Newspaper Selection Analysis Technique

Das, Gourab; Setua, S. K.

doi:10.1007/978-981-10-7871-2_60

Gourab Das¹⁸ &
S. K. Setua¹⁸

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 710))

1677 Accesses
1 Citations

Abstract

Print agencies are fighting for their existence in current data-driven and digital era. Everyday they are coming up with some new approaches to attract the current generation. Going with the flow, they are now seeking the help of the data scientist to innovate new ideas by analyzing the future business. Standing on this approach, this paper predicts the reading habits of the common people. To create a good analogy on the dataset, we have segregated our thoughts into data preprocessing and machine learning. Training a machine learning model using raw data alone can never produce good solution in most of the cases. Efficient preprocessing techniques need to be embedded in order to have better result. It is utmost important to note that not all the machine learning models are quite useful. To get better accuracy in this classification problem, we have trained the dataset using ensemble classifier like gradient boosting and extreme gradient boosting. After training both the classifiers with train dataset, we have predicted the accuracy on unseen test dataset. Main aim of this paper is to show that these machine learning models generalize the test dataset quite well and do not overfit on the train dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Friedman, J.H.: Greedy Function Approximation: A Gradient Boosting Machine. The Annals of Statistics. 2001; 29:1189–1232
Google Scholar
Chen, T., Guestrin, C.: XGBoost: A scalable Tree Boosting System. arXiv preprint arXiv:1603.02754v3, 2016
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay., E.: Scikit-learn: Machine learning in Python. JMLR, 12:2825–2830, 2011
Google Scholar
Scikit Learn Framework http://scikit-learn.org/stable/
Meinshausen, N., Buhlmann, P.: Stability selection. Journal of the Royal Statistical Society Series B, 72 (2010), 417–473
Google Scholar
Wang, S., Nan, B., Rosset, S., Zhu, J.: Random Lasso. arXiv preprint arXiv:1104.3398v1, 2011
Tenenbaum, J.B., de Silva, V., Langford, J.C. (2000): A global geometric framework for nonlinear dimensionality reduction. Science, 290, 2319–2323
Google Scholar
Saul, L. K., Roweis, S. T. (2000): An introduction to locally linear embedding. Science, 290, 2323–2326
Google Scholar
Van Dar Maaten, L., Hinton, G.: Visualizing Data Using t-SNE. JMLR, 1 (2008) 1–48
Google Scholar
XGBOOST: http://xgboost.readthedocs.io/en/latest/model.html

Download references

Author information

Authors and Affiliations

University of Calcutta, Kolkata, India
Gourab Das & S. K. Setua

Authors

Gourab Das
View author publications
You can also search for this author in PubMed Google Scholar
S. K. Setua
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gourab Das .

Editor information

Editors and Affiliations

School of Computer Engineering, KIIT University, Bhubaneswar, Odisha, India
Prasant Kumar Pattnaik
School of Computer Engineering, KIIT University, Bhubaneswar, Odisha, India
Siddharth Swarup Rautaray
School of Computer Engineering, KIIT University, Bhubaneswar, Odisha, India
Himansu Das
Department of Computer Science and Engineering, Sri Sivani College of Engineering, Srikakulam, Andhra Pradesh, India
Janmenjoy Nayak

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Das, G., Setua, S.K. (2018). Newspaper Selection Analysis Technique. In: Pattnaik, P., Rautaray, S., Das, H., Nayak, J. (eds) Progress in Computing, Analytics and Networking. Advances in Intelligent Systems and Computing, vol 710. Springer, Singapore. https://doi.org/10.1007/978-981-10-7871-2_60

Download citation

DOI: https://doi.org/10.1007/978-981-10-7871-2_60
Published: 11 April 2018
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-7870-5
Online ISBN: 978-981-10-7871-2
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics