Skip to main content

Newspaper Selection Analysis Technique

  • Conference paper
  • First Online:
Progress in Computing, Analytics and Networking

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 710))

Abstract

Print agencies are fighting for their existence in current data-driven and digital era. Everyday they are coming up with some new approaches to attract the current generation. Going with the flow, they are now seeking the help of the data scientist to innovate new ideas by analyzing the future business. Standing on this approach, this paper predicts the reading habits of the common people. To create a good analogy on the dataset, we have segregated our thoughts into data preprocessing and machine learning. Training a machine learning model using raw data alone can never produce good solution in most of the cases. Efficient preprocessing techniques need to be embedded in order to have better result. It is utmost important to note that not all the machine learning models are quite useful. To get better accuracy in this classification problem, we have trained the dataset using ensemble classifier like gradient boosting and extreme gradient boosting. After training both the classifiers with train dataset, we have predicted the accuracy on unseen test dataset. Main aim of this paper is to show that these machine learning models generalize the test dataset quite well and do not overfit on the train dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Friedman, J.H.: Greedy Function Approximation: A Gradient Boosting Machine. The Annals of Statistics. 2001; 29:1189–1232

    Google Scholar 

  2. Chen, T., Guestrin, C.: XGBoost: A scalable Tree Boosting System. arXiv preprint arXiv:1603.02754v3, 2016

  3. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay., E.: Scikit-learn: Machine learning in Python. JMLR, 12:2825–2830, 2011

    Google Scholar 

  4. Scikit Learn Framework http://scikit-learn.org/stable/

  5. Meinshausen, N., Buhlmann, P.: Stability selection. Journal of the Royal Statistical Society Series B, 72 (2010), 417–473

    Google Scholar 

  6. Wang, S., Nan, B., Rosset, S., Zhu, J.: Random Lasso. arXiv preprint arXiv:1104.3398v1, 2011

  7. Tenenbaum, J.B., de Silva, V., Langford, J.C. (2000): A global geometric framework for nonlinear dimensionality reduction. Science, 290, 2319–2323

    Google Scholar 

  8. Saul, L. K., Roweis, S. T. (2000): An introduction to locally linear embedding. Science, 290, 2323–2326

    Google Scholar 

  9. Van Dar Maaten, L., Hinton, G.: Visualizing Data Using t-SNE. JMLR, 1 (2008) 1–48

    Google Scholar 

  10. XGBOOST: http://xgboost.readthedocs.io/en/latest/model.html

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gourab Das .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Das, G., Setua, S.K. (2018). Newspaper Selection Analysis Technique. In: Pattnaik, P., Rautaray, S., Das, H., Nayak, J. (eds) Progress in Computing, Analytics and Networking. Advances in Intelligent Systems and Computing, vol 710. Springer, Singapore. https://doi.org/10.1007/978-981-10-7871-2_60

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-7871-2_60

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-7870-5

  • Online ISBN: 978-981-10-7871-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics