Skip to main content

Efficient Feature Selection Framework for Digital Marketing Applications

  • Conference paper
  • First Online:
Advances in Knowledge Discovery and Data Mining (PAKDD 2018)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10939))

Included in the following conference series:

Abstract

Digital marketing strategies can help businesses achieve better Return on Investment (ROI). Big data and predictive modelling are key to identifying these specific customers. Yet the very rich and mostly irrelevant attributes(features) will adversely affect the predictive modelling performance, both computationally and qualitatively. So selecting relevant features is a crucial task for marketing applications. The feature selection process is very time consuming due to the large amount of data and high dimensionality of features. In this paper, we propose to reduce the computation time through regularizing the feature search process using expert knowledge. We also combine the regularized search with a generative filtering step, so we can address potential problems with the regularized search and further speed up the process. In addition, a progressive sampling and coarse to fine selection framework is built to further lower the space and time requirements.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    C(n) depends on the modelling algorithm. It is linear w.r.t. n for many commonly used algorithms such as logistic regression and random forest [8]. In this case, the complexity term \(C(1)+C(2)+ \ldots +C(n) \propto n^2\). Without loss of generality, we use Eq. 1 to represent the complexity. The derivation in Sect. 4.1 holds either way.

References

  1. Berrendero, J.R., Cuevas, A., Torrecilla, J.L.: The mRMR variable selection method: a comparative study for functional data. J. Stat. Comput. Simul. 86(5), 891–907 (2016)

    Article  MathSciNet  Google Scholar 

  2. Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)

    Article  Google Scholar 

  3. Deng, K.: Omega: On-line memory-based general purpose system classifier. Ph.D. dissertation, Carnegie Mellon University (1998)

    Google Scholar 

  4. Farahat, A.K., Ghodsi, A., Kamel, M.S.: An efficient greedy method for unsupervised feature selection. In: ICDM, pp. 161–170 (2011)

    Google Scholar 

  5. Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. 3, 1157–1182 (2003)

    MATH  Google Scholar 

  6. Hsu, H.H., Hsieh, C.W., Lu, M.D.: Hybrid feature selection by combining filters and wrappers. Expert Syst. Appl. 38(7), 8144–8150 (2011)

    Article  Google Scholar 

  7. Huda, S., Yearwood, J., Stranieri, A.: Hybrid wrapper-filter approaches for input feature selection using maximum relevance-minimum redundancy and artificial neural network input gain measurement approximation. In: ACSC, pp. 43–52 (2011)

    Google Scholar 

  8. Iyer, K.: Computational complexity of data mining algorithms used in fraud detection. Ph.D. dissertation, Pennsylvania State University (2005)

    Google Scholar 

  9. Kotsiantis, S.: Feature selection for machine learning classification problems: a recent overview. Artif. Intell. Rev. 42, 1–20 (2011)

    Google Scholar 

  10. Kroeger, P.R.: Analyzing Grammar: An Introduction. Cambridge University Press, Cambridge (2005)

    Book  Google Scholar 

  11. Lee, C.P., Leu, Y.: A novel hybrid feature selection method for microarray data analysis. Appl. Soft Comput. 11(1), 208–213 (2011)

    Article  Google Scholar 

  12. Mahdokht, M., Yan, Y., Cui, Y., Dy, J.: Convex principal feature selection. In: Proceedings of the 2010 SIAM International Conference on Data Mining, pp. 619–628 (2010)

    Google Scholar 

  13. Manikandan, P., Venkateswaran, C.J.: Feature selection algorithms: literature review. Smart Comput. Rev. 4(3) (2014)

    Google Scholar 

  14. Minka, T.P.: A comparison of numerical optimizers for logistic regression. Unpublished draft (2003)

    Google Scholar 

  15. Nguyen, X.V., Chan, J., Romano, S., Bailey, J.: Effective global approaches for mutual information based feature selection. In: KDD, pp. 512–521 (2014)

    Google Scholar 

  16. Peng, H., Long, F., Ding, C.: Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–1238 (2005)

    Article  Google Scholar 

  17. Senliol, B., Gulgezen, G., Yu, L., Cataltepe, Z.: Fast correlation based filter (FCBF) with a different search strategy. In: 23rd International Symposium on Computer and Information Sciences, pp. 1–4 (2008)

    Google Scholar 

  18. Shao, W., He, L., Lu, C., Wei, X., Yu, P.: Online unsupervised multi-view feature selection. In: ICDM, pp. 1203–1208 (2016)

    Google Scholar 

  19. Tang, J., Alelyani, S., Liu, H.: Feature selection for classification: a review (2014)

    Google Scholar 

  20. Tang, J., Hu, X., Gao, H., Liu, H.: Unsupervised feature selection for multi-view data in social media. In: SDM, pp. 270–278 (2013)

    Chapter  Google Scholar 

  21. Torkkola, K.: Feature extraction by non-parametric mutual information maximization. J. Mach. Learn. Res. 3, 1415–1438 (2003)

    MathSciNet  MATH  Google Scholar 

  22. Venkateswara, H., Lade, P., Lin, B., Ye, J., Panchanathan, S.: Efficient approximate solutions to mutual information based global feature selection. In: ICDM, pp. 1009–1014 (2015)

    Google Scholar 

  23. Vinzamuri, B., Padthe, K.K., Reddy, C.K.: Feature grouping using weighted l1 norm for high-dimensional data. In: ICDM, pp. 1233–1238. IEEE (2016)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wei Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, W., Bose, S., Kobeissi, S., Tomko, S., Challis, C. (2018). Efficient Feature Selection Framework for Digital Marketing Applications. In: Phung, D., Tseng, V., Webb, G., Ho, B., Ganji, M., Rashidi, L. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2018. Lecture Notes in Computer Science(), vol 10939. Springer, Cham. https://doi.org/10.1007/978-3-319-93040-4_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-93040-4_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-93039-8

  • Online ISBN: 978-3-319-93040-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics