Efficient Feature Selection Framework for Digital Marketing Applications

Zhang, Wei; Bose, Shiladitya; Kobeissi, Said; Tomko, Scott; Challis, Chris

doi:10.1007/978-3-319-93040-4_3

Wei Zhang¹⁹,
Shiladitya Bose¹⁹,
Said Kobeissi¹⁹,
Scott Tomko¹⁹ &
…
Chris Challis¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10939))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

3464 Accesses
1 Citations

Abstract

Digital marketing strategies can help businesses achieve better Return on Investment (ROI). Big data and predictive modelling are key to identifying these specific customers. Yet the very rich and mostly irrelevant attributes(features) will adversely affect the predictive modelling performance, both computationally and qualitatively. So selecting relevant features is a crucial task for marketing applications. The feature selection process is very time consuming due to the large amount of data and high dimensionality of features. In this paper, we propose to reduce the computation time through regularizing the feature search process using expert knowledge. We also combine the regularized search with a generative filtering step, so we can address potential problems with the regularized search and further speed up the process. In addition, a progressive sampling and coarse to fine selection framework is built to further lower the space and time requirements.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
C(n) depends on the modelling algorithm. It is linear w.r.t. n for many commonly used algorithms such as logistic regression and random forest [8]. In this case, the complexity term \(C(1)+C(2)+ \ldots +C(n) \propto n^2\). Without loss of generality, we use Eq. 1 to represent the complexity. The derivation in Sect. 4.1 holds either way.

References

Berrendero, J.R., Cuevas, A., Torrecilla, J.L.: The mRMR variable selection method: a comparative study for functional data. J. Stat. Comput. Simul. 86(5), 891–907 (2016)
Article MathSciNet Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)
Article Google Scholar
Deng, K.: Omega: On-line memory-based general purpose system classifier. Ph.D. dissertation, Carnegie Mellon University (1998)
Google Scholar
Farahat, A.K., Ghodsi, A., Kamel, M.S.: An efficient greedy method for unsupervised feature selection. In: ICDM, pp. 161–170 (2011)
Google Scholar
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. 3, 1157–1182 (2003)
MATH Google Scholar
Hsu, H.H., Hsieh, C.W., Lu, M.D.: Hybrid feature selection by combining filters and wrappers. Expert Syst. Appl. 38(7), 8144–8150 (2011)
Article Google Scholar
Huda, S., Yearwood, J., Stranieri, A.: Hybrid wrapper-filter approaches for input feature selection using maximum relevance-minimum redundancy and artificial neural network input gain measurement approximation. In: ACSC, pp. 43–52 (2011)
Google Scholar
Iyer, K.: Computational complexity of data mining algorithms used in fraud detection. Ph.D. dissertation, Pennsylvania State University (2005)
Google Scholar
Kotsiantis, S.: Feature selection for machine learning classification problems: a recent overview. Artif. Intell. Rev. 42, 1–20 (2011)
Google Scholar
Kroeger, P.R.: Analyzing Grammar: An Introduction. Cambridge University Press, Cambridge (2005)
Book Google Scholar
Lee, C.P., Leu, Y.: A novel hybrid feature selection method for microarray data analysis. Appl. Soft Comput. 11(1), 208–213 (2011)
Article Google Scholar
Mahdokht, M., Yan, Y., Cui, Y., Dy, J.: Convex principal feature selection. In: Proceedings of the 2010 SIAM International Conference on Data Mining, pp. 619–628 (2010)
Google Scholar
Manikandan, P., Venkateswaran, C.J.: Feature selection algorithms: literature review. Smart Comput. Rev. 4(3) (2014)
Google Scholar
Minka, T.P.: A comparison of numerical optimizers for logistic regression. Unpublished draft (2003)
Google Scholar
Nguyen, X.V., Chan, J., Romano, S., Bailey, J.: Effective global approaches for mutual information based feature selection. In: KDD, pp. 512–521 (2014)
Google Scholar
Peng, H., Long, F., Ding, C.: Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–1238 (2005)
Article Google Scholar
Senliol, B., Gulgezen, G., Yu, L., Cataltepe, Z.: Fast correlation based filter (FCBF) with a different search strategy. In: 23rd International Symposium on Computer and Information Sciences, pp. 1–4 (2008)
Google Scholar
Shao, W., He, L., Lu, C., Wei, X., Yu, P.: Online unsupervised multi-view feature selection. In: ICDM, pp. 1203–1208 (2016)
Google Scholar
Tang, J., Alelyani, S., Liu, H.: Feature selection for classification: a review (2014)
Google Scholar
Tang, J., Hu, X., Gao, H., Liu, H.: Unsupervised feature selection for multi-view data in social media. In: SDM, pp. 270–278 (2013)
Chapter Google Scholar
Torkkola, K.: Feature extraction by non-parametric mutual information maximization. J. Mach. Learn. Res. 3, 1415–1438 (2003)
MathSciNet MATH Google Scholar
Venkateswara, H., Lade, P., Lin, B., Ye, J., Panchanathan, S.: Efficient approximate solutions to mutual information based global feature selection. In: ICDM, pp. 1009–1014 (2015)
Google Scholar
Vinzamuri, B., Padthe, K.K., Reddy, C.K.: Feature grouping using weighted l1 norm for high-dimensional data. In: ICDM, pp. 1233–1238. IEEE (2016)
Google Scholar

Download references

Author information

Authors and Affiliations

Adobe Systems, Mclean, USA
Wei Zhang, Shiladitya Bose, Said Kobeissi, Scott Tomko & Chris Challis

Authors

Wei Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Shiladitya Bose
View author publications
You can also search for this author in PubMed Google Scholar
Said Kobeissi
View author publications
You can also search for this author in PubMed Google Scholar
Scott Tomko
View author publications
You can also search for this author in PubMed Google Scholar
Chris Challis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wei Zhang .

Editor information

Editors and Affiliations

Deakin University, Geelong, Victoria, Australia
Dinh Phung
National Chiao Tung University, Hsinchu City, Taiwan
Vincent S. Tseng
Monash University, Clayton, Victoria, Australia
Geoffrey I. Webb
Japan Advanced Institute of Science and Technology, Nomi, Ishikawa, Japan
Bao Ho
University of Melbourne, Melbourne, Victoria, Australia
Mohadeseh Ganji
University of Melbourne, Melbourne, Victoria, Australia
Lida Rashidi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, W., Bose, S., Kobeissi, S., Tomko, S., Challis, C. (2018). Efficient Feature Selection Framework for Digital Marketing Applications. In: Phung, D., Tseng, V., Webb, G., Ho, B., Ganji, M., Rashidi, L. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2018. Lecture Notes in Computer Science(), vol 10939. Springer, Cham. https://doi.org/10.1007/978-3-319-93040-4_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-93040-4_3
Published: 17 June 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-93039-8
Online ISBN: 978-3-319-93040-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Efficient Feature Selection Framework for Digital Marketing Applications