Pattern-Based Causal Feature Extraction

Almeida, Diogo Moitinho de

doi:10.1007/978-3-030-21810-2_10

Diogo Moitinho de Almeida⁷

Part of the book series: The Springer Series on Challenges in Machine Learning ((SSCML))

928 Accesses
1 Citations

Abstract

This cause-effect pairs challenge was motivated by the contrast between the costs of performing controlled experiments in order to determine causality and the abundance of observational data. Our goal was to provide a value representing our confidence of causality determined by the observation data which would help identify the most promising variables for experimental verification of their causal relationship. By identifying patterns in functions that generate relevant features, a feature extraction pipeline was architected to allow for the creation of large amounts of complex features with minimal human intervention. Using this pipeline, we were able to finish second in the public leaderboard and first in the private leaderboard. Furthermore, this process by default generates over 20,000 features. In this paper, we analyze which aspects are most important, and create a new pipeline that gets comparable performance with only 324 features.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Hardcover Book: USD 139.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Available at https://github.com/diogo149/autocause.
2.
See the configs subdirectory of https://github.com/diogo149/CauseEffectPairsPaper.

References

Causality Workbench causality challenge #3: Cause-effect pairs - help. http://www.causality.inf.ethz.ch/cause-effect.php?page=help. Accessed: 2013.
Cause-Effect Pairs, howpublished = http://www.kaggle.com/c/cause-effect-pairs, note = Accessed: 2013.
Jerome H Friedman. Greedy function approximation: a gradient boosting machine. Annals of statistics, pages 1189–1232, 2001.
Google Scholar
Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, et al. Scikit-learn: Machine learning in python. Journal of machine learning research, 12(Oct):2825–2830, 2011.
Google Scholar
Jasper Snoek, Hugo Larochelle, and Ryan P Adams. Practical Bayesian optimization of machine learning algorithms. In Advances in neural information processing systems, pages 2951–2959, 2012.
Google Scholar

Download references

Acknowledgment

Special thanks to the organizers of the ChaLearn Cause-Effect Pair Challenge hosted by Kaggle.

Author information

Authors and Affiliations

Google, Menlo Park, CA, USA
Diogo Moitinho de Almeida

Authors

Diogo Moitinho de Almeida
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Team TAU - CNRS, INRIA, Université Paris Sud, Université Paris Saclay, Orsay France, ChaLearn, Berkeley, CA, USA
Isabelle Guyon
SoFi, San Francisco, CA, USA
Alexander Statnikov
University of Paris-Sud, Paris-Saclay, Paris, France
Berna Bakir Batu

Appendix

1.1 Results

See Tables 10.6, 10.7, 10.8, 10.9, 10.10, 10.11, 10.12, and 10.13.

Table 10.6 Results of experiment on meta-features

Full size table

Table 10.7 Results of experiment on relative features

Full size table

Table 10.8 Results of experiment on aggregation features

Full size table

Table 10.9 Results of experiment on numerical vs categorical features

Full size table

Table 10.10 Results of experiment on numerical to categorical transformation

Full size table

Table 10.11 Results of experiment on categorical to numerical transformation

Full size table

Table 10.12 Results of experiment on classifiers

Full size table

Table 10.13 Results of experiment on regression predictors

Full size table

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Almeida, D.M.d. (2019). Pattern-Based Causal Feature Extraction. In: Guyon, I., Statnikov, A., Batu, B. (eds) Cause Effect Pairs in Machine Learning. The Springer Series on Challenges in Machine Learning. Springer, Cham. https://doi.org/10.1007/978-3-030-21810-2_10

Download citation

DOI: https://doi.org/10.1007/978-3-030-21810-2_10
Published: 23 October 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-21809-6
Online ISBN: 978-3-030-21810-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Pattern-Based Causal Feature Extraction

Abstract

Access this chapter

Notes

References

Acknowledgment

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Appendix

Appendix

1.1 Results

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation