Multivariate Analysis Algorithms

Keck, Thomas

doi:10.1007/978-3-319-98249-6_3

Thomas Keck²

Part of the book series: Springer Theses ((Springer Theses))

409 Accesses

Abstract

In recent years, the field of multivariate analysis and machine learning evolved rapidly, and provided powerful techniques, which are currently adopted in all fields of science. Prominent use-cases include: image and speech recognition, stock market trading, fraud detection, and medical diagnosis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Hardcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
A dataset without signal, but comparable background properties.
2.
The Lebesgue space of p-integrable functions.
3.
A wrapper would provide a backend-agnostic interface and in consequence the lowest common denominator of all backends.
4.
A memory-representation of the current event.
5.
A Python-based ecosystem, of open-source software for mathematics, science, and engineering.
6.
FastBDT: One hundred trees with a depth of three.
7.
Different quantities can be used to construct the receiver operating characteristic, here the signal efficiency and background rejection are used.
8.
A joint region in the fit-variables.
9.
In the sense defined in [23].
10.
Meaning that the events cannot be distinguished by the features used in the training of the classifier.
11.
Quantitative parameters have an intrinsic ordering, examples are the number of trees in an BDT, or the weight-decay constant in the loss function of a NN.
12.
Qualitative parameters do not have an intrinsic ordering, examples are the separation gain measure in a BDT and the optimization algorithm of a NN.
13.
A hyper-parameter with minor or no influence on the score.

References

C.M. Bishop, Pattern Recognition and Machine Learning (Information Science and Statistics) (Springer-Verlag, New York, Inc., 2006). ISBN: 0387310738
Google Scholar
T. Hastie, R. Tibshirani, J. Friedman, The Elements of Statistical Learning (Springer-Verlag, New York, Inc., 2001). ISBN: 978-0-387-84858-7
Chapter Google Scholar
V. Vapnik, Principles of risk minimization for learning theory, in NIPS (1991)
Google Scholar
L. Rosasco et al., Are loss functions all the same? Neural Comput. 16(5), 1063–1076 (2004). https://doi.org/10.1162/089976604773135104
Article MATH Google Scholar
P. McCullagh, J.A. Nelder, Generalized Linear Models, 2nd edn. Chapman & Hall (1989). ISBN: 9780412317606
Google Scholar
E. Parzen, On estimation of a probability density function and mode. Ann. Math. Stat. 33(3), 1065–1076 (1962). https://doi.org/10.1214/aoms/1177704472
Article MathSciNet MATH Google Scholar
R.A. Rigby, D.M. Stasinopoulos, Generalized additive models for location, scale and shape. J. R. Stat. Soc. Ser. C (Appl. Stat.) 54(3), 507–554 (2005). https://doi.org/10.1111/j.1467-9876.2005.00510.x
Article MathSciNet Google Scholar
R.W. Koenker, G. Bassett, Regression quantiles. Econometrica 46(1), 33–50 (1978)
Article MathSciNet Google Scholar
J. Neyman, E.S. Pearson, On the problem of the most efficient tests of statistical hypotheses. Philos. Trans. R. Soc. Lond. Ser. A Contain. Pap. Math. Phys. Character 231, 289–337 (1933). https://doi.org/10.1098/rsta.1933.0009
Article ADS Google Scholar
R.A. Fisher, The use of multiple measurements in taxonomic problems. Ann. Eugen. 7(7), 179–188 (1936). https://doi.org/10.1111/j.1469-1809.1936.tb02137.x
Article Google Scholar
J.H. Friedman, Stochastic gradient boosting. Comput. Stat. Data Anal. 38(4), 367–378 (2002). https://doi.org/10.1016/S0167-9473(01)00065-2
Article MathSciNet MATH Google Scholar
J. Bergstra, Y. Bengio, Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13, 281–305 (2012)
MathSciNet MATH Google Scholar
D. Maclaurin, D. Duvenaud, R. Adams, Gradient-based hyperparameter optimization through reversible learning, in Proceedings of the 32nd International Conference on Machine Learning (ICML-15), pp. 2113–2122 (2015), http://jmlr.org/proceedings/papers/v37/maclaurin15.pdf
J. Snoek, H. Larochelle, R.P. Adams, Practical Bayesian optimization of machine learning algorithms. Adv. Neural Inf. Process. Syst. 25, 2951–2959 (2012), arXiv: 1206.2944 [stat.ML]
O. Behnke, K. Kroeninger, T. Schoerner-Sadenius, G. Schott, Data Analysis in High Energy Physics. Wiley-VCH (2013). ISBN: 9783527410583
Google Scholar
K. Cranmer, I. Yavin, RECAST: extending the impact of existing analyses. JHEP 04, 038 (2011). https://doi.org/10.1007/JHEP04(2011)038
Article ADS Google Scholar
M. Feindt et al., A hierarchical NeuroBayes-based algorithm for full reconstruction of B mesons at B factories. Nucl. Instrum. Methods A654, 432–440 (2011). https://doi.org/10.1016/j.nima.2011.06.008
Article ADS Google Scholar
K. Hornik, Approximation capabilities of multilayer feedforward networks. Neural Netw. 4(2), 251–257 (1991). https://doi.org/10.1016/0893-6080(91)90009-T
Article Google Scholar
H.W. Lin, M. Tegmark, D. Rolnick, Why does deep and cheap learning work so well? J. Stat. Phys. (2017). https://doi.org/10.1007/s10955-017-1836-5
Article MathSciNet MATH Google Scholar
P. Baldi, P. Sadowski, D. Whiteson, Searching for exotic particles in high-energy physics with deep learning. Nat. Commun. 5, 4308 (2014). https://doi.org/10.1038/ncomms5308
Article ADS Google Scholar
Y. Lecun, Y. Bengio, G. Hinton, Deep learning. Nature 2014, 436–444 (2015). https://doi.org/10.1038/nature14539
Article ADS Google Scholar
I. Goodfellow et al., Generative adversarial nets, in Advances in Neural Information Processing Systems 27, pp. 2672–2680. Curran Associates Inc. (2014), http://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf
G. Louppe, M. Kagan, K. Cranmer, Learning to pivot with adversarial networks, in NIPS (2016), arXiv: 1611.01046 [stat.ME]
Y. Bengio, A. Courville, P. Vincent, Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013). https://doi.org/10.1109/TPAMI.2013.50
Article Google Scholar
O. Vinyals, A. Toshev, S. Bengio, D. Erhan, Show and tell: lessons learned from the 2015 MSCOCO image captioning challenge. IEEE Trans. Pattern Anal. Mach. Intell. 39(4), 652–663 (2017). https://doi.org/10.1109/TPAMI.2016.2587640
Article Google Scholar
M. Pivk, F.R. Le Diberder, SPlot: a statistical tool to unfold data distributions. Nucl. Instrum. Methods A555, 356–369 (2005). https://doi.org/10.1016/j.nima.2005.08.106
Article ADS Google Scholar
D. Martschei, M. Feindt, S. Honc, J. Wagner-Kuhr, Advanced event reweighting using multivariate analysis, in Proceedings, 14th International Workshop on Advanced Computing and Analysis Techniques in Physics Research (ACAT 2011), vol. 368, p. 012028 (2012). https://doi.org/10.1088/1742-6596/368/1/012028
Google Scholar
T. Keck, FastBDT: a speed-optimized multivariate classification algorithm for the Belle II experiment. Comput. Softw. Big Sci. 1(1) (2017). https://doi.org/10.1007/s41781-017-0002-8
02 October 2017, https://github.com/thomaskeck/FastBDT
J. Therhaag et al., TMVA–Toolkit for multivariate data analysis. AIP Conf. Proc. 1504(1), 1013–1016 (2012). https://doi.org/10.1063/1.4771869
Article ADS Google Scholar
S. Nissen, Implementation of a fast artificial neural network library (FANN). Technical report, Department of Computer Science University of Copenhagen (DIKU) (2003), http://fann.sf.net
M. Feindt, U. Kerzel, The NeuroBayes neural network package. Nucl. Instrum. Methods A559, 190–194 (2006). https://doi.org/10.1016/j.nima.2005.11.166
Article ADS Google Scholar
F. Pedregosa et al., Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet MATH Google Scholar
T. Chen, C. Guestrin, XGBoost: a scalable tree boosting system, in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794 (2016). https://doi.org/10.1145/2939672.2939785
M. Abadi et al., TensorFlow: a system for large-scale machine learning, in 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), pp. 265–283 (2016), https://www.usenix.org/system/files/conference/osdi16/osdi16-abadi.pdf
F. Chollet et al., Keras (2015), https://github.com/fchollet/keras
I.J. Goodfellow et al., Pylearn2: a machine learning research library (2013), arXiv: 1308.4214 [stat.ML]
R. Al-Rfou et al., Theano: a Python framework for fast computation of mathematical expressions, arXiv: 1605.02688 [cs.SC]
C. Patrignani et al., Review of particle physics. Chin. Phys. C40(10), 100001 (2016). https://doi.org/10.1088/1674-1137/40/10/100001
Article ADS Google Scholar
J.A. Hanley, B.J. McNeil, The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143(1), 29–36 (1982). https://doi.org/10.1148/radiology.143.1.7063747
Article Google Scholar
M. Gelb, Neutral B Meson flavor tagging for Belle II. MA thesis, KIT (2015), https://ekp-invenio.physik.uni-karlsruhe.de/record/48719
J. Gemmler, Study of B Meson flavor tagging with deep neural networks at Belle and Belle II. MA thesis, KIT (2016), https://ekp-invenio.physik.uni-karlsruhe.de/record/48849
D.M. Asner, M. Athanas, D.W. Bliss et al., Search for exclusive charmless hadronic B decays. Phys. Rev. D 53, 1039–1050 (1996). https://doi.org/10.1103/PhysRevD.53.1039
Article ADS Google Scholar
G.C. Fox, S. Wolfram, Observables for the analysis of event shapes in \({e}^{+}{e}_{-}\) annihilation and other processes. Phys. Rev. Lett. 41, 1581–1585 (1978). https://doi.org/10.1103/PhysRevLett.41.1581
Article ADS Google Scholar
A.J. Bevan et al., The physics of the B factories. Eur. Phys. J. C 74, 3026 (2014). https://doi.org/10.1140/epjc/s10052-014-3026-9
Article ADS Google Scholar
D. Weyland, Continuum suppression with deep learning techniques for the Belle II experiment. MA thesis, KIT (2017), https://ekp-invenio.physik.uni-karlsruhe.de/record/48934
A. Rogozhnikov et al., New approaches for boosting to uniformity. JINST 10(03), T03002 (2015). https://doi.org/10.1088/1748-0221/10/03/T03002
Article ADS Google Scholar
M. Feindt, M. Prim, An algorithm for quantifying dependence in multivariate data sets. Nucl. Instrum. Methods A698, 84–89 (2013). https://doi.org/10.1016/j.nima.2012.09.043
Article ADS Google Scholar
J. Dolen et al., Thinking outside the ROCs: designing decorrelated taggers (DDT) for jet substructure. JHEP 05, 156 (2016). https://doi.org/10.1007/JHEP05(2016)156
Article ADS Google Scholar
J. Stevens, M. Williams, uBoost: a boosting method for producing uniform selection efficiencies from multivariate classiffiers. JINST 8, P12013 (2013). https://doi.org/10.1088/1748-0221/8/12/P12013
Article ADS Google Scholar
B. Lipp, sPlot-based training of multivariate classifiers in the Belle II analysis software framework. BA thesis, KIT (2015), https://ekp-invenio.physik.uni-karlsruhe.de/record/48717

Download references

Author information

Authors and Affiliations

Institute of Experimental Particle Physics, Karlsruhe Institute of Technology, Karlsruhe, Germany
Thomas Keck

Authors

Thomas Keck
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Thomas Keck .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Keck, T. (2018). Multivariate Analysis Algorithms. In: Machine Learning at the Belle II Experiment. Springer Theses. Springer, Cham. https://doi.org/10.1007/978-3-319-98249-6_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-98249-6_3
Published: 30 December 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-98248-9
Online ISBN: 978-3-319-98249-6
eBook Packages: Physics and AstronomyPhysics and Astronomy (R0)

Publish with us

Policies and ethics