Skip to main content

Multivariate Analysis Algorithms

  • Chapter
  • First Online:
Machine Learning at the Belle II Experiment

Part of the book series: Springer Theses ((Springer Theses))

  • 409 Accesses

Abstract

In recent years, the field of multivariate analysis and machine learning evolved rapidly, and provided powerful techniques, which are currently adopted in all fields of science. Prominent use-cases include: image and speech recognition, stock market trading, fraud detection, and medical diagnosis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 129.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    A dataset without signal, but comparable background properties.

  2. 2.

    The Lebesgue space of p-integrable functions.

  3. 3.

    A wrapper would provide a backend-agnostic interface and in consequence the lowest common denominator of all backends.

  4. 4.

    A memory-representation of the current event.

  5. 5.

    A Python-based ecosystem, of open-source software for mathematics, science, and engineering.

  6. 6.

    FastBDT: One hundred trees with a depth of three.

  7. 7.

    Different quantities can be used to construct the receiver operating characteristic, here the signal efficiency and background rejection are used.

  8. 8.

    A joint region in the fit-variables.

  9. 9.

    In the sense defined in [23].

  10. 10.

    Meaning that the events cannot be distinguished by the features used in the training of the classifier.

  11. 11.

    Quantitative parameters have an intrinsic ordering, examples are the number of trees in an BDT, or the weight-decay constant in the loss function of a NN.

  12. 12.

    Qualitative parameters do not have an intrinsic ordering, examples are the separation gain measure in a BDT and the optimization algorithm of a NN.

  13. 13.

    A hyper-parameter with minor or no influence on the score.

References

  1. C.M. Bishop, Pattern Recognition and Machine Learning (Information Science and Statistics) (Springer-Verlag, New York, Inc., 2006). ISBN: 0387310738

    Google Scholar 

  2. T. Hastie, R. Tibshirani, J. Friedman, The Elements of Statistical Learning (Springer-Verlag, New York, Inc., 2001). ISBN: 978-0-387-84858-7

    Chapter  Google Scholar 

  3. V. Vapnik, Principles of risk minimization for learning theory, in NIPS (1991)

    Google Scholar 

  4. L. Rosasco et al., Are loss functions all the same? Neural Comput. 16(5), 1063–1076 (2004). https://doi.org/10.1162/089976604773135104

    Article  MATH  Google Scholar 

  5. P. McCullagh, J.A. Nelder, Generalized Linear Models, 2nd edn. Chapman & Hall (1989). ISBN: 9780412317606

    Google Scholar 

  6. E. Parzen, On estimation of a probability density function and mode. Ann. Math. Stat. 33(3), 1065–1076 (1962). https://doi.org/10.1214/aoms/1177704472

    Article  MathSciNet  MATH  Google Scholar 

  7. R.A. Rigby, D.M. Stasinopoulos, Generalized additive models for location, scale and shape. J. R. Stat. Soc. Ser. C (Appl. Stat.) 54(3), 507–554 (2005). https://doi.org/10.1111/j.1467-9876.2005.00510.x

    Article  MathSciNet  Google Scholar 

  8. R.W. Koenker, G. Bassett, Regression quantiles. Econometrica 46(1), 33–50 (1978)

    Article  MathSciNet  Google Scholar 

  9. J. Neyman, E.S. Pearson, On the problem of the most efficient tests of statistical hypotheses. Philos. Trans. R. Soc. Lond. Ser. A Contain. Pap. Math. Phys. Character 231, 289–337 (1933). https://doi.org/10.1098/rsta.1933.0009

    Article  ADS  Google Scholar 

  10. R.A. Fisher, The use of multiple measurements in taxonomic problems. Ann. Eugen. 7(7), 179–188 (1936). https://doi.org/10.1111/j.1469-1809.1936.tb02137.x

    Article  Google Scholar 

  11. J.H. Friedman, Stochastic gradient boosting. Comput. Stat. Data Anal. 38(4), 367–378 (2002). https://doi.org/10.1016/S0167-9473(01)00065-2

    Article  MathSciNet  MATH  Google Scholar 

  12. J. Bergstra, Y. Bengio, Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13, 281–305 (2012)

    MathSciNet  MATH  Google Scholar 

  13. D. Maclaurin, D. Duvenaud, R. Adams, Gradient-based hyperparameter optimization through reversible learning, in Proceedings of the 32nd International Conference on Machine Learning (ICML-15), pp. 2113–2122 (2015), http://jmlr.org/proceedings/papers/v37/maclaurin15.pdf

  14. J. Snoek, H. Larochelle, R.P. Adams, Practical Bayesian optimization of machine learning algorithms. Adv. Neural Inf. Process. Syst. 25, 2951–2959 (2012), arXiv: 1206.2944 [stat.ML]

  15. O. Behnke, K. Kroeninger, T. Schoerner-Sadenius, G. Schott, Data Analysis in High Energy Physics. Wiley-VCH (2013). ISBN: 9783527410583

    Google Scholar 

  16. K. Cranmer, I. Yavin, RECAST: extending the impact of existing analyses. JHEP 04, 038 (2011). https://doi.org/10.1007/JHEP04(2011)038

    Article  ADS  Google Scholar 

  17. M. Feindt et al., A hierarchical NeuroBayes-based algorithm for full reconstruction of B mesons at B factories. Nucl. Instrum. Methods A654, 432–440 (2011). https://doi.org/10.1016/j.nima.2011.06.008

    Article  ADS  Google Scholar 

  18. K. Hornik, Approximation capabilities of multilayer feedforward networks. Neural Netw. 4(2), 251–257 (1991). https://doi.org/10.1016/0893-6080(91)90009-T

    Article  Google Scholar 

  19. H.W. Lin, M. Tegmark, D. Rolnick, Why does deep and cheap learning work so well? J. Stat. Phys. (2017). https://doi.org/10.1007/s10955-017-1836-5

    Article  MathSciNet  MATH  Google Scholar 

  20. P. Baldi, P. Sadowski, D. Whiteson, Searching for exotic particles in high-energy physics with deep learning. Nat. Commun. 5, 4308 (2014). https://doi.org/10.1038/ncomms5308

    Article  ADS  Google Scholar 

  21. Y. Lecun, Y. Bengio, G. Hinton, Deep learning. Nature 2014, 436–444 (2015). https://doi.org/10.1038/nature14539

    Article  ADS  Google Scholar 

  22. I. Goodfellow et al., Generative adversarial nets, in Advances in Neural Information Processing Systems 27, pp. 2672–2680. Curran Associates Inc. (2014), http://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf

  23. G. Louppe, M. Kagan, K. Cranmer, Learning to pivot with adversarial networks, in NIPS (2016), arXiv: 1611.01046 [stat.ME]

  24. Y. Bengio, A. Courville, P. Vincent, Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013). https://doi.org/10.1109/TPAMI.2013.50

    Article  Google Scholar 

  25. O. Vinyals, A. Toshev, S. Bengio, D. Erhan, Show and tell: lessons learned from the 2015 MSCOCO image captioning challenge. IEEE Trans. Pattern Anal. Mach. Intell. 39(4), 652–663 (2017). https://doi.org/10.1109/TPAMI.2016.2587640

    Article  Google Scholar 

  26. M. Pivk, F.R. Le Diberder, SPlot: a statistical tool to unfold data distributions. Nucl. Instrum. Methods A555, 356–369 (2005). https://doi.org/10.1016/j.nima.2005.08.106

    Article  ADS  Google Scholar 

  27. D. Martschei, M. Feindt, S. Honc, J. Wagner-Kuhr, Advanced event reweighting using multivariate analysis, in Proceedings, 14th International Workshop on Advanced Computing and Analysis Techniques in Physics Research (ACAT 2011), vol. 368, p. 012028 (2012). https://doi.org/10.1088/1742-6596/368/1/012028

    Google Scholar 

  28. T. Keck, FastBDT: a speed-optimized multivariate classification algorithm for the Belle II experiment. Comput. Softw. Big Sci. 1(1) (2017). https://doi.org/10.1007/s41781-017-0002-8

  29. 02 October 2017, https://github.com/thomaskeck/FastBDT

  30. J. Therhaag et al., TMVA–Toolkit for multivariate data analysis. AIP Conf. Proc. 1504(1), 1013–1016 (2012). https://doi.org/10.1063/1.4771869

    Article  ADS  Google Scholar 

  31. S. Nissen, Implementation of a fast artificial neural network library (FANN). Technical report, Department of Computer Science University of Copenhagen (DIKU) (2003), http://fann.sf.net

  32. M. Feindt, U. Kerzel, The NeuroBayes neural network package. Nucl. Instrum. Methods A559, 190–194 (2006). https://doi.org/10.1016/j.nima.2005.11.166

    Article  ADS  Google Scholar 

  33. F. Pedregosa et al., Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  34. T. Chen, C. Guestrin, XGBoost: a scalable tree boosting system, in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794 (2016). https://doi.org/10.1145/2939672.2939785

  35. M. Abadi et al., TensorFlow: a system for large-scale machine learning, in 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), pp. 265–283 (2016), https://www.usenix.org/system/files/conference/osdi16/osdi16-abadi.pdf

  36. F. Chollet et al., Keras (2015), https://github.com/fchollet/keras

  37. I.J. Goodfellow et al., Pylearn2: a machine learning research library (2013), arXiv: 1308.4214 [stat.ML]

  38. R. Al-Rfou et al., Theano: a Python framework for fast computation of mathematical expressions, arXiv: 1605.02688 [cs.SC]

  39. C. Patrignani et al., Review of particle physics. Chin. Phys. C40(10), 100001 (2016). https://doi.org/10.1088/1674-1137/40/10/100001

    Article  ADS  Google Scholar 

  40. J.A. Hanley, B.J. McNeil, The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143(1), 29–36 (1982). https://doi.org/10.1148/radiology.143.1.7063747

    Article  Google Scholar 

  41. M. Gelb, Neutral B Meson flavor tagging for Belle II. MA thesis, KIT (2015), https://ekp-invenio.physik.uni-karlsruhe.de/record/48719

  42. J. Gemmler, Study of B Meson flavor tagging with deep neural networks at Belle and Belle II. MA thesis, KIT (2016), https://ekp-invenio.physik.uni-karlsruhe.de/record/48849

  43. D.M. Asner, M. Athanas, D.W. Bliss et al., Search for exclusive charmless hadronic B decays. Phys. Rev. D 53, 1039–1050 (1996). https://doi.org/10.1103/PhysRevD.53.1039

    Article  ADS  Google Scholar 

  44. G.C. Fox, S. Wolfram, Observables for the analysis of event shapes in \({e}^{+}{e}_{-}\) annihilation and other processes. Phys. Rev. Lett. 41, 1581–1585 (1978). https://doi.org/10.1103/PhysRevLett.41.1581

    Article  ADS  Google Scholar 

  45. A.J. Bevan et al., The physics of the B factories. Eur. Phys. J. C 74, 3026 (2014). https://doi.org/10.1140/epjc/s10052-014-3026-9

    Article  ADS  Google Scholar 

  46. D. Weyland, Continuum suppression with deep learning techniques for the Belle II experiment. MA thesis, KIT (2017), https://ekp-invenio.physik.uni-karlsruhe.de/record/48934

  47. A. Rogozhnikov et al., New approaches for boosting to uniformity. JINST 10(03), T03002 (2015). https://doi.org/10.1088/1748-0221/10/03/T03002

    Article  ADS  Google Scholar 

  48. M. Feindt, M. Prim, An algorithm for quantifying dependence in multivariate data sets. Nucl. Instrum. Methods A698, 84–89 (2013). https://doi.org/10.1016/j.nima.2012.09.043

    Article  ADS  Google Scholar 

  49. J. Dolen et al., Thinking outside the ROCs: designing decorrelated taggers (DDT) for jet substructure. JHEP 05, 156 (2016). https://doi.org/10.1007/JHEP05(2016)156

    Article  ADS  Google Scholar 

  50. J. Stevens, M. Williams, uBoost: a boosting method for producing uniform selection efficiencies from multivariate classiffiers. JINST 8, P12013 (2013). https://doi.org/10.1088/1748-0221/8/12/P12013

    Article  ADS  Google Scholar 

  51. B. Lipp, sPlot-based training of multivariate classifiers in the Belle II analysis software framework. BA thesis, KIT (2015), https://ekp-invenio.physik.uni-karlsruhe.de/record/48717

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thomas Keck .

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Keck, T. (2018). Multivariate Analysis Algorithms. In: Machine Learning at the Belle II Experiment. Springer Theses. Springer, Cham. https://doi.org/10.1007/978-3-319-98249-6_3

Download citation

Publish with us

Policies and ethics