(Machine) learning to do more with less

Cohen, Timothy; Freytsis, Marat; Ostdiek, Bryan

doi:10.1007/JHEP02(2018)034

(Machine) learning to do more with less

Regular Article - Experimental Physics
Open access
Published: 06 February 2018

Volume 2018, article number 34, (2018)
Cite this article

Download PDF

You have full access to this open access article

Journal of High Energy Physics Aims and scope Submit manuscript

(Machine) learning to do more with less

Download PDF

1112 Accesses
58 Citations
24 Altmetric
1 Mention
Explore all metrics

A preprint version of the article is available at arXiv.

Abstract

Determining the best method for training a machine learning algorithm is critical to maximizing its ability to classify data. In this paper, we compare the standard “fully supervised” approach (which relies on knowledge of event-by-event truth-level labels) with a recent proposal that instead utilizes class ratios as the only discriminating information provided during training. This so-called “weakly supervised” technique has access to less information than the fully supervised method and yet is still able to yield impressive discriminating power. In addition, weak supervision seems particularly well suited to particle physics since quantum mechanics is incompatible with the notion of mapping an individual event onto any single Feynman diagram. We examine the technique in detail — both analytically and numerically — with a focus on the robustness to issues of mischaracterizing the training samples. Weakly supervised networks turn out to be remarkably insensitive to a class of systematic mismodeling. Furthermore, we demonstrate that the event level outputs for weakly versus fully supervised networks are probing different kinematics, even though the numerical quality metrics are essentially identical. This implies that it should be possible to improve the overall classification ability by combining the output from the two types of networks. For concreteness, we apply this technology to a signature of beyond the Standard Model physics to demonstrate that all these impressive features continue to hold in a scenario of relevance to the LHC. Example code is provided on GitHub.

Article PDF

Tag N’ Train: a technique to train improved classifiers on unlabeled data

Article Open access 25 January 2021

A method for approximating optimal statistical significances with machine-learned likelihoods

Article Open access 05 November 2022

Boosting mono-jet searches with model-agnostic machine learning

Article Open access 01 August 2022

References

ATLAS collaboration, Performance of b-Jet Identification in the ATLAS Experiment, 2016 JINST 11 P04008 [arXiv:1512.01094] [INSPIRE].
CMS collaboration, Identification of b quark jets at the CMS Experiment in the LHC Run 2, CMS-PAS-BTV-15-001.
ATLAS collaboration, Performance and Calibration of the JetFitterCharm Algorithm for c-Jet Identification, ATL-PHYS-PUB-2015-001 (2015).
CMS collaboration, Identification of c-quark jets at the CMS experiment, CMS-PAS-BTV-16-001.
J. Cogan, M. Kagan, E. Strauss and A. Schwarztman, Jet-Images: Computer Vision Inspired Techniques for Jet Tagging, JHEP 02 (2015) 118 [arXiv:1407.5675] [INSPIRE].
Article ADS Google Scholar
L.G. Almeida, M. Backović, M. Cliche, S.J. Lee and M. Perelstein, Playing Tag with ANN: Boosted Top Identification with Pattern Recognition, JHEP 07 (2015) 086 [arXiv:1501.05968] [INSPIRE].
Article ADS Google Scholar
L. de Oliveira, M. Kagan, L. Mackey, B. Nachman and A. Schwartzman, Jet-images — deep learning edition, JHEP 07 (2016) 069 [arXiv:1511.05190] [INSPIRE].
Article Google Scholar
P. Baldi, K. Bauer, C. Eng, P. Sadowski and D. Whiteson, Jet Substructure Classification in High-Energy Physics with Deep Neural Networks, Phys. Rev. D 93 (2016) 094034 [arXiv:1603.09349] [INSPIRE].
ADS Google Scholar
D. Guest, J. Collado, P. Baldi, S.-C. Hsu, G. Urban and D. Whiteson, Jet Flavor Classification in High-Energy Physics with Deep Neural Networks, Phys. Rev. D 94 (2016) 112002 [arXiv:1607.08633] [INSPIRE].
ADS Google Scholar
K. Datta and A. Larkoski, How Much Information is in a Jet?, JHEP 06 (2017) 073 [arXiv:1704.08249] [INSPIRE].
Article ADS Google Scholar
C. Shimmin et al., Decorrelated Jet Substructure Tagging using Adversarial Neural Networks, Phys. Rev. D 96 (2017) 074034 [arXiv:1703.03507] [INSPIRE].
ADS Google Scholar
K. Cranmer and R.S. Bowman, PhysicsGP: A Genetic Programming Approach to Event Selection, Comput. Phys. Commun. 167 (2005) 165 [physics/0402030] [INSPIRE].
S. Whiteson and D. Whiteson, Machine learning for event selection in high energy physics, Eng. Appl. Artif. Intell. 22 (2009) 1203.
Article MATH Google Scholar
P. Baldi, P. Sadowski and D. Whiteson, Searching for Exotic Particles in High-Energy Physics with Deep Learning, Nature Commun. 5 (2014) 4308 [arXiv:1402.4735] [INSPIRE].
Article ADS Google Scholar
J. Searcy, L. Huang, M.-A. Pleier and J. Zhu, Determination of the W W polarization fractions in pp → W ^± W ^± jj using a deep machine learning technique, Phys. Rev. D 93 (2016) 094033 [arXiv:1510.01691] [INSPIRE].
ADS Google Scholar
P. Baldi, K. Cranmer, T. Faucett, P. Sadowski and D. Whiteson, Parameterized neural networks for high-energy physics, Eur. Phys. J. C 76 (2016) 235 [arXiv:1601.07913] [INSPIRE].
Article ADS Google Scholar
P.T. Komiske, E.M. Metodiev and M.D. Schwartz, Deep learning in color: towards automated quark/gluon jet discrimination, JHEP 01 (2017) 110 [arXiv:1612.01551] [INSPIRE].
Article ADS MATH Google Scholar
J. Barnard, E.N. Dawe, M.J. Dolan and N. Rajcic, Parton Shower Uncertainties in Jet Substructure Analyses with Deep Neural Networks, Phys. Rev. D 95 (2017) 014018 [arXiv:1609.00607] [INSPIRE].
ADS Google Scholar
L.-G. Pang, K. Zhou, N. Su, H. Petersen, H. Stöcker and X.-N. Wang, An equation-of-state-meter of QCD transition from deep learning, arXiv:1612.04262 [INSPIRE].
G. Kasieczka, T. Plehn, M. Russell and T. Schell, Deep-learning Top Taggers or The End of QCD?, JHEP 05 (2017) 006 [arXiv:1701.08784] [INSPIRE].
Article ADS Google Scholar
G. Louppe, M. Kagan and K. Cranmer, Learning to Pivot with Adversarial Networks, arXiv:1611.01046 [INSPIRE].
L. de Oliveira, M. Paganini and B. Nachman, Learning Particle Physics by Example: Location-Aware Generative Adversarial Networks for Physics Synthesis, Comput. Softw. Big Sci. 1 (2017) 4 [arXiv:1701.05927] [INSPIRE].
Article Google Scholar
J. Pearkes, W. Fedorko, A. Lister and C. Gay, Jet Constituents for Deep Neural Network Based Top Quark Tagging, arXiv:1704.02124 [INSPIRE].
G. Louppe, K. Cho, C. Becot and K. Cranmer, QCD-Aware Recursive Neural Networks for Jet Physics, arXiv:1702.00748 [INSPIRE].
B.T. Huffman, T. Russell and J. Tseng, Tagging b quarks without tracks using an Artificial Neural Network algorithm, arXiv:1701.06832 [INSPIRE].
Y.-H. He, Deep-Learning the Landscape, arXiv:1706.02714 [INSPIRE].
L.M. Dery, B. Nachman, F. Rubbo and A. Schwartzman, Weakly Supervised Classification in High Energy Physics, JHEP 05 (2017) 145 [arXiv:1702.00414] [INSPIRE].
Article ADS MATH Google Scholar
N. Quadrianto, A.J. Smola, T.S. Caetano and Q.V. Le, Estimating labels from label proportions, J. Mach. Learn. Res. 10 (2009) 2349.
MathSciNet MATH Google Scholar
F.X. Yu, K. Choromanski, S. Kumar, T. Jebara and S.-F. Chang, On Learning from Label Proportions, arXiv:1402.5902.
T.G. Dietterich, R.H. Lathrop and T. Lozano-Pérez, Solving the multiple instance problem with axis-parallel rectangles, Artif. Intell. 89 (1997) 31.
Article MATH Google Scholar
J. Amores, Multiple instance classification: Review, taxonomy and comparative study, Artificial Intelligence 201 (2013) 81.
Article MathSciNet MATH Google Scholar
J.R. Andersen et al., Les Houches 2015: Physics at TeV Colliders Standard Model Working Group Report, arXiv:1605.04692 [INSPIRE].
S.D. Ellis, T.S. Roy and J. Scholtz, Jets and Photons, Phys. Rev. Lett. 110 (2013) 122003 [arXiv:1210.1855] [INSPIRE].
Article ADS Google Scholar
S.D. Ellis, T.S. Roy and J. Scholtz, Phenomenology of Photon-Jets, Phys. Rev. D 87 (2013) 014015 [arXiv:1210.3657] [INSPIRE].
ADS Google Scholar
T. Cohen, M.J. Dolan, S. El Hedri, J. Hirschauer, N. Tran and A. Whitbeck, Dissecting Jets and Missing Energy Searches Using n-body Extended Simplified Models, JHEP 08 (2016) 038 [arXiv:1605.01416] [INSPIRE].
Article ADS Google Scholar
S. Iwamoto, G. Lee, Y. Shadmi and Y. Weiss, Tagging new physics with charm, JHEP 09 (2017) 114 [arXiv:1703.05748] [INSPIRE].
Article ADS Google Scholar
G. Barello, S. Chang, C.A. Newby and B. Ostdiek, Don’t be left in the dark: Improving LHC searches for dark photons using lepton-jet substructure, Phys. Rev. D 95 (2017) 055007 [arXiv:1612.00026] [INSPIRE].
ADS Google Scholar
A. Buckley, A. Shilton and M.J. White, Fast supersymmetry phenomenology at the Large Hadron Collider using machine learning techniques, Comput. Phys. Commun. 183 (2012) 960 [arXiv:1106.4613] [INSPIRE].
Article ADS MATH Google Scholar
N. Bornhauser and M. Drees, Determination of the CMSSM Parameters using Neural Networks, Phys. Rev. D 88 (2013) 075016 [arXiv:1307.3383] [INSPIRE].
ADS Google Scholar
S. Caron, J.S. Kim, K. Rolbiecki, R. Ruiz de Austri and B. Stienen, The BSM-AI project: SUSY-AI-generalizing LHC limits on supersymmetry with machine learning, Eur. Phys. J. C 77 (2017) 257 [arXiv:1605.02797] [INSPIRE].
Article ADS Google Scholar
G. Bertone, M.P. Deisenroth, J.S. Kim, S. Liem, R. Ruiz de Austri and M. Welling, Accelerating the BSM interpretation of LHC data with machine learning, arXiv:1611.02704 [INSPIRE].
P. Bechtle et al., SCYNet: Testing supersymmetric models at the LHC with neural networks, Eur. Phys. J. C 77 (2017) 707 [arXiv:1703.01309] [INSPIRE].
Article ADS Google Scholar
E.M. Metodiev, B. Nachman and J. Thaler, Classification without labels: Learning from mixed samples in high energy physics, JHEP 10 (2017) 174 [arXiv:1708.02949] [INSPIRE].
Article ADS Google Scholar
F. Chollet, Keras, https://github.com/fchollet/keras (2015).
D.P. Kingma and J. Ba, Adam: A method for stochastic optimization, arXiv:1412.6980.
J. MacQueen, Some methods for classification and analysis of multivariate observations, in Proc. 5th Berkeley Symp. Math. Stat. Probab., University of California, 1965/66, vol. 1 (1967), pp. 281-297.
S. Lloyd, Least squares quantization in pcm, IEEE Trans. Inf. Theory 28 (1982) 129.
Article MathSciNet MATH Google Scholar
Abstracts, Biometrics 21 (1965) 761 [http://www.jstor.org/stable/2528559].
F. Pedregosa et al., Scikit-learn: Machine learning in Python, J. Mach. Learn. Res. 12 (2011) 2825.
MathSciNet MATH Google Scholar
ATLAS collaboration, Search for squarks and gluinos in final states with jets and missing transverse momentum using 36 fb ⁻¹ of \( \sqrt{s}=13 \) TeV pp collision data with the ATLAS detector, ATLAS-CONF-2017-022 (2017).
CMS collaboration, Search for supersymmetry in multijet events with missing transverse momentum in proton-proton collisions at 13 TeV, CMS-PAS-SUS-16-033.
J. Alwall et al., The automated computation of tree-level and next-to-leading order differential cross sections and their matching to parton shower simulations, JHEP 07 (2014) 079 [arXiv:1405.0301] [INSPIRE].
Article ADS Google Scholar
T. Sjöstrand, S. Mrenna and P.Z. Skands, PYTHIA 6.4 Physics and Manual, JHEP 05 (2006) 026 [hep-ph/0603175] [INSPIRE].
DELPHES 3 collaboration, J. de Favereau et al., DELPHES 3, A modular framework for fast simulation of a generic collider experiment, JHEP 02 (2014) 057 [arXiv:1307.6346] [INSPIRE].
M. Cacciari, G.P. Salam and G. Soyez, FastJet User Manual, Eur. Phys. J. C 72 (2012) 1896 [arXiv:1111.6097] [INSPIRE].
Article ADS Google Scholar
M. Cacciari, G.P. Salam and G. Soyez, The anti-k _t jet clustering algorithm, JHEP 04 (2008) 063 [arXiv:0802.1189] [INSPIRE].
Article ADS MATH Google Scholar
A. Krogh and J.A. Hertz, A simple weight decay can improve generalization, in Proceedings of the 4th International Conference on Neural Information Processing Systems, NIPS’91, pp. 950-957, Morgan Kaufmann Publishers Inc., San Francisco, CA, U.S.A. (1991) [http://dl.acm.org/citation.cfm?id=2986916.2987033].
M.J. Strassler and K.M. Zurek, Echoes of a hidden valley at hadron colliders, Phys. Lett. B 651 (2007) 374 [hep-ph/0604261] [INSPIRE].
J. Kang and M.A. Luty, Macroscopic Strings and ‘Quirks’ at Colliders, JHEP 11 (2009) 065 [arXiv:0805.4642] [INSPIRE].
Article ADS Google Scholar
R. Harnik and T. Wizansky, Signals of New Physics in the Underlying Event, Phys. Rev. D 80 (2009) 075015 [arXiv:0810.3948] [INSPIRE].
ADS Google Scholar
S. Knapen, S. Pagan Griso, M. Papucci and D.J. Robinson, Triggering Soft Bombs at the LHC, JHEP 08 (2017) 076 [arXiv:1612.00850] [INSPIRE].
Article ADS Google Scholar
R. Polikar, Ensemble based systems in decision making, IEEE Circuits Syst. Mag. 6 (2006) 21.
Article Google Scholar
L. Rokach, Ensemble-based classifiers, Artif. Intell. Rev. 33 (2010) 1.
Article Google Scholar
R. Maclin and D.W. Opitz, Popular ensemble methods: An empirical study, J. Artif. Intell. Res. 11 (1999) 169 [arXiv:1106.0257].
MATH Google Scholar

Download references

Open Access

This article is distributed under the terms of the Creative Commons Attribution License (CC-BY 4.0), which permits any use, distribution and reproduction in any medium, provided the original author(s) and source are credited.

Author information

Authors and Affiliations

Institute of Theoretical Science, University of Oregon, Eugene, Oregon, 97403, U.S.A.
Timothy Cohen, Marat Freytsis & Bryan Ostdiek

Authors

Timothy Cohen
View author publications
You can also search for this author in PubMed Google Scholar
Marat Freytsis
View author publications
You can also search for this author in PubMed Google Scholar
Bryan Ostdiek
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bryan Ostdiek.

Additional information

ArXiv ePrint: 1706.09451

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0), which permits use, duplication, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Cite this article

Cohen, T., Freytsis, M. & Ostdiek, B. (Machine) learning to do more with less. J. High Energ. Phys. 2018, 34 (2018). https://doi.org/10.1007/JHEP02(2018)034

Download citation

Received: 15 August 2017
Revised: 04 January 2018
Accepted: 29 January 2018
Published: 06 February 2018
DOI: https://doi.org/10.1007/JHEP02(2018)034

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

(Machine) learning to do more with less

Abstract

Article PDF

Similar content being viewed by others

Tag N’ Train: a technique to train improved classifiers on unlabeled data

A method for approximating optimal statistical significances with machine-learned likelihoods

Boosting mono-jet searches with model-agnostic machine learning

References

Open Access

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

(Machine) learning to do more with less

Abstract

Article PDF

Similar content being viewed by others

Tag N’ Train: a technique to train improved classifiers on unlabeled data

A method for approximating optimal statistical significances with machine-learned likelihoods

Boosting mono-jet searches with model-agnostic machine learning

References

Open Access

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation