Parent Assignment Is Hard for the MDL, AIC, and NML Costs

Koivisto, Mikko

doi:10.1007/11776420_23

Mikko Koivisto²⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4005))

Included in the following conference series:

International Conference on Computational Learning Theory

2716 Accesses
5 Citations

Abstract

Several hardness results are presented for the parent assignment problem: Given m observations of n attributes x ₁, ..., x _n, find the best parents for x _n, that is, a subset of the preceding attributes so as to minimize a fixed cost function. This attribute or feature selection task plays an important role, e.g., in structure learning in Bayesian networks, yet little is known about its computational complexity. In this paper we prove that, under the commonly adopted full-multinomial likelihood model, the MDL, BIC, or AIC cost cannot be approximated in polynomial time to a ratio less than 2 unless there exists a polynomial-time algorithm for determining whether a directed graph with n nodes has a dominating set of size logn, a LOGSNP-complete problem for which no polynomial-time algorithm is known; as we also show, it is unlikely that these penalized maximum likelihood costs can be approximated to within any constant ratio. For the NML (normalized maximum likelihood) cost we prove an NP-completeness result. These results both justify the application of existing methods and motivate research on heuristic and super-polynomial-time algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Cooper, G.F., Herskovits, E.: A Bayesian method for the induction of probabilistic networks from data. Machine Learning 9, 309–347 (1992)
MATH Google Scholar
Heckerman, D., Geiger, D., Chickering, D.M.: Learning Bayesian networks: The combination of knowledge and statistical data. Machine Learning 20, 197–243 (1995)
MATH Google Scholar
Suzuki, J.: Learning Bayesian belief networks based on the Minimun Description Length principle: An efficient algorithm using the b & b technique. In: Proceedings of the Thirteenth International Conference on Machine Learning (ICML), pp. 462–470 (1996)
Google Scholar
Tian, J.: A branch-and-bound algorithm for MDL learning Bayesian networks. In: Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence (UAI), pp. 580–588. Morgan Kaufmann, San Francisco (2000)
Google Scholar
Rissanen, J.: Modeling by shortest data description. Automatica 14, 465–471 (1978)
Article MATH Google Scholar
Bouckaert, R.R.: Probabilistic network construction using the minimum description length principle. In: Moral, S., Kruse, R., Clarke, E. (eds.) ECSQARU 1993. LNCS, vol. 747, pp. 41–48. Springer, Heidelberg (1993)
Chapter Google Scholar
Bouckaert, R.R.: Properties of Bayesian belief network learning algorithms. In: de Mantaras, R.L., Poole, D. (eds.) Proceedings of the Tenth Conference on Uncertainty in Artificial Intelligence (UAI), pp. 102–109. Morgan Kaufmann, San Francisco (1994)
Google Scholar
Papadimitriou, C., Yannakakis, M.: On limited nondeterminism and the complexity of the V-C dimension. Journal of Computer and System Sciences 53, 161–170 (1996)
Article MathSciNet MATH Google Scholar
Akaike, H.: A new look at the statistical model identification. IEEE Transactions on Automatic Control 19, 716–722 (1974)
Article MathSciNet MATH Google Scholar
Shtarkov, Y.M.: Universal sequential coding of single messages. Problems of Information Transmission 23, 3–17 (1987)
MathSciNet Google Scholar
Kontkanen, P., Buntine, W., Myllymäki, P., Rissanen, J., Tirri, H.: Efficient computation of stochastic complexity. In: Bishop, C.M., Frey, B.J. (eds.) Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics (AISTAT), Key West, FL, pp. 181–188 (2003)
Google Scholar
Schwarz, G.: Estimating the dimension of a model. Annals of Statistics 6, 461–464 (1978)
Article MathSciNet MATH Google Scholar
Cai, L., Juedes, D., Kanj, I.: The inapproximability of non-NP-hard optimization problems. Theoretical Computer Science 289, 553–571 (2002)
Article MathSciNet MATH Google Scholar
Garey, M., Johnson, D.: Computers and Intractability - A Guide to the Theory of NP-completeness. W. H. Freeman & Co., San Fransisco (1971)
Google Scholar
Chickering, D.M., Meek, C.: Finding optimal Bayesian networks. In: Proceedings of Eighteenth Conference on Uncertainty in Artificial Intelligence (UAI), pp. 94–102. Morgan Kaufmann, Edmonton (2002)
Google Scholar
Koller, D., Sahami, M.: Toward optimal feature selection. In: Proceedings of the Thirteenth International Conference on Machine Learning (ICML), pp. 284–292. Morgan Kaufmann, San Francisco (1996)
Google Scholar
Charikar, M., Guruswami, V., Kumar, R., Rajagopalan, S., Sahai, A.: Combinatorial feature selection problems. In: Proceedings of the 41st IEEE Symposium on Foundations of Computer Science (FOCS), pp. 631–640. IEEE, Los Alamitos (2000)
Chapter Google Scholar
Amaldi, E., Kann, V.: On the approximability of minimizing nonzero variables or unsatisfied relations in linear systems. Theoretical Computer Science 209, 237–260 (1998)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

HIIT Basic Research Unit, Department of Computer Science, University of Helsinki, Gustaf Hällströmin katu 2b, FIN-00014, Finland
Mikko Koivisto

Authors

Mikko Koivisto
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

ICREA and Department of Economics, Universitat Pompeu Fabra, Ramon Trias Fargas 25-27, 08005, Barcelona, Spain
Gábor Lugosi
Ruhr-Universität Bochum, Germany
Hans Ulrich Simon

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Koivisto, M. (2006). Parent Assignment Is Hard for the MDL, AIC, and NML Costs. In: Lugosi, G., Simon, H.U. (eds) Learning Theory. COLT 2006. Lecture Notes in Computer Science(), vol 4005. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11776420_23

Download citation

DOI: https://doi.org/10.1007/11776420_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-35294-5
Online ISBN: 978-3-540-35296-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics