Abstract
We present a novel approach to construct a kind of tree belief network, in which the “nodes” are subsets of variables of dataset. We call this model Large Node Chow-Liu Tree (LNCLT). This technique uses the concept of the association rule as found in the database literature to guide the construction of the LNCLT. Similar to the Chow-Liu Tree (CLT), the LNCLT is also ideal for density estimation and classification applications. More importantly, our novel model partially solves the disadvantages of the CLT, i.e., the inability to represent non-tree structures, and is shown to be superior to the CLT theoretically. Moreover, based on the MNIST hand-printed digit database, we conduct a series of digit recognition experiments to verify our approach. From the result we find that both the approximation accuracy and the recognition rate on the data are improved with the LNCLT structure, when compared with the CLT.
Key words
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
R. Agrawal and R. Srikant. Fast algorithms for mining association rules. Proceedings of International Conference on Very Large Data Bases (VLDB-1994), 1994.
R. Bakis, M. Herbst, and G. Nagy. An experimental study of machine recognition of hand-printed numerals. IEEE Transactions on Systems Science and Cybernetics, SSC-4(2), JULY 1968.
C. K. Chow and C. N. Liu. Approximating discrete probability distributions with dependence trees. IEEE Trans. on Information Theory, 14: 462–467, 1968.
S. Dasguota. Learning polytrees. In Uncertainty in Artificial Intelligence, 1999.
P. Domingos and M. J. Pazzani. On the optimality of the simple baysian classifier under zero-one loss. Machine Learning, 29: 103–130, 1997.
J. Dougherty, R. Kohavi, and M. Sahami. Supervised and unsupervised discretization of continuous features. In International Conference on Machine Learning, pages 194–202, 1995.
G. Elidan, N. Lotner, N. Friedman, and D. Koller. Discovering hidden variables:a structure-based approach. In NIPS 13, 2001.
N. Friedman and G. Elidan. Learning the dimensionality of hidden variables. In Proceedings of Seventeenth Conference on Uncertainty in Artificial Intelligence (UAI), 2001.
N. Friedman, D. Geiger, and M. Goldszmidt. Bayesian network classifiers. Machine Learning, 29: 131–161, 1997.
J. Hipp, U. Guntzer, and G. Nakhaeizadeh. Algorithms for association rule mining-a general survey and comparison. ACM SIGKDD Explorations, 2: 5864, July 2000.
K. Huang, I. King, and M. R. Lyu. Constructing a large node chow-liu tree based on frequent itemsets. In Lipo Wang, Jagath C. Rajapakse, Kunihiko Fukushima, Soo-Young Lee, and Xi Yao, editors, Proceedings of the International Conference on Neural Information Processing (ICONIP-2002), Orchid Country Club, Singapore, pages 498–502, 2002.
K. Huang, I. King, and M. R. Lyu. Learning maximum likelihood semi-naive bayesian network classifier. In Proceedings of IEEE International Conference on Systems, Man and Cybernetics (SMC-2002), Hammamet, Tunisia, 2002.
K. Huang, I. King, and M. R. Lyu. Discriminative training of bayesian chowliu tree multinet classifiers. In Proceedings of International Joint Conference on Neural Network(IJCNN-2003), Oregon, Portland, U.S.A., volume 1, pages 484–488, 2003.
K. Huang, I. King, and M. R. Lyu. Finite mixture model of bound semi-naive bayesian network classifier. In Joint 13th International Conference on Artificial Neural Network (ICANN-2003) and 10th International Conference on Neural Information Processing (ICONIP-2003), Long paper, Lecture Notes in Computer Science, pages 115–122, 2003.
D. Karger and N. Srebro. Learning markov networks: maximum bounded tree-width graphs. In Symposium on Discrete Algorithms, pages 392–401, 2001.
R. Kohavi. A study of cross validation and bootstrap for accuracy estimation and model selection. In Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI-1995), pages 338–345. San Francisco, CA: Morgan Kaufmann, 1995.
R. Kohavi, B. Becker, and D Sommerfield. Improving simple bayes. In Technique report. Data Mining and Visualization Group, Silicon Graphics Inc., Mountain View, CA., 1997.
I. Kononenko. Semi-naive bayesian classifier. In Proceedings of Sixth European Working Session on Learningpages 206–219. Springer-Verlag, 1991.
P. Langley. Induction of recursive bayesian classifiers.In Proceedings of the 1993 European Conference on Machine learningpages 153–164, 1993.
P. Langley and S. Sage. Induction of selective bayesian classifiers. In Proceedings of the Tenth Conference on Uncertainty in Artificial Intelligence (UAI-1994), pages 399–406. San Francisco, CA: Morgan Kaufmann, 1994.
Y. Le Cun. http://www.research.att.com/ yann/exdb/mnist/index.html.
F. M. Malvestuto. Approximating discrete probability distributions with decomposable models. IEEE Transactions on Systems, Man and Cybernetics, 21 (5): 1287–1294, 1991.
M. Meila and M. Jordan. Learning with mixtures of trees. Journal of Machine Learning Research, 1: 1–48, 2000.
T. Niblett. Constructing decision trees in noisy domains. In Proceedings of the Second European Working Session on Learningpages 67–78, 1987.
M. J. Pazzani. Searching dependency in bayesian classifiers. In D. Fisher and 11.-J. Lenz, editors, Learning from data: Artificial intelligence and statistics V, pages 239–248. New York, NY: Springer-Verlag, 1996.
J. Pearl. Probabilistic Reasoning in Intelligent Systems: networks of plausible inference. Morgan Kaufmann, CA, 2nd edition, 1997.
J. R. Quinlan. C4.5: programs for machine learning. San Mateo, California:Morgan Kaufmann Publishers, 1993.
M. Sahami. Learning limited dependence bayesian classifiers. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, pages 335–338. Portland, OR: AAAI Press, 1996.
N. Srebro. Maximum likelihood bounded tree-width markov networks. MIT Master thesis2001.
A. Stolcke and S. Omohundro. Hidden markov model induction by bayesian model merging. In NIPS 5, pages 11–18, 1993.
A. Stolcke and S. Omohundro. Inducing probabilistic grammars by bayesian model merging. In International Conference on Grammatical Inference, 1994.
G. Webb and M. J. Pazzani. Adjusted probability naive bayesian induction. In the eleventh Australian Joint Conference on Artificial Intelligence, 1998.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Huang, K., King, I., Lyu, M.R., Yang, H. (2004). Improving Chow-Liu Tree Performance by Mining Association Rules. In: Rajapakse, J.C., Wang, L. (eds) Neural Information Processing: Research and Development. Studies in Fuzziness and Soft Computing, vol 152. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39935-3_6
Download citation
DOI: https://doi.org/10.1007/978-3-540-39935-3_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-53564-2
Online ISBN: 978-3-540-39935-3
eBook Packages: Springer Book Archive