Abstract
To mine significant dependencies among predictiveattributes, much work has been carried out to learn Bayesian netwrok classifiers (BNC\(_\mathcal {T}\)s) from labeled training data set \(\mathcal {T}\). However, if BNC\(_\mathcal {T}\) does not capture the “right” dependencies that would be most relevant to unlabeled testing instance, that will result in performance degradation. To address this issue we propose a novel framework, called target learning, that takes each unlabeled testing instance as a target and builds an “unstable” Bayesian model BNC\(_\mathcal {P}\) for it. To make BNC\(_\mathcal {P}\) and BNC\(_\mathcal {T}\) complementary to each other and work efficiently in combination, the same learning strategy is applied to build them. Experimental comparison on 32 large data sets from UCI machine learning repository shows that, for BNCs with different degrees of dependence target learning always helps improve the generalization performance with minimal additional computation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Pearl, J.: Probabilistic reasoning in intelligent systems: networks of plausible inference. Artif. Intell. 48, 117–124 (1991). https://doi.org/10.1016/0004-3702(91)90084-w
Lewis, D.D.: Naive (Bayes) at forty: the independence assumption in information retrieval. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 4–15. Springer, Heidelberg (1998). https://doi.org/10.1007/BFb0026666
Langley, P.: Induction of recursive Bayesian classifiers. In: Brazdil, P.B. (ed.) ECML 1993. LNCS, vol. 667, pp. 153–164. Springer, Heidelberg (1993). https://doi.org/10.1007/3-540-56602-3_134
Jiang. S., Harry, Z.: Full Bayesian network classifiers. In: 23rd International Conference on Machine Learning, Pittsburgh, Pennsylvania, pp. 897–904 (2006). https://doi.org/10.1145/1143844.1143957
Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian network classifiers. Mach. Learn. 29, 131–163 (1997). https://doi.org/10.1002/9780470400531.eorms0099
Sahami, M.: Learning limited dependence Bayesian classifiers. In: 2nd International Conference on Knowledge Discovery and Data Mining, Portland, United States, pp. 335–338 (1996). https://doi.org/10.1007/978-1-4471-0745-3_8
Zheng, Z.J., Webb, G.I.: Lazy learning of Bayesian rules. Mach. Learn. 41, 53–84 (2000). https://doi.org/10.1007/978-1-4471-0745-3_8
Martínez, A.M., Webb, G.I., Chen, S.L., Zaidi, N.A.: Scalable learning of Bayesian network classifiers. J. Mach. Learn. Res. 17, 1–30 (2016). https://doi.org/10.1145/1015330.1015339
Taheri, S., Mammadov, M.: Structure learning of Bayesian Networks using global optimization with applications in data classification. Optim. Lett. 9, 931–948 (2015). https://doi.org/10.1007/s11590-014-0803-1
Triguero, I., García, S., Herrera, F.: Self-labeled techniques for semi-supervised learning: taxonomy, software and empirical study. Knowl. Inf. Syst. 42, 245–284 (2015). https://doi.org/10.1007/s10115-013-0706-y
Nikos, F., Stamatis, K., Sotiris, K., Kyriakos, S.: Self-trained LMT for semisupervised learning. Comput. Intell. Neurosci. 2, 1–13 (2016). https://doi.org/10.1155/2016/3057481
Didaci, L., Fumera, G., Roli, F.: Analysis of co-training algorithm with very small training sets. In: Gimel’farb, G., et al. (eds.) SSPR/SPR 2012. LNCS, vol. 7626. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-34166-3_79
Zhu, X.J.: Semi-supervised learning literature survey. Comput. Sci. 37, 63–77 (2008). https://doi.org/10.7551/mitpress/9780262033589.003.0001
Shannon, C.E.: The Mathematical Theory of Communication. University of Illinois Press, Champaign (1949)
Park, S.H., Fürnkranz, J.: Efficient implementation of class-based decomposition schemes for Naive Bayes. Mach. Learn. 96, 295–309 (2014). https://doi.org/10.21275/v4i11.nov151091
Breiman, L.: Bagging predictors. Mach. Learn. 24, 123–140 (1996). https://doi.org/10.1007/bf00058655
UCI repository of machine learning databases (1995). http://www.ics.uci.edu/mlearn/MLRepository.html
Fayyad, U.M., Irani, K.B.: Multi-interval discretization of continuous valued attributes for classification learning. In: 5th International Joint Conference on Artificial Intelligence, France, Chambery, pp. 1022–1029 (1993). https://doi.org/10.1109/icmlc.2010.5581069
Kohavi, R., Wolpert, D.: Bias plus variance decomposition for zero-one loss functions. In: 13th International Conference on Machine Learning, Bari, Italy, pp. 275–283 (1996). https://doi.org/10.1007/978-0-387-09823-4_37
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Wang, L., Chen, S., Mammadov, M. (2018). Target Learning: A Novel Framework to Mine Significant Dependencies for Unlabeled Data. In: Phung, D., Tseng, V., Webb, G., Ho, B., Ganji, M., Rashidi, L. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2018. Lecture Notes in Computer Science(), vol 10937. Springer, Cham. https://doi.org/10.1007/978-3-319-93034-3_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-93034-3_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-93033-6
Online ISBN: 978-3-319-93034-3
eBook Packages: Computer ScienceComputer Science (R0)