Abstract
Estimating joint probabilities plays an important role in many data mining and machine learning tasks. In this paper we introduce two methods, minAB and prodAB, to estimate joint probabilities. Both methods are based on a light-weight structure, partition support. The core idea is to maintain the partition support of itemsets over logically disjoint partitions and then use it to estimate joint probabilities of itemsets of higher cardinalitiess. We present extensive mathematical analyses on both methods and compare their performances on synthetic datasets. We also demonstrate a case study of using the estimation methods in Apriori algorithm for fast association mining. Moreover, we explore the usefulness of the estimation methods in other mining/learning tasks [9]. Experimental results show the effectiveness of the estimation methods.
The project is supported in part by NIH Grants 5-P41-RR09283, RO1-AG18231, and P30-AG18254 and by NSF Grants EIA-0080124, NSF CCR-9701911, and DUE- 9980943. We would also like to thank Dr. Meng Xiang Tang and Xianghui Liu for their helpful discussions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
R. Agrawal, T. Imielinski, and A. Swami. Mining associations between sets of items in massive databases. In Proc. of ACM SIGMOD, 1993.
R. Agrawal and R. Srikant. Fast algorithms for mining association rules in large databases. In VLDB, 1994.
S. Brin, R. Motwani, and C. Silverstein. Beyond market baskets: generalizaing association rules to correlation. In Proc. of ACM SIGMOD, 1997.
Trevor Hastie, Robert Tibshirani, and Jerome Friedman. The Elemetns of Statistical Learning: Data Mining, Inference, Prediction. Springer, 2001.
C. Hidber. Online association rule mining. In Proc. of ACM SIGMOD, 1999.
J. Hipp, U. Güntzer, and G. Nakhaeizadeh. Algorithms for association rule mining: a general survey and comparison. SIGKDD Explorations, 2(1):58–63, 2000.
M. Kamber, J. Han, and J. Y. Chiang. Metarules-guided mining of multidimensional association rules using data cubes. In Proc. of ACM SIGKDD, 1997.
L. V. S. Lakshmanan, C. K. S. Leung, and R. T. Ng. The segment support map: Scalable mining of frequent itemsets. SIGKDD Explorations, 2(2), December 2000.
Tao Li, Shenghuo Zhu, Mitsunori Ogihara, and Yinhe Cheng. Estimating joint probabilities without combinatory counting. Technical Report TR-764, Computer Science Department,University of Rochester, 2002.
C. Silverstein, S. Brin, R. Motwani, and J. Ullman. Scalable techniques for mining causal structures. In Proc. of VLDB, 1998.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Li, T., Zhu, S., Ogihara, M., Cheng, Y. (2002). Estimating Joint Probabilities from Marginal Ones* . In: Kambayashi, Y., Winiwarter, W., Arikawa, M. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2002. Lecture Notes in Computer Science, vol 2454. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46145-0_4
Download citation
DOI: https://doi.org/10.1007/3-540-46145-0_4
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44123-6
Online ISBN: 978-3-540-46145-6
eBook Packages: Springer Book Archive