Abstract
In this paper, first we introduce frequent few-overlapped monotone DNF formulas under the minimum supportσ, the minimum term support τ and the maximum overlap λ. We say that a monotone DNF formula is frequent if the support of it is greater than σ and the support of each term (or itemset) in it is greater than τ, and few-overlapped if the overlap of it is less than λ and λ < τ.Then, we design the algorithm ffo_dnf to extract them. The algorithm ffo_dnf first enumerates all of the maximal frequent itemsets under τ, and secondly connects the extracted itemsets by a disjunction ∨ until satisfying σ and λ. The first step of ffo_dnf, called a depth-first pruning, follows from the property that every pair of itemsets in a few-overlapped monotone DNF formula is incomparable under a subset relation. Furthermore, we show that the extracted formulas by ffo_dnf are representative.Finally, we apply the algorithm ffo_dnf to bacterial culture data.
This work is partially supported by Grand-in-Aid for Scientific Research 15700137 and 16016275 from the Ministry of Education, Culture, Sports, Science and Technology, Japan.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.I.: Fast discovery of association rules. In: [6], pp. 307–328
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Proc. 20th VLDB, pp. 487–499 (1994)
Burdick, D., Calimlim, M., Gehrke, J.: MAFIA: A maximal frequent itemset algorithm for transaction databases. In: Proc. ICDE 2001, pp. 443–452 (2001)
Bykowski, A., Rigotti, C.: A condensed representation to find frequent patterns. In: Proc. PODS 2001, pp. 267–273 (2001)
Bykowski, A., Rigotti, C.: DBC: A condensed representation of frequent patterns for efficient mining. Information Systems 28, 949–977 (2003)
Fayyed, U.M., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R. (eds.) Advances in knowledge discovery and data mining. AAAI/MIT Press (1996)
Hirata, K., Nagazumi, R., Harao, M.: Extraction of coverings as monotone DNF formulas. In: Grieser, G., Tanaka, Y., Yamamoto, A. (eds.) DS 2003. LNCS (LNAI), vol. 2843, pp. 165–178. Springer, Heidelberg (2003)
Kryszkiewicz, M.: Concise representation of frequent patterns based on disjunction-free generators. In: Proc. ICDM 2001, pp. 305–312 (2001)
Kryszkiewicz, M., Gajek, M.: Concise representation of frequent patterns based on generalized disjunction-free generators. In: Chen, M.-S., Yu, P.S., Liu, B. (eds.) PAKDD 2002. LNCS (LNAI), vol. 2336, pp. 159–171. Springer, Heidelberg (2002)
Kryszkiewicz, M., Gajek, M.: Why to apply generalized disjunction-free generators representation of frequent patterns? In: Hacid, M.-S., Raś, Z.W., Zighed, D.A., Kodratoff, Y. (eds.) ISMIS 2002. LNCS (LNAI), vol. 2366, pp. 383–392. Springer, Heidelberg (2002)
Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Discovering frequent closed itemsets for association rules. In: Beeri, C., Bruneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 398–416. Springer, Heidelberg (1999)
Shima, Y., Mitsuishi, S., Hirata, K., Harao, M.: Extracting minimal and closed monotone DNF formulas. In: Suzuki, E., Arikawa, S. (eds.) DS 2004. LNCS (LNAI), vol. 3245, pp. 298–305. Springer, Heidelberg (2004)
Suzuki, E.: Mining bacterial test data with scheduled discovery of exception rules. In: [14], pp. 34–40
Suzuki, E. (ed.): Proc. KDD Challenge 2000 (2000)
Tsumoto, S.: Guide to the bacteriological examination data set. In: [14], pp. 8–12
Zaki, M.J., Hsiao, C.-J.: CHARM: An efficient algorithm for closed itemset mining. In: Proc. SDM 2002, pp. 457–478 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Shima, Y., Hirata, K., Harao, M. (2005). Extraction of Frequent Few-Overlapped Monotone DNF Formulas with Depth-First Pruning. In: Ho, T.B., Cheung, D., Liu, H. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2005. Lecture Notes in Computer Science(), vol 3518. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11430919_8
Download citation
DOI: https://doi.org/10.1007/11430919_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26076-9
Online ISBN: 978-3-540-31935-1
eBook Packages: Computer ScienceComputer Science (R0)