Abstract
Frequent itemset mining is often regarded as advanced querying where a user specifies the source dataset and pattern constraints using a given constraint model. Recently, a new problem of optimizing processing of batches of frequent itemset queries has been considered. The best technique for this problem proposed so far is Common Counting, which consists in concurrent processing of frequent itemset queries and integrating their database scans. Common Counting requires that data structures of several queries are stored in main memory at the same time. Since in practice memory is limited, the crucial problem is scheduling the queries to Common Counting phases so that the I/O cost is optimized. According to our previous studies, the best algorithm for this task, applicable to large batches of queries, is CCAgglomerative. In this paper we present a novel query scheduling method CCAgglomerativeNoise, built around CCAgglomerative, increasing its chances of finding an optimal solution.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
1. Agrawal, R., Imielinski, T., Swami, A. (1993) Mining Association Rules Between Sets of Items in Large Databases. Proceedings of the 1993 ACM SIG-MOD Conference on Management of Data, Washington, D. C., 207–216
2. Agrawal, R., Mehta, M., Shafer, J., Srikant, R., Arning, A., Bollinger, T. (1996) The Quest Data Mining System. Proceedings of the 2nd International Conference on Knowledge Discovery in Databases and Data Mining, Portland, Oregon, 244–249
3. Agrawal, R,., Srikant, R,. (1994) Fast Algorithms for Mining Association Rules. Proceedings of the 20th International Conference on Very Large Data Bases, Santiago de Chile, Chile, 487–499
4. Baralis, E., Psaila, G. (1999) Incremental Refinement of Mining Queries. Proceedings of the 1st International Conference on Data Warehousing and Knowledge Discovery, Florence, Italy, 173–182
5. Blockeel, H., Dehaspe, L., Demoen, B., Janssens, G., Ramon, J., Vandecasteele, H. (2002) Improving the Efficiency of Inductive Logic Programming Through the Use of Query Packs. Journal of Artificial Intelligence Research 16, 135–166
6. Cheung, D. W.-L., Han, J., Ng, V., Wong, C. Y. (1996) Maintenance of discovered association rules in large databases: An incremental updating technique. Proceedings of the 12th International Conference on Data Engineering, NewOrleans, Louisiana, USA, 106–114
7. Garey, M., Johnson, D., Stockmeyer, L. (1976) Some simplified NP-complete graph problems. Theoretical Computer Science 1(3), 237–267
8. Hart, J.P., Shogan, A.W. (1987) Semi-greedy heuristics: An empirical study. Operations Research Letters 6, 107–114
9. Imielinski, T., Mannila, H. (1996) A Database Perspective on Knowledge Discovery. Communications of the ACM 39(11), 58–64
10. Meo, R,. (2003) Optimization of a Language for Data Mining. Proceedings of the ACM Symposium on Applied Computing - Data Mining Track, Melbourne, Florida, USA, 437–444
11. Morzy, M., Wojciechowski, M., Zakrzewicz, M. (2005) Optimizing a Sequence of Frequent Pattern Queries. Proceedings of the 7th International Conference on Data Warehousing and Knowledge Discovery, Copenhagen, Denmark, 448–457
12. Nag, B., Deshpande, P. M., DeWitt, D. J. (1999) Using a Knowledge Cache for Interactive Discovery of Association Rules. Proceedings of the 5th International Conference on Knowledge Discovery and Data Mining, San Diego, California, 244–253
13. Sellis, T. (1988) Multiple-query optimization. ACM Transactions on Database Systems 13(1), 23–52
14. Wojciechowski, M., Zakrzewicz, M. (2003) Evaluation of Common Counting Method for Concurrent Data Mining Queries. Proceedings of 7th East European Conference on Advances in Databases and Information Systems, Dresden, Germany, 76–87
15. Wojciechowski, M., Zakrzewicz, M. (2004) Evaluation of the Mine Merge Method for Data Mining Query Processing. Proceedings of the 8th East European Conference on Advances in Databases and Information Systems, Budapest, Hungary, 78–88
16. Wojciechowski, M., Zakrzewicz, M. (2005) On Multiple Query Optimization in Data Mining. Proceedings of the 9th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Hanoi, Vietnam, 696–701
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer
About this paper
Cite this paper
Boinski, P., Jozwiak, K., Wojciechowski, M., Zakrzewicz, M. (2006). Improving Quality of Agglomerative Scheduling in Concurrent Processing of Frequent Itemset Queries. In: Kłopotek, M.A., Wierzchoń, S.T., Trojanowski, K. (eds) Intelligent Information Processing and Web Mining. Advances in Soft Computing, vol 35. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-33521-8_23
Download citation
DOI: https://doi.org/10.1007/3-540-33521-8_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33520-7
Online ISBN: 978-3-540-33521-4
eBook Packages: EngineeringEngineering (R0)