Improving Quality of Agglomerative Scheduling in Concurrent Processing of Frequent Itemset Queries

Boinski, Pawel; Jozwiak, Konrad; Wojciechowski, Marek; Zakrzewicz, Maciej

doi:10.1007/3-540-33521-8_23

Pawel Boinski³,
Konrad Jozwiak³,
Marek Wojciechowski³ &
…
Maciej Zakrzewicz³

Part of the book series: Advances in Soft Computing ((AINSC,volume 35))

597 Accesses
1 Citations

Abstract

Frequent itemset mining is often regarded as advanced querying where a user specifies the source dataset and pattern constraints using a given constraint model. Recently, a new problem of optimizing processing of batches of frequent itemset queries has been considered. The best technique for this problem proposed so far is Common Counting, which consists in concurrent processing of frequent itemset queries and integrating their database scans. Common Counting requires that data structures of several queries are stored in main memory at the same time. Since in practice memory is limited, the crucial problem is scheduling the queries to Common Counting phases so that the I/O cost is optimized. According to our previous studies, the best algorithm for this task, applicable to large batches of queries, is CCAgglomerative. In this paper we present a novel query scheduling method CCAgglomerativeNoise, built around CCAgglomerative, increasing its chances of finding an optimal solution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Softcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

1. Agrawal, R., Imielinski, T., Swami, A. (1993) Mining Association Rules Between Sets of Items in Large Databases. Proceedings of the 1993 ACM SIG-MOD Conference on Management of Data, Washington, D. C., 207–216
Google Scholar
2. Agrawal, R., Mehta, M., Shafer, J., Srikant, R., Arning, A., Bollinger, T. (1996) The Quest Data Mining System. Proceedings of the 2nd International Conference on Knowledge Discovery in Databases and Data Mining, Portland, Oregon, 244–249
Google Scholar
3. Agrawal, R,., Srikant, R,. (1994) Fast Algorithms for Mining Association Rules. Proceedings of the 20th International Conference on Very Large Data Bases, Santiago de Chile, Chile, 487–499
Google Scholar
4. Baralis, E., Psaila, G. (1999) Incremental Refinement of Mining Queries. Proceedings of the 1st International Conference on Data Warehousing and Knowledge Discovery, Florence, Italy, 173–182
Google Scholar
5. Blockeel, H., Dehaspe, L., Demoen, B., Janssens, G., Ramon, J., Vandecasteele, H. (2002) Improving the Efficiency of Inductive Logic Programming Through the Use of Query Packs. Journal of Artificial Intelligence Research 16, 135–166
MATH Google Scholar
6. Cheung, D. W.-L., Han, J., Ng, V., Wong, C. Y. (1996) Maintenance of discovered association rules in large databases: An incremental updating technique. Proceedings of the 12th International Conference on Data Engineering, NewOrleans, Louisiana, USA, 106–114
Google Scholar
7. Garey, M., Johnson, D., Stockmeyer, L. (1976) Some simplified NP-complete graph problems. Theoretical Computer Science 1(3), 237–267
Article MATH MathSciNet Google Scholar
8. Hart, J.P., Shogan, A.W. (1987) Semi-greedy heuristics: An empirical study. Operations Research Letters 6, 107–114
Article MATH MathSciNet Google Scholar
9. Imielinski, T., Mannila, H. (1996) A Database Perspective on Knowledge Discovery. Communications of the ACM 39(11), 58–64
Article Google Scholar
10. Meo, R,. (2003) Optimization of a Language for Data Mining. Proceedings of the ACM Symposium on Applied Computing - Data Mining Track, Melbourne, Florida, USA, 437–444
Google Scholar
11. Morzy, M., Wojciechowski, M., Zakrzewicz, M. (2005) Optimizing a Sequence of Frequent Pattern Queries. Proceedings of the 7th International Conference on Data Warehousing and Knowledge Discovery, Copenhagen, Denmark, 448–457
Google Scholar
12. Nag, B., Deshpande, P. M., DeWitt, D. J. (1999) Using a Knowledge Cache for Interactive Discovery of Association Rules. Proceedings of the 5th International Conference on Knowledge Discovery and Data Mining, San Diego, California, 244–253
Google Scholar
13. Sellis, T. (1988) Multiple-query optimization. ACM Transactions on Database Systems 13(1), 23–52
Article Google Scholar
14. Wojciechowski, M., Zakrzewicz, M. (2003) Evaluation of Common Counting Method for Concurrent Data Mining Queries. Proceedings of 7th East European Conference on Advances in Databases and Information Systems, Dresden, Germany, 76–87
Google Scholar
15. Wojciechowski, M., Zakrzewicz, M. (2004) Evaluation of the Mine Merge Method for Data Mining Query Processing. Proceedings of the 8th East European Conference on Advances in Databases and Information Systems, Budapest, Hungary, 78–88
Google Scholar
16. Wojciechowski, M., Zakrzewicz, M. (2005) On Multiple Query Optimization in Data Mining. Proceedings of the 9th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Hanoi, Vietnam, 696–701
Google Scholar

Download references

Author information

Authors and Affiliations

Poznan University of Technology, ul. Piotrowo 2, Poznan, Poland
Pawel Boinski, Konrad Jozwiak, Marek Wojciechowski & Maciej Zakrzewicz

Authors

Pawel Boinski
View author publications
You can also search for this author in PubMed Google Scholar
Konrad Jozwiak
View author publications
You can also search for this author in PubMed Google Scholar
Marek Wojciechowski
View author publications
You can also search for this author in PubMed Google Scholar
Maciej Zakrzewicz
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Polish Academy of Sciences, Institute of Computer Science, ul. Ordona 21, 01-237, Warszawa, Poland
Mieczysław A. Kłopotek , Sławomir T. Wierzchoń & Krzysztof Trojanowski , &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Boinski, P., Jozwiak, K., Wojciechowski, M., Zakrzewicz, M. (2006). Improving Quality of Agglomerative Scheduling in Concurrent Processing of Frequent Itemset Queries. In: Kłopotek, M.A., Wierzchoń, S.T., Trojanowski, K. (eds) Intelligent Information Processing and Web Mining. Advances in Soft Computing, vol 35. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-33521-8_23

Download citation

DOI: https://doi.org/10.1007/3-540-33521-8_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33520-7
Online ISBN: 978-3-540-33521-4
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics