Minimizing Space Time Complexity by RSTDB a New Method for Frequent Pattern Mining
Data-mining is the extraction of meaningful patterns from the large source of data. Association Rule Mining (ARM) is an important data mining technique. Mining of frequent patterns is a very important association rule mining problem. The previous approach i.e. Apriori suffers from the candidate-generation and test mechanism. The Apriori approach becomes inefficient when either the length of the frequent set or length of the Transaction Database (TDB) increases. The algorithm adopts bottom up breadth first approach for the mining purpose. In this research work, we have proposed a Reduced Scanning Transaction Database (RSTDB) algorithm that uses certain heuristic function which reduces the number of Transaction Database passes required to generate the maximum frequent set required for Association Rule Mining (ARM). The approach is a hybrid of bottom up and top down approach. It uses both upward and downward closure properties for frequent item sets evaluation.
In this work we will compare the Apriori approach with the above proposed approach for frequent pattern mining. We will try to evaluate the shortcoming of the proposed approach and also look as to how much efficient it is and in which cases. The RSTDB algorithm not only reduces the database scans but also will help in reducing the number of candidategeneration for a phase that is having a value less than the minimum support threshold value.
KeywordsHeuristic Function Apriori Algorithm Transaction Database Frequent Pattern Mining Minimum Support Threshold
Unable to display preview. Download preview PDF.
- 1.Agrawal, R., Imielinski, T., and Swami, A.N.: Mining association rules between sets of items in large databases. Proceedings of ACM SIGMOD International Conference on Management of Data, ACM Press, Washington DC, pp. 207–216, May (1993)Google Scholar
- 3.Han, J., Pei, J., Yin, Y.: Mining Frequent Patterns without Candidate Generation. Proceedings of ACM SIGMOD Internationa 1 Conference on Management of Data, ACM Press, Dallas, Texas, pp. 1–12, May (2000)Google Scholar
- 4.Pei, J., Han, J., Lu, H., Nishio, S., Tang, S., and Yang, D.: Hmine: Hyper-Structure Mining of Frequent Patterns in Large Databases. Proceedings of IEEE International Conference on Data Mining, pp. 441–448 (2001)Google Scholar
- 5.Pietracaprina, Zandolin, D.: Mining Frequen t Item sets Using Patricia Tries. FIMI’ 03, Frequent Itemset Mining Implementations, Proceedings of the ICDM 2003 Workshop on Frequent Item set Mining Implementations, Melbourne, Florida, Dec. (2003)Google Scholar
- 6.Grahne, G., Zhu, J.: Efficiently using prefix-trees in mining frequent itemsets. FIMI’ 03, Frequent Itemset Mining Implementations, Proceedings of the ICDM 2003 Workshop on Frequent Itemset Mining Implementations, Melbourne, Florida, December (2003)Google Scholar
- 8.Zaki, M.J., Parthasarathy, S., Ogihara, M., Li, W.: New algorithms for fast discovery of association rules. Proceedings of the Third International Conference on Knowledge Discovery and Data Mining, AAAI Press, pp. 283–286 (1997)Google Scholar
- 9.Shenoy, P., Haritsa, J.R., Sudarshan, S., Bhalotia, G., Bawa, M., Shah, D.: Turbo-charging vertical mining of large databases. Proceedings of ACM SIGMOD Intnational Conference on Management of Data, ACM Press, Dallas, Texas, pp. 22–23, May (2000)Google Scholar
- 10.Burdick, D., Calimlim, M., and Gehrke, J.: MAFIA: a maximal frequent item set algorithm for transactional databases. Proceedings of International Conference on Data Engineering, Heidelberg, Germany, pp. 443–452, April (2001)Google Scholar
- 11.Zaki, M.J., Gouda, K.: Fast vertical mining using diffsets. Proceedings of the Nineth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, D.C., ACM Press, New York, pp. 326–335, (2003)Google Scholar
- 12.Agrawal, R., Agarwal, C., Prasad, V.: A Tree Projection Algorithm for Generation of Frequent Item Sets. Parallel and Distributed Computing, pp. 350–371, (2000)Google Scholar
- 13.Singh, V.K., Shah, V., Jain Y.K., Shukla, A., Thoke, A.S., Singh, V.K., Dule, C., Parganiha, V.: Proposing an Efficient Method for Frequent Pattern Mining. has been Accepted for Oral Presentation at the Conference and publication in Proceeding of World Academy of Science, Engineering and Technology, Volume 36, International Conference on Computational and Statistical Sciences, Bangkok Dec 9 (2008)Google Scholar
- 14.Singh, V.K., Shah V.: Minimizing Space Time Complexity in Frequent Pattern Mining by Reducing Transaction Database Scanning and Using Pattern Growth Methods. To appear in Chhattisgarh Journal, of Science and Technology (2008)Google Scholar
- 15.Singh, V.K., Shah V.: Minimizing Space Time Tradeoff in Frequent Pattern Mining Using Pattern growth Methods. Proceedings of Tech Acme 08, 17–19 Oct Bhopal (2008)Google Scholar
- 16.Singh, V.K., Singh, V.K.: The Huge Potential of Information Technology. Proceedings of National Convention on Global Leadership: Strategies and Challenges for Indian Business, Feb 10–11, GGDU Bilaspur (2007).Google Scholar