Advertisement

Minimizing Space Time Complexity by RSTDB a New Method for Frequent Pattern Mining

  • Vaibhav Kant Singh
  • Vinay Kumar Singh

Abstract

Data-mining is the extraction of meaningful patterns from the large source of data. Association Rule Mining (ARM) is an important data mining technique. Mining of frequent patterns is a very important association rule mining problem. The previous approach i.e. Apriori suffers from the candidate-generation and test mechanism. The Apriori approach becomes inefficient when either the length of the frequent set or length of the Transaction Database (TDB) increases. The algorithm adopts bottom up breadth first approach for the mining purpose. In this research work, we have proposed a Reduced Scanning Transaction Database (RSTDB) algorithm that uses certain heuristic function which reduces the number of Transaction Database passes required to generate the maximum frequent set required for Association Rule Mining (ARM). The approach is a hybrid of bottom up and top down approach. It uses both upward and downward closure properties for frequent item sets evaluation.

In this work we will compare the Apriori approach with the above proposed approach for frequent pattern mining. We will try to evaluate the shortcoming of the proposed approach and also look as to how much efficient it is and in which cases. The RSTDB algorithm not only reduces the database scans but also will help in reducing the number of candidategeneration for a phase that is having a value less than the minimum support threshold value.

Keywords

Heuristic Function Apriori Algorithm Transaction Database Frequent Pattern Mining Minimum Support Threshold 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agrawal, R., Imielinski, T., and Swami, A.N.: Mining association rules between sets of items in large databases. Proceedings of ACM SIGMOD International Conference on Management of Data, ACM Press, Washington DC, pp. 207–216, May (1993)Google Scholar
  2. 2.
    Zaki, M.J.: Scalable Algorithms for Association Mining. IEEE Transactions on Knowledge and Data Engineering, vol. 12, no. 3, pp. 372–390, May/June (2000)CrossRefMathSciNetGoogle Scholar
  3. 3.
    Han, J., Pei, J., Yin, Y.: Mining Frequent Patterns without Candidate Generation. Proceedings of ACM SIGMOD Internationa 1 Conference on Management of Data, ACM Press, Dallas, Texas, pp. 1–12, May (2000)Google Scholar
  4. 4.
    Pei, J., Han, J., Lu, H., Nishio, S., Tang, S., and Yang, D.: Hmine: Hyper-Structure Mining of Frequent Patterns in Large Databases. Proceedings of IEEE International Conference on Data Mining, pp. 441–448 (2001)Google Scholar
  5. 5.
    Pietracaprina, Zandolin, D.: Mining Frequen t Item sets Using Patricia Tries. FIMI’ 03, Frequent Itemset Mining Implementations, Proceedings of the ICDM 2003 Workshop on Frequent Item set Mining Implementations, Melbourne, Florida, Dec. (2003)Google Scholar
  6. 6.
    Grahne, G., Zhu, J.: Efficiently using prefix-trees in mining frequent itemsets. FIMI’ 03, Frequent Itemset Mining Implementations, Proceedings of the ICDM 2003 Workshop on Frequent Itemset Mining Implementations, Melbourne, Florida, December (2003)Google Scholar
  7. 7.
    Burdick, D., Calimlim, M., Flannick, J., Gehrke, J.: MAFIA: A Maximal Frequent Itemset Algorithm. IEEE Transactions on Knowledge and Data Engineering, 17, 1490–1505, Nov. (2005).CrossRefGoogle Scholar
  8. 8.
    Zaki, M.J., Parthasarathy, S., Ogihara, M., Li, W.: New algorithms for fast discovery of association rules. Proceedings of the Third International Conference on Knowledge Discovery and Data Mining, AAAI Press, pp. 283–286 (1997)Google Scholar
  9. 9.
    Shenoy, P., Haritsa, J.R., Sudarshan, S., Bhalotia, G., Bawa, M., Shah, D.: Turbo-charging vertical mining of large databases. Proceedings of ACM SIGMOD Intnational Conference on Management of Data, ACM Press, Dallas, Texas, pp. 22–23, May (2000)Google Scholar
  10. 10.
    Burdick, D., Calimlim, M., and Gehrke, J.: MAFIA: a maximal frequent item set algorithm for transactional databases. Proceedings of International Conference on Data Engineering, Heidelberg, Germany, pp. 443–452, April (2001)Google Scholar
  11. 11.
    Zaki, M.J., Gouda, K.: Fast vertical mining using diffsets. Proceedings of the Nineth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, D.C., ACM Press, New York, pp. 326–335, (2003)Google Scholar
  12. 12.
    Agrawal, R., Agarwal, C., Prasad, V.: A Tree Projection Algorithm for Generation of Frequent Item Sets. Parallel and Distributed Computing, pp. 350–371, (2000)Google Scholar
  13. 13.
    Singh, V.K., Shah, V., Jain Y.K., Shukla, A., Thoke, A.S., Singh, V.K., Dule, C., Parganiha, V.: Proposing an Efficient Method for Frequent Pattern Mining. has been Accepted for Oral Presentation at the Conference and publication in Proceeding of World Academy of Science, Engineering and Technology, Volume 36, International Conference on Computational and Statistical Sciences, Bangkok Dec 9 (2008)Google Scholar
  14. 14.
    Singh, V.K., Shah V.: Minimizing Space Time Complexity in Frequent Pattern Mining by Reducing Transaction Database Scanning and Using Pattern Growth Methods. To appear in Chhattisgarh Journal, of Science and Technology (2008)Google Scholar
  15. 15.
    Singh, V.K., Shah V.: Minimizing Space Time Tradeoff in Frequent Pattern Mining Using Pattern growth Methods. Proceedings of Tech Acme 08, 17–19 Oct Bhopal (2008)Google Scholar
  16. 16.
    Singh, V.K., Singh, V.K.: The Huge Potential of Information Technology. Proceedings of National Convention on Global Leadership: Strategies and Challenges for Indian Business, Feb 10–11, GGDU Bilaspur (2007).Google Scholar

Copyright information

© Indian Institute of Information Technology, India 2009

Authors and Affiliations

  • Vaibhav Kant Singh
    • 1
  • Vinay Kumar Singh
    • 2
  1. 1.Department of Computer Science & Engineering SATI VidishaIndia
  2. 2.Department of MCA GGU BilaspurIndia

Personalised recommendations