Skip to main content

PSMSP: A Parallelized Sampling-Based Approach for Mining Top-k Sequential Patterns in Database Graphs

  • Conference paper
  • First Online:
Database Systems for Advanced Applications (DASFAA 2019)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11448))

Included in the following conference series:

  • 3556 Accesses

Abstract

We study to improve the efficiency of finding top-k sequential patterns in database graphs, where each edge (or vertex) is associated with multiple transactions and a transaction consists of a set of items. This task is to discover the subsequences of transaction sequences that frequently appear in many paths. We propose PSMSP, a Parallelized Sampling-based Approach For Mining Top-k Sequential Patterns, which involves: (a) a parallelized unbiased sequence sampling approach, and (b) a novel PSP-Tree structure to efficiently mine the patterns based on the anti-monotonicity properties. We validate our approach via extensive experiments with real-world datasets.

This work has been supported in part by the National Key Research and Development Program of China (No. 2017YFB0803301), the Natural Science Foundation of China (No. U1836215), and DongGuan Innovative Research Team Program (No. 201636000100038).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. DBLP. http://dblp.uni-trier.de/

  2. Lei, M., Chu, L., Wang, Z.: Mining top-k sequential patterns in database graphs: a new challenging problem and a sampling-based approach. arXiv preprint arXiv:1805.03320 (2018)

  3. Li, H., Yi, W., Dong, Z., Ming, Z., Edward, Y.C.: PFP: parallel FP-growth for query recommendation. In: RecSys, pp. 107–114 (2008)

    Google Scholar 

  4. Pei, J., et al.: PrefixSpan: mining sequential patterns by prefix-projected growth. In: ICDE, pp. 215–224 (2001)

    Google Scholar 

  5. Riondato, M., Upfal, E.: Efficient discovery of association rules and frequent itemsets through sampling with tight performance guarantees. In: ECML PKDD, pp. 25–41 (2012)

    Google Scholar 

  6. Riondato, M., Upfal, E.: Mining frequent itemsets through progressive sampling with rademacher averages. In: KDD, pp. 1005–1014 (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xi Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Lei, M., Zhang, X., Yang, J., Fang, B. (2019). PSMSP: A Parallelized Sampling-Based Approach for Mining Top-k Sequential Patterns in Database Graphs. In: Li, G., Yang, J., Gama, J., Natwichai, J., Tong, Y. (eds) Database Systems for Advanced Applications. DASFAA 2019. Lecture Notes in Computer Science(), vol 11448. Springer, Cham. https://doi.org/10.1007/978-3-030-18590-9_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-18590-9_23

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-18589-3

  • Online ISBN: 978-3-030-18590-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics