RCP Mining: Towards the Summarization of Spatial Co-location Patterns
Co-location pattern mining is an important task in spatial data mining. However, the traditional framework of co-location pattern mining produces an exponential number of patterns because of the downward closure property, which makes it hard for users to understand, or apply. To address this issue, in this paper, we study the problem of mining representative co-location patterns (RCP). We first define a covering relationship between two co-location patterns by finding a new measure to appropriately quantify the distance between patterns in terms of their prevalence, based on which the problem of RCP mining is formally formulated. To solve the problem of RCP mining, we first propose an algorithm called RCPFast, adopting the post-mining framework that is commonly used by existing distance-based pattern summarization techniques. To address the peculiar challenge in spatial data mining, we further propose another algorithm, RCPMS, which employs the mine-and-summarize framework that pushes pattern summarization into the co-location mining process. Optimization strategies are also designed to further improve the performance of RCPMS. Our experimental results on both synthetic and real-world data sets demonstrate that RCP mining effectively summarizes spatial co-location patterns, and RCPMS is more efficient than RCPFast, especially on dense data sets.
KeywordsApproximation Strategy Frequent Itemset Compression Rate Representative Pattern Cover Relationship
We thank the anonymous reviewers for their detailed suggestions for improving the paper. This work was supported, in part, by the Australian Research Council (ARC) Discovery Project under Grant No. DP140100545.
- 2.Chen, L., Liu, C., Zhang, C.: Mining Probabilistic Representative Frequent Patterns From Uncertain Data. In: SDM, pp. 73–81 (2013)Google Scholar
- 6.Bayardo, Jr., R.J.: Efficiently mining long patterns from databases. In: SIGMOD Conference, pp. 85–93 (1998)Google Scholar
- 8.Liu, B., Chen, L., Liu, C., Zhang, C., Qiu, W.: RCP Mining: Towards the Summarization of Spatial Co-location Patterns. https://goo.gl/B0mwei
- 9.Liu, C., Chen, L., Zhang, C.: Summarizing probabilistic frequent patterns: a fast approach. In: SIGKDD, pp. 527–535 (2013)Google Scholar
- 10.Liu, G., Zhang, H., Wong, L.: Finding minimum representative pattern sets. In: KDD, pp. 51–59 (2012)Google Scholar
- 11.Modani, N., Dey, K.: Large maximal cliques enumeration in sparse graphs. In: CIKM, pp. 1377–1378 (2008)Google Scholar
- 12.Morimoto, Y.: Mining frequent neighboring class sets in spatial databases. In: KDD, pp. 353–358 (2001)Google Scholar
- 17.Xin, D., Han, J., Yan, X., Cheng, H.: Mining compressed frequent-pattern sets. In: VLDB, pp. 709–720 (2005)Google Scholar
- 18.Yan, X., Cheng, H., Han, J., Xin, D.: Summarizing itemset patterns: a profile-based approach. In: KDD, pp. 314–323 (2005)Google Scholar
- 19.Yoo, J.S., Bow, M.: Mining Top-k closed co-location patterns. In: ICSDM, pp. 100–105 (2011)Google Scholar
- 20.Yoo, J.S., Shekhar, S.: A partial join approach for mining co-location patterns. In: GIS, pp. 241–249 (2004)Google Scholar
- 22.Zhang, X., Mamoulis, N., Cheung, D.W., Shou, Y.: Fast mining of spatial collocations. In: KDD, pp. 384–393 (2004)Google Scholar