Mining evolutions of complex spatial objects using a single-attributed Directed Acyclic Graph

Abstract

Directed acyclic graphs (DAGs) are used in many domains ranging from computer science to bioinformatics, including industry and geoscience. They enable to model complex evolutions where spatial objects (e.g., soil erosion) may move, (dis)appear, merge or split. We study a new graph-based representation, called attributed DAG (a-DAG). It enables to capture interactions between objects as well as information on objects (e.g., characteristics or events). In this paper, we focus on pattern mining in such data. Our patterns, called weighted paths, offer a good trade-off between expressiveness and complexity. Frequency and compactness constraints are used to filter out uninteresting patterns. These constraints lead to an exact condensed representation (without loss of information) in the single-graph setting. A depth-first search strategy and an optimized data structure are proposed to achieve the efficiency of weighted path discovery. It does a progressive extension of patterns based on database projections. Relevance, scalability and genericity are illustrated by means of qualitative and quantitative results when mining various real and synthetic datasets. In particular, we show how such an approach can be used to monitor soil erosion using remote sensing and geographical information system (GIS) data.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19

References

  1. 1.

    Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th international conference on very large data bases (VLDB). Morgan Kaufmann, pp 487–499

  2. 2.

    Agrawal R, Srikant R (1995) Mining sequential patterns. In: Proceedings of the eleventh international conference on data engineering (ICDE). IEEE Computer Society, pp 3–14

  3. 3.

    Alatrista-Salas H, Bringay S, Flouvat F, Selmaoui-Folcher N, Teisseire M (2012) The pattern next door: towards spatio-sequential pattern discovery. In: Advances in knowledge discovery and data mining. Springer, pp 157–168

  4. 4.

    Arimura H, Uno T (2009) Polynomial-delay and polynomial-space algorithms for mining closed sequences, graphs, and pictures in accessible set systems. In: Proceedings of the SIAM international conference on data mining (SDM). SIAM, pp 1088–1099

  5. 5.

    Aydin B, Angryk RA (2016) A graph-based approach to spatiotemporal event sequence mining. In: Proceedings of the IEEE international conference on data mining workshops (ICDMW). IEEE Computer Society, pp 1090–1097

  6. 6.

    Bannari A, Morin D, Bonn F, Huete A (1995) A review of vegetation indices. Remote Sens Rev 13(1–2):95–120

    Article  Google Scholar 

  7. 7.

    Berthold MR, Cebron N, Dill F, Gabriel TR, Kötter T, Meinl T, Ohl P, Sieb C, Thiel K, Wiswedel B (2007) KNIME: the Konstanz information miner. In: Studies in classification, data analysis, and knowledge organization (GfKL 2007). Springer

  8. 8.

    Beucher S, Meyer F (1993) The morphological approach to segmentation: the watershed transformation. Mathematical morphology in image processing. Opt Eng 34:433–481

    Google Scholar 

  9. 9.

    Bonchi F, Lucchese C (2004) On closed constrained frequent pattern mining. In: Proceedings of the IEEE international conference on data mining (ICDM). IEEE Computer Society, pp 35–42

  10. 10.

    Borges J, Levene M (2000) A fine grained heuristic to capture web navigation patterns. ACM SIGKDD Explor 2(1):40–50

    Article  Google Scholar 

  11. 11.

    Boulicaut JF, Bykowski A, Rigotti C (2003) Free-sets: a condensed representation of boolean data for the approximation of frequency queries. Data Min Knowl Discov 7(1):5–22

    MathSciNet  Article  Google Scholar 

  12. 12.

    Bringmann B, Nijssen S (2008) What is frequent in a single graph? In: Proceedings of the Pacific-Asia conference on advances in knowledge discovery and data mining (PAKDD). Springer, pp 858–863

  13. 13.

    Calders T, Rigotti C, Boulicaut JF (2004) A survey on condensed representations for frequent sets. In: Constraint-based mining and inductive databases. Springer, pp 64–80

  14. 14.

    Casali A, Cicchetti R, Lakhal L (2005) Essential patterns: a perfect cover of frequent patterns. In: Proceedings of the international conference on data warehousing and knowledge discovery (DaWaK). Springer, pp 428–437

  15. 15.

    Celik M, Shekhar S, Rogers JP, Shine JA (2008) Mixed-drove spatiotemporal co-occurrence pattern mining. IEEE Trans Knowl Data Eng 20(10):1322–1335

    Article  Google Scholar 

  16. 16.

    Chen MS, Park JS, Yu PS (1998) Efficient data mining for path traversal patterns. IEEE Trans Knowl Data Eng 10(2):209–221

    Article  Google Scholar 

  17. 17.

    Chen Yl, Kao Hp, Ko Mt (2004) Mining DAG patterns from DAG databases. In: Advances in web-age information management, pp 579–588

  18. 18.

    Collin M, Flouvat F, Selmaoui-Folcher N (2016) Patsi: pattern mining of time series of satellite images in knime. In: Proceedings of the IEEE international conference on data mining workshops (ICDMW). IEEE Computer Society, pp 1292–1295

  19. 19.

    Cook D, Holder L (2006) Mining graph data. Wiley, New York

    Google Scholar 

  20. 20.

    De Raedt L, Kramer S (2001) The levelwise version space algorithm and its application to molecular fragment finding. In: Proceedings of the international joint conference on artificial intelligence (IJCAI), vol 2. Morgan Kaufmann, pp 853–859

  21. 21.

    De Raedt L, Jaeger M, Lee SD, Mannila H (2002) A theory of inductive query answering. In: Proceedings of the IEEE international conference on data mining (ICDM). IEEE Computer Society, pp 123–130

  22. 22.

    Douar B, Liquiere M, Latiri C, Slimani Y (2015) Lc-mine: a framework for frequent subgraph mining with local consistency techniques. Knowl Inf Syst 44(1):1–25

    Article  Google Scholar 

  23. 23.

    Dube MP, Egenhofer MJ (2014) Surrounds in partitions. In: Proceedings of the ACM international conference on advances in geographic information systems (SIGSPATIAL). ACM, pp 233–242

  24. 24.

    Dube MP, Barrett JV, Egenhofer MJ (2015) From metric to topology: determining relations in discrete space. In: International workshop on spatial information theory. Springer, pp 151–171

  25. 25.

    Fariha A, Ahmed CF, Leung CKS, Abdullah S, Cao L (2013) Mining frequent patterns from human interactions in meetings using directed acyclic graphs. In: Proceedings of the Pacific-Asia conference on advances in knowledge discovery and data mining (PAKDD). Springer, pp 38–49

  26. 26.

    Fiedler M, Borgelt C (2007) Support computation for mining frequent subgraphs in a single graph. In: Mining and learning with graphs

  27. 27.

    Flouvat F, Sanhes J, Pasquier C, Selmaoui-Folcher N, Boulicaut JF (2014) Improving pattern discovery relevancy by deriving constraints from expert models. In: Proceedings of the European conference on artificial intelligence (ECAI). IOS Press, pp 327–332

  28. 28.

    Fukuzaki M, Seki M, Kashima H, Sese J (2010) Finding itemset-sharing patterns in a large itemset-associated graph. In: Proceedings of the Pacific-Asia conference on advances in knowledge discovery and data mining (PAKDD). Springer, pp 147–159

  29. 29.

    Garriga GC, Khardon R, De Raedt L (2012) Mining closed patterns in relational, graph and network data. In: Annals of mathematics and artificial intelligence, pp 1–28

  30. 30.

    Geng R, Xu W, Dong X (2007) WTPMiner: efficient mining of weighted frequent patterns based on graph traversals. In: Proceedings of the international conference on knowledge science, engineering and management (KSEM). Springer, pp 412–424

  31. 31.

    Giannotti F, Pedreschi D (eds) (2008) Mobility, data mining and privacy—geographic knowledge discovery. Springer, Berlin

    Google Scholar 

  32. 32.

    Gudes E, Shimony SE, Vanetik N (2006) Discovering frequent graph patterns using disjoint paths. IEEE Trans Knowl Data Eng 18(11):1441–1456

    Article  Google Scholar 

  33. 33.

    Günnemann S, Seidl T (2010) Subgraph mining on directed and weighted graphs. In: Proceedings of the Pacific-Asia conference on advances in knowledge discovery and data mining (PAKDD). Springer, pp 133–146

  34. 34.

    Gunopulos D, Mannila H, Saluja S (1997) Discovering all most specific sentences by randomized algorithms extended abstract. Springer, Berlin

    Google Scholar 

  35. 35.

    Haas BJ, Delcher AL, Wortman JR, Salzberg SL (2004) Dagchainer: a tool for mining segmental genome duplications and synteny. Bioinformatics 20(18):3643–3646

    Article  Google Scholar 

  36. 36.

    Huang Y, Shekhar S, Xiong H (2004) Discovering colocation patterns from spatial data sets: a general approach. IEEE Trans Knowl Data Eng 16(12):1472–1485

    Article  Google Scholar 

  37. 37.

    Inokuchi A, Washio T, Motoda H (2000) An apriori-based algorithm for mining frequent substructures from graph data. In: Proceedings of the European conference on principles of data mining and knowledge discovery (PKDD). Springer, vol 1910, pp 13–23

  38. 38.

    Jiang C, Coenen F, Zito M (2013) A survey of frequent subgraph mining algorithms. Knowl Eng Rev 28(01):75–105

    Article  Google Scholar 

  39. 39.

    Jiang J, Worboys M (2009) Event-based topology for dynamic planar areal objects. Int J Geogr Inf Sci 23(1):33–60

    Article  Google Scholar 

  40. 40.

    Jiang X, Xiong H, Wang C, Tan AH (2009) Mining globally distributed frequent subgraphs in a single labeled graph. Data Knowl Eng 68(10):1034–1058

    Article  Google Scholar 

  41. 41.

    Khan A, Yan X, Wu KL (2010) Towards proximity pattern mining in large graphs. In: Proceedings of the ACM international conference on management of data (SIGMOD). ACM Press, pp 867–878

  42. 42.

    Kuramochi M, Karypis G (2001) Frequent subgraph discovery. In: Proceedings of the IEEE international conference on data mining (ICDM). IEEE Computer Society, pp 313–320

  43. 43.

    Kuramochi M, Karypis G (2005) Finding frequent patterns in a large sparse graph*. Data Min Knowl Discov 11(3):243–271

    MathSciNet  Article  Google Scholar 

  44. 44.

    Leskovec J, Krevl A (2014) SNAP Datasets: Stanford large network dataset collection. http://snap.stanford.edu/data

  45. 45.

    Leskovec J, Kleinberg J, Faloutsos C (2005) Graphs over time: densification laws, shrinking diameters and possible explanations. In: Proceedings of the ACM international conference on knowledge discovery in data mining (SIGKDD). ACM, pp 177–187

  46. 46.

    Lewis JA, Dube MP, Egenhofer MJ (2013) The topology of spatial scenes in r2. In: International conference on spatial information theory. Springer, pp 495–515

  47. 47.

    Miyoshi Y, Ozaki T, Ohkawa T (2009) Frequent pattern discovery from a single graph with quantitative itemsets. In: Proceedings of the IEEE international conference on data mining workshops (ICDMW), pp 527–532

  48. 48.

    Mohan P, Shekhar S, Shine JA, Rogers JP (2010) Cascading spatio-temporal pattern discovery: a summary of results. In: Proceedings of the SIAM international conference on data mining (SDM), pp 327–338

  49. 49.

    Mohan P, Shekhar S, Shine JA, Rogers JP (2012) Cascading spatio-temporal pattern discovery. IEEE Trans Knowl Data Eng 24(11):1977–1992

    Article  Google Scholar 

  50. 50.

    Moser F, Colak R, Rafiey A, Ester M (2009) Mining cohesive patterns from graphs with feature vectors. In: Proceedings of the SIAM international conference on data mining (SDM), pp 593–604

  51. 51.

    Nanopoulos A, Manolopoulos Y (2001) Mining patterns from graph traversals. Data Knowl Eng 37(3):243–266

    Article  Google Scholar 

  52. 52.

    Nguyen TT, Nguyen HA, Pham NH, Al-Kofahi JM, Nguyen TN (2009) Graph-based mining of multiple object usage patterns. In: Proceedings of the the joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on the foundations of software engineering. ACM Press, pp 383–392

  53. 53.

    Nijssen S, Kok JN (2004) A quickstart in frequent structure mining can make a difference. In: Proceedings of the ACM international conference on knowledge discovery and data mining (SIGKDD). ACM, pp 647–652

  54. 54.

    Pasquier C, Flouvat F, Sanhes J, Selmaoui-Folcher N (2017) Attributed graph mining in the presence of automorphism. Knowl Inf Syst 50(2):569–584

    Article  Google Scholar 

  55. 55.

    Pasquier N, Bastide Y, Taouil R, Lakhal L (1999) Discovering frequent closed itemsets for association rules. In: Proceedings of the international conference on database theory (ICDT). Springer, pp 398–416

  56. 56.

    Pei J, Han J, Mortazavi-Asl B, Wang J, Pinto H, Chen Q, Dayal U, Hsu M (2004) Mining sequential patterns by pattern-growth: the prefixspan approach. IEEE Trans Knowl Data Eng 16(11):1424–1440

    Article  Google Scholar 

  57. 57.

    Qian F, He Q, He J (2009) Mining spread patterns of spatio-temporal co-occurrences over zones. In: Proceedings of the international conference on computational science and its applications (ICCSA). Springer, vol 5593, pp 677–692

  58. 58.

    Sanhes J, Flouvat F, Pasquier C, Selmaoui-Folcher N, Boulicaut J (2013) Weighted path as a condensed pattern in a single attributed DAG. In: Proceedings of the international joint conference on artificial intelligence (IJCAI)

  59. 59.

    Sedgewick R, Wayne K (2011) Algorithms, 4th edn. Addison-Wesley, Reading

    Google Scholar 

  60. 60.

    Selmaoui-Folcher N, Flouvat F (2011) How to use classical tree mining algorithms to find complex spatio-temporal patterns? In: Proceedings of the international conference on database and expert systems applications (DEXA). Springer, pp 107–117

  61. 61.

    Silva A, Meira W Jr, Zaki MJ (2012) Mining attribute-structure correlated patterns in large attributed graphs. Proceedings of the VLDB Endowment 5(5):466–477

    Article  Google Scholar 

  62. 62.

    Sindoni G, Stell JG (2017) The logic of discrete qualitative relations. In: Proceedings of the international conference on spatial information theory (COSIT). Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, vol 86, pp 1–15

  63. 63.

    Termier A, Tamada Y, Numata K, Imoto S, Washio T, Higushi T, Higuchi T (2007) DigDag, a first algorithm to mine closed frequent embedded sub-DAGs. In: Proceedings of mining and learning with graphs (MLG), pp 1–5

  64. 64.

    Tsoukatos I, Gunopulos D (2001) Efficient mining of spatiotemporal patterns. In: Proceedings of the international symposium on spatial and temporal databases (SSTD). Springer, vol 2121, pp 425–442

  65. 65.

    Uno T, Asai T, Uchida Y, Arimura H (2003) LCM: an efficient algorithm for enumerating frequent closed item sets. In: Proceedings of the IEEE international conference on data mining workshop on frequent itemset mining implementations (FIMI). CEUR-WS.org, vol 90

  66. 66.

    Uno T, Asai T, Uchida Y, Arimura H (2004) An efficient algorithm for enumerating closed patterns in transaction databases. In: Proceedings of the international conference on discovery science (DS). Springer, pp 16–31

  67. 67.

    Wang J, Hsu W, Lee ML, Wang JTL (2004) FlowMiner: finding flow patterns in spatio-temporal databases. In: Proceedings of the IEEE international conference on tools with artificial intelligence (ICTAI). IEEE Computer Society, pp 14–21

  68. 68.

    Wang J, Hsu W, Lee ML, Sheng C (2006) A partition-based approach to graph mining. In: Proceedings of the IEEE international conference on data engineering (ICDE). IEEE Computer Society, pp 74—-74

  69. 69.

    Washio T, Motoda H (2003) State of the art of graph-based data mining. SIGKDD Explora Newsl 5(1):59–68

    Article  Google Scholar 

  70. 70.

    Washio T, Mitsunaga Y, Motoda H (2005) Mining quantitative frequent itemsets using adaptive density-based subspace clustering. In: Proceedings of the IEEE international conference on data mining (ICDM). IEEE Computer Society, pp 793–796

  71. 71.

    Wasserman S, Faust K (1994) Social network analysis: methods and applications, vol 8. Cambridge University Press, Cambridge

    Google Scholar 

  72. 72.

    Werth T, Dreweke A, Wörlein M, Fischer I, Philippsen M (2008) Dagma: mining directed acyclic graphs. In: Proceedings of the IADIS European conference on data mining. IADIS Press, pp 11–18

  73. 73.

    Werth T, Wörlein M, Dreweke A, Fischer I, Philippsen M (2009) Dag mining for code compaction. In: Data mining for business applications. Springer, pp 209–223

  74. 74.

    Worboys M (2012) The maptree: a fine-grained formal representation of space. In: International conference on geographic information science. Springer, pp 298–310

  75. 75.

    Yan X, Han J (2002) gSpan: Graph-bases substructure pattern mining. In: Proceedings of the IEEE international conference on data mining (ICDM). IEEE Computer Society, vol 3, pp 721–724

  76. 76.

    Yan X, Han J (2003) CloseGraph. In: Proceedings of the ACM international conference on knowledge discovery and data mining (SIGKDD). ACM Press, vol 6, p 286

  77. 77.

    Yan X, Han J, Afshar R (2003) Clospan: mining: closed sequential patterns in large datasets. In: Proceedings of the SIAM international conference on data mining (SDM), pp 166–177

  78. 78.

    Yang H, Parthasarathy S, Mehta S (2005) A generalized framework for mining spatio-temporal patterns in scientific data. In: Proceedings of the ACM international conference on knowledge discovery and data mining (SIGKDD). ACM Press, pp 716–721

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Frédéric Flouvat.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This research was supported by the Project FOSTER ANR-2010-COSI-012-01 funded by the French Ministry of Higher Education and Research.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Flouvat, F., Selmaoui-Folcher, N., Sanhes, J. et al. Mining evolutions of complex spatial objects using a single-attributed Directed Acyclic Graph. Knowl Inf Syst (2020). https://doi.org/10.1007/s10115-020-01478-9

Download citation

Keywords

  • Graph mining
  • Spatiotemporal data
  • Attributed DAG
  • Weighted path
  • Environmental monitoring