Skip to main content
Log in

ITISS: an efficient framework for querying big temporal data

  • Published:
GeoInformatica Aims and scope Submit manuscript

Abstract

In the real word, temporal data can be found in many applications, and it is rapidly increasing nowadays. It is urgently important and challenging to manage and operate big temporal data efficiently and effectively, due to the large volume of big temporal data and the real-time response requirement. Processing big temporal data using a distributed system is a desired choice, since a single-machine based system usually has the limited computing ability. Nevertheless, existing distributed systems or methods either are disk-based solutions, or cannot support native queries, which may not well meet the demands of low latency and high throughput. To attack these issues, this article suggests a new approach to handle big temporal data. Our approach is an In-memory based Two-level Index Solution in Spark, dubbed as ITISS. The proposed framework of our solution is easily understood and implemented, but without loss of effectiveness and efficiency. Based on the proposed framework, this article develops targeted algorithms for handling time travel, temporal aggregation, and temporal join queries, respectively. We have implemented our framework in Apache Spark, extended the Apache Spark SQL to support declarative SQL interface that enables users to perform temporal queries with a few lines of SQL statements, and conducted extensive experiments to verify the performance of our solution. The experimental results, based on both real and synthetic datasets, consistently demonstrate that our proposed solution is efficient and competitive for processing big temporal data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24

Similar content being viewed by others

Notes

  1. https://www.didiglobal.com/

References

  1. Postgres 9.2 highlight - range types. http://paquier.xyz/postgresql-2/postgres-9-2-highlight-range-types, 2017

  2. Temporal tables. https://docs.microsoft.com/en-us/sql/relational-databases/tables/temporal-tables, 2017

  3. Workspace manager valid time support. https://docs.oracle.com/cd/B2835901/appdev.111/b28396, 2017

  4. Ahn I, Snodgrass RT (1986) Performance evaluation of a temporal database management system. In: SIGMOD, pp 96–107

  5. Alarabi L, Mokbel MF (2017) A demonstration of st-hadoop A mapreduce framework for big spatio-temporal data. PVLDB 10(12):1961–1964

    Google Scholar 

  6. Alarabi L, Mokbel MF, Musleh M (2017) St-hadoop: a mapreduce framework for spatio-temporal data. In: SSTD, pp 84–104. Springer

  7. Becker B, Gschwind S, Ohler T, Seeger B, Widmayer P (1996) An asymptotically optimal multiversion b-tree. VLDB J 5(4):264–275

    Article  Google Scholar 

  8. Bettini C, Wang XS, Bertino E, Jajodia S (1995) Semantic assumptions and query evaluation in temporal databases. In: SIGMOD, pp 257–268

  9. Bliujute R, Jensen CS, Saltenis S, Slivinskas G (1998) R-tree based indexing of now-relative bitemporal data, In: VLDB, pp 345–356

  10. Böhlen MH, Gamper J, Jensen CS (2006) Multi-dimensional aggregation for temporal data. In: EDBT, pp 257–275

  11. Cao X, Chen L, Cong G, Jensen CS, Qu Q, Skovsgaard A, Wu D, Yiu ML (2012) Spatial keyword querying. In: ER, pp 16–29

  12. Chandramouli B, Goldstein J, Duan S (2012) Temporal analytics on big data for web advertising. In: ICDE, pp 90–101

  13. Chen L, Cong G, Jensen CS, Wu D (2013) Spatial keyword query processing: An experimental evaluation. PVLDB, 6(3):217–228

    Google Scholar 

  14. Chen L, Shang S, Yao B, Zheng K (2018) Spatio-temporal top-k term search over sliding window. World Wide Web, pp 1–18

  15. Cheng K (2017) On computing temporal aggregates over null time intervals. In: DEXA, pp 67–79

  16. Elmasri R, Wuu GTJ, Kim Y-J (1990) The time index An access structure for temporal data. In: VLDB, pp 1–12

  17. Färber F, May N, Lehner W, Große P, Müller I, Rauhe H, Dees J (2012) The SAP HANA database – an architecture overview. IEEE Data Eng Bull 35(1):28–33

    Google Scholar 

  18. Gao D, Jensen CS, Snodgrass RT, Soo MD (2005) Join operations in temporal databases. VLDB J 14(1):2–29

    Article  Google Scholar 

  19. Gendrano JAG, Huang BC, Rodrigue JIMM, Moon B, Snodgrass RT (1999) Parallel algorithms for computing temporal aggregates. In: ICDE, pp 418–427

  20. Gollapudi S, Sivakumar D (2004) Framework and algorithms for trend analysis in massive temporal data sets. In: CIKM, pp 168–177

  21. Gunadhi H, Segev A (1991) Query processing algorithms for temporal intersection joins. In: ICDE, pp 336–344

  22. Günnemann S, Kremer H, Laufkötter C, Seidl T (2012) Tracing evolving subspace clusters in temporal climate data. Data Min Knowl Discov 24(2):387–410

    Article  Google Scholar 

  23. Gupta M, Gao J, Aggarwal CC, Han J (2014) Outlier detection for temporal data A survey. IEEE Trans Knowl Data Eng 26(9):2250–2267

    Article  Google Scholar 

  24. Jensen CS, Snodgrass RT (1999) Temporal data management. IEEE Trans Knowl Data Eng 11(1):36–44

    Article  Google Scholar 

  25. Kaufmann M, Fischer PM, May N, Ge C, Goel AK, Kossmann D (2015) Bi-temporal timeline index: A data structure for processing queries on bi-temporal data. In: ICDE, pp 471–482

  26. Kaufmann M, Manjili AA, Vagenas P, Fischer PM, Kossmann D, Färber F, May N (2013) Timeline index: a unified data structure for processing queries on temporal data in SAP HANA. In: SIGMOD, pp 1173–1184

  27. Kline N, Snodgrass RT (1995) Computing temporal aggregates. In: ICDE, pp 222–231

  28. Kollios G, Tsotras VJ (2002) Hashing methods for temporal data. IEEE Trans Knowl Data Eng 14(4):902–919

    Article  Google Scholar 

  29. Lakshminarasimhan HG (2014) Processing spatio-temporal data on map-reduce, pp 57–59. Springer

  30. Le W, Li F, Tao Y, Christensen R (2013) Optimal splitters for temporal and multi-version databases. In: SIGMOD, pp 109–120

  31. Leskovec J, Krevl A (2014) SNAP Datasets: Stanford large network dataset collection. http://snap.stanford.edu/data

  32. Leung TYC, Muntz RR (1992) Temporal query processing and optimization in multiprocessor database machines. In: VLDB, pp 383–394

  33. Li F, Yi K, Le W (2010) Top-k queries on temporal data. VLDB J 19 (5):715–733

    Article  Google Scholar 

  34. Li M, Chen L, Cong G, Gu Y, Yu G (2016) Efficient processing of location-aware group preference queries. In: CIKM, pp 559–568

  35. Loglisci C, Ceci M, Malerba D (2011) A temporal data mining framework for analyzing longitudinal data. In: DEXA, pp 97–106

  36. Lomet DB, Barga RS, Mokbel MF, Shegalov G, Wang R, Zhu Y (2006) Transaction time support inside a database engine. In: ICDE, pp 35

  37. Lu H, Ooi BC, Tan K-L (1994) On spatially partitioned temporal join. In: VLDB, pp 546–557

  38. Lu H, Yang B, Jensen CS (2011) Spatio-temporal joins on symbolic indoor tracking data. In: ICDE, pp 816–827

  39. Muth P, O’Neil P, Pick A, Weikum G (2000) The LHAM log-structured history data access method. VLDB J 8(3-4):199–221

    Article  Google Scholar 

  40. Özsoyoglu G, Snodgrass RT (1995) Temporal and real-time databases: A survey. IEEE Trans Knowl Data Eng 7(4):513–532

    Article  Google Scholar 

  41. Ramaswamy S (1997) Efficient indexing for constraint and temporal databases. In: ICDT, pp 419–431

  42. Roddick JF, Spiliopoulou M (2002) A survey of temporal knowledge discovery paradigms and methods. IEEE Trans Knowl Data Eng 14(4):750–767

    Article  Google Scholar 

  43. Saracco CM (2012) A matter of time: temporal data management in db2 10. Technical report, IBM

  44. Segev A, Gunadhi H (1989) Event-join optimization in temporal relational databases. In: VLDB, pp 205–215

  45. Shang S, Chen L, Jensen CS, Wen J-R, Kalnis P (2017) Searching trajectories by regions of interest. IEEE Trans Knowl Data Eng 29(7):1549–1562

    Article  Google Scholar 

  46. Shang S, Chen L, Wei Z, Jensen CS, Wen J-R, Kalnis P (2016) Collective travel planning in spatial networks. IEEE Trans Knowl Data Eng 28(5):1132–1146

    Article  Google Scholar 

  47. Shang S, Chen L, Wei Z, Jensen CS, Zheng K, Kalnis P (2017) Trajectory similarity join in spatial networks. PVLDB 10(11):1178–1189

    Google Scholar 

  48. Shang S, Chen L, Wei Z, Jensen CS, Zheng K, Kalnis P (2018) Parallel trajectory similarity joins in spatial networks. VLDB J 27(3):395–420

    Article  Google Scholar 

  49. Shang S, Chen L, Zheng K, Jensen CS, Wei Z, Kalnis P (2019) Parallel trajectory to location join. IEEE Trans Knowl Data Eng, pp 1–14. online first

  50. Shang S, Ding R, Bo Y, Xie K, Zheng K, Kalnis P (2012) User oriented trajectory search for trip recommendation. In: EDBT, pp 156–167

  51. Shang S, Ding R, Zheng K, Jensen CS, Kalnis P, Zhou X (2014) Personalized trajectory matching in spatial networks. VLDB J 23(3):449–468

    Article  Google Scholar 

  52. Shang S, Liu J, Zheng K, Lu H, Pedersen TB, Wen J-R (2015) Planning unobstructed paths in traffic-aware spatial networks. GeoInformatica 19(4):723–746

    Article  Google Scholar 

  53. Shang S, Zheng K, Jensen CS, Yang B, Kalnis P, Li G, Wen J-R (2015) Discovery of path nearby clusters in spatial networks. IEEE Trans Knowl Data Eng 27(6):1505–1518

    Article  Google Scholar 

  54. Son D, Elmasri R (1996) Efficient temporal join processing using time index. In: SSDBM, pp 252–261

  55. Wang P, Zhang P, Zhou C, Li Z, Yang H (2017) Hierarchical evolving dirichlet processes for modeling nonlinear evolutionary traces in temporal data. Data Min Knowl Discov 31(1):32–64

    Article  Google Scholar 

  56. Wang XS, Jajodia S, Subrahmanian VS (1993) Temporal modules: an approach toward federated temporal databases. In: SIGMOD, pp 227–236

  57. Whitman RT, Park MB, Marsh BG, Hoel EG (2017) Spatio-temporal join on apache spark. In: SIGSPATIAL, pages 1–10. ACM

  58. Xie D, Li F, Yao B, Li G, Zhou L, Guo M (2016) Simba: efficient in-memory spatial analytics. In: SIGMOD, pp 1071–1085

  59. Xu Y, Chen L, Yao B, Shang S, Zhu S, Zheng K, Li F (2017) Location-based top-k term querying over sliding window. In: WISE, pp 299–314. Springer

  60. Yang J, Widom J (2001) Incremental computation and maintenance of temporal aggregates. In: ICDE, pp 51–60

  61. Yang Y, Chen K (2011) Temporal data clustering via weighted clustering ensemble with different representations. IEEE Trans Knowl Data Eng 23(2):307–320

    Article  Google Scholar 

  62. Yao B, Zhang W, Wang Z-J, Chen Z, Shang S, Zheng K, Guo M (2018) Distributed in-memory analytics for big temporal data. In: DASFAA, pp 549–565

  63. Ye Y, Wang G, Chen L, Wang H (2013) Efficient keyword search on uncertain graph data. IEEE Trans Knowl Data Eng 25(12):2767–2779

    Article  Google Scholar 

  64. Ye Y, Wang G, Chen L, Wang H (2015) Graph similarity search on large uncertain graph databases. VLDB J 24(2):271–296

    Article  Google Scholar 

  65. Ye Y, Wang G, Xu JY, Chen L (2015) Efficient distributed subgraph similarity matching. VLDB J 24(3):369–394

    Article  Google Scholar 

  66. Zaharia M, Chowdhury M, Das T, Dave A, Ma J, McCauly M, Franklin MJ, Shenker S, Stoica I (2012) Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: USENIX, pp 15–28

  67. Zhang D, Markowetz A, Tsotras VJ, Gunopulos D, Seeger B (2008) On computing temporal aggregates with range predicates, vol 33

    Article  Google Scholar 

  68. Zhang D, Tsotras VJ, Seeger B (2002) Efficient temporal join processing using indices. In: ICDE, pp 103–113

  69. Zhang S, Yang Y, Fan W, Lan L, Yuan M (2014) Oceanrt: real-time analytics over large temporal data. In: SIGMOD, pp 1099–1102

  70. Zhao K, Chen L, Cong G (2016) Topic exploration in spatio-temporal document collections. In: SIGMOD, pp 985–998

  71. Zhao K, Liu Y, Yuan Q, Chen L, Chen Z, Cong G (2016) Towards personalized maps: mining user preferences from geo-textual data. PVLDB, 9 (13):1545–1548

    Google Scholar 

Download references

Acknowledgements

This work was supported by the NSFC (61872235, 61729202, 61832017, U1636210, U61811264, 61832013 and 61672351), and the National Key Research and Development Program of China (2018YFC1504504, 2016YFB0700502 and 2018YFB1004400).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bin Yao.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, Z., Yao, B., Wang, ZJ. et al. ITISS: an efficient framework for querying big temporal data. Geoinformatica 24, 27–59 (2020). https://doi.org/10.1007/s10707-019-00362-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10707-019-00362-1

Keywords

Navigation