A probabilistic stop and move classifier for noisy GPS trajectories

Article
  • 45 Downloads

Abstract

Stop and move information can be used to uncover useful semantic patterns; therefore, annotating GPS trajectories as either stopping or moving is beneficial. However, the task of automatically discovering if the entity is stopping or moving is challenging due to the spatial noisiness of real-world GPS trajectories. Existing approaches classify each entry definitively as being either a stop or a move: hiding all indication that some classifications can be made with more certainty than others. Such an indication of the “goodness of classification” of each entry would allow the user to filter out certain stop classifications that appear too ambiguous for their use-case, which in a data-mining context may ultimately lead to less false patterns. In this work we propose such an approach that takes a noisy GPS trajectory as input and calculates the stop probability at each entry. Through the use of a minimum stop probability parameter our proposed approach allows the user to directly filter out any classified stops that are of an unacceptable probability for their application. Using several real-world and synthetic GPS trajectories (that we have made available) we compared the classification effectiveness, parameter sensitivity, and running time of our approach to two well-known existing approaches SMoT and CB-SMoT. Experimental results indicated the efficiency, effectiveness, and sampling rate robustness of our approach compared to the existing approaches. The results also demonstrated that the user can increase the minimum stop probability parameter to easily filter out low probability stop classifications—which equated to effectively reducing the number of false positive classifications in our ground truth experiments. Lastly, we proposed estimation heuristics for each our approaches’ parameters and empirically demonstrated the effectiveness of each heuristic using real-world trajectories. Specifically, the results revealed that even when all of the parameters were estimated the classification effectiveness of our approach was higher than existing approaches across a range of sampling rates.

Keywords

Stop and move GPS trajectory Probabilistic classifier Semantic trajectory Preprocessing 

References

  1. Alvares LO, Bogorny V, Kuijpers B, de Macedo J.A.F, Moelans B, Vaisman A (2007) A model for enriching trajectories with semantic geographical information. In: Proceedings of the 15th annual ACM international symposium on advances in geographic information systems GIS ’07. ACM, New York, pp 22:1–22:8Google Scholar
  2. Ankerst M, Breunig MM, Kriegel HP, Sander J (1999) Optics: ordering points to identify the clustering structure. SIGMOD Rec 28(2):49–60.  https://doi.org/10.1145/304181.304187 CrossRefGoogle Scholar
  3. Boukhechba M, Bouzouane A, Bouchard B, Gouin-Vallerand C, Giroux S (2015) Online recognition of people’s activities from raw GPS data: semantic trajectory data analysis. In: Proceedings of the 8th ACM international conference on PErvasive technologies related to assistive environments PETRA ’15. ACM, New York, pp 40:1–40:8.  https://doi.org/10.1145/2769493.2769498
  4. Calenge C, Dray S, Royer-Carenzi M (2009) The concept of animals’ trajectories from a data analysis perspective. Ecol Inf 4(1):34–41.  https://doi.org/10.1016/j.ecoinf.2008.10.002 CrossRefGoogle Scholar
  5. Cao H, Mamoulis N, Cheung DW (2007) Discovery of periodic patterns in spatiotemporal sequences. IEEE Trans Knowl Data Eng 19(4):453–467.  https://doi.org/10.1109/TKDE.2007.1002 CrossRefGoogle Scholar
  6. Cao X, Cong G, Jensen CS (2010) Mining significant semantic locations from GPS data. Proc VLDB Endow 3(1–2):1009–1020.  https://doi.org/10.14778/1920841.1920968 CrossRefGoogle Scholar
  7. DATA.GOV.IE: Dublin bus GPS sample data from Dublin city council (insight project) (2013). https://data.gov.ie/dataset/dublin-bus-gps-sample-data-from-dublin-city-council-insight-project. Accessed 12 Nov 2017
  8. de Smith MJ, Goodchild MF, Longley PA (2015) Geospatial analysis: a comprehensive guide to principles, techniques and software tools, 5th edn. The Winchelsea PressGoogle Scholar
  9. Ester M, peter Kriegel H, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD ’96: proceedings of the 2nd international conference on knowledge discovery and data mining. AAAI Press, pp 226–231Google Scholar
  10. Fischer MM, Getis A (eds) (2010) Handbook of applied spatial analysis: software tools, methods and applications. SpringerGoogle Scholar
  11. Fu Z, Tian Z, Xu Y, Qiao C (2016) A two-step clustering approach to extract locations from individual GPS trajectory data. ISPRS Int J Geoinf.  https://doi.org/10.3390/ijgi5100166 Google Scholar
  12. Giannotti F, Nanni M, Pinelli F, Pedreschi D (2007) Trajectory pattern mining. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining KDD ’07. ACM, New York, pp 330–339.  https://doi.org/10.1145/1281192.1281230
  13. Gong L, Sato H, Yamamoto T, Miwa T, Morikawa T (2015) Identification of activity stop locations in gps trajectories by density-based clustering method combined with support vector machines. J Mod Transp 23(3):202–213.  https://doi.org/10.1007/s40534-015-0079-x CrossRefGoogle Scholar
  14. Gonzalez MC, Hidalgo CA, Barabasi AL (2008) Understanding individual human mobility patterns. Nature 453(7196):779–782CrossRefGoogle Scholar
  15. Guidotti R, Trasarti R, Nanni M (2015) Tosca: two-steps clustering algorithm for personal locations detection. In: Proceedings of the 23rd SIGSPATIAL international conference on advances in geographic information systems SIGSPATIAL ’15. ACM, New York, pp 38:1–38:10Google Scholar
  16. Guidotti R, Trasarti R, Nanni M, Giannotti F, Pedreschi D (2017) There’s a path for everyone: a data-driven personal model reproducing mobility agendas. In: 2017 IEEE international conference on data science and advanced analytics (DSAA) pp 303–312Google Scholar
  17. Haining R (2003) Spatial data analysis: theory and practice. Cambridge University Press. https://books.google.com.au/books?id=CYZSh347eiAC
  18. Hastie T, Tibshirani R, Friedman J (2001) The elements of statistical learning. Springer, New YorkCrossRefMATHGoogle Scholar
  19. Huang L, Li Q, Yue Y (2010) Activity identification from GPS trajectories using spatial temporal POIS’ attractiveness. In: Proceedings of the 2nd ACM SIGSPATIAL international workshop on location based social networks LBSN ’10. ACM, New York, pp 27–30.  https://doi.org/10.1145/1867699.1867704
  20. Hwang YC, Lin CC, Chang JR, Mori H, Huang HC (2009) Predicting essential genes based on network and sequence analysis. Mol BioSyst 5:1672–1678CrossRefGoogle Scholar
  21. Hwang S, Evans C, Hanke T (2017) Detecting stop episodes from GPS trajectories with gaps. Springer, New York, pp 427–439.  https://doi.org/10.1007/978-3-319-40902-3_23 Google Scholar
  22. Khetarpaul S, Chauhan R, Gupta SK, Subramaniam LV, Nambiar U (2011) Mining GPS data to determine interesting locations. In: Proceedings of the 8th international workshop on information integration on the Web: In Conjunction with WWW 2011 IIWeb ’11. ACM, New York, pp 8:1–8:6.  https://doi.org/10.1145/1982624.1982632
  23. Leung KWT, Lee DL, Lee WC (2011) Clr: a collaborative location recommendation framework based on co-clustering. In: Proceedings of the 34th international ACM SIGIR conference on research and development in information retrieval SIGIR ’11. ACM, New York, pp 305–314.  https://doi.org/10.1145/2009916.2009960
  24. Luo T, Zheng X, Xu G, Fu K, Ren W (2017) An improved DBSCAN algorithm to detect stops in individual trajectories. ISPRS Int J Geo-Inf.  https://doi.org/10.3390/ijgi6030063 Google Scholar
  25. MacQueen J (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability Volume 1: Statistics. University of California Press, Berkeley, pp 281–297. http://projecteuclid.org/euclid.bsmsp/1200512992
  26. McCarroll D (2017) Simple statistical tests for geography. CRC Press, Boca RatonGoogle Scholar
  27. Nadaraya EA (1964) On estimating regression. Theory Probab Appl 9(1):141–142CrossRefMATHGoogle Scholar
  28. Palma AT, Bogorny V, Kuijpers B, Alvares LO (2008) A clustering-based approach for discovering interesting places in trajectories. In: Proceedings of the 2008 ACM symposium on applied computing SAC ’08. ACM, New York, pp 863–868.  https://doi.org/10.1145/1363686.1363886
  29. Pelekis N, Kopanakis I, Kotsifakos E, Frentzos E, Theodoridis Y (2009) Clustering trajectories of moving objects in an uncertain world. In: 2009 Ninth IEEE international conference on data mining, pp 417–427.  https://doi.org/10.1109/ICDM.2009.57
  30. Powers D (2011) Evaluation: from precision recall and f-measure to ROC informedness markedness & correlation. J Mach Learn Technol 2:37–63Google Scholar
  31. Rocha JAMR, Times VC, Oliveira G, Alvares LO, Bogorny V (2010) Db-SMoT: a direction-based spatio-temporal clustering method. In: 2010 5th IEEE international conference intelligent systems, pp 114–119.  https://doi.org/10.1109/IS.2010.5548396
  32. Satopaa V, Albrecht J, Irwin D, Raghavan B (2011) Finding a “kneedle” in a haystack: detecting knee points in system behavior. In: Proceedings of the 2011 31st international conference on distributed computing systems workshops ICDCSW ’11. IEEE Computer Society, Washington, pp 166–171.  https://doi.org/10.1109/ICDCSW.2011.20
  33. Smith MJ, Goodchild MF, Longley PA (2015) Geospatial analysis: a comprehensive guide to principles techniques and software tools, 5th edn. The Winchelsea Press, LeicesterGoogle Scholar
  34. Spaccapietra S, Parent C, Damiani ML, de Macedo JA, Porto F, Vangenot C (2008) A conceptual view on trajectories. Data Knowl Eng 65(1):126–146.  https://doi.org/10.1016/j.datak.2007.10.008 CrossRefGoogle Scholar
  35. Spinsanti L, Celli F, Renso C (2010) Where you stop is who you are: understanding peoples activities by places visited. In: BMI ’10: Proceedings of the 5th BMI workshop on behaviour monitoring and interpretation. CEUR-WS Karlsruhe, Germany, pp 38–52Google Scholar
  36. Takeuchi Y, Sugimoto M (2006) Cityvoyager: an outdoor recommendation system based on user location history. In: Proceedings of the third international conference on ubiquitous intelligence and computing UIC’06. Springer, Berlin, pp 625–636.  https://doi.org/10.1007/11833529_64
  37. Thierry B, Chaix B, Kestens Y (2013) Detecting activity locations from raw gps data: a novel kernel-based algorithm. Int J Health Geogr 12(1):14CrossRefGoogle Scholar
  38. Tobler WR (1970) A computer movie simulating urban growth in the detroit region. Econ Geogr 46:234–240CrossRefGoogle Scholar
  39. Trajcevski G (2011) Uncertainty in spatial trajectories. Springer, New York, pp 63–107.  https://doi.org/10.1007/978-1-4614-1629-6_3 CrossRefGoogle Scholar
  40. Tran LH, Nguyen QVH, Do NH, Yan Z (2011) Robust and hierarchical stop discovery in sparse and diverse trajectories. Technical report EPFL EPFLGoogle Scholar
  41. Xiang L, Gao M, Wu T (2016) Extracting stops from noisy trajectories: a sequence oriented clustering approach. ISPRS Int J Geo-Inf.  https://doi.org/10.3390/ijgi5030029 Google Scholar
  42. Xie K, Deng K, Zhou X (2009) From trajectories to activities: a spatio-temporal join approach. In: Proceedings of the 2009 international workshop on location based social networks LBSN ’09. ACM, New York, pp 25–32.  https://doi.org/10.1145/1629890.1629897
  43. Ying JJC, Lee WC, Tseng VS (2014) Mining geographic-temporal-semantic patterns in trajectories for location prediction. ACM Trans Intell Syst Technol 5(1):2:1–2:33.  https://doi.org/10.1145/2542182.2542184 Google Scholar
  44. Yuan J, Zheng Y, Zhang C, Xie W, Xie X, Sun G, Huang Y (2010) T-drive: driving directions based on taxi trajectories. In: Proceedings of the 18th SIGSPATIAL international conference on advances in geographic information systems GIS ’10. ACM, New York, pp 99–108.  https://doi.org/10.1145/1869790.1869807
  45. Zheng Y, Zhang L, Xie X, Ma WY (2009) Mining interesting locations and travel sequences from GPS trajectories. In: Proceedings of the 18th international conference on World Wide Web WWW ’09. ACM, New York, pp 791–800.  https://doi.org/10.1145/1526709.1526816
  46. Zimmermann M, Kirste T, Spiliopoulou M (2009) Finding stops in error-prone trajectories of moving objects with time-based clustering. Springer, Berlin, pp 275–286.  https://doi.org/10.1007/978-3-642-10263-9_24 Google Scholar

Copyright information

© The Author(s) 2018

Authors and Affiliations

  1. 1.Information Technology AcademyJames Cook UniversityCairnsAustralia

Personalised recommendations