Skip to main content
Log in

A probabilistic stop and move classifier for noisy GPS trajectories

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

Stop and move information can be used to uncover useful semantic patterns; therefore, annotating GPS trajectories as either stopping or moving is beneficial. However, the task of automatically discovering if the entity is stopping or moving is challenging due to the spatial noisiness of real-world GPS trajectories. Existing approaches classify each entry definitively as being either a stop or a move: hiding all indication that some classifications can be made with more certainty than others. Such an indication of the “goodness of classification” of each entry would allow the user to filter out certain stop classifications that appear too ambiguous for their use-case, which in a data-mining context may ultimately lead to less false patterns. In this work we propose such an approach that takes a noisy GPS trajectory as input and calculates the stop probability at each entry. Through the use of a minimum stop probability parameter our proposed approach allows the user to directly filter out any classified stops that are of an unacceptable probability for their application. Using several real-world and synthetic GPS trajectories (that we have made available) we compared the classification effectiveness, parameter sensitivity, and running time of our approach to two well-known existing approaches SMoT and CB-SMoT. Experimental results indicated the efficiency, effectiveness, and sampling rate robustness of our approach compared to the existing approaches. The results also demonstrated that the user can increase the minimum stop probability parameter to easily filter out low probability stop classifications—which equated to effectively reducing the number of false positive classifications in our ground truth experiments. Lastly, we proposed estimation heuristics for each our approaches’ parameters and empirically demonstrated the effectiveness of each heuristic using real-world trajectories. Specifically, the results revealed that even when all of the parameters were estimated the classification effectiveness of our approach was higher than existing approaches across a range of sampling rates.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. In the case where stops must occur for some minimum amount of time it is straightforward to enforce this constraint on POSMIT’s stop/move classification result. Firstly, all contiguous entries that are classified as stops are merged into groups, each of these groups then has their combined durations calculated, and finally groups whose durations are too low become moves.

  2. Entries with spatial coordinates in a non-Cartesian geographic projection will need to be unprojected to calculate a suitable Euclidean distance. Also, Euclidean distance was chosen over great-circle distance for this problem because it is most widely used in spatial analysis (Smith et al. 2015), and it is faster to compute and intra-point distance between points in a candidate stop are intrinsically small; thus, factoring in the curvature of Earth in this case would be negligible.

  3. https://github.com/lukehb/137-GPS-Tracker.

  4. http://doi.org/10.13140/RG.2.2.29896.01281.

  5. https://github.com/lukehb/137-stopmove.

References

  • Alvares LO, Bogorny V, Kuijpers B, de Macedo J.A.F, Moelans B, Vaisman A (2007) A model for enriching trajectories with semantic geographical information. In: Proceedings of the 15th annual ACM international symposium on advances in geographic information systems GIS ’07. ACM, New York, pp 22:1–22:8

  • Ankerst M, Breunig MM, Kriegel HP, Sander J (1999) Optics: ordering points to identify the clustering structure. SIGMOD Rec 28(2):49–60. https://doi.org/10.1145/304181.304187

    Article  Google Scholar 

  • Boukhechba M, Bouzouane A, Bouchard B, Gouin-Vallerand C, Giroux S (2015) Online recognition of people’s activities from raw GPS data: semantic trajectory data analysis. In: Proceedings of the 8th ACM international conference on PErvasive technologies related to assistive environments PETRA ’15. ACM, New York, pp 40:1–40:8. https://doi.org/10.1145/2769493.2769498

  • Calenge C, Dray S, Royer-Carenzi M (2009) The concept of animals’ trajectories from a data analysis perspective. Ecol Inf 4(1):34–41. https://doi.org/10.1016/j.ecoinf.2008.10.002

    Article  Google Scholar 

  • Cao H, Mamoulis N, Cheung DW (2007) Discovery of periodic patterns in spatiotemporal sequences. IEEE Trans Knowl Data Eng 19(4):453–467. https://doi.org/10.1109/TKDE.2007.1002

    Article  Google Scholar 

  • Cao X, Cong G, Jensen CS (2010) Mining significant semantic locations from GPS data. Proc VLDB Endow 3(1–2):1009–1020. https://doi.org/10.14778/1920841.1920968

    Article  Google Scholar 

  • DATA.GOV.IE: Dublin bus GPS sample data from Dublin city council (insight project) (2013). https://data.gov.ie/dataset/dublin-bus-gps-sample-data-from-dublin-city-council-insight-project. Accessed 12 Nov 2017

  • de Smith MJ, Goodchild MF, Longley PA (2015) Geospatial analysis: a comprehensive guide to principles, techniques and software tools, 5th edn. The Winchelsea Press

  • Ester M, peter Kriegel H, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD ’96: proceedings of the 2nd international conference on knowledge discovery and data mining. AAAI Press, pp 226–231

  • Fischer MM, Getis A (eds) (2010) Handbook of applied spatial analysis: software tools, methods and applications. Springer

  • Fu Z, Tian Z, Xu Y, Qiao C (2016) A two-step clustering approach to extract locations from individual GPS trajectory data. ISPRS Int J Geoinf. https://doi.org/10.3390/ijgi5100166

    Article  Google Scholar 

  • Giannotti F, Nanni M, Pinelli F, Pedreschi D (2007) Trajectory pattern mining. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining KDD ’07. ACM, New York, pp 330–339. https://doi.org/10.1145/1281192.1281230

  • Gong L, Sato H, Yamamoto T, Miwa T, Morikawa T (2015) Identification of activity stop locations in gps trajectories by density-based clustering method combined with support vector machines. J Mod Transp 23(3):202–213. https://doi.org/10.1007/s40534-015-0079-x

    Article  Google Scholar 

  • Gonzalez MC, Hidalgo CA, Barabasi AL (2008) Understanding individual human mobility patterns. Nature 453(7196):779–782

    Article  Google Scholar 

  • Guidotti R, Trasarti R, Nanni M (2015) Tosca: two-steps clustering algorithm for personal locations detection. In: Proceedings of the 23rd SIGSPATIAL international conference on advances in geographic information systems SIGSPATIAL ’15. ACM, New York, pp 38:1–38:10

  • Guidotti R, Trasarti R, Nanni M, Giannotti F, Pedreschi D (2017) There’s a path for everyone: a data-driven personal model reproducing mobility agendas. In: 2017 IEEE international conference on data science and advanced analytics (DSAA) pp 303–312

  • Haining R (2003) Spatial data analysis: theory and practice. Cambridge University Press. https://books.google.com.au/books?id=CYZSh347eiAC

  • Hastie T, Tibshirani R, Friedman J (2001) The elements of statistical learning. Springer, New York

    Book  MATH  Google Scholar 

  • Huang L, Li Q, Yue Y (2010) Activity identification from GPS trajectories using spatial temporal POIS’ attractiveness. In: Proceedings of the 2nd ACM SIGSPATIAL international workshop on location based social networks LBSN ’10. ACM, New York, pp 27–30. https://doi.org/10.1145/1867699.1867704

  • Hwang YC, Lin CC, Chang JR, Mori H, Huang HC (2009) Predicting essential genes based on network and sequence analysis. Mol BioSyst 5:1672–1678

    Article  Google Scholar 

  • Hwang S, Evans C, Hanke T (2017) Detecting stop episodes from GPS trajectories with gaps. Springer, New York, pp 427–439. https://doi.org/10.1007/978-3-319-40902-3_23

    Book  Google Scholar 

  • Khetarpaul S, Chauhan R, Gupta SK, Subramaniam LV, Nambiar U (2011) Mining GPS data to determine interesting locations. In: Proceedings of the 8th international workshop on information integration on the Web: In Conjunction with WWW 2011 IIWeb ’11. ACM, New York, pp 8:1–8:6. https://doi.org/10.1145/1982624.1982632

  • Leung KWT, Lee DL, Lee WC (2011) Clr: a collaborative location recommendation framework based on co-clustering. In: Proceedings of the 34th international ACM SIGIR conference on research and development in information retrieval SIGIR ’11. ACM, New York, pp 305–314. https://doi.org/10.1145/2009916.2009960

  • Luo T, Zheng X, Xu G, Fu K, Ren W (2017) An improved DBSCAN algorithm to detect stops in individual trajectories. ISPRS Int J Geo-Inf. https://doi.org/10.3390/ijgi6030063

    Article  Google Scholar 

  • MacQueen J (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability Volume 1: Statistics. University of California Press, Berkeley, pp 281–297. http://projecteuclid.org/euclid.bsmsp/1200512992

  • McCarroll D (2017) Simple statistical tests for geography. CRC Press, Boca Raton

    Google Scholar 

  • Nadaraya EA (1964) On estimating regression. Theory Probab Appl 9(1):141–142

    Article  MATH  Google Scholar 

  • Palma AT, Bogorny V, Kuijpers B, Alvares LO (2008) A clustering-based approach for discovering interesting places in trajectories. In: Proceedings of the 2008 ACM symposium on applied computing SAC ’08. ACM, New York, pp 863–868. https://doi.org/10.1145/1363686.1363886

  • Pelekis N, Kopanakis I, Kotsifakos E, Frentzos E, Theodoridis Y (2009) Clustering trajectories of moving objects in an uncertain world. In: 2009 Ninth IEEE international conference on data mining, pp 417–427. https://doi.org/10.1109/ICDM.2009.57

  • Powers D (2011) Evaluation: from precision recall and f-measure to ROC informedness markedness & correlation. J Mach Learn Technol 2:37–63

    Google Scholar 

  • Rocha JAMR, Times VC, Oliveira G, Alvares LO, Bogorny V (2010) Db-SMoT: a direction-based spatio-temporal clustering method. In: 2010 5th IEEE international conference intelligent systems, pp 114–119. https://doi.org/10.1109/IS.2010.5548396

  • Satopaa V, Albrecht J, Irwin D, Raghavan B (2011) Finding a “kneedle” in a haystack: detecting knee points in system behavior. In: Proceedings of the 2011 31st international conference on distributed computing systems workshops ICDCSW ’11. IEEE Computer Society, Washington, pp 166–171. https://doi.org/10.1109/ICDCSW.2011.20

  • Smith MJ, Goodchild MF, Longley PA (2015) Geospatial analysis: a comprehensive guide to principles techniques and software tools, 5th edn. The Winchelsea Press, Leicester

    Google Scholar 

  • Spaccapietra S, Parent C, Damiani ML, de Macedo JA, Porto F, Vangenot C (2008) A conceptual view on trajectories. Data Knowl Eng 65(1):126–146. https://doi.org/10.1016/j.datak.2007.10.008

    Article  Google Scholar 

  • Spinsanti L, Celli F, Renso C (2010) Where you stop is who you are: understanding peoples activities by places visited. In: BMI ’10: Proceedings of the 5th BMI workshop on behaviour monitoring and interpretation. CEUR-WS Karlsruhe, Germany, pp 38–52

  • Takeuchi Y, Sugimoto M (2006) Cityvoyager: an outdoor recommendation system based on user location history. In: Proceedings of the third international conference on ubiquitous intelligence and computing UIC’06. Springer, Berlin, pp 625–636. https://doi.org/10.1007/11833529_64

    Chapter  Google Scholar 

  • Thierry B, Chaix B, Kestens Y (2013) Detecting activity locations from raw gps data: a novel kernel-based algorithm. Int J Health Geogr 12(1):14

    Article  Google Scholar 

  • Tobler WR (1970) A computer movie simulating urban growth in the detroit region. Econ Geogr 46:234–240

    Article  Google Scholar 

  • Trajcevski G (2011) Uncertainty in spatial trajectories. Springer, New York, pp 63–107. https://doi.org/10.1007/978-1-4614-1629-6_3

    Book  Google Scholar 

  • Tran LH, Nguyen QVH, Do NH, Yan Z (2011) Robust and hierarchical stop discovery in sparse and diverse trajectories. Technical report EPFL EPFL

  • Xiang L, Gao M, Wu T (2016) Extracting stops from noisy trajectories: a sequence oriented clustering approach. ISPRS Int J Geo-Inf. https://doi.org/10.3390/ijgi5030029

    Article  Google Scholar 

  • Xie K, Deng K, Zhou X (2009) From trajectories to activities: a spatio-temporal join approach. In: Proceedings of the 2009 international workshop on location based social networks LBSN ’09. ACM, New York, pp 25–32. https://doi.org/10.1145/1629890.1629897

  • Ying JJC, Lee WC, Tseng VS (2014) Mining geographic-temporal-semantic patterns in trajectories for location prediction. ACM Trans Intell Syst Technol 5(1):2:1–2:33. https://doi.org/10.1145/2542182.2542184

    Article  Google Scholar 

  • Yuan J, Zheng Y, Zhang C, Xie W, Xie X, Sun G, Huang Y (2010) T-drive: driving directions based on taxi trajectories. In: Proceedings of the 18th SIGSPATIAL international conference on advances in geographic information systems GIS ’10. ACM, New York, pp 99–108. https://doi.org/10.1145/1869790.1869807

  • Zheng Y, Zhang L, Xie X, Ma WY (2009) Mining interesting locations and travel sequences from GPS trajectories. In: Proceedings of the 18th international conference on World Wide Web WWW ’09. ACM, New York, pp 791–800. https://doi.org/10.1145/1526709.1526816

  • Zimmermann M, Kirste T, Spiliopoulou M (2009) Finding stops in error-prone trajectories of moving objects with time-based clustering. Springer, Berlin, pp 275–286. https://doi.org/10.1007/978-3-642-10263-9_24

    Book  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ickjai Lee.

Additional information

Responsible editor: Srinivasan Parthasarathy.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bermingham, L., Lee, I. A probabilistic stop and move classifier for noisy GPS trajectories. Data Min Knowl Disc 32, 1634–1662 (2018). https://doi.org/10.1007/s10618-018-0568-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-018-0568-8

Keywords

Navigation