Preference-aware sequence matching for location-based services

  • Hao WangEmail author
  • Ziyu Lu


Sequantial data are important in many real world location based services. In this paper, we study the problem of sequence matching. Specifically, we want to identify the sequences most similar to a given sequence, under three most commonly used preferece-aware similarity measures, i.e., Fagin’s intersection metric, Kendall’s tau, and Spearman’s footrule. We first analyze the properties of these three preference-aware similarity measures, revealing the connection between them and set intersection. Then, we build an index structure, which is essentially a doubly linked list, to facilitate efficient sequence matching. Lower- and upper-bounds are derived to achieve support prefix-based filtering. Experiments on various datasets show that our proposed method outperforms the baselines by a large margin.


Sequence Matching Preference Similarity 



  1. 1.
    Rentfrow PJ, Gosling SD (2003) The do re mi’s of everyday life: the structure and personality correlates of music preferences. J Pers Soc Psychol 84(6):1236–1256CrossRefGoogle Scholar
  2. 2.
    Chausson O Assessing the impact of gender and personality on film preferences. Technical report, University of Cambridge, 2010. myPersonality ProjectGoogle Scholar
  3. 3.
    Cantador I, Ferández-Tobías I, Bellogín A (2013) Relating personality types with user preferences in multiple entertainment domains. In: EMPIREGoogle Scholar
  4. 4.
    Diaconis P, Graham RL (1977) Spearman’s footrule as a measure of disarray. J Royal Statistical Soc Series B (Methodol) 39(2):262–268Google Scholar
  5. 5.
    Douglas E (1984) Critchlow. Metric methods for analyzing partially ranked data. Technical Report 225, Dept of Statistics, Stanford UniversityGoogle Scholar
  6. 6.
    Salama IA, Quade D (1990) A note on spearman’s footrule. Comm Statistics 19(2):591–601CrossRefGoogle Scholar
  7. 7.
    Fagin R, Kumar R, Sivakumar D (2003) Comparing top-k lists. SIAM J Discrete Math 17(1):134–160CrossRefGoogle Scholar
  8. 8.
    Wu S, Crestani F (2003) Methods for ranking information retrieval systems without relevance judgements. In: SACGoogle Scholar
  9. 9.
    Webber W, Moffat A, Zobel J (2010) A similarity measure for indefinite rankings. TOIS 28(4):1–34CrossRefGoogle Scholar
  10. 10.
    Adomavicius G, Tuzhilin A (2005) Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. TKDE 17(6):734–749Google Scholar
  11. 11.
    Konstas I, Stathopoulos V, Jose JM (2009) On social networks and collaborative recommendation. In: SIGIRGoogle Scholar
  12. 12.
    Shang S, Chen L, Wei Z, Jensen CS, Zheng K, Kalnis P (2017) Trajectory similarity join in spatial networks. In: PVLDBGoogle Scholar
  13. 13.
    Yue X, Xi M, Chen B, Gao M, He Y, Xu J (2019) A revocable group signatures scheme to provide privacy-preserving authentications. Mobile Networks and ApplicationsGoogle Scholar
  14. 14.
    Ester M, Kriegel H-P, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDDGoogle Scholar
  15. 15.
    Pal K, Michel S (2016) Efficient similarity search across top-k lists under the Kendall’s tau distance. In: SSDMB2016Google Scholar
  16. 16.
    Berchtold S, Ertl B, Keim DA, Kriegel H-P, Seidl T (1998) Fast nearest neighbor search in high-dimensional space. In: ICDEGoogle Scholar
  17. 17.
    Roussopoulos N, Kelly S, Vincent F (1995) eRic Nearest neighbor queries. In: KDDGoogle Scholar
  18. 18.
    Hjaltason GR, Samet H (1999) Distance browsing in spatial databases. TODS 24(2):265–318CrossRefGoogle Scholar
  19. 19.
    Sharifzadeh M, Shahabi C (2010) Vor-tree: R-trees with Voronoi diagrams for efficient processing of spatial nearest neighbor queries. PVLDB 3(1-2):1231–1242Google Scholar
  20. 20.
    Liu T, Moore AW, Gray A (2006) New algorithms for efficient high-dimensional nonparametric classification. JMLR 7:1135–1158Google Scholar
  21. 21.
    Sproull RF (1991) Refinements to nearest-neighbor searching in k-dimensional trees. Algorithmica 6:579–589CrossRefGoogle Scholar
  22. 22.
    Beygelzimer A, Kakade S, Langford J (2006) Cover trees for nearest neighbors. In: ICMLGoogle Scholar
  23. 23.
    Filho RFS, Traina A, Traina C Jr., Faloutsos C (2001) Similarity search without tears: the OMNI-family of all-purpose access methods. In: ICDEGoogle Scholar
  24. 24.
    Jagadish HV, Ooi BC, Tan K-L, Yu C, Zhang R (2005) idistance: an adaptive b+-tree based indexing method for nearest neighbor search. TODS 30(2):364–397CrossRefGoogle Scholar
  25. 25.
    Venkateswaran J, Lachwani D, Kahveci T, Jermaine C (2006) Reference-based indexing of sequence databases. In: VLDBGoogle Scholar
  26. 26.
    Spearman C (1904) The proof and measurement of association between two things. Am J Psychol 15(1):72–101CrossRefGoogle Scholar
  27. 27.
    Kendall M (1948) Rank correlation methods charles griffin and co.Google Scholar
  28. 28.
    Jurman G, Merler S, Barla A, Paoli S, Galea A, Furlanello C (2008) Algebraic stability indicators for ranked lists in molecular profiling. Bioinformatics 24 (2):258–264CrossRefGoogle Scholar
  29. 29.
    Jurman G, Riccadonna S, Visintainer R, Furlanello C (2009) Canberra distance on ranked lists. In: Adv ranking NIPS 09 Workshop, Whistler, CanadaGoogle Scholar
  30. 30.
    Jurman G, Riccadonna S, Visintainer R, Furlanello C (2012) Algebraic comparison of partial lists in bioinformatics. PLoS One 7(5):e36540CrossRefGoogle Scholar
  31. 31.
    Chen J, Li Y, Feng L (2012) A new weighted Spearman’s footrule as a mesaure of distance between rankings. In: 1207.2541.v2 [cs.DM]
  32. 32.
    Bartholdi JJ III, Tovey CA, Trick MA (1989) Voting schemes for which it can be difficult to tell who won the election. Soc Choice Welfare 8(2):157–165CrossRefGoogle Scholar
  33. 33.
    Dwork C, Kumar R, Naor M, Sivakumar D (2001) Rank aggregation methods for the Web. In: WWWGoogle Scholar
  34. 34.
    Ailon N (2007) Aggregation of partial rankings, p-ratings and top-m lists. In: SODAGoogle Scholar
  35. 35.
    Sculley D. (2007) Rank aggregation for similar items. In: SDMGoogle Scholar
  36. 36.
    Fang Q, Feng J, Ng W (2011) Identifying differentially-expressed genes via weighted rank aggregation. In: ICDMGoogle Scholar
  37. 37.
    Liu Y-T, Liu T-Y, Qin T, Ma Z-M, Li H (2007) Supervised rank aggregation. In: WWWGoogle Scholar
  38. 38.
    Klementiev A, Roth D, Small K (2008) Unsupervised rank aggregation with distance-based models. In: ICMLGoogle Scholar
  39. 39.
    Fagin R, Kumar R, Sivakumar D (2003) Efficient similarity search and classification via rank aggregation. In: SIGMODGoogle Scholar
  40. 40.
    Witten IH, Moffat A, Bell TC (1999) Managing Gigabytes: Compressing and Indexing Documents and Images, 2nd edn. Morgan Kaufmann, BurlingtonGoogle Scholar
  41. 41.
    Sanders P, Transier F (2007) Intersection in integer inverted indices. In: ALENEXGoogle Scholar
  42. 42.
    Mirzazadeh M. (2004) Adaptive comparison-based algorithms for evaluating set queries. Master’s thesis, University of WaterlooGoogle Scholar
  43. 43.
    Bille P, Pagh A, Pagh R (2007) Fast evaluation of union-intersection expressions. In: ISAACGoogle Scholar
  44. 44.
    Blelloch GE, Reid-Miller M (1998) Fast set operations using treaps. In: SPAAGoogle Scholar
  45. 45.
    Ding B, König AC (2011) Fast set intersection in memory. In: VLDBGoogle Scholar
  46. 46.
    Shang S, Ding R, Bo Y, Xie K, Zheng K, Kalnis P (2012) User oriented trajectory search for trip recommendation. In: EDBTGoogle Scholar
  47. 47.
    Cao X, Chen L, Cong G, Xiao X (2012) Keyword-aware optimal route search. In: PVLDBGoogle Scholar
  48. 48.
    Cao X, Chen L, Cong G, Jensen CS, Qu Q, Skovsgaard A, Wu D, Yiu ML (2012) Spatial keyword querying. In: ERGoogle Scholar
  49. 49.
    Cao X, Chen L, Cong G, Guan J, Phan N-T, Xiao X (2013) KORS: Keyword-aware optimal route search system. In: ICDEGoogle Scholar
  50. 50.
    Han J, Wen J-R (2013) Mining frequent neighborhood patterns in a large labeled graph. In: CIKMGoogle Scholar
  51. 51.
    Han J, Wen J-R, Pei J (2014) Within-network classification using radius-constrained neighborhood patterns. In: CIKMGoogle Scholar
  52. 52.
    Han J, Zheng K, Sun A, Shang S, Wen J-R (2016) Discovering neighborhood pattern queries by sample answers in knowledge base. In: ICDEGoogle Scholar
  53. 53.
    Shang S, Ding R, Zheng K, Jensen CS, Kalnis P, Zhou X (2014) Personalized trajectory matching in spatial networks. VLDB J 23(3):449–468CrossRefGoogle Scholar
  54. 54.
    Shang S, Chen L, Wei Z, Jensen CS, Wen J-R, Kalnis P (2016) Collective travel planning in spatial networks. TKDE 28(5):1132–1146Google Scholar
  55. 55.
    Shang S, Chen L, Jensen CS, Wen J-R, Kalnis P (2017) Searching trajectories by regions of interest. TKDE 29(7):1549–1562Google Scholar
  56. 56.
    Shang S, Chen L, Zheng K, Jensen CS, Wei Z, Kalnis P (2018) Parallel trajectory to location join. TKDE, online firstGoogle Scholar
  57. 57.
    Chen L, Cui Y, Cong G, Cao X (2014) SOPS: A system for efficient processing of spatial-keyword publish/subscribe. In: PVLDBGoogle Scholar
  58. 58.
    Chen L, Cong G, Cao X, Tan K-L (2015) Temporal spatial-keyword top-k publish/subscribe. In: ICDEGoogle Scholar
  59. 59.
    Chen L, Cong G (2015) Diversity-aware top-k publish/subscribe for text stream. In: SIGMODGoogle Scholar
  60. 60.
    Chen Z, Cong G, Zhang Z, Tom ZJ, Chen L (2017) Distributed publish/subscribe query processing on the spatio-textual data stream. In: ICDEGoogle Scholar
  61. 61.
    Chen L, Shang S, Zhang Z, Cao X, Jensen CS, Kalnis P (2018) Location-aware top-k term publish/subscribe. In: ICDEGoogle Scholar
  62. 62.
    Li M, Chen L, Cong G, Gu Y, Yu G (2016) Efficient processing of location-aware group preference queries. In: CIKMGoogle Scholar
  63. 63.
    An L, Wang W, Shang S, Li Q, Zhang X (2018) Efficient task assignment in spatial crowdsourcing with worker and task privacy protection. GeoInformatica 22 (2):335–362CrossRefGoogle Scholar
  64. 64.
    Chen L, Cong G, Cao X (2013) An efficient query indexing mechanism for filtering geo-textual data. In: SIGMODGoogle Scholar
  65. 65.
    Zhao K, Liu Y, Yuan Q, Chen L, Chen Z, Cong G (2016) Towards personalized maps: mining user preferences from geo-textual data. In: PVLDBGoogle Scholar
  66. 66.
    Li X, Cheng Y, Cong G, Chen L (2017) Discovering pollution sources and propagation patterns in urban area. In: KDDGoogle Scholar
  67. 67.
    Zhao K, Chen L, Cong G (2016) Topic exploration in spatio-temporal document collections. In: SIGMODGoogle Scholar
  68. 68.
    Knuth DE (2009) Bitwise Tricks & Techniques; Binary Decision Diagrams, volume 4, fascicle 1 of The Art of Computer Programming, chapter 7 Addison-WesleyGoogle Scholar
  69. 69.
    Wegner P (1960) A technique for counting ones in a binary computer. CACM 3 (5):322CrossRefGoogle Scholar
  70. 70.
    Tang J, Zhang D, Yao L (2007) Social network extraction of academic researchers. In: ICDM’07Google Scholar
  71. 71.
    Tang J, Zhang J, Yao L, Li J, Li Z, Su Z (2008) Arnetminer: Extraction and mining of academic social networks. In: KDDGoogle Scholar
  72. 72.
    Tang J, Yao L, Zhang D, Zhang J (2010) A combination approach to web user profiling. ACM TKDD 5(1):1–44CrossRefGoogle Scholar
  73. 73.
    Tang J, Zhang J, Jin R, Zi Y, Cai K, Li Z, Zhong S u (2011) Topic level expertise search over heterogeneous networks. Machine Learning Journal 82 (2):211–237CrossRefGoogle Scholar
  74. 74.
    Tang J, Fong ACM, Bo W, Zhang J (2012) A unified probabilistic framework for name disambiguation in digital library. TKDE 24(6):975–987Google Scholar
  75. 75.
    Goldberg K, Roeder T, Gupta D, Perkins C (2001) Eigentaste: a constant time collaborative filtering algorithm. J Inform Retrieval 4:133–151CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Inception Institute of Artificial IntelligenceAbu DhabiUAE
  2. 2.Central University of Finance and EconomicsBeijingChina

Personalised recommendations