Paths in Complex Networks
Synonyms
Glossary
 Path

A path is a sequence of nodes and edges in a graph such that each node and edge of the path is contained in the graph
 Polygonal curve or polygonal chain

A sequence of connected line segments (in geometry, usually in the Euclidean plane). It is also uniquely determined by the sequence of points at which the line segments are connected
 Sequence

A sequence is an ordered list of elements in which elements are of arbitrary type and repetitions of elements are allowed
 Trajectory

A trajectory describes the position of a moving object through space. A discrete trajectory is usually a sequence of (possibly timestamped) locations in two or threedimensional space, for example, given as GPS coordinates
Definition
Given a simple, undirected, and unweighted graph G = (V, E) with a set of nodes V = {v _{1}, … , v _{ n }} and a set of edges E ⊆ V × V, a path in G is defined as finite sequence P = (p _{1} e _{ p1} p _{2} … p _{ k−1} e _{ pk }__{1} p _{ k }) with p _{ i } ∈ V for all i ∈ {1, … , k} and \( {e}_{p_i}=\left({p}_i,{p}_{i+1}\right)\in E \) for all i ∈ {1, … , k − 1} and k ∈ IN. There are usually two different terminologies: if the nodes and edges are not required to be distinct, P is called either a path or walk. If the edges of P are distinct, P is called a simple path or a trail. If the nodes and the edges of P are distinct, P is called an elementary path or a path. In this entry, the former terminology is used, i.e., when using the term path, no assumptions, whether its nodes and edges are distinct, are made.
Since the considered graphs are simple, a path is uniquely determined by its node sequence and the notation can be simplified to P = (p _{1} p _{2} … p _{ k }) which is used in the following.
Let V(P) = {p _{1}, … , p _{ k }} and E(P) = {e _{ p1}, … , e _{ pk−1}} denote the set of nodes and edges which are contained in a path P, respectively. The length ∣P∣ = k − 1 of a path P is defined as the number of (not necessarily distinct) edges.
A similarity measure on a set of elements X is a function σ : X × X → IR which indicates how similar two objects of X are. The more similar two objects are, the higher the value of the similarity function should be.
A distance measure on a set X is a function δ : X × X → IR which indicates the dissimilarity or the distance of two objects.
A distance measure δ is said to satisfy nonnegativity if for all x , y ∈ X, it holds that δ(x, y) ≥ 0. δ satisfies coincidence, if for all x, y ∈ X, it holds that δ(x, y) = 0 ⇔ x = y. δ satisfies symmetry, if for all x, y ∈ X, it holds that δ(x, y) = δ(y, x). δ satisfies the triangle inequality if for all x, y , z ∈ X, it holds that δ(x, z) ≤ δ(x, y) + δ(y, z).
A distance metric on a set X is a distance measure satisfying nonnegativity, coincidence, symmetry, and the triangle inequality.
A metric space (X, d) is a set of elements X on which there is a distance metric d defined for any pair of elements of X.
Introduction
Key Points
When humans or other entities traverse a complex network, they usually do not take the shortest path, but they also do not move randomly. The structure of these paths is an important research area which has just started. First studies have been published exploring which ways humans or other entities take in complex networks. Of particular relevance are methods for summarizing their structure. Such methods will help reducing (possibly large) data sets of paths into a few representative groups of paths. For developing such methods, an appropriate similarity measure for paths is needed.
Historical Background
Human navigation in spatial navigation has been subject to numerous studies in cognitive sciences for decades (e.g., McDonald and Pellegrino 1993). These studies focus primarily on human orientation in a (possibly unknown) spatial environment (Moeser 1988) and the mental representations humans have from their environment (Aginsky et al. 1997).
The observation that humans are able to find surprisingly short path also in other environments although they do not know the complete environment has been illustrated by Milgram (1967): in his famous experiment, people were asked to send a letter to a target person via one of their acquaintances. Although the structure of the social network was not known to any of the involved persons, the letters which arrived at their destination were routed over only five intermediate persons on average which is a remarkable small number (to be fair, over all different runs of the experiment, the percentage of letters which actually arrived at the target person ranges only from 15 to 35%). A similar result was found by Sudarshan Iyengar et al. (2012), who looked at players of a word game and found that the paths taken by the participants were on average around 1.7 times longer than the shortest paths. The results of West and Leskovec (2012) support these observations: they analyzed the paths taken by humans while seeking for information in the Wikipedia network. Based on almost 30,000 distinct paths for different pairs of articles, their analysis revealed that human wayfinding in information networks is surprisingly efficient although not the complete network structure is known to the players.
Although this has been known for decades, how humans actually find these short paths was not investigated until Kleinberg (2000a) posed the question in which way the structure of the network has an effect on the performance of decentralized algorithms (i.e., algorithms for finding paths from a given source to a given target in a network, using only local information of the nodes). Many networks, also the social network and the information network are examples of small world networks, i.e., networks with a strongly local structure (a high clustering coefficient) and a few longrange connections (which lead to a short average path length) at the same time (Watts and Strogatz 1998). Kleinberg generalizes the small world model of Watts and Strogatz and can prove that the capability of any decentralized algorithm of finding short paths (short compared to the diameter of the network which is not necessarily the shortest path) is crucially dependent on the value of one parameter of the model: there is exactly one parameter value for which a decentralized algorithm is able to find a short path with high probability. However, for other values of the parameter, no decentralized algorithm can find short paths (Kleinberg 2000a, b). This might also explain why in Milgram’s experiment, the majority of the letters actually did not reach their target.
Structure of Paths
Not only the length of found paths is interesting to study but also the structure of the found paths. In context of information networks, West and Leskovec (2012) analyze the structure and the properties of the paths taken by the human players by considering different qualities of the nodes of the paths, for example, the degree of the node, their distance to the target node, or the lucrative degree (i.e., the number of outgoing links which decrease the distance to the target node). They can show that the paths taken by humans share the same structure: the players first aim at reaching a hub node, i.e., a node with a large number of outgoing links. After that the players narrow down their search again, meaning that the articles get more specific again, their degrees decrease and the textual similarity to the target article increases. Similarly, Sudarshan Iyengar et al. (2012) found that all participants of the word game selected an individual “landmark” node with a low closeness value, through which they navigated in almost all paths.
A further approach is presented by González et al. (2008) who consider human trajectories created by a large set of individual travels, collected by the recorded locations of individuals’ mobile phones over a period of 6 months. Although single travel paths seem to be very individual, González et al. can show that merging all trajectories will result in single spatial probability distribution which means that human travel patterns show a high degree of regularity. They find that most individuals travel mostly short distances, but there are also several individuals who travel longer distances. A similar result is found by Cho et al. (2011) who aim at quantifying the factors which lead to these regularities. They observe that the majority of shortdistance travels can be explained by the daily routines of the people which therefore show a high spatial and temporal periodicity. Longdistance travels do not show this behavior; however, when taking the social network of the user into account, the longdistance travels can be explained.
Similarity Measures for Paths
Similarity and distance measures for paths, depending on how a path is modeled. The references either refer to the authors who introduced the original measure or to authors who applied it in the corresponding context
Modeling  Measure  References 

Sets  Number of common elements  
Jaccard index  Jaccard (1912)  
Sequences  Longest common substring distance  Gusfield (1997) 
Longest common subsequence distance  Needleman and Wunsch (1970)  
Levenshtein distance  Levenshtein (1966)  
Further edit distances  Navarro (2001)  
Sets of points in metric space  Hausdorff distance  Hausdorff (1914) 
Sum of minimum distances  Niiniluoto (1987)  
(Fair) Surjection distance  Oddie (1986)  
Link distance  Eiter and Mannila (1997)  
Matching distance  Ramon and Bruynooghe (2001)  
(Discrete) Curves in metric space  Fréchet distance  Fréchet (1906), Alt and Godau (1995), Eiter and Mannila (1994) 
Discrete Fréchet distance  Eiter and Mannila (1994)  
LCSS distance  Vlachos et al. (2002)  
Euclidean distance 
A systematic evaluation of similarity measures for paths and their properties has been proposed by Bockholt (2015).
Paths as Sets
Furthermore, when modeling paths as sets of nodes or edges, several pieces of information contained in a path are discarded, for example, the order in which the nodes and edges occur in the paths as well as the information whether nodes or edges are contained once or multiple times in the path. This information can be considered when modeling paths as sequences of nodes.
Paths as Sequences
A path P can be modeled as sequence of nodes for which many similarity and distance measures have been developed, for example, in the community of bioinformatics to quantify the similarity of genetic sequences. Most of the existing similarity measures for sequences can be formulated as edit distance: for a set of allowed edit operations (as deletion, insertion, or substitution of elements) and costs associated to each operation, the edit distance between two sequences is the minimum costs which are necessary for transforming the one sequence into the other. Allowing insertion and deletion as edit operations with the costs of 1 for each operation yields the longest common subsequence distance (Needleman and Wunsch 1970). The name is due to the following connection: for a sequence A = (a _{1}, … , a _{ k }), a subsequence of A is defined as any sequence of elements which can be obtained by deleting elements from A. For A and another sequence B = (b _{1}, … , b _{ ℓ }), let lcs(A, B) denote the length of the longest common subsequence, i.e., the maximal number of elements which occur both in A and B in the same order. The longest common subsequence distance for A and B is then ∣A ∣ + ∣B∣ − 2lcs(A, B) (which can be normalized by the length of the longer sequence which allows the comparison of sequences of different orders of length).
Sequences are not only of interest in the area of computational biology but also in other research fields. Researchers from data mining analyze sequences of events, as sequences of events in telecommunication data, ordered lists of courses a student has taken during studies, or sequences of stock prices from financial data (see, e.g., Kumar et al. 2010; Das et al. 1997; Mannila and Ronkainen 1997; Moen 2000; Laasonen 2005a). Another example is the analysis of clickstream data in order to investigate and predict the behavior of a user in a web environment (Gündüz and Özsu 2003; Wang and Zaiane 2002).
The analysis of sequences is of interest for the comparison of paths in complex networks. The longest common subsequence distance can directly be applied to paths as sequences of nodes and measure the number of nodes which occur in both paths in the same order. Therefore, the paths shown in Figs. 2 and 3 yield large values for this distance measure which seems more intuitive than the setbased measures. Note furthermore that it satisfies all properties of a distance metric. However, this measure is designed for comparing sequences of letters where the length of the sequences are long in relation to the size of the alphabet from which the letters are taken from. For paths in complex networks, this is rather the opposite case: the alphabet, i.e., the node set of the graph, is usually much larger than the length of the sequences, i.e., the considered paths. This implies that when comparing two arbitrary paths in a network, they usually share only a small fraction of nodes or are even totally disjoint. In these cases, the distance measure is not able to yield meaningful results, since it cannot distinguish between paths which are disjoint and “close” in the graph and paths which are disjoint but “distant” in the graph. For example, if two people drive from the same city to the same other city, but one on a highway and one using only country roads next to the highway, the two paths should be rated as quite similar. However, if one drives from north to south and the other from east to west, the paths should be rated as very different. The longest common subsequence distance measure however will rather return values which are only dependent on the lengths of the paths if they are disjoint.
Therefore, modeling paths as sequences will not always yield satisfying results since the information of where the paths are situated in the context of the network is not captured by this approach. Incorporating the location of the paths within the network leads to the next two modeling possibilities.
Paths as Sets of Points in Metric Space
Clearly, these measures consider the distance between the elements of the paths and allow a comparison of totally disjoint paths. They furthermore include all elements of both paths in the computation which is why extreme points do not have such a big impact. However, since they are set based, these measures ignore the order of the nodes in path which is also why these measures fail at satisfying coincidence as well. A further main drawback of them is the fact that it violates the triangle inequality (an interesting approach for fixing this issue is presented by Ramon and Bruynooghe (2001)).
Paths as Polygonal Chains
In recent years, it has become very cheap to equip mobile devices with all kind of sensors able to track the device’s position. This has led to a huge amount of available data containing the tracked movement of individuals (e.g., animals in wildlife (ShamounBaranes et al. 2011), mobile phone users (González et al. 2008; Laasonen 2005a, b), taxis (Yuan et al. 2010, 2011), bicycles from a shared bike systems (Vogel et al. 2014; Sener et al. 2009), or shopping cards with RFID chips which track the customers’ way through the supermarket (Larson et al. 2005)). These trajectories, given as sequence of GPS coordinates or in some other form, are basically sequences of points in a metric space such that results from this research field can be applied. Also the (discrete) Fréchet distance has been proven useful in the analysis of GPS trajectories, for example, for finding clusters of similar trajectories (Gudmundsson et al. 2012) or for detecting recurring patterns in the trajectories (e.g., Buchin et al. 2008).
However, for the comparison of GPS trajectories, also other similarity measures are used: although developed for sets and sensitive to outliers, the Hausdorff distance is widely used (e.g., by Junejo et al. 2004). Vlachos et al. (2002) make some effort to develop a similarity measure which is based on the longest common subsequence, but is applicable for trajectories by introducing two parameters. While the classic longest common subsequence lcs of two sequences counts the number of elements which are exactly the same in both sequences (and in the same order), the LCSS distance of Vlachos et al. (2002) counts points as matched if they are “close enough” to each other, i.e., their distance to each other is smaller than the given parameter. This idea can be useful for the comparison of paths in order to overcome the problem described in the section Paths as Sequences: the network usually contains much more nodes than the paths such that most paths cannot be distinguished by the lcs because they yield a similarity of 0. Counting nodes of two paths as matched if they are close enough in the network addresses this problem.
These approaches are also of relevance for the analysis of paths in complex networks: in order to summarize huge amount paths in a network, an appropriate clustering procedure with a meaningful similarity or distance measure is required. Of relevance is also the work of Lee et al. (2007) who point out that clustering trajectories as a whole might often be appropriate since this approach misses similar subtrajectories. They therefore propose a partitionandgroup framework in which the trajectories are first partitioned into line segments and then groups the line segments.
Although many results from the analysis of polygonal curves and trajectories can be applied for the analysis of paths in complex networks, the crucial difference between these concepts is that paths are embedded in the structure of a complex network. This fact provides effectively more information which cannot be used by any of the existing approaches. An open question is here how to adapt the methods from trajectory analysis to paths in directed or weighted networks or networks with multiple edges. In this case, the graph distance is not a distance metric anymore, for example, if there is a path from node v to node w, it does not imply that the shortest path from w to v has the same length or that it exists at all. Thus, adapting distance measures as the (discrete) Fréchet distance or the LCSS distance to paths in directed graphs requires some consideration.
Future Directions
Refining Centrality Indices
Although it has been observed that humans are often able to find surprisingly short paths through a network when only having local knowledge, they still rarely choose shortest paths. At the same time, popular measures for identifying the most central node in a network are based on paths through the network. They are, however, all assuming that the entities move on shortest paths which they usually do not. Therefore, refining centrality indices which take into account the actual taken paths seems to be a promising approach. Dorn et al. (2012), for example, could already show that centrality indices are less prone to artifacts when using actually taken paths.
Developing Realistic Models of Paths
The same holds not only for centrality indices but for all network models. When predictions about the behavior of users or processes in networks are being made, it might be not sufficient to only consider the network structure, but the actual usage of the network should be taken into account. The mere existence of an edge does not imply that this edge is actually used. Realistic models of network usage which is neither on shortest nor on randomly chosen paths are needed.
Using the Knowledge in the Paths
On the other hand, the actual usage of the network by thousands of entities – which paths are often taken, which rarely or never, maybe also dependent of other factors – contains a huge amount of knowledge. This knowledge may be used in order to infer knowledge about the network itself. This has already been done for GPS trajectories in order to provide the effectively shortest path to car drivers – based on the aggregated knowledge of thousands of taxi drivers (Yuan et al. 2010, 2011), or to extract interesting locations from individual travel sequences (Benkert et al. 2010; Zheng et al. 2009).
Visualizing Paths in Complex Networks
The human eye is a powerful tool for identifying common patterns and structure in data. Therefore, the visualization of large path data sets will be an important task for future research.
Compilation of Data Sets
Since the analysis of actual taken paths in complex networks is research area that has just started, researchers face the challenge that only a few data sets of paths in networks are available. Compiling and publishing data sets of complex networks and their actual usage is an important task for the network community – to mine and analyze the paths together with the underlying network structure.
CrossReferences
References
 Aginsky V, Harris C, Rensink R, Beusmans J (1997) Two strategies for learning a route in a driving simulator. J Environ Psychol 17(4):317–331CrossRefGoogle Scholar
 Alt H, Godau M (1995) Computing the Fréchet distance between two polygonal curves. Int J Comput Geom Appl 5:75–91CrossRefzbMATHGoogle Scholar
 Benkert M, Djordjevic B, Gudmundsson J, Wolle T (2010) Finding popular places. Int J Comput Geom Appl 20(01):19–42MathSciNetCrossRefzbMATHGoogle Scholar
 Bockholt M (2015) Measures for the similarity of paths in complex networks. Master thesis, TU KaiserslauternGoogle Scholar
 Buchin K, Buchin M, Gudmundsson J, Löffler M, Luo J (2008) Detecting commuting patterns by clustering subtrajectories. In: Algorithms and computation: 19th international symposium, ISAAC 2008, Gold Coast, 15–17 Dec 2008. Proceedings, September, pp 644–655Google Scholar
 Cho E, Myers SA, Leskovec J (2011) Friendship and mobility: user movement in locationbased social networks. In: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD ‘11). ACM, New York, pp 1082–1090. doi: 10.1145/2020408.2020579Google Scholar
 Das G, Gunopulos D, Mannila H (1997) Finding similar time series. In: Komorowski J, Zytkow J (eds) Principles of data mining and knowledge discovery, Lecture notes in computer science, vol 1263. Springer, Berlin/Heidelberg, pp 88–100CrossRefGoogle Scholar
 Dorn I, Lindenblatt A, Zweig KA (2012) The trilemma of network analysis. In: Proceedings of the 2012 IEEE/ACM international conference on advances in social network analysis and mining, IstanbulGoogle Scholar
 Ducruet C, Notteboom T (2012) The worldwide maritime network of container shipping: spatial structure and regional dynamics. Glob Netw 12(3):395–423CrossRefGoogle Scholar
 Eiter T, Mannila H (1994) Computing discrete Fréchet distance. Technical report. Information Systems Department, Technical University of ViennaGoogle Scholar
 Eiter T, Mannila H (1997) Distance measures for point sets and their computation. Acta Inform 34(2):109–133MathSciNetCrossRefzbMATHGoogle Scholar
 Fréchet MM (1906) Sur quelques points du calcul fonctionnel. Rend Circ Mat Palermo (1884–1940) 22(1):1–72CrossRefzbMATHGoogle Scholar
 González MC, Hidalgo CA, Barabási AL (2008) Understanding individual human mobility patterns. Nature 453(7196):779–782. 0806.1256CrossRefGoogle Scholar
 Gudmundsson J, Thom A, Vahrenhold J (2012) Of motifs and goals: mining trajectory data. In: Proceedings of the 20th international conference on advances in geographic information systems – SIGSPATIAL ’12, ACM, New York, pp 129–138Google Scholar
 Guimerá R, Amaral LAN (2004) Modeling the worldwide airport network. Eur Phys J B 38(2):381–385CrossRefGoogle Scholar
 Gündüz Ş, Özsu MT (2003) A web page prediction model based on clickstream tree representation of user behavior. In: Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining, ACM, New York, pp 535–540Google Scholar
 Gusfield D (1997) Algorithms on strings, trees and sequences: computer science and computational biology. Cambridge University Press, New YorkCrossRefzbMATHGoogle Scholar
 Hausdorff F (1914) Grundzüge der Mengenlehre. Veit and Company, LeipzigzbMATHGoogle Scholar
 Jaccard P (1912) The distribution of the flora in the alpine zone. New Phytol 11(2):37–50CrossRefGoogle Scholar
 Jarušek P, Pelánek R (2011) What determines difficulty of transport puzzles? In: Proceedings of Florida Artificial Intelligence Research Society conference. AAAI Press, pp 428–433Google Scholar
 Junejo IN, Javed O, Shah M (2004) Multi feature path modeling for video surveillance. In: Proceedings of the international conference on pattern recognition, vol 2, pp 716–719Google Scholar
 Kaluza P, Kölzsch A, Gastner MT, Blasius B (2010) The complex network of global cargo ship movements. J R Soc Interface 7(48):1093–1103, 1001.2172CrossRefGoogle Scholar
 Kleinberg J (2000a) The smallworld phenomenon: an algorithmic perspective. In: Proceedings of the thirtysecond annual ACM symposium on theory of computing, STOC 00, ACM, New York, pp 163–170Google Scholar
 Kleinberg JM (2000b) Navigation in a small world. Nature 406:845–845CrossRefGoogle Scholar
 Kumar P, Raju BS, Radha Krishna P (2010) A new similarity metric for sequential data. Int J Data Warehous Min 6(4):16–32. doi:10.4018/jdwm.2010100102CrossRefGoogle Scholar
 Laasonen K (2005a) Clustering and prediction of mobile user routes from cellular data. In: Knowledge discovery in databases: PKDD 2005. Lecture notes in computer science, vol 3721. Springer, Berlin/Heidelberg, pp 569–576Google Scholar
 Laasonen K (2005b) Route prediction from cellular data. In: Workshop on ContextAwareness for Proactive Systems (CAPS), vol 1617Google Scholar
 Larson JS, Bradlow ET, Fader PS (2005) An exploratory look at supermarket shopping paths. Int J Res Mark 22(4):395–414CrossRefGoogle Scholar
 Lee JG, Han J, Whang KY (2007) Trajectory clustering: a partitionandgroup framework. In: Proceedings of the 2007 ACM SIGMOD international conference on management of data, ACM, New York, pp 593–604Google Scholar
 Levenshtein V (1966) Binary codes capable of correcting deletions, insertions and reversals. Sov Phys Dokl 10(8):707–710. Original in Russian in Dokl Akad Nauk SSSR 163(4):845–848, 1965MathSciNetzbMATHGoogle Scholar
 Mannila H, Ronkainen P (1997) Similarity of event sequences. In: Temporal representation and reasoning, 1997. (TIME ‘97), Proceedings., fourth international workshop on, Dayton Beach, pp 136–139. doi:10.1109/TIME.1997.600793Google Scholar
 McDonald TP, Pellegrino JW (1993) Psychological perspectives on spatial cognition. In: Gärling T, Golledge RG (eds) Behavior and environment – psychological and geographical approaches, advances in psychology, vol 96. Elsevier Science Publishers, NorthHolland, pp 47–82CrossRefGoogle Scholar
 Milgram S (1967) The small world problem. Psychol Today 2(1):60–67Google Scholar
 Moen P (2000) Attribute, event sequence, and event type similarity notions for data mining. PhD thesis, Department of Computer Science, University of HelsinkiGoogle Scholar
 Moeser SD (1988) Cognitive mapping in a complex building. Environ Behav 20(1):21–49CrossRefGoogle Scholar
 Navarro G (2001) A guided tour to approximate string matching. ACM Comput Surv 33(1):31–88CrossRefGoogle Scholar
 Needleman SB, Wunsch CD (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol 48(3):443–453CrossRefGoogle Scholar
 Newell A (1980) Reasoning, problem solving and decision processes: the problem space as a fundamental category. In: Nickerson R (ed) Attention and performance VIII. Erlbaum, Hillsdale. (Also available as Technical Report, Carnegie Mellon University, Computer Science, Report No. 2482, 1979)Google Scholar
 Niiniluoto I (1987) Truthlikeness, vol 185. Springer Science & Business Media, DordrechtCrossRefzbMATHGoogle Scholar
 Oddie G (1986) Likeness to truth. D. Reidel, DordrechtCrossRefGoogle Scholar
 Ramon J, Bruynooghe M (2001) A polynomial time computable metric between point sets. Acta Inform 37(10):765–780MathSciNetCrossRefzbMATHGoogle Scholar
 Sen P, Dasgupta S, Chatterjee A, Sreeram PA, Mukherjee G, Manna SS (2003) Smallworld properties of the Indian railway network. Phys Rev E 67(3):036106CrossRefGoogle Scholar
 Sener IN, Eluru N, Bhat CR (2009) An analysis of bicycle route choice preferences in Texas, US. Transportation 36(5):511–539CrossRefGoogle Scholar
 ShamounBaranes J, van Loon EE, Purves RS, Speckmann B, Weiskopf D, Camphuysen C (2011) Analysis and visualization of animal movement. Biol Lett 8(1):6–9CrossRefGoogle Scholar
 Sudarshan Iyengar S, Veni Madhavan C, Zweig KA, Natarajan A (2012) Understanding human navigation using network analysis. Top Cogn Sci 4(1):121–134CrossRefGoogle Scholar
 Vlachos M, Kollios G, Gunopulos D (2002) Discovering similar multidimensional trajectories. In: Proceedings 18th international conference on data engineering, San Jose, pp 673–684. doi:10.1109/ICDE.2002.994784Google Scholar
 Vogel M, Hamon R, Lozenguez G, Merchez L, Abry P, Barnier J, Borgnat P, Flandrin P, Mallon I, Robardet C (2014) From bicycle sharing system movements to users: a typology of Vélo’v cyclists in Lyon based on largescale behavioural dataset. J Transp Geogr 41:280–291CrossRefGoogle Scholar
 Wang Q, Zaiane OR (2002) Clustering web sessions by sequence alignment. In: Proceedings. 13th international workshop on database and expert systems applications, AixenProvence, pp 394–398. doi:10.1109/DEXA.2002.1045928Google Scholar
 Watts DJ, Strogatz SH (1998) Collective dynamics of smallworld networks. Nature 393(6684):440–442CrossRefGoogle Scholar
 West R, Leskovec J (2012) Human wayfinding in information networks. In: Proceedings of the 21st international conference on World wide web, ACM, New York, pp 619–628Google Scholar
 West R, Pineau J, Precup D (2009) Wikispeedia: an online game for inferring semantic distances between concepts. In: Kitano H (ed) Proceedings of the 21st international joint conference on artificial intelligence (IJCAI ‘09). Morgan Kaufmann, San Francisco, pp 1598–1603Google Scholar
 Yuan J, Zheng Y, Zhang C, Xie W (2010) TDrive: driving directions based on Taxi trajectories. In: Proceedings of the 18th SIGSPATIAL international conference on advances in geographic information systems, pp 99–108Google Scholar
 Yuan J, Zheng Y, Xie X, Sun G (2011) Driving with knowledge from the physical world. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining (KDD ’11), vol 5, pp 316–324Google Scholar
 Zheng Y, Zhang L, Xie X, Ma WY (2009) Mining interesting locations and travel sequences from GPS trajectories. In: Proceedings of the 18th international conference on World Wide Web. ACM, pp 791–800Google Scholar