Abstract
Directed contact networks (DCNs) are a particularly flexible and convenient class of temporal networks, useful for modeling and analyzing the transfer of discrete quantities in communications, transportation, etc. Transfers modeled by contacts typically underlie flows that associate multiple contacts based on their spatiotemporal relationships. To infer these flows, we introduce a simple inhomogeneous Markov model associated to a DCN and show how it can be effectively used for data reduction and anomaly detection through an example of kernel-level information transfers within a computer.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
A useful analogy is of a flight departing from s at \(\tau _0\) and arriving at t at \(\tau _1\): the contacts \((s,*,\tau _0)\) and \((*,t,\tau _1)\) respectively correspond to embarking and debarking. This analogy also highlights that alternative representations could also include additional contacts \((s,*,\tau _*)\) with \(\tau _0 \le \tau _* < \tau _1\) depending on the desired behavior.
- 2.
In fact the EPG is a directed graph with vertices bipartitioned into files and events; arcs indicating a subject or object go from events to files. However, there is an obvious bijective correspondence between this and our moral characterization.
- 3.
If \(a_1 \in \tau (\mathcal {C})\), we can simply consider instead \(a'_1 = a_1+\varepsilon _\mathcal {C}\), where \(\varepsilon _\mathcal {C} := \frac{1}{2} \min _{t,t' \in \tau (\mathcal {C}), t \ne t'} |t - t'|\). Note that here we assume without loss of generality that \(|\tau (\mathcal {C})| > 1\), i.e., that \(\mathcal {C}\) is nontrivial as a DCN (versus, e.g., a digraph).
- 4.
Bearing the concept of negative absolute temperature [12] in mind, we note that \(\beta = -\infty \) corresponds to “absolute hot”, and \(\beta = \infty \) corresponds to absolute zero. We follow a natural convention (and it is nothing more) for our model, in which lower temperatures correspond to slower dynamics: thus \(\beta \uparrow \infty \) and \(\beta \downarrow -\infty \) are respectively the limits in which no temporal and spatial arcs are traversed. In practice, we follow a physical analogy and set \(\beta ^{-1}\) to the average time between contacts.
- 5.
If there are (say) contacts of the form \((v,w,\tau _*)\) and \((w,v,\tau _*)\) with \(\tau _* = \tau ^{@v}_{|\mathcal {C}@v|-2} = \tau ^{@w}_{|\mathcal {C}@w|-2}\), then (5) entails that the probability of a transition from \((v,\tau _*)\) to \((v,a_m)\) or from \((w,\tau _*)\) to \((w,a_m)\) would be exponentially small were it not for the \(\varepsilon \) term. While in principle this is not an issue, in numerical practice this leads to floating-point underflow. Taking \(\varepsilon > 0\) avoids this problem without significant side effects.
- 6.
The \(\Delta \tau \) dependence is necessary and in the context of information flows is a plausible approximant (for small values) to the conditional Kolmogorov complexity of the intervening computation.
- 7.
For a weighted DCN, normalizing so that the sum of outbound weights equals either \(d^+\) or zero as appropriate and replacing the first case in (5) with the corresponding normalized weight gives an easy and consistent generalization.
- 8.
In many cases either the source or the target of a ground truth event does not exist. For example, the userspace commands hostname and put/tmp/netrecon correspond to the (\(\text {process name },\text {filename }\)) pairs \((\texttt {hostname},\varnothing )\); and \((\varnothing ,\texttt {/tmp/netrecon})\). By way of comparison, the command rm -f /tmp/netrecon.log corresponds to the pair \((\texttt {rm},\texttt {/tmp/netrecon.log})\).
References
Brémaud, P.: Markov Chains: Gibbs Fields, Monte Carlo Simulation, and Queues. Springer, Berlin (1999)
Chan, S.C., et al.: Expressiveness benchmarking for system-level provenance. In: TaPP (2017)
Grindrod, P., Higham, D.J.: A matrix iteration for dynamic network summaries. SIAM Rev. 55, 118 (2013)
Holme, P.: Modern temporal network theory: a colloquium. Eur. Phys. J. B 88, 234 (2015)
Huntsman, S.: Topological mixture estimation. In: ICML (2018)
Jenkinson, G., et al.: Applying provenance in APT monitoring and analysis. In: TaPP (2017)
King, S.T., Chen, P.M.: Backtracking intrusions. ACM Trans. Comput. Syst. 23, 51 (2005)
Lencastre, P., et al.: From empirical data to continuous Markov processes: a systematic approach. Phys. Rev. E 93, 032135 (2016)
Masuda, N., Lambiotte, R.: A Guide to Temporal Networks. World Scientific, Singapore (2016)
Meilă, M.: Comparing clusterings-an information based distance. J. Multivar. Anal. 98, 873 (2007)
Perra, N., et al.: Random walks and search in time-varying networks. Phys. Rev. Lett. 109, 238701 (2012)
Ramsey, N.F.: Thermodynamics and statistical mechanics at negative absolute temperatures. Phys. Rev. 103, 20 (1956)
Rocha, L.E.C., Masuda, N.: Random walk centrality for temporal networks. New J. Phys. 16, 063023 (2014)
Saramäki, J., Holme, P.: Exploring temporal networks with greedy walks. Eur. Phys. J. B 88, 334 (2015)
Ser-Giacomi, E., et al.: Most probable paths in temporal weighted networks: an application to ocean transport. Phys. Rev. E 92, 012818 (2015)
Starnini, M., et al.: Random walks on temporal networks. Phys. Rev. E 85, 056115 (2012)
Acknowledgements
The author thanks Yingbo Song, Rob Ross, and Mike Weber for many helpful discussions as well as creating the summary and ground truth data used in Sect. 4, and George Cybenko for still more helpful discussions. This material is based upon work supported by the Defense Advanced Research Projects Agency (DARPA) and the Air Force Research Laboratory (AFRL). Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of DARPA or AFRL.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Huntsman, S. (2019). A Markov Model for Inferring Flows in Directed Contact Networks. In: Aiello, L., Cherifi, C., Cherifi, H., Lambiotte, R., Lió, P., Rocha, L. (eds) Complex Networks and Their Applications VII. COMPLEX NETWORKS 2018. Studies in Computational Intelligence, vol 812. Springer, Cham. https://doi.org/10.1007/978-3-030-05411-3_35
Download citation
DOI: https://doi.org/10.1007/978-3-030-05411-3_35
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-05410-6
Online ISBN: 978-3-030-05411-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)