Skip to main content

A Markov Model for Inferring Flows in Directed Contact Networks

  • Conference paper
  • First Online:

Part of the book series: Studies in Computational Intelligence ((SCI,volume 812))

Abstract

Directed contact networks (DCNs) are a particularly flexible and convenient class of temporal networks, useful for modeling and analyzing the transfer of discrete quantities in communications, transportation, etc. Transfers modeled by contacts typically underlie flows that associate multiple contacts based on their spatiotemporal relationships. To infer these flows, we introduce a simple inhomogeneous Markov model associated to a DCN and show how it can be effectively used for data reduction and anomaly detection through an example of kernel-level information transfers within a computer.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD   219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    A useful analogy is of a flight departing from s at \(\tau _0\) and arriving at t at \(\tau _1\): the contacts \((s,*,\tau _0)\) and \((*,t,\tau _1)\) respectively correspond to embarking and debarking. This analogy also highlights that alternative representations could also include additional contacts \((s,*,\tau _*)\) with \(\tau _0 \le \tau _* < \tau _1\) depending on the desired behavior.

  2. 2.

    In fact the EPG is a directed graph with vertices bipartitioned into files and events; arcs indicating a subject or object go from events to files. However, there is an obvious bijective correspondence between this and our moral characterization.

  3. 3.

    If \(a_1 \in \tau (\mathcal {C})\), we can simply consider instead \(a'_1 = a_1+\varepsilon _\mathcal {C}\), where \(\varepsilon _\mathcal {C} := \frac{1}{2} \min _{t,t' \in \tau (\mathcal {C}), t \ne t'} |t - t'|\). Note that here we assume without loss of generality that \(|\tau (\mathcal {C})| > 1\), i.e., that \(\mathcal {C}\) is nontrivial as a DCN (versus, e.g., a digraph).

  4. 4.

    Bearing the concept of negative absolute temperature [12] in mind, we note that \(\beta = -\infty \) corresponds to “absolute hot”, and \(\beta = \infty \) corresponds to absolute zero. We follow a natural convention (and it is nothing more) for our model, in which lower temperatures correspond to slower dynamics: thus \(\beta \uparrow \infty \) and \(\beta \downarrow -\infty \) are respectively the limits in which no temporal and spatial arcs are traversed. In practice, we follow a physical analogy and set \(\beta ^{-1}\) to the average time between contacts.

  5. 5.

    If there are (say) contacts of the form \((v,w,\tau _*)\) and \((w,v,\tau _*)\) with \(\tau _* = \tau ^{@v}_{|\mathcal {C}@v|-2} = \tau ^{@w}_{|\mathcal {C}@w|-2}\), then (5) entails that the probability of a transition from \((v,\tau _*)\) to \((v,a_m)\) or from \((w,\tau _*)\) to \((w,a_m)\) would be exponentially small were it not for the \(\varepsilon \) term. While in principle this is not an issue, in numerical practice this leads to floating-point underflow. Taking \(\varepsilon > 0\) avoids this problem without significant side effects.

  6. 6.

    The \(\Delta \tau \) dependence is necessary and in the context of information flows is a plausible approximant (for small values) to the conditional Kolmogorov complexity of the intervening computation.

  7. 7.

    For a weighted DCN, normalizing so that the sum of outbound weights equals either \(d^+\) or zero as appropriate and replacing the first case in (5) with the corresponding normalized weight gives an easy and consistent generalization.

  8. 8.

    In many cases either the source or the target of a ground truth event does not exist. For example, the userspace commands hostname and put/tmp/netrecon correspond to the (\(\text {process name },\text {filename }\)) pairs \((\texttt {hostname},\varnothing )\); and \((\varnothing ,\texttt {/tmp/netrecon})\). By way of comparison, the command rm -f /tmp/netrecon.log corresponds to the pair \((\texttt {rm},\texttt {/tmp/netrecon.log})\).

References

  1. Brémaud, P.: Markov Chains: Gibbs Fields, Monte Carlo Simulation, and Queues. Springer, Berlin (1999)

    Google Scholar 

  2. Chan, S.C., et al.: Expressiveness benchmarking for system-level provenance. In: TaPP (2017)

    Google Scholar 

  3. Grindrod, P., Higham, D.J.: A matrix iteration for dynamic network summaries. SIAM Rev. 55, 118 (2013)

    Google Scholar 

  4. Holme, P.: Modern temporal network theory: a colloquium. Eur. Phys. J. B 88, 234 (2015)

    Google Scholar 

  5. Huntsman, S.: Topological mixture estimation. In: ICML (2018)

    Google Scholar 

  6. Jenkinson, G., et al.: Applying provenance in APT monitoring and analysis. In: TaPP (2017)

    Google Scholar 

  7. King, S.T., Chen, P.M.: Backtracking intrusions. ACM Trans. Comput. Syst. 23, 51 (2005)

    Google Scholar 

  8. Lencastre, P., et al.: From empirical data to continuous Markov processes: a systematic approach. Phys. Rev. E 93, 032135 (2016)

    Google Scholar 

  9. Masuda, N., Lambiotte, R.: A Guide to Temporal Networks. World Scientific, Singapore (2016)

    Google Scholar 

  10. Meilă, M.: Comparing clusterings-an information based distance. J. Multivar. Anal. 98, 873 (2007)

    Google Scholar 

  11. Perra, N., et al.: Random walks and search in time-varying networks. Phys. Rev. Lett. 109, 238701 (2012)

    Google Scholar 

  12. Ramsey, N.F.: Thermodynamics and statistical mechanics at negative absolute temperatures. Phys. Rev. 103, 20 (1956)

    Google Scholar 

  13. Rocha, L.E.C., Masuda, N.: Random walk centrality for temporal networks. New J. Phys. 16, 063023 (2014)

    Google Scholar 

  14. Saramäki, J., Holme, P.: Exploring temporal networks with greedy walks. Eur. Phys. J. B 88, 334 (2015)

    Google Scholar 

  15. Ser-Giacomi, E., et al.: Most probable paths in temporal weighted networks: an application to ocean transport. Phys. Rev. E 92, 012818 (2015)

    Google Scholar 

  16. Starnini, M., et al.: Random walks on temporal networks. Phys. Rev. E 85, 056115 (2012)

    Google Scholar 

Download references

Acknowledgements

The author thanks Yingbo Song, Rob Ross, and Mike Weber for many helpful discussions as well as creating the summary and ground truth data used in Sect. 4, and George Cybenko for still more helpful discussions. This material is based upon work supported by the Defense Advanced Research Projects Agency (DARPA) and the Air Force Research Laboratory (AFRL). Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of DARPA or AFRL.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Steve Huntsman .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Huntsman, S. (2019). A Markov Model for Inferring Flows in Directed Contact Networks. In: Aiello, L., Cherifi, C., Cherifi, H., Lambiotte, R., Lió, P., Rocha, L. (eds) Complex Networks and Their Applications VII. COMPLEX NETWORKS 2018. Studies in Computational Intelligence, vol 812. Springer, Cham. https://doi.org/10.1007/978-3-030-05411-3_35

Download citation

Publish with us

Policies and ethics