Abstract
Manifest data is a log of container shipments from foreign lading ports to U.S. unlading ports. We provide several time varying network-based representations of this data in order to extract its most “discrepant” port pairs and contents patterns. We treat this time varying network representation as a combinatorial set system and use its discrepancy and firing rate (Abello et al. (2010) Detecting Novel Discrepancies in Communications Networks, International Conference on Data Mining, ICDM 2010: 8–17, Sydney, Australia and Chazelle (2000) The Discrepancy Method: Randomness and Complexity, Cambridge University Press) as the main statistics to track the most “salient” network elements. The output of the entire process is a “fossil” sub-network that encodes those port pairs and contents that exhibit unusual time varying patterns. It is expected that substantial deviations from these patterns will be useful triggers for further content inspections. The applicability of the proposed techniques is not limited to manifest data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
See Sect. 5.3.
References
Abello J, Eliassi-Rad T, Devanur N (2010) Detecting novel discrepancies in communications networks. In: International conference on data mining, ICDM Dec 2010, Sydney, Australia, pp 8–17
ASFOI.txt (2009) Department of Homeland Security, DyDan Center, DIMACS, Rutgers University
Bastion M, Heymann S, Jacomy M (2009) Gephi: an open source software for exploring and manipulating networks. In: Proceedings of the third international conference on weblogs and social media, May 2009
Chazelle B (2000) The discrepancy method: randomness and complexity. Cambridge University Press, New York
Lin J (1991) Divergence measures based on the Shannon entropy. IEEE Trans Inform Theory 37(1):145–151
Acknowledgments
We thank Tsvetan Asamov, Jerry Chen, and Nishchal Devanur for collaborations at the initial stages of this project. We acknowledge the support provided by DHS Agreement Number 2008-DN-077-ARSI012-04, DIMACS Special Focus on Algorithmic Foundations of the Internet, NSF Grant #CNS-0721113, and mgvis.com.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
Appendix 1: Sample Records of Original Data
5.1.1 Human Readable version
5.1.2 Corresponding Original Text Data
Appendix 2: Sample Fossil Information as Seen in Figs. 5.6 and 5.7
Recall that an edge is selected to be in the i-fossil at time t if its weight is at least i-standard deviations away from the overall data discrepancy at that time. The following are samples of edges in the i-fossil, where i is the maximum number of standard deviations at time t. Our fossil frequency calculations ignore timestamp 1 (denoted as FOS_FREQ in the tables below).
5.1.1 Ascend Descend Coloring Fossil
5.1.2 Random Coloring Fossil
Appendix 3: Salient Edges Selected by Both Coloring Schemes
The following are examples of edges that are included in the fossil with respect to both random coloring and ascend descend coloring.
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media New York
About this chapter
Cite this chapter
Abello, J., Chen, M., Parikh, N. (2013). Time Discrepant Shipments in Manifest Data. In: Herrmann, J. (eds) Handbook of Operations Research for Homeland Security. International Series in Operations Research & Management Science, vol 183. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-5278-2_5
Download citation
DOI: https://doi.org/10.1007/978-1-4614-5278-2_5
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-5277-5
Online ISBN: 978-1-4614-5278-2
eBook Packages: Business and EconomicsBusiness and Management (R0)