Abstract
In the last decades digital forensics have become a prominent activity in modern investigations. Indeed, an important data source is often constituted by information contained in devices on which investigational activity is performed. Due to the complexity of this inquiring activity, the digital tools used for investigation constitute a central concern. In this paper a clustering-based text mining technique is introduced for investigational purposes. The proposed methodology is experimentally applied to the publicly available Enron dataset that well fits a plausible forensics analysis context.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
U.S. Department of Justice, Electronic Crime Scene Investigation: A Guide for First Responders, I Edition, NCJ 219941 (2008), http://www.ncjrs.gov/pdffiles1/nij/219941.pdf
Chen, H., Chung, W., Xu, J.J., Wang, G., Qin, Y., Chau, M.: Crime data mining: a general framework and some examples. IEEE Trans. Computer 37, 50–56 (2004)
Seifert, J.W.: Data Mining and Homeland Security: An Overview. CRS Report RL31798 (2007), www.epic.org/privacy/fusion/crs-dataminingrpt.pdf
Mena, J.: Investigative Data Mining for Security and Criminal Detection. Butterworth-Heinemann (2003)
Sullivan, D.: Document warehousing and text mining. John Wiley and Sons, Chichester (2001)
Fan, W., Wallace, L., Rich, S., Zhang, Z.: Tapping the power of text mining. Comm. of the ACM 49, 76–82 (2006)
Decherchi, S., Gastaldo, P., Redi, J., Zunino, R.: Hypermetric k-means clustering for content-based document management. In: First Workshop on Computational Intelligence in Security for Information Systems, Genova (2008)
The Enron Email Dataset, http://www-2.cs.cmu.edu/~enron/
Carrier, B.: File System Forensic Analysis. Addison-Wesley, Reading (2005)
Popp, R., Armour, T., Senator, T., Numrych, K.: Countering terrorism through information technology. Comm. of the ACM 47, 36–43 (2004)
Zanasi, A. (ed.): Text Mining and its Applications to Intelligence, CRM and KM, 2nd edn. WIT Press (2007)
Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)
Baeza-Yates, R., Ribiero-Neto, B.: Modern Information Retrieval. ACM Press, New York (1999)
Salton, G., Wong, A., Yang, L.S.: A vector space model for information retrieval. Journal Amer. Soc. Inform. Sci. 18, 613–620 (1975)
Linde, Y., Buzo, A., Gray, R.M.: An algorithm for vector quantizer design. IEEE Trans. Commun. COM 28, 84–95 (1980)
Bekkerman, R., McCallum, A., Huang, G.: Automatic Categorization of Email into Folders: Benchmark Experiments on Enron and SRI Corpora. CIIR Technical Report IR-418 (2004), http://www.cs.umass.edu/~ronb/papers/email.pdf
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Decherchi, S., Tacconi, S., Redi, J., Leoncini, A., Sangiacomo, F., Zunino, R. (2009). Text Clustering for Digital Forensics Analysis. In: Herrero, Á., Gastaldo, P., Zunino, R., Corchado, E. (eds) Computational Intelligence in Security for Information Systems. Advances in Intelligent and Soft Computing, vol 63. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04091-7_4
Download citation
DOI: https://doi.org/10.1007/978-3-642-04091-7_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04090-0
Online ISBN: 978-3-642-04091-7
eBook Packages: EngineeringEngineering (R0)