Abstract
User interest profiles are of great importance for security monitoring and forensic investigation. Once a specific topic becomes sensitive or suspected, being able to quickly determine who has shown an interest in that topic can assist investigators to focus their attention from massive data and develop effective investigation strategies. To automatically generate user interest profiles, we extend Author Topic model to explicitly model user’s dynamic interest based on the text information posted by the user. Our model is able to monitor the evolution of user interest from time-stamped documents. Moreover, instead of modeling a topic as a multinomial distribution over words, we develop a model that can discover and output multi-word phrases to describe topics, which facilitates the human interpretation of unorganized texts. Therefore, our technique has the potential to reduce the cost of investigation and discover latent evidence that is often missed by expression-based searches. We evaluate the effectiveness and performance of our algorithm on a real-life forensic dataset Enron. The experiment results demonstrate that our algorithm can effectively discover user’s dynamic interest. The generated user interest profiles can further assist investigator to discover the latent evidence effectively from textual forensic data and perform security monitoring.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Agrawal, R., Srikant, R., et al.: Fast algorithms for mining association rules. In: Proceedings of the 20th International Conference of Very Large Data Bases, VLDB, vol. 1215, pp. 487–499 (1994)
Blei, D.M., Lafferty, J.D.: Dynamic topic models. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 113–120. ACM (2006)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Chen, Y.S., Shahabi, C.: Automatically improving the accuracy of user profiles with genetic algorithm. In: Proceedings of IASTED International Conference on Artificial Intelligence and Soft Computing, pp. 283–288 (2001)
Claypool, M., Brown, D., Le, P., Waseda, M.: Inferring user interest. IEEE Internet Comput. 5(6), 32–39 (2001)
Daoud, M., Lechani, L.T., Boughanem, M.: Towards a graph-based user profile modeling for a session-based personalized search. Knowl. Inf. Syst. 21(3), 365–398 (2009)
Daud, A.: Using time topic modeling for semantics-based dynamic research interest finding. Knowl.-Based Syst. 26, 154–163 (2012)
de Waal, A., Venter, J., Barnard, E.: Applying topic modeling to forensic data. In: Ray, I., Shenoi, S. (eds.) Advances in Digital Forensics IV. IFIP, vol. 285, pp. 115–126. Springer US, New York (2008)
El-Kishky, A., Song, Y., Wang, C., Voss, C.R., Han, J.: Scalable topical phrase mining from text corpora. Proc. VLDB Endowment 8(3), 305–316 (2014)
Fawcett, T., Provost, F.J.: Combining data mining and machine learning for effective user profiling. In: KDD, pp. 8–13 (1996)
Garfinkel, S.L.: Digital forensics research: the next 10 years. Digit. Invest. 7, S64–S73 (2010)
Griffiths, T.L., Steyvers, M.: Finding scientific topics. Proc. Nat. Acad. Sci. 101(suppl 1), 5228–5235 (2004)
Klimt, B., Yang, Y.: Introducing the enron corpus. In: CEAS (2004)
Okolica, J.S., Peterson, G.L., Mills, R.F.: Using PLSI-U to detect insider threats by datamining e-mail. Int. J. Secure. Network. 3(2), 114–121 (2008)
Rosen-Zvi, M., Griffiths, T., Steyvers, M., Smyth, P.: The author-topic model for authors and documents. In: Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence, pp. 487–494. AUAI Press (2004)
Turvey, B.E.: Criminal Profiling: An Introduction to Behavioral Evidence Analysis. Academic press, San Diego (2011)
Wang, X., McCallum, A.: Topics over time: a non-Markov continuous-time model of topical trends. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 424–433. ACM (2006)
Yang, M., Chow, K.-P.: Authorship attribution for forensic investigation with thousands of authors. In: Cuppens-Boulahia, N., Cuppens, F., Jajodia, S., Abou El Kalam, A., Sans, T. (eds.) SEC 2014. IFIP AICT, vol. 428, pp. 339–350. Springer, Heidelberg (2014)
Yang, M., Chow, K.P.: An information extraction framework for digital forensic investigations. In: Peterson, G., et al. (eds.) Advances in Digital Forensics XI. IFIP AICT, vol. 462, pp. 61–76. Springer, Heidelberg (2015). doi:10.1007/978-3-319-24123-4_4
Yang, M., Zhu, D., Chow, K.P.: A topic model for building fine-grained domain-specific emotion lexicon. In: ACL (2), pp. 421–426 (2014)
Zhou, X., Wu, S.-T., Li, Y., Xu, Y., Lau, R.Y.K., Bruza, P.D.: Utilizing search intent in topic ontology-based user profile for web mining. In: IEEE/WIC/ACM International Conference on Web Intelligence, WI 2006, pp. 558–564. IEEE (2006)
Acknowledgements
This work is supported by the Strategic Priority Research Program of the Chinese Academy of Sciences (No. XDA06030200).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Yang, M., Xu, F., Chow, KP. (2016). Interest Profiling for Security Monitoring and Forensic Investigation. In: Liu, J., Steinfeld, R. (eds) Information Security and Privacy. ACISP 2016. Lecture Notes in Computer Science(), vol 9723. Springer, Cham. https://doi.org/10.1007/978-3-319-40367-0_30
Download citation
DOI: https://doi.org/10.1007/978-3-319-40367-0_30
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-40366-3
Online ISBN: 978-3-319-40367-0
eBook Packages: Computer ScienceComputer Science (R0)