Abstract
Tracking users’ activities on the World Wide Web (WWW) allows researchers to analyze each user’s internet behavior as time passes and for the amount of time spent on a particular domain. This analysis can be used in research design, as researchers may access to their participant’s behaviors while browsing the web. Web search behavior has been a subject of interest because of its real-world applications in marketing, digital advertisement, and identifying potential threats online. In this paper, we present an image-processing based method to extract domains which are visited by a participant over multiple browsers during a lab session. This method could provide another way to collect users’ activities during an online session given that the session recorder collected the data. The method can also be used to collect the textual content of web-pages that an individual visits for later analysis.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Agichtein, E., Brill, E., Dumais, S.: Improving web search ranking by incorporating user behavior information. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 19–26. ACM (2006)
Barve, S.: Optical character recognition using artificial neural network. Int. J. Adv. Res. Comput. Eng. Technol. (IJARCET) 1(4), 131 (2012)
Berchmans, D., Kumar, S.: Optical character recognition: an overview and an insight. In: 2014 International Conference on Control, Instrumentation, Communication and Computational Technologies (ICCICCT), pp. 1361–1365. IEEE (2014)
Borisov, A., Markov, I., de Rijke, M., Serdyukov, P.: A context-aware time model for web search. In: Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 205–214. ACM (2016)
Bradski, G., Kaehler, A.: Learning OpenCV: Computer Vision with the OpenCV Library. O’Reilly Media, Inc., Sebastopol (2008)
Buades, A., Coll, B., Morel, J.M.: Image denoising methods. A new nonlocal principle. SIAM Rev. 52(1), 113–147 (2010)
Catledge, L.D., Pitkow, J.E.: Characterizing browsing behaviors on the world-wide web. Technical report, Georgia Institute of Technology (1995)
Chandarana, J., Kapadia, M.: Optical character recognition. Int. J. Emerg. Technol. Adv. Eng. 4(5), 219–223 (2014)
Hölscher, C., Strube, G.: Web search behavior of internet experts and newbies. Comput. Netw. 33(1–6), 337–346 (2000)
Hsieh-Yee, I.: Research on web search behavior. Libr. Inf. Sci. Res. 23(2), 167–185 (2001)
Kumar, G., Bhatia, P.K.: A detailed review of feature extraction in image processing systems. In: 2014 Fourth International Conference on Advanced Computing and Communication Technologies (ACCT), pp. 5–12. IEEE (2014)
Lowe, D.: CPSC 425: Computer Vision (January–April 2007) (2007)
Mori, S., Nishida, H., Yamada, H.: Optical Character Recognition. Wiley, New York (1999)
Patel, C., Patel, A., Patel, D.: Optical character recognition by open source OCR tool Tesseract: a case study. Int. J. Comput. Appl. 55(10), 50–56 (2012)
Rose, D.E., Levinson, D.: Understanding user goals in web search. In: Proceedings of the 13th International Conference on World Wide Web, pp. 13–19. ACM (2004)
Shao, L., Yan, R., Li, X., Liu, Y.: From heuristic optimization to dictionary learning: a review and comprehensive comparison of image denoising algorithms. IEEE Trans. Cybern. 44(7), 1001–1013 (2014)
Smith, R.: An overview of the Tesseract OCR engine. In: 2007 Ninth International Conference on Document Analysis and Recognition, ICDAR 2007, vol. 2, pp. 629–633. IEEE (2007)
Spalevic, Z., Ilic, M.: The use of dark web for the purpose of illegal activity spreading. Ekonomika 63(1), 73–82 (2017)
Srivastava, J., Cooley, R., Deshpande, M., Tan, P.N.: Web usage mining: discovery and applications of usage patterns from web data. ACM SIGKDD Explor. Newsl. 1(2), 12–23 (2000)
Xue, Y.: Optical character recognition. Department of Biomedical Engineering, University of Michigan (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Heidarysafa, M., Reed, J., Kowsari, K., Leviton, A.C.R., Warren, J.I., Brown, D.E. (2020). From Videos to URLs: A Multi-Browser Guide to Extract User’s Behavior with Optical Character Recognition. In: Arai, K., Kapoor, S. (eds) Advances in Computer Vision. CVC 2019. Advances in Intelligent Systems and Computing, vol 943. Springer, Cham. https://doi.org/10.1007/978-3-030-17795-9_37
Download citation
DOI: https://doi.org/10.1007/978-3-030-17795-9_37
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-17794-2
Online ISBN: 978-3-030-17795-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)