Skip to main content

From Videos to URLs: A Multi-Browser Guide to Extract User’s Behavior with Optical Character Recognition

  • Conference paper
  • First Online:
Advances in Computer Vision (CVC 2019)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 943))

Included in the following conference series:

  • 2626 Accesses

Abstract

Tracking users’ activities on the World Wide Web (WWW) allows researchers to analyze each user’s internet behavior as time passes and for the amount of time spent on a particular domain. This analysis can be used in research design, as researchers may access to their participant’s behaviors while browsing the web. Web search behavior has been a subject of interest because of its real-world applications in marketing, digital advertisement, and identifying potential threats online. In this paper, we present an image-processing based method to extract domains which are visited by a participant over multiple browsers during a lab session. This method could provide another way to collect users’ activities during an online session given that the session recorder collected the data. The method can also be used to collect the textual content of web-pages that an individual visits for later analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Agichtein, E., Brill, E., Dumais, S.: Improving web search ranking by incorporating user behavior information. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 19–26. ACM (2006)

    Google Scholar 

  2. Barve, S.: Optical character recognition using artificial neural network. Int. J. Adv. Res. Comput. Eng. Technol. (IJARCET) 1(4), 131 (2012)

    Google Scholar 

  3. Berchmans, D., Kumar, S.: Optical character recognition: an overview and an insight. In: 2014 International Conference on Control, Instrumentation, Communication and Computational Technologies (ICCICCT), pp. 1361–1365. IEEE (2014)

    Google Scholar 

  4. Borisov, A., Markov, I., de Rijke, M., Serdyukov, P.: A context-aware time model for web search. In: Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 205–214. ACM (2016)

    Google Scholar 

  5. Bradski, G., Kaehler, A.: Learning OpenCV: Computer Vision with the OpenCV Library. O’Reilly Media, Inc., Sebastopol (2008)

    Google Scholar 

  6. Buades, A., Coll, B., Morel, J.M.: Image denoising methods. A new nonlocal principle. SIAM Rev. 52(1), 113–147 (2010)

    Article  MathSciNet  Google Scholar 

  7. Catledge, L.D., Pitkow, J.E.: Characterizing browsing behaviors on the world-wide web. Technical report, Georgia Institute of Technology (1995)

    Google Scholar 

  8. Chandarana, J., Kapadia, M.: Optical character recognition. Int. J. Emerg. Technol. Adv. Eng. 4(5), 219–223 (2014)

    Google Scholar 

  9. Hölscher, C., Strube, G.: Web search behavior of internet experts and newbies. Comput. Netw. 33(1–6), 337–346 (2000)

    Article  Google Scholar 

  10. Hsieh-Yee, I.: Research on web search behavior. Libr. Inf. Sci. Res. 23(2), 167–185 (2001)

    Article  Google Scholar 

  11. Kumar, G., Bhatia, P.K.: A detailed review of feature extraction in image processing systems. In: 2014 Fourth International Conference on Advanced Computing and Communication Technologies (ACCT), pp. 5–12. IEEE (2014)

    Google Scholar 

  12. Lowe, D.: CPSC 425: Computer Vision (January–April 2007) (2007)

    Google Scholar 

  13. Mori, S., Nishida, H., Yamada, H.: Optical Character Recognition. Wiley, New York (1999)

    Google Scholar 

  14. Patel, C., Patel, A., Patel, D.: Optical character recognition by open source OCR tool Tesseract: a case study. Int. J. Comput. Appl. 55(10), 50–56 (2012)

    Google Scholar 

  15. Rose, D.E., Levinson, D.: Understanding user goals in web search. In: Proceedings of the 13th International Conference on World Wide Web, pp. 13–19. ACM (2004)

    Google Scholar 

  16. Shao, L., Yan, R., Li, X., Liu, Y.: From heuristic optimization to dictionary learning: a review and comprehensive comparison of image denoising algorithms. IEEE Trans. Cybern. 44(7), 1001–1013 (2014)

    Article  Google Scholar 

  17. Smith, R.: An overview of the Tesseract OCR engine. In: 2007 Ninth International Conference on Document Analysis and Recognition, ICDAR 2007, vol. 2, pp. 629–633. IEEE (2007)

    Google Scholar 

  18. Spalevic, Z., Ilic, M.: The use of dark web for the purpose of illegal activity spreading. Ekonomika 63(1), 73–82 (2017)

    Article  Google Scholar 

  19. Srivastava, J., Cooley, R., Deshpande, M., Tan, P.N.: Web usage mining: discovery and applications of usage patterns from web data. ACM SIGKDD Explor. Newsl. 1(2), 12–23 (2000)

    Article  Google Scholar 

  20. Xue, Y.: Optical character recognition. Department of Biomedical Engineering, University of Michigan (2014)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mojtaba Heidarysafa .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Heidarysafa, M., Reed, J., Kowsari, K., Leviton, A.C.R., Warren, J.I., Brown, D.E. (2020). From Videos to URLs: A Multi-Browser Guide to Extract User’s Behavior with Optical Character Recognition. In: Arai, K., Kapoor, S. (eds) Advances in Computer Vision. CVC 2019. Advances in Intelligent Systems and Computing, vol 943. Springer, Cham. https://doi.org/10.1007/978-3-030-17795-9_37

Download citation

Publish with us

Policies and ethics