Advertisement

Tangent-V: Math Formula Image Search Using Line-of-Sight Graphs

  • Kenny DavilaEmail author
  • Ritvik Joshi
  • Srirangaraj Setlur
  • Venu Govindaraju
  • Richard Zanibbi
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11437)

Abstract

We present a visual search engine for graphics such as math, chemical diagrams, and figures. Graphics are represented using Line-of-Sight (LOS) graphs, with symbols connected only when they can ‘see’ each other along an unobstructed line. Symbol identities may be provided (e.g., in PDF) or taken from Optical Character Recognition applied to images. Graphics are indexed by pairs of symbols that ‘see’ each other using their labels, spatial displacement, and size ratio. Retrieval has two layers: the first matches query symbol pairs in an inverted index, while the second aligns candidates with the query and scores the resulting matches using the identity and relative position of symbols. For PDFs, we also introduce a new tool that quickly extracts characters and their locations. We have applied our model to the NTCIR-12 Wikipedia Formula Browsing Task, and found that the method can locate relevant matches without unification of symbols or using a math expression grammar. In the future, one might index LOS graphs for entire pages and search for text and graphics. Our source code has been made publicly available.

Keywords

Graphics search Mathematical Information Retrieval (MIR) Image search PDF symbol extraction 

Notes

Acknowledgements

We are grateful to Chris Bondy for his help with designing SymbolScraper. This material is based upon work supported by the National Science Foundation (USA) under Grant Nos. HCC-1218801, III-1717997, and 1640867 (OAC/DMR).

References

  1. 1.
    Al-Zaidy, R.A., Giles, C.L.: Automatic extraction of data from bar charts. In: Proceedings of the 8th International Conference on Knowledge Capture, K-CAP 2015, Palisades, NY, USA, 7–10 October 2015, pp. 30:1–30:4 (2015).  https://doi.org/10.1145/2815833.2816956, http://doi.acm.org/10.1145/2815833.2816956
  2. 2.
    Al-Zaidy, R.A., Giles, C.L.: A machine learning approach for semantic structuring of scientific charts in scholarly documents. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 4–9 February 2017, San Francisco, California, USA, pp. 4644–4649 (2017). http://aaai.org/ocs/index.php/IAAI/IAAI17/paper/view/14275
  3. 3.
    Avrithis, Y., Tolias, G.: Hough pyramid matching: speeded-up geometry re-ranking for large scale image retrieval. Int. J. Comput. Vis. 107(1), 1–19 (2014)CrossRefGoogle Scholar
  4. 4.
    Babenko, A., Lempitsky, V.: Aggregating local deep features for image retrieval. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1269–1277 (2015)Google Scholar
  5. 5.
    Baker, J., Sexton, A.P., Sorge, V.: Extracting precise data on the mathematical content of PDF documents. In: Towards a Digital Mathematics Library (DML). Masaryk University Press, Birmingham, 27 July 2008. ISBN 978-80-210-4658-0Google Scholar
  6. 6.
    Bay, H., Tuytelaars, T., Van Gool, L.: SURF: speeded up robust features. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3951, pp. 404–417. Springer, Heidelberg (2006).  https://doi.org/10.1007/11744023_32CrossRefGoogle Scholar
  7. 7.
    Berg, M., Cheong, O., Kreveld, M., Overmars, M.: Computational Geometry: Algorithms and Applications, 3rd edn. Springer, Heidelberg (2008).  https://doi.org/10.1007/978-3-540-77974-2CrossRefzbMATHGoogle Scholar
  8. 8.
    Cao, Y., Long, M., Liu, B., Wang, J.: Deep cauchy hashing for hamming space retrieval. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018Google Scholar
  9. 9.
    Chatbri, H., Kwan, P., Kameyama, K.: An application-independent and segmentation-free approach for spotting queries in document images. In: ICPR, pp. 2891–2896. IEEE (2014)Google Scholar
  10. 10.
    Choudhury, S., et al.: Figure metadata extraction from digital documents. In: 12th International Conference on Document Analysis and Recognition, ICDAR 2013, pp. 135–139 (2013).  https://doi.org/10.1109/ICDAR.2013.34
  11. 11.
    Clark, C., Divvala, S.K.: Pdffigures 2.0: mining figures from research papers. In: Proceedings of the 16th ACM/IEEE-CS on Joint Conference on Digital Libraries, JCDL 2016, Newark, NJ, USA, 19–23 June 2016, pp. 143–152 (2016).  https://doi.org/10.1145/2910896.2910904, http://doi.acm.org/10.1145/2910896.2910904
  12. 12.
    Davila, K., Zanibbi, R.: Visual search engine for handwritten and typeset math in lecture videos and latex notes. In: 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 50–55, August 2018.  https://doi.org/10.1109/ICFHR-2018.2018.00018
  13. 13.
    Davila, K., Ludi, S., Zanibbi, R.: Using off-line features and synthetic data for on-line handwritten math symbol recognition. In: ICFHR, pp. 323–328. IEEE (2014)Google Scholar
  14. 14.
    Davila, K., Zanibbi, R.: Layout and semantics: combining representations for mathematical formula search. In: SIGIR (2017)Google Scholar
  15. 15.
    Goodfellow, I., Bengio, Y., Courville, A., Bengio, Y.: Deep Learning, vol. 1. MIT Press, Cambridge (2016)zbMATHGoogle Scholar
  16. 16.
    Gordo, A., Almazán, J., Revaud, J., Larlus, D.: Deep image retrieval: learning global representations for image search. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9910, pp. 241–257. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46466-4_15CrossRefGoogle Scholar
  17. 17.
    Hu, L., Zanibbi, R.: MST-based visual parsing of online handwritten mathematical expressions. In: Proceedings of the International Conference on Frontiers in Handwriting Recognition (ICFHR), Shenzhen, China (2016, to appear)Google Scholar
  18. 18.
    Hu, L., Zanibbi, R.: Line-of-sight stroke graphs and parzen shape context features for handwritten math formula representation and symbol segmentation. In: ICFHR, pp. 180–186. IEEE (2016)Google Scholar
  19. 19.
    Jégou, H., Douze, M., Schmid, C.: Improving bag-of-features for large scale image search. Int. J. Comput. Vis. 87(3), 316–336 (2010)CrossRefGoogle Scholar
  20. 20.
    Kristianto, G.Y., Topić, G., Aizawa, A.: The MCAT math retrieval system for NTCIR-12 MathIR task. In: Proceedings of the NTCIR-12, pp. 323–330 (2016)Google Scholar
  21. 21.
    Li, X., Larson, M., Hanjalic, A.: Pairwise geometric matching for large-scale object retrieval. In: CVPR, pp. 5153–5161, June 2015Google Scholar
  22. 22.
    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis, 60(2), 91–110 (2004)CrossRefGoogle Scholar
  23. 23.
    Mouchère, H., Zanibbi, R., Garain, U., Viard-Gaudin, C.: Advancing the state-of-the-art for handwritten math recognition: the CROHME competitions, 2011–2014. Int. J. Doc. Anal. Recogn. (IJDAR) 19(2), 173–189 (2016)CrossRefGoogle Scholar
  24. 24.
    Mouchère, H., Viard-Gaudin, C., Zanibbi, R., Garain, U.: ICFHR 2016 CROHME: competition on recognition of online handwritten mathematical expressions. In: International Conference on Frontiers in Handwriting Recognition (ICFHR) (2016)Google Scholar
  25. 25.
    Noh, H., Araujo, A., Sim, J., Weyand, T., Han, B.: Largescale image retrieval with attentive deep local features. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3456–3465 (2017)Google Scholar
  26. 26.
    Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: CVPR, pp. 1–8. IEEE (2007)Google Scholar
  27. 27.
    Radenović, F., Iscen, A., Tolias, G., Avrithis, Y., Chum, O.: Revisiting oxford and paris: large-scale image retrieval benchmarking. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018Google Scholar
  28. 28.
    Sivic, J., Zisserman, A.: Video Google: a text retrieval approach to object matching in videos, In: ICCV, pp. 1470–1477. IEEE (2003)Google Scholar
  29. 29.
    Wang, X.: Tabular Abstraction, Editing and Formatting. Ph.D. thesis, University of Waterloo, Canada (1996)Google Scholar
  30. 30.
    Zanibbi, R., Aizawa, A., Kohlhase, M., Ounis, I., Topić, G., Davila, K.: NTCIR-12 MathIR task overview. In: Proceedings of the NTCIR-12, pp. 299–308 (2016)Google Scholar
  31. 31.
    Zanibbi, R., Blostein, D.: Recognition and retrieval of mathematical expressions. IJDAR 15(4), 331–357 (2012)CrossRefGoogle Scholar
  32. 32.
    Zanibbi, R., Blostein, D., Cordy, J.R.: A survey of table recognition: models, observations, transformations, and inferences. Int. J. Doc. Anal. Recogn. (IJDAR) 7(1), 1–16 (2004)Google Scholar
  33. 33.
    Zanibbi, R., Davila, K., Kane, A., Tompa, F.: Multi-stage math formula search: using appearance-based similarity metrics at scale. In: SIGIR (2016)Google Scholar
  34. 34.
    Zanibbi, R., Yu, L.: Math spotting: retrieving math in technical documents using handwritten query images. In: ICDAR, pp. 446–451. IEEE (2011)Google Scholar
  35. 35.
    Zhang, W., Ngo, C.W.: Topological spatial verification for instance search. IEEE Trans. Multimedia 17(8), 1236–1247 (2015).  https://doi.org/10.1109/TMM.2015.2440997CrossRefGoogle Scholar
  36. 36.
    Zhang, Y., Jia, Z., Chen, T.: Image retrieval with geometry-preserving visual phrases. In: CVPR, pp. 809–816. IEEE (2011)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Kenny Davila
    • 1
    Email author
  • Ritvik Joshi
    • 2
  • Srirangaraj Setlur
    • 1
  • Venu Govindaraju
    • 1
  • Richard Zanibbi
    • 2
  1. 1.University at BuffaloBuffaloUSA
  2. 2.Rochester Institute of TechnologyRochesterUSA

Personalised recommendations