Skip to main content

A Structure Based Approach for Mathematical Expression Retrieval

  • Conference paper
Multi-disciplinary Trends in Artificial Intelligence (MIWAI 2012)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7694))

Abstract

Mathematical expression (ME) retrieval problem has currently received much attention due to wide-spread availability of MEs on the World Wide Web. As MEs are two-dimensional in nature, traditional text retrieval techniques used in natural language processing are not sufficient for their retrieval. In this paper, we have proposed a novel structure based approach to ME retrieval problem. In our approach, query given in \(\mbox{\LaTeX}\) format is preprocessed to eliminate extraneous keywords (like \displaystyle, \begin{array} etc.) while retaining the structure information like superscript and subscript relationships. MEs in the database are also preprocessed and stored in the same manner. We have created a database of 829 MEs in \(\mbox{\LaTeX}\) form, that covers various branches of mathematics like Algebra, Trigonometry, Calculus etc. Preprocessed query is matched against the database of preprocessed MEs using Longest Common Subsequence (LCS) algorithm. LCS algorithm is used as it preserves the order of keywords in the preprocessed MEs unlike bag of words approach in the traditional text retrieval techniques. We have incorporated structure information into LCS algorithm and proposed a measure based on the modified algorithm, for ranking MEs in the database. As proposed approach exploits structure information, it is closer to human intuition. Retrieval performance has been evaluated using standard precision measure.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Adeel, M., Cheung, H.S., Khiyal, A.H.: Math go! prototype of a content based mathematical formula search engine. Journal of Theoretical and Applied Information Technology 4(10), 1002–1012 (2008)

    Google Scholar 

  2. Adeel, M., Sher, M., Khiyal, M.S.H.: Efficient cluster-based information retrieval from mathematical markup documents. World Applied Sciences Journal 17, 611–616 (2012)

    Google Scholar 

  3. Cormen, T.H., Leiserson, C.E., Rivest, R.L.: Introduction to Algorithms. The MIT Press and McGraw-Hill Book Company (1989)

    Google Scholar 

  4. Graf, P.: Substitution Tree Indexing. In: Hsiang, J. (ed.) RTA 1995. LNCS, vol. 914, pp. 117–131. Springer, Heidelberg (1995)

    Chapter  Google Scholar 

  5. Kamali, S., Tompa, F.W.: Improving mathematics retrieval. In: Proceedings of Digital Mathematics Libraries, Grand Bend, pp. 37–48 (2009)

    Google Scholar 

  6. Kohlhase, M., Sucan, I.: A Search Engine for Mathematical Formulae. In: Calmet, J., Ida, T., Wang, D. (eds.) AISC 2006. LNCS (LNAI), vol. 4120, pp. 241–253. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  7. Lamport, L.: LaTeX: A Document Preparation System. Addison-Wesley (1986)

    Google Scholar 

  8. Lucene: Indexing and retrieval library, http://lucene.apache.org

  9. MathML (2010), http://www.w3.org/Math/

  10. Miner, R., Munavalli, R.: Mathfind: A math-aware search engine. In: Proceedings of the International Conference on Information Retrieval, New York, USA, pp. 735–735 (2006)

    Google Scholar 

  11. Miner, R., Munavalli, R.: An Approach to Mathematical Search Through Query Formulation and Data Normalization. In: Kauers, M., Kerber, M., Miner, R., Windsteiger, W. (eds.) MKM/CALCULEMUS 2007. LNCS (LNAI), vol. 4573, pp. 342–355. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  12. Müller, H., Müller, W., Squire, D.M., Marchand-Maillet, S., Pun, T.: Performance evaluation in content-based image retrieval: overview and proposals. Pattern Recognition Letters 22(5), 593–601 (2001)

    Article  MATH  Google Scholar 

  13. Pavan Kumar, P., Agarwal, A., Bhagvati, C.: A Rule-Based Approach to Form Mathematical Symbols in Printed Mathematical Expressions. In: Sombattheera, C., Agarwal, A., Udgata, S.K., Lavangnananda, K. (eds.) MIWAI 2011. LNCS, vol. 7080, pp. 181–192. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  14. Rich, E., Knight, K.: Artificial Intelligence, 2nd edn. McGraw-Hill Book Company (1991)

    Google Scholar 

  15. Salton, G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill, Inc., New York (1986)

    Google Scholar 

  16. Springer: LaTeX search, http://www.latexsearch.com/

  17. Yokoi, K., Aizawa, A.: An approach to similarity search for mathematical expressions using mathml. Towards a Digital Mathematics Library, Grand Bend, pp. 27–35 (2009)

    Google Scholar 

  18. Zanibbi, R., Blostein, D.: Recognition and retrieval of mathematical expressions. In: IJDAR. Springer, Heidelberg (2011)

    Google Scholar 

  19. Zanibbi, R., Yu, L.: Math spotting: Retrieving math in technical documents using handwritten query images. In: 2011 International Conference on Document Analysis and Recognition, pp. 446–451 (2011)

    Google Scholar 

  20. Zanibbi, R., Yuan, B.: Keyword and image-based retrieval of mathematical expressions. In: Document Recognition and Retrieval XVIII, vol. 7874, pp. 1–10. SPIE (2011)

    Google Scholar 

  21. Zhao, J., Kan, M.Y., Theng, Y.L.: Math information retrieval: user requirements and prototype implementation. In: Proceedings of the 8th ACM/IEEE-CS Joint Conference on Digital Libraries, JCDL 2008, pp. 187–196. ACM, New York (2008)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Pavan Kumar, P., Agarwal, A., Bhagvati, C. (2012). A Structure Based Approach for Mathematical Expression Retrieval. In: Sombattheera, C., Loi, N.K., Wankar, R., Quan, T. (eds) Multi-disciplinary Trends in Artificial Intelligence. MIWAI 2012. Lecture Notes in Computer Science(), vol 7694. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35455-7_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-35455-7_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-35454-0

  • Online ISBN: 978-3-642-35455-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics