Skip to main content

Storing Data Once in M-tree and PM-tree

  • Conference paper
  • First Online:
Book cover Similarity Search and Applications (SISAP 2019)

Abstract

Since the publication of M-tree, several enhancements were proposed to its structure. One of the most exciting is the use of additional global pivots that resulted in the PM-tree. In this paper, we revisit both M-tree and PM-tree to propose a new construction algorithm that stores data elements once in their trees hierarchies. The main challenge is to select data elements when an inner node split is needed. The idea is that as a data element is evaluated for pruning during traversal, it can become part of the result set, allowing faster convergence of nearest neighbor algorithms. The new insert and query algorithms enable faster retrieval, the decrease in node occupation of trees built with the same parameters, and also a reduction in the overlap among nodes, as shown in the experimental evaluation.

This work has been supported by CNPq (Brazilian National Council for Scientific and Technological Development), by CAPES (Brazilian Coordination for Improvement of Higher Level Personnel), by FAPEMIG (Minas Gerais State Research Foundation) and by PROPP/UFU.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 74.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Arboretum: The Database Group at ICMC/USP Arboretum Library (2019). https://bitbucket.org/gbdi/arboretum. Accessed May 2019

  2. Chen, L., Gao, Y., Zheng, B., Jensen, C.S., Yang, H., Yang, K.: Pivot-based metric indexing. Proc. VLDB Endow. (PVLDB) 10(10), 1058–1069 (2017). https://doi.org/10.14778/3115404.3115411

    Article  Google Scholar 

  3. Ciaccia, P., Patella, M., Zezula, P.: M-tree: an efficient access method for similarity search in metric spaces. In: International Conference on Very Large Data Bases (VLDB), Greece, Athens, pp. 426–435 (1997)

    Google Scholar 

  4. Comer, D.: The ubiquitous b-tree. ACM Comput. Surv. 11(2), 121–137 (1979). https://doi.org/10.1145/356770.356776

    Article  MathSciNet  MATH  Google Scholar 

  5. Figueroa, K., Navarro, G., Chaves, E.: Metric spaces library (2007). http://www.sisap.org/metricspaceslibrary.html

  6. Jagadish, H.V., Ooi, B.C., Tan, K.L., Yu, C., Zhang, R.: iDistance: an adaptive B+-tree based indexing method for nearest neighbor search. ACM Trans. Database Syst. (TODS) 30(2), 364–397 (2005). https://doi.org/10.1145/1071610.1071612

    Article  Google Scholar 

  7. Lokoc, J., Mosko, J., Cech, P., Skopal, T.: On indexing metric spaces using cut-regions. Inf. Syst. 43, 1–19 (2014). https://doi.org/10.1016/j.is.2014.01.007

    Article  Google Scholar 

  8. Lokoc, J., Skopal, T.: On reinsertions in m-tree. In: International Workshop on Similarity Search and Applications (SISAP), pp. 121–128. IEEE (2008). https://doi.org/10.1109/SISAP.2008.10

  9. Navarro, G., Paredes, R., Reyes, N., Bustos, C.: An empirical evaluation of intrinsic dimension estimators. Inf. Syst. 64, 206–218 (2017). https://doi.org/10.1016/j.is.2016.06.004

    Article  Google Scholar 

  10. Navarro, G., Reyes, N.: New dynamic metric indices for secondary memory. Inf. Syst. 59, 48–78 (2016). https://doi.org/10.1016/j.is.2016.03.009

    Article  Google Scholar 

  11. Novak, D., Batko, M., Zezula, P.: Metric index: an efficient and scalable solution for precise and approximate similarity search. Inf. Syst. 36(4), 721–733 (2011). https://doi.org/10.1016/j.is.2010.10.002

    Article  Google Scholar 

  12. Razente, H.L., Barioni, M.C.N., Traina, A.J.M., Faloutsos, C., Traina-Jr, C.: A novel optimization approach to efficiently process aggregate similarity queries in metric access methods. In: International Conference on Information and Knowledge Management (CIKM), Napa Valley, California, pp. 193–202. ACM (2008). https://doi.org/10.1145/1458082.1458110

  13. Razente, H., Sousa, R.M.S., Barioni, M.C.N.: Metric indexing assisted by short-term memories. In: Marchand-Maillet, S., Silva, Y.N., Chávez, E. (eds.) SISAP 2018. LNCS, vol. 11223, pp. 107–121. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-02224-2_9

    Chapter  Google Scholar 

  14. Razente, H.L., Lima, R.L.B., Barioni, M.C.N.: Similarity search through one-dimensional embeddings. In: ACM Symposium on Applied Computing (SAC), Marrakech, Morocco, pp. 874–879. ACM (2017). https://doi.org/10.1145/3019612.3019674

  15. Samet, H.: Foundations of Multidimensional and Metric Data Structures. Morgan Kaufmann, San Francisco (2006)

    MATH  Google Scholar 

  16. Schroeder, M.: Fractals, Chaos, Power Laws: Minutes from an Infinite Paradise. W. H. Freeman and Company, New York (1990)

    MATH  Google Scholar 

  17. Skopal, T., Pokorný, J., Snášel, V.: Nearest neighbours search using the PM-tree. In: Zhou, L., Ooi, B.C., Meng, X. (eds.) DASFAA 2005. LNCS, vol. 3453, pp. 803–815. Springer, Heidelberg (2005). https://doi.org/10.1007/11408079_73

    Chapter  Google Scholar 

  18. Socorro, R., Mico, L., Oncina, J.: A fast pivot-based indexing algorithm for metric spaces. Pattern Recogn. Lett. 32(11), 1511–1516 (2011). https://doi.org/10.1016/j.patrec.2011.04.016

    Article  Google Scholar 

  19. Traina-Jr, C., Filho, R.F.S., Traina, A.J.M., Vieira, M.R., Faloutsos, C.: The omni-family of all-purpose access methods: a simple and effective way to make similarity search more efficient. VLDB J. 16(4), 483–505 (2007). https://doi.org/10.1007/s00778-005-0178-0

    Article  Google Scholar 

  20. Traina-Jr, C., Traina, A., Faloutsos, C., Seeger, B.: Fast indexing and visualization of metric data sets using slim-trees. IEEE Trans. Knowl. Data Eng. (TKDE) 14(2), 244–260 (2002). https://doi.org/10.1109/69.991715

    Article  Google Scholar 

  21. Traina-Jr, C., Traina, A., Wu, L., Faloutsos, C.: Fast feature selection using fractal dimension. J. Inf. Data Manag. (JIDM) 1(1), 3–16 (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Humberto Razente .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Razente, H., Nardini Barioni, M.C. (2019). Storing Data Once in M-tree and PM-tree. In: Amato, G., Gennaro, C., Oria, V., Radovanović , M. (eds) Similarity Search and Applications. SISAP 2019. Lecture Notes in Computer Science(), vol 11807. Springer, Cham. https://doi.org/10.1007/978-3-030-32047-8_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-32047-8_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-32046-1

  • Online ISBN: 978-3-030-32047-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics