Abstract
Since the publication of M-tree, several enhancements were proposed to its structure. One of the most exciting is the use of additional global pivots that resulted in the PM-tree. In this paper, we revisit both M-tree and PM-tree to propose a new construction algorithm that stores data elements once in their trees hierarchies. The main challenge is to select data elements when an inner node split is needed. The idea is that as a data element is evaluated for pruning during traversal, it can become part of the result set, allowing faster convergence of nearest neighbor algorithms. The new insert and query algorithms enable faster retrieval, the decrease in node occupation of trees built with the same parameters, and also a reduction in the overlap among nodes, as shown in the experimental evaluation.
This work has been supported by CNPq (Brazilian National Council for Scientific and Technological Development), by CAPES (Brazilian Coordination for Improvement of Higher Level Personnel), by FAPEMIG (Minas Gerais State Research Foundation) and by PROPP/UFU.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Arboretum: The Database Group at ICMC/USP Arboretum Library (2019). https://bitbucket.org/gbdi/arboretum. Accessed May 2019
Chen, L., Gao, Y., Zheng, B., Jensen, C.S., Yang, H., Yang, K.: Pivot-based metric indexing. Proc. VLDB Endow. (PVLDB) 10(10), 1058–1069 (2017). https://doi.org/10.14778/3115404.3115411
Ciaccia, P., Patella, M., Zezula, P.: M-tree: an efficient access method for similarity search in metric spaces. In: International Conference on Very Large Data Bases (VLDB), Greece, Athens, pp. 426–435 (1997)
Comer, D.: The ubiquitous b-tree. ACM Comput. Surv. 11(2), 121–137 (1979). https://doi.org/10.1145/356770.356776
Figueroa, K., Navarro, G., Chaves, E.: Metric spaces library (2007). http://www.sisap.org/metricspaceslibrary.html
Jagadish, H.V., Ooi, B.C., Tan, K.L., Yu, C., Zhang, R.: iDistance: an adaptive B+-tree based indexing method for nearest neighbor search. ACM Trans. Database Syst. (TODS) 30(2), 364–397 (2005). https://doi.org/10.1145/1071610.1071612
Lokoc, J., Mosko, J., Cech, P., Skopal, T.: On indexing metric spaces using cut-regions. Inf. Syst. 43, 1–19 (2014). https://doi.org/10.1016/j.is.2014.01.007
Lokoc, J., Skopal, T.: On reinsertions in m-tree. In: International Workshop on Similarity Search and Applications (SISAP), pp. 121–128. IEEE (2008). https://doi.org/10.1109/SISAP.2008.10
Navarro, G., Paredes, R., Reyes, N., Bustos, C.: An empirical evaluation of intrinsic dimension estimators. Inf. Syst. 64, 206–218 (2017). https://doi.org/10.1016/j.is.2016.06.004
Navarro, G., Reyes, N.: New dynamic metric indices for secondary memory. Inf. Syst. 59, 48–78 (2016). https://doi.org/10.1016/j.is.2016.03.009
Novak, D., Batko, M., Zezula, P.: Metric index: an efficient and scalable solution for precise and approximate similarity search. Inf. Syst. 36(4), 721–733 (2011). https://doi.org/10.1016/j.is.2010.10.002
Razente, H.L., Barioni, M.C.N., Traina, A.J.M., Faloutsos, C., Traina-Jr, C.: A novel optimization approach to efficiently process aggregate similarity queries in metric access methods. In: International Conference on Information and Knowledge Management (CIKM), Napa Valley, California, pp. 193–202. ACM (2008). https://doi.org/10.1145/1458082.1458110
Razente, H., Sousa, R.M.S., Barioni, M.C.N.: Metric indexing assisted by short-term memories. In: Marchand-Maillet, S., Silva, Y.N., Chávez, E. (eds.) SISAP 2018. LNCS, vol. 11223, pp. 107–121. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-02224-2_9
Razente, H.L., Lima, R.L.B., Barioni, M.C.N.: Similarity search through one-dimensional embeddings. In: ACM Symposium on Applied Computing (SAC), Marrakech, Morocco, pp. 874–879. ACM (2017). https://doi.org/10.1145/3019612.3019674
Samet, H.: Foundations of Multidimensional and Metric Data Structures. Morgan Kaufmann, San Francisco (2006)
Schroeder, M.: Fractals, Chaos, Power Laws: Minutes from an Infinite Paradise. W. H. Freeman and Company, New York (1990)
Skopal, T., Pokorný, J., Snášel, V.: Nearest neighbours search using the PM-tree. In: Zhou, L., Ooi, B.C., Meng, X. (eds.) DASFAA 2005. LNCS, vol. 3453, pp. 803–815. Springer, Heidelberg (2005). https://doi.org/10.1007/11408079_73
Socorro, R., Mico, L., Oncina, J.: A fast pivot-based indexing algorithm for metric spaces. Pattern Recogn. Lett. 32(11), 1511–1516 (2011). https://doi.org/10.1016/j.patrec.2011.04.016
Traina-Jr, C., Filho, R.F.S., Traina, A.J.M., Vieira, M.R., Faloutsos, C.: The omni-family of all-purpose access methods: a simple and effective way to make similarity search more efficient. VLDB J. 16(4), 483–505 (2007). https://doi.org/10.1007/s00778-005-0178-0
Traina-Jr, C., Traina, A., Faloutsos, C., Seeger, B.: Fast indexing and visualization of metric data sets using slim-trees. IEEE Trans. Knowl. Data Eng. (TKDE) 14(2), 244–260 (2002). https://doi.org/10.1109/69.991715
Traina-Jr, C., Traina, A., Wu, L., Faloutsos, C.: Fast feature selection using fractal dimension. J. Inf. Data Manag. (JIDM) 1(1), 3–16 (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Razente, H., Nardini Barioni, M.C. (2019). Storing Data Once in M-tree and PM-tree. In: Amato, G., Gennaro, C., Oria, V., Radovanović , M. (eds) Similarity Search and Applications. SISAP 2019. Lecture Notes in Computer Science(), vol 11807. Springer, Cham. https://doi.org/10.1007/978-3-030-32047-8_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-32047-8_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-32046-1
Online ISBN: 978-3-030-32047-8
eBook Packages: Computer ScienceComputer Science (R0)