Abstract
In this paper, we present cache-efficient algorithms for trie search. There are three key features of these algorithms. First, they use different data structures (partitioned-array, B-tree, hashtable, vectors) to represent different nodes in a trie. The choice of the data structure depends on cache characteristics as well as the fanout of the node. Second, they adapt to changes in the fanout at a node by dynamically switching the data structure used to represent the node. Third, the size and the layout of individual data structures is determined based on the size of the symbols in the alphabet as well as characteristics of the cache(s). We evaluate the performance of these algorithms on real and simulated memory hierarchies. Our evaluation indicates that these algorithms outperform alternatives that are otherwise efficient but do not take cache characteristics into consideration. A comparison of the number of instructions executed indicates that these algorithms derive their performance advantage primarily by making better use of the memory hierarchy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
R. Agrawal, T. Imielinski, and A. Swami. Mining association rules between sets of items in large data bases. In Proc. of the ACM SIGMOD Conference on Management of Data, pages 207–16, Washington, D.C., May 1993.
R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In Proceedings of the 20th International Conference on Very Large Databases, 1994.
E. Ai-Sunwaiyel and E. Horowitz. Algorithms for trie compaction. ACM Transactions on Database Systems, 9(2):243–63, 1984.
J. Aoe, K. Marimoto, and T. Sato. An efficient implementation of trie structure. Software Practice and Experience, 22(9):695–721, 1992.
J. Aoe, K. Morimoto, M. Shishibori, and K. Park. A trie compaction algorithm for a large set of keys. IEEE Transactions on Knowledge and Data Engineering, 8(3):476–91, 1996.
A. Appel and G. Jacobson. The world’s fastest scrabble program. Communications of the ACM, 31(5):572–8, 1988.
J. Bentley and R. Sedgewick. Fast algorithms for sorting and searching strings. In Proceedings of SODA’97, 1997.
A. Blumer, J. Blumer, D. Haussler, and R. McConnel. Complete inverted files for efficient text retrieval and analysis. Journal of the ACM, 34(3):578–95, 1987.
H. Clampett. Randomized binary searching with tree structures. Communications of the ACM, 7(3):163–5, 1964.
J. Clement, P. Flajolet, and B. Vallee. The analysis of hybrid trie structures. Technical Report 3295, INRIA, Nov 1997.
M. Degermark, A. Brodnik, S. Carlsson, and S. Pink. Small forwarding tables for fast routing lookups. Computer Communication Review, October 1997.
J. Dundas. Implementing dynamic minimal-prefix tries. Software Practice and Experience, 21(10):1027–40, 1991.
P. Flajolet and C. Puech. Partial match retrieval of multidimensional data. Journal of the ACM, 33(2):371–407, 1986.
G. Gonnet and R. Baeza-Yates. Handbook of Algorithms and Data Structures: in Pascal and C. Addison-Wesley, second edition, 1991.
E. Han, V. Karypis, and V. Kumar. Scalable parallel data mining for association rules. In Proceedings of SIGMOD’97, 1997.
J. Hennessy and D. Patterson. Computer Architecture: A Quantitative Approach. Morgan Kaufman, second edition, 1996.
D. Knuth. The Art of Computer Programming, volume 3: Sorting and Searching. Addison-Wesley, 1973.
A. Lamarca and R. Ladner. The influence of caches on the performance of sorting. In Proceedings of SODA’97, 1997.
C. Lucchesi and T. Knowaltowski. Applications of finite automata representing large vocabularies. Software Practice and Experience, 23(1):15–30, 1993.
K. Maly. Compressed tries. Communications of the ACM, 19:409–15, 1976.
S. Nilsson and G. Karlsson. Fast address lookup for internet routers. In Proceedings of IEEE Broadband Communications’98, 1998.
J. Peterson. Computer Programs for Spelling Correction. Lecture Notes in Computer Science, Springer Verlag, 1980.
IBM Quest Data Mining Project. The Quest retail transaction data generator10, 1996.
R. Rivest. Partial match retrieval algorithms. SIAM Journal on Computing, 5:19–50, 1976.
S. Sharma and A. Acharya. The msim memory hierarchy simulator. Personal Communication, 1997.
S. Venkatachary and G. Varghese. Faster IP Lookups Using Controlled Prefix Expansion. In Proceedings of SIGMETRICS’98, pages 1–10, 1998.
M. Waldvogel, G. Varghese, J. Turner, and B. Plattner. Scalable high speed IP routing lookups. In Proceedings of SIGCOMM’97, 1997.
M. Zaki, M. Ogihara, S. Parthasarthy, and W. Li. Parallel data mining for association rules on shared-memory multi-processors. In Proceedings of Supercomputing’96, 1996.
M. Zaki, S. Parthasarathy, and W. Li. A localized algorithm for parallel association mining. In Proceedings of the 9th Annual ACM Symposium on Parallel Algorithms and Architectures, 1997.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Acharya, A., Zhu, H., Shen, K. (1999). Adaptive Algorithms for Cache-efficient Trie Search. In: Goodrich, M.T., McGeoch, C.C. (eds) Algorithm Engineering and Experimentation. ALENEX 1999. Lecture Notes in Computer Science, vol 1619. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48518-X_18
Download citation
DOI: https://doi.org/10.1007/3-540-48518-X_18
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66227-3
Online ISBN: 978-3-540-48518-6
eBook Packages: Springer Book Archive