Browsing Digital Collections with Reconfigurable Faceted Thesauri

  • Joaquín Gayoso-Cabada
  • Daniel Rodríguez-Cerezo
  • José-Luis SierraEmail author
Conference paper
Part of the Lecture Notes in Information Systems and Organisation book series (LNISO, volume 22)


Faceted thesauri group classification terms into hierarchically arranged facets. They enable faceted browsing, a well-known browsing technique that makes it possible to narrowing down digital collections by recursively adding filtering terms from the facet hierarchy. In this paper we develop an approach to achieve faceted browsing in live collections, in which not only the contents but also the thesauri can be constantly reorganized. For this purpose we start by introducing a faceted thesauri-based digital collection model in which users can freely rearrange the hierarchical organizations of facets. Then we analyze how to efficiently react to thesauri reconfigurations by representing all the possible ways of browsing a collection with a finite state machine called navigation automaton. Since, in the worst-case, the number of states in navigation automata can grow exponentially with respect to the collections’ sizes, we propose two indexing strategies to avoid this exponential worst-case complexity: one based on inverted indexes, and another inspired by hierarchical clustering, which makes use of the so-called navigation dendrograms. Some experimental results concerning Clavy, a system for managing digital collections with reconfigurable structures in digital humanities and educational settings, provide evidence that navigation dendrogram organization outperforms the inverted index-based one.


Faceted browsing Faceted thesauri Indexing Reconfigurable collections 



This work has been supported by the BBVA Foundation (research grant HUM14_251) and by the Spanish Ministry of Economy and Competitiveness (research grant TIN2014-52010-R). The Chasqui repository was created and is maintained by Prof. Mercedes Guinea (currently a researcher at “El Caño” Foundation). The thesaurus used as an example in this work is adapted from the Chasqui’s cataloguing schema. Chasqui’s original software infrastructure was developed by Alfredo Fernández-Valmayor (currently also at “El Caño” Foundation).


  1. 1.
    Berchtold, S., Böhm, C., Keim, D.-A., Kriegel, H.-P., Xiaowei, X.: Optimal multidimensional query processing using tree striping. In: Proceedings of the 2nd International Conference on Data Warehousing and Knowledge Discovery, pp. 244–257. Springer, London, UK (2000)Google Scholar
  2. 2.
    Chengkai, L., Ning, Y., Senjuti, B-R., Lekhendro, L., Gautam, D.: Facetedpedia: dynamic generation of query-dependent faceted interfaces for wikipedia. In: Proceedings of the 19th International World Wide Web Conference, pp. 651–660. ACM, Raleigh, NC, USA (2010)Google Scholar
  3. 3.
    Chodorow, K.: MongoDB: the definitive guide. O’Reilly (2013)Google Scholar
  4. 4.
    Cigarrán-Recuero, J., Gayoso-Cabada, J., Rodríguez-Artacho, M., Romero-López, D., Sarasa-Cabezuelo, A., Sierra, J.-L.: Assessing semantic annotation activities with formal concept analysis. Expert Syst. Appl. 44(11), 5495–5508 (2014)CrossRefGoogle Scholar
  5. 5.
    Culpepper, J.-S., Moffat, A.: Efficient set intersection for inverted indexing. ACM Trans. Inf. Syst. 29(1), article 1 (2010)Google Scholar
  6. 6.
    Godin, R., Saunders, G.: Lattice model of browsable data space. Inf. Sci. 40(2), 89–116 (1986)CrossRefGoogle Scholar
  7. 7.
    Grainger, T., Potter, T.: Solr in Action. Manning Publications (2014)Google Scholar
  8. 8.
    Greene, G.-J., Dunaiski, M., Fischer, B.: Browsing publication data using tag clouds over concept lattices constructed by key-phrase extraction. In: Proceedings of Russian and South African Workshop on Knowledge Discovery Techniques Based on Formal Concept Analysis, pp. 10–22. CEUR, Stellenbosch, South Africa (2015)Google Scholar
  9. 9.
    Greene, G.-J., Fischer, B.: Interactive tag cloud visualization of software version control repositories. In: Proceedings of the 3rd IEEE Working Conference on Software Visualization, pp. 56–65. IEEE, Raleight, NC, USA (2015)Google Scholar
  10. 10.
    Greene, G.-J.: A Generic framework for concept-based exploration of semi-structured software engineering data. In: Proceedings of the 30th IEEE/ACM International Conference on Automated Software Engineering, pp. 894–897. ACM, Lincoln, Nebraska, USA (2015)Google Scholar
  11. 11.
    Hildebrand, M., van Ossenbruggen, J., Hardman, L.: /facet: a browser for heterogeneous semantic web repositories. In: Proceedings of the 5th International Semantic Web Conference, pp. 272–285. Springer, Athens, GA, USA (2006)Google Scholar
  12. 12.
    Huang, J.-W., Chen, K.-Y., Chen, Y.-C., Yang, K.-N., Hwang, S., Huang, W.-C.: A novel spatial tag cloud using multi-level clustering. J. Inf. Sci. Eng. 30, 687–700 (2014)Google Scholar
  13. 13.
    Jain, A.-K., Murty, M.-N., Flynn, P.-J.: Data clustering: a review. ACM Comput. Surv. 31(3), 264–323 (1999)CrossRefGoogle Scholar
  14. 14.
    Kriegel H.-P.: Performance comparison of index structures for multi-key retrieval. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 186–196. ACM, Boston, MA (1984)Google Scholar
  15. 15.
    Kuznetsov, S.: On computing the size of a lattice and related decision problems. Order 18(4), 313–321 (2001)CrossRefGoogle Scholar
  16. 16.
    Li, R., Shenghua, B., Fei, B., Su, Z., Yu, Y.: Towards effective browsing of large scale social annotations. In: Proceedings of 16th International World Wide Web Conference, pp. 943–952. ACM, Banff, Alberta, Canada (2007)Google Scholar
  17. 17.
    McCandless, M., Hatcher, E., Gospodnetic, O.: Lucene in Action, 2nd edn. Manning Publications (2010)Google Scholar
  18. 18.
    Perugini, S.: Supporting multiple paths to objects in information hierachies: faceted classification, faceted search, and symbolic links. Inf. Proc. Manag. 46(1), 22–43 (2010)CrossRefGoogle Scholar
  19. 19.
    Radelaar, J., Boor, A.-J., Vandic, D., van Dam, J.-W., Fasinca, F.: Improving search and exploration in tag spaces using automated tag clustering. J. Web Eng. 13(3–4), 277–301 (2014)Google Scholar
  20. 20.
    Sarmah, A.-K., Hazarika, S.-M., Sinha, S.-K.: Formal concept analysis: current trends and directions. Artif. Intell. Rev. 44(1), 47–86 (2015)CrossRefGoogle Scholar
  21. 21.
    Schraefel, M.-C., Smith, D-A., Owens, A., Russell, A., Harris, C., Wilson, M.: The evolving mSpace platform: leveraging the semantic web on the trail of the memex. In: Proceedings of the 16th Conference on Hypertext, pp. 174–183. ACM, Salzburg, Austria (2005)Google Scholar
  22. 22.
    Schraefel, M.-C., Wilson, M., Russell, A., Smith, D.-A.: mSpace: improving information access to multimedia domains with multimodal exploratory search. Commun. ACM 49(4), 47–49 (2006)CrossRefGoogle Scholar
  23. 23.
    Sierra, J.-L., Fernández-Valmayor, A., Guinea, M., Hernanz, H.: From research resources to learning objects: process model and virtualization experiences. Education. Tech. Soc. 9(3), 56–68 (2006)Google Scholar
  24. 24.
    Sierra, J.-L., Fernández-Valmayor, A.: Tagging learning objects with evolving metadata schemas. In: Proceedings of the 8th IEEE International Conference on Advanced Learning Technologies, pp. 829–833. IEEE. Santander, Spain (2008)Google Scholar
  25. 25.
    Smith, D.-A., Owens, A., Schraefel, M-C., Sinclair, P., Max, P-A., Wilson, A., Rusell, A., Martinez, K., Lewis, P.: Challenges in supporting faceted semantic browsing of multimedia collections. In: Proceedings of the 2nd International Conference on Semantics and Digitial Media Technologies, pp. 280–283. Springer, Genoa, Italy (2007)Google Scholar
  26. 26.
    Tunkelang, D.: Faceted Search. Morgan & Claypool Publishers (2009)Google Scholar
  27. 27.
    Uddin, M.-N., Janecek, P.: The implementation of faceted classification in web site searching and browsing. Online Inf. Rev. 31(2), 218–233 (2007)CrossRefGoogle Scholar
  28. 28.
    Way, T., Eklund, P.: Social Tagging for digital libraries using formal concept analysis. In: Proceedings of the 17th International Conference on Concept Lattices and their Applications, pp. 139–150. Sevilla, Spain (2010)Google Scholar
  29. 29.
    Wei, B., Liu, J., Zheng, Q.: A survey of faceted search. J. Web Eng. 12(1–2), 41–64 (2013)Google Scholar
  30. 30.
    Yee, K.-P., Swearingen, K., Li, K., Hearst, M.: Faceted metadata for image search and browsing. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 401–408. ACM, Fort Lauderdale, Florida, USA (2003)Google Scholar
  31. 31.
    Yitzhak, O-B., Golbandj, N., Har’El N. et al.: Beyond basic faceted search. In: Proceedings of the 2008 International Conference on Web Search and Data Minining, pp. 33–44. ACM, Stanford, CA, USA (2008)Google Scholar
  32. 32.
    Zhang, Z., Li, W., Gurrin, C., Smeaton. A.-F.: Faceted navigation for browsing large video collection. In: Proceedings of the 22nd International Conference on Multimedia Modelling, pp. 412–417. Springer, Miami, USA (2016)Google Scholar
  33. 33.
    Zobel, J., Moffat, A.: Inverted files for text search engines. ACM Comput. Surv. 33(2) (2006) article 6Google Scholar

Copyright information

© Springer International Publishing Switzerland 2017

Authors and Affiliations

  • Joaquín Gayoso-Cabada
    • 1
  • Daniel Rodríguez-Cerezo
    • 1
  • José-Luis Sierra
    • 1
    Email author
  1. 1.Complutense University of MadridMadridSpain

Personalised recommendations