Skip to main content

Text Mining Scientific Papers: A Survey on FCA-Based Information Retrieval Research

  • Conference paper
Advances in Data Mining. Applications and Theoretical Aspects (ICDM 2012)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7377))

Included in the following conference series:

Abstract

Formal Concept Analysis (FCA) is an unsupervised clustering technique and many scientific papers are devoted to applying FCA in Information Retrieval (IR) research. We collected 103 papers published between 2003-2009 which mention FCA and information retrieval in the abstract, title or keywords. Using a prototype of our FCA-based toolset CORDIET, we converted the pdf-files containing the papers to plain text, indexed them with Lucene using a thesaurus containing terms related to FCA research and then created the concept lattice shown in this paper. We visualized, analyzed and explored the literature with concept lattices and discovered multiple interesting research streams in IR of which we give an extensive overview. The core contributions of this paper are the innovative application of FCA to the text mining of scientific papers and the survey of the FCA-based IR research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 49.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ahmad, I., Jang, T.S.: Old Fashion Text-Based Image Retrieval Using FCA. In: Proc. IEEE Int. Conf. Image Processing, ICIP-III, vol. 2, pp. 33–36 (2003)

    Google Scholar 

  2. Amato, G., Meghini, C.: Faceted Content-based Image Retrieval. In: Proc. 19th IEEE Int. Conf. on Database and Expert Systems Application, DEXA, pp. 402–406. (2008)

    Google Scholar 

  3. Bruno, M., Canfora, G., Penta, M.D., Scognamiglio, R.: An Approach to support Web Service Classification and Annotation. In: Proc. IEEE Int. Conf. on e-Technology, e-Commerce and e-Service, pp. 138–143 (2005)

    Google Scholar 

  4. Carpineto, C., Romano, G.: A lattice conceptual clustering system and its application to browsing retrieval. Machine Learning 24(2), 1–28 (1996b)

    Google Scholar 

  5. Carpineto, C., Romano, G.: Concept data analysis: Theory and applications. John Wiley & Sons (2004a)

    Google Scholar 

  6. Carpineto, C., Romano, G.: Exploiting the Potential of Concept Lattices for Information Retrieval with CREDO. J. of Universal Computing 10(8), 985–1013 (2004b)

    Google Scholar 

  7. Carpineto, C., Romano, G.: Using Concept Lattices for Text Retrieval and Mining. In: Ganter, B., Stumme, G., Wille, R. (eds.) ICFCA 2005. LNCS (LNAI), vol. 3626, pp. 161–179. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  8. Ceravolo, P., Gusmini, A., Leida, M., Cui, Z.: An FCA-based mapping generator. In: 12th IEEE int. Conf. on Emerging Technologies and Factory Automation, pp. 796–803 (2007)

    Google Scholar 

  9. Cigarrán, J.M., Gonzalo, J., Peñas, A., Verdejo, M.F.: Browsing Search Results via Formal Concept Analysis: Automatic Selection of Attributes. In: Eklund, P. (ed.) ICFCA 2004. LNCS (LNAI), vol. 2961, pp. 74–87. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  10. Cigarrán, J.M., Peñas, A., Gonzalo, J., Verdejo, M.F.: Automatic Selection of Noun Phrases as Document Descriptors in an FCA-Based Information Retrieval System. In: Ganter, B., Godin, R. (eds.) ICFCA 2005. LNCS (LNAI), vol. 3403, pp. 49–63. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  11. Cole, R., Eklund, P.: Browsing Semi-structured Web Texts Using Formal Concept Analysis. In: Delugach, H.S., Stumme, G. (eds.) ICCS 2001. LNCS (LNAI), vol. 2120, pp. 319–332. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  12. Cole, R., Eklund, P., Stumme, G.: Document retrieval for e-mail search and discovery using Formal Concept Analysis. In: Applied Artificial Intelligence, vol. 17, pp. 257–280. Taylor & Francis (2003)

    Google Scholar 

  13. Cole, R.J.: The management and visualization of document collections using Formal Concept Analysis. Ph. D. Thesis, Griffith University (2000)

    Google Scholar 

  14. Ignatov, D.I., Kuznetsov, S.O.: Frequent Itemset Mining for Clustering Near Duplicate Web Documents. In: Rudolph, S., Dau, F., Kuznetsov, S.O. (eds.) ICCS 2009. LNCS (LNAI), vol. 5662, pp. 185–200. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  15. Dau, F., Ducrou, J., Eklund, P.: Concept Similarity and Related Categories in SearchSleuth. In: Eklund, P., Haemmerlé, O. (eds.) ICCS 2008. LNCS (LNAI), vol. 5113, pp. 255–268. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  16. De Souza, K.X.S., Davis, J.: Using an Aligned Ontology to Process User Queries. In: Bussler, C.J., Fensel, D. (eds.) AIMSA 2004. LNCS (LNAI), vol. 3192, pp. 44–53. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  17. Ducrou, J.: DVDSleuth: A Case Study in Applied Formal Concept Analysis for Navigating Web Catalogs. In: Priss, U., Polovina, S., Hill, R. (eds.) ICCS 2007. LNCS (LNAI), vol. 4604, pp. 496–500. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  18. Ducrou, J., Vormbrock, B., Eklund, P.: FCA-Based Browsing and Searching of a Collection of Images. In: Schärfe, H., Hitzler, P., Øhrstrøm, P. (eds.) ICCS 2006. LNCS (LNAI), vol. 4068, pp. 203–214. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  19. Ducrou, J., Eklund, P.W.: SearchSleuth: The Conceptual Neighborhood of an Web Query. In: CLA (2007b)

    Google Scholar 

  20. Ducrou, J., Wormuth, B., Eklund, P.: Dynamic Schema Navigation Using Formal Concept Analysis. In: Tjoa, A.M., Trujillo, J. (eds.) DaWaK 2005. LNCS, vol. 3589, pp. 398–407. Springer, Heidelberg (2005b)

    Chapter  Google Scholar 

  21. Eklund, P., Ducrou, J.: Navigation and Annotation with Formal Concept Analysis. In: Richards, D., Kang, B.-H. (eds.) PKAW 2008. LNCS, vol. 5465, pp. 118–121. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  22. Eklund, P., Ducrou, J., Brawn, P.: Concept Lattices for Information Visualization: Can Novices Read Line-Diagrams? In: Eklund, P. (ed.) ICFCA 2004. LNCS (LNAI), vol. 2961, pp. 57–73. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  23. Eklund, P., Wille, R.: Semantology as Basis for Conceptual Knowledge Processing. In: Kuznetsov, S.O., Schmidt, S. (eds.) ICFCA 2007. LNCS (LNAI), vol. 4390, pp. 18–38. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  24. Eklund, P., Wormuth, B.: Restructuring Help Systems Using Formal Concept Analysis. In: Ganter, B., Godin, R. (eds.) ICFCA 2005. LNCS (LNAI), vol. 3403, pp. 129–144. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  25. Ganter, B., Wille, R.: Formal Concept Analysis. Mathematical foundations. Springer (1999)

    Google Scholar 

  26. Recio-García, J.A., Gómez-Martín, M.A., Díaz-Agudo, B., González-Calero, P.A.: Improving Annotation in the Semantic Web and Case Authoring in Textual CBR. In: Roth-Berghofer, T.R., Göker, M.H., Güvenir, H.A. (eds.) ECCBR 2006. LNCS (LNAI), vol. 4106, pp. 226–240. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  27. Godin, R., Gecsei, J., Pichet, C.: Design of browsing interface for information retrieval. In: Belkin, N.J., et al. (eds.) Proc. GIR, pp. 32–39 (1989)

    Google Scholar 

  28. Godin, R., Missaoui, R., April, A.: Experimental comparison of navigation in a Galois lattice with conventional information retrieval methods. Int. J. Man-Machine Studies 38, 747–767 (1993)

    Article  Google Scholar 

  29. Hachani, N., Ben Hassine, M.A., Chettaoui, H., et al.: Cooperative answering of fuzzy queries. Journal of Computer Science and Technology 24(4), 675–686 (2009)

    Article  Google Scholar 

  30. Hitzler, P., Krötzsch, M.: Querying Formal Contexts with Answer Set Programs. In: Schärfe, H., Hitzler, P., Øhrstrøm, P. (eds.) ICCS 2006. LNCS (LNAI), vol. 4068, pp. 260–273. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  31. Ignatov, D.I., Kuznetsov, S.O.: Frequent Itemset Mining for Clustering Near Duplicate Web Documents. In: Rudolph, S., Dau, F., Kuznetsov, S.O. (eds.) ICCS 2009. LNCS (LNAI), vol. 5662, pp. 185–200. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  32. Kim, M., Compton, P.: Evolutionary Document Management and Retrieval for Specialised Domains on the Web. Int. J. of Human Computer Studies 60(2), 201–241 (2004)

    Article  Google Scholar 

  33. Kim, M., Compton, P.: A Hybrid Browsing Mechanism Using Conceptual Scales. In: Hoffmann, A., Kang, B.-H., Richards, D., Tsumoto, S. (eds.) PKAW 2006. LNCS (LNAI), vol. 4303, pp. 132–143. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  34. Koester, B.: Conceptual Knowledge Retrieval with FooCA: Improving Web Search Engine Results with Contexts and Concept Hierarchies. In: Perner, P. (ed.) ICDM 2006. LNCS (LNAI), vol. 4065, pp. 176–190. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  35. Lakhal, L., Stumme, G.: Efficient Mining of Association Rules Based on Formal Concept Analysis. In: Ganter, B., Stumme, G., Wille, R. (eds.) ICFCA 2005. LNCS (LNAI), vol. 3626, pp. 180–195. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  36. Le Grand, B., Aufaure, M.A., Soto, M.: Semantic and Conceptual Context-Aware Information Retrieval. In: Damiani, E., Yetongnon, K., Chbeir, R., Dipanda, A. (eds.) SITIS 2006. LNCS, vol. 4879, pp. 247–258. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  37. Liu, M., Shao, M., Zhang, W., Wu, C.: Reduction method for concept lattices based on rough set theory and its application. Computers and Mathematics with Applications 53, 1390–1410 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  38. Lungley, D., Kruschwitz, U.: Automatically Maintained Domain Knowledge: Initial Findings. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds.) ECIR 2009. LNCS, vol. 5478, pp. 739–743. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  39. Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press (2008)

    Google Scholar 

  40. Messai, N., Devignes, M.D., Napoli, A., Smail-Tabbone, M.: Extending Attribute Dependencies for Lattice-based Querying and Navigation. In: Eklund, P., Haemmerlé, O. (eds.) ICCS 2008. LNCS (LNAI), vol. 5113, pp. 189–202. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  41. Messai, N., Devignes, M.-D., Napoli, A., Smaïl-Tabbone, M.: Querying a Bioinformatic Data Sources Registry with Concept Lattices. In: Dau, F., Mugnier, M.-L., Stumme, G. (eds.) ICCS 2005. LNCS (LNAI), vol. 3596, pp. 323–336. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  42. Mili, H., Ah-Ki, E., Godin, R., Mcheick, H.: Another nail to the coffin of faceted controlled-vocabulary component classification and retrieval. VCM SIGSOFT Software Engineering Notes 22(3), 89–98 (1997)

    Article  Google Scholar 

  43. Muangon, W., Intakosum, S.: Retrieving Design Patterns by Case-Based Reasoning and Formal Concept Analysis. In: 2nd Int. Conf. Comp. Sc. Inf. Technology, pp. 424–428 (2009)

    Google Scholar 

  44. Nafkha, I., Jaoua, A.: Using Formal Concept Analysis for Heterogeneous Information. In: Belohlavek, R.R., et al. (eds.) CLA, pp. 107–122 (2005)

    Google Scholar 

  45. Nauer, E., Toussaint, Y.: CreChainDo: An iterative and interactive Web information retrieval system based on lattices. International Journal of General Systems 38(4), 363–378 (2009)

    Article  MATH  Google Scholar 

  46. Peng, D., Huang, S., Wang, X., Zhou, A.: Concept-Based Retrieval of Alternate Web Services. In: Zhou, L.-z., Ooi, B.-C., Meng, X. (eds.) DASFAA 2005. LNCS, vol. 3453, pp. 359–371. Springer, Heidelberg (2005a)

    Chapter  Google Scholar 

  47. Peng, X., Zhao, W.: An Incremental and FCA-based Ontology Construction Method for Semantics-based Component Retrieval. In: 7th Int. Conf. on Quality Soft, pp. 309–315 (2007)

    Google Scholar 

  48. Poelmans, J., Elzinga, P., Viaene, S., Dedene, G.: Formal Concept Analysis in Knowledge Discovery: A Survey. In: Croitoru, M., Ferré, S., Lukose, D. (eds.) ICCS 2010. LNCS, vol. 6208, pp. 139–153. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  49. Poelmans, J., Elzinga, P., Viaene, S., Dedene, G.: Concept Discovery Innovations in Law Enforcement: a Perspective. In: IEEE CINS Workshop (INCos), Greece (2010b)

    Google Scholar 

  50. Poelmans, J., Elzinga, P., Viaene, S., Dedene, G.: Curbing domestic violence: Instantiating C-K theory with Formal Concept Analysis and Emergent Self Organizing Maps. Intelligent Systems in Accounting, Finance and Management 17(3-4), 167–191 (2010c)

    Article  Google Scholar 

  51. Polaillon, G., Aufaure, M.A., Le Grand, B., Soto, M.: FCA for contextual semantic navigation and information retrieval in heterogeneous information systems. In: 8th IEEE Int. Workshop on Database and Expert Systems Applications, pp. 534–539 (2007)

    Google Scholar 

  52. Poshyvanyk, D., Marcus, A.: Combining Formal Concept Analysis with Information Retrieval for Concept Location in Source Code. In: Proc. IEEE Int. Conf. on Program Comprehension, pp. 37–48 (2007)

    Google Scholar 

  53. Priss, U.: Lattice-based Information Retrieval. Knowledge Organization 27(3), 132–142 (2000)

    Google Scholar 

  54. Priss, U.: Formal Concept Analysis in Information Science. In: Blaise, C. (ed.) Annual Review of Information Science and Technology, ASIST, vol. 40, pp. 521–543 (2006)

    Google Scholar 

  55. Spyratos, N., Meghini, C.: Preference-Based Query Tuning Through Refinement/Enlargement in a Formal Context. In: Dix, J., Hegner, S.J. (eds.) FoIKS 2006. LNCS, vol. 3861, pp. 278–293. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  56. Stojanovic, N.: On the query refinement in the ontology-based searching for information. Information Systems 30(7), 543–563 (2005)

    Article  MathSciNet  Google Scholar 

  57. Stojanovic, N.: On Using Query Neighborhood for Better Navigation through a Product Catalog: SMART Approach. In: IEEE Int. Conf. e-Tech., e-Com. and e-Service (2004)

    Google Scholar 

  58. Tane, J.: Using a Query-Based Multicontext for Knowledge Base Browsing. In: 3rd Int. Conf., ICFCA - Supplementary, Lens, France, pp. 62–78 (2005)

    Google Scholar 

  59. Tane, J., Cimiano, P., Hitzler, P.: Query-Based Multicontexts for Knowledge Base Browsing: An Evaluation. In: Schärfe, H., Hitzler, P., Øhrstrøm, P. (eds.) ICCS 2006. LNCS (LNAI), vol. 4068, pp. 413–426. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  60. Tilley, T.: Tool Support for FCA. In: Eklund, P. (ed.) ICFCA 2004. LNCS (LNAI), vol. 2961, pp. 104–111. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  61. Tilley, T., Eklund, P.: Citation analysis using Formal Concept Analysis: A case study in software engineering. In: 18th Int. Conf., DEXA, pp. 545–550 (2007)

    Google Scholar 

  62. Wille, R.: Restructuring lattice theory: an approach based on hierarchies of concepts. In: Rival, I. (ed.) Ordered Sets, pp. 445–470. Reidel, Dordrecht-Boston (1982)

    Google Scholar 

  63. Wille, R.: Methods of Conceptual Knowledge Processing. In: Missaoui, R., Schmidt, J. (eds.) ICFCA 2006. LNCS (LNAI), vol. 3874, pp. 1–29. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  64. Zhang, Y., Feng, B., Xue, Y.: A New Search Results Clustering Algorithm based on Formal Concept Analysis. In: 5th Int. Conf. on FSKD, pp. 356–360 (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Poelmans, J., Ignatov, D.I., Viaene, S., Dedene, G., Kuznetsov, S.O. (2012). Text Mining Scientific Papers: A Survey on FCA-Based Information Retrieval Research. In: Perner, P. (eds) Advances in Data Mining. Applications and Theoretical Aspects. ICDM 2012. Lecture Notes in Computer Science(), vol 7377. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31488-9_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-31488-9_22

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-31487-2

  • Online ISBN: 978-3-642-31488-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics