Skip to main content

Abstract

This paper presents an exploration engine for text mining and cross-context link discovery, implemented as a web application with a user-friendly interface. The system supports experts in advanced document exploration by facilitating document retrieval, analysis and visualization. It enables document retrieval from public databases like PubMed, as well as by querying the web, followed by document cleaning and filtering through several filtering criteria. Document analysis includes document presentation in terms of statistical and similarity based properties and topic ontology construction through document clustering, while the distinguishing feature of the presented system is its powerful cross context and cross-domain document exploration facility through bridging term discovery aimed at finding potential cross-domain linking terms. Term ranking based on the developed ensemble heuristic enables the expert to focus on cross context terms with greater potential for cross-context link discovery. Additionally, the system supports the expert in finding relevant documents and terms by providing customizable document visualization, a color-based domain separation scheme and highlighted top-ranked bisociative terms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Koestler, A.: The act of creation. MacMillan Company, New York (1964)

    Google Scholar 

  2. Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.I.: Fast discovery of association rules. Advances in Knowledge Discovery and Data Mining, 307–328 (1996)

    Google Scholar 

  3. Swanson, D.R.: Migraine and magnesium: Eleven neglected connections. Perspectives in Biology and Medicine 31, 526–557 (1988)

    Google Scholar 

  4. Swanson, D.R.: Medical literature as a potential source of new knowledge. Bull. Med. Libr. Assoc. 78/1, 29–37 (1990)

    Google Scholar 

  5. Lindsay, R.K., Gordon, M.D.: Literature-based discovery by lexical statistics. Journal of the American Society for Information Science and Technology 50/7, 574–587 (1999)

    Google Scholar 

  6. Weeber, M., Vos, R., Klein, H., de Jong-van den Berg, L.T.W.: Using concepts in literature-based discovery: Simulating Swanson’s Raynaud–fish oil and migraine–magnesium discoveries. J. Am. Soc. Inf. Sci. Tech. 52/7, 548–557 (2001)

    Article  Google Scholar 

  7. Srinivasan, P.: Text Mining: Generating Hypotheses from MEDLINE. Journal of the American Society for Information Science and Technology 55/5, 396–413 (2004)

    Article  Google Scholar 

  8. Urbančič, T., Petrič, I., Cestnik, B.: RaJoLink: A Method for Finding Seeds of Future Discoveries in Nowadays Literature. In: Rauch, J., Raś, Z.W., Berka, P., Elomaa, T. (eds.) ISMIS 2009. LNCS, vol. 5722, pp. 129–138. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  9. Hristovski, D., Peterlin, B., Mitchell, J.A., Humphrey, S.M.: Using literature-based discovery to identify disease candidate genes. Int. J. Med. Inform. 74/2–4, 289–298 (2005)

    Article  Google Scholar 

  10. Yetisgen-Yildiz, M., Pratt, W.: Using statistical and knowledge-based approaches for literature-based discovery. J. Biomed. Inform. 39/6, 600–611 (2006)

    Article  Google Scholar 

  11. Smalheiser, N.R., Swanson, D.R.: Using ARROWSMITH: a computer-assisted approach to formulating and assessing scientific hypotheses. Computer Methods and Programs in Biomedicine 57/3, 149–153 (1998)

    Article  Google Scholar 

  12. Holzinger, A., Yildirim, P., Geier, M., Simonic, K.-M.: Quality-based knowledge discovery from medical text on the Web Example of computational methods in Web intelligence. In: Pasi, G., Bordogna, G., Jain, L.C. (eds.) Qual. Issues in the Management of Web Information. ISRL, vol. 50, pp. 145–158. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  13. Dubitzky, W., Kötter, T., Schmidt, O., Berthold, M.R.: Towards creative information exploration based on Koestler’s concept of bisociation. In: Berthold, M.R. (ed.) Bisociative Knowledge Discovery. LNCS, vol. 7250, pp. 11–32. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  14. Juršič, M., Cestnik, B., Urbančič, T., Lavrač, N.: Bisociative Literature Mining by Ensemble Heuristics. In: Berthold, M.R. (ed.) Bisociative Knowledge Discovery. LNCS, vol. 7250, pp. 338–358. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  15. Juršič, M., Cestnik, B., Urbančič, T., Lavrač, N.: Cross-domain literature mining: Finding bridging concepts with CrossBee. In: Proceedings of the 3rd International Conference on Computational Creativity (2012)

    Google Scholar 

  16. Resnick, M., Myers, B., Nakakoji, K., Shneiderman, B., Pausch, R., Selker, T., Eisenberg, M.: Design Principles for Tools to Support Creative Thinking. In: Proceedings of the NSF Workshop on Creativity Support Tools, pp. 25–36 (2005)

    Google Scholar 

  17. Shneiderman, B.: Creativity support tools: accelerating discovery and innovation. Communications of the ACM 50/12, 20–32 (2007)

    Article  Google Scholar 

  18. Shneiderman, B.: Creativity Support Tools: A Grand Challenge for HCI Researchers. In: Engineering the User Interface, pp. 1–9. Springer, London (2009)

    Chapter  Google Scholar 

  19. Kranjc, J., Podpečan, V., Lavrač, N.: ClowdFlows: A cloud cased scientific workflow platform. In: Flach, P.A., De Bie, T., Cristianini, N. (eds.) ECML PKDD 2012, Part II. LNCS, vol. 7524, pp. 816–819. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  20. Urbančič, T., Petrič, I., Cestnik, B., Macedoni-Lukšič, M.: Literature Mining: Towards Better Understanding of Autism. In: Bellazzi, R., Abu-Hanna, A., Hunter, J. (eds.) AIME 2007. LNCS (LNAI), vol. 4594, pp. 217–226. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  21. Petrič, I., Urbančič, T., Cestnik, B., Macedoni-Lukšič, M.: Literature mining method RaJoLink for uncovering relations between biomedical concepts. Journal of Biomedical Informatics 42/2, 219–227 (2009)

    Article  Google Scholar 

  22. Salton, G., Buckley, C.: Term weighting approaches in automatic text retrieval. Inf. Process Manag. 24/5, 513–523 (1988)

    Article  Google Scholar 

  23. Fortuna, B., Grobelnik, M., Mladenić, D.: Semi-automatic Data-driven Ontology Construction System. In: Proceedings of the 9th International Multiconference Information Society, pp. 212–220 (2006)

    Google Scholar 

  24. Petrič, I., Cestnik, B., Lavrač, N., Urbančič, T.: Outlier Detection in Cross-Context Link Discovery for Creative Literature Mining. The Computer Journal 55/1, 47–61 (2012)

    Article  Google Scholar 

  25. Muhr, M., Kern, R., Granitzer, M.: Analysis of structural relationships for hierarchical cluster labelling. In: Proceeding of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 178–185 (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Juršič, M., Cestnik, B., Urbančič, T., Lavrač, N. (2013). HCI Empowered Literature Mining for Cross-Domain Knowledge Discovery. In: Holzinger, A., Pasi, G. (eds) Human-Computer Interaction and Knowledge Discovery in Complex, Unstructured, Big Data. HCI-KDD 2013. Lecture Notes in Computer Science, vol 7947. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39146-0_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-39146-0_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-39145-3

  • Online ISBN: 978-3-642-39146-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics