Semantic Analytics of PubMed Content

  • Dominik Ślęzak
  • Andrzej Janusz
  • Wojciech Świeboda
  • Hung Son Nguyen
  • Jan G. Bazan
  • Andrzej Skowron
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7058)


We present an architecture aimed at semantic search and synthesis of information acquired from the document repositories. The proposed framework is expected to provide domain knowledge interfaces enabling the internally implemented algorithms to identify relationships between documents, researchers, institutions, as well as concepts extracted from various types of knowledge bases. The framework should be scalable with respect to data volumes, diversity of analytic processes, and the speed of search. In this paper, we investigate these requirements for the case of medical publications gathered in PubMed.


Semantic Search and Analytics PubMed MeSH RDBMS Document Repositories Decision Support Systems Behavioral Patterns 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Agrawal, R., Gollapudi, S., Kannan, A., Kenthapadi, K.: Enriching Education through Data Mining. In: Kuznetsov, S.O., Mandal, D.P., Kundu, M.K., Pal, S.K. (eds.) PReMI 2011. LNCS, vol. 6744, pp. 1–2. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  2. 2.
    Badr, Y., Chbeir, R., Abraham, A., Hassanien, A.: Emergent Web Intelligence: Advanced Semantic Technologies. Springer, Heidelberg (2010)CrossRefzbMATHGoogle Scholar
  3. 3.
    Baldi, P., Hatfield, G.W.: DNA Microarrays and Gene Expression: From Experiments to Data Analysis and Modeling. Cambridge University Press (2002)Google Scholar
  4. 4.
    Barwise, J., Seligman, J.: Information Flow: The Logic of Distributed Systems. Cambridge University Press (1997)Google Scholar
  5. 5.
    Bazan, J.G.: Hierarchical Classifiers for Complex Spatio-temporal Concepts. Transactions on Rough Sets 9, 474–750 (2008)Google Scholar
  6. 6.
    Bembenik, R., Skonieczny, Ł., Rybiński, H., Niezgódka, M. (eds.): Intelligent Tools for Building a Scientific Information Platform. Springer, Heidelberg (2011)Google Scholar
  7. 7.
    Butcher, S., Clarke, C.L.A., Cormack, G.: Information Retrieval: Implementing and Evaluating Search Engines. MIT Press (2010)Google Scholar
  8. 8.
    Chodorow, K., Dirolf, M.: MongoDB: The Definitive Guide: Powerful and Scalable Data Storage. O’Reilly Media (2010)Google Scholar
  9. 9.
    Davies, J., Grobelnik, M., Mladenic, D.: Semantic Knowledge Management: Integrating Ontology Management, Knowledge Discovery, and Human Language Technologies. Springer, Heidelberg (2009)CrossRefzbMATHGoogle Scholar
  10. 10.
    Gabrilovich, E., Markovitch, S.: Computing Semantic Relatedness using Wikipedia-based Explicit Semantic Analysis. In: Proc. of the 20th Int. Joint Conf. on Artificial Intelligence (IJCAI), pp. 6–12 (2007)Google Scholar
  11. 11.
    Góra, G., Kruczek, P., Skowron, A., Bazan, J.G., Bazan-Socha, S., Pietrzyk, J.J.: Case-based Planning of Treatment of Infants with Respiratory Failure. Fundamenta Informaticae 85(1-4), 155–172 (2008)MathSciNetzbMATHGoogle Scholar
  12. 12.
    Grużdź, A., Ihnatowicz, A., Ślęzak, D.: Interactive Gene Clustering - A Case Study of Breast Cancer Microarray Data. Information Systems Frontiers 8(1), 21–27 (2006)CrossRefGoogle Scholar
  13. 13.
    Han, J.: Construction and Analysis of Web-Based Computer Science Information Networks. In: Kuznetsov, S.O., Ślęzak, D., Hepting, D.H., Mirkin, B.G. (eds.) RSFDGrC 2011. LNCS (LNAI), vol. 6743, pp. 1–2. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  14. 14.
    Jankowski, A., Skowron, A.: Wisdom Technology: A Rough-Granular Approach. In: Marciniak, M., Mykowiecka, A. (eds.) Bolc Festschrift, vol. 5070, pp. 3–41. Springer, Heidelberg (2009)Google Scholar
  15. 15.
    Jörg, B., Jeffery, K., van Grootel, G., Asserson, A., Dvorak, J., Rasmussen, H.: CERIF 2008 - 1.2 Full Data Model (FDM) Introduction and Specification (2008),
  16. 16.
    Kacprzyk, J., Zadrożny, S.: Computing With Words Is an Implementable Paradigm: Fuzzy Queries, Linguistic Data Summaries, and Natural-Language Generation. IEEE Transactions on Fuzzy Systems 18(3), 461–472 (2010)CrossRefGoogle Scholar
  17. 17.
    McCandless, M., Hatcher, E., Gospodnetić, O.: Lucene in Action, 2nd edn. Manning Publications (2010)Google Scholar
  18. 18.
    Mika, P.: Social Networks and the Semantic Web. In: Proc. of the 2004 IEEE/WIC/ACM Int. Conf. on Web Intelligence (WI), pp. 285–291 (2004)Google Scholar
  19. 19.
    Nguyen, H.S.: Approximate Boolean Reasoning: Foundations and Applications in Data Mining. Transactions on Rough Sets 5, 334–506 (2006)MathSciNetzbMATHGoogle Scholar
  20. 20.
    Nguyen, H.S., Ho, T.B.: Rough Document Clustering and the Internet. In: Pedrycz, W., Skowron, A., Kreinovich, V. (eds.) Handbook of Granular Computing, pp. 987–1003. John Wiley & Sons, Inc., New York (2008)CrossRefGoogle Scholar
  21. 21.
    Roberts, R.J.: PubMed Central: The GenBank of the Published Literature. Proc. of the National Academy of Sciences of the United States of America 98(2), 381–382 (2001), CrossRefGoogle Scholar
  22. 22.
    Ślęzak, D., Wróblewski, J., Eastwood, V., Synak, P.: Brighthouse: An Analytic Data Warehouse for Ad-hoc Queries. Proc. of the VLDB Endowment (PVLDB) 1(2), 1337–1345 (2008)CrossRefGoogle Scholar
  23. 23.
    Szczuka, M., Janusz, A., Herba, K.: Clustering of Rough Set Related Documents with use of Knowledge from DBpedia. In: Yao, J., Ramanna, S., Wang, G., Suraj, Z. (eds.) RSKT 2011. LNCS (LNAI), vol. 6954, pp. 394–403. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  24. 24.
    Tenenbaum, J.M., Shrager, J.: Cancer: A Computational Disease that AI Can Cure. AI Magazine 32(2), 14–26 (2011)Google Scholar
  25. 25.
    Ulam, S.: Analogies Between Analogies: The Mathematical Reports of S. M. Ulam and His Los Alamos Collaborators. University of California Press (1990)Google Scholar
  26. 26.
    United States National Library of Medicine: Introduction to MeSH - 2011 (2011),
  27. 27.
    Valiant, L.G.: Robust Logics. Artif. Intell. 117(2), 231–253 (2000)MathSciNetCrossRefzbMATHGoogle Scholar
  28. 28.
    Vapnik, V.: Learning Has Just Started (An interview with Vladimir Vapnik by Ran Gilad-Bachrach) (2008),

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Dominik Ślęzak
    • 1
    • 2
  • Andrzej Janusz
    • 1
  • Wojciech Świeboda
    • 1
  • Hung Son Nguyen
    • 1
  • Jan G. Bazan
    • 3
    • 1
  • Andrzej Skowron
    • 1
  1. 1.Institute of MathematicsUniversity of WarsawWarsawPoland
  2. 2.Infobright Inc.WarsawPoland
  3. 3.Chair of Computer ScienceUniversity of RzeszówRzeszówPoland

Personalised recommendations