Skip to main content

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6266))

Abstract

Biobanks are extremely important resources for medical research: they collect biological material (samples) and data describing this material. Biobanks provide medical researchers with material and data they need for their studies. Data availability varies greatly among samples and makes the retrieval of data and identification of relevant samples a strenuous task. We show the challenges and limitations when using pure SQL statements for querying a relational database. To tackle the problem of locating interesting material we present a novel approach which automatically generates approximate queries with ranking capabilities. Medical researchers use a Query By Example interface to specify desired attributes and restrictions and assign weights to them to influence the ranking function.

The work reported here was partially supported by the European Commission 7th Framework program - project BBMRI and by the Austrian Ministry of Science and Research within the program GEN-AU - project GATIB II.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Biobanking and biomolecular resources research infrastructure, http://www.bbmri.eu

  2. Systematized nomenclature of medicine-clinical terms, http://www.ihtsdo.org

  3. Unified global medical coding system, http://www.icd10codes.com

  4. Agichtein, E., Brill, E., Dumais, S.: Improving web search ranking by incorporating user behavior information. In: SIGIR 2006: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 19–26. ACM, New York (2006)

    Chapter  Google Scholar 

  5. Agrawal, S., Chaudhuri, S.: Automated ranking of database query results. In: CIDR, pp. 888–899 (2003)

    Google Scholar 

  6. Amer-Yahia, S., Koudas, N., Marian, A., Srivastava, D., Toman, D.: Structure and content scoring for xml. In: VLDB 2005: Proceedings of the 31st international conference on Very large data bases, pp. 361–372. VLDB Endowment (2005)

    Google Scholar 

  7. Bosc, P., Hadjali, A., Pivert, O.: Empty versus overabundant answers to flexible relational queries. Fuzzy Sets and Systems 159(12), 1450–1467 (2008); Advances in Intelligent Databases and Information Systems

    Google Scholar 

  8. Buckland, M., Gey, F.: The relationship between recall and precision. J. Am. Soc. Inf. Sci. 45(1), 12–19 (1994)

    Article  Google Scholar 

  9. Chaudhuri, S., Das, G.: Probabilistic information retrieval approach for ranking of database query results. ACM Trans. Database Syst. 31(3), 1134–1168 (2006)

    Article  Google Scholar 

  10. Chaudhuri, S., Das, G., Hristidis, V., Weikum, G.: Probabilistic ranking of database query results. In: VLDB 2004: Proceedings of the Thirtieth international conference on Very large data bases, pp. 888–899. VLDB Endowment (2004)

    Google Scholar 

  11. Church, K., Gale, W.: Inverse document frequency (idf): A measure of deviations from poisson

    Google Scholar 

  12. Das, G., Hristidis, V., Kapoor, N., Sudarshan, S.: Ordering the attributes of query results. In: SIGMOD 2006: Proceedings of the 2006 ACM SIGMOD international conference on Management of data, pp. 395–406. ACM, New York (2006)

    Chapter  Google Scholar 

  13. Eder, J., Dabringer, C., Schicho, M., Stark, K.: Information systems for federated biobanks. Transactions on Large Scale Data and Knowledge Centered Systems (2009)

    Google Scholar 

  14. Fazzinga, B., Flesca, S., Pugliese, A.: Retrieving xml data from heterogeneous sources through vague querying. ACM Trans. Internet Technol. 9(2), 1–35 (2009)

    Article  Google Scholar 

  15. Litwin, W., Mark, L., Roussopoulos, N.: Interoperability of multiple autonomous databases. ACM Comput. Surv. 22(3), 267–293 (1990)

    Article  Google Scholar 

  16. Liu, S., Zou, Q., Chu, W.W.: Configurable indexing and ranking for xml information retrieval. In: SIGIR 2004: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 88–95. ACM, New York (2004)

    Chapter  Google Scholar 

  17. Mandreoli, F., Martoglia, R., Tiberio, P.: Approximate query answering for a heterogeneous xml document base (2004)

    Google Scholar 

  18. Mandreoli, F., Martoglia, R., Tiberio, P.: Approximate query answering for a heterogeneous xml document base. LNCS. Springer, Heidelberg (2004)

    Google Scholar 

  19. Marian, A., Bruno, N., Gravano, L.: Evaluating top-k queries over web-accessible databases. ACM Trans. Database Syst. 29(2), 319–362 (2004)

    Article  Google Scholar 

  20. Ortega-Binderberger, M., Chakrabarti, K., Mehrotra, S.: An approach to integrating query refinement in sql (2002)

    Google Scholar 

  21. Papakonstantinou, Y., Vassalos, V.: Query rewriting for semistructured data. In: SIGMOD 1999: Proceedings of the 1999 ACM SIGMOD international conference on Management of data, pp. 455–466. ACM, New York (1999)

    Chapter  Google Scholar 

  22. Robertson, S.: Understanding inverse document frequency: on theoretical arguments for idf. Journal of Documentation 60, 503–520 (2004)

    Article  Google Scholar 

  23. Sheth, A.P., Larson, J.A.: Federated database systems for managing distributed, heterogeneous, and autonomous databases. ACM Comput. Surv. 22(3), 183–236 (1990)

    Article  Google Scholar 

  24. Sparck Jones, K.: A statistical interpretation of term specificity and its application in retrieval, pp. 132–142 (1988)

    Google Scholar 

  25. Stojanovic, N., Studer, R., Stojanovic, L.: An approach for the ranking of query results in the semantic web. In: Fensel, D., Sycara, K., Mylopoulos, J. (eds.) ISWC 2003. LNCS, vol. 2870, pp. 500–516. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Dabringer, C., Eder, J. (2010). Retrieving Samples from Biobanks. In: Khuri, S., Lhotská, L., Pisanti, N. (eds) Information Technology in Bio- and Medical Informatics, ITBAM 2010. ITBAM 2010. Lecture Notes in Computer Science, vol 6266. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15020-3_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-15020-3_16

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-15019-7

  • Online ISBN: 978-3-642-15020-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics