Abstract
In the domain of genomic research, the understanding of specific gene name is a portal to most Information Retrieval (IR) and Information Extraction (IE) systems. In this paper we present an automatic method to extract genomic glossary triggered by the initial gene name in query. LocusLink gene names and MEDLINE abstracts are employed in our system, playing the roles of query triggers and genomic corpus respectively. The evaluation of the extracted glossary is through query expansion in TREC2003 Genomics Track ad hoc retrieval task, and the experiment results yield evidence that 90.15% recall can be achieved.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
MEDLINE (2005), http://www.nlm.nih.gov/pubs/factsheets/medline.html
GenBank (2005), http://www.ncbi.nlm.nih.gov/Genbank/
Chiang, J.H., Yu, H.H.: MeKE: discovering the functions of gene products form biomedical literature via sentence alignment. Bioinformatics 19(11), 1417–1422 (2003)
Pruitt, K.D., et al.: Introducing RefSeq and LocusLink: curated human genome resource at the NCBI. Trends Genet. 16(1), 44–47 (2000)
LocusLink Home Page (2004), http://www.ncbi.nih.gov/LocusLink
Pustejovsky, J., Castaño, J., SaurÃ, R., Rumshisky, A., Zhang, J., Luo, W.: Medstract: Creating Large-scale Information Servers for Biomedical Libraries. In: ACL 2002 Workshop on Natural Language Processing in the Biomedical Domain, Philadelphia, PA (2002)
The Medstract Project - AcroMed 1.1 (2005), http://medstract.med.tufts.edu/acro1.1/
Gilbert, D.G.: euGenes: A Eukaryote Genome Information System. Nucletic Acids Research 30(1), 145–148 (2002)
Genomic Information for Eukaryotic Organisms (2005), http://eugenes.org/
U.S. National Library of Medicine Medical Subject Headings (MeSH) Home Page (2005), http://www.nlm.nih.gov/mesh/meshhome.html
Humphreys, L., Lindberg, D.A.B., Schoolman, H.M., Barnett, G.O.: The Unified Medical Language System: An Informatics Collaboration. Journal of the American Medical Informatics Association 1(5), 1–13 (1998)
Unified Medical Language System (UMLS), http://www.nlm.nih.gov/research/umls/
Pustejovsky, J., Castaño, J., Cochran, B., Kotecki, M., Morrell, M., Rumshisky, A.: Linguistic Knowledge Extraction from Medline: Automatic Construction of an Acronym Database. Medinfo (2001)
Chang, J.T., Schütze, H., Altman, R.B.: Creating an Online Dictionary of Abbreviations from MEDLINE. The Journal of the American Medical Informatics Association 9(6), 612–620 (2002)
Biomedical Abbreviation (2005), http://abbreviation.stanford.edu/
Yu, H., Hatzivassiloglou, V., Rzhetsky, A., Wilbur, W.J.: Automatically identifying gene/protein terms in MEDLINE abstracts. J. Biomed. Inform. 35(5-6), 322–330 (2003)
Hisamitsu, T., Niwa, Y.: Extraction of useful terms form parenthetical expression by using simple rules and statistical measures. In: Proceedings of the First Workshop on Computational Terminology, Compu Term 1998, Montreal, Ontario, August 15, 1998, pp. 36–42 (1998)
Satou, K., Yamamoto, K.: Utilizing weakly controlled vocabulary for sentence segmentation in biomedical literature. Silico Biology 5 (2004)
Kohli, J.: Genetic, Nomenclature and Gene List of the Fission Yeast, Schizosaccharomyces pombe. Curr. Genet. 11(8), 575–589 (1987)
Wain, H.M., Bruford, E.A., Lovering, R.C., Lush, M.J., Wright, M.W., Povey, S.: Guidelines for Human Gene Nomenclature. Genomics 79(4), 464–470 (2002)
HUGO Gene Nomenclature Committee (2005), http://www.gene.ucl.ac.uk/nomenclature/
Maltais, L.J., et al.: Rules and Guidelines for mouse gene nomenclature: a condensed version. International committee on standardized genetic nomenclature for mice. Genomics 45(2), 471–476 (1997)
Antonarakis, S.E.: Recommendations for a nomenclature system for human gene mutations. Nomenclature working group. Hum. Mutat. 11(1), 1–3 (1998)
Horvitz, H.R., et al.: A Uniform Genetic Nomenclature for the Nematode Caenorhabditis Elegans. Mol. Gen. Genet. 175(2), 129–133 (1979)
Baeza-Yates, R., Riberiro-Neto, B.: Modern Information Retrieval, pp. 24–138. ACM Press, New York (1999)
Hersh, W.R., Ravi, T.B.: TREC Genomics Track Overview. In: The Twelfth Text Retrieval Conference: TREC 2003. National Institute of Standards and Technology, Gaithersburg, MD (2003)
Li, J., Zhang, X., Zhang, M., Zhu, X.: THUIR at TREC 2004: Genomics Track. In: Proceedings of 13th Text Retrireval Conference (TREC 2004), Gaithersburg, USA, pp. 571–575 (November 2004)
Klavans, J., Muresan, S.: Evaluation of the DEFINDER System for Full Automatic Glossary Construction. In: Proceedings of the AMIA Symposium (2001)
Alexander, S., Yeh, L.H., Alexander, A.: Background and Overview for KDD Cup 2002 Task 1: Information Extraction from Biomedical Articles. SIGKDD Explorations 4(2), 87–89 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Li, J., Zhu, X. (2006). Automatic Extraction of Genomic Glossary Triggered by Query. In: Li, J., Yang, Q., Tan, AH. (eds) Data Mining for Biomedical Applications. BioDM 2006. Lecture Notes in Computer Science(), vol 3916. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11691730_4
Download citation
DOI: https://doi.org/10.1007/11691730_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33104-9
Online ISBN: 978-3-540-33105-6
eBook Packages: Computer ScienceComputer Science (R0)