Abstract
A variety of biological databases are currently available to researchers in the XML format. Homology-related querying on such databases presents several challenges, as most available exhaustive mining techniques do not incorporate the semantic relationships inherent to these data collections. This chapter identifies an index-based approach to mining such data and explores the improvement achieved in the quality of query results by the application of genetic algorithms. Our experiments confirm the widely accepted advantages of index and vector-space based model for biological data and specifically, show that the application of genetic algorithms optimizes the search and achieves higher levels of precision and accuracy in heterogeneous databases and faster query execution across all data collections.
Keywords
- Genetic Algorithm
- Latent Semantic Analysis
- Latent Semantic Indexing
- Semantic Approach
- Biological Database
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Singh, A.K.: Querying and Mining Biological Databases. Journal of Interactive Biology 7(1), 7–8 (2003)
Luk, R., et al.: A Survey of Search Engines for XML Documents. In: SIGIR Workshop on XML and IR (2000)
Cohen, S., Mamou, J., Kanza, Y., Sagiv, Y.: XSEarch: A Semantic Search Engine for XML. In: VLDB, pp. 45–56 (2003)
Letsche, T.A., Berry, M.W.: Large-Scale Information Retrieval with Latent Semantic Indexing. Information Sciences - Applications 100, 105–137 (1997)
Landauer, T.K., Dumais, S.T.: A Solution to Plato’s problem: the Latent Semantic Analysis Theory of Acquisition, Induction and Representation of Knowledge. Psychological Review 104(2), 211–240 (1997)
Williams, H.E., Zobel, J.: Indexing and Retrieval for Genomic Databases. IEEE Transactions on Knowledge and Data Engineering 14(1) (January/February 2002)
Hammouda, K.M., Kamel, M.S.: Efficient Phrase-Based Document Indexing for Web Document Clustering. IEEE Transactions on Knowledge and Data Engineering 16(10), 1279–1296 (2004)
Bellettini, C., Marchetto, A., Trentini, A.: An Approach to Concerns and Aspects Mining for Web Applications. International Journal of Information Technology (IJIT) (2005)
Guo, L., et al.: XRANK: Ranked Keyword search over XML Documents. In: SIGMOD 2003 (2003)
Deerwester, S., Dumais, S.T., Landauer, T.K., Furnas, G.W., Harshman, R.A.: Indexing by Latent Semantic Analysis. Journal of the American Society of Information Science (1990)
Caid, W.R., Dumais, S.T., Gallant, S.I.: Learned Vector Space Models for Information Retrieval. Journal of Information Processing and Management (1995)
Berry, M., Dumais, S., O’Brien, G.: Using Linear Algebra for Intelligent Information Retrieval. SIAM Review 37(4), 573–595 (1995)
Cooper, R., et al.: Indexing Genomic Databases. In: Fourth IEEE Symposium on Bioinformatics and Bioengineering (2005)
Golub, G., Van Loan, C.: Matrix Computations, 2nd edn. Johns-Hopkins (1989)
Foltz, P.: Using Latent Semantic Indexing for Information Filtering. In: Proceedings of the ACM Conference on Office Information Systems (COIS), pp. 40-47 (1990)
Kikuchi, N., Kameyama, A., et al.: The Carbohydrate Sequence Markup Language (CabosML): an XML Description of Carbohydrate Structures, Bioinformatics 21(8), 1717–1718 (2005)
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Venugopal, K.R., Srinivasa, K.G., Patnaik, L.M. (2009). A Semantic Approach for Mining Biological Databases. In: Soft Computing for Data Mining Applications. Studies in Computational Intelligence, vol 190. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00193-2_13
Download citation
DOI: https://doi.org/10.1007/978-3-642-00193-2_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-00192-5
Online ISBN: 978-3-642-00193-2
eBook Packages: EngineeringEngineering (R0)