Skip to main content

Classifying Noisy and Incomplete Medical Data by a Differential Latent Semantic Indexing Approach

  • Chapter
Book cover Data Mining in Biomedicine

Part of the book series: Springer Optimization and Its Applications ((SOIA,volume 7))

  • 1369 Accesses

Abstract

It is well-recognized that medical datasets are often noisy and incomplete due to the difficulties in data collection and integration. Noise and incompleteness in medical data post substantial challenges for accurate classification. A differential latent semantic indexing (DLSI) approach which is an improvement of the standard LSI method has been proposed for information retrieval and demonstrated improved performance over standard LSI approach. The key idea is that DLSI adapts to the unique characteristics of individual record/document. By experimental results on real datasets, we show that DLSI outperforms the standard LSI method on noisy and incomplete medical datasets. The results strongly indicate that the DLSI approach is also capable of medical numerical data analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. T. A. Letsche, and M. W. Berry. Large-Scale Information Retrieval with Latent Semantic Indexing. Information Sciences-Applications, 100: 105–137, 1997.

    Article  Google Scholar 

  2. M. W. Berry, Z. Drmac, and E. R. Jessup. Matrices, Vector Spaces, and Information Retrieval. SIAM Review, 41(2): 335–362, 1999.

    Article  Google Scholar 

  3. L. Chen, N. Tokuda, and A. Nagai. A New Differential LSI Space-Based Probabilistic Document Classifier. Information Processing Letters, 88: 203–212, 2003.

    Article  Google Scholar 

  4. O. L. Mangasarian and W. H. Wolberg. Cancer Diagnosis via Linear Programming. SIAM News, 23(5): 1–18, 1990.

    Google Scholar 

  5. K.P. Bennett and O. L. Mangasarian. Neural Network Training via Linear Programming. Elsevier Science, 1992.

    Google Scholar 

  6. I. Taha and J. Gosh. Characterization of the Wisconsin Breast Cancer Database Using a Hybrid Symbolic-Connectionist System. Tech. Report UT-CVISS-TR-97-007, the Computer and Vision Research Center, University of Texas, Austin, 1996.

    Google Scholar 

  7. R. Setiono. Extracting Rules from Pruned Neural Network for Breast Cancer Diagnosis. Artificial Intelligence in Medicine, 8: 37–51, 1996.

    Article  PubMed  CAS  Google Scholar 

  8. R. Setiono. Generating Concise and Accurate Classification Rules for Breast Cancer Diagnosis. Artificial Intelligence in Medicine, 18: 205–219, 2000.

    Article  PubMed  CAS  Google Scholar 

  9. W. H. Wolberg and O. L. Mangasarian. Multisurface method of pattern separation for medical diagnosis applied to breast cytology. Proceedings of the National Academy of Sciences, 87: 9193–9196, 1990.

    Article  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Chen, L., Zeng, J., Pei, J. (2007). Classifying Noisy and Incomplete Medical Data by a Differential Latent Semantic Indexing Approach. In: Pardalos, P.M., Boginski, V.L., Vazacopoulos, A. (eds) Data Mining in Biomedicine. Springer Optimization and Its Applications, vol 7. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-69319-4_10

Download citation

Publish with us

Policies and ethics