Skip to main content

Three Transductive Set Covering Machines

  • Conference paper
  • First Online:
Book cover Data Analysis, Machine Learning and Knowledge Discovery

Abstract

We propose three transductive versions of the set covering machine with data dependent rays for classification in the molecular high-throughput setting. Utilizing both labeled and unlabeled samples, these transductive classifiers can learn information from both sample types, not only from labeled ones. These transductive set covering machines are based on modified selection criteria for their ensemble members. Via counting arguments we include the unlabeled information into the base classifier selection. One of the three methods we developed, uniformly increased the classification accuracy, the other two showed mixed behaviour for all data sets. Here, we could show that only by observing the order of unlabeled samples, not distances, we were able to increase classification accuracies, making these approaches useful even when very few information is available.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    We only utilize the information of orderings of features values, not the information given by a ordinal structure of the class labels (e.g. Herbrich et al. 1999).

References

  • Armstrong, S. A., Staunton, J. E., Silverman, L. B., Pieters, R., den Boer, M. L., Minden, M. D., et al. (2002). Mll translocations specify a distinct gene expression profile that distinguishes a unique leukemia. Nature Genetics, 30(1), 41–47.

    Article  Google Scholar 

  • Bishop, C. M. (2006). Pattern recognition and machine learning. Secaucus: Springer.

    MATH  Google Scholar 

  • Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. I. (1984). Classification and regression trees. Belmont: Wadsworth.

    MATH  Google Scholar 

  • Buchholz, M., Kestler, H. A., Bauer, A., B \(\ddot{\mathrm{o}}\) ck, W., Rau, B., Leder, G., et al. (2005). Specialized DNA arrays for the differentiation of pancreatic tumors. Clinical Cancer Research, 11(22), 8048–8054.

    Google Scholar 

  • Herbrich, R., Graepel, T., & Obermayer, K. (1999). Regression Models for Ordinal Data: A Machine Learning Approach. Technical report, TU Berlin.

    Google Scholar 

  • Jolliffe, I. T. (2002). Principal component analysis. New York: Springer.

    MATH  Google Scholar 

  • Kestler, H. A., Lausser, L., Lindner, W., & Palm, G. (2011). On the fusion of threshold classifiers for categorization and dimensionality reduction. Computational Statistics, 26, 321–340.

    Article  MathSciNet  Google Scholar 

  • Lausser, L., Schmid, F., & Kestler, H. A. (2011). On the utility of partially labeled data for classification of microarray data. In F. Schwenker & E. Trentin (Eds.), Partially supervised learning (pp. 96–109). Berlin: Springer.

    Google Scholar 

  • Marchand, M., & Taylor, J. S. (2003). The set covering machine. Journal of Machine Learning Research, 3, 723–746.

    MATH  Google Scholar 

  • Su, A. I., Welsh, J. B., Sapinoso, L. M., Kern, S. G., Dimitrov, P., Lapp, H., et al. (2001). Molecular classification of human carcinomas by use of gene expression signatures. Cancer Research, 61(20), 7388–7393.

    Google Scholar 

  • Valk, P. J., Verhaak, R. G., Beijen, M. A., Erpelinck, C. A., Barjesteh van Waalwijk van Doorn-Khosrovani, S., Boer, J. M., et al. (2004). Prognostically useful gene-expression profiles in acute myeloid leukemia. New England Journal of Medicine, 16(350), 1617–1628.

    Article  Google Scholar 

  • Vapnik, V. N. (1998). Statistical learning theory. New York: Wiley.

    MATH  Google Scholar 

  • Weston, J., Pérez-Cruz, F., Bousquet, O., Chapelle, O., Elisseeff, A., Sch\(\ddot{\mathrm{o}}\) lkopf, B., et al. (2003). Feature selection and transduction for prediction of molecular bioactivity for drug design. Bioinformatics, 19(6), 764–771.

    Google Scholar 

Download references

Acknowledgements

This work was funded in part by a Karl-Steinbuch grant to Florian Schmid, the German federal ministry of education and research (BMBF) within the framework of the program of medical genome research (PaCa-Net; Project ID PKB-01GS08) and the framework GERONTOSYS 2 (Forschungskern SyStaR, Project ID 0315894A), and by the German Science Foundation (SFB 1074, Project Z1) to Hans A. Kestler. The responsibility for the content lies exclusively with the authors.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hans A. Kestler .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Schmid, F., Lausser, L., Kestler, H.A. (2014). Three Transductive Set Covering Machines. In: Spiliopoulou, M., Schmidt-Thieme, L., Janning, R. (eds) Data Analysis, Machine Learning and Knowledge Discovery. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Cham. https://doi.org/10.1007/978-3-319-01595-8_33

Download citation

Publish with us

Policies and ethics