Three Transductive Set Covering Machines

Schmid, Florian; Lausser, Ludwig; Kestler, Hans A.

doi:10.1007/978-3-319-01595-8_33

Florian Schmid²¹,
Ludwig Lausser²¹ &
Hans A. Kestler²¹

Part of the book series: Studies in Classification, Data Analysis, and Knowledge Organization ((STUDIES CLASS))

5253 Accesses
1 Citations

Abstract

We propose three transductive versions of the set covering machine with data dependent rays for classification in the molecular high-throughput setting. Utilizing both labeled and unlabeled samples, these transductive classifiers can learn information from both sample types, not only from labeled ones. These transductive set covering machines are based on modified selection criteria for their ensemble members. Via counting arguments we include the unlabeled information into the base classifier selection. One of the three methods we developed, uniformly increased the classification accuracy, the other two showed mixed behaviour for all data sets. Here, we could show that only by observing the order of unlabeled samples, not distances, we were able to increase classification accuracies, making these approaches useful even when very few information is available.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
We only utilize the information of orderings of features values, not the information given by a ordinal structure of the class labels (e.g. Herbrich et al. 1999).

References

Armstrong, S. A., Staunton, J. E., Silverman, L. B., Pieters, R., den Boer, M. L., Minden, M. D., et al. (2002). Mll translocations specify a distinct gene expression profile that distinguishes a unique leukemia. Nature Genetics, 30(1), 41–47.
Article Google Scholar
Bishop, C. M. (2006). Pattern recognition and machine learning. Secaucus: Springer.
MATH Google Scholar
Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. I. (1984). Classification and regression trees. Belmont: Wadsworth.
MATH Google Scholar
Buchholz, M., Kestler, H. A., Bauer, A., B \(\ddot{\mathrm{o}}\) ck, W., Rau, B., Leder, G., et al. (2005). Specialized DNA arrays for the differentiation of pancreatic tumors. Clinical Cancer Research, 11(22), 8048–8054.
Google Scholar
Herbrich, R., Graepel, T., & Obermayer, K. (1999). Regression Models for Ordinal Data: A Machine Learning Approach. Technical report, TU Berlin.
Google Scholar
Jolliffe, I. T. (2002). Principal component analysis. New York: Springer.
MATH Google Scholar
Kestler, H. A., Lausser, L., Lindner, W., & Palm, G. (2011). On the fusion of threshold classifiers for categorization and dimensionality reduction. Computational Statistics, 26, 321–340.
Article MathSciNet Google Scholar
Lausser, L., Schmid, F., & Kestler, H. A. (2011). On the utility of partially labeled data for classification of microarray data. In F. Schwenker & E. Trentin (Eds.), Partially supervised learning (pp. 96–109). Berlin: Springer.
Google Scholar
Marchand, M., & Taylor, J. S. (2003). The set covering machine. Journal of Machine Learning Research, 3, 723–746.
MATH Google Scholar
Su, A. I., Welsh, J. B., Sapinoso, L. M., Kern, S. G., Dimitrov, P., Lapp, H., et al. (2001). Molecular classification of human carcinomas by use of gene expression signatures. Cancer Research, 61(20), 7388–7393.
Google Scholar
Valk, P. J., Verhaak, R. G., Beijen, M. A., Erpelinck, C. A., Barjesteh van Waalwijk van Doorn-Khosrovani, S., Boer, J. M., et al. (2004). Prognostically useful gene-expression profiles in acute myeloid leukemia. New England Journal of Medicine, 16(350), 1617–1628.
Article Google Scholar
Vapnik, V. N. (1998). Statistical learning theory. New York: Wiley.
MATH Google Scholar
Weston, J., Pérez-Cruz, F., Bousquet, O., Chapelle, O., Elisseeff, A., Sch\(\ddot{\mathrm{o}}\) lkopf, B., et al. (2003). Feature selection and transduction for prediction of molecular bioactivity for drug design. Bioinformatics, 19(6), 764–771.
Google Scholar

Download references

Acknowledgements

This work was funded in part by a Karl-Steinbuch grant to Florian Schmid, the German federal ministry of education and research (BMBF) within the framework of the program of medical genome research (PaCa-Net; Project ID PKB-01GS08) and the framework GERONTOSYS 2 (Forschungskern SyStaR, Project ID 0315894A), and by the German Science Foundation (SFB 1074, Project Z1) to Hans A. Kestler. The responsibility for the content lies exclusively with the authors.

Author information

Authors and Affiliations

Research Group Bioinformatics and Systems Biology, Institute of Neural Information Processing, Ulm University, 89069, Ulm, Germany
Florian Schmid, Ludwig Lausser & Hans A. Kestler

Authors

Florian Schmid
View author publications
You can also search for this author in PubMed Google Scholar
Ludwig Lausser
View author publications
You can also search for this author in PubMed Google Scholar
Hans A. Kestler
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hans A. Kestler .

Editor information

Editors and Affiliations

Faculty of Computer Science, Otto-von-Guericke-Universität Magdeburg, Magdeburg, Germany
Myra Spiliopoulou
Institute of Computer Science, University of Hildesheim, Hildesheim, Germany
Lars Schmidt-Thieme
Institute of Computer Science, University of Hildesheim, Hildesheim, Germany
Ruth Janning

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Schmid, F., Lausser, L., Kestler, H.A. (2014). Three Transductive Set Covering Machines. In: Spiliopoulou, M., Schmidt-Thieme, L., Janning, R. (eds) Data Analysis, Machine Learning and Knowledge Discovery. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Cham. https://doi.org/10.1007/978-3-319-01595-8_33

Download citation

DOI: https://doi.org/10.1007/978-3-319-01595-8_33
Published: 10 October 2013
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-01594-1
Online ISBN: 978-3-319-01595-8
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics