Universal Learning over Related Distributions and Adaptive Graph Transduction

Zhong, Erheng; Fan, Wei; Peng, Jing; Verscheure, Olivier; Ren, Jiangtao

doi:10.1007/978-3-642-04174-7_44

Erheng Zhong²²,
Wei Fan²³,
Jing Peng²⁴,
Olivier Verscheure²³ &
…
Jiangtao Ren²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5782))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

3620 Accesses
1 Citations

Abstract

The basis assumption that “training and test data drawn from the same distribution” is often violated in reality. In this paper, we propose one common solution to cover various scenarios of learning under “different but related distributions” in a single framework. Explicit examples include (a) sample selection bias between training and testing data, (b) transfer learning or no labeled data in target domain, and (c) noisy or uncertain training data. The main motivation is that one could ideally solve as many problems as possible with a single approach. The proposed solution extends graph transduction using the maximum margin principle over unlabeled data. The error of the proposed method is bounded under reasonable assumptions even when the training and testing distributions are different. Experiment results demonstrate that the proposed method improves the traditional graph transduction by as much as 15% in accuracy and AUC in all common situations of distribution difference. Most importantly, it outperforms, by up to 10% in accuracy, several state-of-art approaches proposed to solve specific category of distribution difference, i.e, BRSD [1] for sample selection bias, CDSC [2] for transfer learning, etc. The main claim is that the adaptive graph transduction is a general and competitive method to solve distribution differences implicitly without knowing and worrying about the exact type. These at least include sample selection bias, transfer learning, uncertainty mining, as well as those alike that are still not studied yet. The source code and datasets are available from the authors.

Download to read the full chapter text

Chapter PDF

Domain Adaptation with Few Labeled Source Samples by Graph Regularization

Article 09 July 2019

RDA: Reciprocal Distribution Alignment for Robust Semi-supervised Learning

TO-UGDA: target-oriented unsupervised graph domain adaptation

Article Open access 22 April 2024

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Ren, J., Shi, X., Fan, W., Yu, P.S.: Type-independent correction of sample selection bias via structural discovery and re-balancing. In: Proceedings of the Eighth SIAM International Conference on Data Mining, SDM 2008, pp. 565–576. SIAM, Philadelphia (2008)
Chapter Google Scholar
Ling, X., Dai, W., Xue, G.R., Yang, Q., Yu, Y.: Spectral domain-transfer learning. In: KDD 2008: Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 488–496. ACM, New York (2008)
Google Scholar
Blitzer, J., Crammer, K., Kulesza, A., Pereira, F., Wortman, J.: Learning bounds for domain adaptation. In: Platt, J., Koller, D., Singer, Y., Roweis, S. (eds.) Advances in Neural Information Processing Systems 20, pp. 129–136. MIT Press, Cambridge (2008)
Google Scholar
Wang, J., Jebara, T., Chang, S.F.: Graph transduction via alternating minimization. In: ICML 2008: Proceedings of the 25th international conference on Machine learning, pp. 1144–1151. ACM, New York (2008)
Chapter Google Scholar
Amini, M., Laviolette, F., Usunier, N.: A transductive bound for the voted classifier with an application to semi-supervised learning. In: Koller, D., Schuurmans, D., Bengio, Y., Bottou, L. (eds.) Advances in Neural Information Processing Systems 21 (2009)
Google Scholar
Fan, W., Davidson, I.: On sample selection bias and its efficient correction via model averaging and unlabeled examples. In: Proceedings of the Seventh SIAM International Conference on Data Mining, SDM 2007, Minneapolis, Minnesota. SIAM, Philadelphia (2007)
Google Scholar
Zhu, X.: Semi-supervised learning with graphs. PhD thesis, Carnegie Mellon University, Pittsburgh, PA, USA (2005)
Google Scholar
Zhu, X., Lafferty, J.: Combining active learning and semi-supervised learning using gaussian fields and harmonic functions. In: ICML 2003 workshop on The Continuum from Labeled to Unlabeled Data in Machine Learning and Data Mining, pp. 58–65 (2003)
Google Scholar
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann Series in Data Management Systems. Morgan Kaufmann, San Francisco (2005)
MATH Google Scholar
Asuncion, A., Newman, D.J.: UCI machine learning repository (2007), http://www.ics.uci.edu/mlearn/MLRepository.html
Liu, H., Li, J.: Kent ridge bio-medical data set repository (2005), http://leo.ugr.es/elvira/DBCRepository/index.html
Melville, P., Shah, N., Mihalkova, L., Mooney, R.J.: Experiments on ensembles with missing and noisy data. In: Roli, F., Kittler, J., Windeatt, T. (eds.) MCS 2004. LNCS, vol. 3077, pp. 293–302. Springer, Heidelberg (2004)
Chapter Google Scholar
Pan, S.J., Kwok, J.T., Yang, Q.: Transfer learning via dimensionality reduction. In: Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, AAAI 2008, Chicago, Illinois, USA, July 13-17, pp. 677–682 (2008)
Google Scholar
Sugiyama, M., Nakajima, S., Kashima, H., Buenau, P.V., Kawanabe, M.: Direct importance estimation with model selection and its application to covariate shift adaptation. In: Platt, J., Koller, D., Singer, Y., Roweis, S. (eds.) Advances in Neural Information Processing Systems 20, pp. 1433–1440. MIT Press, Cambridge (2008)
Google Scholar
Aggarwal, C.C.: On density based transforms for uncertain data mining. In: Proceedings of the 23rd International Conference on Data Engineering, ICDE 2007, The Marmara Hotel, Istanbul, Turkey, pp. 866–875. IEEE, Los Alamitos (2007)
Chapter Google Scholar
El Ghaoui, L., Lanckriet, G.R.G., Natsoulis, G.: Robust classification with interval data. Technical Report UCB/CSD-03-1279, EECS Department, University of California, Berkeley (October 2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Sun Yat-Sen University, Guangzhou, China
Erheng Zhong & Jiangtao Ren
IBM T.J. Watson Research, USA
Wei Fan & Olivier Verscheure
Montclair State University, USA
Jing Peng

Authors

Erheng Zhong
View author publications
You can also search for this author in PubMed Google Scholar
Wei Fan
View author publications
You can also search for this author in PubMed Google Scholar
Jing Peng
View author publications
You can also search for this author in PubMed Google Scholar
Olivier Verscheure
View author publications
You can also search for this author in PubMed Google Scholar
Jiangtao Ren
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

NICTA, Locked Bag 8001, Canberra, 2601, Australia and Helsinki Institute of IT, Finland
Wray Buntine
Dept. of Knowledge Technologies, Jožef Stefan Institute, Jamova 39, 1000, Ljubljana, Slovenia
Marko Grobelnik & Dunja Mladenić &
The Centre for Computational Statistics and Machine Learning Department of Computer Science, University College London, Gower St.,, WC1E 6BT, London, UK
John Shawe-Taylor

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhong, E., Fan, W., Peng, J., Verscheure, O., Ren, J. (2009). Universal Learning over Related Distributions and Adaptive Graph Transduction. In: Buntine, W., Grobelnik, M., Mladenić, D., Shawe-Taylor, J. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2009. Lecture Notes in Computer Science(), vol 5782. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04174-7_44

Download citation

DOI: https://doi.org/10.1007/978-3-642-04174-7_44
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04173-0
Online ISBN: 978-3-642-04174-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Universal Learning over Related Distributions and Adaptive Graph Transduction

Abstract

Chapter PDF

Similar content being viewed by others

Domain Adaptation with Few Labeled Source Samples by Graph Regularization

RDA: Reciprocal Distribution Alignment for Robust Semi-supervised Learning

TO-UGDA: target-oriented unsupervised graph domain adaptation

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Universal Learning over Related Distributions and Adaptive Graph Transduction

Abstract

Chapter PDF

Similar content being viewed by others

Domain Adaptation with Few Labeled Source Samples by Graph Regularization

RDA: Reciprocal Distribution Alignment for Robust Semi-supervised Learning

TO-UGDA: target-oriented unsupervised graph domain adaptation

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation