Skip to main content

Learning Robust Multi-label Sample Specific Distances for Identifying HIV-1 Drug Resistance

  • Conference paper
  • First Online:
Research in Computational Molecular Biology (RECOMB 2019)

Abstract

Acquired immunodeficiency syndrome (AIDS) is a syndrome caused by the human immunodeficiency virus (HIV). During the progression of AIDS, a patient’s the immune system is weakened, which increases the patient’s susceptibility to infections and diseases. Although antiretroviral drugs can effectively suppress HIV, the virus mutates very quickly and can become resistant to treatment. In addition, the virus can also become resistant to other treatments not currently being used through mutations, which is known in the clinical research community as cross-resistance. Since a single HIV strain can be resistant to multiple drugs, this problem is naturally represented as a multi-label classification problem. Given this multi-class relationship, traditional single-label classification methods usually fail to effectively identify the drug resistances that may develop after a particular virus mutation. In this paper, we propose a novel multi-label Robust Sample Specific Distance (RSSD) method to identify multi-class HIV drug resistance. Our method is novel in that it can illustrate the relative strength of the drug resistance of a reverse transcriptase sequence against a given drug nucleoside analogue and learn the distance metrics for all the drug resistances. To learn the proposed RSSDs, we formulate a learning objective that maximizes the ratio of the summations of a number of \(\ell _1\)-norm distances, which is difficult to solve in general. To solve this optimization problem, we derive an efficient, non-greedy, iterative algorithm with rigorously proved convergence. Our new method has been verified on a public HIV-1 drug resistance data set with over 600 RT sequences and five nucleoside analogues. We compared our method against other state-of-the-art multi-label classification methods and the experimental results have demonstrated the effectiveness of our proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 74.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Chen, G., Song, Y., Wang, F., Zhang, C.: Semi-supervised multi-label learning by solving a sylvester equation. In: SDM, pp. 410–419. SIAM (2008)

    Google Scholar 

  2. Ding, C., Zhou, D., He, X., Zha, H.: R1-PCA: rotational invariant L1-norm principal component analysis for robust subspace factorization. In: ICML, pp. 281–288 (2006)

    Google Scholar 

  3. Feng, J., Zhou, Z.H.: Deep MIML network. In: AAAI (2017)

    Google Scholar 

  4. Fukunaga, K.: Introduction to Statistical Pattern Recognition. Elsevier, Amsterdam (2013)

    Google Scholar 

  5. Gönen, M., Margolin, A.A.: Drug susceptibility prediction against a panel of drugs using kernelized Bayesian multitask learning. Bioinformatics 30(17), i556–i563 (2014)

    Article  Google Scholar 

  6. Han, F., Wang, H., Zhang, H.: Learning of integrated holism-landmark representations for long-term loop closure detection. In: AAAI Conference on Artificial Intelligence (2018)

    Google Scholar 

  7. Heider, D., Senge, R., Cheng, W., HĂ¼llermeier, E.: Multilabel classification for exploiting cross-resistance information in HIV-1 drug resistance prediction. Bioinformatics 29(16), 1946–1952 (2013)

    Article  Google Scholar 

  8. Heider, D., Verheyen, J., Hoffmann, D.: Predicting bevirimat resistance of HIV-1 from genotype. BMC Bioinform. 11(1), 37 (2010)

    Article  Google Scholar 

  9. Hepler, N.L., et al.: IDEPI: rapid prediction of HIV-1 antibody epitopes and other phenotypic features from sequence data using a flexible machine learning platform. PLOS Comput. Biol. 10(9), e1003842 (2014)

    Article  Google Scholar 

  10. Jenatton, R., Obozinski, G., Bach, F.: Structured sparse principal component analysis. In: International Conference on Artificial Intelligence and Statistics (2010)

    Google Scholar 

  11. Ke, Q., Kanade, T.: Robust L/sub 1/norm factorization in the presence of outliers and missing data by alternative convex programming. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, vol. 1, pp. 739–746. IEEE (2005)

    Google Scholar 

  12. Kwak, N.: Principal component analysis based on L1-norm maximization. IEEE Trans. Pattern Anal. Mach. Intell. 30, 1672–1680 (2008)

    Article  Google Scholar 

  13. Kyte, J., Doolittle, R.F.: A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 157(1), 105–132 (1982)

    Article  Google Scholar 

  14. Lewis, D.D., Yang, Y., Rose, T.G., Li, F.: Rcv1: a new benchmark collection for text categorization research. J. Mach. Learn. Res. 5, 361–397 (2004)

    Google Scholar 

  15. Liu, K., Wang, H., Nie, F., Zhang, H.: Learning multi-instance enriched image representations via non-greedy ratio maximization of the L1-norm distances. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7727–7735 (2018)

    Google Scholar 

  16. Liu, Y., Gao, Q., Miao, S., Gao, X., Nie, F., Li, Y.: A non-greedy algorithm for L1-norm LDA. IEEE Trans. Image Process. 26(2), 684–695 (2017)

    Article  MathSciNet  Google Scholar 

  17. Nie, F., et al.: New L1-norm relaxations and optimizations for graph clustering. In: AAAI, pp. 1962–1968 (2016)

    Google Scholar 

  18. Nie, F., Wang, H., Huang, H., Ding, C.: Unsupervised and semi-supervised learning via \(\ell _1\)-norm graph. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 2268–2273. IEEE (2011)

    Google Scholar 

  19. Pennings, P.S.: Standing genetic variation and the evolution of drug resistance in HIV. PLoS Comput. Biol. 8(6), e1002527 (2012)

    Article  MathSciNet  Google Scholar 

  20. Read, J., Pfahringer, B., Holmes, G., Frank, E.: Classifier chains for multi-label classification. Mach. Learn. 85(3), 333–359 (2011)

    Article  MathSciNet  Google Scholar 

  21. Rhee, S.Y., Gonzales, M.J., Kantor, R., Betts, B.J., Ravela, J., Shafer, R.W.: Human immunodeficiency virus reverse transcriptase and protease sequence database. Nucleic Acids Res. 31(1), 298–303 (2003)

    Article  Google Scholar 

  22. Rhee, S.Y., Taylor, J., Wadhera, G., Ben-Hur, A., Brutlag, D.L., Shafer, R.W.: Genotypic predictors of human immunodeficiency virus type 1 drug resistance. Proc. Natl. Acad. Sci. 103(46), 17355–17360 (2006)

    Article  Google Scholar 

  23. Riemenschneider, M., Senge, R., Neumann, U., HĂ¼llermeier, E., Heider, D.: Exploiting HIV-1 protease and reverse transcriptase cross-resistance information for improved drug resistance prediction by means of multi-label classification. BioData Min. 9(1), 10 (2016)

    Article  Google Scholar 

  24. Smyth, R.P., Davenport, M.P., Mak, J.: The origin of genetic diversity in HIV-1. Virus Res. 169(2), 415–429 (2012)

    Article  Google Scholar 

  25. Sun, W., Yuan, Y.X.: Optimization Theory and Methods: Nonlinear Programming, vol. 1. Springer, Heidelberg (2006). https://doi.org/10.1007/b106451

    Book  MATH  Google Scholar 

  26. Wang, H., Deng, C., Zhang, H., Gao, X., Huang, H.: Drosophila gene expression pattern annotations via multi-instance biological relevance learning. In: AAAI, pp. 1324–1330 (2016)

    Google Scholar 

  27. Wang, H., Ding, C., Huang, H.: Multi-label linear discriminant analysis. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6316, pp. 126–139. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15567-3_10

    Chapter  Google Scholar 

  28. Wang, H., Ding, C.H., Huang, H.: Multi-label classification: inconsistency and class balanced k-nearest neighbor. In: AAAI (2010)

    Google Scholar 

  29. Wang, H., Huang, H., Ding, C.: Image annotation using multi-label correlated green’s function. In: 2009 IEEE 12th International Conference on Computer Vision, pp. 2029–2034. IEEE (2009)

    Google Scholar 

  30. Wang, H., Huang, H., Ding, C.: Multi-label feature transform for image classifications. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6314, pp. 793–806. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15561-1_57

    Chapter  Google Scholar 

  31. Wang, H., Huang, H., Ding, C.: Function-function correlated multi-label protein function prediction over interaction networks. J. Comput. Biol. 20(4), 322–343 (2013)

    Article  MathSciNet  Google Scholar 

  32. Wang, H., Huang, H., Ding, C.: Correlated protein function prediction via maximization of data-knowledge consistency. J. Comput. Biol. 22(6), 546–562 (2015)

    Article  Google Scholar 

  33. Wang, H., Huang, H., Kamangar, F., Nie, F., Ding, C.H.: Maximum margin multi-instance learning. In: Advances in Neural Information Processing Systems, pp. 1–9 (2011)

    Google Scholar 

  34. Wang, H., Nie, F., Huang, H.: Learning instance specific distance for multi-instance classification. In: AAAI, vol. 2, p. 6 (2011)

    Google Scholar 

  35. Wang, H., Nie, F., Huang, H.: Robust and discriminative distance for multi-instance learning. In: CVPR. IEEE (2012)

    Google Scholar 

  36. Wang, H., Nie, F., Huang, H.: Robust and discriminative self-taught learning. In: International Conference on Machine Learning, pp. 298–306 (2013)

    Google Scholar 

  37. Wang, H., Nie, F., Huang, H.: Robust distance metric learning via simultaneous \(\ell _1\)-norm minimization and maximization. In: ICML, pp. 1836–1844 (2014)

    Google Scholar 

  38. Wang, H., Nie, F., Huang, H., Yang, Y.: Learning frame relevance for video classification. In: Proceedings of the 19th ACM International Conference on Multimedia, pp. 1345–1348. ACM (2011)

    Google Scholar 

  39. Wang, H., Yan, L., Huang, H., Ding, C.: From protein sequence to protein function via multi-label linear discriminant analysis. IEEE/ACM Trans. Comput. Biol. Bioinform. (TCBB) 14(3), 503–513 (2017)

    Article  Google Scholar 

  40. Wright, J., Ganesh, A., Rao, S., Peng, Y., Ma, Y.: Robust principal component analysis: exact recovery of corrupted. In: NIPS, p. 116 (2009)

    Google Scholar 

  41. Wright, S.J., Nocedal, J.: Numerical optimization. Springer Sci. 35(67–68), 7 (1999)

    MATH  Google Scholar 

  42. Yuan, H., Paskov, I., Paskov, H., GonzĂ¡lez, A.J., Leslie, C.S.: Multitask learning improves prediction of cancer drug sensitivity. Sci. Rep. 6, 31619 (2016)

    Google Scholar 

Download references

Acknowledgments

This work was partially supported by National Science Foundation under Grant NSF-IIS 1652943. This research was also partially supported by Army Research Office (ARO) under Grant W911NF-17-1-0447, U.S. Air Force Academy (USAFA) under Grant FA7000-18-2-0016, and the Distributed and Collaborative Intelligent Systems and Technology (DCIST) CRA under Grant W911NF-17-2-0181.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hua Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Brand, L., Yang, X., Liu, K., Elbeleidy, S., Wang, H., Zhang, H. (2019). Learning Robust Multi-label Sample Specific Distances for Identifying HIV-1 Drug Resistance. In: Cowen, L. (eds) Research in Computational Molecular Biology. RECOMB 2019. Lecture Notes in Computer Science(), vol 11467. Springer, Cham. https://doi.org/10.1007/978-3-030-17083-7_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-17083-7_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-17082-0

  • Online ISBN: 978-3-030-17083-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics