Skip to main content
Log in

Semiparametric Kernel-Based Regression for Evaluating Interaction Between Pathway Effect and Covariate

  • Published:
Journal of Agricultural, Biological and Environmental Statistics Aims and scope Submit manuscript

Abstract

Pathway-based analysis has the ability to detect subtle changes in response variables that could be missed when using gene-based analysis. Since genes interact with other covariates such as environmental or clinical variables, so do pathways, which are sets of genes that serve particular cellular or physiological functions. However, since pathways are sets of genes and since environmental or clinical variables do not have parametric relationships with response variables, it is difficult to model unknown interaction terms between high-dimensional variables and low-dimensional variables as environmental or clinical variables. In this paper, we propose a semiparametric interaction model for two unknown functions to evaluate the interaction between a pathway and environmental or clinical variable: for the pathway, we use an unknown high-dimensional function, and for environmental or clinical variable, we use an unknown low-dimensional function. We model the environmental or clinical variable nonparametrically via a natural cubic spline. We model both the pathway effect and the interaction between the pathway and environmental or clinical effect nonparametrically via a kernel machine. Since both interactions among genes within the same pathway and the interaction between the pathway and the environmental or clinical variables are complex, we allow for the possibility that a pathway is interacting with environmental or clinical variables and the genes within the same pathway are interacting with each other. We illustrate our approach using simulated data and genetic pathway data for type II diabetes. Supplementary materials accompanying this paper appear online.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  • Aronszajn, N. (1950). Theory of Reproducing Kernels, Transactions of the American Mathmatical Society, 68, 337–404.

    Article  MathSciNet  MATH  Google Scholar 

  • Cheng, L, Kim, I., and Pang, H. (2016). Bayesian semiparametric model for pathway-based analysis with zero-inflated clinical outcomes. Journal of Agricultural, Biological, and Environmental Statistics, 21, 641–662.

    Article  MathSciNet  MATH  Google Scholar 

  • Claeskens, G. (2004). Restricted Likelihood Ratio Lack-of-fit Tests Using Mixed Spline Models. Journal of the Royal Statistical Society, Series B, 66, 909–926.

    Article  MathSciNet  MATH  Google Scholar 

  • Crainiceanu, C., Ruppert, D., Claeskens, G., and Wand, M. P. (2005). Exact Likelihood Ratio Tests for Penalized Splines. Biometrika, 92, 91–103.

  • Czyzyk, A., Lao, B., Orowska, K., Szczepanik, Z., and Bartosiewicz, W. (1989). Effect of Antidiabetics on Post-exercise Alaninemia in Patients with Non-insulin-dependent Diabetes Mellitus (Type 2). Polskie Archiwum Medycyny Wewntrznej, 81, 193–206.

    Google Scholar 

  • Fang, Z., Kim, I., and Schaumont, P. (2016). Flexible variable selection for recovering sparsity in nonadditive nonparametric models. Biometrics, 72, 1155–1163.

    Article  MathSciNet  MATH  Google Scholar 

  • Franconi, F., Loizzo, A., Ghirlanda, G., and Seghieri, G. (2006). Taurine Supplementation and Diabetes Mellitus. Current Opinion in Clinical Nutrition & Metabolic Care, 9, 32–36.

    Article  Google Scholar 

  • Goeman, J. J., van de Geer, S. A., de Kort F., and van Houwelingen, H. C. (2004). A Global Test for Groups of Genes: Testing Association with a Clinical Outcome. Bioinformatics, 20, 93–99.

  • Green, P. J. and Silverman, B. W. (1994). Nonparametric Regression and Generalized Linear Models. London: Chapman and Hall.

  • Gu, C. and Wahba, G. (1993). Semiparametric Analysis of Variance with Tensor Product Thin Plate. Journal of the Royal Statistical Society, Series B, 55, 353–368.

    MathSciNet  MATH  Google Scholar 

  • Juretić, D., Krajnović, V., and Lukac-Bajalo, J. (2002). Altered Distribution of Urinary Glycosaminoglycans in Diabetic Subjects. Acta Diabetologica, 39, 123–128.

    Article  Google Scholar 

  • Kim, I., Pang, H., and Zhao, H. (2012). Bayesian Semiparametric Regression Models for Evaluating Pathway Effects on Continuous and Binary Clinical Outcomes. Statistics in Medicine, 31, 1633–1651.

    Article  MathSciNet  Google Scholar 

  • — (2013). Statistical Properties on Semiparametric Regression for Evaluating Pathway Effects. Journal of statistical planning and inference, 143, 745–763.

  • Kimeldorf, G. and Wahba, G. (1971). Some Results on Tchebychefian Spline Functions. Journal of Mathematical Analysis and Applications, 33, 82–95.

    Article  MathSciNet  MATH  Google Scholar 

  • Kwee L. C., Liu, D., Lin, X., Ghosh, D., and Epstein, M. P. (2008). A powerful and flexible multilocus association test for quantitative traits. American Journal of Human Genetics, 82, 386–397.

    Article  Google Scholar 

  • Lin, X. (1997). Variance Component Testing in Generalized Linear Models with Random Effects. Biometrika, 84, 309–326.

  • Liu, D., Ghosh, D., and Lin, X. (2008). Estimation and Testing for the Effect of a Genetic Pathway on a Disease Outcome Using Logistic Kernel Machine Regression via Logistic Mixed Models. BMC Bioinformatics, 9, 292.

    Article  Google Scholar 

  • Liu, D., Lin, X., and Ghosh, D. (2007). Semiparametric Regression of Multi-Dimensional Genetic Pathway Data: Least Squares Kernel Machines and Linear Mixed Models. Biometrics, 63, 1079–1088.

  • MacKay, D. J. C. (1998). Introducing to Gaussian Process. In Bishop, C. M., editor, Neural Networks and Machine Learning. New York: Springer-Verlag.

  • Maity, A. and Lin, X. (2011). Powerful tests for detecting a gene effect in the presences of possible gene-gene interactions using garrote kernel machines. Biometrics, 67, 1271–1284.

  • Misu, H., Takamura, T., Matsuzawa, N., Shimizu, A., Ota, T., Sakurai, M., Ando, H., Arai, K., Yamashita, T., Honda, M., Yamashita, T., and Kaneko, S. (2007). Genes Involved in Oxidative Phosphorylation are Coordinately Upregulated with Fasting Hyperglycaemia in Livers of Patients with Type 2 Diabetes. Diabetologia, 50, 268–277.

    Article  Google Scholar 

  • Mootha, V. K., Lindgren, C. M., Eriksson, K., Subramanian, A., Sihag, S., Lehar, J., Puigserver, P., Carlsson, E., Ridderstrale, M., Laurila, E., Houstis, N., Daly, M. J., Patterson, N., Mesirov, J. P., Golub, T. R., Tamayo, P., Spiegelman, B., Lander, E. S., Hirschhorn, J. N., Altshuler, D., and Groop, L. C. (2003). PGC-l alpha-Responsive Genes Involved in Oxidative Phosphorylation are Coordinately Downregulated in Human Diabetes. Nature Genetics, 34, 267–273.

    Article  Google Scholar 

  • Pang, H., Lin, A., Holford, M., Enerson, B., Lu, B., Lawton, M. P., Floyd, E., and Zhao, H. (2006). Pathway Analysis Using Random Forests Classification and Regression. Bioinformatics, 22, 2028–2036.

  • Pang, H., Kim, I., and Zhao, H. (2014). Random Effect Model for Multiple Pathway Analysis with Applications to Type II Diabetes Microarray Data. Statistics in Bioscience, https://doi.org/10.1007/s12561-014-9109-1.

  • Rasmussen, C. E. and Williams, C. K. I. (2006). Gaussian Process for Machine Learning. Cambridge: MIT Press.

    MATH  Google Scholar 

  • Searle, S. R., Casella, G., and McCulloch, C. E. (1992). Variance Components. New York: Wiley.

    Book  MATH  Google Scholar 

  • Simon, R., Marks, V., Leeds, A., and Anderson, J. (2011). A Comprehensive Review of Oral Glucosamine Use and Effects on Glucose Metabolism in Normal and Diabetic Individuals. Diabetes Metabolism Research and Reviews, 27, 14–27

    Article  Google Scholar 

  • Storey, J. D. (2002). A Direct Approach to False Discovery Rates. Journal of the Royal Statistical Society, Series B, 64, 479–498.

    Article  MathSciNet  MATH  Google Scholar 

  • Vu, H. T. V. and Zhou, S. (1997). Generalization of Likelihood Ration Tests under Nonstandard Conditions. Annals of Statistics, 25, 897–916.

  • Wahba, G. (1990). Spline Models for Observational Data. Philadelphia: Society for Industrial and Applied Mathematics.

  • Wang, Z., Maity, A., Luo, Y., Neely, M., and Tzeng, J. Y. (2015). Complete effect-profile assessment in association studies with multiple genetic and multiple environmental factors. Genetics Epidemiology, 39, 122–133.

    Article  Google Scholar 

  • Zhang, D. and Lin, X. (2003). Hypothesis Testing in Semiparametric Additive Mixed Models. Biostatistics, 4, 57–74.

    Article  MATH  Google Scholar 

Download references

Acknowledgements

This study was supported in part by the National Science Foundation Grant Number 0964680.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Inyoung Kim.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 469 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fang, Z., Kim, I. & Jung, J. Semiparametric Kernel-Based Regression for Evaluating Interaction Between Pathway Effect and Covariate. JABES 23, 129–152 (2018). https://doi.org/10.1007/s13253-017-0317-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13253-017-0317-2

Keywords

Navigation