Functional Mapping of Expression Quantitative Trait Loci that Regulate Oscillatory Gene Expression

Berg, Arthur; Li, Ning; Tong, Chunfa; Wang, Zhong; Berceli, Scott A.; Wu, Rongling

doi:10.1007/978-1-61779-086-7_12

Arthur Berg,
Ning Li,
Chunfa Tong,
Zhong Wang,
Scott A. Berceli &
…
Rongling Wu²

Part of the book series: Methods in Molecular Biology ((MIMB,volume 734))

1085 Accesses
2 Citations

Abstract

Genetic networks underlying many biological processes, such as vertebrate somitogenesis, cell cycle, hormonal signaling, and circadian rhythms, are characterized by oscillations in gene expression. It has been recognized that the frequency and amplitude of gene expression oscillations vary among individuals and can be controlled by specific expression quantitative trait loci (eQTLs). In this chapter, we develop a dynamic model for mapping and identifying such eQTLs by integrating mathematical aspects of oscillatory dynamics into the functional mapping framework. The model can determine whether and how eQTLs regulate individual genes’ activation kinetics and expression dynamics by estimating and testing Fourier series parameters for different eQTL genotypes. We incorporate a general autoregressive moving-average process of order (r,s), the so-called ARMA(r,s), to model the covariance structure for gene expression profiles measured in time course, broadening the applicability of the new dynamic model to mapping eQTLs in practice. The expectation-maximization algorithm (EM algorithm) was derived to estimate all parameters modeling the mean–covariance structures within a mixture model setting. Simulation studies were performed to investigate the statistical behavior of the model. The model will provide a powerful statistical tool for mapping eQTLs and their epistatic interactions that regulate oscillations in gene expression, helping to construct a regulatory genetic network for those periodic biological phenomena.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 159.00; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Brem, R., Yvert, G., Clinton, R., and Kruglyak, L. (2002) Genetic dissection of transcriptional regulation in budding yeast. Science 296, 752–755.
Google Scholar
Cheung, V. and Spielman, R. (2002) The genetics of variation in gene expression. Nature Genetics 32, 522–525.
Google Scholar
Cheung, V. and Spielman, R. (2009) Genetics of human gene expression: mapping DNA variants that influence gene expression. Nature Reviews Genetics 10, 595–604.
Google Scholar
Cheung, V., Conlin, L., Weber, T., Arcaro, M., Jen, K., Morley, M., and Spielman, R. (2003) Natural variation in human gene expression assessed in lymphoblastoid cells. Nature Genetics 33, 422–425.
Google Scholar
Cookson, W., Liang, L., Abecasis, G., Moffatt, M., and Lathrop, M. (2009) Mapping complex disease traits with global gene expression. Nature Reviews Genetics 10, 184–194.
Google Scholar
Schadt, E., Monks, S., Drake, T., Lusis, A., Che, N., Colinayo, V., Ruff, T., Milligan, S., Lamb, J., Cavet, G., et al. (2003) Genetics of gene expression surveyed in maize, mouse and man. Nature 422, 297–302.
Google Scholar
Jansen, R. and Nap, J. (2001) Genetical genomics: the added value from segregation. Trends in Genetics 17, 388–391.
Google Scholar
Goldbeter, A. (2002) Computational approaches to cellular rhythms. Nature 420, 238–245.
Google Scholar
Rustici, G., Mata, J., Kivinen, K., Lió, P., Penkett, C., Burns, G., Hayles, J., Brazma, A., Nurse, P., and Bahler, J. (2004) Periodic gene expression program of the fission yeast cell cycle. Nature Genetics 36, 809–817.
Google Scholar
Swinburne, I., Miguez, D., Landgraf, D., and Silver, P. (2008) Intron length increases oscillatory periods of gene expression in animal cells. Genes & Development 22, 2342–2346.
Google Scholar
Holter, N., Maritan, A., Cieplak, M., Fedoroff, N., and Banavar, J. (2001) Dynamic modeling of gene expression data. Proceedings of the National Academy of Sciences of the United States of America 98, 1693–1698.
Google Scholar
Qian, J., Dolled-Filhart, M., Lin, J., Yu, H., and Gerstein, M. (2001) Beyond synexpression relationships: local clustering of time-shifted and inverted gene expression profiles identifies new, biologically relevant interactions. Journal of Molecular Biology 314, 1053–1066.
Google Scholar
Bar-Joseph, Z., Gerber, G., Gifford, D., Jaakkola, T., and Simon, I. (2003) Continuous representations of time-series gene expression data. Journal of Computational Biology 10, 341–356.
Google Scholar
Luan, Y. and Li, H. (2003) Clustering of time-course gene expression data using a mixed-effects model with B-splines. Bioinformatics 19, 474–482.
Google Scholar
Park, T., Yi, S., Lee, S., Lee, S., Yoo, D., Ahn, J., and Lee, Y. (2003) Statistical tests for identifying differentially expressed genes in time-course microarray experiments. Bioinformatics 19, 694–703.
Google Scholar
Wakefield, J., Zhou, C., and Self, S. (2003) Modelling gene expression over time: curve clustering with informative prior distributions. Bayesian Statistics 7, 721–732.
Google Scholar
Ernst, J., Nau, G., and Bar-Joseph, Z. (2005) Clustering short time series gene expression data. Bioinformatics 21, 159–168.
Google Scholar
Storey, J., Xiao, W., Leek, J., Tompkins, R., and Davis, R. (2005) Significance analysis of time course microarray experiments. Proceedings of the National Academy of Sciences of the United States of America 102, 12837–12842.
Google Scholar
Ma, P., Castillo-Davis, C., Zhong, W., and Liu, J. (2006) A data-driven clustering method for time course gene expression data. Nucleic Acids Research 34, 1261–1269.
Google Scholar
Ng, S., McLachlan, G., Wang, K., Ben-Tovim Jones, L., and Ng, S. (2006) A mixture model with random-effects components for clustering correlated gene-expression profiles. Bioinformatics 22, 1745–1752.
Google Scholar
Inoue, L., Neira, M., Nelson, C., Gleave, M., and Etzioni, R. (2007) Cluster-based network model for time-course gene expression data. Biostatistics 8, 507–525.
Google Scholar
Kim, B., Zhang, L., Berg, A., Fan, J., and Wu, R. (2008) A Computational Approach to the Functional Clustering of Periodic Gene Expression Profiles. Genetics 180, 821–834.
Google Scholar
Wang, Z. and Wu, R. (2004) A statistical model for high-resolution mapping of quantitative trait loci determining human HIV-1 dynamics. Statistics in Medicine 23, 3033–3051.
Google Scholar
Ma, C., Casella, G., and Wu, R. (2002) Functional mapping of quantitative trait loci underlying the character process: a theoretical framework. Genetics 161, 1751–1762.
Google Scholar
Li, N., McMurry, T., Berg, A., Zhong, W., Berceli, S., and Wu. (2010 (accepted)) Functional clustering of periodic transcriptional profiles through arma(p,q). PloS ONE 5, e9894.
Google Scholar
Spellman, P., Sherlock, G., Zhang, M., Iyer, V., Anders, K., Eisen, M., Brown, P., Botstein, D., and Futcher, B. (1998) Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Molecular biology of the cell 9, 3273–3297.
Google Scholar
Brockwell, P. and Davis, R. (1991) Time series: theory and methods. (Springer).
Google Scholar
Haddad, J. (2004) On the closed form of the covariance matrix and its inverse of the causal ARMA process. Journal of Time Series Analysis 25, 443–448.
Google Scholar
Akaike, H. (1974) A new look at the statistical model identification. IEEE Transactions on Automatic Control 19, 716–723.
Google Scholar
Schwarz, G. (1978) Estimating the dimension of a model. The Annals of Statistics 6, 461–464.
Google Scholar

Download references

Acknowledgments

This work is supported by NSF/NIH joint grant DMS/NIGMS-0540745 and NIH ARRA grant 09095.

Author information

Authors and Affiliations

Center for Statistical Genetics, Pennsylvania State University, Hershey, PA, USA
Rongling Wu

Authors

Arthur Berg
View author publications
You can also search for this author in PubMed Google Scholar
Ning Li
View author publications
You can also search for this author in PubMed Google Scholar
Chunfa Tong
View author publications
You can also search for this author in PubMed Google Scholar
Zhong Wang
View author publications
You can also search for this author in PubMed Google Scholar
Scott A. Berceli
View author publications
You can also search for this author in PubMed Google Scholar
Rongling Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rongling Wu .

Editor information

Editors and Affiliations

Biochemisches Institut, Universität Zürich, Winterthurerstr. 190, Zürich, 8057, Switzerland
Attila Becskei

Appendix

Here, we give the procedure for estimating parameters in $ \Lambda = (\Theta, \Psi ) $ within the EM algorithm framework. In the E-step, the posterior expectation of z _ij is evaluated as:

$$ {P_{j|i}} = E[{z_{ij}}|\Lambda, {y_i}] = \Pr [{z_{ij}} = 1|\Lambda, {y_i}] = \frac{{{\omega_{j|i}}{f_j}({y_i};{\mu_j},\Sigma )}}{{\sum\nolimits_{j\prime = 0}^2 {{\omega_{j\prime |i}}{f_{j\prime }}({y_i};{\mu_j}^\prime, {\Sigma_i})} }}, $$

(14)

where we assume that the covariance matrix is subject specific.

In the M-step, closed form solutions exist for ω (see Eqs. 8–11) and the parameters in $ \Lambda $ except for τ and σ ².

Suppose the gene expression trajectory is approximated by the first K orders of the Fourier series, then $ {\Lambda_j} = ({c_j},{\tau_j}), $ where $ {c_j} = ({\alpha_{0j}},{\alpha_{1j}},{\beta_{1j}},\; \ldots, \;{\alpha_{Kj}},{\beta_{Kj}}). $ We have

$$ \frac{{\partial \log {L_c}(\Lambda |y)}}{{\partial {c_j}}} = \left[ {\frac{{\partial \log {L_c}(\Lambda |y)}}{{\partial {\mu_{ij}}}}} \right]\left[ {\frac{{\partial {\mu_j}}}{{\partial {c_j}}}} \right]. $$

(15)

The parameter c _j can be updated by setting (Eq. 15) to zero. Since

$$ \frac{{\partial \log {L_c}(\Lambda |y)}}{{\partial {\mu_j}}} = \sum\limits_{i = 1}^n {{P_{j|i}}{{({y_i} - {\mu_j})}^T}\Sigma_i^{ - 1}} $$

and $ \partial {\mu_j}/\partial {c_j} = {D_i}({\tau_j}) $, where

$$ {D_i}({\tau_j}) = \left( {\begin{array}{*{20}{c}} 1 & {\cos \left( {\frac{{2\pi {t_{i1}}}}{{{\tau_j}}}} \right)} & {\sin \left( {\frac{{2\pi {t_{i1}}}}{{{\tau_j}}}} \right)} & \cdots & {\cos \left( {\frac{{2\pi K{t_{i1}}}}{{{\tau_j}}}} \right)} & {\sin \left( {\frac{{2\pi K{t_{i1}}}}{{{\tau_j}}}} \right)} \\ 1 & {\cos \left( {\frac{{2\pi {t_{i2}}}}{{{\tau_j}}}} \right)} & {\sin \left( {\frac{{2\pi {t_{i2}}}}{{{\tau_j}}}} \right)} & \cdots & {\cos \left( {\frac{{2\pi K{t_{i2}}}}{{{\tau_j}}}} \right)} & {\sin \left( {\frac{{2\pi K{t_{i2}}}}{{{\tau_j}}}} \right)} \\ \vdots & \vdots & \vdots & \vdots & \vdots & \vdots \\ 1 & {\cos \left( {\frac{{2\pi {t_{i{m_i}}}}}{{{\tau_j}}}} \right)} & {\sin \left( {\frac{{2\pi {t_{i{m_i}}}}}{{{\tau_j}}}} \right)} & \cdots & {\cos \left( {\frac{{2\pi K{t_{i{m_j}}}}}{{{\tau_j}}}} \right)} & {\sin \left( {\frac{{2\pi K{t_{i{m_j}}}}}{{{\tau_j}}}} \right)} \\ \end{array} } \right), $$

we have

$$ {\hat{c}_j} = {\left[ {\sum\limits_{i = 1}^n {{P_{j|i}}{D_i}{{({\tau_j})}^T}\Sigma_i^{ - 1}{D_i}({\tau_j})} } \right]^{ - 1}}\left[ {\sum\limits_{i = 1}^n {{P_{j|i}}y_i^T\Sigma_i^{ - 1}{D_i}({\tau_j})} } \right]. $$

Since the analytical form of the inverse of $ {\Sigma_i} $ is not available, we use the recursive method to calculate the inverse matrix of ARMA(p,q) through its association with ARMA(p,q − 1).

We can write $ {\Sigma_i} = {\sigma^2}{R_i} $, where R _i is the correlation matrix that is entirely determined by the ARMA parameters $ {\varphi_1},\; \ldots, \;{\varphi_p},\;{\theta_1},\; \ldots, \;{\theta_q}. $ The variance σ ² can be updated by:

$$ {\hat{\sigma }^2} = \frac{{\sum\nolimits_{i = 1}^n {\sum\nolimits_{j = 1}^J {{P_{j|i}}{{({y_i} - {\mu_j})}^T}R_i^{ - 1}({y_i} - {\mu_j})} } }}{{\sum\nolimits_{i = 1}^n {{m_i}} }}. $$

(16)

Again $ R_i^{ - 1} $ can be calculated by the method of Haddad (28).

Because there are no closed form solutions for τ _j and ARMA parameters $ {\varphi_1},\; \ldots, \;{\varphi_p} $ and $ {\theta_1},\; \ldots, \;{\theta_q} $, their estimates are updated using one-step Newton–Raphson method within each iteration. In particular, in the $ {{(\nu + 1)th}} $ iteration, τ _j can be updated by:

$$ \tau_j^{\nu + 1} = \tau_j^\nu - \frac{{\partial /\partial {\tau_j}\log {L_c}(\Lambda |y){|_{\Lambda = {\Lambda^\nu }}}}}{{{\partial^2}/\partial \tau_j^2\log {L_c}(\Lambda |y){|_{\Lambda = {\Lambda^\nu }}}}}, $$

(17)

where

$$ \frac{\partial }{{\partial {\tau_j}}}\log {L_c}(\Lambda |y) = \sum\limits_{i = 1}^n {{P_{j|i}}{{({y_i} - {\mu_j})}^T}\Sigma_i^{ - 1}} {\delta_{ij}} $$

with $ {\delta_{ij}} $ being a $ {m_i} \times 1 $ vector whose components

$$ {\delta_{ijl}} = \sum\limits_{k = 1}^K {\left[ {{\alpha_{kj}}\sin \left( {\frac{{2\pi k{t_l}}}{{{\tau_j}}}} \right)\frac{{2\pi k{t_l}}}{{\tau_j^2}} - {\beta_{kj}}\cos \left( {\frac{{2\pi k{t_l}}}{{{\tau_j}}}} \right)\frac{{2\pi k{t_l}}}{{\tau_j^2}}} \right],} $$

and

$$ \frac{{{\partial^2}}}{{\partial \tau_j^2}}\log {L_c}(\Lambda |y) = \sum\limits_{i = 1}^n {\left[ { - {P_{j|i}}\delta_{ij}^T\Sigma_i^{ - 1}{\delta_{ij}} + {P_{j|i}}{{({y_i} - {\mu_j})}^T}\Sigma_i^{ - 1}\frac{{{\partial^2}}}{{\partial \tau_j^2}}{\mu_j}} \right]}. $$

Similarly, the parameters $ {\varphi_1},\; \ldots, \;{\varphi_p} $ and $ {\theta_1},\; \ldots, \;{\theta_q} $ can be updated by the one-step Newton–Raphson method outlined above. However, there are no analytical forms of the first and the second derivatives of the expected complete data log-likelihood with respect to the $ \varphi {{\prime s}} $ and $ \theta {{\prime s}} $, we use the numerical differentiation method to calculate these quantities. To ease the presentation of the method, denote the $ (p + q) $ dimensional vector $ \psi = ({\varphi_1},\; \ldots, \;{\varphi_p},\;{\theta_1},\; \ldots, \;{\theta_q}). $ The first and the second derivatives with respect to the Kth component in Ψ are approximated, respectively, by:

$$ \frac{{E\left[ {\log {L_c}({\Lambda_{ - \psi }},\psi + {h_n}{e_\kappa }|y)} \right] - E\left[ {\log {L_c}(\Lambda |y)} \right]}}{{{h_n}}}, $$

(18)

and

$$ \frac{{E\left[ {\log {L_c}({\Lambda_{ - \psi }},\psi + {h_n}{e_\kappa }|y)} \right] - 2E\left[ {\log {L_c}(\Lambda |y)} \right] + E\left[ {\log {L_c}({\Lambda_{ - \psi }},\psi - {h_n}{e_\kappa }|y)} \right]}}{{h_n^2}}, $$

(19)

where we use E to represent the posterior expectation of the complete data log-likelihood with respect to, $ {\Lambda_{ - \psi }} $ denotes the parameters in $ \Lambda $ other than $ \psi $, the $ (p + q) $ vector e has unity length with the Kth component set to 1, and h _n is the bandwidth chosen by the investigator. When h _n is small enough, the numerical differentiation approximates the true derivatives adequately. On the other hand, if h _n is too small, the random errors from the numerical computation may deteriorate the results.

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Berg, A., Li, N., Tong, C., Wang, Z., Berceli, S.A., Wu, R. (2011). Functional Mapping of Expression Quantitative Trait Loci that Regulate Oscillatory Gene Expression. In: Becskei, A. (eds) Yeast Genetic Networks. Methods in Molecular Biology, vol 734. Humana Press. https://doi.org/10.1007/978-1-61779-086-7_12

Download citation

DOI: https://doi.org/10.1007/978-1-61779-086-7_12
Published: 08 March 2011
Publisher Name: Humana Press
Print ISBN: 978-1-61779-085-0
Online ISBN: 978-1-61779-086-7
eBook Packages: Springer Protocols

Publish with us

Policies and ethics

Functional Mapping of Expression Quantitative Trait Loci that Regulate Oscillatory Gene Expression

Abstract

Access this chapter

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix

Appendix

Rights and permissions

Copyright information

About this protocol

Cite this protocol

Download citation

Publish with us

Search

Navigation