Skip to main content

Functional Mapping of Expression Quantitative Trait Loci that Regulate Oscillatory Gene Expression

  • Protocol
  • First Online:
Yeast Genetic Networks

Part of the book series: Methods in Molecular Biology ((MIMB,volume 734))

Abstract

Genetic networks underlying many biological processes, such as vertebrate somitogenesis, cell cycle, hormonal signaling, and circadian rhythms, are characterized by oscillations in gene expression. It has been recognized that the frequency and amplitude of gene expression oscillations vary among individuals and can be controlled by specific expression quantitative trait loci (eQTLs). In this chapter, we develop a dynamic model for mapping and identifying such eQTLs by integrating mathematical aspects of oscillatory dynamics into the functional mapping framework. The model can determine whether and how eQTLs regulate individual genes’ activation kinetics and expression dynamics by estimating and testing Fourier series parameters for different eQTL genotypes. We incorporate a general autoregressive moving-average process of order (r,s), the so-called ARMA(r,s), to model the covariance structure for gene expression profiles measured in time course, broadening the applicability of the new dynamic model to mapping eQTLs in practice. The expectation-maximization algorithm (EM algorithm) was derived to estimate all parameters modeling the mean–covariance structures within a mixture model setting. Simulation studies were performed to investigate the statistical behavior of the model. The model will provide a powerful statistical tool for mapping eQTLs and their epistatic interactions that regulate oscillations in gene expression, helping to construct a regulatory genetic network for those periodic biological phenomena.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 159.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Brem, R., Yvert, G., Clinton, R., and Kruglyak, L. (2002) Genetic dissection of transcriptional regulation in budding yeast. Science 296, 752–755.

    Google Scholar 

  2. Cheung, V. and Spielman, R. (2002) The genetics of variation in gene expression. Nature Genetics 32, 522–525.

    Google Scholar 

  3. Cheung, V. and Spielman, R. (2009) Genetics of human gene expression: mapping DNA variants that influence gene expression. Nature Reviews Genetics 10, 595–604.

    Google Scholar 

  4. Cheung, V., Conlin, L., Weber, T., Arcaro, M., Jen, K., Morley, M., and Spielman, R. (2003) Natural variation in human gene expression assessed in lymphoblastoid cells. Nature Genetics 33, 422–425.

    Google Scholar 

  5. Cookson, W., Liang, L., Abecasis, G., Moffatt, M., and Lathrop, M. (2009) Mapping complex disease traits with global gene expression. Nature Reviews Genetics 10, 184–194.

    Google Scholar 

  6. Schadt, E., Monks, S., Drake, T., Lusis, A., Che, N., Colinayo, V., Ruff, T., Milligan, S., Lamb, J., Cavet, G., et al. (2003) Genetics of gene expression surveyed in maize, mouse and man. Nature 422, 297–302.

    Google Scholar 

  7. Jansen, R. and Nap, J. (2001) Genetical genomics: the added value from segregation. Trends in Genetics 17, 388–391.

    Google Scholar 

  8. Goldbeter, A. (2002) Computational approaches to cellular rhythms. Nature 420, 238–245.

    Google Scholar 

  9. Rustici, G., Mata, J., Kivinen, K., Lió, P., Penkett, C., Burns, G., Hayles, J., Brazma, A., Nurse, P., and Bahler, J. (2004) Periodic gene expression program of the fission yeast cell cycle. Nature Genetics 36, 809–817.

    Google Scholar 

  10. Swinburne, I., Miguez, D., Landgraf, D., and Silver, P. (2008) Intron length increases oscillatory periods of gene expression in animal cells. Genes & Development 22, 2342–2346.

    Google Scholar 

  11. Holter, N., Maritan, A., Cieplak, M., Fedoroff, N., and Banavar, J. (2001) Dynamic modeling of gene expression data. Proceedings of the National Academy of Sciences of the United States of America 98, 1693–1698.

    Google Scholar 

  12. Qian, J., Dolled-Filhart, M., Lin, J., Yu, H., and Gerstein, M. (2001) Beyond synexpression relationships: local clustering of time-shifted and inverted gene expression profiles identifies new, biologically relevant interactions. Journal of Molecular Biology 314, 1053–1066.

    Google Scholar 

  13. Bar-Joseph, Z., Gerber, G., Gifford, D., Jaakkola, T., and Simon, I. (2003) Continuous representations of time-series gene expression data. Journal of Computational Biology 10, 341–356.

    Google Scholar 

  14. Luan, Y. and Li, H. (2003) Clustering of time-course gene expression data using a mixed-effects model with B-splines. Bioinformatics 19, 474–482.

    Google Scholar 

  15. Park, T., Yi, S., Lee, S., Lee, S., Yoo, D., Ahn, J., and Lee, Y. (2003) Statistical tests for identifying differentially expressed genes in time-course microarray experiments. Bioinformatics 19, 694–703.

    Google Scholar 

  16. Wakefield, J., Zhou, C., and Self, S. (2003) Modelling gene expression over time: curve clustering with informative prior distributions. Bayesian Statistics 7, 721–732.

    Google Scholar 

  17. Ernst, J., Nau, G., and Bar-Joseph, Z. (2005) Clustering short time series gene expression data. Bioinformatics 21, 159–168.

    Google Scholar 

  18. Storey, J., Xiao, W., Leek, J., Tompkins, R., and Davis, R. (2005) Significance analysis of time course microarray experiments. Proceedings of the National Academy of Sciences of the United States of America 102, 12837–12842.

    Google Scholar 

  19. Ma, P., Castillo-Davis, C., Zhong, W., and Liu, J. (2006) A data-driven clustering method for time course gene expression data. Nucleic Acids Research 34, 1261–1269.

    Google Scholar 

  20. Ng, S., McLachlan, G., Wang, K., Ben-Tovim Jones, L., and Ng, S. (2006) A mixture model with random-effects components for clustering correlated gene-expression profiles. Bioinformatics 22, 1745–1752.

    Google Scholar 

  21. Inoue, L., Neira, M., Nelson, C., Gleave, M., and Etzioni, R. (2007) Cluster-based network model for time-course gene expression data. Biostatistics 8, 507–525.

    Google Scholar 

  22. Kim, B., Zhang, L., Berg, A., Fan, J., and Wu, R. (2008) A Computational Approach to the Functional Clustering of Periodic Gene Expression Profiles. Genetics 180, 821–834.

    Google Scholar 

  23. Wang, Z. and Wu, R. (2004) A statistical model for high-resolution mapping of quantitative trait loci determining human HIV-1 dynamics. Statistics in Medicine 23, 3033–3051.

    Google Scholar 

  24. Ma, C., Casella, G., and Wu, R. (2002) Functional mapping of quantitative trait loci underlying the character process: a theoretical framework. Genetics 161, 1751–1762.

    Google Scholar 

  25. Li, N., McMurry, T., Berg, A., Zhong, W., Berceli, S., and Wu. (2010 (accepted)) Functional clustering of periodic transcriptional profiles through arma(p,q). PloS ONE 5, e9894.

    Google Scholar 

  26. Spellman, P., Sherlock, G., Zhang, M., Iyer, V., Anders, K., Eisen, M., Brown, P., Botstein, D., and Futcher, B. (1998) Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Molecular biology of the cell 9, 3273–3297.

    Google Scholar 

  27. Brockwell, P. and Davis, R. (1991) Time series: theory and methods. (Springer).

    Google Scholar 

  28. Haddad, J. (2004) On the closed form of the covariance matrix and its inverse of the causal ARMA process. Journal of Time Series Analysis 25, 443–448.

    Google Scholar 

  29. Akaike, H. (1974) A new look at the statistical model identification. IEEE Transactions on Automatic Control 19, 716–723.

    Google Scholar 

  30. Schwarz, G. (1978) Estimating the dimension of a model. The Annals of Statistics 6, 461–464.

    Google Scholar 

Download references

Acknowledgments

This work is supported by NSF/NIH joint grant DMS/NIGMS-0540745 and NIH ARRA grant 09095.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rongling Wu .

Editor information

Editors and Affiliations

Appendix

Appendix

Here, we give the procedure for estimating parameters in \( \Lambda = (\Theta, \Psi ) \) within the EM algorithm framework. In the E-step, the posterior expectation of z ij is evaluated as:

$$ {P_{j|i}} = E[{z_{ij}}|\Lambda, {y_i}] = \Pr [{z_{ij}} = 1|\Lambda, {y_i}] = \frac{{{\omega_{j|i}}{f_j}({y_i};{\mu_j},\Sigma )}}{{\sum\nolimits_{j\prime = 0}^2 {{\omega_{j\prime |i}}{f_{j\prime }}({y_i};{\mu_j}^\prime, {\Sigma_i})} }}, $$
(14)

where we assume that the covariance matrix is subject specific.

In the M-step, closed form solutions exist for ω (see Eqs. 811) and the parameters in \( \Lambda \) except for τ and σ 2.

Suppose the gene expression trajectory is approximated by the first K orders of the Fourier series, then \( {\Lambda_j} = ({c_j},{\tau_j}), \) where \( {c_j} = ({\alpha_{0j}},{\alpha_{1j}},{\beta_{1j}},\; \ldots, \;{\alpha_{Kj}},{\beta_{Kj}}). \) We have

$$ \frac{{\partial \log {L_c}(\Lambda |y)}}{{\partial {c_j}}} = \left[ {\frac{{\partial \log {L_c}(\Lambda |y)}}{{\partial {\mu_{ij}}}}} \right]\left[ {\frac{{\partial {\mu_j}}}{{\partial {c_j}}}} \right]. $$
(15)

The parameter c j can be updated by setting (Eq. 15) to zero. Since

$$ \frac{{\partial \log {L_c}(\Lambda |y)}}{{\partial {\mu_j}}} = \sum\limits_{i = 1}^n {{P_{j|i}}{{({y_i} - {\mu_j})}^T}\Sigma_i^{ - 1}} $$

and \( \partial {\mu_j}/\partial {c_j} = {D_i}({\tau_j}) \), where

$$ {D_i}({\tau_j}) = \left( {\begin{array}{*{20}{c}} 1 & {\cos \left( {\frac{{2\pi {t_{i1}}}}{{{\tau_j}}}} \right)} & {\sin \left( {\frac{{2\pi {t_{i1}}}}{{{\tau_j}}}} \right)} & \cdots & {\cos \left( {\frac{{2\pi K{t_{i1}}}}{{{\tau_j}}}} \right)} & {\sin \left( {\frac{{2\pi K{t_{i1}}}}{{{\tau_j}}}} \right)} \\ 1 & {\cos \left( {\frac{{2\pi {t_{i2}}}}{{{\tau_j}}}} \right)} & {\sin \left( {\frac{{2\pi {t_{i2}}}}{{{\tau_j}}}} \right)} & \cdots & {\cos \left( {\frac{{2\pi K{t_{i2}}}}{{{\tau_j}}}} \right)} & {\sin \left( {\frac{{2\pi K{t_{i2}}}}{{{\tau_j}}}} \right)} \\ \vdots & \vdots & \vdots & \vdots & \vdots & \vdots \\ 1 & {\cos \left( {\frac{{2\pi {t_{i{m_i}}}}}{{{\tau_j}}}} \right)} & {\sin \left( {\frac{{2\pi {t_{i{m_i}}}}}{{{\tau_j}}}} \right)} & \cdots & {\cos \left( {\frac{{2\pi K{t_{i{m_j}}}}}{{{\tau_j}}}} \right)} & {\sin \left( {\frac{{2\pi K{t_{i{m_j}}}}}{{{\tau_j}}}} \right)} \\ \end{array} } \right), $$

we have

$$ {\hat{c}_j} = {\left[ {\sum\limits_{i = 1}^n {{P_{j|i}}{D_i}{{({\tau_j})}^T}\Sigma_i^{ - 1}{D_i}({\tau_j})} } \right]^{ - 1}}\left[ {\sum\limits_{i = 1}^n {{P_{j|i}}y_i^T\Sigma_i^{ - 1}{D_i}({\tau_j})} } \right]. $$

Since the analytical form of the inverse of \( {\Sigma_i} \) is not available, we use the recursive method to calculate the inverse matrix of ARMA(p,q) through its association with ARMA(p,q − 1).

We can write \( {\Sigma_i} = {\sigma^2}{R_i} \), where R i is the correlation matrix that is entirely determined by the ARMA parameters \( {\varphi_1},\; \ldots, \;{\varphi_p},\;{\theta_1},\; \ldots, \;{\theta_q}. \) The variance σ 2 can be updated by:

$$ {\hat{\sigma }^2} = \frac{{\sum\nolimits_{i = 1}^n {\sum\nolimits_{j = 1}^J {{P_{j|i}}{{({y_i} - {\mu_j})}^T}R_i^{ - 1}({y_i} - {\mu_j})} } }}{{\sum\nolimits_{i = 1}^n {{m_i}} }}. $$
(16)

Again \( R_i^{ - 1} \) can be calculated by the method of Haddad (28).

Because there are no closed form solutions for τ j and ARMA parameters \( {\varphi_1},\; \ldots, \;{\varphi_p} \) and \( {\theta_1},\; \ldots, \;{\theta_q} \), their estimates are updated using one-step Newton–Raphson method within each iteration. In particular, in the \( {{(\nu + 1)th}} \) iteration, τ j can be updated by:

$$ \tau_j^{\nu + 1} = \tau_j^\nu - \frac{{\partial /\partial {\tau_j}\log {L_c}(\Lambda |y){|_{\Lambda = {\Lambda^\nu }}}}}{{{\partial^2}/\partial \tau_j^2\log {L_c}(\Lambda |y){|_{\Lambda = {\Lambda^\nu }}}}}, $$
(17)

where

$$ \frac{\partial }{{\partial {\tau_j}}}\log {L_c}(\Lambda |y) = \sum\limits_{i = 1}^n {{P_{j|i}}{{({y_i} - {\mu_j})}^T}\Sigma_i^{ - 1}} {\delta_{ij}} $$

with \( {\delta_{ij}} \) being a \( {m_i} \times 1 \) vector whose components

$$ {\delta_{ijl}} = \sum\limits_{k = 1}^K {\left[ {{\alpha_{kj}}\sin \left( {\frac{{2\pi k{t_l}}}{{{\tau_j}}}} \right)\frac{{2\pi k{t_l}}}{{\tau_j^2}} - {\beta_{kj}}\cos \left( {\frac{{2\pi k{t_l}}}{{{\tau_j}}}} \right)\frac{{2\pi k{t_l}}}{{\tau_j^2}}} \right],} $$

and

$$ \frac{{{\partial^2}}}{{\partial \tau_j^2}}\log {L_c}(\Lambda |y) = \sum\limits_{i = 1}^n {\left[ { - {P_{j|i}}\delta_{ij}^T\Sigma_i^{ - 1}{\delta_{ij}} + {P_{j|i}}{{({y_i} - {\mu_j})}^T}\Sigma_i^{ - 1}\frac{{{\partial^2}}}{{\partial \tau_j^2}}{\mu_j}} \right]}. $$

Similarly, the parameters \( {\varphi_1},\; \ldots, \;{\varphi_p} \) and \( {\theta_1},\; \ldots, \;{\theta_q} \) can be updated by the one-step Newton–Raphson method outlined above. However, there are no analytical forms of the first and the second derivatives of the expected complete data log-likelihood with respect to the \( \varphi {{\prime s}} \) and \( \theta {{\prime s}} \), we use the numerical differentiation method to calculate these quantities. To ease the presentation of the method, denote the \( (p + q) \) dimensional vector \( \psi = ({\varphi_1},\; \ldots, \;{\varphi_p},\;{\theta_1},\; \ldots, \;{\theta_q}). \) The first and the second derivatives with respect to the Kth component in Ψ are approximated, respectively, by:

$$ \frac{{E\left[ {\log {L_c}({\Lambda_{ - \psi }},\psi + {h_n}{e_\kappa }|y)} \right] - E\left[ {\log {L_c}(\Lambda |y)} \right]}}{{{h_n}}}, $$
(18)

and

$$ \frac{{E\left[ {\log {L_c}({\Lambda_{ - \psi }},\psi + {h_n}{e_\kappa }|y)} \right] - 2E\left[ {\log {L_c}(\Lambda |y)} \right] + E\left[ {\log {L_c}({\Lambda_{ - \psi }},\psi - {h_n}{e_\kappa }|y)} \right]}}{{h_n^2}}, $$
(19)

where we use E to represent the posterior expectation of the complete data log-likelihood with respect to, \( {\Lambda_{ - \psi }} \) denotes the parameters in \( \Lambda \) other than \( \psi \), the \( (p + q) \) vector e has unity length with the Kth component set to 1, and h n is the bandwidth chosen by the investigator. When h n is small enough, the numerical differentiation approximates the true derivatives adequately. On the other hand, if h n is too small, the random errors from the numerical computation may deteriorate the results.

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Berg, A., Li, N., Tong, C., Wang, Z., Berceli, S.A., Wu, R. (2011). Functional Mapping of Expression Quantitative Trait Loci that Regulate Oscillatory Gene Expression. In: Becskei, A. (eds) Yeast Genetic Networks. Methods in Molecular Biology, vol 734. Humana Press. https://doi.org/10.1007/978-1-61779-086-7_12

Download citation

  • DOI: https://doi.org/10.1007/978-1-61779-086-7_12

  • Published:

  • Publisher Name: Humana Press

  • Print ISBN: 978-1-61779-085-0

  • Online ISBN: 978-1-61779-086-7

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics