Transcriptomics within the Exposome Paradigm

  • D. A. SarigiannisEmail author


The advent of omics technologies has enhanced significantly our capacity to interpret mechanistically the association between environmental exposure and disease. Although understanding these interactions requires capturing perturbations at different levels of biological organization, transcriptomics holds a key role. Modulation of gene expression represents the initial biological perturbations due to environmental exposure. This is of particular importance when assessing real-life exposure that involves multiple stressors in highly variable time regimes. This chapter aims at (a) demonstrating the place of transcriptomics in modern risk assessment and environmental health associations, highlighting the respective bioinformatics tools that are necessary for the interpretation and (b) demonstrating the feasibility of transcriptomics of understanding environmental risk associated to real-life ubiquitous mixtures. Although environmental exposures occur to mixtures of chemicals rather than to individual agents, most of the toxic effects of air pollutants are ascribed to single chemicals. There is a growing feeling in both the scientific and regulatory communities, however, that there is a need for more comprehensive approaches toward managing the potential impact of complex environmental chemical mixtures on human health. In this perspective, it is expected that toxicogenomics would be the appropriate screening method for assessing biological effects of complex chemical mixtures, allowing us to review the whole spectrum of potential biological response rather than focusing on a predefined number of endpoints as in classical toxicological analysis. In this chapter, beyond the overview of the analytical and computational aspects necessary for implementing toxicogenomics in the context of the exposome, a concrete example of such an application on a typical indoor air mixture as defined in the EU-wide review study INDEX and on a mixture of polyaromatic hydrocarbons (PAHs) isolated from urban air in the city of Milan is given with the aim to identify specific sets of biomarkers for each of the two types of exposure (indoor or outdoor). A human cell line derived from a bronco-pulmonary system (A549) was used as the appropriate in vitro model to support the investigation of the molecular basis for adverse outcomes that are attributed to indoor and/or outdoor air pollution based on epidemiological evidence. Applying a Total Gene Expression assay by Applied Biosystems Microarrays, large sets of genes modulated by single mixtures exposure were profiled. This process led us to identify common biochemical pathways and specific molecular responses. Indoor air mixtures induced a higher level of gene modulation than ambient air PAHs. A closer look at the differences in biological response confirmed major discrepancies in the mode of action of the two mixtures. Indoor air induced primarily modulation of genes associated to protein targeting and localization including in particular cytoskeletal organization; PAHs modulated mostly the expression of genes related to cell motility and gene networks regulating cell–cell signaling, as well as cell proliferation and differentiation. These results provide biological information useful for articulating mechanistic hypotheses linking exposure to xenobiotic mixtures and physiological responses. The evidence on the latter is supported by a large amount of epidemiological evidence, associating exposure to urban air pollution with respiratory allergies, chronic obstructive pulmonary disease, cardiovascular disease, and cancer. Lately, such evidence has been extended to include associations of exposure to polluted ambient and indoor air with kidney disease and even neurodegenerative disorders, and in particular dementia.


Transcriptomics Genetic susceptibility Integrated exposure biology Systems Biology 



The author gratefully acknowledges the support of the European Commission through the grant No. 603946 (HEALS—Health and Environment-wide Associations via Large Population Studies) funded through the 7th Framework Program for Research and Technological Development of the EU.


  1. Aebersold R, Mann M (2003) Mass spectrometry-based proteomics. Nature 422(6928):198–207CrossRefGoogle Scholar
  2. Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: VLDB conference, pp 487–499Google Scholar
  3. Agyeman AS, Chaerkady R, Shaw PG, Davidson NE, Visvanathan K, Pandey A, Kensler TW (2012) Transcriptomic and proteomic profiling of KEAP1 disrupted and sulforaphane-treated human breast epithelial cells reveals common expression profiles. Breast Cancer Res Treat 132(1):175–187. Scholar
  4. Allison DB, Cui X, Page GP, Sabripour M (2006) Microarray data analysis: from disarray to consolidation and consensus. Nat Rev Genet 7(1):55–65. Scholar
  5. Altman RB, Raychaudhuri S (2001) Whole-genome expression analysis: challenges beyond clustering. Curr Opin Struct Biol 11(3):340–347. Scholar
  6. Ankley GT, Bennett RS, Erickson RJ, Hoff DJ, Hornung MW, Johnson RD, Mount DR, Nichols JW, Russom CL, Schmieder PK, Serrrano JA, Tietge JE, Villeneuve DL (2010) Adverse outcome pathways: a conceptual framework to support ecotoxicology research and risk assessment. Environ Toxicol Chem 29(3):730–741CrossRefGoogle Scholar
  7. Audouze K, Juncker AS, Roque FJ, Krysiak-Baltyn K, Weinhold N, Taboureau O, Jensen TS, Brunak S (2010) Deciphering diseases and biological targets for environmental chemicals using toxicogenomics networks. PLoS Comput Biol 6(5):e1000788. Scholar
  8. Ben-Dor A, Shamir R, Yakhini Z (1999) Clustering gene expression patterns. J Comput Biol 63(3/4):281–297CrossRefGoogle Scholar
  9. Boezio B, Audouze K, Ducrot P, Taboureau O (2017) Network-based approaches in pharmacology. Molecular Inform 36(10), Wiley-VCH Verlag GmbH & Co. KGaA, WeinheimGoogle Scholar
  10. Borkowski K, Wrzesinski K, Rogowska-Wrzesinska A, Audouze K, Bakke J, Petersen RK, Haj FG, Madsen L, Kristiansen K (2014) Proteomic analysis of cAMP-mediated signaling during differentiation of 3 T3-L1 preadipocytes. Biochim Biophys Acta 1844(12):2096–2107. Scholar
  11. Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140Google Scholar
  12. Breiman L (1998) Arcing classifiers (with discussion). Ann Stat 26(3):801–849CrossRefGoogle Scholar
  13. Breiman L (2001) Random forests. Mach Learn 45(1):5–32. Scholar
  14. Brunekreef B, Holgate ST (2002) Air pollution and health. Lancet 360(9341):1233–1242. Scholar
  15. Chang B, Halgamuge SK (2002) Protein motif extraction with neuro-fuzzy optimization. Bioinformatics 18(8):1084–1090CrossRefGoogle Scholar
  16. Ciriello G, Gatza ML, Beck AH, Wilkerson MD, Rhie SK, Pastore A, Zhang H, McLellan M, Yau C, Kandoth C, Bowlby R, Shen H, Hayat S, Fieldhouse R, Lester SC, Tse GM, Factor RE, Collins LC, Allison KH, Chen YY, Jensen K, Johnson NB, Oesterreich S, Mills GB, Cherniack AD, Robertson G, Benz C, Sander C, Laird PW, Hoadley KA, King TA, Perou CM (2015) Comprehensive molecular portraits of invasive lobular breast cancer. Cell 163(2):506–519. Scholar
  17. Dasgupta A, Raftery AE (1998) Detecting features in spatial point processes with clutter via model-based clustering. J Am Stat Assoc 93(441):294–302CrossRefGoogle Scholar
  18. Dong G, Zhang X, Wong L, Li J (1999) CAEP: classification by aggregating emerging patterns. In: Springer-Verlag (ed) Proceedings of the second international conference on discovery science, pp 30–42Google Scholar
  19. Dubes R (1988) Algorithms for clustering data. Prentice Hall, Englewood Cliffs, NJGoogle Scholar
  20. Dumas ME, Domange C, Calderari S, Martinez AR, Ayala R, Wilder SP, Suarez-Zamorano N, Collins SC, Wallis RH, Gu Q, Wang Y, Hue C, Otto GW, Argoud K, Navratil V, Mitchell SC, Lindon JC, Holmes E, Cazier JB, Nicholson JK, Gauguier D (2016) Topological analysis of metabolic networks integrating co-segregating transcriptomes and metabolomes in type 2 diabetic rat congenic series. Genome Med 8(1):101. Scholar
  21. Ebrahim A, Brunk E, Tan J, O’Brien EJ, Kim D, Szubin R, Lerman JA, Lechner A, Sastry A, Bordbar A, Feist AM, Palsson BO (2016) Multi-omic data integration enables discovery of hidden biological regularities. Nat Commun 7:13091. Scholar
  22. Fayyad UM, Piatetsky-Shapiro G, Smyth P (1996) Knowledge discovery and data mining: towards a unifying framework. In: Proceedings of the second international conference on knowledge discovery and data mining, p 82Google Scholar
  23. Fiedler N, Laumbach R, Kelly-McNeil K, Lioy P, Fan ZH, Zhang J, Ottenweller J, Ohman-Strickland P, Kipen H (2005) Health effects of a mixture of indoor air volatile organics, their ozone oxidation products, and stress. Environ Health Perspect 113(11):1542–1548CrossRefGoogle Scholar
  24. Fraley C, Raftery AE (1998) How many clusters? Which clustering method? Answers via model-based cluster analysis. Comput J 41(8):586–588CrossRefGoogle Scholar
  25. Freund Y, Schapire R (1996) Experiments with a new boosting algorithm. In: Proceedings of the thirteenth national conference on machine learning, pp 148–156Google Scholar
  26. Garcia-Reyero N (2015) Are adverse outcome pathways here to stay? Environ Sci Technol 49(1):3–9. Scholar
  27. Gasch A, Eisen M (2002) Exploring the conditional corregulation of yeast gene expression through fuzzy k-means clustering. Genome Biol 3:1–22CrossRefGoogle Scholar
  28. Gevaert O, De Smet F, Timmerman D, Moreau Y, De Moor B (2006) Predicting the prognosis of breast cancer by integrating clinical and microarray data with Bayesian networks. Bioinformatics 22(14):e184–e190. Scholar
  29. Hackett JL, Lesko LJ (2003) Microarray data--the US FDA, industry and academia. Nat Biotechnol 21(7):742–743. Scholar
  30. Han J, Pei H, Yin Y (2000) Mining frequent patterns without candidate generation. In: Conference on the management of data, ACM Press, DalasGoogle Scholar
  31. Han J, Pei J, Yin Y, Mao R (2003) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min Knowl Discov 8:53–87CrossRefGoogle Scholar
  32. Hao Q, Yadav R, Basse AL, Petersen S, Sonne SB, Rasmussen S, Zhu Q, Lu Z, Wang J, Audouze K, Gupta R, Madsen L, Kristiansen K, Hansen JB (2015) Transcriptome profiling of brown adipose tissue during cold exposure reveals extensive regulation of glucose metabolism. Am J Phys Endocrinol Metab 308(5):E380–E392. Scholar
  33. Hirschhorn JN, Daly MJ (2005) Genome-wide association studies for common diseases and complex traits. Nat Rev Genet 6(2):95–108. Scholar
  34. Jiang D, Pei J, Zhang A (2003a) Interactive exploration of coherent patterns in time-series gene expression data. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining, pp 565–570.
  35. Jiang D, Pei J, Zhang A (2003b) DHC: a density-based hierarchical clustering method for timeseries gene expression data. In: BIBE2003 (ed) 3rd IEEE international symposium on bioinformatics and bioengineering, Bethesda, Maryland, 10–12 Mar 2003Google Scholar
  36. Jones DT (2001) Protein structure prediction in genomics. Brief Bioinform 2(2):111–125CrossRefGoogle Scholar
  37. Kaufman L, Rousseeuw PJ (2008) Finding groups in data: an introduction to cluster analysis. Wiley, New YorkGoogle Scholar
  38. Kohonen T (1984) Self-organization and associative memory. Spring, BerlinGoogle Scholar
  39. Kongsbak K, Vinggaard AM, Hadrup N, Audouze K (2014) A computational approach to mechanistic and predictive toxicology of pesticides. ALTEX 31(1):11–22. Scholar
  40. Kotzias P, Koistinen K, Kephalopoulos S, Schlitt C, Carrer P, Maroni VI, Jantunen MJ, Cochet C, Kirchner S, Lindvall T, McLaughlin J, Molhave L, Fernandes E, Seifert B (2005) The INDEX project: critical appraisal of the setting and implementation of indoor exposure limits in the EU. EUR 21590 EN. doi:Cited By (since 1996) 1 Export Date 17 April 2012Google Scholar
  41. Kuncheva L (2004) Combining pattern classifiers: methods and algorithms. Wiley, New YorkCrossRefGoogle Scholar
  42. Larranaga P, Calvo B, Santana R, Bielza C, Galdiano J (2003) Machine learning in bioinformatics. Brief Bioinform 7(1):86–112CrossRefGoogle Scholar
  43. Lesko LJ, Woodcock J (2004) Translation of pharmacogenomics and pharmacogenetics: a regulatory perspective. Nat Rev Drug Discov 3(9):763–769. Scholar
  44. Li T, Wernersson R, Hansen RB (2017) A scored human protein-protein interaction network to catalyze genomic interpretation. Nat Methods 14(1):61–64. Scholar
  45. Linkov I, Massey O, Keisler J, Rusyn I, Hartung T (2015) From “weight of evidence” to quantitative data integration using multicriteria decision analysis and Bayesian methods. ALTEX 32(1):3–8. Scholar
  46. Manrai AK, Cui Y, Bushel PR, Hall M, Karakitsios S, Mattingly C, Ritchie M, Schmitt C, Sarigiannis DA, Thomas DC, Wishart D, Balshaw DM, Patel CJ (2016) Informatics and data analytics to support exposome-based discovery for public health. Annu Rev Public Health 38:279–294. Scholar
  47. Martinez R, Collard M (2007) Extracted knowledge: interpretation in mining biological data, a survey. Int J Comput Sci Appl 1:1–21Google Scholar
  48. McQueen JB (1967) Some methods for classification and analysis of multivariate observations. In: Press UoC (ed) Fifth Berkeley symposium on mathematical statistics and probability, University of California Press, Berkeley, pp 281–297Google Scholar
  49. Perkins EJ, Antczak P, Burgoon L, Falciani F, Garcia-Reyero N, Gutsell S, Hodges G, Kienzler A, Knapen D, McBride M, Willett C (2015) Adverse outcome pathways for regulatory applications: examination of four case studies with different degrees of completeness and scientific confidence. Toxicol Sci 148(1):14–25. Scholar
  50. Pleil JD (2012) Categorizing biomarkers of the human exposome and developing metrics for assessing environmental sustainability. J Toxicol Environ Health B Crit Rev 15(4):264–280. Scholar
  51. Sarigiannis D, Gotti A, Cimino Reale G, Marafante E (2009) Reflections on new directions for risk assessment of environmental chemical mixtures. Int J Risk Assess Manag 13(3-4):216–241CrossRefGoogle Scholar
  52. Sarigiannis DA, Kermenidou M, Nikolaki S, Zikopoulos D, Karakitsios SP (2015) Mortality and morbidity attributed to aerosol and gaseous emissions from biomass use for space heating. Aerosol Air Qual Res 15(7):2496–2507CrossRefGoogle Scholar
  53. Saykin AJ, Shen L, Yao X, Kim S, Nho K, Risacher SL, Ramanan VK, Foroud TM, Faber KM, Sarwar N, Munsie LM, Hu X, Soares HD, Potkin SG, Thompson PM, Kauwe JS, Kaddurah-Daouk R, Green RC, Toga AW, Weiner MW (2015) Genetic studies of quantitative MCI and AD phenotypes in ADNI: progress, opportunities, and plans. Alzheimers Dement 11(7):792–814. Scholar
  54. Schapire R, Freund Y, Bartlett P, Lee WS (1998) Boosting the margin: a new explanation for the effectiveness of voting methods. Ann Stat 26(5):1651–1686CrossRefGoogle Scholar
  55. Seno M, Karypis G (2001) LPMiner: an algorithm for finding frequent itemsets using length-decreasing support constraint. In: 1st IEEE conference on data miningGoogle Scholar
  56. Shamir R, Sharan R (2000) Click: a clustering algorithm for gene expression analysis. In: AAAI Press (ed) 8th international conference on intelligent systems for molecular biology (ISMB ‘00)Google Scholar
  57. Shatkay H, Edwards S, Wilbur WJ, Boguski M (2000) Genes, themes, microarrays: using information retrieval for large-scale gene analysis. Proc Int Conf Intell Syst Mol Biol 8:340–347Google Scholar
  58. Svihalkova-Sindlerova L, Machala M, Pencikova K, Marvanova S, Neca J, Topinka J, Sevastyanova O, Kozubik A, Vondracek J (2007) Dibenzanthracenes and benzochrysenes elicit both genotoxic and nongenotoxic events in rat liver ‘stem-like’ cells. Toxicology 232(1–2):147–159. Scholar
  59. Szklarczyk D, Morris JH, Cook H, Kuhn M, Wyder S, Simonovic M, Santos A, Doncheva NT, Roth A, Bork P, Jensen LJ, von Mering C (2017) The STRING database in 2017: quality-controlled protein-protein association networks, made broadly accessible. Nucleic Acids Res 45(D1):D362–d368. Scholar
  60. Taboureau O, Audouze K (2017) Human Environmental Disease Network: a computational model to assess toxicology of contaminants. ALTEX 34(2):289–300. Scholar
  61. Taboureau O, Jacobsen UP, Kalhauge C, Edsgard D, Rigina O, Gupta R, Audouze K (2013) HExpoChem: a systems biology resource to explore human exposure to chemicals. Bioinformatics 29(9):1231–1232. Scholar
  62. TCGA (2011) Integrated genomic analyses of ovarian carcinoma. Nature 474(7353):609–615. Scholar
  63. TCGA (2014) Comprehensive molecular characterization of urothelial bladder carcinoma. Nature 507(7492):315–322. Scholar
  64. Valencia A, Pazos F (2002) Computational methods for the prediction of protein interactions. Curr Opin Struct Biol 12(3):368–373. Scholar
  65. Villeneuve DL, Crump D, Garcia-Reyero N, Hecker M, Hutchinson TH, LaLone CA, Landesmann B, Lettieri T, Munn S, Nepelska M, Ottinger MA, Vergauwen L, Whelan M (2014) Adverse outcome pathway development II: best practices. Toxicol Sci 142(2):321–330. Scholar
  66. Vitkina TI, Yankova VI, Gvozdenko TA, Kuznetsov VL, Krasnikov DV, Nazarenko AV, Chaika VV, Smagin SV, Tsatsakis AΜ, Engin AB, Karakitsios SP, Sarigiannis DA, Golokhvast KS (2016) The impact of multi-walled carbon nanotubes with different amount of metallic impurities on immunometabolic parameters in healthy volunteers. Food Chem Toxicol 87:138–147. Scholar
  67. Webb G, Zheng Z (2004) Multistrategy ensemble learning: reducing error by combining ensemble learning techniques. IEEE Trans Knowl Data Eng 16(8):980–991CrossRefGoogle Scholar
  68. Weiner MW, Aisen PS, Jack CR Jr, Jagust WJ, Trojanowski JQ, Shaw L, Saykin AJ, Morris JC, Cairns N, Beckett LA, Toga A, Green R, Walter S, Soares H, Snyder P, Siemers E, Potter W, Cole PE, Schmidt M (2010) The Alzheimer’s disease neuroimaging initiative: progress report and future plans. Alzheimers Dement 6(3):202–211.e207. Scholar
  69. Yan J, Risacher SL, Shen L, Saykin AJ (2017) Network approaches to systems biology analysis of complex disease: integrative methods for multi-omics data. Brief Bioinform.
  70. Yang Y, Blomme EA, Waring JF (2004) Toxicogenomics in drug discovery: from preclinical studies to clinical trials. Chem Biol Interact 150(1):71–85. Scholar
  71. Zeng J, Zhu S, Yan H (2009) Towards accurate human promoter recognition: a review of currently used sequence features and classification methods. Brief Bioinform 10(5):498–508. Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2019

Authors and Affiliations

  1. 1.HERACLES Research Center on the Exposome and Health, Center for Interdisciplinary Research and InnovationThessalonikiGreece
  2. 2.Environmental Engineering LaboratorySchool of Chemical Engineering, Aristotle University of ThessalonikiThessalonikiGreece
  3. 3.University School for Advanced Study (IUSS)PaviaItaly

Personalised recommendations