Advertisement

Robust multivariate analysis of compositional data of treated wastewaters

  • Petr PrausEmail author
Original Article
  • 54 Downloads

Abstract

A dataset of water samples collected behind a biological wastewater treatment plant (BWWTP) during a year was processed as compositional data by a log-ratio transformation and then analysed by a robust principal component analysis (RPCA) and the robust Mahalanobis distances (RMDs). For this purpose, covariance matrices were computed using a minimum covariance determinant (MCD) algorithm. Raw and transformed 11 physico-chemical parameters were reduced to 4 robust principal components (RPCs). Correlations between centre log-ratio (clr)-transformed parameters and RPCs were found to be more realistic than those between the parameters and RPCs of raw data. The first and second RPCs represented nitrogen and phosphorus compounds, respectively. Their temporal changes were explained by some processes occurring during biological wastewater treatment. A nitrification process was also demonstrated by the temporal changes of the raw and clr transformed concentrations of ammonium. The robust and classical Mahalanobis distances were computed from the raw and isometric log-ratio (ilr)-transformed data to show the overall temporal changes of treated wastewater composition and to detect outlaying samples.

Keywords

Compositional data Log-ratio transformation Treated wastewaters Multivariate analysis 

Notes

Acknowledgements

The author thanks Dr. Zdeněk Matěj (Lund University, Sweden) for his help with the MATLAB subroutines. This work was financially supported by the project “Institute of Environmental Technology—Excellent Research” (CZ.02.1.01/0.0/0.0/16_019/0000853) provided by the Ministry of Education, Youth and Sports of the Czech Republic.

Supplementary material

12665_2019_8248_MOESM1_ESM.docx (177 kb)
Supplementary material 1 (DOCX 180 kb)

References

  1. Aitchison J (1982) The statistical analysis of compositional data. J Roy Stat Soc Ser B (Methodol) 44:139–177Google Scholar
  2. Aitchison J (1983) Principal component analysis of compositional data. Biometrika 70:57–65.  https://doi.org/10.1093/biomet/70.1.57 CrossRefGoogle Scholar
  3. Aitchison J (1999) Logratios and natural laws in compositional data analysis. Math Geol 31(5):563–580.  https://doi.org/10.1023/a:1007568008032 CrossRefGoogle Scholar
  4. Alpaslan MN (1997) Prevailing problems in environmental data management. In: Harmancioglu NB, Alpaslan MN, Ozkul SD, Singh VP (eds) Integrated approach to environmental data management systems. Springer, Dordrecht, pp 15–22.  https://doi.org/10.1007/978-94-011-5616-5_2 CrossRefGoogle Scholar
  5. Barceló-Vidal C, Martín-Fernández JA, Pawlowsky-Glahn V (1999) Comment on “Singularity and nonnormality in the classification of compositional data” by G. C. Bohling, J. C. Davis, R. A. Olea, and J. Harff. J. Harff Math Geol 31:581–585.  https://doi.org/10.1023/a:1007520124870 CrossRefGoogle Scholar
  6. Bekele E, Page D, Vanderzalm J, Kaksonen A, Gonzalez D (2018) Water recycling via aquifers for sustainable urban water quality management: current status. Chall Oppor Water 10:457.  https://doi.org/10.3390/w10040457 CrossRefGoogle Scholar
  7. Berthouex PM, Hunter WG, Pallesen L (1978) Monitoring sewage treatment plants: some quality control aspects. J Qual Technol 10:139–149.  https://doi.org/10.1080/00224065.1978.11980842 CrossRefGoogle Scholar
  8. Blake S, Henry T, Murray J, Flood R, Muller MR, Jones AG, Rath V (2016) Compositional multivariate statistical analysis of thermal groundwater provenance: a hydrogeochemical case study from Ireland. Appl Geochem 75:171–188.  https://doi.org/10.1016/j.apgeochem.2016.05.008 CrossRefGoogle Scholar
  9. Capilla C (2009) Application and simulation study of the hotelling’s T2 control chart to monitor a wastewater treatment process. Environ Eng Sci 26:333–342.  https://doi.org/10.1089/ees.2007.0358 CrossRefGoogle Scholar
  10. Cattell RB (1966) The scree test for the number of factors. Multivar Behav Res 1:245–276.  https://doi.org/10.1207/s15327906mbr0102_10 CrossRefGoogle Scholar
  11. Corbett CJ, Pan J-N (2002) Evaluating environmental performance using statistical process control techniques. Eur J Oper Res 139:68–83.  https://doi.org/10.1016/S0377-2217(01)00155-2 CrossRefGoogle Scholar
  12. De Maesschalck R, Jouan-Rimbaud D, Massart DL (2000) The Mahalanobis distance. Chemom Intell Lab Syst 50:1–18.  https://doi.org/10.1016/S0169-7439(99)00047-7 CrossRefGoogle Scholar
  13. Drew LJ, Grunsky EC, Schuenemeyer JH (2008) Investigation of the structure of geological process through multivariate statistical analysis—the creation of a coal. In: Bonham-Carter G, Cheng Q (eds) Progress in geomathematics. Springer, Berlin, pp 53–77.  https://doi.org/10.1007/978-3-540-69496-0_5 CrossRefGoogle Scholar
  14. Egozcue JJ, Pawlowsky-Glahn V, Mateu-Figueras G, Barceló-Vidal C (2003) Isometric logratio transformations for compositional data analysis. Math Geol 35:279–300.  https://doi.org/10.1023/a:1023818214614 CrossRefGoogle Scholar
  15. Egozcue JJ, Pawlowsky-Glahn V, Gloor BG (2018) Linear association in compositional data analysis. Aust J Stat 47:3.  https://doi.org/10.17713/ajs.v47i1.689 CrossRefGoogle Scholar
  16. Engle MA, Gallo M, Schroeder KT, Geboy NJ, Zupancic JW (2014) Three-way compositional analysis of water quality monitoring data. Environ Ecol Stat 21:565–581.  https://doi.org/10.1007/s10651-013-0268-x CrossRefGoogle Scholar
  17. Filzmoser P, Hron K (2008) outlier detection for compositional data using robust methods. Math Geosci 40:233–248.  https://doi.org/10.1007/s11004-007-9141-5 CrossRefGoogle Scholar
  18. Hubert M, Debruyne M (2009) Minimum covariance determinant. Comput Stat 2:8.  https://doi.org/10.1002/wics.61 CrossRefGoogle Scholar
  19. Hubert M, Rousseeuw PJ, Vanden Branden K (2005) ROBPCA: a new approach to robust principal component analysis. Technometrics 47:64–79.  https://doi.org/10.1198/004017004000000563 CrossRefGoogle Scholar
  20. Iglesias C, Sancho J, Piñeiro JI, Martínez J, Pastor JJ, Taboada J (2016) Shewhart-type control charts and functional data analysis for water quality analysis based on a global indicator. Desalin Water Treat 57:2669–2684.  https://doi.org/10.1080/19443994.2015.1029533 CrossRefGoogle Scholar
  21. Jolliffe IT (1986) Principal component analysis and factor analysis. In: Principal component analysis. Springer, New York, pp 115–128.  https://doi.org/10.1007/978-1-4757-1904-8_7 CrossRefGoogle Scholar
  22. Kaiser HF (1960) The application of electronic computers to factor analysis. Educ Psychol Measur 20:141–151.  https://doi.org/10.1177/001316446002000116 CrossRefGoogle Scholar
  23. Kase R et al (2018) Screening and risk management solutions for steroidal estrogens in surface and wastewater TrAC. Trends Anal Chem 102:343–358.  https://doi.org/10.1016/j.trac.2018.02.013 CrossRefGoogle Scholar
  24. Orssatto F, Vilas Boas MA, Nagamine R, Uribe-Opazo MA (2014) Shewhart’s control charts and process capability ratio applied to a sewage treatment station. Engenharia Agrícola 34:770–779CrossRefGoogle Scholar
  25. Pawlowsky-Glahn ABA (2011) Compositional data analysis: theory and applications. John Wiley & Sons Ltd, London.  https://doi.org/10.1002/9781119976462.ch17 CrossRefGoogle Scholar
  26. Praus P (2005a) SVD-based principal component analysis of geochemical data. Cent Eur J Chem 3:731–741Google Scholar
  27. Praus P (2005b) Water quality assessment using SVD-based principal component analysis of hydrological data. Water SA 31:417–422Google Scholar
  28. Rousseeuw PJ, Driessen KV (1999) A fast algorithm for the minimum covariance determinant estimator. Technometrics 41:212–223.  https://doi.org/10.1080/00401706.1999.10485670 CrossRefGoogle Scholar
  29. Sabeen AH, Noor ZZ, Ngadi N, Almuraisy S, Raheem AB (2018) Quantification of environmental impacts of domestic wastewater treatment using life cycle assessment: a review. J Clean Prod 190:221–233.  https://doi.org/10.1016/j.jclepro.2018.04.053 CrossRefGoogle Scholar
  30. Thió-Henestrosa S, Martín-Fernández JA (2005) Dealing with compositional data: the freeware CoDaPack. Math Geol 37:773–793.  https://doi.org/10.1007/s11004-005-7379-3 CrossRefGoogle Scholar
  31. van den Boogaart KG, Tolosana-Delgado R (2013) Fundamental concepts of compositional data analysis. In: analyzing compositional data with R. Springer, Berlin, pp 13–50.  https://doi.org/10.1007/978-3-642-36809-7_2 CrossRefGoogle Scholar
  32. Verboven S, Hubert M (2005) LIBRA: a MATLAB library for robust analysis. Chemom Intell Lab Syst 75:127–136.  https://doi.org/10.1016/j.chemolab.2004.06.003 CrossRefGoogle Scholar
  33. Wold S, Esbensen K, Geladi P (1987) Principal component analysis. Chemom Intell Lab Syst 2:37–52.  https://doi.org/10.1016/0169-7439(87)80084-9 CrossRefGoogle Scholar
  34. Wright C, Booth D (2001) Water treatment control using the joint estimation outlier detection method. Environ Model Assess 6:77–82.  https://doi.org/10.1023/a:1011519414015 CrossRefGoogle Scholar
  35. Zhang Z (2016) Environmental data analysis, methods and applications. De Gruyter, Berlin.  https://doi.org/10.1515/9783110424904 CrossRefGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Institute of Environmental Technology and Department of ChemistryVŠB-Technical University of OstravaOstrava, PorubaCzech Republic

Personalised recommendations