Abstract
R comes with an excellent tutorial that, like many fine tutorials, tends to be ignored by people with little patience for material presented in a general manner. This is why the present chapter uses oceanographic examples to explain R concepts, and why code makes up so much of the text. The early examples are designed to encourage readers to become comfortable whilst navigating the R documentation, because this skill can be the key to moving from simple examples to real-world applications. The main concepts of R data types and language features are illustrated here in practical terms, with many of the explanations involving graphical representation. Since experienced R users are unlikely to study this chapter in great depth, specialized methods of oceanographic analysis are mainly deferred to succeeding chapters.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
The package count was inferred from web archives at http://wayback.archive.org/.
- 3.
- 4.
The indented lines of the Makefile must start with a tab character, not spaces.
- 5.
R displays a prompt before the input, but this is omitted throughout this book.
- 6.
A subtle point is that R does not always look up the values of variables until they are needed. This is related to R concepts of “lazy evaluation ” and “promises ”.
- 7.
- 8.
Alternative time origins may be specified to as.POSIXlt() , and this can be helpful in working with times represented in other systems such as SPSS and SAS .
- 9.
This profile results from a nonlinear regression (Sect. 2.5.5.2) of the oxygen profile at station 112 of the section dataset in the ocedata package.
- 10.
See http://www.gnu.org/software/gsl/ for more on GSL.
- 11.
Note the use of the UNESCO equation of state here; with the GSW equation, longitude and latitude would also have to be supplied; see Sect. 5.2.1 and Appendix D.
- 12.
- 13.
For more on performance issues, see Appendix E.
- 14.
See Sect. 5.7 for more on Argo floats.
- 15.
Several coastline resolutions are provided in the ocedata and oce packages.
- 16.
Home electricity provides a dramatic illustration. Although voltage measurements may give a confidence interval on the mean that barely departs from 0V, the measurement uncertainty will indicate that any given measurement could easily be of order 100V. That is why electrical outlets must be covered up, in houses with young children.
- 17.
Note the use of set.seed( ) to let readers reconstruct the example.
- 18.
It is unwise to use hypothesis tests without considering their limitations. Some issues of misapplication are outlined by, e.g., Johnson and Omland (2004) and Hauer (2004), and deep concerns about the misuse of p values are raised in a highly influential editorial in The American Statistician (Wasserstein and Lazar 2016).
- 19.
If p < 2.2 × 10−16, R regression summaries simply reports “p-value: <2.2e-16”.
- 20.
See also the NISTnls package, which provides data and code for statistical test suites developed by researchers at the U.S. National Institute for Standards and Technology.
- 21.
An alternative to stack() is melt() , from the reshape2 package (Wickham 2007). If this is used, then aov() must use value for values and variable for ind.
- 22.
It is not strictly necessary to use as.ctd() to create a "ctd" object, but it makes it easier to create a standardized plot with isopycnals.
- 23.
A test with a 90 Mb file on the author’s machine revealed read_csv() to be nearly 6 times faster than read.csv() .
- 24.
- 25.
- 26.
- 27.
- 28.
- 29.
RStudio has a variety of other helpful features, e.g. a code-completing editor and a code-analysis tool that can recommend alterations that may make code more robust.
References
Albert, J., 2009. Bayesian computation with R. Use R! Springer, New York, NY, USA, second edition.
Bååth, R., 2012. The state of naming conventions in R. The R Journal, 4(2):74–75.
Becker, R. A. and Chambers, J. M., 1984. S: an interactive environment for data analysis and graphics. Wadsworth statistics/probability series. Wadsworth Advanced Book Program, Belmont, CA, USA.
Becker, R. A., Chambers, J. M., and Wilks, A. R., 1988. The new S language. Wadsworth & Brooks/Cole, Pacific Grove, CA, USA.
Borcard, D., Gillet, F., and Legendre, P., 2011. Numerical Ecology with R. Use R. Springer-Verlag, New York, NY, USA.
Boyer, T. P., Antonov, J. I., Baranova, O. K., Garcia, H. E., Johnson, D. R., Locarnini, R. A., Mishonov, A. V., O’Brien, T. D., Seidov, D., Smolyar, V., and Zweng, M. M., 2009. World ocean atlas 2009. Technical report, US Government printing Office.
Carr, D. B., 1991. Looking at large data sets using binned data plots. In Buja, A. and Tukey, P. A., editors, Computing and Graphics in Statistics, pages 7–39. Springer-Verlag New York, Inc., New York, NY, USA.
Chambers, J. M., 2008. Software for data analysis: programming with R. Statistics and computing. Springer-Verlag, New York, NY, USA.
Chambers, J. M. and Hastie, T. J., 1992. Statistical models in S. Wadsworth & Brooks/Cole, Pacific Grove, CA, USA.
Clarke, A. J. and Van Gorder, S., 2012. On fitting a straight line to data when the “noise” in both variables is unknown. Journal of Atmospheric and Oceanic Technology, 30(1):151–158.
Cleveland, W. S. and McGill, R., 1984. Graphical perception: Theory, experimentation, and application to the development of graphical methods. Journal of the American Statistical Association, 79(387):531–554.
Dalgaard, P., 2002. Introductory Statistics with R. Statistics and Computing. Springer, New York, NY, USA.
De Veaux, R. D., Velleman, P. R., and Bock, D. E., 2006. Intro Stats. Pearson Addison Wesley, Boston, MA, USA, 2nd edition.
deYoung, B., Barange, M., Beaugrand, G., Harris, R., Perry, R. I., Scheffer, M., and Werner, F., 2008. Regime shifts in marine ecosystems: detection, prediction and management. Trends in Ecology & Evolution, 23(7):402–409.
Estivill-Castro, V., 2002. Why so many clustering algorithms—a position paper. ACM SIGKDD Explorations Newsletter, 4(1):65–75.
Faraway, J. J., 2002. Practical regression and ANOVA using R. The Comprehensive R Archive Network (online).
Faraway, J. J., 2005. Linear models with R. Texts in statistical science. Chapman & Hall/CRC, Boca Raton, FL, USA.
Gallant, A. R., 1975. Nonlinear regression. The American Statistician, 29(2):pp. 73–81.
Garratt, J. R., 1977. Review of drag coefficients over oceans and continents. Monthly Weather Review, 105:915–927.
Gentleman, R. and Ihaka, R., 2000. Lexical scope and statistical computing. Journal of Computational and Graphical Statistics, 9(3):pp. 491–508.
Grant, H. L., Stewart, R. W., and Moilliet, A., 1962. Turbulence spectra from a tidal channel. Journal of Fluid Mechanics, 12(2):241–268.
Grolemund, G. and Wickham, H., 2011. Dates and times made easy with lubridate. Journal of Statistical Software, 40(3):1–25.
Hansen, J., Ruedy, R., Sato, M., and Lo, K., 2010. Global surface temperature change. Reviews of Geophysics, 48(4):RG4004.
Hartigan, J. A. and Wong, M. A., 1979. A k-means clustering algorithm. Journal of the Royal Statistical Society. Series C (Applied Statistics), 28(1):100–108.
Hauer, E., 2004. The harm done by tests of significance. Accident Analysis and Prevention, 36(495–500).
Horton, N. J. and Kleinman, K. P., 2007. Much ado about nothing: a comparison of missing data methods and software to fit incomplete data regression models. American Statistician, 61(1):79–90.
Hothorn, T., Hornik, K., and Zeileis, A., 2006. Unbiased recursive partitioning: A conditional inference framework. Journal of Computational and Graphical Statistics, 15(3):651–674.
Ihaka, R., 2003. Colour for presentation graphics. In Proceedings of the 3rd international workshop on distributed statistical computing, Technische Universität Wien, Vienna, Austria.
Ihaka, R. and Gentleman, R., 1996. R: A language for data analysis and graphics. Journal of Computational & Graphical Statistics, 5(3):pp. 299–314.
Johnson, J. B. and Omland, K. S., 2004. Model selection in ecology and evolution. Trends in Ecology & Evolution, 19(2):101–108.
Killick, R. and Eckley, I. A., 2014. changepoint: An R package for changepoint analysis. Journal of Statistical Software, 58(3):1–19.
Killick, R., Haynes, K., and Eckley, I. A., 2016. changepoint: An R package for changepoint analysis.
Lämmel, R., 2008. Google’s MapReduce programming model–revisited. Science of Computer Programming, 70(1):1–30.
Legendre, P., 2014. lmodel2: Model II Regression. Comprehensive R Archive Network.
Legendre, P. and Legendre, L., 1998. Numerical Ecology. Developments in environmental modeling 20. Elsevier, Amsterdam, 2nd English edition.
Leisch, F., 2002. Sweave: Dynamic generation of statistical reports using literate data analysis. In Härdle, W. and Rönz, B., editors, Compstat 2002 — Proceedings in Computational Statistics, pages 575–580. Physica Verlag, Heidelberg. ISBN 3-7908-1517-9.
Lindegren, M., Dakos, V., Gröger, J. P., Gårdmark, A., Kornilovs, G., Otto, S. A., and Möllmann, C., 2012. Early detection of ecosystem regime shifts: A multiple method evaluation for management application. PLoS ONE, 7(7):e38410.
Marsden, R. F., 1999. A proposal for a neutral regression. Journal of Atmospheric and Oceanic Technology, 16(7):876–883.
McArdle, B. H., 2003. Lines, models, and errors: regression in the field. Limnology and Oceanography, 48(3):1363–1366.
Miller, A. J., Cayan, D. R., Barnett, T. P., Graham, N. E., and Oberhuber, J. M., 1994. The 1976–77 climate shift of the Pacific Ocean. Oceanography, 7(1):21–26.
Muggeo, V. M. R., 2008. segmented: An R package to fit regression models with broken-line relationships. R News, 8(1):20–25.
Murrell, P., 2006. R Graphics. Chapman & Hall/CRC, Boca Raton, FL, USA.
R Core Team, 2017. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria.
Ricker, W. E., 1973. Linear regressions in fishery research. Journal of the Fisheries Research Board of Canada, 30:409–434.
Ripley, B. D., 1996. Pattern recognition and neural networks. Cambridge University Press, Cambridge, UK.
Ripley, B. D. and Hornik, K., 2001. Date-time classes. R News, 1(2):8–11.
Rudnick, D. L. and Davis, R. E., 2003. Red noise and regime shifts. Deep Sea Research Part I: Oceanographic Research Papers, 50(6):691–699.
Shumway, R. H. and Stoffer, D. S., 2006. Time Series Analysis and its Applications: With R Examples. Springer-Verlag, New York, 2nd edition.
Taylor, B. N. and Kuyatt, C. E., 1994. Guidelines for evaluating and expressing the uncertainty of NIST measurement results. NIST Technical Note 1297, U.S. Department of Commerce Technology Administration: National Institute of Standards and Technology, Gaithersburg, MD, USA.
Tukey, J. W., 1977. Exploratory Data Analysis. Addison-Wesley, Reading, MA, USA.
Venables, W. N. and Ripley, B. D., 1999. Modern applied statistics with S-plus. Springer-Verlag, New York, NY, USA, third edition.
Warton, D. I., Duursma, R. A., Falster, D. S., and Taskinen, S., 2012. smatr 3–an R package for estimation and inference about allometric lines. Methods in Ecology and Evolution, 3:257–259.
Warton, D. I., Wright, I. J., Falster, D. S., and Westoby, M., 2006. Bivariate line-fitting methods for allometry. Biological Reviews, 81(2):259–291.
Wasserstein, R. L. and Lazar, N. A., 2016. The ASA’s statement on p-values: Context, process, and purpose. The American Statistician, 70(2):129–133.
Wessel, P., Smith, W. H. F., Scharroo, R., Luis, J. F., and Wobbe, F., 2013. Generic mapping tools: improved version released. Transactions, American Geophysical Union, 94:409–410.
Wickham, H., 2007. Reshaping data with the reshape package. Journal of Statistical Software, 21(12):1–20.
Wickham, H., 2009. ggplot2: elegant graphics for data analysis. Springer, New York, USA.
Wickham, H., 2011. The split-apply-combine strategy for data analysis. Journal of Statistical Software, 40(1):1–29.
Wickham, H., 2014. Advanced R. The R Series. Chapman and Hall/CRC.
Wood, S. N., 2001. mgcv: GAMs and generalized ridge regression for R. R News, 1(2):20–25.
Zeileis, A., Hornik, K., and Murrell, P., 2009. Escaping RGBland: Selecting colors for statistical graphics. Computational Statistics and Data Analysis, 53(9):3259–3270.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Science+Business Media, LLC, part of Springer Nature
About this chapter
Cite this chapter
Kelley, D.E. (2018). R Tutorial for Oceanographers. In: Oceanographic Analysis with R. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-8844-0_2
Download citation
DOI: https://doi.org/10.1007/978-1-4939-8844-0_2
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4939-8842-6
Online ISBN: 978-1-4939-8844-0
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)