Skip to main content

Data Input and Output

  • Chapter
  • First Online:
  • 11k Accesses

Abstract

In nearly all scientific computing and data analysis applications there is a need for data input and output, for example, to load datasets or to persistently store results. Getting data in and out of programs is consequently a key step in the computational workflow. There are many standardized formats for storing structured and unstructured data. The benefits of using standardized formats are obvious: you can use existing libraries for reading and writing data, saving yourself both time and effort. In the course of working with scientific and technical computing, it is likely that you will face a variety of data formats through interaction with colleagues and peers, or when acquiring data from sources such as equipment and databases. As a computational practitioner, it is important to be able to handle data efficiently and seamlessly, regardless of which format it comes in. This motivates why this entire chapter is devoted to this topic.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Although RFC 4180, http://tools.ietf.org/html/rfc4180, is sometimes taken as an unofficial specification, in practice there exist many varieties and dialects of CSV.

  2. 2.

    http://www.hdfgroup.org .

  3. 3.

    This is also known as out-of-core computing. For another recent project that also provides out-of-core computing capabilities in Python, see the dask library (http://dask.pydata.org/en/latest).

  4. 4.

    Note that the Python module provided by the PyTables library is named tables. Therefore, tables.open_file refers to open_file function in the tables module provided by the PyTables library.

  5. 5.

    For more information about JSON, see http://json.org .

  6. 6.

    An alternative to the pickle module is the cPickle module, which is a more efficient reimplementation that is also available in the Python standard library. See also the dill library at http://trac.mystic.cacr.caltech.edu/project/pathos/wiki/dill .

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Robert Johansson

About this chapter

Cite this chapter

Johansson, R. (2015). Data Input and Output. In: Numerical Python. Apress, Berkeley, CA. https://doi.org/10.1007/978-1-4842-0553-2_18

Download citation

Publish with us

Policies and ethics