Skip to main content

Data Input and Output

  • Chapter
  • First Online:
Numerical Python
  • 16k Accesses

Abstract

In nearly all scientific computing and data analysis applications, there is a need for data input and output. This includes to load datasets and to persistently store results to files on disk or to databases. Getting data in and out of programs is consequently a key step in the computational workflow. There are many standardized formats for storing structured and unstructured data. The benefits of using standardized formats are obvious: You can use existing libraries for reading and writing data, saving yourself both time and effort. In the course of working with scientific and technical computing, it is likely that you will face a variety of data formats through interaction with colleagues and peers or when acquiring data from sources such as equipment and databases. As a computational practitioner, it is important to be able to handle data efficiently and seamlessly, regardless of which format it comes in. This motivates why this entire chapter is devoted to the topic of data input and output.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Although RFC 4180, http://tools.ietf.org/html/rfc4180, is sometimes taken as an unofficial specification, in practice there exist many varieties and dialects of CSV.

  2. 2.

    www.hdfgroup.org

  3. 3.

    This is also known as out-of-core computing. For another recent project that also provides out-of-core computing capabilities in Python, see the dask library (http://dask.pydata.org/en/latest).

  4. 4.

    Note that the Python module provided by the PyTables library is named tables. Therefore, tables.open_file refers to open_file function in the tables module provided by the PyTables library.

  5. 5.

    For more information about JSON, see http://json.org .

  6. 6.

    An alternative to the pickle module is the cPickle module, which is a more efficient reimplementation that is also available in the Python standard library. See also the dill library at https://pypi.org/project/dill .

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Robert Johansson

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Johansson, R. (2019). Data Input and Output. In: Numerical Python . Apress, Berkeley, CA. https://doi.org/10.1007/978-1-4842-4246-9_18

Download citation

Publish with us

Policies and ethics