Python Basics

Working with Files

Your browser needs to be JavaScript capable to view this video

Try reloading this page, or reviewing your browser settings

This video segment explains how to work with files in Python, how to open files, read from files and write to files, whether text files or binary files.

Keywords

  • Python basics
  • files
  • opening files
  • file handles
  • reading files
  • reading lines
  • end of line character
  • writing files

About this video

Author(s)
Coen de Groot
First online
17 March 2020
DOI
https://doi.org/10.1007/978-1-4842-5831-6_15
Online ISBN
978-1-4842-5831-6
Publisher
Apress
Copyright information
© Coen de Groot 2020

Video Transcript

Let’s start by looking at how we can read and create files. You use the open function to open a file. However, I recommend you don’t use open by itself. Instead, use it as a context. It is often important to close a file as soon as you’re done with it. When you have written something to a file, part of this might still be a memory. If you exit your program without closing it, this may be lost and you could have a broken and unusable file. Also, there is often a limit to how many files you can have open at the same time.

The simplest way to make sure all files are closed, regardless of any errors, is to open them using context. Start creating the context using “with.” Then use the open function to open the file. The target value will contain the file handle which points to the opened file.

As usual, we finish this with a column and one or more indented lines of code. Within this code block, you can now access the file through its handle, for instance, to read its content. When Python leaves the code block, it will automatically close to file. It even does this if there is an error anywhere in the code block. In that case, the rest of the code in the code block will not be executed and Python will quit. But, first, Python will close the file.

As you can see in this example, “read” reads the whole file content. Typically, in a text file, each line ends with an end of line character. This sends the cursor to the next line into the start of the line. In Python, the end of line character is explicitly shown with a backslash followed by an end. The read lines function returns the file as a list of strings.

Each string is a separate line. Using a for loop, you can print the content on separate lines or you can print just one of the lines. In this case, the third one. If you want to work on a file in the current folder, typically where we were when we started Python, that is quite straightforward.

Often we need to specify the path to a file either relative to where the program is, say, images folder for a game or to an absolute path. To separate the parts of a path, the route to the end point, Windows uses a different separator from Linux and Mac OS. Windows uses a backslash, whilst Linux and Mac OS use a forward slash.

So if on your Windows computer you wrote the program with a backslash, it may fail when you run it in the Linux or Mac OS environment. This would work fine on the Windows computer, but fails on my Linux machine even though there is an images folder which does contain a sunset image. To get this to work on my Linux laptop, I have to change it to forward slash.

The join function adapts to the environment and uses the correct separator for the machine your program runs on. I am recording this video on a Linux laptop, so I get the forward slash. Running the same code on the Windows machine, I got a backslash. If you write some code which needs to work on multiple different machines, use the join function. You can join together two, three, or more parts in one go.

The default file mode for the open function is read. So that is what you get when using it without specifying the file mode. When writing some data to a file, we do need to specify the file mode when opening it.

The second parameter of the open function is the file mode. “W” stands for write mode. It returns the number of characters which were written, like 11. This will create a new file if it doesn’t exist yet or it will overwrite the file if it does exist. It does this without asking for confirmation. So be careful when using W, you may accidentally overwrite something valuable.

The write function adds a single string to the output file. If you wanted each string to start on a new line, we need to add the new line character at the end of each string. Without the backslash n, the two lines will be glued together similar to this. The write lines function takes a list of strings– it will write each string to the file, adding a new line character at the end of each string.

I have already mentioned how we can open files in a read or write mode. ‘r’ stands for read mode. This will raise an error if the file doesn’t exist. ‘w’ stands for write mode. It will overwrite an existing file or create a new file. ‘a’ stands for append mode. If the file already exists, anything with rights to it will be added at the end. If it doesn’t exist, it will be created.

The default mode is text. This is usually used for lines of human-readable text separated by the new line character. Add a ‘b’ to the file mode to open a binary file such as an image or sound file. You can combine file modes. For instance, to write to binary file, use ‘wb’– write and binary.

There’s a lot more you can do when opening and using files, such as specifying the character encoding and overwriting part of the content. But this is plenty to get you started with files. Just be careful when opening a file in write mode so you don’t overwrite any important files.