Asynchronous Programming with Python

Race conditions

Your browser needs to be JavaScript capable to view this video

Try reloading this page, or reviewing your browser settings

This segments shows how race conditions can give you unexpected results, and how Python’s GIL helps to prevent race conditions internally.

Keywords

  • Race conditions
  • Python
  • GIL
  • Global interpreter lock

About this video

Author(s)
Coen de Groot
First online
22 December 2020
DOI
https://doi.org/10.1007/978-1-4842-6582-6_3
Online ISBN
978-1-4842-6582-6
Publisher
Apress
Copyright information
© Coen de Groot 2020

Video Transcript

Threads can be interrupted at any time. This can cause some surprising problems known as race conditions. Here is some code. I’m using a dictionary to count numbers for any number from 0 to 9. Using a defaultdict from the collections library would make this more flexible, but that is not needed for this example.

Every time plus_many gets called, the count for 0’s is increased 100,000 times. The underscore in the number 100,000 doesn’t do anything. It just helps to break up the number so it is easier to see how many 0’s there are. plus_many increases the 0 count by 100,000. plus_many is run 10 times, increasing the 0 count 10 times 100,000, which is 1,000,000, as you can see.

Here is the same code, but with some threading. plus_many is now run by 10 threads, once per thread. The totals should still come to one million. Let’s see. That is significantly less. What happened here? Let’s try it once more. Still less, but a different result.

This is a race condition. The result varies between runs and has something to do with the way the threads interact. Before we look at what happened, notice how some threads start later than others, at which point the counter has already been increased. And neither do they all end at the same time.

Here is Python running a single thread. In Memory, we have our Python code and the variables, such as the counter dictionary in our examples. The thread has some registers, such as the program counter which, tracks the next piece of code that needs to be run. The stack is like a scratch pad. It constantly changes with new values being pushed onto the top of the stack, values removed from the top of the stack for some function, replacing a list on the stack with its sum, and much more.

Here is Python running multiple threads. Each thread has its own stack, its own scratch pad, and its own program counter. Thread A doesn’t work. Then thread B gets a turn. Thread C, back to A– et cetera. Because each thread has its own program counter and scratch pad, it just picks up its work where it stopped last time. So far, so good.

In our example, we have 10 threads– actually, to be correct, we have 11 threads. There is also the main thread. There’s always one thread at the start. In this case, our main thread created an additional 10 subthreads for a total of 11 threads. Each subthread will do a bit of work in turn running the plus_many function.

Within plus_many, there is a local variable, which I’ve called underscore. This is a convention amongst Python programmers for variables which the Python syntax requires, but which we don’t use otherwise. The underscore variable is a local variable. It is local to the function. With 10 threads each running plus_many, there are 10 different copies of the underscore variable.

The counter variable is a global variable. There is only one entity shared by all the threads. As you can see, all threads share the same memory. So there really is only one counter variable shared by all threads. If the underscore variable was shared, the threads would take turns incrementing the same underscore variable, and the final result would be 100,000, not one million.

If the counter dictionary was not shared by the threads, each thread would increase a different counter, and the final result would probably be 0, because the main thread itself never increases the counter. However, we note that the counter dictionary is shared. This is just what you would expect for a normal Python program. There are no surprises here. The only reason why its thread has its own underscore variable is because they are local variables. So far, so good.

But why is the counter is so low? And why does it vary between runs? When a thread is finished, it relinquishes control so the next thread can do some work. When a thread is waiting for I/O or when sleeping, it also yields control. When the thread is done waiting, it rejoins the list of active threads, and will get its turn again soon.

Both of these are examples of cooperative multitasking. The threads are working together. Some threads take a while and never wait. If they were left to run until done, other essential tasks may be left until too late. For instance, Python gradually clears out variables which are no longer needed. This is called garbage collection. If this never happens, Python will run out of memory over time.

After a thread has run a while, Python will automatically switch to the next thread. This is called pre-emptive multitasking. Threads are often interrupted before they are done. We’ve got 10 threads, each running the plus_many function, each with a local underscore variable and updating the global counter variable.

What happens if a thread gets interrupted? If it has just finished the for line, the thread would just wait a bit and continue later. If it has finished the next line, it has just increased the counter, then waits for its next turn. It seems that whenever the thread is interrupted, no harm is done. The counter is increased now or later, but it is never skipped.

Let’s look at this in a bit more detail. For dis library, dis assembles Python bytecode. Before Python runs your code, it first converts it to bytecode. The Python runtime engine then runs the bytecode. The documentation for the dis library gives a list of all the bytecodes. Most bytecodes do something to the stack, the thread’s scratch pads.

The code a = 5 a + = 1 breaks down into the following bytecodes. The first two lines store the value 5 into a variable called A. Line 4 puts the value of A on top of the stack. Line 6 adds a 1 to the top of the stack. Line 8 replaces the top of the stack with the sum of the top two elements, which are 1 and 5, for a total of 6. Line 10 stores the result back into A.

What happens when our plus_many function gets interrupted? Let’s say the counter is at 100. Thread P takes the count of value and puts it on top of its stack. It then also puts a 1 on its stack. Time is up. Thread P is interrupted, and Thread Q starts. Thread Q takes the counter value, which is still 100, and puts it on its stack, and puts a 1 on its stack. Then Q adds the top two values of its the stack, and replaces them with the result 101, then moves it from the top of the stack to the dictionary element. Our counter is now 101.

The CPU switches back to Thread P, which also adds up the top two elements of its stack to get 101, then moves it from its stack back to the dictionary element. Both threads have increased the number by 1. We started with 100, and the end result is 101. Q’s increase was overwritten by P’s increase. They both increase the same number, 100, instead of taking turns.

This is a typical race condition, when a shared resource like our global dictionary is retrieved and updated by two threads in parallel without coordination. Locks are a very common mechanism for preventing race conditions. We will look at how to use them soon.

But first, let us briefly look at the GIL, Python’s Global Interpreter Lock. The GIL is part of CPython, the Python interpreter from the Python foundation. It is the most popular Python interpreter. If you install Python from Python.org, you are using CPython. There are other Python interpreters.

A and B are two variables which reference the same list. When I change one of the list elements, both A and B show the new value, because it is the same list. When I delete A, I only delete the variable A. The list still exists. Now, the variable B is gone as well. What about the list?

Well, that depends. It may be gone. It may still be there for a little while. For each object, like our list, Python keeps track of the number of variables which reference it. It does reference counting. For our list, the number of references went up to 2 when A and B both referred to it, then dropped to 0 after we deleted A and B.

Every time Python runs its garbage collection, it removes all objects with a reference count of 0 to make space for new objects. So our list will be removed as soon as the garbage collection runs. The reference count is a shared resource. So we have to protect it from race conditions. If the count is wrong, an object may stay in memory well after it is out of scope or deleted. And little by little, the memory fills up until we run out of memory, or an object is garbage collected too early and disappears even when a variable still points to it.

Protecting the reference counts is the job of the GIL. A Python thread can only run when it holds the GIL. Only one thread at a time can update the reference count. Even on a CPU with multiple cores, the GIL will only allow one thread at a time. This makes multiprocessing a bit tricky, as we’ll soon see.