What Are Computer Threads?
To explain a race condition, we first have to understand a little about how computers work internally. When you use an operating system, you will take various actions like opening a command terminal window, opening a browser, etc. Each of those actions will result in a reaction by the operating system to start a new computing/computer thread.
A thread is a computing/computer process that will execute/run the various steps (programming steps, originally written in source code format and regularly compiled by a compiler) required to execute the task you asked the operating system or software running on it to perform.
In Linux, such a thread is uniquely identified by a PID (Process Identifier). To learn more about PID’s in Linux, you can read our articles Bash Automation and Scripting Basics (Part 3) and How Linux Signals Work: SIGINT, SIGTERM, and SIGKILL.
In Windows, a thread is also uniquely identified by a process ID (Ref the PID column in Windows Task Manager), though the implementation of the process handling is different between Linux and Windows; different underlying code, different PID interaction tools etc. and limited compatibility. Also, the Windows process ID PID is not be confused with Product ID PID (same term, different meaning) or VID (Vendor ID). The latter two refer to the identification of devices and are unrelated to process management.
When a thread starts, it can in itself start other threads. The original thread is often referred to as the main or the parent thread. For example, when you click the icon of your favorite web browser, it will immediately start a thread (the main thread), and that thread will very quickly start several subthreads, or child threads, and thus become the parent thread.
You may also think about threads like runners in a race. For example, think about a busy database server serving many different connected clients. Each of those client threads (note the use of the word thread) will (in many cases) in and by itself have at least one thread on the database host server and/or inside the database software itself (i.e., two threads, one logged into the operating system and one inside the database software).
The database server is trying to serve all of those threads at the same time – hence the term concurrent processes or concurrent threads and if there are bugs in the database software (or operating system, etc.) then sooner or later it may run into a race condition.
What Is a Race Condition?
A simple way to relate this to runners running in a race is to picture a photo finish where two runners truly cross the finish line at the same point in time. It is possible, though quite unlikely, for this to occur in human races. For computers that process thousands of operations per millisecond, it becomes a lot more feasible.
As another example, picture a relay race where runners pass a baton (the flashy colored stick) from one person to the next. Imagine now that one of the race participants makes a mistake, and there are now two runners who think they should get the red-colored baton.
A significant event in a relay race is the passing on of the baton, as this may mean that the previous holder of the baton can stop running, and it’s now up to the new baton owner to give their best. There are now two runners grabbing for the baton. It’s going to be an interesting situation to watch on TV (if you like that sort of thing), but it is clear there is going to be some amount of fallout.
In essence, a race condition is a bug, error, or flaw in computer system code which produces unpredictable results: an unexpected sequence of events. It is normally caused by two threads conflicting in some way through more than two threads may be involved in the actual conflict, and often more than two threads are running in the software faulting.
In our human race example, we had two people accessing an object at approximately the same time, and the corruption (a computer term to indicate that some data was corrupted, where such data could reside in memory or on disk or in the CPU, etc.) happened to take place at the moment where two people (or two threads in computer analogy) tried to grab the baton and conflict occurred. In computer terms, two threads tried to write a memory space which should normally only be written by one thread (one runner).
Race conditions can happen in various areas like inside electronics, in computer software, and general life. For example, a call collision is a telecommunications term to describe the situation where a communications channel is seized at both ends simultaneously. Inside computer software – one of the most prominent areas of race conditions – there are a wide variety of race conditions possible.
As another example of a race condition inside computer software, picture two computing threads working with a given memory space. A user has just committed a form, and the backend software is writing this form into memory. Simultaneously, another user is reading out the fields of this form from the same memory space. Depending on what happens, the reading user may receive a partially incorrect form with partially updated information.
Preventing Race Conditions: Thread Safety
There has been much discussion around race conditions in the IT industry. Depending on the coding language you use, there may be extensive, or few, provisions for handling race conditions. An often-used term is thread safety or a thread safe application or programming language [construct]. Such terms are used to indicate whether a piece of code or software as a whole is thread safe, i.e., written in such a way as to avoid and even prevent race conditions.
If software is deemed thread safe, it is deemed to be free of the possibility of race conditions. In many cases, ‘deemed‘ thread-safe is the best developers can deliver, and all the more so when many threads and interactions are possible. The complexity of many threads working with many resources can easily become a myriad of code handling and an even larger myriad of possible race conditions.
Various programming constructs may be used to prevent race conditions. For example, semaphores and mutexes. The complexity of using such constructs will depend on the programming language being used and their native support for improved thread handling. For example, in C++ one may look at the std::mutex class for implementing a mutex (i.e. mutually exclusive) lock. In Bash, however, one does not find such a construct natively.
Stepping further, one may also consider which particular constructs, functions, or even executables and libraries are thread-safe already, and then use such constructs, functions, executables, and libraries as a base for building a new construct, function, executable, library, or full software package.
Implementing even basic thread-safety handling constructs can be a complex matter. For example, consider the difficulty of implementing a Semaphore in Bash.
Wrapping up
In this article, we explored computing threads and race conditions. We looked at analogies with running races and relay races in human life to explore some basic race conditions which may happen inside computers. Finally, we explored thread safety, the different implementations of race condition handling in computer coding languages, and how we may prevent race conditions.
If you liked this article, have a look at How Logic Gates Work: OR, AND, XOR, NOR, NAND, XNOR, and NOT article.