Understanding Thread-Safety in .NET

First ask, "Is it necessary?"

Whenever the words "parallel", "concurrent", "thread-safe", or the like come up in system design conversations I'll ask myself if any of those things are actually necessary. The answer may depend on several factors.

Is the system in question wholly new or existing?
What are the known constraints on the system (network, CPU, memory, etc.)?
Has this been done before or is it uncharted territory?
If performance is in question, do we have any measurements or benchmarks?

There are other factors that play into system design decisions, but I will typically opt for simple solutions initially and hold off introducing complexity until it becomes apparent that the simple solution won't continue to work.

I think there's an inherent complexity that comes with writing code that needs to be executed consistently in parallel or from multiple threads. Numerous aspects of programming become more complicated when code is executed at the same time. For example, understanding the general flow of a program and debugging both become more challenging.

With all of that being said, throughout my career, I've taken the approach of attempting to learn the underpinnings of technology in order to help me utilize it effectively. This post is my attempt to share my understanding of the underlying complexity in "thread safety", which contributes to my comfort in using it as needed.

Thread safety in general

According to Wikipedia the definition of thread safety is:

A computer programming concept applicable to multi-threaded code. Thread-safe code only manipulates shared data structures in a manner that ensures that all threads behave properly and fulfill their design specifications without unintended interaction.

The interesting aspects of this definition lie in the complexity of "behaving properly" and "fulfilling their design specifications without unintended interaction." The reason for this is that they can mean different things to different programs in different situations. Behaving properly for my application that requires concurrent reads may be completely different than behaving properly for your application requiring concurrent writes.

Most software will provide some level of guarantees against common multi-threaded problems such as deadlocks and race conditions, but those levels and guarantees will differ based on the software you're utilizing.

This is where the importance of going another layer deeper comes into play. Once we've decided that parallelism is needed, do we have the ability to understand how our program needs to behave and can we achieve that behavior with a given toolset? For the purpose of this post, our toolset will be the libraries .NET provides for threading.

Managed Threading in .NET

The main goal of threading is to increase application responsiveness or overall throughput. In .NET a program is started on a single thread, but it is possible to create additional threads for the program to execute in parallel or concurrently. When operating in a single-threaded manner, the program can really only accomplish one task at a time. When multithreading is introduced, the program is able to do multiple tasks simultaneously which leads to more responsiveness and throughput (generally speaking). In .NET an object is considered to be thread-safe if it can be called from multiple threads without the potential for interrupting calls that are taking place across threads. In the next section, we'll look more closely at the techniques and primitives available for managing this.

When possible multithreaded operations should use the Task Parallel Library or PLINQ because they provide great abstractions over raw threads. These libraries rely heavily on System.Threading.ThreadPool for multithreaded behavior and by default all thread pool threads will be background threads.

Multithreading can be pretty straightforward in certain scenarios, but often times there will be challenges with managing shared state or data. For example, a collection or property may need to be accessed or updated from multiple threads. How can we go about doing that without running into issues of accessing value at the same time or the data being out of sync? This is where thread safety comes in. Luckily for us .NET provides a great deal of functionality to help us avoid "shooting ourselves in the foot" when dealing with multithreading.

Typically thread safety will involve some synchronization mechanism, which manages the complexity of making sure that multiple threads are not trying to access or modify the data at the same time. There are different synchronization mechanisms (some of which I'll mention in the next section), but their main purpose is to avoid common issues like deadlocks and race conditions. Let's take a look at some of these synchronization techniques that are used by .NET under the hood to avoid common multithreading problems in our code.

Synchronization Techniques

In this section, I'll be covering some of the manual techniques that can be used to provide thread safety in .NET. I'd like to mention that in most cases it is best to use the higher level libraries like TPL and PLINQ instead of implementing synchronization by hand. This section is intended to shed a little light on what might be going on behind the scenes in the higher-level abstractions.

The first synchronization technique that we'll be discussing involves the lock keyword which is used to control access to shared resources. When using locks in our code there are a few general guidelines to follow.

Don't share the same lock for multiple shared resources
Don't lock this because it could be used by callers
Don't lock types because they may be requested through reflection
Don't lock string instances because they could be accessed through string interning

In the code sample below we've created a private field called _scoreLock to act as a lock controlling access to the _score field. The lock ensures that only a single thread can access and modify the integer value at a time.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;

public class Program
{
  private static object _scoreLock = new object();
  private static int _score = 0;
  
  public static void Main()
  {
    Parallel.For(0, 20, (idx) => AddScore(idx * 2));
  }

  private static void AddScore(int value)
  {
    lock(_scoreLock) 
    {
      Console.WriteLine($"Increasing the score by {value}");
      _score += value;
    }
  }
}

While using the lock keyword may seem low-level, we can actually look another layer deeper at the Monitor class which is used by lock.

The Monitor class is used for controlling access to a block of code by acquiring and releasing a lock object. If that sounds similar to the lock keyword, it should be under the hood the lock keyword uses a couple of static methods on the Monitor class to synchronize access to the block of code. Let's take a look at the AddScore method from our previous example after it has been compiled to see what the lowered C# code looks like.

private static void AddScore(int value)
{
    object scoreLock = _scoreLock;
    bool lockTaken = false;
    try
    {
        Monitor.Enter(scoreLock, ref lockTaken);
        Console.WriteLine($"Increasing the score by {value}");
        _score += value;
    }
    finally
    {
        if (lockTaken)
        {
            Monitor.Exit(scoreLock);
        }
    }
}

As you can see the lock keyword is no longer present in the code sample after it has been compiled. The lock block is replaced with a try/finally block. In order to acquire the lock, Monitor.Enter is called passing in the lock object and a boolean to indicate whether the lock has been taken. The finally block ensures that the lock object is released after use with Monitor.Exit.

In my opinion, one of the trickier aspects of understanding thread-safety in .NET is that there are several ways to achieve similar synchronization behavior, but each method has slightly different functionality that can make a huge difference when choosing the right one. I'm not going to go into each one of these methods in depth, but I'll provide a bit of information that is relevant to using the method of synchronization.

Mutex
- The Mutex class is similar to Monitor except that mutexes can be used to synchronize data between processes.
SpinLock
- The SpinLock class will wait for access to a lock by repeatedly looping until it the lock is available.
ReaderWriterLockSlim
- ReaderWriterLockSlim limits access to a lock for writing, but allows multiple threads access for reading.
Semaphore and SemaphoreSlim
- Semaphores in .NET differ from the Monitor class in that they don't have thread affinity, which means locks can be acquired and released from different threads
- If you're interested in learning more about semaphores, I wrote a post about them previously here.

Summary

Thread-safety and multi-threading are difficult and I think this post is an example of their difficulty. I could have gone on much longer and dug much deeper, but I'll leave that for you if you're interested. I hope this post was helpful in gaining a better understanding of thread safety in general and provides an on-ramp to learning more in the future.

Resources

https://learn.microsoft.com/en-us/archive/blogs/ericlippert/what-is-this-thing-you-call-thread-safe

https://en.wikipedia.org/wiki/Thread_safety

https://learn.microsoft.com/en-us/dotnet/standard/threading/threads-and-threading

https://learn.microsoft.com/en-us/dotnet/standard/threading/overview-of-synchronization-primitives