Multithreading made easy – Parallel ForEach

Introduction

This week, I was presented with an interesting problem. I had a WPF application which allowed users to scan documents from a commercial scanner into a list of System.Drawing.Image objects, then the images would be searched for barcodes using the great ZXing.Net library.

All worked well, but I found out users were sometimes scanning upwards of 50 images at once, and the process of searching an image for a barcode was taking 4-5 seconds each, meaning it could take 4 minutes to finish! I tried as best as I can to optimise the barcode decoding process, but I still couldn’t get it down to a reasonable time.

My code looked like this:

public List<BarcodedImage> Run(List<Image> images)
{
  var barcodedImages = new List<BarcodedImage>();
  foreach (var image in images)
  {
    //The below operation takes 4-5 seconds
    string barcode = ReadBarcode(image);
    
    barcodedImages.Add(new BarcodedImage(barcode,image));
  }
  
  return barcodedImages;
}

If I were to process 10 images, this would take roughly 45 seconds, which is unacceptable in my application.

Using Parallel.ForEach

I then started to think this may be a good candidate for multithreading – we don’t need to ‘wait’ for an iteration to finish, and we don’t care about the order in which we add them to the list. I started looking at splitting them into batches and having multiple Tasks running, but I came across something called the Task Parallel Library introduced in .NET Framework 4. This library includes the ‘Parallel.ForEach’ method, which promises to allow you to iterate through a collection using multiple threads with minimal thread management – bingo! The syntax is slightly different as it takes a lambda expression, but it’s quite clear once you see it

public static List<BarcodedImage> RunParallel(List<Image> images)
{
  var barcodedImages = new List<BarcodedImage>();
  Parallel.ForEach(images, (image) =>
  {
    //The below operation takes 4-5 seconds
    string barcode = ReadBarcode(image);
    
    barcodedImages.Add(new BarcodedImage(barcode,image));
  });
  
  return barcodedImages;
}

It doesn’t look too different does it? And it worked perfectly for me – the time it took for 10 images went down to just 6 seconds – over a 7x improvement!

Thread safety

However, there is a very major bug in the above code which I didn’t realise until it was close to rolling out to the customer. Previously, my experience with threading was minimal so I failed to understand the pitfalls and considerations that you need to take, and crucially the unpredictability of the execution. So while the code above worked for me, every now and then I’d get a call back from QA saying that one of the images they scanned was not in the ‘barcoded’ list – impossible, I thought. So I made a poor man’s unit test as below

var images = Enumerable.Repeat<Image>(null,100).ToList();		
var barcodedImages = RunParallel(images);
Console.WriteLine($"Passed in {images.Count} images, received {barcodedImages.Count} barcoded images");

Imagine my surprise when the barcodedImages list only contained 97 items! Then on subsequent ones, sometimes it’d be 99, sometimes 94…

In hindsight, I should have realised my mistake. If you were to dig into the code of the Add method on the List type, you’ll see that it’s backed by an array which is dynamically resized when required. If you have multiple threads calling Add on the same method, then multiple threads may be resizing the array, or overwriting the same ‘slot’. I’ve come to realize that you should never assume a type is threadsafe, as from my experience it’s likely not! The upshot of using threading is that your code becomes unpredictable, which makes it a powerful but risky too.

There’s a few ways you can fix the above code – one is by using the lock keyword, and creating an object to use as the lock.

private static object _lockObj = new object();
public static List<BarcodedImage> RunParallel(List<Image> images)
{
  var barcodedImages = new List<BarcodedImage>();
  Parallel.ForEach(images, (image) =>
  {
    //The below opeartion takes 4-5 seconds
    string barcode = ReadBarcode(image);
    lock(_lockObj)
    {
      barcodedImages.Add(new BarcodedImage(barcode,image));
    }
  });
  return barcodedImages;
}

That works, and consistently gives us the correct number of results. Success! However, we can do better. The above code relies on us explicitly locking an object ourselves, which is not only easy to forget but also easy to get wrong. You may end up locking something that you shouldn’t, which could have disastrous consequences on your application. If you were to move the lock keyword above the ReadBarcode method call, the application would never allow the method call concurrently – defeating the point of the multithreading all together! Using locks is the ‘defacto’ way of using non threadsafe types in a threadsafe manner, but this should always be a last resort.  The correct way of approaching is to tackle the problem head on – a List<T> is not threadsafe, we should use a type that is

Concurrent collections

Enter the wonders of the System.Collections.Concurrent namespace. This is a set of generic collections from Microsoft that are designed with thread safety in mind, introduced in .NET Framework 4.

The namespace contains a wide variety of types such as dictionaries, queues and stacks. The one which help with the issue I’ve detailed earlier is the ConcurrentBag<T> – this is essentially an analog to a List<T>, with the crucial difference being that ConcurrentBag is defined as unordered. A List is naturally ordered – when you use the Add method to add an item, it goes to the end of the List. You can also use the Insert method to insert at a specific index. With a ConcurrentBag, the order is not guaranteed, and this is completely logical when you usually have no control over the order of a parallel loop

So if we replace our List<BarcodedImage> image with ConcurrentBag<BarcodedImage>, we can remove the lock and everything will work perfectly!

public static List<BarcodedImage> RunParallel(List<Image> images)
{
  var barcodedImages = new ConcurrentBag<BarcodedImage>();
  Parallel.ForEach(images, (image) =>
  {
    //The below opeartion takes 4-5 seconds
    string barcode = ReadBarcode(image);
    barcodedImages.Add(new BarcodedImage(barcode,image));
  });
return barcodedImages.ToList();
}

One import thing to note – I’m using the System.Linq namespace that allows me to call ToList on the ConcurrentBag (as it implements IEnumerable).

Incrementing a counter

Quite often, you want to keep track of how many items have been processed. The obvious way is below

var items = Enumerable.Range(0,100).ToList();
var storage = new ConcurrentBag<int>();
		
int i = 0;
Parallel.ForEach(items, (item) => 
{
	if (item < 90)
	{
		i++;
		storage.Add(item);
	}
});
		
Console.WriteLine($"Processed {i}. Bag has {storage.Count}");

However, the output is likely not to be what you would expect – this is what I had when I ran it:

Processed 65. Bag has 90
How can the counter possibly be wrong if we’re incrementing it just before we add to our collection? The key, again, is thread safety. Incrementing an integer is not atomic, and thus not thread safe; the variable is read, and then 1 is added to it. If the variable has changed in the meantime by another thread, the increment operation may not occur.
 
You could use a ‘lock’ as we did with our List<T>, but Microsoft provide a collection of operations in the Interlocked class, which will carry out certain operations atomically  (in one single operation). This makes it inherently thread safe, if a bit slower.
 
The amended code is below, along with the new output
var items = Enumerable.Range(0,100).ToList();
var storage = new ConcurrentBag<int>();
		
int i = 0;
Parallel.ForEach(items, (item) => 
{
	if (item < 90)
	{
		Interlocked.Increment(ref i);
		storage.Add(item);
	}
});
		
Console.WriteLine($"Processed {i}. Bag has {storage.Count}");

Conclusion

In my opinion, it’s so easy (especially if you’re primarily develop for the web) to fail to consider threading when developing. It’s important to not multithread for the sake of it though – there are many cases where the parallel ForEach function described earlier would be slower than a regular foreach loop. However, whenever you have a loop that performs some intensive operations, it’s always worth considering!