ScalabilitySystem Design

An Overview of Concurrent Software Systems

This is the 3rd post in a series on System Design.

Multiple independent pieces of code execute simultaneously on multiple processing nodes, across multiple locations, in a distributed system.

We can increase our processing capacity locally and system-wide by explicitly writing our software to perform multiple actions simultaneously on a single node.

Earlier computer systems had only one CPU/core, but today, operating systems such as Linux can run multiple programs at the same time. The operating system schedules another program to run while one waits for an I/O event. Using a software structure that allows multiple activities to run simultaneously, the operating system schedules tasks that have work to do while others await I/O.

IBM introduced the world’s first multicore processor in 2001, a chip with two CPUs. Even my laptop has six cores today. The multicore chip allows software systems with multiple parallel activities to run concurrently on each core, up to the number of available cores. This lets us fully utilize the processing resources on a multicore chip, increasing the processing power of our application.

Concurrency vs parallelism

  • Concurrency — Concurrency is the ability of parts of a program to work correctly even when they are executed out of order. Consider tasks A and B, for example. It is possible to execute them sequentially, meaning doing all steps for A, and then all steps for B. In contrast, concurrent execution alternates between each task until both are completed. A program can make progress even when some parts are blocked due to concurrency. A system may switch to another task while waiting for user input, for instance.
  • Parallelism — Parallelism refers to when tasks run at the same time rather than interleaved. It is possible to run instructions simultaneously on multiple CPU cores.


The use of threads is the primary way to structure a software system as concurrent activities.

By default, every software process has a single thread of execution. As the OS schedules the process for execution, it manages this thread. In Java, the main() function is the entry point to the code that defines the behavior of this thread.

There are features in every programming language that allow you to create and execute additional threads. Each thread runs independently and has its own runtime stack for managing local object creation and method calls. The global data and environment of the process are also accessible to each thread.

A thread can be defined in Java using a class that implements the Runnable interface and defines the run() method.

Using our Runnable instance, we can create a Thread object and call the start() method to execute the thread.

// Output
Running: worker-0, Thread[Thread-0,5,main]
Running: worker-1, Thread[Thread-1,5,main]
In Main: Thread[main,5,main]
In Main: Done with all workers

The Thread.currentThread() method returns a string containing the following information:

  • The system-generated thread identifier.
  • The thread priority, which by default is 5 for all threads.
  • The identifier of the parent thread

By calling the join() method on Thread, this method blocks the calling thread (main) until thread0 terminates.

These threads execute in a nondeterministic order.

Synchronization problems with thread

Concurrent programming involves coordinating multiple threads so that whatever order they are executed, they produce the correct result. Since threads can be started and preempted nondeterministically, there are many possible execution orders.

Race Conditions

The non-deterministic execution of threads implies the following:

  • Each thread will execute sequentially as defined.
  • Threads can overlap in any order. The scheduler determines the number of statements to execute for each thread execution slot.

Thus, when many threads are executed on a single processor, their execution is interleaved. The CPU performs some steps from one thread, then from another, and so on. On a multicore CPU, we can execute one thread per core. The statements of each thread execution are still however interleaved in a nondeterministic manner.

This is not a problem if every thread does its own thing and is completely independent. It is possible for threads to coordinate their work and communicate status across threads by using shared data structures, which may lead to race conditions. Identifying and protecting critical sections is key. A critical section is a section of code that updates shared data structures and hence must be executed atomically if accessed by multiple threads.

In Java, the synchronized keyword defines a critical section. When used to decorate a method, only one thread is permitted to enter the critical section when multiple threads attempt to invoke that method on the same shared object. All others block until the thread exits the synchronized method, at which point the scheduler selects the next thread to execute the critical section. Because only one thread can execute the code inside the critical section at a time, it is serialized.

synchronized public void criticalMethod(){
// Critical Section Code

The synchronized keyword can also be applied to blocks of statements within a method.

public void criticalMethod(){
// Critical Section Code

To minimize serialized code, keep critical sections as small as possible.


I explained that we must serialize access to critical sections in multithreaded code to ensure correct results. As a result, race conditions are avoided. It is possible, however, to write code so restrictive of nondeterminism that our program stops working if we are not careful. The cycle never ends. It is formally known as a deadlock.

Deadlocks occur when two or more threads are unable to proceed. When threads need exclusive access to the same set of resources and acquire locks in different orders, this happens.


Livelock occurs when two tasks in your system are constantly changing states as a result of the actions of the other. Therefore, they are unable to continue due to the constant state changes.

Thread States

In multithreaded systems, a system scheduler determines when to run which threads. Priority is assigned to every thread. Priority is inherited from a thread’s parent. Threads with a higher priority are scheduled more frequently than those with a lower priority

Thread Coordination

In the classic producer-consumer problem, producers generate and send messages to consumers via a shared FIFO buffer. From the buffer, consumers retrieve these messages, process them, and then request more work.

The buffer has a limited capacity, just like virtually all real resources. When the buffer is full, producers must wait until some item(s) have been consumed before adding new items to the buffer. If consumers consume faster than producers produce, they must wait if there are no items in the buffer and somehow be alerted when new items arrive.

Polling is one solution, but every time producers and consumers retry and fail, resources are consumed. It would be better if producers and consumers blocked until their desired operations, put or get, were successful. Blocking threads consumes no resources, making them an efficient solution. In thread programming models, blocking operations enable threads to signal other threads when an event occurs. The basic scheme for the producer-consumer problem is as follows:

  • Any blocked consumers are notified when a producer adds an item to the buffer by sending them a signal.
  • If a consumer retrieves an item from the buffer, it sends a signal to any blocked producers notifying them that there is capacity in the buffer.

Java provides two basic primitives for implementing this signaling scheme, namely, wait() and notify().

  • In synchronized blocks, threads may call wait() if they cannot hold a condition.
  • When notify() is called on the object, it wakes up a thread that has called wait() on it.

Guarded blocks are implemented using these Java primitives. The guarded block uses a condition to prevent threads from resuming execution before the guard holds.

// Output
Thread[consumer0,5,main] waiting to get notified at time:1657789782461
All the threads are started
Thread[producer,5,main] started
Thread[consumer0,5,main] Consumer thread got notified at time:1657789784476
Thread[consumer0,5,main] processed Message: Hello from Producer

Java provides thread-safe abstractions to hide this complexity from your code. In a producer-consumer scenario, a BlockingQueue implementation provides a thread-safe buffer that can be used as the buffer.

Thread Pools

It is common for multithreaded systems to create and manage a collection of threads that perform similar tasks. Thread pools are collections of threads. A thread pool is a collection of worker threads that perform similar tasks and are managed as a group. Thread pools are supported by the ExecutorService interface in java.util.concurrent.

// Output
Producer thread: Thread[pool-1-thread-1,5,main] started
Producer thread: Thread[pool-1-thread-2,5,main] started
Producer thread: Thread[pool-1-thread-1,5,main] started
Message received: Hello from Producer: Thread[pool-1-thread-1,5,main]
Producer thread: Thread[pool-1-thread-2,5,main] started
Message received: Hello from Producer: Thread[pool-1-thread-2,5,main]
Message received: Hello from Producer: Thread[pool-1-thread-1,5,main]
Producer thread: Thread[pool-1-thread-1,5,main] started
Message received: Hello from Producer: Thread[pool-1-thread-2,5,main]
Message received: Hello from Producer: Thread[pool-1-thread-1,5,main]
Message received: <EOM>
Consumer Done with all messages



In 2008, Google announced the creation of a new web browser

  • multi-process (one process per tab)
  • multi-threaded (multiple threads handle the loading of content within each tab)

Some of the advantages they cite for this design

  • stability — Single-process, multi-threaded browsers are vulnerable to having a crash in one tab bringing down the entire browser
  • speed — Multi-process browsers can be more responsive due to OS support.
  • security — Exploits in single-process browsers are easier if malware loaded in one tab can grab information contained in another tab; much harder to grab information across processes.


Book: Mastering Concurrency Programming with Java
Book: Foundations of Scalable Systems

Leave a Reply

Your email address will not be published.