Understanding Processes and Threads

A comprehensive guide to processes, threads, and their differences

Featured image



Overview

Understanding the differences between processes and threads is crucial for software development. Let’s explore these concepts in detail.

Operating systems manage resources and provide services for computer programs. Two fundamental concepts in operating system design are processes and threads, which enable concurrent execution of tasks. This article provides a comprehensive understanding of these concepts, their differences, and how they interact with the CPU and memory.


Program vs Process

A program is a set of instructions stored on disk, while a process is a running instance of a program. When you execute a program, the operating system loads it into memory, allocates resources, and creates a process to run it.

Key Differences:

Process Components:



Process vs Thread

HTTPS

Process

A process is an independent execution unit with its own memory space and system resources. The operating system manages processes, providing isolation and protection between them.

Characteristics:

Process Control Block (PCB):

Thread

Thread

A thread is a lightweight execution unit that exists within a process. Multiple threads in the same process share memory and resources but execute independently.


Characteristics:
  • Lightweight unit within a process
  • Shares memory with other threads in the same process
  • Has its own thread control block (TCB)
  • Faster to create and destroy than processes
  • Key concept: Concurrency

Thread-specific Resources:
  • Thread ID
  • Program counter
  • Register set
  • Stack

Shared Resources (with other threads in the same process):
  • Code section
  • Data section
  • Open files
  • Signals
  • Heap memory



Comparison Table

Aspect Process Thread
Definition Program instance in execution Smallest unit of programmed instructions
Memory Separate memory space Shared memory within process
Creation Resource-intensive Lightweight
Communication Complex (IPC) Simple (shared memory)
Control Controlled by Operating System Controlled by Process
Overhead High Low
Use Case When isolation is needed For concurrent tasks
Failure Impact Failure is isolated Failure affects other threads
Context Switch Expensive Less expensive
Data Sharing Requires IPC mechanisms Direct access to shared data



Process States and Lifecycle

Throughout its lifetime, a process goes through various states as it executes. The operating system’s scheduler manages these transitions based on system conditions and process behavior.

Basic Process States

NEW Process is being created. The operating system is allocating resources and setting up the process control block.

READY Process is waiting to be assigned to a processor. It has all the resources needed to run but the CPU is not currently executing its instructions.

RUNNING Process is currently being executed by the CPU. Its instructions are being carried out.

WAITING (BLOCKED) Process is waiting for some event to occur (such as I/O completion or receiving a signal).

TERMINATED Process has finished execution or has been terminated by the operating system. Resources may still be held until the parent process collects exit status information.

Expanded Process States

In addition to the basic states, many operating systems define additional states to better manage processes:

SUSPENDED READY: Process is temporarily removed from memory and placed on disk, but is otherwise ready to execute.

SUSPENDED WAITING: Process is waiting for an event and has been swapped out to disk.

ZOMBIE: Process execution has completed, but its entry still remains in the process table until the parent collects its exit status.


Process Creation

Processes can be created through several mechanisms:

  1. System Initialization: Initial processes started during boot
  2. Process Creation System Call (fork): A process creates a copy of itself
  3. User Request: User starting an application
  4. Batch Job Initiation: In batch systems

Process Creation in Unix/Linux

#include <stdio.h>
#include <unistd.h>

int main() {
    pid_t pid = fork();  // Create child process
    
    if (pid < 0) {
        // Error occurred
        fprintf(stderr, "Fork Failed\n");
        return 1;
    } else if (pid == 0) {
        // Child process
        printf("Child Process: PID = %d\n", getpid());
    } else {
        // Parent process
        printf("Parent Process: PID = %d, Child PID = %d\n", getpid(), pid);
    }
    
    return 0;
}


Process Creation in Windows

#include <windows.h>
#include <stdio.h>

int main() {
    STARTUPINFO si;
    PROCESS_INFORMATION pi;
    
    ZeroMemory(&si, sizeof(si));
    si.cb = sizeof(si);
    ZeroMemory(&pi, sizeof(pi));
    
    // Create child process
    if (!CreateProcess(NULL,   // Application name
                      "notepad.exe", // Command line
                      NULL,   // Process handle not inheritable
                      NULL,   // Thread handle not inheritable
                      FALSE,  // Set handle inheritance to FALSE
                      0,      // No creation flags
                      NULL,   // Use parent's environment block
                      NULL,   // Use parent's starting directory
                      &si,    // Pointer to STARTUPINFO structure
                      &pi))   // Pointer to PROCESS_INFORMATION structure
    {
        printf("CreateProcess failed (%d).\n", GetLastError());
        return 1;
    }
    
    printf("Process created with ID: %lu\n", pi.dwProcessId);
    
    // Close process and thread handles
    CloseHandle(pi.hProcess);
    CloseHandle(pi.hThread);
    
    return 0;
}



Thread Types and Models

Threads can be implemented at different levels, each with its own advantages and disadvantages.

User-Level Threads

User-level threads are managed by a thread library rather than the operating system. The kernel is unaware of these threads.

Advantages:

Disadvantages:

Examples: POSIX Threads (Pthreads), Java threads

Kernel-Level Threads

Kernel-Level Threads

Kernel-level threads are managed directly by the operating system. The kernel maintains context information for both the process and its threads.


Advantages:
  • If one thread blocks, the kernel can schedule another thread from the same process
  • True parallelism on multiprocessor systems
  • Kernel can schedule threads from different processes

Disadvantages:
  • Thread operations are slower because they require system calls
  • More overhead for the kernel to manage threads
  • Thread implementation is operating system dependent

Examples: Windows threads, Linux's Native POSIX Thread Library (NPTL)

Thread Models

Operating systems implement various threading models:

  1. Many-to-One Model: Many user-level threads mapped to a single kernel thread
  2. One-to-One Model: Each user-level thread maps to a kernel thread
  3. Many-to-Many Model: Maps many user-level threads to a smaller or equal number of kernel threads



Multi-Threading vs Multi-Processing

Multi-Threading

Multi-threading involves multiple threads of execution within a single process. These threads share the same memory space and resources but execute independently.

Characteristics:

Use Cases:

Multi-Processing

Multi-Processing

Multi-processing involves multiple processes running concurrently. Each process has its own memory space and resources, providing isolation but requiring more overhead for communication.


Characteristics:
  • Multiple processes
  • Separate memory spaces
  • Slower context switching
  • Higher resource usage
  • Better isolation and stability
  • More complex inter-process communication

Use Cases:
  • CPU-intensive applications
  • Applications requiring high reliability
  • Security-critical applications
  • Systems with large memory requirements

Python Example: Multi-Processing vs Multi-Threading

# Multi-processing example
from multiprocessing import Process

def process_function(name):
    print(f'Process {name} is running')

if __name__ == '__main__':
    processes = []
    for i in range(5):
        p = Process(target=process_function, args=(f'P{i}',))
        processes.append(p)
        p.start()
    
    for p in processes:
        p.join()
# Multi-threading example
import threading

def thread_function(name):
    print(f'Thread {name} is running')

if __name__ == '__main__':
    threads = []
    for i in range(5):
        t = threading.Thread(target=thread_function, args=(f'T{i}',))
        threads.append(t)
        t.start()
    
    for t in threads:
        t.join()



Context Switching

Context switching is the process of saving and restoring the state of a CPU so that multiple processes or threads can share a single CPU resource. The context includes all register values, the program counter, memory management information, and other data required to resume execution exactly where it left off.

Context Switching is the process where the CPU switches from one process or thread to another. It involves storing the state of the current process and loading the state of the next process.


Steps in Context Switching

  1. Time Out (Preemption):
    • The running process (e.g., P1) reaches its time slice limit.
    • The Scheduler is notified to switch the running process.
  2. Saving State:
    • The state of P1 (CPU registers, program counter, etc.) is saved to memory.
    • This ensures that P1 can resume from where it stopped.
  3. Loading Next Process State:
    • The state of P2 is loaded from memory.
    • P2 is now ready to run.
  4. Dispatching:
    • The CPU starts executing P2.
    • This cycle repeats for process scheduling.


Context Switching Overhead

Context switching incurs overhead that affects system performance:

  • Direct Costs: CPU time spent saving and loading registers, changing memory maps, etc.
  • Indirect Costs: Cache misses, TLB flushes, pipeline stalls
  • Process vs Thread Switching: Thread context switches are generally less expensive since they share memory address space
  • Scheduling Algorithms: Different algorithms can affect the frequency of context switches


Reducing Context Switch Overhead

Operating systems employ various techniques to reduce context switching overhead:

  1. Efficient Thread Scheduling: Group related threads together
  2. CPU Affinity: Keep processes/threads on the same CPU core
  3. Processor Sharing: Use time slicing to reduce the number of full context switches
  4. Asynchronous I/O: Avoid blocking operations that force context switches



Hyper-Threading and Hardware Concurrency

Hyper-Threading is Intel’s implementation of Simultaneous Multi-Threading (SMT) technology, which allows a single physical CPU core to execute multiple threads simultaneously.


How Hyper-Threading Works

  1. Physical Core Duplication
    • Duplicate register sets
    • Multiple thread states managed by the same core
    • Appears as multiple logical processors to the operating system
  2. Shared Resources
    • Shared execution engine
    • Shared caches (L1, L2, L3)
    • Shared arithmetic logic units and floating-point units
  3. Benefits
    • Improved efficiency for multi-threaded applications
    • Enhanced parallelism for tasks with different resource needs
    • Better multitasking performance
    • Utilization of idle execution units


Performance Considerations

Hyper-Threading generally provides a performance boost of 15-30% compared to single-threaded performance on the same physical core. However, performance benefits vary depending on:

AMD's Implementation: Simultaneous Multi-Threading (SMT)

AMD implements its own version of SMT in its Ryzen and EPYC processors, which functions similarly to Intel's Hyper-Threading but with architecture-specific optimizations.

Key Differences from Intel Hyper-Threading:
  • Different cache hierarchy
  • CCX (Core Complex) based design
  • Thread prioritization mechanisms


Practical Examples: Process and Thread Management

Process and Thread Creation in Java

// Process creation in Java
public class ProcessExample {
    public static void main(String[] args) throws Exception {
        ProcessBuilder processBuilder = new ProcessBuilder("notepad.exe");
        Process process = processBuilder.start();
        System.out.println("Notepad process started");
        
        // Wait for process to complete
        int exitCode = process.waitFor();
        System.out.println("Notepad exited with code: " + exitCode);
    }
}

// Thread creation in Java
public class ThreadExample {
    public static void main(String[] args) {
        Thread thread = new Thread(() -> {
            System.out.println("Thread is running");
            try {
                Thread.sleep(2000);
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
            System.out.println("Thread completed");
        });
        
        thread.start();
        System.out.println("Main thread continues execution");
    }
}

Process and Thread Management in Linux

# List all processes
ps aux

# View process hierarchy
pstree

# Monitor processes in real-time
top

# View thread information for process with PID 1234
ps -T -p 1234

# Set process priority (niceness)
nice -n 10 ./myprogram

# Change priority of running process
renice +5 -p 1234



Reference