Concurrency Part 1

Parallelism through Threads

Revisiting our Game from chapter 3

How to create multiple independent actors? How do they get access to the board?

Terminology

Parallelism

  • The stronger form of concurrency
    • Multiple things happening at the same time
  • Concurrency only implies multiple things happening within the same time span
  • Many ways of achieving parallelism:
    • Instruction-level parallelism (pipelining, superscalar execution)
    • Vectorization
    • Threads
    • Processes

Parallel streams of instructions

  • Machine code is executed sequentially (from the programmer’s point-of-view)
  • If we have multiple cores, we can run multiple streams of instructions in parallel
  • The OS encapsulates these streams in the form of threads and schedules threads onto the available CPU cores
    • A form of concurrency (e.g time-slicing)

Threads vs. processes

Feature Threads Processes
Address space Shared Separate
Stacks Separate One per thread
Program Counter Separate One per thread
Creation cost Cheap(-ish) Expensive

Thread control

  • Threads are managed by the OS
    • On bare metal: Set entry address for CPU cores manually
Operation POSIX Windows
Creation pthread_create CreateThread
Wait for completion pthread_join WaitForSingleObject
Exiting pthread_exit ExitThread
Suspend & Resume Condition variables using pthread_cond_wait, pthread_cond_signal SuspendThread, ResumeThread, or condition variables

Threads in systems programming languages

  • Standard libraries typically abstract thread management into a platform-independent API
  • Plus various types and modules for synchronization
    • Atomics, mutexes, semaphores, condition variables, channels, futures etc.

Threads and shared resources

  • Threads share an address space, so they can share resources
    • What does this tell us about ownership?
fn main() {
    let mut board = gen_board();

    let split_point = 4;
    let counting_region = &board[..split_point];
    let scrambling_region = &mut board[split_point..];

    // Both threads share the `board` resource, but who owns it?
    // (WARNING: Doesn't compile!)
    std::thread::spawn(|| count_whites(counting_region));
    std::thread::spawn(|| scramble(scrambling_region));
}

Ownership models

Ownership models (cont.)

Ownership models (cont.)

Ownership models (cont.)

  • Thread ownership is an extension of function ownership

The Rule Of One

  • With threads, the Rule Of One suddenly makes more sense
    • Concurrent reads and writes are memory unsafe (race condition)
  • Concurrent readers (i.e. immutable borrows) are safe
  • Writers must be unique (i.e. mutable borrows)
  • The known Rust rules apply and prevent concurrency bugs!

Resource decomposition

  • Writing concurrent algorithms often requires identifying independent (sub-)resources
  • In our Go-Game: Scramble one section of the board, count another (disjoint) section
let (counting_region, scrambling_region) = board.split_at_mut(split_point);
std::thread::spawn(|| count_whites(counting_region));
std::thread::spawn(|| scramble(scrambling_region));
  • This fixes concurrent access to our board, but doesn’t fix the lifetime requirement
    • Each spawned thread may live longer than main!

Threads and lifetimes

// std::thread::spawn:
pub fn spawn<F, T>(f: F) -> JoinHandle<T>
where
    F: FnOnce() -> T,
    F: Send + 'static,  // <--- requires 'static lifetime!
    T: Send + 'static {
        todo!("details...")
    }
  • A spawned thread can detach and run as long as the whole program
    • Thread functions must live for 'static (i.e. as long as the program)
  • Options:
    1. Use ownership transfer (move data into thread, move || { ... })
    2. Use scoped threads (std::thread::scope)
    3. Use shared ownership (Arc<T>)
    4. (Global variables (static))

Ownership transfer

  • Use when one thread creates a piece of work and another thread processes it
fn process_file_contents(data: Vec<u8>) { todo!() }

fn main() -> Result<()> {
    let data = std::fs::read_all(file_path)?;
    std::thread::spawn(move || { // move transfers ownership into thread
        process_file_contents(data);
    });
    // ... do other stuff
}

Scoped threads

  • Use when you want to process data owned by the current function from multiple threads
let (counting_region, scrambling_region) = board.split_at_mut(split_point);
std::thread::scope(|s| {
    s.spawn(|| count_whites(counting_region));
    s.spawn(|| scramble(scrambling_region));
}); // <- joins with all threads, no thread lives longer than current function

// ... do something with board

Shared ownership

  • Use when multiple threads need access to the same resource but no single thread is the “main” thread
fn process(lut: &[u8]) { todo!() }

fn main() {
    let lookup_table: Vec<u8> = build_lut(configuration);
    // Lift to multiple ownership (tracked at runtime):
    let multi_owner_lut = Arc::new(lookup_table);
    const NUM_WORKERS: usize = 4;
    for _ in 0..NUM_WORKERS {
        // Clones the *handle*, not the LUT itself!
        let lut = Arc::clone(&multi_owner_lut);
        std::thread::spawn(move || process(lut.as_ref()));
    }
}

When to use which?

  • When would you use shared ownership, scoped threads, or ownership transfers?
    1. Configuration object accessible from multiple threads, loaded at program startup
    2. Fork-join parallelism, e.g. processing a file in parallel
    3. Batch-writing of e.g. log messages to disk
    4. Concurrent compaction step of an append-only database (like in the lab)
    5. Rendering thread that processes draw commands generated from another thread in a game engine
    6. Producer/consumer problems, e.g. chunk-wise image decoding
    7. Response handler in an HTTP server, using an in-memory cache

Shared ownership problems

  • Shared ownership requires heap allocations and runtime tracking of owners
    • Typically realized through reference counting (Arc<T>: Atomic Reference Counted)
  • Shared ownership breaks the Rule Of One!
    • Multiple owners means one owner might write while the other reads

Arc<T> in practice

  • Use-case: A lookup-table (LUT) shared between threads and generated lazily
#[derive(Default)]
struct Lut {
    data: Vec<u8>,
}

impl Lut {
    pub fn lookup(&mut self, key: u8) -> u8 {
        if self.data.is_empty() {
            self.generate();
        }
        self.data[key as usize]
    }

    fn generate(&mut self) {
        todo!()
    }
}

Arc<T> in practice (cont.)

fn process(lut: &Lut, thread_id: u8) {
    println!("{}", lut.lookup(thread_id));
}

fn main() {
    // Lazily calculated LUT
    let shared_lut = Arc::new(Lut::default());
    const NUM_WORKERS: usize = 4;
    for id in 0..NUM_WORKERS {
        let lut = Arc::clone(&shared_lut);
        std::thread::spawn(move || process(lut.as_ref(), id as u8));
    }
}

Arc<T> in practice (cont.)

fn process(lut: &Lut, thread_id: u8) {
    println!("{}", lut.lookup(thread_id));
}

fn main() {
    // Lazily calculated LUT
    let shared_lut = Arc::new(Lut::default());
    const NUM_WORKERS: usize = 4;
    for id in 0..NUM_WORKERS {
        let lut = Arc::clone(&shared_lut);
        std::thread::spawn(move || process(lut.as_ref(), id as u8));
    }
}
cannot borrow `*lut` as mutable, as it is behind a `&` reference

Arc<T> in practice (cont.)

  • Let’s just replace as_ref with as_mut:
fn process(lut: &mut Lut, thread_id: u8) {
    println!("{}", lut.lookup(thread_id));
}

fn main() {
    // Lazily calculated LUT
    let shared_lut = Arc::new(Lut::default());
    const NUM_WORKERS: usize = 4;
    for id in 0..NUM_WORKERS {
        let lut = Arc::clone(&shared_lut);
        std::thread::spawn(move || process(lut.as_mut(), id as u8));
    }
}
no method named `as_mut` found for struct `Arc<Lut>` in the current scope

The API of Arc<T>

  • There is Arc::get_mut, but it looks like this:
pub fn get_mut(this: &mut Arc<T, A>) -> Option<&mut T>
  • It will return None if this has more than one owner (i.e. the reference count is > 1)
  • The documentation states:
Shared references in Rust disallow mutation by default, and Arc is no exception: you cannot generally obtain a mutable reference to something inside an Arc.

Reference counting and interior mutability

  • Reference counted pointers (Arc<T>, Rc<T>) only allow &T access
  • We need a way to obtain a &mut T from a &Arc<T> (or &Rc<T>)
  • Clearly, this is only sometimes safe: If we currently are the only ones trying to get mutable access
    • Just like Arc<T> tracks the number of owners at runtime, we can track the number of borrows at runtime
    • This is called interior mutability
  • Examples:
    • Cell<T>, RefCell<T> for single-threaded code
    • Mutex<T>, RwLock<T>, Atomic<T> for multi-threaded code

Terminology

  • Inherited mutability: If I have a &Thing, all member variables of Thing are also immutable for me
    • The mutability is inherited from the outer type to the members
  • Interior mutability: If I have a &Thing, I can still mutate the member variables of Thing
    • The mutability is an interior property and not related to the outer type

Interior mutability example using Cell<T>

  • The Rule Of One only applies if we borrow data. What if we always copy it instead?
use std::cell::Cell;

struct SomeStruct {
    regular_field: u8,
    special_field: Cell<u8>,
}

let my_struct = SomeStruct {
    regular_field: 0,
    special_field: Cell::new(1),
};

let new_value = 100;

// ERROR: `my_struct` is immutable
// my_struct.regular_field = new_value;

// WORKS: although `my_struct` is immutable, `special_field` is a `Cell`,
// which can always be mutated
my_struct.special_field.set(new_value);
assert_eq!(my_struct.special_field.get(), new_value); // get() returns *copy*

“But I want access to the same value!”

  • Then we need to count the number of borrows and fail if there is already a mutable borrow
use std::cell::RefCell;

struct SomeStruct {
    regular_field: u8,
    special_field: RefCell<u8>,
}

let my_struct = SomeStruct {
    regular_field: 0,
    special_field: RefCell::new(1),
};

let mut borrow = my_struct.special_field.borrow_mut();
*borrow = 100;
// Calling .borrow() again would panic!
// my_struct.special_field.borrow()

RefCell<T> looks like magic!”

pub struct RefCell<T: ?Sized> {
    borrow: Cell<BorrowCounter>,
    value: UnsafeCell<T>, 
}                         
           
pub struct UnsafeCell<T: ?Sized> {
    value: T,
}

impl<T: ?Sized> UnsafeCell<T> {
    // Access only through raw pointers!
    pub const fn get(&self) -> *mut T { todo!() }
}

RefCell<T> in multi-threaded code

fn process(lut: Arc<RefCell<Lut>>, thread_id: u8) {
    let mut borrow = lut.borrow_mut();
    println!("{}", borrow.lookup(thread_id));
}

fn main() {
    // Lazily calculated LUT
    let shared_lut = Arc::new(RefCell::new(Lut::default()));
    const NUM_WORKERS: usize = 4;
    for id in 0..NUM_WORKERS {
        let lut = Arc::clone(&shared_lut);
        std::thread::spawn(move || process(lut, id as u8));
    }
}

RefCell<T> in multi-threaded code

fn process(lut: Arc<RefCell<Lut>>, thread_id: u8) {
    let mut borrow = lut.borrow_mut();
    println!("{}", borrow.lookup(thread_id));
}

fn main() {
    // Lazily calculated LUT
    let shared_lut = Arc::new(RefCell::new(Lut::default()));
    const NUM_WORKERS: usize = 4;
    for id in 0..NUM_WORKERS {
        let lut = Arc::clone(&shared_lut);
        std::thread::spawn(move || process(lut, id as u8));
    }
}
`RefCell<Lut>` cannot be shared between threads safely
the trait `Sync` is not implemented for `RefCell<Lut>`
if you want to do aliasing and mutation between multiple threads, use `std::sync::RwLock` instead

Thread-safety at the type level

  • RwLock<T> (or Mutex<T>) will solve our problem
    • Arc<RwLock<T>> or Arc<Mutex<T>> is a common Rust pattern for shared mutable ownership
    • Arc<T> shares an instance between threads
    • Mutex<T>/RwLock<T> prevent race conditions
  • Rust detects thread-safety at the type level through two traits: Send and Sync

Send and Sync

  • A type implements Send if it is safe to send it to another thread
  • A type implements Sync if it is safe to share between threads
  • These two marker traits form a relation:
    • T is Sync if and only if &T is Send
    • “Sending a shared reference to another thread is the same as sharing the value between threads”
  • Things that are not Send/Sync
    • Raw pointers (anything goes)
    • UnsafeCell<T> (explicitly allows shared mutable state)
    • Rc<T> (uses raw pointers internally)

Send and Sync as generic bounds

  • Here is std::thread::spawn again:
pub fn spawn<F, T>(f: F) -> JoinHandle<T>
where
    F: FnOnce() -> T,
    F: Send + 'static, // All data must be safe to *send* to the new thread
    T: Send + 'static  // The return value will be sent back from the new thread
  • Arc<T> is only Send if T is Sync
    • By definition: Multiple Arcs point to the same value, sending one Arc to another thread gives that thread access to the underlying value, so the value must be Sync!
unsafe impl<T: ?Sized + Sync + Send> Sync for Arc<T> {}

The fixed Game code

fn process(lut: Arc<Mutex<Lut>>, thread_id: u8) {
    let mut lut = lut.lock().expect("Lock was poisoned");
    println!("{}", lut.lookup(thread_id));
}

fn main() {
    // Lazily calculated LUT
    let shared_lut = Arc::new(Mutex::new(Lut::default()));
    const NUM_WORKERS: usize = 4;
    for id in 0..NUM_WORKERS {
        let lut = Arc::clone(&shared_lut);
        std::thread::spawn(move || process(lut, id as u8));
    }
}

These are all the rules we need to know for writing thread-safe code in Rust!