Concurrency Part 1

Parallelism through Threads

Revisiting our Game from chapter 3

How to create multiple independent actors? How do they get access to the board?

Terminology

Parallelism

The stronger form of concurrency
- Multiple things happening at the same time
Concurrency only implies multiple things happening within the same time span
Many ways of achieving parallelism:
- Instruction-level parallelism (pipelining, superscalar execution)
- Vectorization
- Threads
- Processes

Parallel streams of instructions

Machine code is executed sequentially (from the programmer’s point-of-view)
If we have multiple cores, we can run multiple streams of instructions in parallel
The OS encapsulates these streams in the form of threads and schedules threads onto the available CPU cores
- A form of concurrency (e.g time-slicing)

Threads vs. processes

Feature	Threads	Processes
Address space	Shared	Separate
Stacks	Separate	One per thread
Program Counter	Separate	One per thread
Creation cost	Cheap(-ish)	Expensive

Thread control

Threads are managed by the OS
- On bare metal: Set entry address for CPU cores manually

Operation	POSIX	Windows
Creation	`pthread_create`	`CreateThread`
Wait for completion	`pthread_join`	`WaitForSingleObject`
Exiting	`pthread_exit`	`ExitThread`
Suspend & Resume	Condition variables using `pthread_cond_wait`, `pthread_cond_signal`	`SuspendThread`, `ResumeThread`, or condition variables

Threads in systems programming languages

Standard libraries typically abstract thread management into a platform-independent API
- std::thread in C++
- threading.Thread in Python
- std::thread module in Rust
Plus various types and modules for synchronization
- Atomics, mutexes, semaphores, condition variables, channels, futures etc.

Threads and shared resources

Threads share an address space, so they can share resources
- What does this tell us about ownership?

fn main() {
    let mut board = gen_board();

    let split_point = 4;
    let counting_region = &board[..split_point];
    let scrambling_region = &mut board[split_point..];

    // Both threads share the `board` resource, but who owns it?
    // (WARNING: Doesn't compile!)
    std::thread::spawn(|| count_whites(counting_region));
    std::thread::spawn(|| scramble(scrambling_region));
}

Ownership models

Ownership models (cont.)

Thread ownership is an extension of function ownership

The Rule Of One

With threads, the Rule Of One suddenly makes more sense
- Concurrent reads and writes are memory unsafe (race condition)
Concurrent readers (i.e. immutable borrows) are safe
Writers must be unique (i.e. mutable borrows)
The known Rust rules apply and prevent concurrency bugs!

Resource decomposition

Writing concurrent algorithms often requires identifying independent (sub-)resources
In our Go-Game: Scramble one section of the board, count another (disjoint) section

let (counting_region, scrambling_region) = board.split_at_mut(split_point);
std::thread::spawn(|| count_whites(counting_region));
std::thread::spawn(|| scramble(scrambling_region));

This fixes concurrent access to our board, but doesn’t fix the lifetime requirement
- Each spawned thread may live longer than main!

Threads and lifetimes

// std::thread::spawn:
pub fn spawn<F, T>(f: F) -> JoinHandle<T>
where
    F: FnOnce() -> T,
    F: Send + 'static,  // <--- requires 'static lifetime!
    T: Send + 'static {
        todo!("details...")
    }

A spawned thread can detach and run as long as the whole program
- Thread functions must live for 'static (i.e. as long as the program)
Options:
1. Use ownership transfer (move data into thread, move || { ... })
2. Use scoped threads (std::thread::scope)
3. Use shared ownership (Arc<T>)
4. (Global variables (static))

Ownership transfer

Use when one thread creates a piece of work and another thread processes it

fn process_file_contents(data: Vec<u8>) { todo!() }

fn main() -> Result<()> {
    let data = std::fs::read_all(file_path)?;
    std::thread::spawn(move || { // move transfers ownership into thread
        process_file_contents(data);
    });
    // ... do other stuff
}

Scoped threads

Use when you want to process data owned by the current function from multiple threads

let (counting_region, scrambling_region) = board.split_at_mut(split_point);
std::thread::scope(|s| {
    s.spawn(|| count_whites(counting_region));
    s.spawn(|| scramble(scrambling_region));
}); // <- joins with all threads, no thread lives longer than current function

// ... do something with board

Shared ownership

Use when multiple threads need access to the same resource but no single thread is the “main” thread

fn process(lut: &[u8]) { todo!() }

fn main() {
    let lookup_table: Vec<u8> = build_lut(configuration);
    // Lift to multiple ownership (tracked at runtime):
    let multi_owner_lut = Arc::new(lookup_table);
    const NUM_WORKERS: usize = 4;
    for _ in 0..NUM_WORKERS {
        // Clones the *handle*, not the LUT itself!
        let lut = Arc::clone(&multi_owner_lut);
        std::thread::spawn(move || process(lut.as_ref()));
    }
}

When to use which?

When would you use shared ownership, scoped threads, or ownership transfers?
1. Configuration object accessible from multiple threads, loaded at program startup
2. Fork-join parallelism, e.g. processing a file in parallel
3. Batch-writing of e.g. log messages to disk
4. Concurrent compaction step of an append-only database (like in the lab)
5. Rendering thread that processes draw commands generated from another thread in a game engine
6. Producer/consumer problems, e.g. chunk-wise image decoding
7. Response handler in an HTTP server, using an in-memory cache

Shared ownership problems

Shared ownership requires heap allocations and runtime tracking of owners
- Typically realized through reference counting (Arc<T>: Atomic Reference Counted)
Shared ownership breaks the Rule Of One!
- Multiple owners means one owner might write while the other reads

`Arc<T>` in practice

Use-case: A lookup-table (LUT) shared between threads and generated lazily

#[derive(Default)]
struct Lut {
    data: Vec<u8>,
}

impl Lut {
    pub fn lookup(&mut self, key: u8) -> u8 {
        if self.data.is_empty() {
            self.generate();
        }
        self.data[key as usize]
    }

    fn generate(&mut self) {
        todo!()
    }
}

`Arc<T>` in practice (cont.)

fn process(lut: &Lut, thread_id: u8) {
    println!("{}", lut.lookup(thread_id));
}

fn main() {
    // Lazily calculated LUT
    let shared_lut = Arc::new(Lut::default());
    const NUM_WORKERS: usize = 4;
    for id in 0..NUM_WORKERS {
        let lut = Arc::clone(&shared_lut);
        std::thread::spawn(move || process(lut.as_ref(), id as u8));
    }
}

`Arc<T>` in practice (cont.)

fn process(lut: &Lut, thread_id: u8) {
    println!("{}", lut.lookup(thread_id));
}

fn main() {
    // Lazily calculated LUT
    let shared_lut = Arc::new(Lut::default());
    const NUM_WORKERS: usize = 4;
    for id in 0..NUM_WORKERS {
        let lut = Arc::clone(&shared_lut);
        std::thread::spawn(move || process(lut.as_ref(), id as u8));
    }
}

cannot borrow `*lut` as mutable, as it is behind a `&` reference

`Arc<T>` in practice (cont.)

Let’s just replace as_ref with as_mut:

fn process(lut: &mut Lut, thread_id: u8) {
    println!("{}", lut.lookup(thread_id));
}

fn main() {
    // Lazily calculated LUT
    let shared_lut = Arc::new(Lut::default());
    const NUM_WORKERS: usize = 4;
    for id in 0..NUM_WORKERS {
        let lut = Arc::clone(&shared_lut);
        std::thread::spawn(move || process(lut.as_mut(), id as u8));
    }
}

no method named `as_mut` found for struct `Arc<Lut>` in the current scope

The API of `Arc<T>`

There is Arc::get_mut, but it looks like this:

pub fn get_mut(this: &mut Arc<T, A>) -> Option<&mut T>

It will return None if this has more than one owner (i.e. the reference count is > 1)
The documentation states:

Shared references in Rust disallow mutation by default, and Arc is no exception: you cannot generally obtain a mutable reference to something inside an Arc.

Reference counting and interior mutability

Reference counted pointers (Arc<T>, Rc<T>) only allow &T access
We need a way to obtain a &mut T from a &Arc<T> (or &Rc<T>)
Clearly, this is only sometimes safe: If we currently are the only ones trying to get mutable access
- Just like Arc<T> tracks the number of owners at runtime, we can track the number of borrows at runtime
- This is called interior mutability
Examples:
- Cell<T>, RefCell<T> for single-threaded code
- Mutex<T>, RwLock<T>, Atomic<T> for multi-threaded code

Terminology

Inherited mutability: If I have a &Thing, all member variables of Thing are also immutable for me
- The mutability is inherited from the outer type to the members
Interior mutability: If I have a &Thing, I can still mutate the member variables of Thing
- The mutability is an interior property and not related to the outer type

Interior mutability example using `Cell<T>`

The Rule Of One only applies if we borrow data. What if we always copy it instead?

use std::cell::Cell;

struct SomeStruct {
    regular_field: u8,
    special_field: Cell<u8>,
}

let my_struct = SomeStruct {
    regular_field: 0,
    special_field: Cell::new(1),
};

let new_value = 100;

// ERROR: `my_struct` is immutable
// my_struct.regular_field = new_value;

// WORKS: although `my_struct` is immutable, `special_field` is a `Cell`,
// which can always be mutated
my_struct.special_field.set(new_value);
assert_eq!(my_struct.special_field.get(), new_value); // get() returns *copy*

“But I want access to the same value!”

Then we need to count the number of borrows and fail if there is already a mutable borrow

use std::cell::RefCell;

struct SomeStruct {
    regular_field: u8,
    special_field: RefCell<u8>,
}

let my_struct = SomeStruct {
    regular_field: 0,
    special_field: RefCell::new(1),
};

let mut borrow = my_struct.special_field.borrow_mut();
*borrow = 100;
// Calling .borrow() again would panic!
// my_struct.special_field.borrow()

“`RefCell<T>` looks like magic!”

pub struct RefCell<T: ?Sized> {
    borrow: Cell<BorrowCounter>,
    value: UnsafeCell<T>, 
}                         
           
pub struct UnsafeCell<T: ?Sized> {
    value: T,
}

impl<T: ?Sized> UnsafeCell<T> {
    // Access only through raw pointers!
    pub const fn get(&self) -> *mut T { todo!() }
}

`RefCell<T>` in multi-threaded code

fn process(lut: Arc<RefCell<Lut>>, thread_id: u8) {
    let mut borrow = lut.borrow_mut();
    println!("{}", borrow.lookup(thread_id));
}

fn main() {
    // Lazily calculated LUT
    let shared_lut = Arc::new(RefCell::new(Lut::default()));
    const NUM_WORKERS: usize = 4;
    for id in 0..NUM_WORKERS {
        let lut = Arc::clone(&shared_lut);
        std::thread::spawn(move || process(lut, id as u8));
    }
}

`RefCell<T>` in multi-threaded code

fn process(lut: Arc<RefCell<Lut>>, thread_id: u8) {
    let mut borrow = lut.borrow_mut();
    println!("{}", borrow.lookup(thread_id));
}

fn main() {
    // Lazily calculated LUT
    let shared_lut = Arc::new(RefCell::new(Lut::default()));
    const NUM_WORKERS: usize = 4;
    for id in 0..NUM_WORKERS {
        let lut = Arc::clone(&shared_lut);
        std::thread::spawn(move || process(lut, id as u8));
    }
}

`RefCell<Lut>` cannot be shared between threads safely
the trait `Sync` is not implemented for `RefCell<Lut>`
if you want to do aliasing and mutation between multiple threads, use `std::sync::RwLock` instead

Thread-safety at the type level

RwLock<T> (or Mutex<T>) will solve our problem
- Arc<RwLock<T>> or Arc<Mutex<T>> is a common Rust pattern for shared mutable ownership
- Arc<T> shares an instance between threads
- Mutex<T>/RwLock<T> prevent race conditions
Rust detects thread-safety at the type level through two traits: Send and Sync

`Send` and `Sync`

A type implements Send if it is safe to send it to another thread
A type implements Sync if it is safe to share between threads
These two marker traits form a relation:
- T is Sync if and only if &T is Send
- “Sending a shared reference to another thread is the same as sharing the value between threads”
Things that are not Send/Sync
- Raw pointers (anything goes)
- UnsafeCell<T> (explicitly allows shared mutable state)
- Rc<T> (uses raw pointers internally)

`Send` and `Sync` as generic bounds

Here is std::thread::spawn again:

pub fn spawn<F, T>(f: F) -> JoinHandle<T>
where
    F: FnOnce() -> T,
    F: Send + 'static, // All data must be safe to *send* to the new thread
    T: Send + 'static  // The return value will be sent back from the new thread

Arc<T> is only Send if T is Sync
- By definition: Multiple Arcs point to the same value, sending one Arc to another thread gives that thread access to the underlying value, so the value must be Sync!

unsafe impl<T: ?Sized + Sync + Send> Sync for Arc<T> {}

The fixed Game code

fn process(lut: Arc<Mutex<Lut>>, thread_id: u8) {
    let mut lut = lut.lock().expect("Lock was poisoned");
    println!("{}", lut.lookup(thread_id));
}

fn main() {
    // Lazily calculated LUT
    let shared_lut = Arc::new(Mutex::new(Lut::default()));
    const NUM_WORKERS: usize = 4;
    for id in 0..NUM_WORKERS {
        let lut = Arc::clone(&shared_lut);
        std::thread::spawn(move || process(lut, id as u8));
    }
}

These are all the rules we need to know for writing thread-safe code in Rust!

Concurrency Part 1

Revisiting our Game from chapter 3

Terminology

Parallelism

Parallel streams of instructions

Threads vs. processes

Thread control

Threads in systems programming languages

Threads and shared resources

Ownership models

Ownership models (cont.)

Ownership models (cont.)

Ownership models (cont.)

The Rule Of One

Resource decomposition

Threads and lifetimes

Ownership transfer

Scoped threads

Shared ownership

When to use which?

Shared ownership problems

Arc<T> in practice

Arc<T> in practice (cont.)

Arc<T> in practice (cont.)

Arc<T> in practice (cont.)

The API of Arc<T>

Reference counting and interior mutability

Terminology

Interior mutability example using Cell<T>

“But I want access to the same value!”

“RefCell<T> looks like magic!”

RefCell<T> in multi-threaded code

RefCell<T> in multi-threaded code

Thread-safety at the type level

Send and Sync

Send and Sync as generic bounds

The fixed Game code

`Arc<T>` in practice

`Arc<T>` in practice (cont.)

`Arc<T>` in practice (cont.)

`Arc<T>` in practice (cont.)

The API of `Arc<T>`

Interior mutability example using `Cell<T>`

“`RefCell<T>` looks like magic!”

`RefCell<T>` in multi-threaded code

`RefCell<T>` in multi-threaded code

`Send` and `Sync`

`Send` and `Sync` as generic bounds