Memory Management 1

Memory + Static Type Systems

Why do modern computers have different types of memory?

The Memory Hierarchy

Memory differs in size, access speed, physical location and cost
Memory forms a hierarchy:

Caches

Caches are storage buffers for frequently / recently used memory

Memory meets Physics

Question 1: On a 3 GHz CPU, how much time does a single CPU cycle take?
Question 2: Assume information travels at the speed of light and you want to read a byte from RAM, process it with a single instruction, and write it back to RAM at 3 GHz. What is the maximum distance between CPU and Memory that is physically possible?

Memory Size and Latency

Memory that is arbitrarily large and arbitrarily fast is impossible: These two factors are mutually exclusive!

Memory Type	Typical Size	Typical access time
Register	A few bytes	One CPU cycle (less than a nanosecond)
L1 Cache	Dozens of KiBs	A few CPU cycles (about a nanosecond)
L2 Cache	Hundreds of KiBs	~10ns
L3 Cache	A few MiBs	20-40ns
Main Memory	GiBs	100-300ns
SSD	TiBs	A few hundred microseconds
HDD	Dozens of TiBs	A few milliseconds
Network Storage	Up to Exabytes	Hundreds of milliseconds to seconds

The Stack (recap)

Special memory region & data structure for handling subroutine calls (i.e. functions)
Needed for:
- Local variables
- Large arguments (that don’t fit in registers)
One advantage of higher level languages: Automatic stack management

Scopes

The region in the code where a given variable is valid is called the scope of that variable

fn foo() { // -------
    let var = 42; // | *scope* of var (and also of fn foo)
} // ----------------

fn main() {
    foo();
    println!("{var}"); // Does not compile, var does not exist
}

C, C++, and Rust use curly braces {} to define scopes

Scopes and the stack

fn f3(arg1: &i32, arg2: &i32) {
    let var_f3 = 42;
    println!("f3: {} {} {}", arg1, arg2, 
        var_f3);
}

fn f2(arg: &i32) {
    let var_f2 = 53;
    f3(arg, &var_f2);
    println!("f2: {} {}", var_f2, arg);
}

fn f1() {
    let var_f1 = 64;
    f2(&var_f1);
    println!("f1: {}", var_f1);
}

How does the stack look like in line 3? And in line 10?

Stack memory is managed by the compiler!

No need to write manual push and pop statements
This is a form of automatic memory management!
- It has no overhead compared to manually writing push and pop (if your compiler is smart)
Downsides:
- Stack size is limited (usually a few MiB)
- Stack memory is strictly hierarchical
- Allocations on the stack have to be know at compile time

Limitations of the stack

fn main() {
    let mut input = String::new();
    let mut numbers = ???;
    loop {
        println!("Insert a number or leave empty to finish:");
        std::io::stdin().read_line(&mut input)
            .expect("Failed to read input");
        if input.is_empty_or_whitespace() {
            break;
        }

        let number = input.parse::<i32>().expect("invalid value");
        numbers.push(number);
    }

    numbers.sort();
    println!("{numbers}");
}

Limitations of the stack

fn main() {
    let mut input = String::new();
    let mut numbers = ???;
    loop {
        println!("Insert a number or leave empty to finish:");
        std::io::stdin().read_line(&mut input)
            .expect("Failed to read input");
        if input.is_empty_or_whitespace() {
            break;
        }

        let number = input.parse::<i32>().expect("invalid value");
        numbers.push(number);
    }

    numbers.sort();
    println!("{numbers}");
}

How many numbers will there be?
How big will the String be?

Stack-only code is possible

const MAX_STRING_LEN: usize = 256;
const MAX_NUMBERS: usize = 128;

fn stoi(bytes: &[u8]) -> i32 { todo!() }

fn main() {
    let mut input: [u8; _] = [0; MAX_STRING_LEN];
    let mut input_size: usize = 0;
    let mut numbers: [i32; _] = [0; MAX_NUMBERS];
    let mut numbers_size: usize = 0;
    loop {
        println!("Insert a number or leave empty to finish:");
        input_size = std::io::stdin()
            .read(&mut input)
            .expect("Failed to read input");
        if input_size == 0 {
            break;
        }

        let number = stoi(&input[..input_size]);
        numbers[numbers_size] = number;
        numbers_size += 1;
    }

    numbers.sort();
    println!("{:?}", &numbers[..numbers_size]);
}

Your task (in groups): Find as many things as you can that are wrong with this code!
You can use LLMs, but group into “problems I understand” and “problems I don’t understand”
10 minutes

Dynamic memory allocations

We need growable arrays:

fn main() {
    let mut input = String::new();
    let mut numbers = Vec::new(); 
    loop {
        println!("Insert a number or leave empty to finish:");
        std::io::stdin().read_line(&mut input)
            .expect("Failed to read input");
        if input.is_empty_or_whitespace() {
            break;
        }

        let number = input.parse::<i32>().expect("invalid value");
        numbers.push(number);
    }

    numbers.sort();
    println!("{numbers}");
}

A look into `Vec<T>`

Vec<T> (or std::vector<T> in C++) manages a dynamic piece of memory
- The allocation is dynamic: It happens on-demand
- The size is dynamic: It can grow and shrink

Open questions

How is memory allocated?
How is it freed? When is it freed?
How do uninitialized elements work?
What happens if the memory on the heap is full?

Open questions

How is memory allocated? -> unsafe Rust
How is it freed? When is it freed? -> RAII
How do uninitialized elements work? -> unsafe Rust
What happens if the memory on the heap is full? -> unsafe Rust

Automatic deallocation?

numbers goes out of scope at the end of main
This cleans up the stack memory. What about the heap?

fn main() {
    let mut input = String::new();
    let mut numbers = Vec::new(); 
    loop {
        println!("Insert a number or leave empty to finish:");
        std::io::stdin().read_line(&mut input)
            .expect("Failed to read input");
        if input.is_empty_or_whitespace() {
            break;
        }

        let number = input.parse::<i32>().expect("invalid value");
        numbers.push(number); // Will allocate
    }

    numbers.sort(); 
    println!("{numbers}");
}

The `Drop` trait - Destructors in Rust

Tell the Rust compiler: “Call this function when an instance of this type goes out of scope!”
Allows us to clean up more than just the stack memory
Mandatory whenever we write a type that owns a resource:
- Vec<T> owns a dynamic array on the heap
- String also owns a dynamic array (of UTF-8 bytes) on the heap
- File owns a file handle from the OS

The `Drop` trait - Example

impl<T> Drop for Vec<T> {
    fn drop(&mut self) {
        todo!("deallocate the buffer")
    }
}

fn main() {
    let mut numbers = Vec::new(); 
    numbers.push(42);
}

The `Drop` trait - Example

impl<T> Drop for Vec<T> {
    fn drop(&mut self) {
        todo!("deallocate the buffer")
    }
}

fn main() {
    let mut numbers = Vec::new(); 
    numbers.push(42);
    Drop::drop(&mut numbers);
}

Compiler inserts the drop call
Now we can never forget to call free/delete anymore!

RAII

This is called resource acquisition is initialization (RAII)
- Terrible name for an amazing concept
- Better name: Scoped-based resource management
Acquire a resource during initialization of the type
Release the resource during destruction of the type

Is it really that simple?

fn print_numbers(numbers: Vec<i32>) {
    for number in numbers { println!("{number}"); }
}

fn main() {
    let mut numbers = Vec::new(); 
    loop {
        // Read numbers from command line...
    }

    print_numbers(numbers);
    numbers.sort();
    print_numbers(numbers);
}

Is it really that simple?

fn print_numbers(numbers: Vec<i32>) {
    for number in numbers { println!("{number}"); }
}

fn main() {
    let mut numbers = Vec::new(); 
    loop {
        // Read numbers from command line...
    }

    print_numbers(numbers);
    numbers.sort();
    print_numbers(numbers);
}

borrow of moved value: `numbers`

Semantics

What does it mean to pass a value into a function?
- print_numbers(numbers)
How does the function get access to the value?
The CPU only knows bytes…
Depends on the semantics of the programming language
- Semantics: The study of meaning in languages

Passing values to functions

We have three options:
1. Copy
2. Move
3. Indirection (i.e. pointers)

Rust has move-semantics

Rust moves values on assignment and function call
A moved-from value becomes invalid

fn main() {
    let mut numbers = Vec::new(); 
    // ...
    print_numbers(numbers);
    numbers.sort(); // <- `numbers` has been moved!
    print_numbers(numbers);
}

Why would we choose move-by-default?

Indirection by default?

A problem with copies

Copying resource owners

We can fix our issues by doing a deep copy
How is that performance-wise?
What if an element on the heap is another resource-owner?
- How deep does a deep copy have to be?

Performance tradeoff

Copy-by-default:
- Hidden performance hits possible
- Straightforward, doesn’t “steal” anything
Move-by-default:
- Best performance
- Transfers ownership (unusual?)
Indirection:
- On primitive types: Overkill
- On large resource owners: Good

Primitive types and move

Does this code compile?

fn print(number: i32) {
    println!("{number}");
}

fn main() {
    let num = 1024;
    print(num); // `num` moved into `fn print`
    print(num); // ?
}

It does, but why?

The `Copy` trait

A move has no direct equivalent on the hardware level
- It can be a copy
- The copy can be elided (e.g. return values)
- Registers might be reused for efficiency
Primitive types always fit into registers
- This makes them cheap to copy
Rust has a marker trait called Copy for types that support bitwise copies
- All primitive types implement Copy

`Copy` turns moves into copies

fn print(number: i32) {
    println!("{number}");
}

fn main() {
    let num = 1024; // i32 implements `Copy`
    print(num); // `num` NOT moved but COPIED!
    print(num); // `num` still valid :)
}

Who can implement `Copy`?

Only fully self-contained types
- No indirection!
- The whole value of the type must live entirely on the stack
What if we want deep copies instead?

Manually copying with the `Clone` trait

We use the Clone trait!

trait Clone: Sized {
    fn clone(&self) -> Self;
}

impl<T: Clone> Clone for Vec<T> {
    //  ^---- deep copies require transitive Clone!
    fn clone(&self) -> Self {
        let mut cloned = Self::new();
        for value in &self {
            cloned.push(value.clone());
        }
        cloned
    }
}

`Copy`, `Clone`, moving and pinning

Come up with a Copy + Clone type, a Clone type, and a move-only type

Work in groups, duplicated types cancel out!

From growable arrays to their limitations:
- Don’t forget to clean up memory (leads to RAII)
- Indirection / resource ownership changes value semantics (leads to Copy, Clone, move-by-default)

Memory Management 1

The Memory Hierarchy

Caches

Memory meets Physics

Memory Size and Latency

The Stack (recap)

Scopes

Scopes and the stack

Stack memory is managed by the compiler!

Limitations of the stack

Limitations of the stack

Stack-only code is possible

Dynamic memory allocations

A look into Vec<T>

Open questions

Open questions

Automatic deallocation?

The Drop trait - Destructors in Rust

The Drop trait - Example

The Drop trait - Example

RAII

Is it really that simple?

Is it really that simple?

Semantics

Passing values to functions

Rust has move-semantics

Indirection by default?

A problem with copies

Copying resource owners

Performance tradeoff

Primitive types and move

The Copy trait

Copy turns moves into copies

Who can implement Copy?

Manually copying with the Clone trait

Copy, Clone, moving and pinning

A look into `Vec<T>`

The `Drop` trait - Destructors in Rust

The `Drop` trait - Example

The `Drop` trait - Example

The `Copy` trait

`Copy` turns moves into copies

Who can implement `Copy`?

Manually copying with the `Clone` trait

`Copy`, `Clone`, moving and pinning