Pointers to single objects in C++

The next step after abstractions for owning and non-owning arrays in dynamic memory is to think about single instances of a type. We will first look at how this is typically done in C++, the Rust way of things is covered in the next section. This section covers the following topics:

Motivation: Why do we want single instances on the heap?

What reason is there to put a single instance onto the heap? For an array, our motivation was clear: We want a dynamic size that we only know at runtime. For a single instance, we know our size - the size of the type - so why use the heap here? Try to first come up with an answer for yourself before reading on:

Question: What reasons are there to put a single instance of a type onto the heap?

*  The type is too big to fit onto the stack
*  The type not copyable, not movable, and has a non-trivial lifetime
*  The size of the type is not known

It is easy to come up with an example for the first situation: A type that is too large to fit onto the stack:

struct LargeType {
    char ten_mb_of_text[10 * 1024 * 1024];
};

The second example is more tricky. Up until now, we only dealt mostly with types that could be copied and moved. Depending on the notion of 'moving', it is not even clear why a type should not be at least movable. We saw one previous example for a non-copyable type, during our SillyVector implementation before learning about deep copies. At this point, it is hard to come up with a good example of a non-movable type, but there are some in the C++ and Rust standard libraries, such as std::mutex and Pin<T>. Sometimes you also want to create self-referential types, so types that contain a reference to themselves. These types generally should not be movable, because moving them in memory will invalidate the reference.

The last example is interesting: A type whose size is not known at compile-time. Wasn't the whole point of having a strong type system to know the size of all types at compile-time? There are situations where this is not the case, even in a strong type system. This is the whole point of interfaces in many programming languages: To have a piece of code that works with instances whose type is only known at runtime. The concept of subtyping polymorphism works this way, for example when you have classes with virtual methods in C++. If you work with instances of a derived class through its base class, you end up with a type whose size is only known at runtime. Let's look at an example:

#include <iostream>

struct Base {
    virtual ~Base() {}
    virtual void foo() {
        std::cout << "Base" << std::endl;
    }
};

struct DerivedA : Base {
    explicit DerivedA(int v) : val(v) {}
    void foo() override {
        std::cout << "DerivedA(" << val << ")" << std::endl;
    }

    int val;
};

struct DerivedB : Base {
    explicit DerivedB(std::string v) : val(v) {}
    void foo() override {
        std::cout << "DerivedB(" << val << ")" << std::endl;
    }

    std::string val;
};

int main() {
    Base a = DerivedA{42};
    Base b = DerivedB{"hello"};
    
    a.foo();
    b.foo();

    return 0;
}

Run this example

In this example, we have a base class Base, and two derived classes which store different types and thus have different sizes. We then create an instance of both derived classes and call a virtual method on them...

Output of this example:
Base
Base

...and find out that none of our virtual methods in the subclasses have been called. Somehow, our code did not delegate to the appropriate virtual methods in the subclasses. This is reflected in the assembly code of this example:

    ; ...
    call    Base::foo()
    ; ...

So the C++ compiler saw that we were declaring a value of type Base and were calling a method named foo() on it. It didn't care that our function was declared virtual, because the way we invoked that function was on a value, which is just an ordinary function call. What we want is a virtual function call, and in C++ virtual function calls are only supported if a function is called through a pointer or reference. So let's try that:

int main() {
    Base a = DerivedA{42};
    Base b = DerivedB{"hello"};

    Base& a_ref = a;
    Base& b_ref = b;
    
    a_ref.foo();
    b_ref.foo();

    return 0;
}

Run this example

Output of this example:
Base
Base

Still the same result?! So what did we do wrong? Well, we commited a cardinal sin in C++: We assigned a value of a derived class to a value of a base class. Recall that all local variables are allocated on the stack. To do so, the compiler has to know the size of the variable to allocate enough memory on the stack. Our variables a and b are both of type Base, so the compiler generates the necessary instructions to allocate memory for two instances of Base. We can use the sizeof operator in C++ to figure out how big one Base instance is, which in this case yields 8 bytes. We then assign instances of DerivedA and DerivedB to these variables. How big are those?

sizeof(DerivedA): 16
sizeof(DerivedB): 40

How does that work, assigning a larger object to a smaller object? In C++, this is done through slicing, and it is not something you typically want. In the statement Base a = DerivedA{42};, we assign a 16-byte object into an 8-byte memory region. To make this fit, the compiler will slice off all the overlapping bytes of the larger object. So all the information that makes our DerivedA type special is gone.

Of course this is a toy example, no one would write code like this, right? Not like this maybe, but how about writing a function that returns an object derived from some base class. Let's say in our example that the function takes a bool parameter, and if this parameter is true, it returns a DerivedA instance, otherwise a DerivedB instance:

Base magic_factory(bool flavor) {
    if(flavor) {
        return DerivedA{42};
    }
    return DerivedB{"hello"};
}

int main() {
    Base a = magic_factory(true);
    Base b = magic_factory(false);

    return 0;
}

Run this example

This is the exact same situation! magic_factory returns an object of type Base, but constructs objects of the (larger) subtypes, so slicing will happen. In order to work correctly, magic_factory has to return a type whose size is unknown at compile-time! Which is why all polymorphic objects in C++ have to be allocated on the heap: Their size can be unknown at compile-time, so we have to use the only type available to use that can refer to a variable-size memory region - a pointer!

So there we have it, the last reason for allocating a single instance of a type on the heap. Just for completeness, here is the correct code for the previous example:

Base* magic_factory(bool flavor) {
    if(flavor) {
        return new DerivedA{42};
    }
    return new DerivedB{"hello"};
}

int main() {
    Base* a = magic_factory(true);
    Base* b = magic_factory(false);

    a->foo();
    b->foo();

    delete a;
    delete b;

    return 0;
}

Run this example

Managing single instances on the heap

We can now think about writing a good abstraction that represents a single instance allocated on the heap and manages the memory allocation for us. We already have an abstraction for multiple instances on the heap: Our SillyVector type (or the better std::vector type from the C++ standard library). Let's try to use it:

SillyVector<Base> magic_factory(bool flavor) {
    SillyVector<Base> ret;
    if(flavor) {
        ret.push(DerivedA{42});
    } else {
        ret.push(DerivedB{"hello"});
    }
    return ret;
}

int main() {
    SillyVector<Base> a = magic_factory(true);
    SillyVector<Base> b = magic_factory(false);

    a.at(0).foo();
    b.at(0).foo();

    return 0;
}

Run this example

Ok, not super nice in terms of usability, with this a.at(0) syntax, but a start. Let's run this to confirm that it works:

Base
Base

Of course, our SillyVector<T> stores multiple instances of a single type known at compile-time. When we call the push() function, slicing happens again, because push() expects a value of type Base and we pass it a value of type DerivedA or DerivedB. So clearly, we need something better that can deal with derived classes.

What is it that we want from our new type? It should be able to hold a single instance of some type T, or any type U that derives from T! Just like SillyVector, we want all dynamic memory allocations to happen automatically, and memory should be cleaned up correctly once the instance of our new type is destroyed. So really what we want is a pointer that is just a bit smarter than a regular pointer:

template<typename T>
struct SmartPtr {
    SmartPtr() : ptr(nullptr) {}
    explicit SmartPtr(T* dumb_ptr) : ptr(dumb_ptr) {}
    ~SmartPtr() {
        if(ptr) delete ptr;
    }

    T& get() { return *ptr; }
    const T& get() const { return *ptr; }
private:
    T* ptr;
};

This SmartPtr type wraps around a regular (dumb) pointer and makes sure that delete is called when the SmartPtr object goes out of scope. It also provides a get() method that returns a reference to the object (though calling it on a SmartPtr that holds nullptr is not a good idea). Note that this type is generic on the parameter T, but stores a pointer to T! The C++ rules allow a pointer of a derived type (U*) to be converted into a pointer of the base type (T*), so we can create a SmartPtr<Base> from a DerivedA* or DerivedB*, which is exactly what we want:

SmartPtr<Base> magic_factory(bool flavor) {
    if(flavor) {
        return SmartPtr<Base>{ new DerivedA{42} };
    } 
    return SmartPtr<Base>{ new DerivedB{"hello"} };
}

int main() {
    SmartPtr<Base> a = magic_factory(true);
    SmartPtr<Base> b = magic_factory(false);

    a.get().foo();
    b.get().foo();

    return 0;
}

Run this example

The only unfortunate thing is that we have to call new DerivedA{...} ourselves in this example. To make this work, we would have to pass a value of some type U that is T or derives from T to the constructor, and then copy or move this value onto the heap. The other option would be to pass all the constructor arguments for the type U to the constructor of SmartPtr and then call new U{args} there. For the first option, we have to introduce a second template type U only for the constructor, and then make sure that this constructor can only get called when U derives from T. Prior to C++20, which introduced concepts for constraining generic types, this is how we would write such a constructor:

template<typename U, 
        typename = std::enable_if_t<std::is_base_of<T, U>::value>>
explicit SmartPtr(U val) {
    ptr = new U{std::move(val)};
}

The std::enable_if_t construct effectively tells the compiler to only compile this templated function if the condition std::is_base_of<U, T> holds, that is if U is derived from T. Within the constructor, we then create a new instance of U on the heap and initialize it with the contents of val. std::move tells the compiler to try and move val into the constructor call of U. If U is not a movable type, it is copied instead.

With C++20 and concepts, we can write this code a bit nicer:

template<typename U>
explicit SmartPtr(U val) requires std::derived_from<U, T> {
    ptr = new U{std::move(val)};
}

The downside of this approach is that we potentially create a copy of our val object. It would be nicer if we could construct the instance directly on the heap, without any copies! For this, we have to pass all constructor arguments to make an instance of U to the constructor of SmartPtr. However, there can be arbitrarily many constructor arguments, of arbitrary types. Luckily, C++ supports variadic templates since C++11, which can be used for situations where we have an arbitrary number of template arguments:

template<typename... Args>
explicit SmartPtr(Args... args) {
    ptr = new ??? {args...};
}

Now we have the problem that we have to tell the SmartPtr constructor explicitly, for which type we want to call the constructor. Maybe we can add this type as another generic type?

template<typename U, typename... Args>
explicit SmartPtr(Args... args) {
    ptr = new U {args...};
}

But now we run into a problem of how to call this code:

SmartPtr<Base> magic_factory(bool flavor) {
    if(flavor) {
        return SmartPtr<Base>{ 42 };
    } 
    return SmartPtr<Base>{ "hello" };
}

This does not compile because the compiler can't figure out what type to use for the U template parameter. There is not good way to do this in C++, because explicitly specifying template arguments for a constructor is not supported by the language. It is for regular functions, just not for constructors. So we can use a regular function that behaves like a constructor: A factory function:

template<typename T, typename... Args>
SmartPtr<T> smart_new(Args... args) {
    auto ptr = new T{args...};
    return SmartPtr<T>{ ptr };
}

Which gets called like this:

SmartPtr<Base> magic_factory(bool flavor) {
    if(flavor) {
        return smart_new<DerivedA>(42);
    } 
    return smart_new<DerivedB>("hello");
}

Which again does not compile :( This time, the compiler is complaining that we are returning an instance of the type SmartPtr<DerivedA> from a function that returns SmartPtr<Base>. These two types are different, and the compiler can't figure out how to convert from one into the other. What we want is the same property that pointers have, namely the ability to go from U* to T*, if U derives from T. This property is called covariance, and we can enable it for out SmartPtr type by adding yet another constructor:

template<typename U> friend struct SmartPtr;

template<typename U>
SmartPtr(SmartPtr<U> other) requires std::derived_from<U, T> {
    ptr = other.ptr;
    other.ptr = nullptr; 
}

Run this example

In this constructor, we steal the data from another SmartPtr object that is passed in, but this is only allowed if the other SmartPtr points to an instance of a derived type. Since SmartPtr<T> and SmartPtr<U> are different types, we also need a friend declaration so that we can access the private ptr member.

With that, we have a first working version of a type that manages single instances of types on the heap. We called it SmartPtr, because that it the name that such types go by: Smart pointers.

The problem with ownership - again

You know what is coming: Our SmartPtr type is incorrect. Look at this code:

int main() {
    SmartPtr<Base> a = magic_factory(true);
    
    {
        SmartPtr<Base> b = a;
        b.get().foo();
    }

    a.get().foo();

    return 0;
}

Run this example

Here we create a copy of our heap-allocated object. We never defined a copy constructor, so we get the one that the compiler generates for us, which - as in the example of SillyVector - creates a shallow copy. Now b goes out of scope, calls delete on the underlying pointer, but we still have another pointer to the same object living inside a. When we then dereference a, we access memory that has been freed, leading to undefined behavior.

Again we have two options: Create a deep copy, or disallow copying the SmartPtr type. Let's try the first approach by adding an appropriate copy constructor:

SmartPtr(const SmartPtr<T>& other) {
    if(other.ptr) {
        ptr = new T{other.get()};
    } else {
        ptr = nullptr;
    }
}

Run this example

If we run this, we are in for a surprise:

Base
DerivedA(42)

See this line right here: SmartPtr<Base> b = a;? It invokes the copy constructor for SmartPtr<Base>, which calls new Base{other.get()}. But we only store a pointer to Base, the actual type of the heap-allocated object is DerivedA! But the compiler can't know that, because the actual type is only known at runtime! So how would we ever know which copy constructor to call? What a mess...

We could spend some more time trying to find a workaround, but at this point it makes sense to look at the standard library and see what they do. What we have tried to implement with SmartPtr is available since C++11 as std::unique_ptr. For basically the same reasons that we just saw, std::unique_ptr is a non-copyable type! Which makes sense since - as the name implies - this type manages an instance of a type with unique ownership. While we can't copy a type that has unique ownership semantics, we can move it. We already did something like this, in our converting constructor from SmartPtr<U> to SmartPtr<T>: Here we stole the data from the original SmartPtr and moved it into the new SmartPtr. This way, ownership has been transferred and at no point in time were there two owners to the same piece of memory.

A primer on moving in C++

In Rust, moving was implictily done on assignment, in C++ we have to do some extra stuff to enable moving. First, we have to make a type movable by implementing a move constructor and/or move assignment operator:

SmartPtr(SmartPtr<T>&& other) : ptr(other.ptr) {
    other.ptr = nullptr;
}

SmartPtr<T>& operator=(SmartPtr<T>&& other) {
    std::swap(ptr, other.ptr);
    return *this;
}

The move constructor looks has a similar signature to a copy constructor, but instead of taking a const reference (const SmartPtr<T>&), it takes something that looks like a double-reference: SmartPtr<T>&&. The double ampersand is a special piece of syntax introduced in C++11 and is called an rvalue reference. Without going into the nasty details of C++ value-categories, we can think of rvalues as everything that can appear on the right-hand side of an assignment. There are also lvalues, which are things on the left-hand side of an assignment:

int lvalue = 42;
// The variable 'lvalue' is an lvalue, the value '42' in this context is an rvalue

Another definition is that lvalues are things with a name and rvalues are things without a name. How exactly does this help us though? Looking at the concept of moving, remember that moving an instance of a type from location A to location B means effectively stealing the instance from location A. To steal it, we have to make sure that no one can use it anymore. Since the only things that we can use are things with a name, converting a value to an rvalue is what makes moving possible in C++Just to be clear: This is a massive oversimplification of the actual language rules of C++. Since C++ is a very complex language with lots of rules that even experts argue over, we will leave it at this simplification for the sake of this course.. Think of the value 42 in the statement above. It is a temporary value, we can't take an address to this value and thus we can't assign a name of this value. If we were to steal this value, no one would care, because no one can name this value to access it.

Since C++ defines a special type for rvalues, it is possible to write a function that only accepts rvalues, that is to say a function that only accepts values that are temporary and unnamed. This is how the move constructor is detected by the C++ compiler: If the argument is temporary and thus unnamed, it is an rvalue and thus a match for the function signature of the move constructor. If it is a named value, it is no rvalue and thus can't be passed to the move constructor. The following example illustrates this:

#include <iostream>

struct MoveCopy {
    MoveCopy() {}
    MoveCopy(const MoveCopy&) {
        std::cout << "Copy constructor" << std::endl;
    }
    MoveCopy(MoveCopy&&) {
        std::cout << "Move constructor" << std::endl;
    }

    MoveCopy& operator=(const MoveCopy&) {
        std::cout << "Copy assignment" << std::endl;
        return *this;
    }

    MoveCopy& operator=(MoveCopy&&) {
        std::cout << "Move assignment" << std::endl;
        return *this;
    }
};

MoveCopy foo() {
    return MoveCopy{};
}

int main() {
    MoveCopy lvalue;
    // Call the copy constructor, because 'lvalue' has a name and is an lvalue
    MoveCopy copy(lvalue);

    // Call the copy assignment operator, because 'lvalue' is an lvalue
    copy = lvalue;
    // Call the move assignment operator, because the expression 'MoveCopy{}' creates a temporary object, which is an rvalue
    copy = MoveCopy{};

    // Eplicitly calling the move constructor can be done with std::move:
    MoveCopy moved{std::move(lvalue)};

    return 0;
}

Run this example

If we want to explicitly move an object that has a name, for example the local variable lvalue in the previous example, we have to convert it to an rvalue. For this conversion, we can use the std::move() function. std::move() is one of those weird C++ things that don't make sense at first. Look at the (simplified) implementation of std::move() in the C++ standard library:

template <class T>
inline typename remove_reference<T>::type&&
move(T&& t)
{
    typedef typename remove_reference<T>::type U;
    return static_cast<U&&>(t);
}

std::move() is just a type cast! There is zero runtime code being generated when we call std::move(), instead it just takes a type T and converts it into an rvalue reference (T&&). The remove_reference stuff exists so that we can call std::move() with regular references and still get an rvalue reference back.

At the end of the day, we have to accept that this is just how moving an object in C++ works. Perhaps the biggest issue with moving in C++ is that the language does not enforce what happens to the object that was moved. Look at this code:

#include <iostream>
#include <string>

int main() {
    std::string str{"hello"};
    std::cout << str << std::endl;

    std::string other = std::move(str);
    // WE STILL HAVE A VARIABLE ('str') TO THE MOVED OBJECT!!
    std::cout << "String after moving: '" << str << "'" << std::endl;

    return 0;
}

Run this example

No one prevents us from still using the object that we just moved! What happend with the object that we moved from? The C++ standard is quite vague here, saying that moving 'leave[s] the argument in some valid but otherwise indeterminate state'. So it is something that we as programmers have to be aware of.