2.5. Rust and ad-hoc polymorphism using traits

When we talked about language paradigms in section 2.2, we also briefly talked about the concept of polymorphism, which enables writing one piece of code and using it with many different types. In this chapter, we will look closer at the way polymorphism is achieved in Rust, in particular at the concept of traits.

Polymorphism in systems programming

Polymorphism has become perhaps one of the most important concepts in modern programming languages. The idea that code can be written in a way that is extensible to different types without having to rewrite the code is immensely useful in practice. Polymorphism enables grouping entities by their shared features and treating them in an abstract way. In application programming, this concept is used all the time: A web-shop can sell shoes and books and computers by treating them all as products with common properties like name, price and product ID. A game treats the player, enemies, bullets and power-ups all as entities that have a graphical representation. One can come up with many more examples in this fashion, all of which get easier to program by utilizing polymorphism.

In systems programming, polymorphism traditionally played a less important role because the focus was more on utilizing specific hardware resources instead of coming up with more high-level abstractions. As systems have grown in their complexity, so did the usefulness of polymorphism, which is perhaps the reason why C++ is an object-oriented programming language where C is not. When writing systems software, we will see that there are many common properties of types that come up time and again and where it is typically worthwhile to employ polymorphism to enable our code to efficiently work with whole classes of types at once. Where a game might care about whether an entity is an enemy or a player, or whether it has a graphical representation or not, in systems software we often care about how many bytes a certain type takes up in memory, if it can be trivially copied from one memory location to another, or if it can be safely used in a multithreaded context.

A good example of this is a copy function that copies the contents of one array into another array. We want this function to work with arrays of arbitary types (as long as the type is copyable), but we also want it to be efficient. If we have an array of primitive types, say int or float, we know that we can copy the raw memory from the source array to the target array. If the type is more complex, requiring some setup for example or some dynamic memory allocation, we are not allowed to do that. Let's try writing such a function in C++:

template<typename T, unsigned N>
void copy_array(const T (&source)[N], T (&target)[N]) {
    for(unsigned idx = 0; idx < N; ++idx) {
        target[idx] = source[idx];
    }
}

This function takes two arrays source and target of the same size N and the same type T and performs a trivial loop that copies every element from source into target using the C++ copy assignment operator. While this function is correct even for non-trivial types, it might not be as efficient as we want it to be. In particular, we know that if we call it with a primitive type, we should be able to just copy the raw memory from source into target using the std::memcpy routine, which is the fastest way to copy memory without resorting to writing manual assembly code. We could write two functions, say copy_array_trivial and copy_array_complex, but that would lift the burden of choosing the right implementation to the programmer. Instead, we would like our code to work with any copyable type and select the right implementation automatically. This is where polymorphism comes in. The unfortunate thing is that there are many different approaches to polymorphism and not every approach is equally well suited for the same problem. There a three main types of polymorphism:

  • Ad-hoc polymorphism
  • Parametric polymorphism
  • Subtyping

Our function copy_array is actually using a form of polymorphism already, namely parametric polymorphism. We wrote a function that works for an arbitrary type T, so copy_array is polymorphic over the parameter T. Note that we never explicitly state what the type T is, which is the defining characteristic of parametric polymorphism: A function that works with an arbitrary, unspecified type.

In object-oriented languages, you will often find polymorphism using subtyping, where there is a common superclass or base type that defines e.g. a function signature that is then implemented by many different subtypes. We can achieve this in C++ through virtual methods. Here is a somewhat contrived example that defines a common type for copyable entities using subtyping:

#include <string>
#include <memory>
#include <cstring>

struct Copyable {
    virtual ~Copyable() {}
    virtual void copy_to(Copyable& other) const = 0;
};

struct Foo : Copyable {
    std::string val;

    explicit Foo(std::string val) : val(val) {}
    void copy_to(Copyable& other) const override {
        auto as_foo = dynamic_cast<Foo*>(&other);
        if(!as_foo) return;

        as_foo->val = val;
    }
};

struct Bar : Copyable {
    int val;

    explicit Bar(int val) : val(val) {}
    void copy_to(Copyable& other) const override {
        auto as_bar = dynamic_cast<Bar*>(&other);
        if(!as_bar) return;

        std::memcpy(&as_bar->val, &val, sizeof(int));
    }
};

std::unique_ptr<Copyable> make_random_copyable() {
    if((rand() % 2) == 0) {
        return std::make_unique<Foo>("foo");
    } else {
        return std::make_unique<Bar>(42);
    }
}

int main() {
    auto a = make_random_copyable();
    auto b = make_random_copyable();

    a->copy_to(*b);
}

Run this code

The common type Copyable defines our polymorphic method (copy_to), with specific implementations in all the subtypes (Foo and Bar). The subtype Foo wraps a complex type (std::string) and copies it using the copy assignment operator. The subtype Bar wraps a primitive type (int) and uses std::memcpy. At runtime, the correct implementation of copy_to is chosen based on the specific type that our variable a has. For this contrived example, there are many downsides, the biggest one being that we cannot (at least in C++) introduce common base types to existing types. So there is no way for us to write an efficient copy_to method for builtin types such as int, std::string or std::vector. Nonetheless, subtyping polymorphism is often useful, especially when writing functions that should be able to work with arbitrary types that satisfy some specific interface. We will see plenty examples of this throughout this lecture series.

We haven't really made any progress on our initial copy_array function however. Parametric polymorphism was too generic, and subtyping polymorphism was not usable on existing types. What we can do is using ad-hoc polymorphism. With ad-hoc polymorphism, we define a common interface for a specific, well-known set of types. This sounds similar to subtyping polymorphism, however with ad-hoc polymorphism all types are known at compile-time and the polymorphism is resolved at compile-time. This is done through providing an implementation of the interface ad-hoc, independent of all other types. Two very common realizations of ad-hoc polymorphism are function overloading and operator overloading:

#include <memory>
#include <cstring>
#include <string>

template<unsigned N>
void copy_array(const int (&src)[N], int (&target)[N]) {
    std::memcpy(target, src, N * sizeof(int));
}

template<unsigned N>
void copy_array(const std::string (&src)[N], std::string (&target)[N]) {
    for(unsigned idx = 0; idx < N; ++idx) {
        target[idx] = src[idx];
    }
}

int main() {
    int int_src[4]{1,2,3,4};
    int int_target[4];
    copy_array(int_src, int_target);

    std::string str_src[2]{"hello", "goodbye"};
    std::string str_target[2];
    copy_array(str_src, str_target);
}

Run this code

First, we provide an interface for our polymorphic function. In this case, this is the function name copy_array which takes a source array to copy from and target array to copy to. Then, we provide ad-hoc implementations of our interface for specific types, namely integer arrays and arrays of type std::string. Since both functions have the same name and return type, this is an instance of function overloading in C++. Calling the function is then done through the same syntax for both the integer array and std::string array. This might appear trivial, however we have achieved at least part of what we set out to do: The correct implementation for our function copy_vec is selected automatically by the C++-compiler, and it is impossible to call the slow std::string implementation with an integer array, or the std::memcpy implementation with a std::string array. This is an application of ad-hoc polymorphism. Compared to subtyping polymorphism, ad-hoc polymorphism can be applied to existing types quite easily.

It is worth noting that we made use of ad-hoc polymorphism in the example on parametric polymorphism as well, albeit a little hidden. Besides function overloading, operator overloading is another form of ad-hoc polymorphism. In the copy_to implementation of the Foo type, we used the copy assignment operator for copying the std::string type. This operator can be overloaded as well, based on the types of the assignment, and provides a common interface for copying types in C++.

In Rust, most polymorphism is ad-hoc polymorphism, which is why writing polymorphic code in Rust feels a little different from what one might used to in C++ or Java. In particular, while most Rust polymorphism is ad-hoc polymorphism, Rust does not support function overloading. So how then does Rust realize ad-hoc polymorphism? This is where traits come into play.

Rust traits

Traits are a special language construct in Rust that enables the programmer to introduce polymorphic types and provide ad-hoc implementations for arbitrary types. Here is a simple example of traits in Rust:

trait Foo {
    fn do_stuff(&self);
}

impl Foo for i32 {
    fn do_stuff(&self) {
        println!("i32({})", self);
    }
}

impl Foo for &str {
    fn do_stuff(&self) {
        println!("str({})", self);
    }
}

fn main() {
    42.do_stuff();
    "hello".do_stuff();
}

Run this code

Here, we define a polymorphic type Foo with a single function do_stuff. The &self parameter is Rusts way of defining a const member function for types. We then provide ad-hoc implementations of the do_stuff function for the builtin types i32 and &str. For now, we can say that i32 is equivalent to int in C++, and &str is equivalent to a C-string (const char*) in C++. We can then invoke our new polymorphic do_stuff method on an integer or character literal. The method-call syntax is a bit of syntactic sugar that Rust applies, in principle this is equivalent to writing a free-standing do_stuff(i32) or do_stuff(&str) function.

Since traits can be implemented for existing types, it is easy to add new functionality to existing types. Since traits are a realization of ad-hoc polymorphism, in most scenarios this includes no runtime overhead as the appropriate functions to call can be resolved at compile time. As a consequence, traits have to be available in the current scope to be used, which in Rust means that they have to be imported:

// Foo trait implemented in another module/crate. Without the 'use' statement, the code wouldn't compile
use other_mod::Foo;

fn main() {
    42.do_stuff();
}

Rust provides a large number of traits for common behaviour, such as printing objects, calculating hashes, comparing objects or writing to and reading from a stream. Some of these traits also solve problems particularily relevant to systems programming. For example, to identify if a type can be bitwise copied we can use the trait Copy:

fn copy_binary<T: Copy>(src: &T, dst: &mut T) {
    *dst = *src;
}

fn main() {
    let simple_src = 42;
    let mut simple_dst = 23;
    copy_binary(&simple_src, &mut simple_dst);

    let complex_src = String::from("hello");
    let mut complex_dst = String::new();
    // The next line does not compile because String does not implement the Copy trait!
    //copy_binary(&complex_src, &mut complex_dst);
}

Run this code

Here, we made use of both generic programming, to define a function copy_binary that works with arbitrary types, and traits, to constraint the types that are valid for our copy_binary function. Using a type constraint in such a way makes it possible to write generic code that requires specific functionalities from its types, without ever naming the actual types themselves. For the longest time, writing code like this was possible but quite difficult in C++. Only with the most recent version of the C++ standard, C++20, did C++ get a type constraint mechanism called concepts which works similar to what we have seen here in Rust.

As in the previous C++ example, the actual functionality that the Copy trait provides is somewhat hidden. Copy enables the copy assignment from one value to another using the = operator. In C++, this would be the default behaviour for assignment: Writing a = b; in C++ copies the value of b into a, leaving b untouched. In Rust, if b implements Copy, b is copied into a, otherwise it is moved into a, in which case b is unusable. We will learn more about move and copy in chapter 3.

Besides the Copy trait, there are also traits that determine whether a type is safely usable in a multithreaded context (namely Send and Sync). These traits, together with the concept of borrow checking, is what enables us to write efficient, safe multithreaded code in Rust, which is a highly valuable feature to have available in systems programming. We will learn more about writing multithreaded code in chapter 7.

Recap

In this chapter, we learned about traits in Rust and how they are used to achieve polymorphism. We talked about the usefullnes of polymorphism for systems programming and saw how we can write code that automatically chooses the most efficient implementation of an algorithm based on the capabilities of a type. Traits in Rust make it fairly easy to write such code because we can give generic functions type constraints, which ensure that a certain trait is implemented for all types that the function gets called with.