Blog Series Tags

C++ Constructors and Move Semantics

In order to do almost anything with a class in C++, you need to define a few constructors. If you don’t, depending on usage, the compiler will generate those constructors for you. For example, consider the following code:

class Person
{
public:
    string name;
    int age;
};

The above class does not explicitly define a constructor of any kind and if you don’t use the class at all, the compiler will just ignore it. However on the other hand if you do the following:

cout << "[p1]" << endl;
Person p1;
p1.name = "Felix Mendelssohn";
p1.age = 38;

cout << "[p2]" << endl;
Person p2(p1);

cout << "[p3]" << endl;
Person p3;
p3 = p1;

cout << "[" << &p2 << "] " <<  p2.name << " " << p2.age << endl;
cout << "[" << &p3 << "] " <<  p3.name << " " << p3.age << endl;

The compiler will then generate an Parameterless Constructor, Copy Assignment Constructor and a Copy Constructor for the class Person. The latter two will copy each non static data member of the source to the destination. The result would look like this, two distinct objects, with identical data, stored it different places.

[0x7ffedfd275d0] Felix Mendelssohn 38
[0x7ffedfd27600] Felix Mendelssohn 38

It would be as thought you explicitly wrote the following class:

class Person
{
public:
    string name;
    int age;

    Person()
    {
        cout << "Person Parameterless Constructor Called" << endl;
    }

    Person (const Person& rhs)
        : name(rhs.name), age(rhs.age)
    {
        cout << "Person Copy Constructor Called" << endl;
    }

    Person& operator=(const Person& rhs)
    {
        cout << "Person Copy Assignment Operator Called" << endl;
        this->name = rhs.name;
        this->age = rhs.age;
    }
};

Which results in the following output.

[p1]
Person Parameterless Constructor Called
[p2]
Person Copy Constructor Called
[p3]
Person Parameterless Constructor Called
Person Copy Assignment Operator Called
[0x7ffe0c4e9df0] Felix Mendelssohn 38
[0x7ffe0c4e9e20] Felix Mendelssohn 38

Notice, that for p3, first the parameterless constructor is called, followed by the copy assignment operator. This implementation is fairly straightforward, though the assignment operator should really use the copy and swap idiom.

Big Objects and Performance

The above example should work fine, given that we are dealing with small amounts of data, names don’t get that long…

…Ah, your back. If the amount of data within an object does become large, then the copy constructor can sometimes lead to a great deal of inefficiency. Let’s rework the class to demonstrate this.

const int IMAGE_SIZE = 1024;

class Person
{
private:   
    string name;
    int age;
    unsigned char* image;
    
    void loadImage()
    {
        cout << "loadImage called" << endl;
        image = new unsigned char[IMAGE_SIZE]; // Imagine we fetch the image here
    }

    void copyImage(const unsigned char* source)
    {
        cout << "copyImage called" << endl;
        image = new unsigned char[IMAGE_SIZE];
        memcpy(image, source, IMAGE_SIZE);
    }
    
public:
    Person()
    {
        cout << "Person Parameterless Constructor Called" << endl;
    }

    Person(const string name, const int age)
        : name(name), age(age)
    {
        cout << "Person Constructor Called" << endl;
        loadImage();
    }

    Person (const Person& rhs)
        : name(rhs.name), age(rhs.age)
    {
        cout << "Person Copy Constructor Called" << endl;
        copyImage(rhs.image);
    }

    Person& operator=(const Person& rhs)
    {
        cout << "Person Copy Assignment Operator Called" << endl;
        this->name = rhs.name;
        this->age = rhs.age;
        copyImage(rhs.image);
    }

    ~Person()
    {
        cout << "Destructor called" << endl;
        delete[] image;
    }

    void printDetails()
    {
        cout << "[" << this << "] " <<  name << " " << age << endl;
    }

};

int main()
{
    cout << "[p1]" << endl;

    Person p1("Felix Mendelssohn", 38);

    cout << "[p2]" << endl;
    Person p2(p1);

    cout << "[p3]" << endl;
    Person p3;
    p3 = p1;
}

A note on memory management here, it’s best to use a memory managing pointer to manage image, but that’s a detail we can put aside for the time being. Notice also that we now use a proper constructor without touching the inner members of the class. Yey, encapsulation.

The important thing though, is that now there is a non trivial amount of work involved in creating each Person object. The constructor needs to read a new image from storage using loadImage, while both the copy constructor and the copy assignment constructor needs to create a copy using copyImage. Now if you pass a Person object to a function by value, then there will be additional work done to copy the values within the Person.

void passByValue(Person person)
{
}

int main()
{
    Person p1("Felix Mendelssohn", 38);
    passByValue(p1);
}

This creates the following output.

Person Constructor Called
loadImage called
Person Copy Constructor Called
copyImage called
Destructor called
Destructor called

Notice that we are now creating and destroying two distinct objects. The output would be the same even if you simply create the person only to be passed as a parameter, for example if you want to attach the person to a collection of some sorts.

void addToList(Person person)
{
}

int main()
{
    addToList(Person("Felix Mendelssohn", 38));
}
Person Constructor Called
loadImage called
Person Copy Constructor Called
copyImage called
Destructor called
Destructor called

You could of course pass it as a reference (addToList(Person& person)) but you still have to copy the image data from the object passed as a reference. What you need is a way move the ownership of the passed objects image member to a new object inside addToList, and that brings us to the somewhat mystic notion of lvalues and rvalues.

lvalues and rvalues

Historically, an lvalue is an expression that can appear on the left hand side of an assignment operation. For example:

int a = 10;

In the above a is an lvalue. Lvalues represent locations in memory that can be used to store objects in.

An rvalue is anything that’s not an lvalue. In the above example, 10 is an rvalue because you can’t assign to it.

10 = a; // Not allowed

That’s the basic definition. The specifics can get quite nuanced and I will avoid repetition in order to get back to the substance of this post.

The Substance of This Post

Let’s take a look at a slightly modified version of the code

const int IMAGE_SIZE = 1024;

class Person
{
private:   
    string name;
    int age;
    unsigned char* image;
    
    ...
    
public:
    ...

    Person (Person&& rhs)
        : name(rhs.name), age(rhs.age)
    {
        cout << "Person Move Constructor Called" << endl;

        // Move the image from one object to the other
        unsigned char* image = rhs.image;
        rhs.image = NULL;
        this->image = image;
    }

    ...
};

void addToList(Person person)
{
    cout << "addToList called for ";
    person.printDetails();
}

int main()
{
    Person felix("Felix Mendelssohn", 38);
    addToList(felix); // (1) lvalue
    addToList(move(felix)); // (2) rvalue
}
Person Constructor Called
loadImage called
Person Copy Constructor Called
copyImage called
addToList called for [0x7fff3c990500] Felix Mendelssohn 38
Destructor called
Person Move Constructor Called
addToList called for [0x7fff3c990530] Felix Mendelssohn 38
Destructor called
Destructor called

This has added two new things. The first is the move constructor Person (Person&& rhs), which takes a strange Person&& as the first parameter (as though & wasn’t overloaded enough already in c++). The second is the call to addToList augmented with the strange called to the standard library move function. Note the above code will require the -std=c++11 flag to compile properly in G++.

Let’s deal with the first Person&& (read ‘person ref ref’). This is what’s called an rvalue reference. This works much like an lvalue reference (Person (const Person& rhs)) however it allows us to overload and distinguish between reference types. so the first call addToList(felix) will pass the person into addToList as an lvalue, causing the Copy Constructor to be called.

The second call addToList(move(felix)) will pass the person into addToList as an rvalue, causing the move constructor to be called. This magic is done by the std::move function, which is in essence a static cast to rvalue, it unconditionally converts the lvalue into an rvalue which can be passed into the rvalue accepting move constructor. The result is two different behaviours that are dependent on the semantic (meaning) of what you are trying to do. The call addToList(move(felix)) will invoke the move constructor.

Person (Person&& rhs)
: name(rhs.name), age(rhs.age)
{
    cout << "Person Move Constructor Called" << endl;

    // Move the image from one object to the other
    unsigned char* image = rhs.image;
    rhs.image = NULL;
    this->image = image;
}

The move constructor behaves differently from the copy constructors. Because it’s invoked with ‘move the internals of the rhs object into this one’ it can now set its image pointer to use the internal buffer provided by the incoming rhs object. This as you can expect is much more efficient than allocating a new buffer and copying the contents of the old one into this. It does however scar the incoming rhs object, it sets rhs.image to NULL to ensure that two different objects don’t try to release the same buffer during destruction. This also means that after calling move on an object, as we did in this case on the felix object, that it should not be used outside the calling function again, as it’s internal state is now undefined. This is why the move constructor Person (Person&& rhs) does not contain a const Person&&, as that would preclude us from altering rhs.

Like the copy and copy assignment constructors the move constructor has it’s move assignment counterpart. It looks like this:

Person& operator=(Person&& rhs)
{
    cout << "Person Move Assignment Operator Called" << endl;
    this->name = rhs.name;
    this->age = rhs.age;

    unsigned char* image = rhs.image;
    rhs.image = NULL;
    this->image = image;

    return *this;
}

int main()
{
    Person felix("Felix Mendelssohn", 38);
    Person anotherFelix;
    anotherFelix = move(felix); // rvalue
}
Person Constructor Called
loadImage called
Person Parameterless Constructor Called
Person Move Assignment Operator Called
Destructor called
Destructor called

Together, the move semantics of c++ 11 and the move constructors allow you to transfer ownership of objects from one place to another.