Intro

Have you ever thought about how a class is structured in memory? How the data and functions are laid out? What happened when inheritance is involved? In this post, I’ll explain them to you.

As quoted from § 12.1.17 of N4649(C++17)

Non-static data members of a (non-union) class with the same access control are allocated so that later members have higher addresses within a class object. The order of allocation of non-static data members with different access control is unspecified. Implementation alignment requirements might cause two adjacent members not to be allocated immediately after each other; so might requirements for space for managing virtual functions and virtual base classes.

Non-static data members for a given access specifier will be allocated in order of their declaration. Only the memory layout of standard-layout types(among other restrictions, has all data members with the same access control), which is only a very small subset of kinds of types, is determined. The others are left to the implementation. An object needs space for its data members, subobjects of base classes, virtual function/base classes management, padding, and so on. These are implementation-defined but most compilers adhere to the Itanium ABI specification, part of it specifying how the layout is structured. GCC uses it. So what we’re talking about in this post is mostly implementation things. You can use -fdump-lang-class option in GCC(-fdump-class-hierarchy before GCC8.0) and -cc1 -fdump-record-layouts -fdump-vtable-layouts -emit-llvm option in Clang to dump the object layout for better understanding. The content discussed below is mostly GCC; Clang has some subtle differences. I have no idea what’s behind the MSVC.

Simple Class

For a class without any inheritance involved,

class Objcet {
public:
    double GetVal() const {
      return val;
    }
private:
    double val;
};

The non-static member function Object::GetVal is placed in the code segment and is treated as a global function by the compiler. From the perspective of the compiler, this function may look as follows,

inline double Object::GetVal(const Object* this) {
  return this->val;
}

Data member access is carried out through an implicit class object represented by a pointer to the object this, and be cv-qualified if the member function is declared as cv-qualified. The call may look like this,

// double val = obj.GetVal();
double val = GetVal(&obj); // resolved like global free functions

Don’t worry about the name collision after the member function is converted to global free functions because compilers will mangle these function names, to make sure every function has a unique name. For example, the GetVal function is mangled as _ZNK6Object6GetValEv, and you can use some demangling tools like c++filt to get the original function name.

> c++filt _ZNK6Object6GetValEv
Objcet::GetVal() const

Static members are all treated as global variables or global functions. For a simple class instance, it needs space for its data members (putting alignment paddings aside), val here. Objects of type Object are laid out like the image below.

Single Inheritance

Say I have a base class and a derived class, without any virtual functions.

class Base {
public:
     double b;
};

class Derived : public Base {
public:
  		double d;
};

Objects of Derived are laid out by simply stacking the Base subobject on the data member of its own.

If a pointer to the base class is assigned a derived object, it can only see the subobject part. It can be deemed as an object slicing.

Base* pB = new Derived(); // base pointer to derived object
pB->Foo(); // call Base::Foo()
pB->val // access the data in the Base subobject

The span of pB encompasses only the Base subobject of Derived. This rule still applies under multiple inheritance.

Things become complicated when virtual functions get engaged. Mose compilers use virtual tables to manage the virtual functions.

If you have virtual functions,

class Base {
public:
    virtual ~Base();
    virtual void Foo();
    virtual void Bar();
    double b;
};

class Derived : public Base {
public:
    void Foo() override; // override
    virtual void Baz(); // new virtual function
    double d;
  	
};

The memory layout will be like this,

Base Derived

If a class has any virtual functions, it has a pointer vptr, placed at the beginning of an object, to the associated virtual table, where these virtual function pointers are stored. The virtual table is constructed during compile time. For each virtual function a class has, a slot in the vtable is used to store the function address.

Note: In reality, there’ll be two destructors, deleting destructor and complete constructor, generated for a class. For simplicity, I only take one in this post.

When you call the virtual function through a base pointer, even if only the subobject is visible to the pointer, it can call the derived function through the vtpr. At compile time we don’t know the exact type of object the base pointer addresses, but we can determine which slot is used to store the address of the called function.

Base pB = new Derived();
// Base pB = new Base();
pB->Foo(); // can't determine the exact type until runtime

The compiler can determine Foo is located at second slot in the vtable(counting from the address pointed by vptr) and it internally transforms the call to

*(pB->vptr[2])(pB);

The vptr is all handled automatically by the compiler, it’s set when the object is constructed.

You may notice there’re other things besides the virtual function pointers. Actually, the address contained in the object pointed to the virtual table is not the beginning of the virtual table. It’s called the address point of the virtual table. In this case, above the address point, the virtual table also contains typeinfo pointer and offset to top, we’ll talk about it later.

Multiple Inheritance

Under a single inheritance hierarchy, the virtual function mechanism is well behaved; it is both efficient and easily modeled. What the world will be like under a multiple inheritance hierarchy?

struct Base1 {
    virtual ~Base1();
    virtual void Foo();
  	double b1;
};

struct Base2 {
    virtual ~Base2();
    virtual void Bar();
  	double b2;
};

struct Derived: public Base1, public Base2 {
    ~Derived();
    void Foo() override;
    double d;
};

The memory layout is like this,

For each base class that has a virtual table, the derived object has a corresponding vptr. It’s reasonable because these subobjects should have their own vtpr. When a base pointer is assigned a derived object, it should podouble to the subobject but not necessarily the beginning of this object. If the base class is the first one to be inherited, it works the same as the single-inheritance case. Consider the following case,

Base1* pB = new Derived();
pB->foo();

It works fine. But what if we use Base2 here?

Base2* pB = new Derived();
// generated code
// Derived* tmp = new Derived();
// Base2* pB = tmp + sizeof(Base1);

The pointer pB is adjusted to the subobject Base2 here. If we call pB->Bar(), the function resolution is like what I said before. How about deleting the object through the base pointer?

Base1* pB = new Derived();
delete pB; // this is OK since pB points to the beginning

Base2* pB = new Derived();
delete pB; // it must be adjusted back to the beginning

Even if the corresponding slot can store the correct destructor ~Derived::Derived(), to make things work, the pointer pB, i.e. *this in the function parameter, must be readjusted to the beginning of the object. Of course, you can do the adjustment at runtime. But for runtime efficiency, the thunk function is generated at compile time since the offset is already known. What the thunk function might look as follows:

void thunk_to_Derived_Desturctor(Base2* this) {
    this -= sizeof(Base1);
    Derived::~Derived(this);
}

This explains what the second slot of the Base2 subobject vtable, “thunk to ~Derived::Derived” is used for.

Little Compiler Trick

When the leftmost base class has no virtual functions, compilers will move the first base class with virtual functions to the beginning of the object, so that the beginning of the object is always the vptr. This base class is also called the primary base. Little changes in the above case, where Base1 doesn’t have any virtual functions anymore.

struct Base1 {
    void Foo();
    double b1;
};

The memory will be like,

Although Base1 is the leftmost base class of Derived, due to the lack of any virtual functions for Base1, the compiler changes the order in which the final object is composed.

Virtual Inheritance

To solve the Diamond Inheritance Problem, C++ introduces the virtual inheritance mechanism to guarantee only one copy of virtually inherited bases under the diamond hierarchy. Consider the following code fragment,

struct VBase {
    virtual void Foo();
    double v;
};

struct Base1 : virtual public VBase {
    void Foo() override;
    virtual void Bar();
    double b1;
};

struct Base2 : virtual public VBase {
    virtual void Baz();
    double b2;
};

struct Derived: public Base1, public Base2 {
    double d;
};

To keep only one virtual base in the derived object, a class can be spliced into two parts, the virtual base subobject, which will be shared in the diamond inheritance case, and the other parts. The virtual base is accessed through a virtual table pointer either. Let’s first see the memory layout of Base1 because there is no big difference between Derived and Base in essence.

There’re two kinds of new things added to the vtable, vbase offset and vcall offset.

Virtual Base(Vbase) Offset

This offset is used when you are to access the virtual bases of an object. Suppose you’re using a pointer of VBase* addressing at a Base1 object. To get the real address of the subobject, the pointer is adjusted through this offset.

VBase* pV = new Base1();
// generated code
// Base1* tmp = new Base1();
// VBase* pV = tmp + vbase_offset;
// or 
// pV = tmp + *(tmp->vptr + index) // index is determined at compile time, -3 here

The vbase offset is 16here, it tells the compiler to jump over 16 bytes to get the virtual base. The real reason to do that is to deal with different offset when virtual base is inherited in different classes. There’ll be no such problem when it comes to non-virtual base class since the offset is determined.

Virtual Call(Vcall) Offset

Vcall offsets play a similar role to vbase offsets. Consider the case

VBase pV = new Base1();
pV->Foo();

Since the function Foo is overridden, the pointer needs to be readjusted back to the beginning of Base1 object. Like what is done in thunk functions, add the offset to *this pointer and call the correct function instance. The difference here is the offset is stored in the vcall offset slot. Two indirections are performed to invoke the function. The special thunk is called virtual thunks. Generated code may be,

void virtual_thunk_to_Base1_Foo(VBase* this) {
    int vcall_offset = *(this->vptr + index); // every virtual call has a corresponding index
    this += vcall_offset;
    Base1::Foo(this);
}

Now, the more complicated Derived class memory layout can be understood.

Let’s compare the layout with Base1’s. Due to the int member d and subobject Base2, the offset of VBase is different from its in Base1. To share the same thunk to Base1::Foo between both Base1 and Derived, the vcall offset and virtual thunk are used. This will result in fewer thunks which may cause fewer instruction cache misses. The trade-off is one more time load before the offset is added. Quoted from the Itanium C++ ABI Examples, this trade-off is worthwhile.

Since the offset is smaller than the code for a thunk, the load should miss in cache less frequently, so better cache miss behavior should produce better results in spite of the 2 or more cycles required for the vcall offset load.

Pure Virtual Functions

Pure virtual functions are a kind of special function which means they’re not implemented and classes with pure virtual functions are disallowed to be instantiated.

Since pure virtual function has no definition, the ABI specifies a special function __cxa_pure_virtual as the placeholder for all pure functions. The corresponding virtual function pointer points to this placeholder in the virtual table.

Other Components in the Virtual Table

So far, we have covered only three things in the virtual table, virtual function pointers, vbase offset, and vcall offset. The remaining two things, offset to top, and typeinfo pointer is explained here. They have little relationship with the content above.

Offset to Top

This is the displacement to the top of the object from the vtable pointer. It’s used especially in dynamic_cast<>. Let’s explore more details here.

Recall the multiple inheritance case, suppose I already have a pointer of Base2* to a Derived object, how can we safely down-cast to a Derived pointer? Since this pointer is addressed at the subobject, rather than the top of Derived. To cast it back, the offset is added. The way looks like in the following Pseudo C++ Code,

Derived* pD = pB + *(pB->vptr + index);

The index is determined by the compiler since the whole vtable is structured at compile time. dynamic_cast only works on polymorphic classes, i.e. classes with virtual functions. If Base2 is non-polymorphic, there’s no virtual table pointer and the cast will fail.

But what if the pB is just a pointer to Base2 instead of Derived. If can retrieve the offset through its vptr and get a “looking good” pointer? To solve this problem, the compiler needs to access the object’s RTTI(Run-Time Type Identification). dynamic_cast<T> will firstly check if our object is of type T via the type info and keep doing the cast if it is. That’s why the typeinfo pointer is introduced.

Typeinfo Pointer

C++ supports a mechanism for runtime type information through typeid operator. The typeid expression refers to a compile-time generated object of the type std::type_info, which holds meta information about a type, e.g. its name. It is also what the typeinfo pointer in the virtual tables points to. All typeinfo pointers in a vtable shall point to the same object. They are always valid pointers for polymorphic classes, but in a virtual table of a class without virtual functions, i.e. virtual bases only, these pointers are null pointers.

Typeinfo pointers and offset to top are always present in a virtual table.

Now, let’s complete the memory layout of Derived under virtual inheritance.

Ending Words

There’s something interesting but not mentioned yet, including virtual table tables. Also how these vtables are constructed is quite fun. I plan to delve into them in future posts.

Reference