This is a complement to my previous post about object memory layout and virtual tables.

As mentioned before, the virtual table pointer inside an object is assigned during the construction. Virtual Table Table(VTT) is a table holding vtable pointers to ensure virtual table pointers are set correctly during the construction of base classes under the virtual inheritance hierarchy. As known, when an object is constructed, its direct and indirect base classes are constructed from the basest one.

Construction under Non-virtual Inheritance

struct Base1 {
    virtual ~Base1();
    virtual void Foo();
  	double b1;
};

struct Base2 {
    virtual ~Base2();
    virtual void Bar();
  	double b2;
};

struct Derived: public Base1, public Base2 {
    ~Derived();
    void Foo() override;
    double d;
};

To construct a Derived instance, we must first construct the Base1 subobject, then Base2, and finally itself. Look at the assembly of Derived’s constructor(Godbolt):

Base1::Base1():
  movq $vtable for Base1+16, (%rdi)
  ret
  
Base2::Base2():
  movq $vtable for Base2+16, (%rdi)
  ret
  
Derived::Derived(): 
  movq %rdi, %rbx # store this pointer
  call Base1::Base1()
  leaq 16(%rbx), %rdi # this = this + 16; offset adjustment
  call Base2::Base2()
  movq $vtable for Derived+16, (%rbx) # this->vptr = &vtable_for_Derived + 16;
  movq $vtable for Derived+56, 16(%rbx) # this->vptr + 16 = &vtable_for_Derived + 56;

The register %rdi stores this pointer and is used as a hidden argument of constructors. After calling the two subobjects’ constructors, Its two virtual table pointers are set. Take a look at the memory layout and you can understand the process.

The whole virtual table is 80 bytes long, 8 bytes for each entry inside. (There’re two versions of destructor generated by the compiler, so I simplified the destructor-related entries into 16 bytes long.) The first virtual table pointer, shared by Derivedand Base1 is set to the address of 16 bytes below the top of the virtual table, and the second one, owned by Base2 is set to the position of 56 bytes below.

In the constructor of subobjects, two pointers are set to “vtable of Base1” and “vtable of Base2” respectively before being set to the final address. You may wonder since these two pointers will point to the virtual table of the derived class, is there any necessity to set them to the base vtables? YES. Suppose you’re calling a virtual function or get the type info through typeid(*this) in a base constructor, when you created a derived object. To get the correct behavior, the subobject must be safe and complete, and its vptr pointer must be set.

Construction under Virtual Inheritance

If there’s no virtual inheritance, each base class’s constructor is responsible for initializing its subobject respectively. But it doesn’t work anymore when it comes to a class with virtual bases. Under diamond hierarchy, two base classes will share the same virtual base subobject. Of course, it’s a bad idea to initialize the virtual base part twice! To solve this problem, the Itanium C++ ABI introduces two kinds of constructors, complete object constructor and base object constructor. The base object constructor will not be in charge of the construction of the virtual base part while the complete object constructor will do, including the base object one. Specifically, a complete object constructor is called whenever a whole object is created, handling all stuff.

In the following virtual base example,

struct VBase {
    virtual void Foo();
    double v;
};

struct Base1 : virtual public VBase {
    void Foo() override;
    virtual void Bar();
    double b1;
};

struct Base2 : virtual public VBase {
    virtual void Baz();
    double b2;
};

struct Derived: public Base1, public Base2 {
    double d;
};

When a Base1 object is created, its complete object constructor first calls the base object constructor for the virtual base class VBase and handles the other parts.

Base1 has two virtual pointers, one for its virtual functions and one for its virtual base. Since the virtual base is only owned by Base1, the two virtual pointers will finally point to the virtual table of Base1 after the constructor of VBase is called. So far is ok, the process is similar to the non-virtual case.

Let’s think about what will happen during the construction of Derived. The order in the complete object constructor for Derived will be VBase -> Base1 -> Base2 -> remaining things. Constructors for base cases here are all base object constructors. When subobject Base1 is constructed, where should its virtual table pointer point to? The virtual table of Base1? No! The virtual base subobject offset in Derived is different from its in Base1! Here comes the problem. A special virtual table called construction virtual tables is introduced to address this issue. When the subobject Base1 is constructed in Derived, its virtual pointer is set to the address of a Base1-in-Derived virtual table. There’re two construction virtual tables here, Base1-in-Derived and Base2-in-Derived. All of the virtual table addresses which be assigned during the construction (of a class containing virtual bases) are stored in virtual table tables(VTT). Take a look at what the VTT for Derived will be like.

I put the virtual table for Base1 and for Base1-in-Derived together and pay attention to the difference. The virtual base related entries are replaced with the value of that in the virtual table for Derived. To be specific, these entries are vbase offset and vcall offset. Not all the pointers in VTT are used during the construction. Take a look at the code behind(Godbolt)

Base1::Base1() # base object constructor
  movq (%rsi), %rax
  movq 8(%rsi), %rdx
  movq %rax, (%rdi) # this->vptr = &vtable_for_Base1_in_Derived + 24
  movq -24(%rax), %rax # get the vbase offset
  movq %rdx, (%rdi,%rax) # this->vptr = &vtable_for_Base1_in_Derived + 
  ret

Derived::Derived(): # complete object constructor
 	movq %rdi, %rbx # stores this
  leaq 40(%rdi), %rdi # adjsut this to subobject VBase
  call VBase::VBase()
  movl $VTT for Derived+8, %esi # stores pointer to vtable_for_Base1_in_Derived to %rsi
  call Base1::Base1()
  leaq 16(%rbx), %rdi # adjust this to subobject Base2
  movl $VTT for Derived+24, %esi # stores pointer to vtable_for_Base2_in_Derived to %rsi
  call Base2::Base2()
  movq $vtable for Derived+24, (%rbx) # this->vptr = &vtable_for_Derived + 24; // vptr shared by Derived and Base1
  movq $vtable for Derived+96, 40(%rbx) # this->vptr + 40 = &vtable_for_Derived + 96; // vptr for virtual base VBase
  movq $vtable for Derived+64, 16(%rbx) # this->vptr + 16 = &vtable_for_Derived + 64; // vptr owned by Base2

The last three lines are easy to understand.

The interesting part here is storing the value of pointers of VTT into register %rsi. %rsi is not directly used in Derived::Derived, but in the base object constructor Base1::Base1 (and Base2::Base2). For Base1’s constructor, it deems %rsi as a hidden argument that stores its virtual table! It knows I can set my vptr to the right place from %rsi. In Base1::Base1, the first pointer to the construction vtable for Base1-in-Derived is assigned to (%rdi), i.e. the Base1’s self vptr. And then, the virtual base offset is fetched through movq -24(%rax), since the compiler already knows the vbase offset is 24 bytes above the address point((%rax)). The second vptr, the one for VBase, is set afterward.

This explains why VTT should exist. Since all address points are known at compile-time, the compiler of course can simply replace the indirect access through VTT to the concrete offset to their corresponding virtual tables. The advantage of VTT is under different inheritance hierarchies, no matter what Base1 in, Base1-in-A or Base1-in-B, they can share the same base object constructor Base1::Base1, owing to the continuous layout of VTT! All that’s known is %rsi stores the correct address point.

Summary

In this post, I’ve covered how an object is constructed and how VTT works. The design of VTT is a trade-off between some extra instructions and the explosion of code size. It seems common practice to reduce the whole code size with some extra instruction cost for better L1i cache hit, like the thunk design.