Intro
Have you ever thought about how a class is structured in memory? How the data and functions are laid out? What happened when inheritance is involved? In this post, I’ll explain them to you.
As quoted from § 12.1.17 of N4649(C++17)
Non-static data members of a (non-union) class with the same access control are allocated so that later members have higher addresses within a class object. The order of allocation of non-static data members with different access control is unspecified. Implementation alignment requirements might cause two adjacent members not to be allocated immediately after each other; so might requirements for space for managing virtual functions and virtual base classes.
Non-static data members for a given access specifier will be allocated in order of their declaration. Only the memory layout of standard-layout types
(among other restrictions, has all data members with the same access control), which is only a very small subset of kinds of types, is determined. The others are left to the implementation. An object needs space for its data members, subobjects of base classes, virtual function/base classes management, padding, and so on. These are implementation-defined but most compilers adhere to the Itanium ABI specification, part of it specifying how the layout is structured. GCC uses it. So what we’re talking about in this post is mostly implementation things. You can use -fdump-lang-class
option in GCC(-fdump-class-hierarchy
before GCC8.0) and -cc1 -fdump-record-layouts -fdump-vtable-layouts -emit-llvm
option in Clang to dump the object layout for better understanding. The content discussed below is mostly GCC; Clang has some subtle differences. I have no idea what’s behind the MSVC.
Simple Class
For a class without any inheritance involved,
class Objcet {
public:
double GetVal() const {
return val;
}
private:
double val;
};
The non-static member function Object::GetVal
is placed in the code segment and is treated as a global function by the compiler. From the perspective of the compiler, this function may look as follows,
inline double Object::GetVal(const Object* this) {
return this->val;
}
Data member access is carried out through an implicit class object represented by a pointer to the object this
, and be cv-qualified if the member function is declared as cv-qualified. The call may look like this,
// double val = obj.GetVal();
double val = GetVal(&obj); // resolved like global free functions
Don’t worry about the name collision after the member function is converted to global free functions because compilers will mangle these function names, to make sure every function has a unique name. For example, the GetVal
function is mangled as _ZNK6Object6GetValEv
, and you can use some demangling tools like c++filt
to get the original function name.
> c++filt _ZNK6Object6GetValEv
Objcet::GetVal() const
Static members are all treated as global variables or global functions. For a simple class instance, it needs space for its data members (putting alignment paddings aside), val
here. Objects of type Object
are laid out like the image below.
Single Inheritance
Say I have a base class and a derived class, without any virtual functions.
class Base {
public:
double b;
};
class Derived : public Base {
public:
double d;
};
Objects of Derived
are laid out by simply stacking the Base
subobject on the data member of its own.
If a pointer to the base class is assigned a derived object, it can only see the subobject part. It can be deemed as an object slicing.
Base* pB = new Derived(); // base pointer to derived object
pB->Foo(); // call Base::Foo()
pB->val // access the data in the Base subobject
The span of pB
encompasses only the Base
subobject of Derived
. This rule still applies under multiple inheritance.
Things become complicated when virtual functions get engaged. Mose compilers use virtual tables to manage the virtual functions.
If you have virtual functions,
class Base {
public:
virtual ~Base();
virtual void Foo();
virtual void Bar();
double b;
};
class Derived : public Base {
public:
void Foo() override; // override
virtual void Baz(); // new virtual function
double d;
};
The memory layout will be like this,
Base | Derived |
---|---|
If a class has any virtual functions, it has a pointer vptr
, placed at the beginning of an object, to the associated virtual table, where these virtual function pointers are stored. The virtual table is constructed during compile time. For each virtual function a class has, a slot in the vtable is used to store the function address.
Note: In reality, there’ll be two destructors, deleting destructor and complete constructor, generated for a class. For simplicity, I only take one in this post.
When you call the virtual function through a base pointer, even if only the subobject is visible to the pointer, it can call the derived function through the vtpr
. At compile time we don’t know the exact type of object the base pointer addresses, but we can determine which slot is used to store the address of the called function.
Base pB = new Derived();
// Base pB = new Base();
pB->Foo(); // can't determine the exact type until runtime
The compiler can determine Foo
is located at second slot in the vtable(counting from the address pointed by vptr
) and it internally transforms the call to
*(pB->vptr[2])(pB);
The vptr is all handled automatically by the compiler, it’s set when the object is constructed.
You may notice there’re other things besides the virtual function pointers. Actually, the address contained in the object pointed to the virtual table is not the beginning of the virtual table. It’s called the address point of the virtual table. In this case, above the address point, the virtual table also contains typeinfo pointer and offset to top, we’ll talk about it later.
Multiple Inheritance
Under a single inheritance hierarchy, the virtual function mechanism is well behaved; it is both efficient and easily modeled. What the world will be like under a multiple inheritance hierarchy?
struct Base1 {
virtual ~Base1();
virtual void Foo();
double b1;
};
struct Base2 {
virtual ~Base2();
virtual void Bar();
double b2;
};
struct Derived: public Base1, public Base2 {
~Derived();
void Foo() override;
double d;
};
The memory layout is like this,
For each base class that has a virtual table, the derived object has a corresponding vptr
. It’s reasonable because these subobjects should have their own vtpr
. When a base pointer is assigned a derived object, it should podouble to the subobject but not necessarily the beginning of this object. If the base class is the first one to be inherited, it works the same as the single-inheritance case. Consider the following case,
Base1* pB = new Derived();
pB->foo();
It works fine. But what if we use Base2
here?
Base2* pB = new Derived();
// generated code
// Derived* tmp = new Derived();
// Base2* pB = tmp + sizeof(Base1);
The pointer pB
is adjusted to the subobject Base2
here. If we call pB->Bar()
, the function resolution is like what I said before. How about deleting the object through the base pointer?
Base1* pB = new Derived();
delete pB; // this is OK since pB points to the beginning
Base2* pB = new Derived();
delete pB; // it must be adjusted back to the beginning
Even if the corresponding slot can store the correct destructor ~Derived::Derived()
, to make things work, the pointer pB
, i.e. *this
in the function parameter, must be readjusted to the beginning of the object. Of course, you can do the adjustment at runtime. But for runtime efficiency, the thunk function is generated at compile time since the offset is already known. What the thunk function might look as follows:
void thunk_to_Derived_Desturctor(Base2* this) {
this -= sizeof(Base1);
Derived::~Derived(this);
}
This explains what the second slot of the Base2 subobject vtable, “thunk to ~Derived::Derived” is used for.
Little Compiler Trick
When the leftmost base class has no virtual functions, compilers will move the first base class with virtual functions to the beginning of the object, so that the beginning of the object is always the vptr
. This base class is also called the primary base. Little changes in the above case, where Base1
doesn’t have any virtual functions anymore.
struct Base1 {
void Foo();
double b1;
};
The memory will be like,
Although Base1
is the leftmost base class of Derived
, due to the lack of any virtual functions for Base1
, the compiler changes the order in which the final object is composed.
Virtual Inheritance
To solve the Diamond Inheritance Problem, C++ introduces the virtual inheritance mechanism to guarantee only one copy of virtually inherited bases under the diamond hierarchy. Consider the following code fragment,
struct VBase {
virtual void Foo();
double v;
};
struct Base1 : virtual public VBase {
void Foo() override;
virtual void Bar();
double b1;
};
struct Base2 : virtual public VBase {
virtual void Baz();
double b2;
};
struct Derived: public Base1, public Base2 {
double d;
};
To keep only one virtual base in the derived object, a class can be spliced into two parts, the virtual base subobject, which will be shared in the diamond inheritance case, and the other parts. The virtual base is accessed through a virtual table pointer either. Let’s first see the memory layout of Base1
because there is no big difference between Derived
and Base
in essence.
There’re two kinds of new things added to the vtable, vbase offset and vcall offset.
Virtual Base(Vbase) Offset
This offset is used when you are to access the virtual bases of an object. Suppose you’re using a pointer of VBase*
addressing at a Base1
object. To get the real address of the subobject, the pointer is adjusted through this offset.
VBase* pV = new Base1();
// generated code
// Base1* tmp = new Base1();
// VBase* pV = tmp + vbase_offset;
// or
// pV = tmp + *(tmp->vptr + index) // index is determined at compile time, -3 here
The vbase offset is 16
here, it tells the compiler to jump over 16 bytes to get the virtual base. The real reason to do that is to deal with different offset when virtual base is inherited in different classes. There’ll be no such problem when it comes to non-virtual base class since the offset is determined.
Virtual Call(Vcall) Offset
Vcall offsets play a similar role to vbase offsets. Consider the case
VBase pV = new Base1();
pV->Foo();
Since the function Foo
is overridden, the pointer needs to be readjusted back to the beginning of Base1
object. Like what is done in thunk functions, add the offset to *this
pointer and call the correct function instance. The difference here is the offset is stored in the vcall offset
slot. Two indirections are performed to invoke the function. The special thunk is called virtual thunks. Generated code may be,
void virtual_thunk_to_Base1_Foo(VBase* this) {
int vcall_offset = *(this->vptr + index); // every virtual call has a corresponding index
this += vcall_offset;
Base1::Foo(this);
}
Now, the more complicated Derived
class memory layout can be understood.
Let’s compare the layout with Base1
’s. Due to the int
member d
and subobject Base2
, the offset of VBase
is different from its in Base1
. To share the same thunk to Base1::Foo
between both Base1
and Derived
, the vcall offset and virtual thunk are used. This will result in fewer thunks which may cause fewer instruction cache misses. The trade-off is one more time load before the offset is added. Quoted from the Itanium C++ ABI Examples, this trade-off is worthwhile.
Since the offset is smaller than the code for a thunk, the load should miss in cache less frequently, so better cache miss behavior should produce better results in spite of the 2 or more cycles required for the vcall offset load.
Pure Virtual Functions
Pure virtual functions are a kind of special function which means they’re not implemented and classes with pure virtual functions are disallowed to be instantiated.
Since pure virtual function has no definition, the ABI specifies a special function __cxa_pure_virtual
as the placeholder for all pure functions. The corresponding virtual function pointer points to this placeholder in the virtual table.
Other Components in the Virtual Table
So far, we have covered only three things in the virtual table, virtual function pointers, vbase offset, and vcall offset. The remaining two things, offset to top, and typeinfo pointer is explained here. They have little relationship with the content above.
Offset to Top
This is the displacement to the top of the object from the vtable pointer. It’s used especially in dynamic_cast<>
. Let’s explore more details here.
Recall the multiple inheritance case, suppose I already have a pointer of Base2*
to a Derived
object, how can we safely down-cast to a Derived pointer? Since this pointer is addressed at the subobject, rather than the top of Derived
. To cast it back, the offset is added. The way looks like in the following Pseudo C++ Code,
Derived* pD = pB + *(pB->vptr + index);
The index is determined by the compiler since the whole vtable is structured at compile time. dynamic_cast
only works on polymorphic classes, i.e. classes with virtual functions. If Base2
is non-polymorphic, there’s no virtual table pointer and the cast will fail.
But what if the pB
is just a pointer to Base2
instead of Derived
. If can retrieve the offset through its vptr
and get a “looking good” pointer? To solve this problem, the compiler needs to access the object’s RTTI(Run-Time Type Identification). dynamic_cast<T>
will firstly check if our object is of type T
via the type info and keep doing the cast if it is. That’s why the typeinfo pointer is introduced.
Typeinfo Pointer
C++ supports a mechanism for runtime type information through typeid
operator. The typeid
expression refers to a compile-time generated object of the type std::type_info
, which holds meta information about a type, e.g. its name. It is also what the typeinfo pointer in the virtual tables points to. All typeinfo pointers in a vtable shall point to the same object. They are always valid pointers for polymorphic classes, but in a virtual table of a class without virtual functions, i.e. virtual bases only, these pointers are null pointers.
Typeinfo pointers and offset to top are always present in a virtual table.
Now, let’s complete the memory layout of Derived
under virtual inheritance.
Ending Words
There’s something interesting but not mentioned yet, including virtual table tables. Also how these vtables are constructed is quite fun. I plan to delve into them in future posts.
Reference
- Inside the C++ Object Model, By Stanley B. Lippman
- Itanium C++ ABI
- C++ ABI for IA-64: Code and Implementation Examples
- What is the first (int (*)(…))0 vtable entry in the output of g++ -fdump-class-hierarchy? on StackOverflow