[ACCEPTED]-Why can't you use offsetof on non-POD structures in C++?-offsetof

Accepted answer
Score: 49

Bluehorn's answer is correct, but for me 101 it doesn't explain the reason for the problem 100 in simplest terms. The way I understand 99 it is as follows:

If NonPOD is a non-POD 98 class, then when you do:

NonPOD np;
np.field;

the compiler does 97 not necessarily access the field by adding 96 some offset to the base pointer and dereferencing. For 95 a POD class, the C++ Standard constrains 94 it to do that(or something equivalent), but 93 for a non-POD class it does not. The compiler 92 might instead read a pointer out of the 91 object, add an offset to that value to give 90 the storage location of the field, and then 89 dereference. This is a common mechanism 88 with virtual inheritance if the field is 87 a member of a virtual base of NonPOD. But 86 it is not restricted to that case. The compiler 85 can do pretty much anything it likes. It 84 could call a hidden compiler-generated virtual 83 member function if it wants.

In the complex 82 cases, it is obviously not possible to represent 81 the location of the field as an integer 80 offset. So offsetof is not valid on non-POD classes.

In 79 cases where your compiler just so happens 78 to store the object in a simple way (such 77 as single inheritance, and normally even 76 non-virtual multiple inheritance, and normally 75 fields defined right in the class that you're 74 referencing the object by as opposed to 73 in some base class), then it will just so 72 happen to work. There are probably cases 71 which just so happen to work on every single 70 compiler there is. This doesn't make it 69 valid.

Appendix: how does virtual inheritance work?

With simple inheritance, if B is derived 68 from A, the usual implementation is that 67 a pointer to B is just a pointer to A, with 66 B's additional data stuck on the end:

A* ---> field of A  <--- B*
        field of A
        field of B

With 65 simple multiple inheritance, you generally 64 assume that B's base classes (call 'em A1 63 and A2) are arranged in some order peculiar 62 to B. But the same trick with the pointers 61 can't work:

A1* ---> field of A1
         field of A1
A2* ---> field of A2
         field of A2

A1 and A2 "know" nothing about 60 the fact that they're both base classes 59 of B. So if you cast a B* to A1*, it has 58 to point to the fields of A1, and if you 57 cast it to A2* it has to point to the fields 56 of A2. The pointer conversion operator applies 55 an offset. So you might end up with this:

A1* ---> field of A1 <---- B*
         field of A1
A2* ---> field of A2
         field of A2
         field of B
         field of B

Then 54 casting a B* to A1* doesn't change the pointer 53 value, but casting it to A2* adds sizeof(A1) bytes. This 52 is the "other" reason why, in the absence 51 of a virtual destructor, deleting B through 50 a pointer to A2 goes wrong. It doesn't just 49 fail to call the destructor of B and A1, it 48 doesn't even free the right address.

Anyway, B 47 "knows" where all its base classes are, they're 46 always stored at the same offsets. So in 45 this arrangement offsetof would still work. The 44 standard doesn't require implementations 43 to do multiple inheritance this way, but 42 they often do (or something like it). So 41 offsetof might work in this case on your 40 implementation, but it is not guaranteed 39 to.

Now, what about virtual inheritance? Suppose 38 B1 and B2 both have A as a virtual base. This 37 makes them single-inheritance classes, so 36 you might think that the first trick will 35 work again:

A* ---> field of A   <--- B1* A* ---> field of A   <--- B2* 
        field of A                    field of A
        field of B1                   field of B2

But hang on. What happens when 34 C derives (non-virtually, for simplicity) from 33 both B1 and B2? C must only contain 1 copy 32 of the fields of A. Those fields can't immediately 31 precede the fields of B1, and also immediately 30 precede the fields of B2. We're in trouble.

So 29 what implementations might do instead is:

// an instance of B1 looks like this, and B2 similar
A* --->  field of A
         field of A
B1* ---> pointer to A 
         field of B1

Although 28 I've indicated B1* pointing to the first 27 part of the object after the A subobject, I 26 suspect (without bothering to check) the 25 actual address won't be there, it'll be 24 the start of A. It's just that unlike simple 23 inheritance, the offsets between the actual 22 address in the pointer, and the address 21 I've indicated in the diagram, will never be 20 used unless the compiler is certain of the 19 dynamic type of the object. Instead, it 18 will always go through the meta-information 17 to reach A correctly. So my diagrams will 16 point there, since that offset will always 15 be applied for the uses we're interested 14 in.

The "pointer" to A could be a pointer 13 or an offset, it doesn't really matter. In 12 an instance of B1, created as a B1, it points 11 to (char*)this - sizeof(A), and the same in an instance of B2. But 10 if we create a C, it can look like this:

A* --->  field of A
         field of A
B1* ---> pointer to A    // points to (char*)(this) - sizeof(A) as before
         field of B1
B2* ---> pointer to A    // points to (char*)(this) - sizeof(A) - sizeof(B1)
         field of B2
C* ----> pointer to A    // points to (char*)(this) - sizeof(A) - sizeof(B1) - sizeof(B2)
         field of C
         field of C

So 9 to access a field of A using a pointer or 8 reference to B2 requires more than just 7 applying an offset. We must read the "pointer 6 to A" field of B2, follow it, and only then 5 apply an offset, because depending what 4 class B2 is a base of, that pointer will 3 have different values. There is no such 2 thing as offsetof(B2,field of A): there can't be. offsetof will 1 never work with virtual inheritance, on any implementation.

Score: 38

Short answer: offsetof is a feature that 36 is only in the C++ standard for legacy C 35 compatibility. Therefore it is basically 34 restricted to the stuff than can be done 33 in C. C++ supports only what it must for 32 C compatibility.

As offsetof is basically 31 a hack (implemented as macro) that relies 30 on the simple memory-model supporting C, it 29 would take a lot of freedom away from C++ compiler 28 implementors how to organize class instance 27 layout.

The effect is that offsetof will 26 often work (depending on source code and 25 compiler used) in C++ even where not backed 24 by the standard - except where it doesn't. So 23 you should be very careful with offsetof 22 usage in C++, especially since I do not know a single compiler that will generate a warning for non-POD use... Modern GCC and 21 Clang will emit a warning if offsetof is used outside 20 the standard (-Winvalid-offsetof).

Edit: As you asked for example, the 19 following might clarify the problem:

#include <iostream>
using namespace std;

struct A { int a; };
struct B : public virtual A   { int b; };
struct C : public virtual A   { int c; };
struct D : public B, public C { int d; };

#define offset_d(i,f)    (long(&(i)->f) - long(i))
#define offset_s(t,f)    offset_d((t*)1000, f)

#define dyn(inst,field) {\
    cout << "Dynamic offset of " #field " in " #inst ": "; \
    cout << offset_d(&i##inst, field) << endl; }

#define stat(type,field) {\
    cout << "Static offset of " #field " in " #type ": "; \
    cout.flush(); \
    cout << offset_s(type, field) << endl; }

int main() {
    A iA; B iB; C iC; D iD;
    dyn(A, a); dyn(B, a); dyn(C, a); dyn(D, a);
    stat(A, a); stat(B, a); stat(C, a); stat(D, a);
    return 0;
}

This 18 will crash when trying to locate the field 17 a inside type B statically, while it works 16 when an instance is available. This is because 15 of the virtual inheritance, where the location 14 of the base class is stored into a lookup 13 table.

While this is a contrived example, an 12 implementation could use a lookup table 11 also to find the public, protected and private 10 sections of a class instance. Or make the 9 lookup completely dynamic (use a hash table 8 for fields), etc.

The standard just leaves 7 all possibilities open by restricting offsetof 6 to POD (IOW: no way to use a hash table 5 for POD structs... :)

Just another note: I 4 had to reimplement offsetof (here: offset_s) for 3 this example as GCC actually errors out 2 when I call offsetof for a field of a virtual 1 base class.

Score: 5

In general, when you ask "why is something undefined", the answer 23 is "because the standard says so". Usually, the rational is along one 22 or more reasons like:

  • it is difficult to 21 detect statically in which case you are.

  • corner 20 cases are difficult to define and nobody 19 took the pain of defining special cases;

  • its 18 use is mostly covered by other features;

  • existing 17 practices at the time of standardization 16 varied and breaking existing implementation 15 and programs depending on them was deemed 14 more harmful that standardization.

Back 13 to offsetof, the second reason is probably 12 a dominant one. If you look at C++0X, where 11 the standard was previously using POD, it 10 is now using "standard layout", "layout 9 compatible", "POD" allowing more refined 8 cases. And offsetof now needs "standard 7 layout" classes, which are the cases where 6 the committee didn't want to force a layout.

You 5 have also to consider the common use of 4 offsetof(), which is to get the value of 3 a field when you have a void* pointer to 2 the object. Multiple inheritance -- virtual 1 or not -- is problematic for that use.

Score: 2

I think your class fits the c++0x definition 10 of a POD. g++ has implemented some of c++0x 9 in their latest releases. I think that VS2008 8 also has some c++0x bits in it.

From wikipedia's c++0x article

C++0x 7 will relax several rules with regard to 6 the POD definition.

A class/struct is considered 5 a POD if it is trivial, standard-layout, and if 4 all of its non-static members are PODs.

A 3 trivial class or struct is defined as 2 one that:

  1. Has a trivial default constructor. This may use the default constructor syntax (SomeConstructor() = default;).
  2. Has a trivial copy constructor, which may use the default syntax.
  3. Has a trivial copy assignment operator, which may use the default syntax.
  4. Has a trivial destructor, which must not be virtual.

A standard-layout class or struct 1 is defined as one that:

  1. Has only non-static data members that are of standard-layout type
  2. Has the same access control (public, private, protected) for all non-static members
  3. Has no virtual functions
  4. Has no virtual base classes
  5. Has only base classes that are of standard-layout type
  6. Has no base classes of the same type as the first defined non-static member
  7. Either has no base classes with non-static members, or has no non-static data members in the most derived class and at most one base class with non-static members. In essence, there may be only one class in this class's hierarchy that has non-static members.
Score: 0

For the definition of POD data structure,here 8 you go with the explanation [ already posted 7 in another post in Stack Overflow ]

What are POD types in C++?

Now, coming 6 to your code, it is working fine as expected. This 5 is because, you are trying to find the offsetof(), for 4 the public members of your class, which 3 is valid.

Please let me know, the correct 2 question, if my viewpoint above, doesnot 1 clarify your doubt.

More Related questions