TAGS :Viewed: 13 - Published at: a few seconds ago

[ C- Structures and Unions ]

C standard states that only the members of a union are stored at the same address and ,because of which, we can access only one member at a time.Since the compiler overlays storage for the members of a union, changing one member alters any value previously stored in any of the other members.So if we try to access the value of a member stored previously, the value will be meaningless or undefined.Now here is my question:-

struct catalog_item
{
   int stock_number;
   double price;
   int item_type;
   union
     {
       struct
          {
            char title[TITLE_LEN+1];
            char author[AUTHOR_LEN+1];
            int num_pages;
          } book;
       struct
          {
            char design[DESIGN_LEN+1];
          } mug;
       struct
          {
            char design[DESIGN_LEN+1];
            int colors;
            int sizes;
          } shirt;
     } item;
} c;

Now if the following is done

strcpy(c.item.mug.design, "Butterfly");

then both of the following have the same value

printf("%s",c.item.mug.design);          //1

and

printf("%s",c.item.shirt.design);        //2

Why is the result of "2" is not undefined or meaningless?

Answer 1


Fundamentally, you need to think about data storage in C differently. In essence, a union in C says that it can hold any of the items in it (enough memory is allocated for any type included), but that a single instance will only hold 1. When you access a field, such as an integer, you're looking at the place in memory and treating the bits (0s and 1s) in memory there as an integer. Strings are written as an array of characters. You can look at any place in memory as a number and there will be one (constructed of whatever happen to be there before). Now what you are seeing is that treating this union as either of two possible structures both contain the string. This is because the location of the string is in the same place in either struct (at the beginning, offset 0), so with either you're resolving to the same location in memory.

Note that this is not guaranteed to line up this way, but only happens to line up because of your compilers interpretation of the structures.

Answer 2


This is not undefined behavior but it is implementation defined behavior, if we look at the C99 draft standard footnote 82 says:

If the member used to access the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called "type punning"). This might be a trap representation.

In practice this is supported in C and you should read the specific compilers documentation. For example if you examine the Implementation-defined behavior section for Structures, unions, enumerations, and bit-fields in the gcc manual it points to here for Type-punning and we can see under the -fstrict-aliasing section is says:

The practice of reading from a different union member than the one most recently written to (called “type-punning”) is common. Even with -fstrict-aliasing, type-punning is allowed, provided the memory is accessed through the union type.

but there are caveats and you should read up on strict aliasing to understand all the details, this article is gentler introduction.

For completeness section 3.4.1 defines implementation-defined behavior as:

unspecified behavior where each implementation documents how the choice is made

and the standard annex J.1 Unspecified behavior which lists the unspecified behavior covered by the standard includes this line:

The value of a union member other than the last one stored into (6.2.6.1).

Answer 3


Your three structs in the union use the same memory area. The "design" field in two of those structs happen to fall in the same memory location. Thus, writing one also writes the other, the actual address is the same.