repr(Rust)

First and foremost, all types have an alignment specified in bytes. Thealignment of a type specifies what addresses are valid to store the value at. Avalue with alignment n must only be stored at an address that is a multiple ofn. So alignment 2 means you must be stored at an even address, and 1 meansthat you can be stored anywhere. Alignment is at least 1, and always a powerof 2.

Primitives are usually aligned to their size, although this isplatform-specific behavior. For example, on x86 u64 and f64 are oftenaligned to 4 bytes (32 bits).

A type’s size must always be a multiple of its alignment. This ensures that anarray of that type may always be indexed by offsetting by a multiple of itssize. Note that the size and alignment of a type may not be knownstatically in the case of dynamically sized types.

Rust gives you the following ways to lay out composite data:

  • structs (named product types)
  • tuples (anonymous product types)
  • arrays (homogeneous product types)
  • enums (named sum types — tagged unions)
  • unions (untagged unions)

An enum is said to be field-less if none of its variants have associated data.

By default, composite structures have an alignment equal to the maximumof their fields’ alignments. Rust will consequently insert padding wherenecessary to ensure that all fields are properly aligned and that the overalltype’s size is a multiple of its alignment. For instance:

  1. struct A {
  2. a: u8,
  3. b: u32,
  4. c: u16,
  5. }

will be 32-bit aligned on a target that aligns these primitives to theirrespective sizes. The whole struct will therefore have a size that is a multipleof 32-bits. It may become:

  1. struct A {
  2. a: u8,
  3. _pad1: [u8; 3], // to align `b`
  4. b: u32,
  5. c: u16,
  6. _pad2: [u8; 2], // to make overall size multiple of 4
  7. }

or maybe:

  1. struct A {
  2. b: u32,
  3. c: u16,
  4. a: u8,
  5. _pad: u8,
  6. }

There is no indirection for these types; all data is stored within the struct,as you would expect in C. However with the exception of arrays (which aredensely packed and in-order), the layout of data is not specified by default.Given the two following struct definitions:

  1. struct A {
  2. a: i32,
  3. b: u64,
  4. }
  5. struct B {
  6. a: i32,
  7. b: u64,
  8. }

Rust does guarantee that two instances of A have their data laid out inexactly the same way. However Rust does not currently guarantee that aninstance of A has the same field ordering or padding as an instance of B.

With A and B as written, this point would seem to be pedantic, but several otherfeatures of Rust make it desirable for the language to play with data layout incomplex ways.

For instance, consider this struct:

  1. struct Foo<T, U> {
  2. count: u16,
  3. data1: T,
  4. data2: U,
  5. }

Now consider the monomorphizations of Foo<u32, u16> and Foo<u16, u32>. IfRust lays out the fields in the order specified, we expect it to pad thevalues in the struct to satisfy their alignment requirements. So if Rustdidn’t reorder fields, we would expect it to produce the following:

  1. struct Foo<u16, u32> {
  2. count: u16,
  3. data1: u16,
  4. data2: u32,
  5. }
  6. struct Foo<u32, u16> {
  7. count: u16,
  8. _pad1: u16,
  9. data1: u32,
  10. data2: u16,
  11. _pad2: u16,
  12. }

The latter case quite simply wastes space. An optimal use of spacerequires different monomorphizations to have different field orderings.

Enums make this consideration even more complicated. Naively, an enum such as:

  1. enum Foo {
  2. A(u32),
  3. B(u64),
  4. C(u8),
  5. }

might be laid out as:

  1. struct FooRepr {
  2. data: u64, // this is either a u64, u32, or u8 based on `tag`
  3. tag: u8, // 0 = A, 1 = B, 2 = C
  4. }

And indeed this is approximately how it would be laid out (modulo thesize and position of tag).

However there are several cases where such a representation is inefficient. Theclassic case of this is Rust’s “null pointer optimization”: an enum consistingof a single outer unit variant (e.g. None) and a (potentially nested) non-nullable pointer variant (e.g. Some(&T)) makes the tag unnecessary. A nullpointer can safely be interpreted as the unit (None) variant. The netresult is that, for example, size_of::<Option<&T>>() == size_of::<&T>().

There are many types in Rust that are, or contain, non-nullable pointers such asBox<T>, Vec<T>, String, &T, and &mut T. Similarly, one can imaginenested enums pooling their tags into a single discriminant, as they are bydefinition known to have a limited range of valid values. In principle enums coulduse fairly elaborate algorithms to store bits throughout nested types withforbidden values. As such it is especially desirable thatwe leave enum layout unspecified today.