Allocating Memory

Using Unique throws a wrench in an important feature of Vec (and indeed all ofthe std collections): an empty Vec doesn’t actually allocate at all. So if wecan’t allocate, but also can’t put a null pointer in ptr, what do we do inVec::new? Well, we just put some other garbage in there!

This is perfectly fine because we already have cap == 0 as our sentinel for noallocation. We don’t even need to handle it specially in almost any code becausewe usually need to check if cap > len or len > 0 anyway. The recommendedRust value to put here is mem::align_of::<T>(). Unique provides a conveniencefor this: Unique::empty(). There are quite a few places where we’llwant to use empty because there’s no real allocation to talk about butnull would make the compiler do bad things.

So:

  1. #![feature(alloc, heap_api)]
  2. use std::mem;
  3. impl<T> Vec<T> {
  4. fn new() -> Self {
  5. assert!(mem::size_of::<T>() != 0, "We're not ready to handle ZSTs");
  6. Vec { ptr: Unique::empty(), len: 0, cap: 0 }
  7. }
  8. }

I slipped in that assert there because zero-sized types will require somespecial handling throughout our code, and I want to defer the issue for now.Without this assert, some of our early drafts will do some Very Bad Things.

Next we need to figure out what to actually do when we do want space. Forthat, we’ll need to use the rest of the heap APIs. These basically allow us totalk directly to Rust’s allocator (jemalloc by default).

We’ll also need a way to handle out-of-memory (OOM) conditions. The standardlibrary calls std::alloc::oom(), which in turn calls the the oom langitem,which aborts the program in a platform-specific manner.The reason we abort and don’t panic is because unwinding can cause allocationsto happen, and that seems like a bad thing to do when your allocator just cameback with “hey I don’t have any more memory”.

Of course, this is a bit silly since most platforms don’t actually run out ofmemory in a conventional way. Your operating system will probably kill theapplication by another means if you legitimately start using up all the memory.The most likely way we’ll trigger OOM is by just asking for ludicrous quantitiesof memory at once (e.g. half the theoretical address space). As such it’sprobably fine to panic and nothing bad will happen. Still, we’re trying to belike the standard library as much as possible, so we’ll just kill the wholeprogram.

Okay, now we can write growing. Roughly, we want to have this logic:

  1. if cap == 0:
  2. allocate()
  3. cap = 1
  4. else:
  5. reallocate()
  6. cap *= 2

But Rust’s only supported allocator API is so low level that we’ll need to do afair bit of extra work. We also need to guard against some specialconditions that can occur with really large allocations or empty allocations.

In particular, ptr::offset will cause us a lot of trouble, because it hasthe semantics of LLVM’s GEP inbounds instruction. If you’re fortunate enough tonot have dealt with this instruction, here’s the basic story with GEP: aliasanalysis, alias analysis, alias analysis. It’s super important to an optimizingcompiler to be able to reason about data dependencies and aliasing.

As a simple example, consider the following fragment of code:

  1. # let x = &mut 0;
  2. # let y = &mut 0;
  3. *x *= 7;
  4. *y *= 3;

If the compiler can prove that x and y point to different locations inmemory, the two operations can in theory be executed in parallel (by e.g.loading them into different registers and working on them independently).However the compiler can’t do this in general because if x and y point tothe same location in memory, the operations need to be done to the same value,and they can’t just be merged afterwards.

When you use GEP inbounds, you are specifically telling LLVM that the offsetsyou’re about to do are within the bounds of a single “allocated” entity. Theultimate payoff being that LLVM can assume that if two pointers are known topoint to two disjoint objects, all the offsets of those pointers are alsoknown to not alias (because you won’t just end up in some random place inmemory). LLVM is heavily optimized to work with GEP offsets, and inboundsoffsets are the best of all, so it’s important that we use them as much aspossible.

So that’s what GEP’s about, how can it cause us trouble?

The first problem is that we index into arrays with unsigned integers, butGEP (and as a consequence ptr::offset) takes a signed integer. This meansthat half of the seemingly valid indices into an array will overflow GEP andactually go in the wrong direction! As such we must limit all allocations toisize::MAX elements. This actually means we only need to worry aboutbyte-sized objects, because e.g. > isize::MAX u16s will truly exhaust all ofthe system’s memory. However in order to avoid subtle corner cases where someonereinterprets some array of < isize::MAX objects as bytes, std limits allallocations to isize::MAX bytes.

On all 64-bit targets that Rust currently supports we’re artificially limitedto significantly less than all 64 bits of the address space (modern x64platforms only expose 48-bit addressing), so we can rely on just running out ofmemory first. However on 32-bit targets, particularly those with extensions touse more of the address space (PAE x86 or x32), it’s theoretically possible tosuccessfully allocate more than isize::MAX bytes of memory.

However since this is a tutorial, we’re not going to be particularly optimalhere, and just unconditionally check, rather than use clever platform-specificcfgs.

The other corner-case we need to worry about is empty allocations. There willbe two kinds of empty allocations we need to worry about: cap = 0 for all T,and cap > 0 for zero-sized types.

These cases are tricky because they comedown to what LLVM means by “allocated”. LLVM’s notion of anallocation is significantly more abstract than how we usually use it. BecauseLLVM needs to work with different languages’ semantics and custom allocators,it can’t really intimately understand allocation. Instead, the main idea behindallocation is “doesn’t overlap with other stuff”. That is, heap allocations,stack allocations, and globals don’t randomly overlap. Yep, it’s about aliasanalysis. As such, Rust can technically play a bit fast and loose with the notion ofan allocation as long as it’s consistent.

Getting back to the empty allocation case, there are a couple of places wherewe want to offset by 0 as a consequence of generic code. The question is then:is it consistent to do so? For zero-sized types, we have concluded that it isindeed consistent to do a GEP inbounds offset by an arbitrary number ofelements. This is a runtime no-op because every element takes up no space,and it’s fine to pretend that there’s infinite zero-sized types allocatedat 0x01. No allocator will ever allocate that address, because they won’tallocate 0x00 and they generally allocate to some minimal alignment higherthan a byte. Also generally the whole first page of memory isprotected from being allocated anyway (a whole 4k, on many platforms).

However what about for positive-sized types? That one’s a bit trickier. Inprinciple, you can argue that offsetting by 0 gives LLVM no information: eitherthere’s an element before the address or after it, but it can’t know which.However we’ve chosen to conservatively assume that it may do bad things. Assuch we will guard against this case explicitly.

Phew

Ok with all the nonsense out of the way, let’s actually allocate some memory:

  1. use std::alloc::oom;
  2. fn grow(&mut self) {
  3. // this is all pretty delicate, so let's say it's all unsafe
  4. unsafe {
  5. // current API requires us to specify size and alignment manually.
  6. let align = mem::align_of::<T>();
  7. let elem_size = mem::size_of::<T>();
  8. let (new_cap, ptr) = if self.cap == 0 {
  9. let ptr = heap::allocate(elem_size, align);
  10. (1, ptr)
  11. } else {
  12. // as an invariant, we can assume that `self.cap < isize::MAX`,
  13. // so this doesn't need to be checked.
  14. let new_cap = self.cap * 2;
  15. // Similarly this can't overflow due to previously allocating this
  16. let old_num_bytes = self.cap * elem_size;
  17. // check that the new allocation doesn't exceed `isize::MAX` at all
  18. // regardless of the actual size of the capacity. This combines the
  19. // `new_cap <= isize::MAX` and `new_num_bytes <= usize::MAX` checks
  20. // we need to make. We lose the ability to allocate e.g. 2/3rds of
  21. // the address space with a single Vec of i16's on 32-bit though.
  22. // Alas, poor Yorick -- I knew him, Horatio.
  23. assert!(old_num_bytes <= (::std::isize::MAX as usize) / 2,
  24. "capacity overflow");
  25. let new_num_bytes = old_num_bytes * 2;
  26. let ptr = heap::reallocate(self.ptr.as_ptr() as *mut _,
  27. old_num_bytes,
  28. new_num_bytes,
  29. align);
  30. (new_cap, ptr)
  31. };
  32. // If allocate or reallocate fail, we'll get `null` back
  33. if ptr.is_null() { oom(); }
  34. self.ptr = Unique::new(ptr as *mut _);
  35. self.cap = new_cap;
  36. }
  37. }

Nothing particularly tricky here. Just computing sizes and alignments and doingsome careful multiplication checks.