Exception Safety

Although programs should use unwinding sparingly, there’s a lot of code thatcan panic. If you unwrap a None, index out of bounds, or divide by 0, yourprogram will panic. On debug builds, every arithmetic operation can panicif it overflows. Unless you are very careful and tightly control what code runs,pretty much everything can unwind, and you need to be ready for it.

Being ready for unwinding is often referred to as exception safetyin the broader programming world. In Rust, there are two levels of exceptionsafety that one may concern themselves with:

  • In unsafe code, we must be exception safe to the point of not violatingmemory safety. We’ll call this minimal exception safety.

  • In safe code, it is good to be exception safe to the point of your programdoing the right thing. We’ll call this maximal exception safety.

As is the case in many places in Rust, Unsafe code must be ready to deal withbad Safe code when it comes to unwinding. Code that transiently createsunsound states must be careful that a panic does not cause that state to beused. Generally this means ensuring that only non-panicking code is run whilethese states exist, or making a guard that cleans up the state in the case ofa panic. This does not necessarily mean that the state a panic witnesses is afully coherent state. We need only guarantee that it’s a safe state.

Most Unsafe code is leaf-like, and therefore fairly easy to make exception-safe.It controls all the code that runs, and most of that code can’t panic. Howeverit is not uncommon for Unsafe code to work with arrays of temporarilyuninitialized data while repeatedly invoking caller-provided code. Such codeneeds to be careful and consider exception safety.

Vec::push_all

Vec::push_all is a temporary hack to get extending a Vec by a slice reliablyefficient without specialization. Here’s a simple implementation:

  1. impl<T: Clone> Vec<T> {
  2. fn push_all(&mut self, to_push: &[T]) {
  3. self.reserve(to_push.len());
  4. unsafe {
  5. // can't overflow because we just reserved this
  6. self.set_len(self.len() + to_push.len());
  7. for (i, x) in to_push.iter().enumerate() {
  8. self.ptr().offset(i as isize).write(x.clone());
  9. }
  10. }
  11. }
  12. }

We bypass push in order to avoid redundant capacity and len checks on theVec that we definitely know has capacity. The logic is totally correct, exceptthere’s a subtle problem with our code: it’s not exception-safe! set_len,offset, and write are all fine; clone is the panic bomb we over-looked.

Clone is completely out of our control, and is totally free to panic. If itdoes, our function will exit early with the length of the Vec set too large. Ifthe Vec is looked at or dropped, uninitialized memory will be read!

The fix in this case is fairly simple. If we want to guarantee that the valueswe did clone are dropped, we can set the len every loop iteration. If wejust want to guarantee that uninitialized memory can’t be observed, we can setthe len after the loop.

BinaryHeap::sift_up

Bubbling an element up a heap is a bit more complicated than extending a Vec.The pseudocode is as follows:

  1. bubble_up(heap, index):
  2. while index != 0 && heap[index] < heap[parent(index)]:
  3. heap.swap(index, parent(index))
  4. index = parent(index)

A literal transcription of this code to Rust is totally fine, but has an annoyingperformance characteristic: the self element is swapped over and over againuselessly. We would rather have the following:

  1. bubble_up(heap, index):
  2. let elem = heap[index]
  3. while index != 0 && elem < heap[parent(index)]:
  4. heap[index] = heap[parent(index)]
  5. index = parent(index)
  6. heap[index] = elem

This code ensures that each element is copied as little as possible (it is infact necessary that elem be copied twice in general). However it now exposessome exception safety trouble! At all times, there exists two copies of onevalue. If we panic in this function something will be double-dropped.Unfortunately, we also don’t have full control of the code: that comparison isuser-defined!

Unlike Vec, the fix isn’t as easy here. One option is to break the user-definedcode and the unsafe code into two separate phases:

  1. bubble_up(heap, index):
  2. let end_index = index;
  3. while end_index != 0 && heap[end_index] < heap[parent(end_index)]:
  4. end_index = parent(end_index)
  5. let elem = heap[index]
  6. while index != end_index:
  7. heap[index] = heap[parent(index)]
  8. index = parent(index)
  9. heap[index] = elem

If the user-defined code blows up, that’s no problem anymore, because we haven’tactually touched the state of the heap yet. Once we do start messing with theheap, we’re working with only data and functions that we trust, so there’s noconcern of panics.

Perhaps you’re not happy with this design. Surely it’s cheating! And we haveto do the complex heap traversal twice! Alright, let’s bite the bullet. Let’sintermix untrusted and unsafe code for reals.

If Rust had try and finally like in Java, we could do the following:

  1. bubble_up(heap, index):
  2. let elem = heap[index]
  3. try:
  4. while index != 0 && elem < heap[parent(index)]:
  5. heap[index] = heap[parent(index)]
  6. index = parent(index)
  7. finally:
  8. heap[index] = elem

The basic idea is simple: if the comparison panics, we just toss the looseelement in the logically uninitialized index and bail out. Anyone who observesthe heap will see a potentially inconsistent heap, but at least it won’tcause any double-drops! If the algorithm terminates normally, then thisoperation happens to coincide precisely with the how we finish up regardless.

Sadly, Rust has no such construct, so we’re going to need to roll our own! Theway to do this is to store the algorithm’s state in a separate struct with adestructor for the “finally” logic. Whether we panic or not, that destructorwill run and clean up after us.

  1. struct Hole<'a, T: 'a> {
  2. data: &'a mut [T],
  3. /// `elt` is always `Some` from new until drop.
  4. elt: Option<T>,
  5. pos: usize,
  6. }
  7. impl<'a, T> Hole<'a, T> {
  8. fn new(data: &'a mut [T], pos: usize) -> Self {
  9. unsafe {
  10. let elt = ptr::read(&data[pos]);
  11. Hole {
  12. data: data,
  13. elt: Some(elt),
  14. pos: pos,
  15. }
  16. }
  17. }
  18. fn pos(&self) -> usize { self.pos }
  19. fn removed(&self) -> &T { self.elt.as_ref().unwrap() }
  20. unsafe fn get(&self, index: usize) -> &T { &self.data[index] }
  21. unsafe fn move_to(&mut self, index: usize) {
  22. let index_ptr: *const _ = &self.data[index];
  23. let hole_ptr = &mut self.data[self.pos];
  24. ptr::copy_nonoverlapping(index_ptr, hole_ptr, 1);
  25. self.pos = index;
  26. }
  27. }
  28. impl<'a, T> Drop for Hole<'a, T> {
  29. fn drop(&mut self) {
  30. // fill the hole again
  31. unsafe {
  32. let pos = self.pos;
  33. ptr::write(&mut self.data[pos], self.elt.take().unwrap());
  34. }
  35. }
  36. }
  37. impl<T: Ord> BinaryHeap<T> {
  38. fn sift_up(&mut self, pos: usize) {
  39. unsafe {
  40. // Take out the value at `pos` and create a hole.
  41. let mut hole = Hole::new(&mut self.data, pos);
  42. while hole.pos() != 0 {
  43. let parent = parent(hole.pos());
  44. if hole.removed() <= hole.get(parent) { break }
  45. hole.move_to(parent);
  46. }
  47. // Hole will be unconditionally filled here; panic or not!
  48. }
  49. }
  50. }