Working with Unsafe

Working with Unsafe

Rust generally only gives us the tools to talk about Unsafe Rust in a scoped andbinary manner. Unfortunately, reality is significantly more complicated thanthat. For instance, consider the following toy function:

fn index(idx: usize, arr: &[u8]) -> Option<u8> {
    if idx < arr.len() {
        unsafe {
            Some(*arr.get_unchecked(idx))
        }
    } else {
        None
    }
}

This function is safe and correct. We check that the index is in bounds, and if itis, index into the array in an unchecked manner. But even in such a trivialfunction, the scope of the unsafe block is questionable. Consider changing the< to a <=:

fn index(idx: usize, arr: &[u8]) -> Option<u8> {
    if idx <= arr.len() {
        unsafe {
            Some(*arr.get_unchecked(idx))
        }
    } else {
        None
    }
}

This program is now unsound, and yet we only modified safe code. This is thefundamental problem of safety: it’s non-local. The soundness of our unsafeoperations necessarily depends on the state established by otherwise“safe” operations.

Safety is modular in the sense that opting into unsafety doesn’t require youto consider arbitrary other kinds of badness. For instance, doing an uncheckedindex into a slice doesn’t mean you suddenly need to worry about the slice beingnull or containing uninitialized memory. Nothing fundamentally changes. Howeversafety isn’t modular in the sense that programs are inherently stateful andyour unsafe operations may depend on arbitrary other state.

This non-locality gets much worse when we incorporate actual persistent state.Consider a simple implementation of Vec:

use std::ptr;
// Note: This definition is naive. See the chapter on implementing Vec.
pub struct Vec<T> {
    ptr: *mut T,
    len: usize,
    cap: usize,
}
// Note this implementation does not correctly handle zero-sized types.
// See the chapter on implementing Vec.
impl<T> Vec<T> {
    pub fn push(&mut self, elem: T) {
        if self.len == self.cap {
            // not important for this example
            self.reallocate();
        }
        unsafe {
            ptr::write(self.ptr.offset(self.len as isize), elem);
            self.len += 1;
        }
    }
    # fn reallocate(&mut self) { }
}
# fn main() {}

This code is simple enough to reasonably audit and informally verify. Now consideradding the following method:

fn make_room(&mut self) {
    // grow the capacity
    self.cap += 1;
}

This code is 100% Safe Rust but it is also completely unsound. Changing thecapacity violates the invariants of Vec (that cap reflects the allocated spacein the Vec). This is not something the rest of Vec can guard against. It hasto trust the capacity field because there’s no way to verify it.

Because it relies on invariants of a struct field, this unsafe codedoes more than pollute a whole function: it pollutes a whole module.Generally, the only bullet-proof way to limit the scope of unsafe code is at themodule boundary with privacy.

However this works perfectly. The existence of make_room is not aproblem for the soundness of Vec because we didn’t mark it as public. Only themodule that defines this function can call it. Also, make_room directlyaccesses the private fields of Vec, so it can only be written in the same moduleas Vec.

It is therefore possible for us to write a completely safe abstraction thatrelies on complex invariants. This is critical to the relationship betweenSafe Rust and Unsafe Rust.

We have already seen that Unsafe code must trust some Safe code, but shouldn’ttrust generic Safe code. Privacy is important to unsafe code for similar reasons:it prevents us from having to trust all the safe code in the universe from messingwith our trusted state.

Safety lives!