Concurrency

Concurrency

Concurrency happens whenever different parts of your program might executeat different times or out of order. In an embedded context, this includes:

interrupt handlers, which run whenever the associated interrupt happens,
various forms of multithreading, where your microprocessor regularly swapsbetween parts of your program,
and in some systems, multiple-core microprocessors, where each core can beindependently running a different part of your program at the same time.

Since many embedded programs need to deal with interrupts, concurrency willusually come up sooner or later, and it's also where many subtle and difficultbugs can occur. Luckily, Rust provides a number of abstractions and safetyguarantees to help us write correct code.

No Concurrency

The simplest concurrency for an embedded program is no concurrency: yoursoftware consists of a single main loop which just keeps running, and thereare no interrupts at all. Sometimes this is perfectly suited to the problemat hand! Typically your loop will read some inputs, perform some processing,and write some outputs.

#[entry]
fn main() {
    let peripherals = setup_peripherals();
    loop {
        let inputs = read_inputs(&peripherals);
        let outputs = process(inputs);
        write_outputs(&peripherals, outputs);
    }
}

Since there's no concurrency, there's no need to worry about sharing databetween parts of your program or synchronising access to peripherals. Ifyou can get away with such a simple approach this can be a great solution.

Global Mutable Data

Unlike non-embedded Rust, we will not usually have the luxury of creatingheap allocations and passing references to that data into a newly-createdthread. Instead our interrupt handlers might be called at any time and mustknow how to access whatever shared memory we are using. At the lowest level,this means we must have statically allocated mutable memory, whichboth the interrupt handler and the main code can refer to.

In Rust, such static mut variables are always unsafe to read or write,because without taking special care, you might trigger a race condition,where your access to the variable is interrupted halfway through by aninterrupt which also accesses that variable.

For an example of how this behaviour can cause subtle errors in your code,consider an embedded program which counts rising edges of some input signalin each one-second period (a frequency counter):

static mut COUNTER: u32 = 0;
#[entry]
fn main() -> ! {
    set_timer_1hz();
    let mut last_state = false;
    loop {
        let state = read_signal_level();
        if state && !last_state {
            // DANGER - Not actually safe! Could cause data races.
            unsafe { COUNTER += 1 };
        }
        last_state = state;
    }
}
#[interrupt]
fn timer() {
    unsafe { COUNTER = 0; }
}

Each second, the timer interrupt sets the counter back to 0. Meanwhile, themain loop continually measures the signal, and incremements the counter whenit sees a change from low to high. We've had to use unsafe to accessCOUNTER, as it's static mut, and that means we're promising the compilerwe won't cause any undefined behaviour. Can you spot the race condition? Theincrement on COUNTER is not guaranteed to be atomic — in fact, on mostembedded platforms, it will be split into a load, then the increment, thena store. If the interrupt fired after the load but before the store, thereset back to 0 would be ignored after the interrupt returns — and we wouldcount twice as many transitions for that period.

Critical Sections

So, what can we do about data races? A simple approach is to use criticalsections, a context where interrupts are disabled. By wrapping the access toCOUNTER in main in a critical section, we can be sure the timer interruptwill not fire until we're finished incrementing COUNTER:

static mut COUNTER: u32 = 0;
#[entry]
fn main() -> ! {
    set_timer_1hz();
    let mut last_state = false;
    loop {
        let state = read_signal_level();
        if state && !last_state {
            // New critical section ensures synchronised access to COUNTER
            cortex_m::interrupt::free(|_| {
                unsafe { COUNTER += 1 };
            });
        }
        last_state = state;
    }
}
#[interrupt]
fn timer() {
    unsafe { COUNTER = 0; }
}

In this example we use cortex_m::interrupt::free, but other platforms willhave similar mechanisms for executing code in a critical section. This is alsothe same as disabling interrupts, running some code, and then re-enablinginterrupts.

Note we didn't need to put a critical section inside the timer interrupt,for two reasons:

Writing 0 to COUNTER can't be affected by a race since we don't read it
It will never be interrupted by the main thread anyway

If COUNTER was being shared by multiple interrupt handlers that mightpreempt each other, then each one might require a critical section as well.

This solves our immediate problem, but we're still left writing a lot ofunsafe code which we need to carefully reason about, and we might be usingcritical sections needlessly — which comes at a cost to overhead and interruptlatency and jitter.

It's worth noting that while a critical section guarantees no interrupts willfire, it does not provide an exclusivity guarantee on multi-core systems! Theother core could be happily accessing the same memory as your core, evenwithout interrupts. You will need stronger synchronisation primitives if youare using multiple cores.

Atomic Access

On some platforms, atomic instructions are available, which provide guaranteesabout read-modify-write operations. Specifically for Cortex-M, thumbv6(Cortex-M0) does not provide atomic instructions, while thumbv7 (Cortex-M3and above) do. These instructions give an alternative to the heavy-handeddisabling of all interrupts: we can attempt the increment, it will succeed mostof the time, but if it was interrupted it will automatically retry the entireincrement operation. These atomic operations are safe even across multiplecores.

use core::sync::atomic::{AtomicUsize, Ordering};
static COUNTER: AtomicUsize = AtomicUsize::new(0);
#[entry]
fn main() -> ! {
    set_timer_1hz();
    let mut last_state = false;
    loop {
        let state = read_signal_level();
        if state && !last_state {
            // Use `fetch_add` to atomically add 1 to COUNTER
            COUNTER.fetch_add(1, Ordering::Relaxed);
        }
        last_state = state;
    }
}
#[interrupt]
fn timer() {
    // Use `store` to write 0 directly to COUNTER
    COUNTER.store(0, Ordering::Relaxed)
}

This time COUNTER is a safe static variable. Thanks to the AtomicUsizetype COUNTER can be safely modified from both the interrupt handler and themain thread without disabling interrupts. When possible, this is a bettersolution — but it may not be supported on your platform.

A note on Ordering: this affects how the compiler and hardware may reorderinstructions, and also has consequences on cache visibility. Assuming that thetarget is a single core platform Relaxed is sufficient and the most efficientchoice in this particular case. Stricter ordering will cause the compiler toemit memory barriers around the atomic operations; depending on what you'reusing atomics for you may or may not need this! The precise details of theatomic model are complicated and best described elsewhere.

For more details on atomics and ordering, see the nomicon.

Abstractions, Send, and Sync

None of the above solutions are especially satisfactory. They require unsafeblocks which must be very carefully checked and are not ergonomic. Surely wecan do better in Rust!

We can abstract our counter into a safe interface which can be safely usedanywhere else in our code. For this example we'll use the critical-sectioncounter, but you could do something very similar with atomics.

use core::cell::UnsafeCell;
use cortex_m::interrupt;
// Our counter is just a wrapper around UnsafeCell<u32>, which is the heart
// of interior mutability in Rust. By using interior mutability, we can have
// COUNTER be `static` instead of `static mut`, but still able to mutate
// its counter value.
struct CSCounter(UnsafeCell<u32>);
const CS_COUNTER_INIT: CSCounter = CSCounter(UnsafeCell::new(0));
impl CSCounter {
    pub fn reset(&self, _cs: &interrupt::CriticalSection) {
        // By requiring a CriticalSection be passed in, we know we must
        // be operating inside a CriticalSection, and so can confidently
        // use this unsafe block (required to call UnsafeCell::get).
        unsafe { *self.0.get() = 0 };
    }
    pub fn increment(&self, _cs: &interrupt::CriticalSection) {
        unsafe { *self.0.get() += 1 };
    }
}
// Required to allow static CSCounter. See explanation below.
unsafe impl Sync for CSCounter {}
// COUNTER is no longer `mut` as it uses interior mutability;
// therefore it also no longer requires unsafe blocks to access.
static COUNTER: CSCounter = CS_COUNTER_INIT;
#[entry]
fn main() -> ! {
    set_timer_1hz();
    let mut last_state = false;
    loop {
        let state = read_signal_level();
        if state && !last_state {
            // No unsafe here!
            interrupt::free(|cs| COUNTER.increment(cs));
        }
        last_state = state;
    }
}
#[interrupt]
fn timer() {
    // We do need to enter a critical section here just to obtain a valid
    // cs token, even though we know no other interrupt could pre-empt
    // this one.
    interrupt::free(|cs| COUNTER.reset(cs));
    // We could use unsafe code to generate a fake CriticalSection if we
    // really wanted to, avoiding the overhead:
    // let cs = unsafe { interrupt::CriticalSection::new() };
}

We've moved our unsafe code to inside our carefully-planned abstraction,and now our appplication code does not contain any unsafe blocks.

This design requires the application pass a CriticalSection token in:these tokens are only safely generated by interrupt::free, so by requiringone be passed in, we ensure we are operating inside a critical section, withouthaving to actually do the lock ourselves. This guarantee is provided staticallyby the compiler: there won't be any runtime overhead associated with cs.If we had multiple counters, they could all be given the same cs, withoutrequiring multiple nested critical sections.

This also brings up an important topic for concurrency in Rust: theSend and Sync traits. To summarise the Rust book, a type is Sendwhen it can safely be moved to another thread, while it is Sync whenit can be safely shared between multiple threads. In an embedded context,we consider interrupts to be executing in a separate thread to the applicationcode, so variables accessed by both an interrupt and the main code must beSync.

For most types in Rust, both of these traits are automatically derived for youby the compiler. However, because CSCounter contains an UnsafeCell, it isnot Sync, and therefore we could not make a static CSCounter: staticvariables must be Sync, since they can be accessed by multiple threads.

To tell the compiler we have taken care that the CSCounter is in fact safeto share between threads, we implement the Sync trait explicitly. As with theprevious use of critical sections, this is only safe on single-core platforms:with multiple cores you would need to go to greater lengths to ensure safety.

Mutexes

We've created a useful abstraction specific to our counter problem, butthere are many common abstractions used for concurrency.

One such synchronisation primitive is a mutex, short for mutual exclusion.These constructs ensure exclusive access to a variable, such as our counter. Athread can attempt to lock (or acquire) the mutex, and either succeedsimmediately, or blocks waiting for the lock to be acquired, or returns an errorthat the mutex could not be locked. While that thread holds the lock, it isgranted access to the protected data. When the thread is done, it unlocks (orreleases) the mutex, allowing another thread to lock it. In Rust, we wouldusually implement the unlock using the Drop trait to ensure it is alwaysreleased when the mutex goes out of scope.

Using a mutex with interrupt handlers can be tricky: it is not normallyacceptable for the interrupt handler to block, and it would be especiallydisastrous for it to block waiting for the main thread to release a lock,since we would then deadlock (the main thread will never release the lockbecause execution stays in the interrupt handler). Deadlocking is notconsidered unsafe: it is possible even in safe Rust.

To avoid this behaviour entirely, we could implement a mutex which requiresa critical section to lock, just like our counter example. So long as thecritical section must last as long as the lock, we can be sure we haveexclusive access to the wrapped variable without even needing to trackthe lock/unlock state of the mutex.

This is in fact done for us in the cortex_m crate! We could have writtenour counter using it:

use core::cell::Cell;
use cortex_m::interrupt::Mutex;
static COUNTER: Mutex<Cell<u32>> = Mutex::new(Cell::new(0));
#[entry]
fn main() -> ! {
    set_timer_1hz();
    let mut last_state = false;
    loop {
        let state = read_signal_level();
        if state && !last_state {
            interrupt::free(|cs|
                COUNTER.borrow(cs).set(COUNTER.borrow(cs).get() + 1));
        }
        last_state = state;
    }
}
#[interrupt]
fn timer() {
    // We still need to enter a critical section here to satisfy the Mutex.
    interrupt::free(|cs| COUNTER.borrow(cs).set(0));
}

We're now using Cell, which along with its sibling RefCell is used toprovide safe interior mutability. We've already seen UnsafeCell which isthe bottom layer of interior mutability in Rust: it allows you to obtainmultiple mutable references to its value, but only with unsafe code. A Cellis like an UnsafeCell but it provides a safe interface: it only permitstaking a copy of the current value or replacing it, not taking a reference,and since it is not Sync, it cannot be shared between threads. Theseconstraints mean it's safe to use, but we couldn't use it directly in astatic variable as a static must be Sync.

So why does the example above work? The Mutex<T> implements Sync for anyT which is Send — such as a Cell. It can do this safely because it onlygives access to its contents during a critical section. We're therefore ableto get a safe counter with no unsafe code at all!

This is great for simple types like the u32 of our counter, but what aboutmore complex types which are not Copy? An extremely common example in anembedded context is a peripheral struct, which generally are not Copy.For that we can turn to RefCell.

Device crates generated using svd2rust and similar abstractions providesafe access to peripherals by enforcing that only one instance of theperipheral struct can exist at a time. This ensures safety, but makes itdifficult to access a peripheral from both the main thread and an interrupthandler.

To safely share peripheral access, we can use the Mutex we saw before. We'llalso need to use RefCell, which uses a runtime check to ensure only onereference to a peripheral is given out at a time. This has more overhead thanthe plain Cell, but since we are giving out references rather than copies,we must be sure only one exists at a time.

Finally, we'll also have to account for somehow moving the peripheral intothe shared variable after it has been initialised in the main code. To dothis we can use the Option type, initialised to None and later set tothe instance of the peripheral.

use core::cell::RefCell;
use cortex_m::interrupt::{self, Mutex};
use stm32f4::stm32f405;
static MY_GPIO: Mutex<RefCell<Option<stm32f405::GPIOA>>> =
    Mutex::new(RefCell::new(None));
#[entry]
fn main() -> ! {
    // Obtain the peripheral singletons and configure it.
    // This example is from an svd2rust-generated crate, but
    // most embedded device crates will be similar.
    let dp = stm32f405::Peripherals::take().unwrap();
    let gpioa = &dp.GPIOA;
    // Some sort of configuration function.
    // Assume it sets PA0 to an input and PA1 to an output.
    configure_gpio(gpioa);
    // Store the GPIOA in the mutex, moving it.
    interrupt::free(|cs| MY_GPIO.borrow(cs).replace(Some(dp.GPIOA)));
    // We can no longer use `gpioa` or `dp.GPIOA`, and instead have to
    // access it via the mutex.
    // Be careful to enable the interrupt only after setting MY_GPIO:
    // otherwise the interrupt might fire while it still contains None,
    // and as-written (with `unwrap()`), it would panic.
    set_timer_1hz();
    let mut last_state = false;
    loop {
        // We'll now read state as a digital input, via the mutex
        let state = interrupt::free(|cs| {
            let gpioa = MY_GPIO.borrow(cs).borrow();
            gpioa.as_ref().unwrap().idr.read().idr0().bit_is_set()
        });
        if state && !last_state {
            // Set PA1 high if we've seen a rising edge on PA0.
            interrupt::free(|cs| {
                let gpioa = MY_GPIO.borrow(cs).borrow();
                gpioa.as_ref().unwrap().odr.modify(|_, w| w.odr1().set_bit());
            });
        }
        last_state = state;
    }
}
#[interrupt]
fn timer() {
    // This time in the interrupt we'll just clear PA0.
    interrupt::free(|cs| {
        // We can use `unwrap()` because we know the interrupt wasn't enabled
        // until after MY_GPIO was set; otherwise we should handle the potential
        // for a None value.
        let gpioa = MY_GPIO.borrow(cs).borrow();
        gpioa.as_ref().unwrap().odr.modify(|_, w| w.odr1().clear_bit());
    });
}

That's quite a lot to take in, so let's break down the important lines.

static MY_GPIO: Mutex<RefCell<Option<stm32f405::GPIOA>>> =
    Mutex::new(RefCell::new(None));

Our shared variable is now a Mutex around a RefCell which contains anOption. The Mutex ensures we only have access during a critical section,and therefore makes the variable Sync, even though a plain RefCell would notbe Sync. The RefCell gives us interior mutability with references, whichwe'll need to use our GPIOA. The Option lets us initialise this variableto something empty, and only later actually move the variable in. We cannotaccess the peripheral singleton statically, only at runtime, so this isrequired.

interrupt::free(|cs| MY_GPIO.borrow(cs).replace(Some(dp.GPIOA)));

Inside a critical section we can call borrow() on the mutex, which gives usa reference to the RefCell. We then call replace() to move our new valueinto the RefCell.

interrupt::free(|cs| {
    let gpioa = MY_GPIO.borrow(cs).borrow();
    gpioa.as_ref().unwrap().odr.modify(|_, w| w.odr1().set_bit());
});

Finally we use MY_GPIO in a safe and concurrent fashion. The critical sectionprevents the interrupt firing as usual, and lets us borrow the mutex. TheRefCell then gives us an &Option<GPIOA>, and tracks how long it remainsborrowed - once that reference goes out of scope, the RefCell will be updatedto indicate it is no longer borrowed.

Since we can't move the GPIOA out of the &Option, we need to convert it toan &Option<&GPIOA> with as_ref(), which we can finally unwrap() to obtainthe &GPIOA which lets us modify the peripheral.

If we need a mutable references to shared resources, then borrow_mut and deref_mutshould be used instead. The following code shows an example using the TIM2 timer.

use core::cell::RefCell;
use core::ops::DerefMut;
use cortex_m::interrupt::{self, Mutex};
use cortex_m::asm::wfi;
use stm32f4::stm32f405;
static G_TIM: Mutex<RefCell<Option<Timer<stm32::TIM2>>>> =
    Mutex::new(RefCell::new(None));
#[entry]
fn main() -> ! {
    let mut cp = cm::Peripherals::take().unwrap();
    let dp = stm32f405::Peripherals::take().unwrap();
    // Some sort of timer configuration function.
    // Assume it configures the TIM2 timer, its NVIC interrupt,
    // and finally starts the timer.
    let tim = configure_timer_interrupt(&mut cp, dp);
    interrupt::free(|cs| {
        G_TIM.borrow(cs).replace(Some(tim));
    });
    loop {
        wfi();
    }
}
#[interrupt]
fn timer() {
    interrupt::free(|cs| {
        if let Some(ref mut tim)) =  G_TIM.borrow(cs).borrow_mut().deref_mut() {
            tim.start(1.hz());
        }
    });
}

NOTE
At the moment, the cortex-m crate hides const versions of some functions(including Mutex::new()) behind the const-fn feature. So you need to addthe const-fn feature as a dependency for cortex-m in the Cargo.toml to makethe above examples work:
[dependencies.cortex-m]version="0.6.0"features=["const-fn"]
Meanwhile, const-fn has been working on stable Rust for some time now.So this additional switch in Cargo.toml will not be needed as soon asit is enabled in cortex-m by default.

Whew! This is safe, but it is also a little unwieldy. Is there anything elsewe can do?

RTFM

One alternative is the RTFM framework, short for Real Time For the Masses. Itenforces static priorities and tracks accesses to static mut variables("resources") to statically ensure that shared resources are always accessedsafely, without requiring the overhead of always entering critical sections andusing reference counting (as in RefCell). This has a number of advantages suchas guaranteeing no deadlocks and giving extremely low time and memory overhead.

The framework also includes other features like message passing, which reducesthe need for explicit shared state, and the ability to schedule tasks to run ata given time, which can be used to implement periodic tasks. Check out thedocumentation for more information!

Real Time Operating Systems

Another common model for embedded concurrency is the real-time operating system(RTOS). While currently less well explored in Rust, they are widely used intraditional embedded development. Open source examples include FreeRTOS andChibiOS. These RTOSs provide support for running multiple application threadswhich the CPU swaps between, either when the threads yield control (calledcooperative multitasking) or based on a regular timer or interrupts (preemptivemultitasking). The RTOS typically provide mutexes and other synchronisationprimitives, and often interoperate with hardware features such as DMA engines.

At the time of writing there are not many Rust RTOS examples to point to,but it's an interesting area so watch this space!

Multiple Cores

It is becoming more common to have two or more cores in embedded processors,which adds an extra layer of complexity to concurrency. All the examples usinga critical section (including the cortex_m::interrupt::Mutex) assume the onlyother execution thread is the interrupt thread, but on a multi-core systemthat's no longer true. Instead, we'll need synchronisation primitives designedfor multiple cores (also called SMP, for symmetric multi-processing).

These typically use the atomic instructions we saw earlier, since theprocessing system will ensure that atomicity is maintained over all cores.

Covering these topics in detail is currently beyond the scope of this book,but the general patterns are the same as for the single-core case.

6. Concurrency