Getting Ready to Rust
Before we can start running Rust code, we need to do some initialisation.
.section .init.entry, "ax".global entryentry:/** Load and apply the memory management configuration, ready to* enable MMU and caches.*/adrp x30, idmapmsr ttbr0_el1, x30mov_i x30, .Lmairvalmsr mair_el1, x30mov_i x30, .Ltcrval/* Copy the supported PA range into TCR_EL1.IPS. */mrs x29, id_aa64mmfr0_el1bfi x30, x29, #32, #4msr tcr_el1, x30mov_i x30, .Lsctlrval/** Ensure everything before this point has completed, then* invalidate any potentially stale local TLB entries before they* start being used.*/isbtlbi vmalle1ic ialludsb nshisb/** Configure sctlr_el1 to enable MMU and cache and don't proceed* until this has completed.*/msr sctlr_el1, x30isb/* Disable trapping floating point access in EL1. */mrs x30, cpacr_el1orr x30, x30, #(0x3 << 20)msr cpacr_el1, x30isb/* Zero out the bss section. */adr_l x29, bss_beginadr_l x30, bss_end0: cmp x29, x30b.hs 1fstp xzr, xzr, [x29], #16b 0b1: /* Prepare the stack. */adr_l x30, boot_stack_endmov sp, x30/* Set up exception vector. */adr x30, vector_table_el1msr vbar_el1, x30/* Call into Rust code. */bl main/* Loop forever waiting for interrupts. */2: wfib 2b
- This is the same as it would be for C: initialising the processor state, zeroing the BSS, and setting up the stack pointer.
- The BSS (block starting symbol, for historical reasons) is the part of the object file which containing statically allocated variables which are initialised to zero. They are omitted from the image, to avoid wasting space on zeroes. The compiler assumes that the loader will take care of zeroing them.
- The BSS may already be zeroed, depending on how memory is initialised and the image is loaded, but we zero it to be sure.
- We need to enable the MMU and cache before reading or writing any memory. If we don’t:
- Unaligned accesses will fault. We build the Rust code for the
aarch64-unknown-nonetarget which sets+strict-alignto prevent the compiler generating unaligned accesses, so it should be fine in this case, but this is not necessarily the case in general. - If it were running in a VM, this can lead to cache coherency issues. The problem is that the VM is accessing memory directly with the cache disabled, while the host has cacheable aliases to the same memory. Even if the host doesn’t explicitly access the memory, speculative accesses can lead to cache fills, and then changes from one or the other will get lost when the cache is cleaned or the VM enables the cache. (Cache is keyed by physical address, not VA or IPA.)
- Unaligned accesses will fault. We build the Rust code for the
- For simplicity, we just use a hardcoded pagetable (see
idmap.S) which identity maps the first 1 GiB of address space for devices, the next 1 GiB for DRAM, and another 1 GiB higher up for more devices. This matches the memory layout that QEMU uses. - We also set up the exception vector (
vbar_el1), which we’ll see more about later. - All examples this afternoon assume we will be running at exception level 1 (EL1). If you need to run at a different exception level you’ll need to modify
entry.Saccordingly.