Memory

The Zig language performs no memory management on behalf of the programmer. This is why Zig has no runtime, and why Zig code works seamlessly in so many environments, including real-time software, operating system kernels, embedded devices, and low latency servers. As a consequence, Zig programmers must always be able to answer the question:

Where are the bytes?

Like Zig, the C programming language has manual memory management. However, unlike Zig, C has a default allocator - malloc, realloc, and free. When linking against libc, Zig exposes this allocator with std.heap.c_allocator. However, by convention, there is no default allocator in Zig. Instead, functions which need to allocate accept an *Allocator parameter. Likewise, data structures such as std.ArrayList accept an *Allocator parameter in their initialization functions:

allocator.zig

  1. const std = @import("std");
  2. const Allocator = std.mem.Allocator;
  3. const assert = std.debug.assert;
  4. test "using an allocator" {
  5. var buffer: [100]u8 = undefined;
  6. const allocator = &std.heap.FixedBufferAllocator.init(&buffer).allocator;
  7. const result = try concat(allocator, "foo", "bar");
  8. assert(std.mem.eql(u8, "foobar", result));
  9. }
  10. fn concat(allocator: *Allocator, a: []const u8, b: []const u8) ![]u8 {
  11. const result = try allocator.alloc(u8, a.len + b.len);
  12. std.mem.copy(u8, result, a);
  13. std.mem.copy(u8, result[a.len..], b);
  14. return result;
  15. }
  1. $ zig test allocator.zig
  2. 1/1 test "using an allocator"...OK
  3. All tests passed.

In the above example, 100 bytes of stack memory are used to initialize a FixedBufferAllocator, which is then passed to a function. As a convenience there is a global FixedBufferAllocator available for quick tests at std.debug.global_allocator, however it is deprecated and should be avoided in favor of directly using a FixedBufferAllocator as in the example above.

Currently Zig has no general purpose allocator, but there is one under active development. Once it is merged into the Zig standard library it will become available to import with std.heap.default_allocator. However, it will still be recommended to follow the Choosing an Allocator guide.

Choosing an Allocator

What allocator to use depends on a number of factors. Here is a flow chart to help you decide:

  • Are you making a library? In this case, best to accept an *Allocator as a parameter and allow your library's users to decide what allocator to use.
  • Are you linking libc? In this case, std.heap.c_allocator is likely the right choice, at least for your main allocator.
  • Are you building for WebAssembly? In this case, std.heap.wasm_allocator is likely the right choice for your main allocator as it uses WebAssembly's memory instructions.
  • Is the maximum number of bytes that you will need bounded by a number known at comptime? In this case, use std.heap.FixedBufferAllocator or std.heap.ThreadSafeFixedBufferAllocator depending on whether you need thread-safety or not.
  • Is your program a command line application which runs from start to end without any fundamental cyclical pattern (such as a video game main loop, or a web server request handler), such that it would make sense to free everything at once at the end? In this case, it is recommended to follow this pattern: cli_allocation.zig
  1. const std = @import("std");
  2. pub fn main() !void {
  3. var arena = std.heap.ArenaAllocator.init(std.heap.direct_allocator);
  4. defer arena.deinit();
  5. const allocator = &arena.allocator;
  6. const ptr = try allocator.create(i32);
  7. std.debug.warn("ptr={*}\n", ptr);
  8. }
  1. $ zig build-exe cli_allocation.zig
  2. $ ./cli_allocation
  3. ptr=i32@7f1f6daaa018

When using this kind of allocator, there is no need to free anything manually. Everything gets freed at once with the call to arena.deinit().

  • Are the allocations part of a cyclical pattern such as a video game main loop, or a web server request handler? If the allocations can all be freed at once, at the end of the cycle, for example once the video game frame has been fully rendered, or the web server request has been served, then std.heap.ArenaAllocator is a great candidate. As demonstrated in the previous bullet point, this allows you to free entire arenas at once. Note also that if an upper bound of memory can be established, then std.heap.FixedBufferAllocator can be used as a further optimization.
  • Are you writing a test, and you want to make sure error.OutOfMemory is handled correctly? In this case, use std.debug.FailingAllocator.
  • Finally, if none of the above apply, you need a general purpose allocator. Zig does not yet have a general purpose allocator in the standard library, but one is being actively developed. You can also consider Implementing an Allocator.

Where are the bytes?

String literals such as "foo" are in the global constant data section. This is why it is an error to pass a string literal to a mutable slice, like this:

test.zig

  1. fn foo(s: []u8) void {}
  2. test "string literal to mutable slice" {
  3. foo("hello");
  4. }
  1. $ zig test test.zig
  2. /home/andy/dev/zig/docgen_tmp/test.zig:4:9: error: expected type '[]u8', found '[5]u8'
  3. foo("hello");
  4. ^
  5. /home/andy/dev/zig/docgen_tmp/test.zig:4:8: note: referenced here
  6. foo("hello");
  7. ^

However if you make the slice constant, then it works:

strlit.zig

  1. fn foo(s: []const u8) void {}
  2. test "string literal to constant slice" {
  3. foo("hello");
  4. }
  1. $ zig test strlit.zig
  2. 1/1 test "string literal to constant slice"...OK
  3. All tests passed.

Just like string literals, const declarations, when the value is known at comptime, are stored in the global constant data section. Also Compile Time Variables are stored in the global constant data section.

var declarations inside functions are stored in the function's stack frame. Once a function returns, any Pointers to variables in the function's stack frame become invalid references, and dereferencing them becomes unchecked Undefined Behavior.

var declarations at the top level or in struct declarations are stored in the global data section.

The location of memory allocated with allocator.alloc or allocator.create is determined by the allocator's implementation.

TODO: thread local variables

Implementing an Allocator

Zig programmers can implement their own allocators by fulfilling the Allocator interface. In order to do this one must read carefully the documentation comments in std/mem.zig and then supply a reallocFn and a shrinkFn.

There are many example allocators to look at for inspiration. Look at std/heap.zig and at this work-in-progress general purpose allocator. TODO: once #21 is done, link to the docs here.

Heap Allocation Failure

Many programming languages choose to handle the possibility of heap allocation failure by unconditionally crashing. By convention, Zig programmers do not consider this to be a satisfactory solution. Instead, error.OutOfMemory represents heap allocation failure, and Zig libraries return this error code whenever heap allocation failure prevented an operation from completing successfully.

Some have argued that because some operating systems such as Linux have memory overcommit enabled by default, it is pointless to handle heap allocation failure. There are many problems with this reasoning:

  • Only some operating systems have an overcommit feature.
    • Linux has it enabled by default, but it is configurable.
    • Windows does not overcommit.
    • Embedded systems do not have overcommit.
    • Hobby operating systems may or may not have overcommit.
  • For real-time systems, not only is there no overcommit, but typically the maximum amount of memory per application is determined ahead of time.
  • When writing a library, one of the main goals is code reuse. By making code handle allocation failure correctly, a library becomes eligible to be reused in more contexts.
  • Although some software has grown to depend on overcommit being enabled, its existence is the source of countless user experience disasters. When a system with overcommit enabled, such as Linux on default settings, comes close to memory exhaustion, the system locks up and becomes unusable. At this point, the OOM Killer selects an application to kill based on heuristics. This non-deterministic decision often results in an important process being killed, and often fails to return the system back to working order.

Recursion

Recursion is a fundamental tool in modeling software. However it has an often-overlooked problem: unbounded memory allocation.

Recursion is an area of active experimentation in Zig and so the documentation here is not final. You can read a summary of recursion status in the 0.3.0 release notes.

The short summary is that currently recursion works normally as you would expect. Although Zig code is not yet protected from stack overflow, it is planned that a future version of Zig will provide such protection, with some degree of cooperation from Zig code required.

Lifetime and Ownership

It is the Zig programmer's responsibility to ensure that a pointer is not accessed when the memory pointed to is no longer available. Note that a slice is a form of pointer, in that it references other memory.

In order to prevent bugs, there are some helpful conventions to follow when dealing with pointers. In general, when a function returns a pointer, the documentation for the function should explain who "owns" the pointer. This concept helps the programmer decide when it is appropriate, if ever, to free the pointer.

For example, the function's documentation may say "caller owns the returned memory", in which case the code that calls the function must have a plan for when to free that memory. Probably in this situation, the function will accept an *Allocator parameter.

Sometimes the lifetime of a pointer may be more complicated. For example, when using std.ArrayList(T).toSlice(), the returned slice has a lifetime that remains valid until the next time the list is resized, such as by appending new elements.

The API documentation for functions and data structures should take great care to explain the ownership and lifetime semantics of pointers. Ownership determines whose responsibility it is to free the memory referenced by the pointer, and lifetime determines the point at which the memory becomes inaccessible (lest Undefined Behavior occur).