Pointers


In C no concept of lists can be explored without dealing properly with pointers. Pointers are a famously misunderstood aspect of C. They are difficult to teach because while being conceptually very simple, they come with a lot of new terminology, and often no clear use-case. This makes them appear far more monstrous than they are. Luckily for us, we have a couple of ideal use-cases, both of which are extremely typical in C, and will likely end up being how you use pointers 90% of the time.

The reason we need pointers in C is because of how function calling works. When you call a function in C the arguments are always passed by value. This means a copy of them is passed to the function you call. This is true for int, long, char, and user-defined struct types such as lval. Most of the time this is great but occasionally it can cause issues.

A common problem occurs when we have a large struct containing many other sub structs we wish to pass around. Every time we call a function we must create another copy of it. Suddenly the amount of data that needs to be copied around just to call a function can become huge!

A second problem is this. When we define a struct, it is always a fixed size. It has a limited number of fields, and each field is itself a fixed size. If I want to call a function with just a list of things, where the number of things varies from call to call, clearly I can’t use a struct to do this.

To get around these issues the developers of C (or y’know…someone) came up with a clever idea. They imagined computer memory as a single huge list of bytes. In this list each byte can be given a global index, or position. A bit like a house number. The first byte is numbered 0, the second is 1, etc.

In this case, all the data in the computer, including the structs and variables used in the currently running program, start at some index in this huge list. If, rather than copying the data itself to a function, we instead copy a number representing the index at where this data starts, the function being called can look up any amount of data it wants.

By using addresses instead of the actual data, we can allow a function to access and modify some location in memory without having to copy any data. Functions can also use pointers to do other stuff, like output data to some address given as input.

Because the total size of computer memory is fixed, the number of bytes needed to represent an address is always the same. But if we keep track of it, the number of bytes the address points to can grow and shrink. This means we can create a variable sized data-structure and still pass it to a function, which can inspect and modify it.

So a pointer is just a number. A number representing the starting index of some data in memory. The type of the pointer hints to us, and the compiler, what type of data might be accessible at this location.

We can declare pointer types by suffixing existing ones with the * character. We’ve seen some examples of this already with mpc_parser_t*, mpc_ast_t*, or char*.

To create a pointer to some data, we need to get its index, or address. To get the address of some data we use the address of operator &. Again you’ve seen this before when we passed in a pointer to mpc_parse so it would output into our mpc_result_t.

Finally to get the data at an address, called dereferencing, we use the * operator on the left-hand side of a variable. To get the data at the field of a pointer to a struct we use the arrow ->. This you saw in chapter 7.