Chapter 4 ( JIT and Optimizer Support ) - External Functions - 《Implementing a JIT Compiled Language with Haskell and LLVM》

External Functions

External Functions

The JIT provides a number of other more advanced interfaces for things like freeing allocated machine code, rejit’ing functions to update them, etc. However, even with this simple code, we get some surprisingly powerful capabilities - check this out:

ready> extern sin(x);
; ModuleID = 'my cool jit'
declare double @sin(double)
ready> extern cos(x);
; ModuleID = 'my cool jit'
declare double @sin(double)
declare double @cos(double)
ready> sin(1.0);
; ModuleID = 'my cool jit'
declare double @sin(double)
declare double @cos(double)
define double @main() {
entry:
  %0 = call double @sin(double 1.000000e+00)
  ret double %0
}
Evaluated to: 0.8414709848078965

Whoa, how does the JIT know about sin and cos? The answer is surprisingly simple: in this example, the JIT started execution of a function and got to a function call. It realized that the function was not yet JIT compiled and invoked the standard set of routines to resolve the function. In this case, there is no body defined for the function, so the JIT ended up calling dlsym("sin") on the Kaleidoscope process itself. Since “sin” is defined within the JIT’s address space, it simply patches up calls in the module to call the libm version of sin directly.

The LLVM JIT provides a number of interfaces for controlling how unknown functions get resolved. It allows us to establish explicit mappings between IR objects and addresses (useful for LLVM global variables that we want to map to static tables, for example), allows us to dynamically decide on the fly based on the function name, and even allows us JIT compile functions lazily the first time they’re called.

One interesting application of this is that we can now extend the language by writing arbitrary C code to implement operations. For example, we create a shared library cbits.so:

/* cbits
$ gcc -fPIC -shared cbits.c -o cbits.so
$ clang -fPIC -shared cbits.c -o cbits.so
*/
#include "stdio.h"
// putchard - putchar that takes a double and returns 0.
double putchard(double X) {
  putchar((char)X);
  fflush(stdout);
  return 0;
}

Compile this with your favorite C compiler. We can then link this into our Haskell binary by simply including it alongside the rest of the Haskell source files:

$ ghc cbits.so --make Main.hs -o Main

Now we can produce simple output to the console by using things like: extern putchard(x); putchard(120);, which prints a lowercase ‘x’ on the console (120 is the ASCII code for ‘x’). Similar code could be used to implement file I/O, console input, and many other capabilities in Kaleidoscope.

To bring external shared objects into the process address space we can call Haskell’s bindings to the system dynamic linking loader to load external libraries. In addition if we are statically compiling our interpreter we can tell GHC to link against the shared objects explicitly by passing them in with the -l flag.

This completes the JIT and optimizer chapter of the Kaleidoscope tutorial. At this point, we can compile a non-Turing-complete programming language, optimize and JIT compile it in a user-driven way. Next up we’ll look into extending the language with control flow constructs, tackling some interesting LLVM IR issues along the way.