Instructions

Now that we have the basic infrastructure in place we’ll wrap the raw llvm-hs AST nodes inside a collection of helper functions to push instructions onto the stack held within our monad.

Instructions in LLVM are either numbered sequentially (%0, %1, …) or given explicit variable names (%a, %foo, ..). For example, the arguments to the following function are named values, while the result of the add instruction is unnamed.

  1. define i32 @add(i32 %a, i32 %b) {
  2. %1 = add i32 %a, %b
  3. ret i32 %1
  4. }

In the implementation of llvm-hs both these types are represented in a sum type containing the constructors UnName and Name. For most of our purpose we will simply use numbered expressions and map the numbers to identifiers within our symbol table. Every instruction added will increment the internal counter, to accomplish this we add a fresh name supply.

  1. fresh :: Codegen Word
  2. fresh = do
  3. i <- gets count
  4. modify $ \s -> s { count = 1 + i }
  5. return $ i + 1

Throughout our code we will however refer named values within the module, these have a special data type Name (with an associated IsString instance so that Haskell can automatically perform the boilerplate coercions between String types) for which we’ll create a second name supply map which guarantees that our block names are unique.

  1. type Names = Map.Map String Int
  2. uniqueName :: String -> Names -> (String, Names)
  3. uniqueName nm ns =
  4. case Map.lookup nm ns of
  5. Nothing -> (nm, Map.insert nm 1 ns)
  6. Just ix -> (nm ++ show ix, Map.insert nm (ix+1) ns)

Since we can now work with named LLVM values we need to create several functions for referring to references of values.

  1. local :: Name -> Operand
  2. local = LocalReference double
  3. externf :: Name -> Operand
  4. externf = ConstantOperand . C.GlobalReference double

Our function externf will emit a named value which refers to a toplevel function (@add) in our module or will refer to an externally declared function (@putchar). For instance:

  1. declare i32 @putchar(i32)
  2. define i32 @add(i32 %a, i32 %b) {
  3. %1 = add i32 %a, %b
  4. ret i32 %1
  5. }
  6. define void @main() {
  7. %1 = call i32 @add(i32 0, i32 97)
  8. call i32 @putchar(i32 %1)
  9. ret void
  10. }

Since we’d like to refer to values on the stack by named quantities we’ll implement a simple symbol table as an association list letting us assign variable names to operand quantities and subsequently look them up when used.

  1. assign :: String -> Operand -> Codegen ()
  2. assign var x = do
  3. lcls <- gets symtab
  4. modify $ \s -> s { symtab = [(var, x)] ++ lcls }
  5. getvar :: String -> Codegen Operand
  6. getvar var = do
  7. syms <- gets symtab
  8. case lookup var syms of
  9. Just x -> return x
  10. Nothing -> error $ "Local variable not in scope: " ++ show var

Now that we have a way of naming instructions we’ll create an internal function to take an llvm-hs AST node and push it on the current basic block stack. We’ll return the left hand side reference of the instruction. Instructions will come in two flavors, instructions and terminators. Every basic block has a unique terminator and every last basic block in a function must terminate in a ret.

  1. instr :: Instruction -> Codegen (Operand)
  2. instr ins = do
  3. n <- fresh
  4. let ref = (UnName n)
  5. blk <- current
  6. let i = stack blk
  7. modifyBlock (blk { stack = (ref := ins) : i } )
  8. return $ local ref
  9. terminator :: Named Terminator -> Codegen (Named Terminator)
  10. terminator trm = do
  11. blk <- current
  12. modifyBlock (blk { term = Just trm })
  13. return trm

Using the instr function we now wrap the AST nodes for basic arithmetic operations of floating point values.

  1. fadd :: Operand -> Operand -> Codegen Operand
  2. fadd a b = instr $ FAdd NoFastMathFlags a b []
  3. fsub :: Operand -> Operand -> Codegen Operand
  4. fsub a b = instr $ FSub NoFastMathFlags a b []
  5. fmul :: Operand -> Operand -> Codegen Operand
  6. fmul a b = instr $ FMul NoFastMathFlags a b []
  7. fdiv :: Operand -> Operand -> Codegen Operand
  8. fdiv a b = instr $ FDiv NoFastMathFlags a b []

On top of the basic arithmetic functions we’ll add the basic control flow operations which will allow us to direct the control flow between basic blocks and return values.

  1. br :: Name -> Codegen (Named Terminator)
  2. br val = terminator $ Do $ Br val []
  3. cbr :: Operand -> Name -> Name -> Codegen (Named Terminator)
  4. cbr cond tr fl = terminator $ Do $ CondBr cond tr fl []
  5. ret :: Operand -> Codegen (Named Terminator)
  6. ret val = terminator $ Do $ Ret (Just val) []

Finally we’ll add several “effect” instructions which will invoke memory and evaluation side-effects. The call instruction will simply take a named function reference and a list of arguments and evaluate it and simply invoke it at the current position. The alloca instruction will create a pointer to a stack allocated uninitialized value of the given type.

  1. call :: Operand -> [Operand] -> Codegen Operand
  2. call fn args = instr $ Call Nothing CC.C [] (Right fn) (toArgs args) [] []
  3. alloca :: Type -> Codegen Operand
  4. alloca ty = instr $ Alloca ty Nothing 0 []
  5. store :: Operand -> Operand -> Codegen Operand
  6. store ptr val = instr $ Store False ptr val Nothing 0 []
  7. load :: Operand -> Codegen Operand
  8. load ptr = instr $ Load False ptr Nothing 0 []