The Parser

The AST for a program captures its behavior in such a way that it is easy for later stages of the compiler (e.g. code generation) to interpret. We basically want one object for each construct in the language, and the AST should closely model the language. In Kaleidoscope, we have expressions, and a function object. When parsing with Parsec we will unpack tokens straight into our AST which we define as the Expr algebraic data type:

  1. module Syntax where
  2. type Name = String
  3. data Expr
  4. = Float Double
  5. | BinOp Op Expr Expr
  6. | Var String
  7. | Call Name [Expr]
  8. | Function Name [Expr] Expr
  9. | Extern Name [Expr]
  10. deriving (Eq, Ord, Show)
  11. data Op
  12. = Plus
  13. | Minus
  14. | Times
  15. | Divide
  16. deriving (Eq, Ord, Show)

This is all (intentionally) rather straight-forward: variables capture the variable name, binary operators capture their operation (e.g. Plus, Minus, …), and calls capture a function name as well as a list of any argument expressions.

We create Parsec parser which will scan an input source and unpack it into our Expr type. The code composes within the Parser to generate the resulting parser which is then executed using the parse function.

  1. module Parser where
  2. import Text.Parsec
  3. import Text.Parsec.String (Parser)
  4. import qualified Text.Parsec.Expr as Ex
  5. import qualified Text.Parsec.Token as Tok
  6. import Lexer
  7. import Syntax
  8. binary s f assoc = Ex.Infix (reservedOp s >> return (BinOp f)) assoc
  9. table = [[binary "*" Times Ex.AssocLeft,
  10. binary "/" Divide Ex.AssocLeft]
  11. ,[binary "+" Plus Ex.AssocLeft,
  12. binary "-" Minus Ex.AssocLeft]]
  13. int :: Parser Expr
  14. int = do
  15. n <- integer
  16. return $ Float (fromInteger n)
  17. floating :: Parser Expr
  18. floating = do
  19. n <- float
  20. return $ Float n
  21. expr :: Parser Expr
  22. expr = Ex.buildExpressionParser table factor
  23. variable :: Parser Expr
  24. variable = do
  25. var <- identifier
  26. return $ Var var
  27. function :: Parser Expr
  28. function = do
  29. reserved "def"
  30. name <- identifier
  31. args <- parens $ many variable
  32. body <- expr
  33. return $ Function name args body
  34. extern :: Parser Expr
  35. extern = do
  36. reserved "extern"
  37. name <- identifier
  38. args <- parens $ many variable
  39. return $ Extern name args
  40. call :: Parser Expr
  41. call = do
  42. name <- identifier
  43. args <- parens $ commaSep expr
  44. return $ Call name args
  45. factor :: Parser Expr
  46. factor = try floating
  47. <|> try int
  48. <|> try extern
  49. <|> try function
  50. <|> try call
  51. <|> variable
  52. <|> parens expr
  53. defn :: Parser Expr
  54. defn = try extern
  55. <|> try function
  56. <|> expr
  57. contents :: Parser a -> Parser a
  58. contents p = do
  59. Tok.whiteSpace lexer
  60. r <- p
  61. eof
  62. return r
  63. toplevel :: Parser [Expr]
  64. toplevel = many $ do
  65. def <- defn
  66. reservedOp ";"
  67. return def
  68. parseExpr :: String -> Either ParseError Expr
  69. parseExpr s = parse (contents expr) "<stdin>" s
  70. parseToplevel :: String -> Either ParseError [Expr]
  71. parseToplevel s = parse (contents toplevel) "<stdin>" s