Symbol Syntax

Now that we’re going to allow for user defined variables we need to update the grammar for symbols to be more flexible. Rather than just our builtin functions it should match any possible valid symbol. Unlike in C, where the name a variable can be given is fairly restrictive, we’re going to allow for all sorts of characters in the name of a variable.

We can create a regular expression that expresses the range of characters available as follows.

  1. /[a-zA-Z0-9_+\\-*\\/\\\\=<>!&]+/

On first glance this looks like we’ve just bashed our hands into keyboard. Actually it is a regular expression using a big range specifier []. Inside range specifiers special characters lose their meaning, but some of these characters still need to be escaped with backslashes. Because this is part of a C string we need to put two backslashes to represent a single backslash character in the input.

This rule lets symbols be any of the normal C identifier characters a-zA-Z0-9_ the arithmetic operator characters +\\-*\\/ the backslash character \\\\ the comparison operator characters =<>! or an ampersands &. This will give us all the flexibility we need for defining new and existing symbols.

  1. mpca_lang(MPCA_LANG_DEFAULT,
  2. " \
  3. number : /-?[0-9]+/ ; \
  4. symbol : /[a-zA-Z0-9_+\\-*\\/\\\\=<>!&]+/ ; \
  5. sexpr : '(' <expr>* ')' ; \
  6. qexpr : '{' <expr>* '}' ; \
  7. expr : <number> | <symbol> | <sexpr> | <qexpr> ; \
  8. lispy : /^/ <expr>* /$/ ; \
  9. ",
  10. Number, Symbol, Sexpr, Qexpr, Expr, Lispy);