Example: Syntax Highlighting

The following Node.js script demonstrates the use of Esprima tokenizer to apply syntax highlighting of JavaScript code fragment. It accepts the input from stdin and produces color coded version to stdout by using ANSI escape code.

  1. const esprima = require('esprima');
  2. const readline = require('readline');
  3.  
  4. const CYAN = '\x1b[36m';
  5. const RESET = '\x1b[0m'
  6. let source = '';
  7.  
  8. readline.createInterface({ input: process.stdin, terminal: false })
  9. .on('line', line => { source += line + '\n' })
  10. .on('close', () => {
  11. const tokens = esprima.tokenize(source, { range: true });
  12. const ids = tokens.filter(x => x.type === 'Identifier');
  13. const markers = ids.sort((a, b) => { return b.range[0] - a.range[0] });
  14. markers.forEach(t => {
  15. const id = CYAN + t.value + RESET;
  16. const start = t.range[0];
  17. const end = t.range[1];
  18. source = source.slice(0, start) + id + source.slice(end);
  19. });
  20. console.log(source);
  21. });

An example run is shown in the following screenshot (the script is called highlight.js):

Syntax highlighting

The script uses the readline module to read the input line-by-line, collecting each line to a local string buffer. Once there is no more input, it invokes Esprima tokenizer to break the source into a list of tokens. The script only cares about identifier tokens, hence the filtering. For each token, the starting location is used to determine where to insert the escape code to change the color to cyan and the end location is used to reset the color. This is done from the last identifier token to the first identifier token, which necessities the sorting in reverse order.

For a real-world syntax highlighter that has many more features, take a look at cardinal (source repository: github.com/thlorenz/cardinal). It uses a similar approach, i.e. using Esprima tokenizer to break the source into tokens and then wrap each token with a type-specific color.