Handling Hashbang/Shebang

In a Unix environment, a shell script often has its first line marked by a hashbang or a shebang, #!. A common example is a utility intended to be executed by Node.js, which may look like the following:

  1. #!/usr/bin/env node
  2. console.log('Hello from Node.js!');

If Esprima parser is being used to process the content of the above file, the parser will throw an exception. This is because that hashbang is not valid in JavaScript. A quick Node.js REPL session to illustrate the point:

  1. $ node
  2. > var esprima = require('esprima')
  3. > var src = ['#!/usr/bin/env node', 'answer = 42'].join('\n')
  4. > esprima.parseScript(src)
  5. Error: Line 1: Unexpected token ILLEGAL

The workaround for this problem is to remove the first line completely before passing it to the parser. One way to do that is to use a regular expression, as shown below:

  1. $ node
  2. > var esprima = require('esprima')
  3. > var src = ['#!/usr/bin/env node', 'answer = 42'].join('\n')
  4. > src = src.replace(/^#!(.*\n)/, '')
  5. 'answer = 42'
  6. > esprima.parseScript(src)
  7. Script {
  8. type: 'Program',
  9. body: [ ExpressionStatement { type: 'ExpressionStatement', expression: [Object] } ],
  10. sourceType: 'script' }

Note that the above approach will shorten the source string. If the string length needs to be preserved, e.g. to facilitate an exact location mapping to the original version, then a series of whitespaces need to be padded to the beginning. A modified approach looks like the following:

  1. $ node
  2. > var esprima = require('esprima')
  3. > var src = ['#!/usr/bin/env node', 'answer = 42'].join('\n')
  4. > src = src.replace(/(^#!.*)/, function(m) { return Array(m.length + 1).join(' ') });
  5. > esprima.parseScript(src, { range: true })
  6. Script {
  7. type: 'Program',
  8. body:
  9. [ ExpressionStatement {
  10. type: 'ExpressionStatement',
  11. expression: [Object],
  12. range: [Object] } ],
  13. sourceType: 'script',
  14. range: [ 15, 26 ] }