Prevent malicious RegEx from overloading your single thread execution

One Paragraph Explainer

The risk that is inherent with the use of Regular Expressions is the computational resources that require to parse text and match a given pattern. For the Node.js platform, where a single-thread event-loop is dominant, a CPU-bound operation like resolving a regular expression pattern will render the application unresponsive. Avoid RegEx when possible or defer the task to a dedicated library like validator.js, or safe-regex to check if the RegEx pattern is safe.

Some OWASP examples for vulnerable RegEx patterns:

  • (a|aa)+
  • ([a-zA-Z]+)*

Code Example – Validating exponential time RegEx and using validators instead of RegEx

  1. const saferegex = require('safe-regex');
  2. const emailRegex = /^([a-zA-Z0-9])(([\-.]|[_]+)?([a-zA-Z0-9]+))*(@){1}[a-z0-9]+[.]{1}(([a-z]{2,3})|([a-z]{2,3}[.]{1}[a-z]{2,3}))$/;
  3. // should output false because the emailRegex is vulnerable to redos attacks
  4. console.log(saferegex(emailRegex));
  5. // instead of the regex pattern, use validator:
  6. const validator = require('validator');
  7. console.log(validator.isEmail('liran.tal@gmail.com'));

Book Quote: “A vulnerable Regular Expression is known as one which applies repetition”

From the book Essential Node.js Security by Liran Tal

Often, programmers will use RegEx to validate that an input received from a user conforms to an expected condition. A vulnerable Regular Expression is known as one which applies repetition to a repeating capturing group, and where the string to match is composed of a suffix of a valid matching pattern plus characters that aren’t matching the capturing group.