Extracting components from a formula

Problem

You want to extract parts of a formula for further use.

Solution

You can index into the formula object as though it were a list, using the [[ operator.

  1. f <- y ~ x1 + x2
  2. # Take a look at f
  3. str(f)
  4. #> Class 'formula' language y ~ x1 + x2
  5. #> ..- attr(*, ".Environment")=<environment: 0x1e46710>
  6. # Get each part
  7. f[[1]]
  8. #> `~`
  9. f[[2]]
  10. #> y
  11. f[[3]]
  12. #> x1 + x2
  13. # Or view the whole thing as a list
  14. as.list(f)
  15. #> [[1]]
  16. #> `~`
  17. #>
  18. #> [[2]]
  19. #> y
  20. #>
  21. #> [[3]]
  22. #> x1 + x2
  23. #>
  24. #> <environment: 0x1e46710>

For formulas that have nothing on the left side, there are only two elements:

  1. f2 <- ~ x1 + x2
  2. as.list(f2)
  3. #> [[1]]
  4. #> `~`
  5. #>
  6. #> [[2]]
  7. #> x1 + x2
  8. #>
  9. #> <environment: 0x1e46710>

Each of the elements of the formula is an symbol or language object (which consists of multiple symbols:

  1. str(f[[1]])
  2. #> symbol ~
  3. str(f[[2]])
  4. #> symbol y
  5. str(f[[3]])
  6. #> language x1 + x2
  7. # Look at parts of the langage object
  8. str(f[[3]][[1]])
  9. #> symbol +
  10. str(f[[3]][[2]])
  11. #> symbol x1
  12. str(f[[3]][[3]])
  13. #> symbol x2

You can use as.character() or deparse() to convert any of these to strings. deparse() can give a more natural-looking result:

  1. as.character(f[[1]])
  2. #> [1] "~"
  3. as.character(f[[2]])
  4. #> [1] "y"
  5. # The language object gets coerced into a string that represents the parse tree:
  6. as.character(f[[3]])
  7. #> [1] "+" "x1" "x2"
  8. # You can use deparse() to get a more natural looking string
  9. deparse(f[[3]])
  10. #> [1] "x1 + x2"
  11. deparse(f)
  12. #> [1] "y ~ x1 + x2"

The formula object also captures the environment in which it was called, as we saw earlier when we ran str(f). To extract it, use environment():

  1. environment(f)
  2. #> <environment: 0x1e46710>