GraalVM implementation of R, also known as FastR, is compatible with GNU R,can run R code at unparalleled performance,integrates with the GraalVMecosystem and provides additional R level features.

Warning: The support for R is currently experimental.

Installing R

The R language can be installed to a GraalVM build with the gu command.See graalvm/bin/gu —help for more information.

Requirements

GraalVM R engine requires the OpenMP runtime library and GFortran 3 runtime libraries to be installedon the target system. Following commands should install those dependencies.

  • Ubuntu 18.04 and 19.04: apt-get install libgfortran3 libgomp1
  • Oracle Linux 7: yum install libgfortran libgomp
  • Oracle Linux 8: yum install compat-libgfortran-48
  • MacOS: brew install gcc@4.9

On macOS it is necessary to run $R_HOME/bin/configure_fastr.This script will attempt to locate the necessary runtime libraries on your computerand will fine-tune the the GraalVM R installation according to your system.On Linux systems, this script will check that the necessary libraries are installed, and if not,it will suggest how to install them.

Moreover, to install R packages that contain C/C++ or Fortran code, compilersfor those languages must be present on the target system. Following packagessatisfy the dependencies of the most common R packages:

  • Ubuntu 18.04 and 19.04:
  1. apt-get install libgfortran3 build-essential gfortran libxml2-dev libc++-dev
  • Oracle Linux 7 and 8:
  1. yum groupinstall 'Development Tools'&& yum install gcc-gfortran bzip2 libxml2-devel

Note that you still need to install the GFortran 3 runtime libraries unless this isthe default version on your system.

Search Paths for Packages

The default R library location is within the GraalVM installation directory.In order to allow installation of additional packages for users thatdo not have write access to the GraalVM installation directory,edit the R_LIBS_USER variable in the $R_HOME/etc/Renviron file.

Running R Code

Run R code with the R and Rscript commands:

  1. $ R [polyglot options][R options][filename]
  1. $ Rscript[polyglot options][R options][filename]

GraalVM R engine uses the same polyglot options as other GraalVM languages and the same R options as GNU R, e.g., bin/R —vanilla.Use —help to print the list of supported options. The most important options include:

  • —jvm to enable Java interoperability
  • —polyglot to enable interoperability with other GraalVM languages
  • —vm.Djava.net.useSystemProxies=true to pass any options to the JVM, this will be translated to -Djava.net.useSystemProxies=true.

Note: unlike other GraalVM languages, R does not yet ship with aNative Image of its runtime.Therefore the —native option, which is the default, will still start Rscript on top of JVM,but for the sake of future compatibility the Java interoperability will not be available in such case.

Users can optionally build the native image using:

  1. jre/languages/R/bin/install_r_native_image

Running R Extensions

The GraalVM R engine can run R extensions in two modes:

  • native: the native machine code is run directly on your CPU, this is the same as how GNU-R runs R extensions.
  • llvm: if the LLVM bitcode is available, it can be interpreted by GraalVM LLVM.

The native mode is better suited for code that does not extensively interact with the R API, for example,plain C or Fortran numerical computations working on primitive arrays. The llvm mode provides significantlybetter performance for extensions that frequently call between R and the C/C++ code, because GraalVM LLVMinterpreter is also partially evaluated by the Truffle library like the R code, both can be inlined and optimizedas one compilation unit. Moreover, GraalVM LLVM is supported byGraalVM tools which allows to, for instance,debug R and C code together.

In one GraalVM R process, any R package can be loaded in either mode. That is, GraalVM R supportsmixing packages loaded in the native mode with packages loaded in the llvm mode in one process.

Generating LLVM Bitcode

As of version 19.3.0, the GraalVM R engine is configured to use theLLVM toolchainto compile R packages native code. This toolchain produces standard executable binaries fora given system, but it also embeds the corresponding LLVM bitcode into them.The binaries produced by the LLVM Toolchain can be loaded in both modes: native or llvm.

The GraalVM R engine can be reconfigured to use your system default compilerswhen installing R packages by running

  1. # use local installation of GGC:
  2. $ R -e 'fastr.setToolchain("native")'
  3. # to revert back to using the GraalVM's LLVM toolchain:
  4. $ R -e 'fastr.setToolchain("llvm")'

Using the system default compilers may be more reliable, but you loose theability to load the R packages built with the LLVM toolchain in the llvm mode,because they will not contain the embedded bitcode. Moreover, mixing packagesbuilt by the local system default compilers and packages built by the LLVMtoolchain in one GraalVM R process may cause linking issues.

Choosing the Running Mode

Starting from the version 19.3.0, the GraalVM R engine uses the following defaults:

  • native mode to load the packages
  • llvm toolchain to build their sources

To enable the llvm mode for loading the packages, use —R.BackEnd=llvm.You can also enable each mode selectively for given R packages by using:

  • —R.BackEndLLVM=package1,package2
  • —R.BackEndNative=package1,package2

GraalVM R Engine Compatibility

GraalVM implementation of R, known as FastR, is based on GNU R and reusesthe base packages. It is currently based on GNU-R 3.6.1, and moves to new majorversions of R as they become available and stable. The FastR project, maintains an extensive set of unittests for all aspects of the R language and the builtin functionality, and thesetests are available as part of the R source code. GraalVM R engine aims to befully compatible with GNU R, including its native interface as used by R extensions. Itcan install and run unmodified complex R packages like ggplot2, Shiny, orRcpp. As some packages rely on unspecified behavior or implementation detailsof GNU-R, support for packages is work in progress, and some packages might notinstall successfully or work as expected.

Packages can be installed using the install.packages function or the R CMD INSTALL shell command.By default, R uses fixed snapshot of the CRAN repository1.This behavior can be overridden by explicitly setting the repos argument of the install.packages function.This functionality does not interfere with the checkpoint package. If you are behind a proxy server, makesure to configure the proxy either with environment variables or using the JVM options,e.g., —vm.Djava.net.useSystemProxies=true.

Versions of some packages specifically patched for GraalVM implementation of R can be installed using the install.fastr.packagesfunction that downloads them from the GitHub repository.Currently, those are rJava and data.table.

Known limitations of GraalVM implementation of R compared to GNU R:

  • Only small parts of the low-level graphics package are functional. However, the grid package is supported and R can install and run packages based on it like ggplot2. Support for the graphics package in R is planned for future releases.
  • Encoding of character vectors. Related builtins (e.g., Encoding) are available, but do not execute any useful code. Character vectors are represented as Java Strings and therefore encoded in UTF-16 format. GraalVM implementation of R will add support for encoding in future releases.
  • Some parts of the native API (e.g., DATAPTR) expose implementation details that are hard to emulate for alternative implementations of R. These are implemented as needed while testing the GraalVM implementation of R with various CRAN packages.

You can use the compatibility checker to find whether the CRAN packages you are interested in are tested on GraalVM and whether the tests pass successfully.

High Performance

GraalVM runtime optimizes R code that runs for extended periods of time.The speculative optimizations based on the runtime behavior of the R code and dynamic compilation employed by GraalVM runtime are capable of removing most of the abstraction penalty incurred by the dynamism and complexity of the R language.

Let us look at an algorithm in R code. The following example calculates themutual information of a large matrix:

  1. x <- matrix(runif(1000000),1000,1000)
  2. mutual_R <-function(joint_dist){
  3. joint_dist <- joint_dist/sum(joint_dist)
  4. mutual_information <-0
  5. num_rows <- nrow(joint_dist)
  6. num_cols <- ncol(joint_dist)
  7. colsums <- colSums(joint_dist)
  8. rowsums <- rowSums(joint_dist)
  9. for(i in seq_along(1:num_rows)){
  10. for(j in seq_along(1:num_cols)){
  11. temp <- log((joint_dist[i,j]/(colsums[j]*rowsums[i])))
  12. if(!is.finite(temp)){
  13. temp =0
  14. }
  15. mutual_information <-
  16. mutual_information + joint_dist[i,j]* temp
  17. }
  18. }
  19. mutual_information
  20. }
  21. system.time(mutual_R(x))
  22. # user system elapsed
  23. # 1.321 0.010 1.279

Algorithms such as this one usually require C/C++ code to run efficiently:2

  1. if(!require('RcppArmadillo')){
  2. install.packages('RcppArmadillo')
  3. library(RcppArmadillo)
  4. }
  5. library(Rcpp)
  6. sourceCpp("r_mutual.cpp")
  7. x <- matrix(runif(1000000),1000,1000)
  8. system.time(mutual_cpp(x))
  9. # user system elapsed
  10. # 0.037 0.003 0.040

(Uses r_mutual.cpp.)However, after a few iterations, GraalVM runs the R code efficiently enough tomake the performance advantage of C/C++ negligible:

  1. system.time(mutual_R(x))
  2. # user system elapsed
  3. # 0.063 0.001 0.077

GraalVM implementation of R is primarily aimed at long-running applications. Therefore, the peak performance is usually only achieved after a warmup period. While startup time is currently slower than GNUR’s, due to the overhead from Java class loading and compilation, future releases will contain a native image of R with improved startup.

GraalVM Integration

The R language integration with the GraalVM ecosystem includes:

To start debugging the code start the R script with —inspect option

  1. $ Rscript--inspect myScript.R

Note that GNU R compatible debugging using, for example, debug(myFunction) is also supported.

Interoperability

GraalVM supports several other programming languages, including JavaScript, Ruby, Python, and LLVM.GraalVM implementation of R also provides an API for programming language interoperability that lets you execute code from any other language that GraalVM supports. Note that you must start the R script with —polyglot to have access to other GraalVM languages.

GraalVM execution of R provides the following interoperability primitives:

  • eval.polyglot('languageId', 'code') evaluates code in some other language, the languageId can be, e.g., js.
  • eval.polyglot(path = '/path/to/file.extension') evaluates code loaded from a file. The language is recognized from the extension.
  • export('polyglot-value-name', rObject) exports an R object so that it can be imported by other languages.
  • import('exported-polyglot-value-name') imports a polyglot value exported by some other language.

Please use the ?functionName syntax to learn more. The following example demonstrates the interoperability features:

  1. # get an array from Ruby
  2. x <-eval.polyglot('ruby','[1,2,3]')
  3. print(x[[1]])
  4. # [1] 1
  5. # get a JavaScript object
  6. x <-eval.polyglot(path='r_example.js')
  7. print(x$a)
  8. # [1] "value"
  9. # use R vector in JavaScript
  10. export('robj', c(1,2,3))
  11. eval.polyglot('js', paste0(
  12. 'rvalue = Polyglot.import("robj"); ',
  13. 'console.log("JavaScript: " + rvalue.length);'))
  14. # JavaScript: 3
  15. # NULL -- the return value of eval.polyglot

(Uses r_example.js.)

R vectors are presented as arrays to other languages. This includes single element vectors, e.g. 42L or NA.However, single element vectors that do not contain NA can be typically used in places where the otherlanguages expect a scalar value. Array subscript or similar operation can be used in other languages to accessindividual elements of an R vector. If the element of the vector is not NA, the actual valueis returned as a scalar value. If the element is NA, then a special object that looks like nullis returned. The following Ruby code demonstrates this.

  1. vec =Polyglot.eval("R","c(NA, 42)")
  2. p vec[0].nil?
  3. # true
  4. p vec[1]
  5. # 42
  6. vec =Polyglot.eval("R","42")
  7. p vec.to_s
  8. # "[42]"
  9. p vec[0]
  10. # 42

The foreign objects passed to R are implicitly treated as specific R types.The following table gives some examples.

Example of foreign object (Java)Viewed ‘as if’ on the R side
int[] {1,2,3}c(1L,2L,3L)
int[][] { {1, 2, 3}, {1, 2, 3} }matrix(c(1:3,1:3),nrow=3)
int[][] { {1, 2, 3}, {1, 3} }not supported: raises error
Object[] {1, ‘a’, ‘1’}list(1L, ‘a’, ‘1’)
4242L

In the following code example, we can simply just pass the Ruby array to the R built-in function sum,which will work with the Ruby array as if it was integer vector.

  1. sum(eval.polyglot('ruby','[1,2,3]'))

Foreign objects can be also explicitly wrapped into adapters that make them look like the desired R type.In such a case, no data copying occurs if possible. The code snippet below shows the most common use cases.

  1. # gives list instead of an integer vector
  2. as.list(eval.polyglot('ruby','[1,2,3]'))
  3. # assume the following Java code:
  4. # public class ClassWithArrays {
  5. # public boolean[] b = {true, false, true};
  6. # public int[] i = {1, 2, 3};
  7. # }
  8. x <-new('ClassWithArrays');# see Java interop below
  9. as.list(x)
  10. # gives: list(c(T,F,T), c(1L,2L,3L))

For more details, please refer tothe executable specificationof the implicit and explicit foreign objects conversions.

Note that R contexts started from other languages or Java (as opposed to via the bin/R script) will default to non-interactive mode, similar to bin/Rscript.This has implications on console output (results are not echoed) and graphics (output defaults to a file instead of a window), and some packages may behave differently in non-interactive mode.

See the Polyglot Reference and theEmbedding documentationfor more information about interoperability with other programming languages.

Interoperability with Java

GraalVM R engine provides built-in interoperability with Java. Java class objects can be obtained via java.type(…).The standard new function interprets string arguments as a Java class if such class exists. new also accepts Java types returned from java.type.Fields and methods of Java objects can be accessed using the $ operator.Additionally, you can use awt(…) to open an R drawing devicedirectly on a Java Graphics surface, for more details see Java Based Graphics.

The following example creates a new Java BufferedImage object, plots random data to it using R’s grid package,and shows the image in a window using Java’s AWT framework. Note that you must start the R script with —jvm to have access to Java interoperability.

  1. library(grid)
  2. openJavaWindow <-function(){
  3. # create image and register graphics
  4. imageClass <- java.type('java.awt.image.BufferedImage')
  5. image <-new(imageClass,450,450, imageClass$TYPE_INT_RGB);
  6. graphics <- image$getGraphics()
  7. graphics$setBackground(java.type('java.awt.Color')$white);
  8. grDevices:::awt(image$getWidth(), image$getHeight(), graphics)
  9. # draw image
  10. grid.newpage()
  11. pushViewport(plotViewport(margins = c(5.1,4.1,4.1,2.1)))
  12. grid.xaxis(); grid.yaxis()
  13. grid.points(x = runif(10,0,1), y = runif(10,0,1),
  14. size = unit(0.01,"npc"))
  15. # open frame with image
  16. imageIcon <-new("javax.swing.ImageIcon", image)
  17. label <-new("javax.swing.JLabel", imageIcon)
  18. panel <-new("javax.swing.JPanel")
  19. panel$add(label)
  20. frame <-new("javax.swing.JFrame")
  21. frame$setMinimumSize(new("java.awt.Dimension",
  22. image$getWidth(), image$getHeight()))
  23. frame$add(panel)
  24. frame$setVisible(T)
  25. while(frame$isVisible())Sys.sleep(1)
  26. }
  27. openJavaWindow()

For more information on FastR interoperability with Java and other languages implemented with Truffle framework,refer to the Interoperability tutorial.

GraalVM implementation of R provides its own rJava compatible replacement package available at GitHub,which can be installed using:

  1. $ R -e "install.fastr.packages('rJava')"

GraalVM R Engine Additional Features

Java Based Graphics

The GraalVM implementation of R includes its own Java based implementation of the grid package and the following graphics devices: png, jpeg, bmp, svg and awt (X11 is aliased to awt). The graphics package and most of its functions are not supported at the moment.

The awt device is based on the Java Graphics2D object and users can pass it their own Graphics2D object instance when opening the device using the awt function, as shown in the Java interop example.When the Graphics2D object is not provided to awt, it opens a new window similarly to X11.

The svg device in GraalVM implementation of R generates more lightweight SVG code than the svg implementation in GNU R.Moreover, functions tailored to manipulate the SVG device are provided: svg.off and svg.string.The SVG device is demonstrated in the following code sample. Please use the ?functionName syntax to learn more.

  1. library(lattice)
  2. svg()
  3. mtcars$cars <- rownames(mtcars)
  4. print(barchart(cars~mpg, data=mtcars))
  5. svgCode <- svg.off()
  6. cat(svgCode)
In-Process Parallel Execution

GraalVM R engine adds a new cluster type SHARED for the parallel package. This cluster starts new jobs as new threads inside the same process. Example:

  1. library(parallel)
  2. cl0 <- makeCluster(7,'SHARED')
  3. clusterApply(cl0, seq_along(cl0),function(i) i)

Worker nodes inherit the attached packages from the parent node with copy-on-write semantics, but not the global environment.This means that you do not need to load again R libraries on the worker nodes but values (including functions) from the globalenvironment have to be transfered to the worker nodes, e.g., using clusterExport.

Note that unlike with the FORK or PSOCK clusters the child nodes in SHARED cluster are running in the same process,therefore, e.g., locking files with lockfile or flock will not work. Moreover, the SHARED cluster is based onan assumption that packages’ native code does not mutate shared vectors (which is a discouraged practice) and is threadsafe and re-entrant on the C level.

If the code that you want to parallelize does not match these expectations, you can still use the PSOCK cluster with the GraalVM R engine.The FORK cluster and functions depending solely on forking (e.g., mcparallel) are not supported at the moment.

1 More technically, GraalVM implementation of R uses a fixed MRAN URL from $R_HOME/etc/DEFAULT_CRAN_MIRROR, which is a snapshot of theCRAN repository as it was visible at a given date from the URL string.

2 When this example is run for the first time, it installs the RcppArmadillo package,which may take few minutes. Note that this example can be run in both R executedwith GraalVM and GNU R.