Good practices in R programming

R is a free software environment for statistical computing and graphics, available from The R Project for Statistical Computing. At Indiana University, R is available on research computing systems. R is also available via IUanyWare.

Following are guidelines and code examples that illustrate good practices in R programming. For additional help with developing R programs, contact the UITS Research Analytics group.

On this page:

Avoid unnecessary operators

R is an interpreted language; every operator in your R scripts requires a name lookup every time you use it.

The following two code examples are functionally equivalent. However, the first code example takes about twice as much processing time due to the multiple parentheses.

Example1 Example2
system.time({ 
    I = 0
    while (I<100000) {
        ((((((((((10))))))))))
        I = I + 1
    }
}) 
user        system         elapse
0.055      0.000            0.055
system.time({ 
    I = 0
    while (I<100000) {
        10
        I = I + 1
    }
}) 
user        system         elapse
0.055      0.000            0.055 

Avoid growing objects inside loops

Always pre-allocate objects to be used inside loops. Executing loops in R is slow, and growing objects inside loops will make your R program particularly slow. You should always try to pre-allocate vectors, lists, and data frames accessed inside any loops.

Consider the following two code examples. The first accesses and grows a vector inside the for loop while the second pre-allocates the vector and accesses the vector inside the for loop without growing its size.

Example1 Example2
square_loop_noinit <- function (n) {
    x <- c() 
    for (i in 1:n) {
        x <- c(x, i^2)
}
system.time({
    square_loop_noinit(200)
})
user        system         elapse
0.257      0.000            0.257
square_loop_noinit <- function (n) {
    x <- integer(n)
    for (i in 1:n) {
        x[i] <- i^2
}
system.time({
    square_loop_noinit(200)
})
user        system         elapse
0.099      0.000            0.099

Use vectorization if possible

In R, everything is a vector. In your R script, you should always write vectorized code or use pre-existing compiled kernels (which are already vectorized and optimized) to avoid interpreter overhead.

Consider the following two code examples. The second example achieves a 38-fold speedup by using vectorized code provided by compiled kernels.

Example1 Example2
Ply <- function(x) lapply (rep(1, 1000), rnorm)
system.time({
    Ply()
})
user        system         elapse
0.348      0.000            0.348
vec <- function(x) rnorm(1000)
system.time({
    vec()
})
user        system         elapse
0.009      0.000            0.009

This is document aaxp in the Knowledge Base.
Last modified on 2017-08-17 12:00:55.

  • Fill out this form to submit your issue to the UITS Support Center.
  • Please note that you must be affiliated with Indiana University to receive support.
  • All fields are required.

Please provide your IU email address. If you currently have a problem receiving email at your IU account, enter an alternate email address.

  • Fill out this form to submit your comment to the IU Knowledge Base.
  • If you are affiliated with Indiana University and need help with a computing problem, please use the I need help with a computing problem section above, or contact your campus Support Center.

Please provide your IU email address. If you currently have a problem receiving email at your IU account, enter an alternate email address.