Use R on Big Red II at IU

On Big Red II at Indiana University, you can set up and run R batch jobs on compute nodes in the native (Extreme Scalability Mode) execution environment, or run interactive R jobs on compute nodes in the Cluster Compatibility Mode (CCM) execution environment. The two execution environments are features of the Cray Linux Environment (CLE) operating system running on Big Red II.

On this page:

Set up your user environment

To add the default R module, you need to have the Intel programming environment module already added to your user environment. To determine which modules are currently loaded, on the command line, enter:

  module list

If another programming environment module (e.g., PrgEnv-cray) is loaded, use the module swap command to replace it with the PrgEnv-intel module; on the command line, enter:

  module swap PrgEnv-cray PrgEnv-intel
To use any of the non-default versions of R available on Big Red II, add the GNU programming environment (the PrgEnv-gnu module) instead.

To load the default R package; on the command line, enter:

  module load r

To load a non-default R package, specify the version number; for example:

  module load r/3.1.1

To make permanent changes to your environment, edit your ~/.modules file. For more, see Use a .modules file in your home directory to save your user environment on an IU research supercomputer.

For example, to make sure the required modules are loaded every time you log into Big Red II, add the following lines to your ~/.modules file:

  module swap PrgEnv-cray PrgEnv-intel
  module load r

For more about using Modules to configure your user environment, see Use Modules to manage your software environment on IU's research computing systems.

Submit an R batch job

To submit an R batch job that will run on Big Red II's compute nodes in the ESM execution environment:

  1. Create an R file (for example, R_input.R) containing the commands R should run.
  2. Create a job script (for example, R_job) that includes the aprun command to launch R on a compute node in the ESM execution environment (see the job script example below).
  3. Submit your job script (for example, R_job) to the TORQUE resource manager; on the command prompt, enter:
      qsub R_job
  4. To check the status of your job, use the qstat command (replace username with your IU username):
      qstat -u username

Job script example

The following example can be used (with some minor modifications) to run an R batch job on all 32 processors of one compute node in Big Red II's ESM execution environment:

  #PBS -l nodes=1:ppn=32
  #PBS -l walltime=01:00:00
  #PBS -q cpu
  #PBS -m abe
  #PBS -M
  #PBS -N my_R_job

  cd /N/u/username/BigRed2/your_working_directory

  aprun -n 32 R CMD BATCH --no-save R_input.R

The TORQUE directives in the example script do the following:

Directive Function
#PBS -l nodes=1:ppn=32 Sets resource requirements for the job to one node, 32 processors per node
#PBS -l walltime=01:00:00 Requests one hour of wall-clock time for the job
#PBS -q cpu Sends the job to Big Red II's routing queue (cpu); jobs submitted to the cpu routing queue are placed in the normal, long, or serial queue based on their resource requirements (for more, see Big Red II queue information)
#PBS -abe Sets event notification to send email if the job is ( a) aborted, when it (b) begins, and when it ( e) ends
#PBS -M Indicates where to send event notifications (replace with your IU email address)
#PBS -N Assigns a job name (my_R_job)

The commands in the body of the script do the following:

  • The cd command changes the working directory to the job submission directory (where the R input file is located) before executing further commands; this is necessary because TORQUE scripts execute in your home directory by default.
  • The aprun -n 32 command launches the specified application on all 32 cores of one compute node in the ESM execution environment.
  • The R CMD BATCH --no-save R_input.R string starts R in batch mode, tells R not to save an image of the current workspace at the end of the session (--no-save), and specifies the file from which R should take its input ( R_input.R).

Run R interactively

If your interactive session will require less than 20 minutes of processor time, you can load the required modules and launch R from the Big Red II command line; for example:

  gchawwaa@login2:~> module swap PrgEnv-cray PrgEnv-intel
  gchawwaa@login2:~> module load r
  gchawwaa@login2:~> R

If your interactive session will require more than 20 minutes of processor time, you must run an interactive job on the compute nodes in Big Red II's Cluster Compatibility Mode (CCM) execution environment.

Because the login nodes are not intended for computational work, UITS strongly recommends this method of interactive execution.

To run an interactive R job on Big Red II:

  1. Make sure your user environment is configured properly. In addition to the PrgEnv-intel and r modules, you must also load the ccm module. Enter the following commands, or add them to your ~/.modules file:
      module swap PrgEnv-cray PrgEnv-intel
      module load r
      module load ccm
  2. From the command line, enter the qsub command with the -I (interactive), -l gres=ccm (use CCM), and -q cpu (CPU queue) flags added; for example:
      qsub -I -l walltime=01:00:00 -l nodes=1:ppn=32 -l gres=ccm -q cpu

    When the requested resources are available, your job will start. Once the CCM execution environment is initialized, you'll be placed on one of Big Red II's aprun nodes:

      chebacca@login2:~> qsub -I -l walltime=01:00:00 -l nodes=1:ppn=32 -l gres=ccm -q cpu
      qsub: waiting for job 788009 to start
      qsub: job 788009 ready
      In CCM JOB:  788009  JID  788009  USER  chebacca  GROUP  wook
      Initializing CCM environment, Please Wait
      CCM Start success, 1 of 1 responses
      Directory: /N/u/chebacca/BigRed2
      Thu Mar 26 16:09:44 EDT 2018
  3. From the aprun command line, enter the ccmlogin command:
      chebacca@aprun8:~> ccmlogin

    This will place you on a Big Red II compute node (for example, nid00085):

      Warning: Permanently added '[nid00885]:203' (RSA) to the list of known hosts.
  4. From the compute node command prompt, enter R to launch R:
      chebacca@nid00885:~> R
      R version 3.1.1 (2014-07-10) -- "Sock it to Me"
      Copyright (C) 2014 The R Foundation for Statistical Computing
      Platform: x86_64-unknown-linux-gnu (64-bit)
      R is free software and comes with ABSOLUTELY NO WARRANTY.
      You are welcome to redistribute it under certain conditions.
      Type 'license()' or 'licence()' for distribution details.
      R is a collaborative project with many contributors.
      Type 'contributors()' for more information and
      'citation()' on how to cite R or R packages in publications.
      Type 'demo()' for some demos, 'help()' for on-line help, or
      'help.start()' for an HTML browser interface to help.
      Type 'q()' to quit R.

To use the features of the R graphical user interface (GUI), you must SSH to Big Red II with X forwarding enabled, and then use qsub with the -I (interactive) and -X (X forwarding) switches, as well as the -l gres=ccm (use CCM) switch; for example:

  lpawaroo@login1:~> qsub -I -X -l walltime=01:00:00 -l nodes=1:ppn=32 -l gres=ccm -q cpu

Get help

Support for IU research computing systems, software, and services is provided by the Research Technologies division of UITS. To ask a question or get help, contact UITS Research Technologies.

This is document bdrv in the Knowledge Base.
Last modified on 2019-01-21 16:56:50.

Contact us

For help or to comment, email the UITS Support Center.