ARCHIVED: Run OpenMP or hybrid OpenMP/MPI jobs on Big Red II at IU

This content has been archived, and is no longer maintained by Indiana University. Information here may no longer be accurate, and links may no longer be available or reliable.

On this page:

Note:

Big Red II was retired from service on December 15, 2019; for more, see ARCHIVED: About Big Red II at Indiana University (Retired).


Overview

On Big Red II at Indiana University, the Extreme Scalability Mode (ESM) execution environment supports C, C++, and Fortran applications that use shared memory (OpenMP) parallelism, and hybrid OpenMP/MPI applications that use OpenMP threading to reduce the amount of memory needed to perform MPI tasks.

To use OpenMP on Big Red II, you must build your code with a compiler-specific command-line flag that enables OpenMP. Additionally, your TORQUE job script must set the OpenMP environment variable OMP_NUM_THREADS and use the aprun application launch command to instruct the runtime environment how to distribute processing elements.

Although you'll use the same compiler driver commands ( cc to compile C code, CC to compile C++ code, and ftn to compile Fortran code) for whichever programming environment module that's added to your user environment, the command-line flag used to enable OpenMP will depend on which programming environment module (compiler collection) is currently loaded in your Big Red II user environment:

Compiler collection
Programming environment module
OpenMP command-line flag
Cray Compiler Environment PrgEnv-cray (see note below)
GNU Compiler Collection (GCC) PrgEnv-gnu -fopenmp
Intel Compiler Suite PrgEnv-intel -openmp
Portland Group (PGI) compilers
PrgEnv-pgi -mp
Note:
The Cray compilers have OpenMP support enabled by default.

Run OpenMP applications

To run an OpenMP application in the ESM execution environment on Big Red II, first compile your code:

  1. Verify that the desired programming environment module is loaded. The Cray programming environment module ( PrgEnv-cray) is the default on Big Red II. To see which programming environment module is currently loaded, run module list on the command line, and then review the list of currently loaded modules. If necessary, use the module swap command to replace the currently loaded programming environment module (for example, PrgEnv-cray) with the desired one (for example, PrgEnv-gnu); for example, on the command line, enter:
      module swap PrgEnv-cray PrgEnv-gnu

    Alternatively, use the module unload command to remove the currently loaded programming environment:

      module unload PrgEnv-cray

    Then, add the desired programming environment module (e.g., PrgEnv-gnu) by adding this line, for example, to your ~/.modules file:

      module load PrgEnv-gnu

    For more, see Use modules to manage your software environment on IU research supercomputers.

  2. Execute the language-specific compiler driver command with the compiler-specific OpenMP flag added. For example:
    • If you're using the GNU programming environment (the PrgEnv-gnu module), to compile the C program my_sourcefile.c with OpenMP support enabled, on the command line, enter:
        cc -fopenmp -o my_binary my_sourcefile.c
    • If you're using the PGI programming environment (the PrgEnv-pgi module), to compile the C++ application my_sourcefile.CC with OpenMP support enabled, on the command line, enter:
        CC -mp -o my_binary my_sourcefile.CC
    Note:
    Within a given programming environment, consult the specific compiler's manual page (for example, man gcc or man pgCC) for any options its OpenMP flag may take.
  3. Prepare a TORQUE script that includes PBS directives (specifying the resources required to run your application) and the aprun application launch command (instructing the runtime environment how to distribute processing elements among those resources).

    The following example job script runs my_binary on all 32 cores of a XE6 node on Big Red II:

      #!/bin/bash
      #PBS -l nodes=1:ppn=32
      #PBS -l walltime=00:10:00
      #PBS -N my_job 
      #PBS -q cpu
      
      export OMP_NUM_THREADS=32
      aprun -n 1 -d 32 my_binary

    In this example:

    • The -n option tells aprun to dispatch one processing element for the job.
    • The -d option tells aprun each processing element should have a depth of 32 CPU cores.

Run hybrid OpenMP/MPI applications

To run an MPI application with OpenMP threading added:

  1. Make sure the desired programming environment module is added to your user environment (refer to step 1 in the previous section).
  2. Compile your code using the language-specific compiler driver command with the compiler-specific OpenMP flag added.

    For example:

    • If you're using the GNU programming environment (the PrgEnv-gnu module), to compile the C MPI program my_mpi_sourcefile.c with OpenMP support enabled, on the command line, enter:
        cc -fopenmp -o my_mpi_binary my_mpi_sourcefile.c
    • If you're using the PGI programming environment (the PrgEnv-pgi module), to compile the C++ MPI program my_mpi_sourcefile.CC with OpenMP support enabled, on the command line, enter:
        CC -mp -o my_mpi_binary my_mpi_sourcefile.CC
    Note:
    Within a given programming environment, consult the specific compiler's manual page (for example, man gcc or man pgCC) for any options its OpenMP flag may take.
  3. Prepare a TORQUE job script to run the application using any combination of MPI ranks and threads, on any group of nodes.

    The following example job script runs a job across 32 nodes with four MPI ranks per node, each having eight OpenMP threads:

      #!/bin/bash
      #PBS -l nodes=32:ppn=32
      #PBS -l walltime=00:10:00
      #PBS -N my_mpi_job 
      #PBS -q cpu
      
      export OMP_NUM_THREADS=8
      aprun -n 128 -N 4 -d 8 my_mpi_binary

    In the above example:

    • The -n option tells aprun to dispatch a total of 128 processing elements (equivalent to MPI ranks, for the purposes of this example).
    • The -N option tells aprun to dispatch four processing elements per node.
    • The -d option tells aprun each processing element should have a depth of eight CPU cores (each processing element can accommodate a total of eight OpenMP threads).

Get help

Research computing support at IU is provided by the Research Technologies division of UITS. To ask a question or get help regarding Research Technologies services, including IU's research supercomputers and research storage systems, and the scientific, statistical, and mathematical applications available on those systems, contact UITS Research Technologies. For service-specific support contact information, see Research computing support at IU.

Back to top

This is document bdos in the Knowledge Base.
Last modified on 2023-04-21 16:58:31.