Use PCP to bundle multiple serial jobs to run in parallel on IU research supercomputers

Parallel Command Processor (PCP), developed by the Ohio Supercomputer Center (OSC) and the National Institute for Computational Sciences (NICS), is an application that lets you bundle multiple serial jobs and run them concurrently.

Using PCP to bundle and run your serial jobs in parallel lets you make efficient use of all the cores on a compute node. Conversely, running one serial job at a time can waste more than 90% of a node's computational power. PCP also lets you request multiple nodes for your jobs.

PCP is especially useful for running parametric studies and Monte Carlo simulations.

Add PCP to your user environment

At Indiana University, PCP is available on the IU research supercomputers. To use PCP on an IU research supercomputer, you must add it to your user environment.

On Big Red 200 or Quartz, to add the default version of PCP to your user environment, enter:

module load pcp

You can save your customized user environment so that it loads every time you start a new session; for instructions, see Use modules to manage your software environment on IU research supercomputers.

Run multiple serial jobs in parallel with PCP

For a parallel job with N processors allocated, the PCP manager process reads the first N-1 commands in the command stream and distributes them to the other N-1 processors. As processes complete, the PCP manager reads the next process in the stream and runs it on an idle core.

For example, to run 47 serial jobs in parallel:

  1. Create a text file that lists each job, one per line (for example, list.txt):
    ./a.out > o1.txt 
    ./a.out > o2.txt
    ./a.out > o3.txt
    ./a.out > o46.txt
    ./a.out > o47.txt
  2. Create a Slurm job script (for example, In your script:
    • Use the --ntasks-per-node option to request the cores needed to run all the jobs in your job list and the PCP manager process. (For this example, the script would need to request 48 cores to run the 47 jobs in list.txt plus the PCP manager process.)
    • Use srun to launch PCP and execute the jobs in your job list file.
    • For jobs on Big Red 200, or Quartz, use the -A option to indicate the Slurm Account Name to which resources used by this job should be charged.

      Users belonging to projects approved through RT Projects can find their allocation's Slurm Account Name on the "Home" page in RT Projects; look under "Submitting Slurm Jobs with your Project's Account"; alternatively, on the "Home" page, under "Allocations", select an allocation and look in the table under "Allocation Attributes".

      For more about RT Projects, see Use RT Projects to request and manage access to specialized Research Technologies resources.

    For example, on Big Red 200, the following Slurm script will launch PCP and run the 47 jobs in the list.txt job list file (replace slurm-account-name with your allocation's Slurm Account Name; replace with the email address to which Slurm should send job-related mail):

    #SBATCH -J job_name
    #SBATCH -p general
    #SBATCH -A slurm-account-name
    #SBATCH -o filename_%j.txt
    #SBATCH -e filename_%j.err
    #SBATCH --mail-type=ALL
    #SBATCH --nodes=1
    #SBATCH --ntasks-per-node=48
    #SBATCH --cpus-per-task=1
    #SBATCH --time=02:00:00
    #SBATCH --mem=58G
    module load pcp
    srun --tasks-per-node=48 pcp list.txt
  3. Submit the script (for example, from the command line using sbatch:

When the job starts, the 47 serial jobs in list.txt will run in parallel. When the PCP manager runs out of commands to run, it will wait for any remaining running processes to complete, and then shut itself down.

