On Big Red at IU, how do I use the paralleljob script to submit jobs?
A paralleljob script provides a convenient method for
submitting parallel (multiple-processor) programs to the LoadLeveler
batching and queuing system. Programs must consist of just one
executable file, in contrast to some master/worker programs in which
the master and workers are different executable files. For complete
documentation, enter man paralleljob on Big
Red.
On this page:
The paralleljob command
When you submit a job with paralleljob, you may specify
the number of processes to start, how long the job should be allowed
to run, and the queue to which the job should be submitted. The
default is to launch four processes for up to two hours in the LONG
queue of Big Red. The general form of the command is:
Items in brackets are optional. Replace the example text above as follows:
- For
program-name, substitute the name of the program to submit. - For
program-options, substitute the command-line options you want to pass to the program. - For
np, use the number of processes to start. - For
n, use the number of hours that the job should be allowed to run. - For
queue-name, substitute the name of the queue to which the job will be submitted.
For example, suppose you've written a program called speedster that takes options that specify speed and the name of the file to be processed. To run the program with four processes for up to two hours in the LONG queue, you would enter:
paralleljob speedster -speed super mydata.datTo launch 16 processes and run for up to 48 hours, you would enter:
paralleljob speedster -speed super mydata.dat -CPUS 16 -wallhours 48To launch 512 processes and run for 10 hours in the NORMAL queue of Big Red, you would enter:
paralleljob speedster -speed super mydata.dat -CPUS 512 -wallhours 10 -queue NORMAL
If the program that you wish to run is not in your default path,
be sure to use the fully qualified path name of the program. When
your job runs, the current working directory of your program is the
directory from which you ran the paralleljob command.
Processor and wall time limits
In the default queue (LONG), you can request up to 32 nodes (128 processes) for up to 336 hours (14 days). In the NORMAL queue, you can request up to 1,024 processes for up to 48 hours (2 days). The DEBUG queue is available for debugging, and it allows up to 16 processes for up to 15 minutes. For more about the intended uses and characteristics of the batch queues, see Big Red usage policies.
Accessing paralleljob
An appropriate version of mpirun must be in your path or
specified by the MPIRUN environment variable. Each flavor of MPI has
its own version of mpirun. If you have compiled your
parallel application using mpicc or another appropriate
compiler, mpirun should already be in your path.
Limits of paralleljob
The paralleljob script works only for parallel
applications that are "single program multiple data", that is, a
single binary. It does not work for parallel applications that consist
of more than one binary, "multiple programs multiple data".
paralleljob accepts only double-quoted arguments;
it treats single quotes as double quotes.
This document was developed with support from the National Science Foundation (NSF) under Grant No. 0503697 to the University of Chicago and subcontracted to Indiana University. Additional support was provided by IU through its participation in the TeraGrid, which is supported by the NSF under Grants No. 0833618, SCI451237, SCI535258, and SCI504075. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the NSF.
Last modified on December 09, 2008.







