Use Slurm to submit and manage jobs on IU's research computing systems
On this page:
- Overview
- Batch jobs
- Interactive jobs
- Monitor or delete your job
- View partition and node information
- Get help
Overview
The Indiana University research supercomputers use the Slurm Workload Manager to coordinate resource management and job scheduling.
Slurm user commands include numerous options for specifying the resources and other attributes needed to run batch jobs or interactive sessions. Options can be invoked on the command line or with directives contained in a job script.
Common user commands in Slurm include:
Command | Description |
---|---|
sbatch |
Submit a batch script to Slurm. The command exits immediately when the script is transferred to the Slurm controller daemon and assigned a Slurm job ID. For more, see the Batch jobs section below. |
srun |
Run a job on allocated resources. Commonly used in job scripts to launch programs, srun is used also to request resources for interactive jobs. |
squeue |
Monitor job status information. For more, see the Monitor or delete your job section below. |
scancel |
Terminate a queued or running job prior to its completion. For more, see the Monitor or delete your job section below. |
sinfo |
View partition information. For more, see the View partition and node information section below. |
Batch jobs
About job scripts
To run a job in batch mode, first prepare a job script with that specifies the application you want to launch and the resources required to run it. Then, use the sbatch
command to submit your job script to Slurm. For example, if your script is named my_job.script
, you would enter sbatch my_job.script
to submit the script to Slurm; if the command runs successfully, it will return a job ID to standard output; for example:
[username@h1 ~]$ sbatch my_job.script
Submitted batch job 9472
Slurm job scripts most commonly have at least one executable line preceded by a list of options that specify the resources and attributes needed to run your job (for example, wall-clock time, the number of nodes and processors, and filenames for job output and errors). When you write a job script, make sure to create it in accordance with the needs of your program. Most importantly, make sure your jobs script will request the proper amount of resources, including memory and time, that are required to run your program.
Serial jobs
A job script for running a serial batch job may look similar to the following:
#!/bin/bash
#SBATCH -J job_name
#SBATCH -p general
#SBATCH -o filename_%j.txt
#SBATCH -e filename_%j.err
#SBATCH --mail-type=ALL
#SBATCH --mail-user=username@iu.edu
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --time=02:00:00
#SBATCH --mem=16G
#SBATCH -A slurm-account-name
#Load any modules that your program needs
module load modulename
#Run your program
srun ./my_program my_program_arguments
In the above example:
- The first line indicates that the script should be read using the Bash command interpreter.
- The
#SBATCH
lines are directives that pass options to thesbatch
command:-J job_name
specifies a name for the job allocation. The specified name will appear along with the job ID number when you query running jobs on the system.-p general
specifies that the job should run in the general partition.-o filename_%j.txt
and-e filename_%j.err
instructs Slurm to connect the job's standard output and standard error, respectively, to the file names specified, where%j
is automatically replaced by the job ID.--mail-type=<type>
directs Slurm to send job-related email when an event of the specified type(s) occurs; validtype
values includeall
,begin
,end
, andfail
.--mail-user=username@iu.edu
indicates the email address to which Slurm will send job-related mail.--nodes=1
requests that a minimum of one node be allocated to this job.--ntasks-per-node=1
specifies that one task should be launched per node.--time=02:00:00
requests two hours for the job to run.--mem=16G
requests 16 GB of memory.-A slurm-account-name
indicates the Slurm Account Name to which resources used by this job should be charged.Users belonging to projects approved through RT Projects can find their allocation's Slurm Account Name on the "Home" page in RT Projects; look under "Submitting Slurm Jobs with your Project's Account"; alternatively, on the "Home" page, under "Allocations", select an allocation and look in the table under "Allocation Attributes".
Users without RT Projects allocations should use the
-A general
option to indicate thegeneral
Slurm account.For more about RT Projects, see Use RT Projects to request and manage access to specialized Research Technologies resources.
- At the bottom are the two executable lines that the job will run. In this case, the
module
command is used to load a module (modulename
), and thensrun
is used to execute the application with the arguments specified. In your script, replacemy_program
andmy_program_arguments
with your program's name and any necessary arguments, respectively.
For information about running GPU-enabled jobs, see Run GPU-accelerated jobs on Carbonate or Big Red 200 at IU.
OpenMP jobs
If your program can take advantage of multiple processors (for example, if it uses OpenMP), you can add a #SBATCH
directive to pass the --cpus-per-task
option to sbatch
. For example, you could add this line to request that 12 CPUs per task be allocated to your job:
#SBATCH --cpus-per-task=12
If you include this line, make sure it does not request more than the maximum number of CPUs available per node (each system has a different maximum). This type of parallel program can only take advantage of multiple CPUs that are on a single node. Typically, before calling such a program, you should set the OMP_NUM_THREADS
environment variable to indicate the number of OpenMP threads that can be used. Unless you want more than one thread running on each CPU, this value is typically equal to the number of CPUs requested. For example:
#Run your program
export OMP_NUM_THREADS=12
srun ./my_program my_program_arguments
MPI jobs
If your program uses MPI (that is, the code is using MPI directives), it can take advantage of multiple processors on more than one node. Request more than one node only if your program is specifically structured to communicate across nodes. MPI programs launch multiple copies of the same program, which then communicate through MPI. One Slurm task is used to run each MPI process. For example, if your MPI program can benefit successfully from 48 processes, and the maximum number of processors available on each node is 24, you could alter the above serial job script example to set --nodes=2
(to request two nodes) and --ntasks-per-node=24
(to request 24 tasks per node):
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=24
You also may want to indicate the total number of tasks in your srun
command:
srun -n 48 ./my_mpi_program my_program_arguments
The number of processes that can run successfully on one node is limited by the amount of memory available on that node. If each process of a program needs 20 GB of memory, and the node has 240 GB of memory available, you could run a maximum of 12 tasks on each node. In such a case, to run 48 tasks, your script would set --nodes=4
(to request four nodes) and --ntasks-per-node=12
(to request 12 tasks per node). Also, in such a case, your script should also set --mem
to request the maximum amount of memory per node, as not all of the processors of the node would be requested. To determine the correct values for your job script, make sure you know the amount of memory available per node and the number of processors available per node for the system you are using.
Hybrid OpenMP-MPI jobs
In a hybrid OpenMP-MPI job, each MPI process uses multiple threads. In addition to #SBATCH
directives for MPI, your script should include a #SBATCH
directive that requests multiple CPUs per task and an executable line placed before your srun
command that sets the OMP_NUM_THREADS
environment variable. Typically, one CPU should be allocated to each thread of each process. If each node has 24 processors, and you want to give each process four threads, then a maximum of six tasks can run on each node (if each node has enough memory available to run six copies of the program).
For example, if you want to run 12 processes, each with four threads include the following #SBATCH
directives in your script:
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=6
#SBATCH --cpus-per-task=4
Also, include a line that sets OMP_NUM_THREADS=4
before your srun
command:
export OMP_NUM_THREADS=4
srun -n 12 ./my_mpi_program my_program_arguments
Other sbatch
options
Depending on the resources needed to run your executable lines, you may need to include other sbatch
options in your job script. Here a few other useful ones:
Option | Action |
---|---|
--begin=YYYY-MM-DDTHH:MM:SS |
Defer allocation of your job until the specified date and time, after which the job is eligible to execute. For example, to defer allocation of your job until 10:30pm October 31, 2022, use:
|
--no-requeue |
Specify that the job is not rerunnable. Setting this option prevents the job from being requeued after it has been interrupted, for example, by a scheduled downtime or preemption by a higher priority job. |
For complete documentation about the sbatch
command and its options, see the sbatch
manual page (on the web, see sbatch; on the IU research supercomputers, enter man sbatch
).
Interactive jobs
To request resources for an interactive job, use the srun
command with the --pty
option.
For example:
- To launch a Bash session that uses one node in the general partition, on the command line, enter:
srun -p general -A slurm-account-name --pty bash
- To perform debugging, submit an interactive job to the debug or general partition; for example:
- To request an hour of wall time in the debug partition, on the command line, enter:
srun -p debug -A slurm-account-name --time=01:00:00 --pty bash
- To request an hour of wall time in the general partition, on the command line, enter:
srun -p general -A slurm-account-name --time=01:00:00 --pty bash
- To request an hour of wall time in the debug partition, on the command line, enter:
- To run an interactive job with X11 forwarding enabled, add the
--x11
flag; for example:srun -p general -A slurm-account-name --x11 --time=01:00:00 --pty bash
In the above examples, replace slurm-account-name
with your allocation's Slurm Account name.
Users belonging to projects approved through RT Projects can find their allocation's Slurm Account Name on the "Home" page in RT Projects; look under "Submitting Slurm Jobs with your Project's Account"; alternatively, on the "Home" page, under "Allocations", select an allocation and look in the table under "Allocation Attributes".
Users without RT Projects allocations should use the -A general
option to indicate the general
Slurm account.
For more about RT Projects, see Use RT Projects to request and manage access to specialized Research Technologies resources.
When the requested resources are allocated to your job, you will be placed at the command prompt on a compute node. Once you are placed on a compute node, you can launch graphical X applications and your own binaries from the command line. You may need to load the module for a desired X client before launching the application.
When you are finished with your interactive session, on the command line, enter exit
to release the allocated resources.
If you use srun
to launch an interactive session as described above, you will not be able to run additional srun
commands on the allocated resources. If you need this functionality, you can instead use the salloc
command to get a Slurm job allocation, execute a command (such as srun
or a shell script containing srun
commands), and then, when the command finishes, enter exit
to release the allocated resources.
If you do not issue salloc
a command, your default shell is executed. From that shell, you can issue any number of commands (including srun
commands), and those commands will run on the allocation. When the commands are finished, enter exit
to quit the shell and release the allocated resources.
For example (replace slurm-account-name
with your allocation's Slurm Account Name):
$salloc -A slurm-account-name --nodes=1 --ntasks-per-node=24 --time=2:00:00 --mem=128G
salloc: Granted job allocation 109347 salloc: Waiting for resource configuration salloc: Nodes c18 are ready for job $srun -n 24 python my_great_python_mpi_program.py
$srun <any other commands you want to run>
$exit
exit salloc: Relinquishing job allocation 109347
For complete documentation about the srun
command, see the srun
manual page (on the web, see srun; on the IU research supercomputers, enter man srun
).
For complete documentation about the salloc
command, see the salloc
manual page (on the web, see salloc; on the IU research supercomputers, enter man salloc
).
Monitor or delete your job
To monitor the status of jobs in a Slurm partition, use the squeue
command. Some useful squeue
options include:
Option | Description |
---|---|
-a |
Display information for all jobs. |
-j <jobid> |
Display information for the specified job ID. |
-j <jobid> -o %all |
Display all information fields (with a vertical bar separating each field) for the specified job ID. |
-l |
Display information in long format. |
-n <job_name> |
Display information for the specified job name. |
-p <partition_name> |
Display jobs in the specified partition. |
-t <state_list> |
Display jobs that have the specified state(s). Valid jobs states include PENDING , RUNNING , SUSPENDED , COMPLETED , CANCELLED , FAILED , TIMEOUT , NODE_FAIL , PREEMPTED , BOOT_FAIL , DEADLINE , OUT_OF_MEMORY , COMPLETING , CONFIGURING , RESIZING , REVOKED , and SPECIAL_EXIT . |
-u <username> |
Display jobs owned by the specified user. |
For example:
- To see all jobs running in the general partition, enter:
squeue -p general -t RUNNING
- To see pending jobs in the general partition that belong to
username
, enter:squeue -u username -p general -t PENDING
For complete documentation about the squeue
command, see the squeue
manual page (on the web, see squeue; on the IU research supercomputers, enter man squeue
).
To delete your pending or running job, use the scancel
command with your job's job ID; for example, to delete your job that has a job ID of 8990
, on the command line, enter:
scancel 8990
Alternatively:
- To cancel a job named
my_job
, enter:scancel -n my_job
- To cancel a job owned by
username
, enter:scancel -u username
For complete documentation about the scancel
command, see the scancel
manual page (on the web, see scancel; on the IU research supercomputers, enter man scancel
).
View partition and node information
To view information about the nodes and partitions that Slurm manages, use the sinfo
command.
By default, sinfo
(without any options) displays:
- All partition names
- Availability of each partition
- Maximum wall time allowed for jobs in each partition
- Number of nodes in each partition
- State of the nodes in each partition
- Names of the nodes in each partition
To display node-specific information, use sinfo -N
, which lists:
- All node names
- Partition to which each node belongs
- State of each node
To display additional node-specific information, use sinfo -lN
, which adds the following fields to the previous output:
- Number of cores per node
- Number of sockets per node, cores per socket, and threads per core
- Size of memory per node in megabytes
Alternatively, to specify which information fields are displayed and control the formatting of the output, use sinfo
with the -o
option; for example (replace #
with a number to set the display width of the field, and field1
and field2
with the desired field specifications):
sinfo -o "%<#><field1> %<#><field2>"
Available field specifications include:
Specification | Field displayed |
---|---|
%<#>P |
Partition name (set field width to # characters) |
%<#>N |
List of node names (set field width to # characters) |
%<#>c |
Number of cores per node (set field width to # characters) |
%<#>m |
Size of memory per node in megabytes (set field width to # characters) |
%<#>l |
Maximum wall time allowed (set field width to # characters) |
%<#>s |
Maximum number of nodes allowed per job (set field width to # characters) |
%<#>G |
Generic resource associated with a node (set field width to # characters) |
For example, on Quartz, the following sinfo
command outputs a list that includes partition names, node names, the number of cores per node, the amount of memory per node, the maximum wall time allowed per job, and the status of each node:
sinfo -No "%10P %8N %4c %7m %10l %.6t"
The resulting output looks similar to this:
PARTITION NODELIST CPUS MEMORY TIMELIMIT STATE debug c1 128 515770 4:00:00 mix debug c2 128 515770 4:00:00 idle general* c3 128 515770 4-00:00:00 mix general* c4 128 515770 4-00:00:00 mix general* c5 128 515770 4-00:00:00 mix general* c6 128 515770 4-00:00:00 mix general* c7 128 515770 4-00:00:00 mix general* c8 128 515770 4-00:00:00 alloc general* c9 128 515770 4-00:00:00 mix general* c10 128 515770 4-00:00:00 alloc general* c11 128 515770 4-00:00:00 mix general* c12 128 515770 4-00:00:00 mix [................................................] general* c90 128 515770 4-00:00:00 idle general* c91 128 515770 4-00:00:00 idle general* c92 128 515770 4-00:00:00 idle
For complete documentation about the sinfo
command, see the sinfo
manual page (on the web, see sinfo; on the IU research supercomputers, enter man sinfo
).
Get help
SchedMD, the company that distributes and maintains the canonical version of Slurm, provides online user documentation, including a summary of Slurm commands and options, manual pages for all Slurm commands, and a Rosetta Stone of Workload Managers for help determining the Slurm equivalents of commands and options used in other resource management and scheduling systems (for example, TORQUE/PBS).
Support for IU research supercomputers, software, and services is provided by various teams within the Research Technologies division of UITS.
- If you have a technical issue or system-specific question, contact the High Performance Systems (HPS) team.
- If you have a programming question about compilers, scientific/numerical libraries, or debuggers, contact the UITS Research Applications and Deep Learning team.
For general questions about research computing at IU, contact UITS Research Technologies.
For more options, see Research computing support at IU.
This is document awrz in the Knowledge Base.
Last modified on 2023-08-07 16:43:57.