User Tools

Site Tools



General information

SLURM is the job scheduler used in all our clusters. Even though different versions are installed for different groups of clusters, the basic commands are still the same. Here a list of them:

Command Information Usage
sbatch Submit a job script to SLURM sbatch $jobscript
srun Run a parallel job without job script srun –ntasks=N –cpus-per-task=CPT
salloc Allocate nodes to allow directly access salloc –nodes=$nNodes
sinfo View information about SLURM nodes and partitions sinfo
squeue View information about jobs located in the SLURM scheduling queue squeue
scancel Cancel a job scancel $jobID
sacct Display accounting data of jobs sacct -u $user

SLURM examples

Preliminary information

SLURM allows to set different options to specify the resources will be allocated and, potentially, used by the submitted job. Here a list with the most common ones:

# To specify the number of processes (~MPI ranks) to be spawned
# To specify how many cores each task will use (~ Number of OpenMP/OmpSs threads)
# To specify the maximum execution time your job will use.
# It will be killed after this amount of time
# To specify the name of the job. Useful for identifying it at squeue/sinfo/sacct output
-J $jobName
# To specify the file where standard output and error will be redirected.
# If -e not specified stderr will be redirected to stdout.
# If -o not specified, the output file will be located at the working directory,
# with name slurm-%j.out. %j is the job ID.
-o out/file-%j.out
-e err/file-%j.err
# To specify the partition where the job will be submitted.
# Each cluster has its own partitions.

Job Dependencies

As a user, one can specify some dependencies that the job must fulfill in order to start its execution. These dependencies can be between jobs or even a future time. The following dependencies are allowed:

  • Between jobs
    • After job begin
      • This job can begin execution after the specified jobs have begun execution.
    • After job finish
      • This job can begin execution after the specified jobs have terminated.
    • After job fail
      • This job can begin execution after the specified jobs have terminated in some failed state (non-zero exit code, node failure, timed out, etc).
    • After job finish successfully
      • This job can begin execution after the specified jobs have successfully executed (ran to completion with an exit code of zero).
    • Singleton
      • The job can begin execution after any previously job sharing the same job name and user have terminated.
  • Time dependency
    • Start Time
      • Submit the batch script to the SLURM controller immediately, like normal, but tell the controller to defer the allocation of the job until the specified time.
      • Time format could be one of the following:
        • –begin=16:00
        • –begin=now+4hour
        • –begin=now+60 (seconds by default)
        • –begin=2010-01-20T12:34:00


#SBATCH --partition=thunder
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=8
#SBATCH --out=omp-%j.out
#SBATCH --err=omp-%j.err
#SBATCH --time=10:00
srun ./omp_binary


#SBATCH --partition=thunder
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=8
#SBATCH --out=ompss-%j.out
#SBATCH --err=ompss-%j.err
#SBATCH --time=10:00
export NX_ARGS="--smp-workers=8"
srun ./ompss_binary


#SBATCH --partition=thunder
#SBATCH --ntasks=48
#SBATCH --cpus-per-task=1
#SBATCH --out=mpi-%j.out
#SBATCH --err=mpi-%j.err
#SBATCH --time=10:00
srun ./mpi_binary


#SBATCH --partition=jetson-tx
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=4
#SBATCH --out=cuda-%j.out
#SBATCH --err=cuda-%j.err
#SBATCH --time=10:00
#SBATCH --gres=gpu
srun ./cuda_binary


#SBATCH --partition=thunder
#SBATCH --ntasks=2
#SBATCH --cpus-per-task=48
#SBATCH --out=mpi_omp-%j.out
#SBATCH --err=mpi_omp-%j.err
#SBATCH --time=10:00
srun ./mpi_omp_binary

MPI+OpenMP with CPU binding

The following code shows an example on Thunder cluster on how to bind different MPI ranks to different cores. For more information, check the man page of the srun command.

#SBATCH --partition=thunder
#SBATCH --ntasks=2
#SBATCH --cpus-per-task=48
#SBATCH --out=mpi_omp-%j.out
#SBATCH --err=mpi_omp-%j.err
#SBATCH --time=10:00
# This will map rank 0 to one socket and rank 1 to the other
srun --cpu_bind=verbose,mask_cpu:${maskCPU0},${maskCPU1} ./mpi_omp_binary
# This would map rank 0 to even cores, while rank 1 to odd cores
srun --cpu_bind=verbose,mask_cpu:${maskCPU0},${maskCPU1} ./mpi_omp_binary
# This would map tasks to sockets. If the number of tasks differ from the number
# of allocated sockets, could result in sub-optimal binding.
srun --cpu_bind=verbose,mask_cpu:sockets ./mpi_omp_binary
wiki/prototype/slurmusage.txt · Last modified: 2017/04/11 10:44 (external edit)