The Deepthought cluster uses the Slurm Resource Manager for job queuing and scheduling. The cluster has several queues with different priorities.
Deepthought currently has 4 queues. In order of descending priority, the queues are shown in the table below:
This queue is usable only by faculty who have access to Deepthought.
|class||This queue is usable only by students currently enrolled in classes for which the instructor has requested access to Deepthought. Use this queue for any multiple-node jobs.||
|class_long||This queue is usable only by students currently enrolled in classes for which the instructor has requested access to Deepthought. Use this queue for long-running jobs that require only one node.||
Since wait times can vary, non-interactive batch jobs are the recommended way to run programs on the cluster. However, if your program is inherently interactive (e.g. Matlab), you can also request an interactive shell on one or more compute nodes.
The easiest way to submit a batch job is to create a wrapper script that includes comments to instruct Slurm how to run your job. Below is an example script, which we will call "myWrapperScript.sh," to run the program "myprog" in the class queue:
#PBS -q class
#PBS -l nodes=1:gpu
#PBS -l walltime=00:05:00
#PBS -N MyJobName
./myprog arg1 arg2 arg3
This script runs "myprog" on a single node with a maximum of 5 minutes of walltime. The "#PBS" comments are identical to the arguments available to the qsub command:
- "#PBS -q": Required. Specify the name of the queue for this job.
- "BS -l nodes:gpu": Optional but recommended. Specify the number of nodes to reserve, but be careful not to request more than the queue's resource limits will allow. A node requirement in the resource list necessary to request more than 1 node or to specify a node attribute. You may substitute any desired node attribute in the place of "gpu." Multiple attributes may be specified and should be separated by colons; however, you must use care to ensure that nodes exist that satisfy all attributes specified.
- "#PBS -l walltime": Optional but recommended. Specify the maximum time your job is allowed to run. If you know that your job should complete within a certain amount of time, this argument is recommended. It will prevent your job from keeping a node occupied for longer than necessary in the event that something is wrong with your program.
- "#PBS -N": Optional. Up to 15 non-whitespace characters to identify your job.
To submit your job to the queue, use the qsub command:
-login ~]$ qsub ./myWrapperScript.sh
[user@deepthought-login ~]$ sbatch ./myWrapperScript.sh
Slurm automatically redirects stdout and stderr to files, and once your job completes, it copies them to your home directory. If you provided a name for your job, these files will be named "jobname.ojobid" and "jobname.ejobid," respectively. For example, if you submitted "myWrapperScript.sh" as written above, and it was assigned the Torque job id 1135, the output and error files would be named "MyJobName.o1135" and "MyJobName.e1135," respectively.
If you were to run "myWrapperScript.sh" but without the line that provides the job name, the output and error files would be named "myWrapperScript.sh.o1135" and "myWrapperScript.sh.e1135," respectively. Again, this assumes that 1135 is the Torque job ID.
If you need an interactive shell on a compute node to run your program (e.g. Matlab), use qsub with the "-I" flag:
-login ~]$ qsub -I -q class -l nodes=1:sixcore
[user@deepthought-login ~]$ srun -p class --pty /bin/bash
This will give you an interactive shell on one of the nodes you reserved (1 node by default). In this case, the job will run on a node configured with the "sixcore" attribute. Any other node attribute may be substituted.
You can run MPI jobs as standard Slurm batch jobs. For example, to run the MPI program "myMPIprog" with 16 MPI processes per node on 3 nodes, submit the following script:
#PBS -q class #PBS -l nodes=3:fourcore #PBS -l walltime=00:05:00 #PBS -N MyJobName OMPI_MCA_mpi_yield_when_idle=0 /opt/openmpi-1.4.3-gcc44/bin/mpirun --hostfile $PBS_NODEFILE -np 48 myMPIprog
Torque automatically creates a file containing the list of nodes allocated to a job and stores the path to that file in the $PBS_NODEFILE environment variable. As with any other job, MPI jobs can be executed interactively by submitting a standard interactive job and running mpirun.
Note: mpirun is unaware that each node has multiple processors because of our Torque configuration, so running more than one process per node forcess MPI processes to run in the "degraded" mode (see http://www.open-mpi.org/faq/?category=running#oversubscribing). To force them to run in the "aggressive" mode, which is the correct behavior if you have no more processes per node than CPU cores/threads, set the environment variable OMPI_MCA_mpi_yield_when_idle=0.
Checking the Status of Jobs and Queues
Another very useful Torque command is qstat, which provides information about jobs and queues. Below is a list of common qstat commands and equivalent SLURM commands:
- 'qstat' or 'squeue': Show all running, queued, and held jobs.
- 'qstat -a' or 'sinfo': Same as above, but provides more information about each job.
- 'qstat -f': Same as above, but provide extremely verbose information about each job.
- 'qstat jobid' or 'sinfo jobid': same as 'qstat,' but show only the job with the specified ID.
- 'qstat -f jobid': same as 'qstat -f,' but show detailed information only for the job with the specified ID.
- 'qstat -q': List the Torque queues, showing the resource limits for each queue.
- 'qstat -Q': List the Torque queues, showing job statistics for each queue.
Deleting/Canceling a Job
To delete a queued or running, job, use the qdel or scancel command followed by the job ID. For example, to cancel job number 1573, type:
-login ~]$ qdel 1573
[user@deepthought-login ~]$ scancel 1573
Additional Information on Torque Commands
For more complete information on all Torque Commands, consult the Torque Commands Overview.
For Translation or Torque commands to SLURM, consult this Quick Guide.