Slurm examples
Job Submission
PIConGPU job submission on the Taurus cluster at TU Dresden:
tbg -s sbatch -c etc/picongpu/0008gpus.cfg -t etc/picongpu/taurus-tud/k80.tpl $SCRATCH/runs/test-001
Job Control
interactive job:
salloc --time=1:00:00 --nodes=1 --ntasks-per-node=2 --cpus-per-task=8 --partition gpu-interactivee.g.
srun "hostname"GPU allocation on taurus requires an additional flag, e.g. for two GPUs
--gres=gpu:2
details for my jobs:
scontrol -d show job 12345all details for job with <job id>12345squeue -u $(whoami) -lall jobs under my user name
details for queues:
squeue -p queueName -llist full queuesqueue -p queueName --start(show start times for pending jobs)squeue -p queueName -l -t R(only show running jobs in queue)sinfo -p queueName(show online/offline nodes in queue)sview(alternative on taurus:module load llviewandllview)scontrol show partition queueName
communicate with job:
scancel <job id>abort jobscancel -s <signal number> <job id>send signal or signal name to jobscontrol update timelimit=4:00:00 jobid=12345change the walltime of a jobscontrol update jobid=12345 dependency=afterany:54321only start job12345after job with id54321has finishedscontrol hold <job id>prevent the job from startingscontrol release <job id>release the job to be eligible for run (after it was set on hold)