Running Jobs

In order to run a computation on the HPC cluster, you must submit the work as a “job”. This is in contrast to work done on a traditional computer like a laptop or an office workstation, where work is done primarily interactively and via a graphical user interface (GUI). Large jobs or numbers of computations are submitted in “batches”.

The job scheduler SLURM does the work of running all jobs submitted to the cluster. It is very similar in function to the Grid Engine scheduler seen on other clusters. SLURM is fault-tolerant cluster management and job scheduling system that allocates resources to compute nodes so that they can run programs or jobs. It also manages contention for cluster resources by managing a queue of pending work.