Below is an example of submitting batch R jobs to the cluster. If you have questions on creating a SLURM file, submitting a SLURM job, or run R jobs interactively, please check the links below:
This R example will read in an Operational Taxonomical Unit (OTU) Table (otu_table.csv
) of microbial abundance counts and normalize them by cleaning out any missing entries, replacing zero-values with nominal values, and then scaling all values. The normalized table is then written in a new file normalized_otu_matrix.csv
.
Contents of the SLURM file:
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --mem=4G
#SBATCH --time=00:10:00
#SBATCH --job-name=R_normalize
#SBATCH --error=R_normalize.%J.err
#SBATCH --output=R_normalize.%J.out
module load R/4.0.2
Rscript normalize.R
This script will request for 1 node, 1 core, 4G memory and 10 minutes run time from the SLURM job scheduler. The job name is R_normalize. In –error=R_normalize.%J.err and –output=R_normalize.%J.out, the %J parameter will be replaced by the job id once the job starts to run.
To download the example files, use command below:
wget https://scholarblogs.emory.edu/rsph-hpc/files/2020/09/R_example.zip
To submit this job to the cluster, use the command sbatch SLURM_R.submit
.
Below is a full walkthrough:
#Create an R_example directory
[jzhan61@clogin01 ~]$ mkdir R_example
[jzhan61@clogin01 ~]$ cd R_example/
#Download example files
[jzhan61@clogin01 R_example]$ wget https://scholarblogs.emory.edu/rsph-hpc/files/2020/09/R_example.zip
--2020-09-24 15:04:04-- https://scholarblogs.emory.edu/rsph-hpc/files/2020/09/R_example.zip
Resolving scholarblogs.emory.edu (scholarblogs.emory.edu)... 34.196.187.114, 34.198.138.92
Connecting to scholarblogs.emory.edu (scholarblogs.emory.edu)|34.196.187.114|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 797805 (779K) [application/zip]
Saving to: ‘R_example.zip’
R_example.zip 100%[========================================================================================================================================>] 779.11K --.-KB/s in 0.1s
2020-09-24 15:04:05 (5.16 MB/s) - ‘R_example.zip’ saved [797805/797805]
#Unzip example files
[jzhan61@clogin01 R_example]$ unzip R_example.zip
Archive: R_example.zip
inflating: normalize.R
inflating: otu_table.csv.gz
inflating: README.md
inflating: SLURM_R.submit
#Check the list of files
[jzhan61@clogin01 R_example]$ ls
normalize.R otu_table.csv.gz README.md R_example.zip SLURM_R.submit
#Print the contents of SLURM_R.submit to screen
[jzhan61@clogin01 R_example]$ cat SLURM_R.submit
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --mem=4G
#SBATCH --time=00:10:00
#SBATCH --job-name=R_normalize
#SBATCH --error=R_normalize.%J.err
#SBATCH --output=R_normalize.%J.out
module load R/4.0.2
Rscript normalize.R
#Submit this R job to the cluster using sbatch command
[jzhan61@clogin01 R_example]$ sbatch SLURM_R.submit
Submitted batch job 14059
#Check job status using JOB ID
[jzhan61@clogin01 R_example]$ squeue -j 14059
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
14059 short-cpu R_normal jzhan61 R 0:14 1 node8
#Once the job is completed, the job status will be empty
[jzhan61@clogin01 R_example]$ squeue -j 14059
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
#Check the list of files in the folder again. The result file normalized_otu_matrix.csv has been generated successfully.
[jzhan61@clogin01 R_example]$ ls
normalized_otu_matrix.csv normalize.R otu_table.csv.gz README.md R_example.zip R_normalize.14059.err R_normalize.14059.out SLURM_R.submit