Managing R Threading

When running R code that contains multi-threaded libraries, by default, R detects the systems core count and will attempt to start an equal number of threads. Because of this, the job will perform badly as there is likely a mismatch between the cores requested in the slurm job compared to the number of threads that were started by R per the detected cores on the compute node. In addition to being bad for your job’s performance, the jobs create a high load on our compute nodes and cause some undesirable behavior. To prevent this issue from occurring, the following code segments should be added to your R script:

# specifies that you will be manually setting the cores
options(future.availablecores.methods = "mc.cores")

# SLURM_CPUS_PER_TASK is the amount of cores specified in the job environment
options(mc.cores = Sys.getenv("SLURM_CPUS_PER_TASK")

In general, the options above should be used at minimum. However, there is usually additional changes to be made to your script for each parallel library loaded. These options are usually something like ncores, cores, or num.threads.