SMP Nodes
Hybrid batch script for 48 threads/node
-
Line 2 : Specify total MPI tasks to be started by Slurm (-n <tasks>
) Line 2 : Specify total nodes equal to tasks (-N <number of nodes>
)Line 4 : Set number of threads for each process-
Line 5:
PAMPering at job level
numactl
appears within TACC'sibrun
command, so it applies to all MPI tasks in the jobnumactl
controls behavior (e.g., process-core affinity) in exactly the same way for ALL tasks- no simple/standard way to control thread-core affinity with numactl
...
#SBATCH -n 10 -N 10
...
export OMP_NUM_THREADS=48
ibrun numactl -i all ./a.out
...
#SBATCH -n 10 -N 10
...
setenv OMP_NUM_THREADS 48
ibrun numactl -i all ./a.out
In the above example, memory allocations are interleaved among sockets with numactl -i
. Note that this policy is not necessarily recommended for all multithreaded programs. On most operating systems, the default policy is instead to allocate memory on a "first touch" basis: the first thread to write to an area of shared memory will have that address range allocated on its local NUMA node. This default policy can work well for many purposes. If one simply accepts it, then a good general rule is to have each thread initialize the same memory that it will later be working on; in this case, the numactl -i
command should be omitted.