The Phoenix cluster nodes are grouped into different sets (partitions) each fulfilling different purposes. These partitions can be considered to be separate "job queues", each of which can have a variety of sets of constraints on certain resources (nodes, processors, memory, time, etc.), so-called quality of service (QoS) constraints, associated with it. The Phoenix cluster runs the SLURM scheduler to manage all partitions.

The following will help you select the right partition for your job.

[hide]

How to submit to a partition (queue)[edit | edit source]

There are two ways to tell SLURM which partition your job should run on. The first is through a direct command when submitting a job, i.e.:

sbatch -p <partition> jobscript

The second, more common, way is to define it in your jobscript via:

#SBATCH -p <partition>

Partitions[edit | edit source]

The Phoenix cluster currently supports a number of different major SLURM partitions:

batch
cpu
skylake
skypool
gpu
volta
test
skytest
highmem

The `batch` partition[edit | edit source]

This is the default partition, and in the majority of cases you should submit your job to the batch partition. When the batch partition is specified, the scheduler will use an internal algorithm will determine which partition your job is eligible for, and it will be re-routed to all eligible partitions. The following partitions are considered for job eligibility:

cpu, skylake, skypool for CPU-only jobs
gpu, volta for GPU jobs

The majority of jobs will be eligible for multiple partitions, and will run in the first partition with available resources that meet the job requirements.

The `cpu`, `skylake` and `skypool` partitions[edit | edit source]

These partitions are meant for general computational jobs that do not require any GPU accelerators and a moderate amount of RAM. Most of the jobs submitted to Phoenix will run in these partitions.

cpu partition limits[edit | edit source]

The cpu partition targets Phoenix nodes that have the Haswell Intel CPU architecture. It is also the partition with most resources (i.e. nodes) associated with it.

cpu job constraints
Max Job time	3-00:00:00
Max cpus/node	32
Max RAM per cpu	4 GB
Max RAM per node	125 GB
Cost	1 SU per CPU hour
Can use GPUs	No

skylake partition limits[edit | edit source]

The skylake partition targets Phoenix nodes that have the Skylake Intel CPU architecture. This partition is suitable for general computational jobs that do not require any GPU accelerators, and a moderate to large amount of RAM. The Skylake nodes have up to 384 GB of memory available, which is triple the amount available to the Haswell nodes.

skylake job constraints
Max Job time	3-00:00:00
Max cpus/node	40
Max RAM per cpu	9665 MB
Max RAM per node	377 GB
Cost	1.25 SU per CPU hour
Can use GPUs	No

skypool partition limits[edit | edit source]

Jobs that are eligible to run on the skylake partition will also be considered for the skypool partition. The purpose of the skypool partition is to maximise CPU utilisation across the Skylake nodes by allowing CPU-only jobs that have a sufficiently small core-per-node requirement to run alongside GPU jobs in the volta partition.

skypool job constraints
Max Job time	3-00:00:00
Max cpus/node	32
Max RAM per cpu	9665 MB
Max RAM per node	301 GB
Cost	1.25 SU per CPU hour
Can use GPUs	No

Long QoS[edit | edit source]

If for some reason it is not possible to run your job within three days by e.g. increasing the number of CPUs/RAM or by check-pointing (saving the current state and restarting it), we offer a long QoS upon special request. Each request will be assessed on an individual basis. Accessing the long QoS requires a fair-share factor of greater than 0.25.

If you have been granted access to the long QoS, you can select it by adding the following to your job submission script:

#SBATCH --qos=long

The long QoS features are the following:

long QoS constraints
Max Job time	7-00:00:00

The `gpu` and `volta` partitions[edit | edit source]

These partitions are suitable for computational jobs which require GPU accelerators. Programmes that can run efficiently on GPUs can see speed-ups of 10X or better. (1 CPU hour vs. 1 GPU hour).

gpu partition limits[edit | edit source]

The gpu partition targets the Phoenix nodes that have Nvidia K80 accelerators. For the Nvidia K80s, the cost for one GPU-hour is 8 service units (SU). The minimum number of cores per GPU is 2, and the maximum is 4. The K80 nodes have 2 x K80 cards, and as each K80 is a dual-GPU device, there is a maximum of 4 GPUs that can be used per node.

gpu job constraints
Max Job time	2-00:00:00
Max cpus/node	16
Max RAM per cpu	4 GB
Max RAM per node	64 GB
Cost	8 SU per GPU hour
Can use GPUs	Yes

The resources available for jobs that are assigned to the gpu partition depend on the number of GPUs requested. The scheduler controls these limits by automatically assigning each job one of three QoS flags, named gxs, gxm, gxl. These flags are designed to optimise the overall system utilisation, with each providing different core and memory limits based on the number of GPUs required. The following table summarises the constrains.

QoS	gxs	gxm	gxl
GPUs/node	1	2	3 or 4
Max cpus/node requestable	4	8	16
Max RAM/node requestable	16 GB	32 GB	64 GB
Max RAM per cpu	4 GB	4 GB	4 GB
Max Job time	2-00:00:00
Cost	8 SU per GPU hour

Jobs submitted to the gpu partition will also be considered for routing to the volta partition, if the resource requirements allow for it.

If you wish to specifically request K80 GPUs for your job, you can do so using the kepler gres subtype. For example, to request 2 kepler GPUs, place the following line in your job script:

#SBATCH --gres=gpu:kepler:2

volta partition limits[edit | edit source]

The volta partition targets the Phoenix nodes that have Nvidia V100 accelerators. This partition is suitable for all computational jobs which require up to 2 GPU accelerators per node. In particular, the Nvidia V100 cards have a number of tensor cores that are optimised for deep learning workloads.

For the Nvidia V100s, the cost for one GPU-hour is 32 service units (SU). The minimum number of cores per GPU is 2, and the maximum is 8. There is a maximum of 2 V100 GPUs that can be used per node.

volta job constraints
Max Job time	2-00:00:00
Max cpus/node	16
Max RAM per cpu	9665 MB
Max RAM per node	151 GB
Cost	32 SU per GPU hour
Can use GPUs	Yes

The resources available for jobs that are assigned to the volta partition depend on the number of GPUs requested. The scheduler controls these limits by automatically assigning each job one of two QoS flags, named vxs, vxm. These flags are designed to optimise the overall system utilisation, with each providing different core and memory limits based on the number of GPUs required. The following table summarises the constrains.

QoS	vxs	vxm
GPUs/node	1	2
Max cpus/node requestable	8	16
Max RAM/node requestable	76 GB	152 GB
Max RAM per cpu	9.5 GB	9.5 GB
Max Job time	2-00:00:00
Cost	32 SU per GPU hour

Jobs submitted to the gpu partition will also be considered for routing to the volta partition, if the resource requirements allow for it.

If you wish to specifically request V100 GPUs for your job, you can do so using the volta gres subtype. For example, to request 2 volta GPUs, place the following line in your job script:

#SBATCH --gres=gpu:volta:2

The `test` and `skytest` partitions[edit | edit source]

These partitions are meant for job testing. Jobs submitted to these partitions have elevated priority to ensure a quick start time. There are strict limits on the wall time and the number of cpus a test job can run on. It is also three times more expensive to run through the job testing partitions. Furthermore, users with a fair share factor of less than 0.25 cannot run using these queues until their fair share rises above the threshold.

test partition limits[edit | edit source]

The test partition targets Phoenix nodes which have Haswell cores and K80 GPUs.

test job constraints
Max Job time	02:00:00
Max RAM per cpu	4 GB
Max RAM per node	64 GB
Cost	3 SU per CPU hour, 24 SU per GPU hour
Max number of cpus	16
Max number of GPUs	4
Can use GPUs	Yes

skytest partition limits[edit | edit source]

The skytest partition targets Phoenix nodes which have Skylake cores and V100 GPUs.

skytest job constraints
Max Job time	02:00:00
Max RAM per cpu	9665 MB
Max RAM per node	190 GB
Cost	3.75 SU per CPU hour, 96 SU per GPU hour
Max number of cpus	20
Max number of GPUs	2
Can use GPUs	Yes

The `highmem` partition[edit | edit source]

This partition is meant for jobs that have high memory requirements. The Phoenix cluster currently holds 3 nodes with 512 GB of RAM and 3 nodes with 1.5 TB of RAM. Demand for these nodes can fluctuate. Make sure to check your memory requirements thoroughly before submitting to the highmem partition to avoid unnecessary queuing times.

highmem partition limits[edit | edit source]

The highmem partition targets Phoenix nodes which have Haswell cores and a minimum of 512 GB of RAM.

highmem job constraints
Max Job time	3-00:00:00
Max RAM per cpu	16 GB or 48 GB
Max RAM per node	503 GB or 1511 GB
Cost	1 SU per CPU hour
Can use GPUs	No

Phoenix Partitions and QoSs

Contents

How to submit to a partition (queue)[edit | edit source]

Partitions[edit | edit source]

The `batch` partition[edit | edit source]

The `cpu`, `skylake` and `skypool` partitions[edit | edit source]

cpu partition limits[edit | edit source]

skylake partition limits[edit | edit source]

skypool partition limits[edit | edit source]

Long QoS[edit | edit source]

The `gpu` and `volta` partitions[edit | edit source]

gpu partition limits[edit | edit source]

volta partition limits[edit | edit source]

The `test` and `skytest` partitions[edit | edit source]

test partition limits[edit | edit source]

skytest partition limits[edit | edit source]

The `highmem` partition[edit | edit source]

highmem partition limits[edit | edit source]

猜你喜欢

Phoenix Partitions and QoSs

Contents

How to submit to a partition (queue)[edit | edit source]

Partitions[edit | edit source]

The batch partition[edit | edit source]

The cpu, skylake and skypool partitions[edit | edit source]

cpu partition limits[edit | edit source]

skylake partition limits[edit | edit source]

skypool partition limits[edit | edit source]

Long QoS[edit | edit source]

The gpu and volta partitions[edit | edit source]

gpu partition limits[edit | edit source]

volta partition limits[edit | edit source]

The test and skytest partitions[edit | edit source]

test partition limits[edit | edit source]

skytest partition limits[edit | edit source]

The highmem partition[edit | edit source]

highmem partition limits[edit | edit source]

猜你喜欢

The `batch` partition[edit | edit source]

The `cpu`, `skylake` and `skypool` partitions[edit | edit source]

The `gpu` and `volta` partitions[edit | edit source]

The `test` and `skytest` partitions[edit | edit source]

The `highmem` partition[edit | edit source]