Beruflich Dokumente
Kultur Dokumente
1 of 7
https://wiki.adelaide.edu.au/hpc/index.php/Getting_Started_Guide
Contents
1 Logging In
1.1 Windows Operating System
1.1.1 Cygwin/Windows
1.2 Linux and OS X
2 Transferring files to Phoenix
2.1 Using a file transfer client
2.2 Using terminal commands
3 Interacting with the Phoenix System: Mastering Linux Basics
4 Loading software packages
4.1 For advance users: To install your own packages
5 Preparing to submit a job
5.1 Creating a simple job script
5.2 Creating a MPI Job Script
5.3 Debiting compute time: Multiple associations
6 HPC user guides - Selecting a Job Queue
7 Submitting A Job
8 Monitoring The Queue
9 Canceling a Job
Logging In
Access is provided by remote login using the secure shell (ssh) protocol.
To login you need to have the appropriate client software installed on your desktop or laptop computer.
Linux and OS X
6/07/2016 10:01 AM
2 of 7
https://wiki.adelaide.edu.au/hpc/index.php/Getting_Started_Guide
where you replace the X's with your UofA ID number and when prompted enter your UofA password.
3. To change shell environment permanently, use 'rcshell -s /bin/bash' to assign $SHELL with
/bin/bash.
Remote file locations are specified by prepending the file path with the user and hostname as in
the example above. For further details you may wish to refer to this scp command Tutorial
(https://www.garron.me/en/articles/scp.html).
sftp (secure ftp) provides a similar interface to the ftp command. First, navigate to the
local directory where your files reside. You can then initiate an sftp session to phoenix
with the following command:
sftp aXXXXXXX@phoenix.adelaide.edu.au
6/07/2016 10:01 AM
3 of 7
https://wiki.adelaide.edu.au/hpc/index.php/Getting_Started_Guide
Once the sftp session has started, you can use put and get to upload and download files to/from
the remote computer. There are other commands available using the sftp protocol, to learn more
you can look at this sftp tutorial (https://www.garron.me/en/articles/sftp.html).
4 of 7
https://wiki.adelaide.edu.au/hpc/index.php/Getting_Started_Guide
There are two components required to submit a job for analysis in the Phoenix system.
1. The software you wish to run (and any associated data files)
2. A job script that requests system resources
To ensure a fair and optimized system for all Phoenix users, we use a resource management tool, SLURM,
for job scheduling. In order to submit a job to Phoenix, you must create a SLURM job script and save this
along with the program files, into your directory folder on Phoenix. Below are examples of sample job
scripts called <my_job.sh> for each of the two common job types, namely simple jobs and parallel (MPI)
jobs. For each job type, a downloadable version is provided for you to use. Please configure your job script
according to one of following that best suits your requirements.
#
#
#
#
#
We'll begin by explaining the purpose of each line of the script example:
The header line #!/bin/bash simply tells the scheduler which shell language is going to be used to
interpret the script. The default shell on Phoenix is bash.
The next set of lines all begin with the prefix #SBATCH. This prefix is used to indicate that we are
specifying a resource request for the scheduler.
The scheduler divides the cluster workload into partitions, or work queues. Different partitions
are used for different types of compute job. Each compute job must select a partition with the -p
option. To learn more about the different partitions available on Phoenix, see <reference>.
The Phoenix cluster is a collection of compute nodes, where each node has multiple CPU cores.
Each job must specify the CPU resources required by using the -N option to request nodes and
the -n to request the number of cores per node required. See <reference>
Each compute job needs to specify an estimate of the amount of time it needs to complete. This
is commonly referred to as the walltime, specified with the --time=HH:MM:SS option. The
estimated walltime needs to be larger than the actual time needed by the job, otherwise the
scheduler will terminate the job for exceeding its requested time.
Dedicated memory (RAM) is allocated for each job when it runs, and the amount of memory
required per node must be specified with the --mem option.
A simple job is one in which the computational process is sequential and is carried out by a single node.
(Note: If your program file does not use MPI or MPI enabled libraries, your job belongs to this category.)
Depending on your computational needs, you may need to use either CPU or GPU-accelerated computing
nodes. This post provides some insights about the differences between CPU and GPU-accelerated
computing. If you need further support with this, please contact the team to discuss.
Below is a sample job script for simple CPU jobs. You will need to create an .sh file in your directory on
6/07/2016 10:01 AM
5 of 7
https://wiki.adelaide.edu.au/hpc/index.php/Getting_Started_Guide
Phoenix, and can copy and paste the script below into that file. You can also download this job script file,
transfer to your directory on Phoenix, and edit as required. Please remember you must then configure the job
script to your needs. The most common fields that need modification are the number of nodes and cores you
wish to use, the duration of time for which you wish to run the job, and the email address to which
notifications should be sent (i.e. your email address).
#!/bin/bash
#SBATCH -p batch
#SBATCH -N 1
#SBATCH -n 2
#SBATCH --time=01:00:00
#SBATCH --mem=32GB
#
#
#
#
#
# Notification configuration
#SBATCH --mail-type=END
#SBATCH --mail-type=FAIL
#SBATCH --mail-user=firstname.lastname@adelaide.edu.au
# Executing script (Example here is sequential script and you have to select suitable compiler for your case.)
bash ./my_program.sh
# bash script used here for demonstration purpose, yo
For simple GPU jobs, the following example job script can be copied and pasted into a new .sh file in your
Phoenix directory, or you can also download this job script file , transfer to Phoenix and edit as needed.
#!/bin/bash
# Configure the resources required
#SBATCH -p batch
#SBATCH -n 8
#SBATCH --time=01:00:00
#SBATCH --gres=gpu:4
#SBATCH --mem=16GB
# Configure notifications
#SBATCH --mail-type=END
#SBATCH --mail-type=FAIL
#SBATCH --mail-user=my_email@adelaide.edu.au
# Execute your script (due to sequential nature, please select proper compiler as your script corresponds to)
bash ./my_program.sh
# bash script used here for demonstration purpose, yo
#
#
#
#
#
For jobs that use GPU accelerators, <my_job.sh> will look like something like the example below. You can
also obtain a copy here.
#!/bin/bash
6/07/2016 10:01 AM
6 of 7
#SBATCH
#SBATCH
#SBATCH
#SBATCH
#SBATCH
-p batch
-n 2
--time=01:00:00
--gres=gpu:1
--mem=16GB
https://wiki.adelaide.edu.au/hpc/index.php/Getting_Started_Guide
More commonly used options, which can be added to #SBATCH lines includes
#SBATCH --mail-type=END
#SBATCH --mail-type=FAIL
#SBATCH --mail-user=my_email@adelaide.edu.au
SLURM is very powerful and allows detailed tailoring to fit your specific needs. If you want to explore all
available SLURM parameters available, simply type following in the command line:
man sbatch
Submitting A Job
To submit a job script to the queue use the sbatch command:
$ sbatch my_job.sh
If your job script requires additional variables you can define these with the --export option to sbatch:
$ sbatch --export=ALL,var1=val1,var2=val2 my_job.sh
Be sure to include the ALL option to --export to ensure your job runs correctly.
6/07/2016 10:01 AM
7 of 7
https://wiki.adelaide.edu.au/hpc/index.php/Getting_Started_Guide
The fifth column gives the status of the job as follows: R - running, PD - Pending, F - Failed, ST - Stopped,
TO - Timeout
Running squeue without arguments will list all currently running jobs. However if the list displayed is too
long for you to easily locate your job, you can limit the search to your own jobs by using -u argument like
this
squeue -u aXXXXXXX
Canceling a Job
To cancel a job you own, use the sbatch command followed by the slurm job ID:
$ scancel 2914
To cancel all jobs you own in a particular queue, use the -p argument:
$ scancel -p batch
6/07/2016 10:01 AM