Академический Документы
Профессиональный Документы
Культура Документы
Version 1.2 July 2012 IITS (Research Support) Singapore Management University
Page 1 of 35
Revision History Version 1.0 (27 June 2012): - Modified HPCC users guide after creating the new Ultra HPC cluster Version 1.1 (3 July 2012): - Modified HPCC users guide after changing the job queue selection option in the HPC job submission script Version 1.2 (25 July 2012): - Modified HPCC users guide after adding new job queue selection option in the HPC job submission script
Page 2 of 35
Table of Contents
Introduction ..................................................................................... 4
HARDWARE ....................................................................................................... 4 SOFTWARE ....................................................................................................... 4 HOW TO USE THIS GUIDE................................................................................... 5
Appendix B: The vi Text Editor ...................................................... 33 Appendix C: Cluster Software Availability List ..................................... 34
Page 3 of 35
Introduction
HARDWARE
The High Performance Computing (HPC) Cluster is available in SMU for researchers needing computational resources to run numerical analysis and simulations on the Linux platform. Managed by the Integrated IT Services (IITS) Research Support, the cluster comprises of:-
a. 1 x Intel Xeon CPU X5460 3.16GHz 8 cores processors compute server with 16GB acting as the frontend login host b. 8 x Intel Xeon CPU X5460 3.16GHz 8 cores processors compute server with 32GB RAM per server c. 2 x Intel Xeon CPU E5620 2.40GHz 80 cores processors compute server with 512GB RAM per server d. 2 x Intel Xeon CPU X7542 2.67GHz 24 cores processors compute server with 128GB RAM per server
The servers are connected with a 10 Gigabit Ethernet switch on a private network for high speed low latency data transmission and are clustered using CENTOS (a variant of Red Hat Enterprise Linux).
The frontend server mounts the home directory of users and serves as the software compilation and job submission host for the HPC cluster. It also handles the assignment of jobs to the compute nodes.
SOFTWARE
The list of available software in the cluster includes compilers, mathematical software and statistical software. Please refer to Appendix C for details.
If you need additional software in the cluster, please email the IITS (Helpdesk) (helpdesk@smu.edu.sg) or drop by helpdesk at the basement level of School of Accountancy Building.
Page 4 of 35
Chapter 1 (Accessing HPC Cluster) tells you how you can connect to the cluster from inside and outside of the SMU network.
Chapter 2 (Job Scheduler: Univa Grid Engine) touches on the three Linux commands that you need to know to run and manage jobs in the cluster.
Chapter 3 (Software Job Submission) provides you with information required to run jobs for the specific software in the cluster.
The appendixes provide quick referrals or refreshers when youre using the cluster:-
Appendix A (Linux Commands for home directory administration) shows some Linux commands for self administration of your home folder.
Appendix B (The vi Text Editor) provides a quick look up table of shortcuts for the vi editor.
Appendix C (Cluster Software Availability List) lists the software available in the cluster.
Page 5 of 35
The Ultra cluster is a shared resource. The current user disk quota is set at 12GB (soft limit) and 15GB (hard limit). Please ensure that you have sufficient disk space before running large data jobs. If you need additional disk space, please email us at helpdesk@smu.edu.sg.
To login to the cluster, you will need a secure shell program or any Linux-based OS. Suggested secure shell programs include:-
2.
SSH Secure Shell http://www.ssh.com/support/downloads/secureshellwks/non-commercial.html A commercial program that is freely available for non-commercial purposes (t&c applies).
Page 6 of 35
Fill in the hostname ultra.smu.edu.sg as shown in Figure 1.1. At this point, please note that you must have an account in Ultra before you access it. Otherwise, you risk locking out your SMU domain account. After clicking Open, you will be prompted for some inputs. The login ID that you should use is your SMU user-id (without the smustu or smustf) and the password is your SMU domain password. After authenticating successfully, you will see a command-line shell environment awaiting your command.
1.
Using WinSCP (both inside or outside of the SMU network) Download from: http://winscp.sourceforge.net/eng/
This program provides a graphical user interface that allows data transfer to and from Ultra either locally or outside SMU without requiring VPN.
Page 7 of 35
Fill in the hostname, user name, password and choose the SFTP protocol as detailed in Figure 1.3 above. Once logged in, you can proceed to transfer files between the 2 displayed windows (one of the windows will be your desktop while the other will be your home directory in Ultra).
All simulation jobs must be submitted through the Univa Grid Engine. Jobs not submitted through the Grid Engine will have to be terminated to ensure the integrity of the system.
2.1.1.
Page 8 of 35
This command is used to submit a job description file (which describes the job you wish to run) for job queuing and execution. More information on specific software job description file will be provided in the later chapter which discusses the software in the cluster.
Syntax: qsub [ job script ] [ job script ] Example: To submit a job: qsub jobscript.sh Job description file to submit
Page 9 of 35
2.2
For every job submitted, the screen output and errors will be dumped into a file named after your job submission script in the following format: output log: job_name.sh.ojob_id
If the job did not run as expected and encountered errors, the file might log some information to aid troubleshooting.
Page 10 of 35
Code samples and instructions to run the codes are available in users home directory under the subfolder ultra-samples. Please contact IITS (email: helpdesk@smu.edu.sg) for assistance for software not listed above.
3.1. Overview
Most users follow the steps below to run jobs: 1. Transfer files over to the Ultra cluster 2. Edit files using a UNIX-friendly text editor like EditPadLite Note: Windows Notepad and Wordpad are not UNIX-friendly and cannot be used. Job submission scripts created using Notepad or Wordpad always cause an error on job submission 3. Create a job submission script by referencing the samples scripts in ultra-samples 4. Login to the cluster using the secure shell program 5. Submit the job submission script using qsub. Then use qstat to check whether the job is running 6. You can logout of the cluster (type exit) while the job is running. 7. When the job ends, you can transfer the output files back to your office PC.
You need to create a job submission script for every job that you intend to submit. This job submission script contains the Univa Grid Engine switches (parameters) and software specific instructions to run your code. This section shows you how to write a job submission script.
A job submission script consists of 2 parts: 1. Univa Grid Engine (GE) switches 2. Software specific instruction
GE switches
./sim1
Software specifics
The following 4 switches are optional, but recommended: #$ -j y #$ -cwd #$ -m e #$ -M email #$ -q express.q #$ -q short.q Merge the error and output stream files into a single file Stands for current working directory. Means to take the current folder as the reference for file paths Send an email notification when the job ends (use with below option -M) Send job notification to this email address
1 of the 2 following switches must be indicated as choice of platform for the job to be run: Send the job to the HP DL980 G7 Intel Xeon 64-bit nodes Send the job to the HP DL580 G7 Intel Xeon 64-bit nodes
Page 12 of 35
Launching Instructions ./[compiled filename] > [ output file ] example: ./sim1 > out.txt
Gauss pseudocode
ILOG CPLEX
cplex < [ cplex script ] > [ output file ] example: cplex < sim1 > result.txt
JAVA
. /etc/profile.d/java.sh java [ java classname ] example: . /etc/profile.d/java.sh java testrun > results.txt
MATLAB
matlab -r '[ filename ]' example (the .m file extension must be omitted): matlab -r 'matrixsim1'
MATLAB binary
compiled
R b --vanilla < [ R filename ] > [ output file ] example: R b --vanilla < runsim.R > results.txt
STATA
Reminder: To create / edit a job submission script, use a UNIX-friendly text editor from your Windows PC or use the text editors in the cluster (refer to Appendix B). Do not use Microsoft Notepad or Wordpad to do this.
Page 13 of 35
1. Name the script with a file extension .sh to indicate that the file is a shell script 2. Start the script with the line: #! /bin/bash 3. Add the Grid Engine switches 4. Add the software specific instructions
The following is an example of a job submission script named mysim1.sh which is used to run a 64bit MATLAB compiled binary file named sim1:
#$ -j y #$ -cwd #$ -m e #$ -M legolas@smu.edu.sg
#$ -q short.q
. /etc/profile.d/matlab.sh ./sim1
Example output: [legolas@Ultra R]$ qsub mysim1.sh Your job 6505 ("mysim1.sh") has been submitted [legolas@Ultra R]$
After submitting the job, you want to check whether the job is executing . To do so, run the command qstat and check the column state:
Page 14 of 35
Example output:
job-ID prior name user state submit/start at queue slots ja-task-ID ----------------------------------------------------------------------------------------------------------------------------------------------------------------110 123 126 0.55500 sendjob1f. kenyamada 0.55500 sendjob1f. kenyamada 0.55500 sendjob1f. kenyamada r r r 06/28/2012 18:57:50 06/28/2012 19:44:20 06/28/2012 19:44:35 express.q@c3-1 express.q@c3-2 express.q@c3-2 1 1 1
Possible states are:r: qw: t: Eqw: running queuing transfer error queuing
Alternatively, run the command jobwatch which displays per second update of qstat. To break out of jobwatch, press Ctrl-C. If your job state is Eqw, you need to delete the job. Then check the error and output file for
Note:
hints on why the job could not run. You may also want to double-check your job-script for typo mistakes. Then you can resubmit the job again.
Page 15 of 35
3.2. MATLAB
You can run MATLAB codes directly in the cluster. However, you are encouraged to compile MATLAB codes instead as compiled MATLAB jobs do not utilize network licenses. More information on MATLAB compilation can be found in the next section.
while on the Linux platform, it is written as: /home/[ your userid ]/run1/compute.m
If you use such file path references in your code, you have to change it accordingly when you port the codes over to the cluster.
matlab r 'simulation'
The following command submits the job submission script: qsub sendjob1.sh
Page 16 of 35
To compile MATLAB codes, scripts must be first be marked as functions. This usually involves putting in a header line like function main. This command then compiles the code inside MATLAB: >> mcc mv simulation1.m
Using the above example, a few related files will be created in the folder: 1. simulation (this is the compiled binary) 2. files with extension .c (these can be removed) 3. simulation.prj (this can be removed) 4. simulation.log (this can be removed) 5. run_simulation.sh (this can be removed)
. /etc/profile.d/matlab.sh ./compiled_simulation
The following command submits the job submission script: qsub sendjob1.sh
Page 17 of 35
3.4. GAUSS
There are no GAUSS licenses available on the compute nodes. Instead, the compute nodes are installed with the GAUSS Run-Time Module that can run GAUSS-compiled pseudo-codes. On the cluster, the front-end node has the following GAUSS packages:
1. 2. 3.
Constrained Maximum Likelihood add-on Constrained Optimization add-on GAUSS Run-Time module
while on the Linux platform, it is written as: /home/[ your userid ]/run1/compute.m
If you use such file path references in your code, you have to change it accordingly when you port the codes over to the cluster.
The above command will produce compute.e.gcg pseudo-code file in the folder which can then be run from the Linux command line using the gsrun command. All that is needed now is a submission job script for submitting the job. See section below.
Page 18 of 35
In the following example, compute.e.gcg is the GAUSS pseudo-code file that we want to run. We create a job submission script sendjob1.sh to describe compute.e.gcg as the Gauss pseudo-code file to be run:
gsrun -b compute.e.gcg
The following command submits the job submission script: qsub sendjob1.sh
Page 19 of 35
3.5. R
R is available on all nodes in the cluster.
Example job submission script sendjob1.sh: #! /bin/bash #$ -j y #$ -cwd #$ -m e #$ -M legolas@smu.edu.sg #$ -q short.q R -q --vanilla < simulation.R
The following command submits the job submission script: qsub sendjob1.sh
Page 20 of 35
3.6. STATA SE
STATA SE is available only on the Intel EM64T and AMD64 nodes in the HPC Cluster.
stata b do sortdata
The following command submits the job submission script: qsub sendjob1.sh
Page 21 of 35
The following illustrates an example (also found in the sub-folder Ultra-samples/ilog) for submitting a CPLEX job: (a) create the CPLEX problem: (example file name problem.lp) -----------------------------maximize x1 + 2 x2 + 3 x3 subject to -x1 + bounds 0 <= x1 <= 40 0 <= x2 0 <= x3 end -----------------------------x2 + x3 <= 20 x1 - 3 x2 + x3 <=30
(b) create the CPLEX script file: (example file name cplex-script) -----------------------------read problem.lp optimize display solution variables x1-x3 quit ------------------------------
Page 22 of 35
(example file name cplex-submit.sh) -----------------------------#! /bin/bash #$ -j y #$ -cwd #$ -m e #$ -M legolas@smu.edu.sg #$ -q short.q cplex < cplex-script ------------------------------
The following command submits the job submission script: qsub cplex-submit.sh
Page 23 of 35
The Intel compilers have been optimized for the Intel platform (here 64-bit architecture) while the GNU compilers are generalized x86 open-source compilers. For this cluster, it is recommended that you use the Intel compilers unless you prefer the GNU compilers.
If the code is compiled successfully, it can then be dispatched for running by creating and submitting a job submission file. See section below.
If the code is compiled successfully, it can then be dispatched for running by creating and submitting a job submission file. See section below.
#$ -j y
Page 24 of 35
The following command submits the job submission script: qsub sendjob1.sh
Page 25 of 35
2 methods are suggested below to compile the FORTRAN code that uses IMSL functions:
(a) First, setup the environment for the IMSL libraries: . /opt/vni/CTT6.0/ctt/bin/cttsetup.sh Next, type the following command line in a single continuous line (case-sensitive): $F90 $F90FLAGS -o lin_eig_self_ex4 lin_eig_self_ex4.f90 $LINK_F90_STATIC -lpthread The above command line will compile the code lin_eig_self_ex4.f90 into an output binary file named lin_eig_self_ex4.
(OR)
(b) First, setup the environment for the IMSL libraries: . /opt/vni/CTT6.0/ctt/bin/cttsetup.sh Next, use the script imsl-compile that we created and specify the FORTRAN code file: ./imsl-compile lin_eig_self_ex4 Note that the filename lin_eig_self_ex4 is not specified with the .f90 extension.
After successfully compiling the code, prepare the job submission script:
#$ -j y #$ -cwd #$ -m e #$ -M legolas@smu.edu.sg
Page 26 of 35
#$ -q short.q
The following command submits the job submission script: qsub imsl-submit.sh When the job is completed without any problems, an output file named output.txt will be created. Check the output file. If it shows Example 4 of LIN_EIG_SELF is correct, then the job ran successfully.
Page 27 of 35
3.10. JAVA
The Java SDK is available across all nodes in the cluster.
3.10.1.
JAVA Compiler
You can compile your Java codes using the Java compiler installed on the frontend node. You can test out the example code in the Ultra-samples/java folder in your home directory.
After the code is compiled successfully, you can submit the job using a job submission script:
The following command then submits the job submission script: qsub sendjob1.sh
Page 28 of 35
Page 29 of 35
--------------------------------------------------------------------------Column 2 blocks shows your current quota usage. Column 3 quota shows the soft limit quota. Column 4 limit shows the hard limit quota.
--------------------------------------------------------------------------In the example above, the sub-folder sim1 is taking up 841MB of space.
A.2
Directory Handling
To list folders and files, type: ls [ options ] [ file or dir name ] [ options ]: [ -l ] [ -lt ] : Display detailed attributes of files and folders : Sort by modification time
Page 30 of 35
To create a folder:
To rename a file:
To change directory:
cd [ foldername ]
pwd
--------------------------------------------------------------------------------------------------------------------------
Starting from the left of the line: d represents a directory while - represents a file. rwx represents read/write/execute respectively. the 3 sets of rwx represents the permission settings for owner/group/others respectively. legolas on the left represents the owner while legolas on the right represents the group.
Page 31 of 35
To set the permission for owner/group/others, use the above number and link them together. Example, rwxr-x--- is interpreted as 750.
After calculating the numbers from the above 2 steps, type the command to complete the permissions change: chmod 755 [ filename ]
This is useful if you want to share a particular folder with a collaborator (contact IITS to create a common group for both you and the collaborator).
To get help for a command, type: man [ command ] [ command ] --help (for manual pages of the command) (for a brief syntax guide)
To dump content of text files to screen: cat [ filename ] cat [ filename ] | more cat n [ file1 ] > [ file2 ] (to output the file to the display) (to pause the display page by page) (display file1 with line numbers and send the output to file2)
Page 32 of 35
VI uses a set of key codes to indicate commands. The following table shows the various commonly used commands (case-sensitive): File Operations Write to file (save) Write to file and Quit (save and quit) Write to another filename file Quit Quit and discard changes Quit, saving changes if any Text Editing Insert text (Go into Insert mode) Change to replace text (overwrite) End insert text (End Insert mode) Delete current line (cut in ms word) Delete n number of lines (cut in ms word) Yank the current line (copy in ms word) Yank n number of lines (copy in ms word) Place the text from buffer (paste from clipboard in ms word after the cut or copy) Undo last change Movement Jump to start of file Jump to end of file Go to High / Medium / Low part of screen Go to n lines below the High part of screen Go to n lines above the Low part of screen Move to line number n Search Search forward for pattern Search backwards for pattern Keys :w :wq :w file :q :q! :x Keys i Insert Esc dd ndd yy nyy p u Keys gg G H / M / L nH nL :n Keys /pattern ?pattern
Page 33 of 35
COMPILERS Software GNU C/C++ Compiler GNU F77 Compiler Cmd / Location gcc / g++ g77 Documentation Official Software: http://www.gnu.org/software/gcc/ Man Pages Type: man gcc man g77
MATHEMATICAL AND STATISTICAL Software GAUSS - Constrained Max. Likelihood - Constrained Optimization - Run-Time Module MATLAB gsrun matlab Official Software: http://www.mathworks.com tgauss gauss Cmd / Location Documentation Official Software: http://www.aptech.com Man Pages Type: gauss -h gsrun -h Type: matlab -h
Page 34 of 35
STATA SE
stata
ILOG CPLEX
cplex
MATHEMATICAL LIBRARY ROUTINES (FOR USE WITH FORTRAN COMPILERS) Software IMSL Fortran Numerical Libraries (32-bit version) Cmd / Location $F90 $F90FLAGS -o example example.f90 $LINK_F90_STATIC lpthread Documentation Official Software: http://www.vni.com Man Pages None
Page 35 of 35