DCSC logo
 
ABOUT-DCSC
DCSC/SDU
DCSC/AU
DCSC/AAU
DCSC/DTU
DCSC/KU
 
+Open all         -Close all
 
    Overview   Software   Hints  

 

Hints and FAQ for Huge

Q1: Which filesystem are available?
Q2: How do I submit jobs?
Q3: How does a typically batchjob look like?
Q4: How to compile and run MPI programs?
Q5: How can I speedup a program?
Q6: How can I limit memory usage to protect the system.


Q1: Which filesystem are available?
  On Huge the userfilesystem is the same as on The Power4 IBM-cluster (Sleipner, Fenris, Hugin and Munin). The userfilesystem is NFS-mounted on Huge through a rather slow 100 MB/s line.
On Huge the scratchfilesystem is comprised of 4 300 GB disks. In total the scratchfilesystem is about 1.1 TB. When a batchjob is started, a job-specific directory on /scratch is created. It can be used by referencing /scratch/$PBS_JOBID. Please notice that the scratchdirectory and all its contents are erased when the job terminates.
A running job should never use the userfilesystem. Instead, let the first step in the job copy the relevant files from the userfilesystem to the scratchdirectory, and the last step copy back usefull data to the userfilesystem.
Top  

Q2: How do I submit jobs?
  The queueing system on Huge is Torque (former 'OpenPBS').
To submit a one-processor job to the Huge-node:
  qsub  -l nodes=huge jobscript
To submit a one-processor job to the Huge2-node:
  qsub  -l nodes=huge2 jobscript
To submit a job requiring 8 CPUs (say):
  qsub -l nodes=huge:ppn=8 jobscript

NB: Don't use the ncpus option. It works incorrectly!!! Currently it is not necessary to specify amount of memory required.

Top  

Q3: How does a typically batchjob look like?
  To achieve best performance, all jobs should use the scratch filesystem. Thus a prototype of a typically serial job would be:
  #!/bin/bash
  #PBS -N jobname
  # Mail me when the job starts and ends
  #PBS -m abe
  # Go to the directory where the job was submitted
  cd $PBS_O_WORKDIR
  # Copy relevant files to scratch
  cp *.H2O /scratch/$PBS_JOBID
  cp my/exe/dir/myprogram /scratch/$PBS_JOBID/myprogram
  # Change directory to scratch
  cd /scratch/$PBS_JOBID
  # Run the program
  ./myprogram
  # Copy back results
  cp *.result $PBS_O_WORKDIR
  echo "====== Job finished at `date` ====="
  #
Notice, that when the job has finished, the scratchdirectory automatically will be deleted.
The job is submitted with the command qsub jobscript
Top  

Q4: How to compile and run MPI programs?
  Use the 'mpxlf_r' -command to compile and links a MPI-program. All switches to 'xlf' also apply to 'mpxlf_r'.
To run the MPI-program, use a batchjob like this:
  #!/bin/bash
  #PBS -l nodes=1:ppn=2
  #PBS -N jobname
  #PBS -S /bin/bash
  #
  # Keep these 3 definitions:
  export MP_SHARED_MEMORY=yes
  export MP_HOSTFILE=/etc/host.list
  export MP_PROCS=`awk 'END{print NR}' $PBS_NODEFILE`
  #
  #
  cd $PBS_O_WORKDIR
  here=`pwd -P`
  #
  echo "====== Job started  at `date` ====="
  #
  /usr/bin/poe $here/myprogram arguments
  #
  echo "====== Job finished at `date` ====="
  #
Top  

Q5: How can I speedup a program?
  If a program is memory intensive, it might benefit from accessing memory with a larger pagesize. This can be done by running this command on the executable:
  ldedit -bdatapsize:64K a.out
Note: No compiling is necessary. Just run the command on an existing executable!
Also note, that 64K normally gives a noticeable performance improvement, (10-15%) but a different value might suit your program better.
The current datapsize (if it has been set) can be displayed with:
  dump -vo a.out
Look for DPageSize.
Top  

Q6: How can I limit memory usage to protect the system.
  The Huge-cluster is basically two shared-memory machines. Overallocation of memory may affect other users and maybe crash the whole system. To avoid that, the user can specify the maximum amount of physical memory a job can use. If the job requests more memory than specified, the job is killed.
Example: qsub -l nodes=huge:ppn=3,pmem=1500mb jobscript Here the sum of memory usage of the 3 tasks may not exceed 1.5 GB
Top