DCSC logo
 
ABOUT-DCSC
DCSC/SDU
DCSC/AU
DCSC/AAU
DCSC/DTU
DCSC/KU
 
+Open all         -Close all
 
    Overview   Hardware   Software   Batchjobs   Hints  

 

Batchjobs on Karlsen

All jobs on Karlsen must be executed as batchjobs through the queueing system. The queueing system is LSF from Platform Computing. All jobs must be submitted to a apropriate queue. Currently these queues are defined (in order of ascending priority):

 

Queue Description
qi Idle queue. Jobs will be run if the system otherwise is idle.
q1 For sequential jobs
q1r For sequential jobs which cannot be suspended
Jobs in q1 will automatically be moved to this
queue, when it has room for more jobs. You cannot
submit jobs to this queue manually.
q1s High priority queue for short seq. jobs.
q4 For jobs requirering less than 4 CPUs
q8 For jobs requirering less than 8 CPUs
q16 For jobs requirering less than 16 CPUs
q32 For jobs requirering less than 32 CPUs. Not generally open
qexp High priority queue for short/exp. jobs
qncwh Special queue, not generally open.

Generally, jobs in lower priority queues will be suspended by jobs in higher priority queues within the limits for the particular queue. The command 'bql' shows the currently defind limits and queue-priroties. Jobs must be submitted to the right queue, f.ex. a job requirering 6 CPUs must be sent to queue "q8" (i.e.: 'bsub -n 6 -q q8 jobscript').

When a job starts, a uniq directory will be created in the /scratch filesystem. You can refer to this directory via the SCRDIR environment variable as indicated in the jobscript below. When the job terminates the scratch-directory and its contents is automatically erased.

A job-packer daemon checks every 10 minutte if jobs, pending due to queue-limits, can be started on otherwise idle CPUs.

Usefull commands for handling batchjobs on Fenris:

Submit jobs to the system
% bsub -q queue -n N jobscript
- where N is the number of CPUs requested, and jobscript is the name of the file containing the job (see below).
Delete a pending or running job
% bkill -s 9 jobid
- where jobid is the uniq identifier of the job, which can be seen with the 'js' command.
Display jobs in the queues
% js
- You can dynamically monitor you jobs with: 'js -mon'
Display the reason for, why a job is pending
% bjobs -p jobid

Example of a (Gaussian98-) jobscript:

#!/bin/csh     
limit core 0     
limit stack 1024m     
set INP=b.com
setenv g98root /usr/local/g98RevA7      

# The _DSM variables is necessary to run Gaussian in parallel.     
setenv _DSM_BARRIER SHM     
setenv _DSM_OFF     OFF      

source $g98root/g98/bsd/g98.login     
$g98root/g98/g98 < $INP > ${INP}.out    
#
NB: A comprehensive tool to submit Gaussian 98 jobs: subg98 gaussjob

Example of a parallel (MPI-) jobscript:

#!/bin/csh
cd $SCRDIR
mpirun -np 6 /path/to/mpiprogram arg1 arg2 > outdata
grep Result: outdata
#