DCSC logo
 
ABOUT-DCSC
DCSC/SDU
DCSC/AU
DCSC/AAU
DCSC/DTU
DCSC/KU
 
+Open all         -Close all
 
    Overview   Hardware   Software   Batchjobs   Hints  

 

The IBM-cluster hardware

IBM hardware

The IBM-cluster nodes have these main features:

Nodename Type CPUs Freq.(GHz)L3 cache(MB)Memory (GB) Local scratch (TB)
Sleipner IBM p690 32 1.3 32 (+)128 1
Fenris IBM p690 32 1.3 32 (+)128 1
Hugin IBM p655 8 1.1 32 (+) 32 1 (*)
Munin IBM p655 8 1.1 32 (+) 32 1 (*)
(*) shared by Hugin and Munin
(+)shared between two CPUs

Disksystem: FAStT700: 3.2 TB discstorage in SAN w. all the nodes. Of this 2.1 TB is allocated as common homefilesystem for all users.

The IBM p690 and 655 computers are based on Multi-Chip-Modules (MCM) which are "building blocks" containing 8 Power4 CPUs + caches. In the p690's the CPU clockspeed is 1.3 GHz and in the p655's the CPU clockspeed is 1.1 GHz.
The computers are connected via Fastethernet to a Cisco 3550 switch for ordinary useraccess, and to a BayStack 350T switch for administrative purposes (GPFS).
The SAN provides common access for all computers to the discstorage. The IBM Global Parallel FileSystem (GPFS) is running on top of the SAN, and provides a highperformance shared filesystem, obsoleting the needs for the traditional - but slow - Network FileSystem (NFS). All user-homedirectories and the scratchfilesystems for HUGIN and MUNIN resides on the GPFS. Sleipner and Fenris have local scratchfilesystems.

 

 

The Multi Chip Module

MCM

A IBM pSeries 690 "Regatta" system with 32 CPUs consists of 4 MCMs interconnected by a highspeed fabric. Each MCM consists of 4 dualCPU chips, each with 2 CPUs and a common L2 cache. Attached to each dualCPU chip is the external L3 cache and the memory.

A IBM pSeries 655 system with 8 CPUs consists of 1 MCM.

The bandwith of the interconnect ("network") between the CPU's is roughly 50 times better than the bandwith of a 100 Mbps network, which typically is used in Beowulf clusters.

Each CPU can perform two "Multiply-Add" instructions per clockcycle, that is 4 floating point instructions per cycle. Therefore in total Fenris' peakperformance is 4*1.3*32 Gflops = 166 Gflops