cluster - parallel speedup

March 25, 2005, 07:00

Hi,

I'm looking for experience of somebody who is doing CFD computations on a cluster made of PCs, connected by fast Ethernet. Can you say what is the lower limiting size of problem in terms of speedup when splitting a simulation between nodes?

On a 64-bit computing server with shared memory one has almost linear increase in computing speed when the simulation is split among processors. But on a cluster, when the size of the problem is relatively small, the network traffic would soon become a bottleneck of further speedup if splitting the simulation on more processors.

Have you this experience and can you tell us what's the limit in your system?

Thanx

George

March 25, 2005, 12:31

The ratio:time used for communicating/ time used for computing on the single pc. If the ratio is relatively large (e.g. >1) , your speedup will be very low. I used to use 6 Pc nodes for computing, my speedup is about 5. In other word, the efficiency is about 5/6*100%.

March 26, 2005, 22:54

This depends on what kind of algorithm you are using and what type of platform you are work on. Certainly you have to do the tests by yourself. There are several metrics. I suggest you read Chapter 7 of Parallel Programming in C with MPI and OpenMP by Quinn. You can measure the execution speed and communication speed for your machine, then do the calculation.

You'd better worry about whether your code is scalable in terms of processors and memory. You don't really want to run small size problems using parallel computers, do you?

March 29, 2005, 12:32

For our reasonably typical incompressible LES code (a mixture of explicitly and implicitly solved equations) and a grid size big enough to be useful (32^3 I think) the maximum number of usable PC nodes with fast ethernet was about 4. That is, using more than 4 made the simulation slower. Using 4 nodes gave about a doubling in performance relative to 1 processor. We got almost the same speed on 2 processors.

With gigabit ethernet the maximum number of usable nodes would appear to be around 50-100 for grid sizes of 20^3 - 30^3 on each processor.

Unless you are performing purely explicit simulations (e.g. particle codes, some compressible codes,...) fast ethernet is no longer a viable interconnect with current processors. It used to be OK for small clusters of Pentium IIIs though.

March 25, 2005, 07:00	cluster - parallel speedup	#1
George Guest Posts: n/a	Hi, I'm looking for experience of somebody who is doing CFD computations on a cluster made of PCs, connected by fast Ethernet. Can you say what is the lower limiting size of problem in terms of speedup when splitting a simulation between nodes? On a 64-bit computing server with shared memory one has almost linear increase in computing speed when the simulation is split among processors. But on a cluster, when the size of the problem is relatively small, the network traffic would soon become a bottleneck of further speedup if splitting the simulation on more processors. Have you this experience and can you tell us what's the limit in your system? Thanx George

March 25, 2005, 12:31	Re: cluster - parallel speedup	#2
Yan XIONG Guest Posts: n/a	The ratio:time used for communicating/ time used for computing on the single pc. If the ratio is relatively large (e.g. >1) , your speedup will be very low. I used to use 6 Pc nodes for computing, my speedup is about 5. In other word, the efficiency is about 5/6*100%.

March 26, 2005, 22:54	Re: cluster - parallel speedup	#3
Chen Xiaoming Guest Posts: n/a	This depends on what kind of algorithm you are using and what type of platform you are work on. Certainly you have to do the tests by yourself. There are several metrics. I suggest you read Chapter 7 of Parallel Programming in C with MPI and OpenMP by Quinn. You can measure the execution speed and communication speed for your machine, then do the calculation. You'd better worry about whether your code is scalable in terms of processors and memory. You don't really want to run small size problems using parallel computers, do you?

March 29, 2005, 12:32	Re: cluster - parallel speedup	#4
andy Guest Posts: n/a	For our reasonably typical incompressible LES code (a mixture of explicitly and implicitly solved equations) and a grid size big enough to be useful (32^3 I think) the maximum number of usable PC nodes with fast ethernet was about 4. That is, using more than 4 made the simulation slower. Using 4 nodes gave about a doubling in performance relative to 1 processor. We got almost the same speed on 2 processors. With gigabit ethernet the maximum number of usable nodes would appear to be around 50-100 for grid sizes of 20^3 - 30^3 on each processor. Unless you are performing purely explicit simulations (e.g. particle codes, some compressible codes,...) fast ethernet is no longer a viable interconnect with current processors. It used to be OK for small clusters of Pentium IIIs though.

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
Superlinear speedup in OpenFOAM 13	msrinath80	OpenFOAM Running, Solving & CFD	18	March 3, 2015 06:36
Script to Run Parallel Jobs in Rocks Cluster	asaha	OpenFOAM Running, Solving & CFD	12	July 4, 2012 23:51
Parallel cluster solving with OpenFoam? P2P Cluster?	hornig	OpenFOAM Programming & Development	8	December 5, 2010 17:06
Parallel efficiency and speedup info	lakeat	OpenFOAM Running, Solving & CFD	2	August 31, 2009 12:05
cluster - parallel speedup	George	FLUENT	0	March 25, 2005 06:54