|
[Sponsors] |
September 1, 1999, 09:03 |
CPU efficiency over 100% !!
|
#1 |
Guest
Posts: n/a
|
Running our parallel CFD code, I have obtained the experience that the CPU efficiency exceeds sometimes 100%, e.g., with single processor I need 10 hours for a calculation, but with 8 processors I need only 1 hour, that means the speedup factor is 10 and the CPU efficiency is 125%. This fact suprised me. Now I suppose that the cache plays a very important role: With single processor its cache load is too high and the data must be transfered to RAM, then the calculation speed will be reduced through this data transfer; with 8 processors the cache load is distributed to the sevral processors and the calculation speed is as high as there is no data transfer to RAM. Has anybody the same experience?
X. Ye |
|
September 1, 1999, 10:01 |
Re: CPU efficiency over 100% !!
|
#2 |
Guest
Posts: n/a
|
Yeah, superlinear behaviour of parallel CFD codes just means that you picked (or wrote) a good one! The thing is that the performance of a CFD code is critically dependent of memory fetch - sometimes you get unbelievable scalability results. If you look into the commercial parallel CFD code brochures, you'll find that lots of people report this sort of thing (it looks cool!). Nothing to worry about. My guess is that: a) you've got a really decent parallel platform with big bus-bandwidth (SGI?) b) you might be a bit low on the number of cells per processor (I would say the optimum is between 30 and 50k, depending on the communication width vs. type of solver you're using). If, on top of everything, you have the option of running partial (processor-by-processor) IC preconditioning (I assume you're using conjugate gradient solvers) rather than diagonal preconditioning, you'll get a tiny difference in results (round-off errors are different) but also a LOT of music (performance) for your money!
|
|
September 3, 1999, 13:17 |
Re: CPU efficiency over 100% !!
|
#3 |
Guest
Posts: n/a
|
i wouldn't say cpu efficiency is >100% since cpu efficiency is measured flop rate/peak speed (flop rate) *100. however you are getting a superlinear speeedup which is good. some parallel codes have a "sweet spot" where the cache load is just right and little or no data has to be fetched from RAM. what code are you using
|
|
September 6, 1999, 10:00 |
Re: CPU efficiency over 100% !!
|
#4 |
Guest
Posts: n/a
|
I am using finite volume, multiblock and Runge-Kutta explicite time marching code. The machine is HP Convex 220 with native parallel compiler.
X. Ye |
|
September 6, 1999, 15:39 |
Re: CPU efficiency over 100% !!
|
#5 |
Guest
Posts: n/a
|
(1). Are you getting the same trend at 2-CPU, 4-CPU, 6-CPU ? (2). If you can establish the trend, they should extend the machine to more CPUs to take advantage of this over 100% speed gain.
|
|
September 7, 1999, 04:34 |
Re: CPU efficiency over 100% !!
|
#6 |
Guest
Posts: n/a
|
Dear John,
Thanks for your message. I am getting the same trend with lower number of CPU's. We'll scale the number of CPU's in the near future. X. Ye |
|
September 7, 1999, 11:07 |
Re: CPU efficiency over 100% !!
|
#7 |
Guest
Posts: n/a
|
This superlinear speedup, of course, do not last for ever. As more processors are used, the time needed for loading data into cache reduces to a minimum and remain stable while the time for message passing increases. The speedup will quickly becomes sublinear.
Regards Frank |
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
stop when I run in parallel | Nolwenn | OpenFOAM | 36 | March 21, 2021 05:56 |
Superlinear speedup in OpenFOAM 13 | msrinath80 | OpenFOAM Running, Solving & CFD | 18 | March 3, 2015 06:36 |
OpenFOAM 13 Intel quadcore parallel results | msrinath80 | OpenFOAM Running, Solving & CFD | 13 | February 5, 2008 06:26 |
OpenFOAM 13 AMD quadcore parallel results | msrinath80 | OpenFOAM Running, Solving & CFD | 1 | November 11, 2007 00:23 |
Dual Core CPU | hjasak | OpenFOAM Running, Solving & CFD | 5 | July 22, 2006 04:57 |