|
[Sponsors] |
August 24, 2001, 03:55 |
benchmark results
|
#1 |
Guest
Posts: n/a
|
These are the results of a benchmark I did with STAR v3.15. I used a simple case with approx. 500000 cells (see below).
Machines: SGI Origin 2000, 8 * R10000, 195 MHz Linux PC with P3, 1 GHz Linux PC Cluster, 8 * P4, 1.7 Ghz, 100 MBit Network Results for serial run (times as reported by STAR as "ELAPSED TIME" in the .run file): R10000: 24473 s P3: 16638 s P4: 4841 s This means that (for this case) the P4 is 5 times faster than the R10000 and 3.4 times faster than the P3! Results for HPC: R10000: CPUs----time----Speedup to serial 2------13926----1.76 4-------6887----3.55 8-------3009----8.13 P4: CPUs----time----Speedup to serial 2-------2504----1.93 4-------1332----3.63 6-------1034----4.68 8--------901----5.37 (optimized metis decomposition failed, used basic) For the cluster, the problem seems to be too small to get an adequate speedup with more than 4 CPUs. This should be better for a problem with more cells and equations. As it would be interesting to compare these results with other machines, you are invited to run the benchmark yourself and post the results here. The case I used can easily be set up using the following commands (delete the blank lines): *set ncel 32 *set nvrt ncel + 1 vc3d 0 1 ncel 0 1 ncel 0 1 ncel *set coff ncel * ncel * ncel *set voff nvrt * nvrt * nvrt cset news cran 0 * coff + 1 1 * coff$cgen 3 voff cset,,,vgen 1 1 0 0 cset news cran 2 * coff + 1 3 * coff$cgen 3 voff cset,,,vgen 1 0 0 1 cset news cran 4 * coff + 1 5 * coff$cgen 3 voff cset,,,vgen 1 0 1 0 cset news cran 6 * coff + 1 7 * coff$cgen 3 voff cset,,,vgen 1 1 0 0 cset news cran 8 * coff + 1 9 * coff$cgen 3 voff cset,,,vgen 1 0 0 -1 cset news cran 10 * coff + 1 11 * coff$cgen 3 voff cset,,,vgen 1 0 -1 0 cset news cran 12 * coff + 1 13 * coff$cgen 3 voff cset,,,vgen 1 1 0 0 vmer all n vcom all y cset all axis z view -1 -1 1 plty ehid cplo live surf cset news gran,,0.01 cplo bzon 1 all cset news gran 6.99 cplo bzon 2 all cset news shell cdele cset ccom all y dens const 1000 lvis const 1e-3 rdef 1 inlet 0.025 rdef 2 outlet split 1 iter 10000 geom,,1e-2 prob,, The run converges in 383 (or 384) iterations (using single precision). (I had some difficulties to post this message in a readable form. The tables and commands were concatenated to single strings, blanks and tabs werde deleted. So I had to add the "-"s and the blank lines. How can this be avoided?) |
|
August 31, 2001, 07:25 |
Re: benchmark results
|
#3 |
Guest
Posts: n/a
|
Here are some results obtained on an IBM RS/6000 SP-SMP with POWER3-II 375 MHz CPUs. All computations have been done on 8-way Nighthawk II nodes.
Serial run: 9400 elapsed time, 8753 CPU time Values and comparisons below are elapsed times. CPUs---time----Speedup to serial 2------4190-----2.24 4------1924-----4.88 8-------930----10.11 |
|
September 10, 2001, 10:48 |
Re: benchmark results
|
#4 |
Guest
Posts: n/a
|
Your results are quite interesting and I am glad you are sharing them. I think you drew one possibly invalid conclusion concerning scalability, although, as you say, running a much larger job might help. The P4 cluster is using 100mb ethernet as its message passing media. Its just not good enough for 8 machines of that speed. When running parallel, everything has to be balanced or the slowest component will be the choke point. In this case its the ethernet. There are 2 relatively cheap things you can do to improve the performance (assuming you have not done these already):
1)put a 2nd ethernet card on each machine and dedicate that to MPI while the first does general network activity (nfs, ftp, X...) 2)Connect your cards to an etherswitch rather than a hub. If this does not work, then its likely that you need a better (and far more expensive) message passing media to get good scaling. Steve |
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Another Benchmark | andras | OpenFOAM | 2 | July 10, 2008 09:35 |
Benchmark | Ford Prefect | Main CFD Forum | 6 | July 3, 2006 11:29 |
CFX cylinder or sphere benchmark results | Mel | CFX | 1 | August 8, 2005 19:47 |
benchmark | ernie | FLUENT | 0 | July 9, 2005 03:17 |
new P4 vs. AMD and ifc vs. g77 benchmark | George Bergantz | Main CFD Forum | 3 | July 9, 2002 04:04 |