|
[Sponsors] |
January 31, 2008, 12:31 |
speedup questions
|
#1 |
Guest
Posts: n/a
|
Hi, all,
It seems that when I change from serial to parallel using 4 partitions of MPICH2 Local Parallel for Windows, the computational time increased(almost doubled!), which is oppsite to what I expected. I am using CFX11. My machine: Fujitsu Siemens Celsius v830 AMD64 2.61G 2xdual core (total 4 processors) 8G memory Windows XP x64 edition The mesh: total nodes: 581665 total elements: 2.7M (mostly tet) I tried several cases with different meshes and different partition numbers, I didn't see speedup. The serial mode seems always faster. Did I missed something in solver setup or other software setup? Thank you for your comments. |
|
January 31, 2008, 13:46 |
Re: speedup questions
|
#2 |
Guest
Posts: n/a
|
Hi Tony,
That sounds unusual. You certainly have a large enough problem to get reasonable scaling and with 8GB of RAM, you have enough memory. Could you add the following: 1. How many iterations are you running? 2. How long is the partitioning taking vs. the iterations? (look at the time reported just before the iterations start vs. the time after) 3. How does the CPU time compare to the wall clock time? 4. How does the performance compare for 2 and 3 partitions? 5. Do you have a lot of domain interfaces (GGIs)? 6. Do other cases display similar behavior? -CycLone |
|
January 31, 2008, 15:49 |
Re: speedup questions
|
#3 |
Guest
Posts: n/a
|
Hi, CycLone,
Thank you for your reply. 1. I haven't done any real comparason runs on a clean machine like most previous posts did. I noticed that the CPU time for one iteration remains almost the same (maybe I am wrong here). So I just compared the CPU Seconds for one iteration on the same def file using different partitions. 2. The summed CPU-time for mesh partitioning: 19s; The CPU time for each iteration: 500s 3. For this run I stopped the solver at 17th iterations and wall clock time is 4 minutes longer than the cpu time (it wrote 4 *_full.bak files). 4. I will check this later. Before I tried once and I didn't see any advantage from parallel computing. 5. Yes, I do have a lot of GGI interfaces. And one of the interfaces has a lot of 2d regions. Is this the reason? 6. I haven't try other cases without interfaces. Thanks. |
|
January 31, 2008, 17:24 |
Re: speedup questions
|
#4 |
Guest
Posts: n/a
|
Hi,
Lots of GGI interfaces will reduce parallel efficiency. You may have lots of small domains, that will also parallelise poorly. Also writing results files parallelises poorly, if you stop writing results files it should improve. Try the benchmark.def file which is located in the examples directory of CFX. Run that serial and parallel, you should get a reasonable speedup using that model. That would be a good test to see if the issue is the model or your computer setup. Glenn Horrocks |
|
February 1, 2008, 11:58 |
Re: speedup questions
|
#5 |
Guest
Posts: n/a
|
Hi, Glenn,
Yes, that benchmark gives me reasonable speedup. partition: speedup 4 2.58 3 2.32 2 1.78 I think I have to minimize or stop writing backup files. Thanks. |
|
February 3, 2008, 18:26 |
Re: speedup questions
|
#6 |
Guest
Posts: n/a
|
Hi,
It looks like the backup files are the cause then. Sounds like the simulation is spending more time writing backup files then solving - you will get a big speedup by writing less backup files. Glenn Horrocks |
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Superlinear speedup in OpenFOAM 13 | msrinath80 | OpenFOAM Running, Solving & CFD | 18 | March 3, 2015 06:36 |
possible interview questions | atturh | Main CFD Forum | 1 | February 21, 2012 09:53 |
NACA0012 Validation Case Questions | ozzythewise | Main CFD Forum | 3 | August 3, 2010 15:39 |
[Other] Some perhaps stupid questions about calculix | lynx | OpenFOAM Meshing & Mesh Conversion | 11 | May 17, 2010 07:48 |
Parallel efficiency and speedup info | lakeat | OpenFOAM Running, Solving & CFD | 2 | August 31, 2009 12:05 |