|
[Sponsors] |
June 1, 2009, 19:58 |
Parallel performance of icoFoam
|
#1 |
Senior Member
Senthil Kabilan
Join Date: Mar 2009
Posts: 113
Rep Power: 17 |
Hi All,
I get linear speed-up upto 4 processors with icoFoam; when I try to use more than 4 processors, the performance decreases. Has anyone come across this issue? I will be more than happy to report the details if need be. Regards, Senthil |
|
June 2, 2009, 06:11 |
|
#2 |
Senior Member
Martin Beaudoin
Join Date: Mar 2009
Posts: 332
Rep Power: 22 |
Hello Senthil,
Could you give us more details on your hardware, software and problem configuration? CPU type, amount of RAM, type of interconnect, version of OpenFOAM, size of meshes, etc. Martin |
|
June 2, 2009, 14:17 |
|
#3 |
Member
Rachel Vogl
Join Date: Jun 2009
Posts: 48
Rep Power: 17 |
Hello Senthil,
What was your problem size. Many solvers will not scale when the number of cells per CPU is less than 10,000. There is a lot of communication overhead. This leads to more time in communication than actual computation. Hence the resulting performance is bad ! |
|
June 2, 2009, 15:47 |
|
#4 |
Senior Member
Senthil Kabilan
Join Date: Mar 2009
Posts: 113
Rep Power: 17 |
Hi Martin/Rachel,
Thanks for your inputs...The machine that I am using is a Dell Dual processor, quad core Intel (Harpertown running at 2.33Ghz), 16GB, No interconnect (it just uses shared memory) Here is the output from checkMesh... Create time Create polyMesh for time = constant Time = constant Mesh stats points: 22303 faces: 209018 internal faces: 184318 cells: 98334 boundary patches: 10 point zones: 0 face zones: 0 cell zones: 0 Number of cells of each type: hexahedra: 0 prisms: 0 wedges: 0 pyramids: 0 tet wedges: 0 tetrahedra: 98334 polyhedra: 0 Checking topology... Boundary definition OK. Point usage OK. Upper triangular ordering OK. Topological cell zip-up check OK. Face vertices OK. Face-face connectivity OK. Number of regions: 1 (OK). Checking patch topology for multiply connected surfaces ... Patch Faces Points Surface inlet 111 72 ok (not multiply connected) out2 74 52 ok (not multiply connected) out3 62 40 ok (not multiply connected) out4 65 44 ok (not multiply connected) out5 44 31 ok (not multiply connected) out6 58 38 ok (not multiply connected) out7 51 34 ok (not multiply connected) out8 50 34 ok (not multiply connected) out1 46 32 ok (not multiply connected) w1 24139 12150 ok (not multiply connected) Checking geometry... Domain bounding box: (-0.0609739 -0.106362 -0.025452) (0.0609426 0.106513 0.0254047) Boundary openness (-2.2934e-17 -4.63998e-17 7.83197e-18) OK. Max cell openness = 3.5536e-16 OK. Max aspect ratio = 13.8974 OK. Minumum face area = 5.56535e-09. Maximum face area = 1.18647e-05. Face area magnitudes OK. Min volume = 2.28107e-13. Max volume = 1.27222e-08. Total volume = 5.34576e-05. Cell volumes OK. Mesh non-orthogonality Max: 77.0421 average: 22.9327 *Number of severely non-orthogonal faces: 17. Non-orthogonality check OK. <<Writing 17 non-orthogonal faces to set nonOrthoFaces Face pyramids OK. Max skewness = 1.44383 OK. All angles in faces OK. All face flatness OK. Mesh OK. End |
|
July 14, 2009, 16:47 |
|
#5 |
Member
David P. Schmidt
Join Date: Mar 2009
Posts: 72
Rep Power: 17 |
Hi,
Two things to look for here. The first is that as you decompose the domain into more and more small subdomains the surface area to volume ratio of each subdomain increases. Surface area is measured in number of faces and volume is measured in number of cells. At some point, your machine spends more time communicating than processing. I would not divide up a domain into chunks of less than 50K cells. Secondly, shared memory machines like yours will start to have memory transfer bottlenecks, where your fast processors are spending much of their time waiting for the accesses to main memory. In unstructured codes, pre-fetching is hard, even with the special ordering OpenFOAM uses. -David |
|
July 16, 2009, 15:43 |
|
#6 |
Senior Member
Senthil Kabilan
Join Date: Mar 2009
Posts: 113
Rep Power: 17 |
Hi David,
Thanks for the valuable input. Makes sense... Warm Regards, Senthil |
|
August 13, 2009, 19:13 |
|
#7 | |
New Member
Karpenko Anton
Join Date: Aug 2009
Location: Saint-Petersburg
Posts: 4
Rep Power: 17 |
Quote:
This happened because using Harpertown. http://www.fluent.com/software/fluen.../truck_14m.htm Pay attention to the line for INTEL WHITEBOX (INTEL_X5482_HTN4, 3200, RHEL5). |
||
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Performance of GGI case in parallel | hannes | OpenFOAM Running, Solving & CFD | 26 | August 3, 2011 04:07 |
Parallel performance OpenFoam Vs Fluent | prapanj | Main CFD Forum | 0 | March 26, 2009 06:43 |
IcoFoam parallel woes | msrinath80 | OpenFOAM Running, Solving & CFD | 9 | July 22, 2007 03:58 |
ANSYS CFX 10.0 Parallel Performance for Windows XP | Saturn | CFX | 4 | August 13, 2006 13:27 |
Parallel Performance of Fluent | Soheyl | FLUENT | 2 | October 30, 2005 07:11 |