CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > Software User Forums > OpenFOAM > OpenFOAM Running, Solving & CFD

Parallel Performance of Large Case

Register Blogs Community New Posts Updated Threads Search

Like Tree19Likes

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   February 16, 2015, 10:19
Default
  #21
Senior Member
 
Andrea Pasquali
Join Date: Sep 2009
Location: Germany
Posts: 142
Rep Power: 17
andrea.pasquali is on a distinguished road
Hi,
I did not see different running times when running on nodes on a single switch.
My test was with mesh generation whit the refinement stage in snappyHexMesh.
As I said, I did not investigate it in detail yet. I only tried once recompiling mpi and openfoam with intel12 but having still the same (bad) performance with infiniband...

Andrea
__________________
Andrea Pasquali
andrea.pasquali is offline   Reply With Quote

Old   February 18, 2015, 06:09
Default
  #22
New Member
 
Join Date: May 2013
Posts: 23
Rep Power: 13
arnaud6 is on a distinguished road
Hello,

So I have tried the rebumberMesh before solving and it looks like it has improved a bit the performances of both single and multiple switches, reducing the running time by ~10%.

But I still can't see why the running times are so slow across multiple switches. RenumberMesh or not, we should get roughly the same running time whatever the nodes selected, right ?
arnaud6 is offline   Reply With Quote

Old   February 22, 2015, 15:12
Default
  #23
Retired Super Moderator
 
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,981
Blog Entries: 45
Rep Power: 128
wyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to all
Greetings to all!

@arnaud6:
Quote:
Originally Posted by arnaud6 View Post
RenumberMesh or not, we should get roughly the same running time whatever the nodes selected, right ?
InfiniBand uses a special addressing mechanism that is not used by Ethernet MPI technology; AFAIK, InfiniBand uses a mechanism for sharing memory directly between nodes, mapping out as much as possible, between both the RAM of the machines and by mapping out the "path ways of least resistance" for communicating between each machine. This to say that an InfiniBand switch is far more complex than an Ethernet switch, because as many as possible paths are mapped out between each pair of ports on that switch.

Problem is that when 3 switches are used, the tree becomes a lot larger and is sectioned in 3 parts, making it a bit harder to map out communications.

Commercial CFD software might already have these kinds of configurations taken into account, by either asking the InfiniBand controls to adjust accordingly, or the CFD software itself tries to balance this out on its own, by placing sub-domains closer to each other on the same machines that share a switch and keeping communication to a minimum when communicating with machines that are connected on other switches. But when you use OpenFOAM, you're probably not taking this into account.

I haven't had to deal with this myself, so I have no idea how this is properly configured, but there are at least a few things I can imagine that could work:
  • Have you tried PCG yet? If not, you better try it as well.
  • Try multi-level decomposition: http://www.cfd-online.com/Forums/ope...tml#post367979 post #8 - the idea is that you should have the first level divided by switch group.
    • Note: if you have 3 switches, either you have one master switch that connects only between the 2 other switches and has no direct machines, or you have 1 switch per group of machines in a daisy chain. Keep this in mind when using multi-level decomposition.
  • Contact your InfiniBand support line on how to configure mpirun to map out properly the communications.
Best regards,
Bruno
wyldckat is offline   Reply With Quote

Old   February 26, 2015, 06:15
Default
  #24
New Member
 
Join Date: May 2013
Posts: 23
Rep Power: 13
arnaud6 is on a distinguished road
Hi Bruno,

Thanks for your ideas !

I am looking at the PCG solvers.
Would you advice to use the combination PCG for p and PBiCG for other variables or using PCG for p and keep other variables with a smopothSolver/Gauss Seidel ? In my cases it looks like p is the most difficult to solve (at least it is the variable that takes the longest time to be solved at each iteration).

The difficulty is that I don't have much control on the nodes thus the switches that will be selected when I submit my parallel job ...
I will see what I can do with the IB support.
arnaud6 is offline   Reply With Quote

Old   October 24, 2015, 16:37
Default
  #25
Retired Super Moderator
 
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,981
Blog Entries: 45
Rep Power: 128
wyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to all
Hi arnaud6,

Quote:
Originally Posted by arnaud6 View Post
I am looking at the PCG solvers.
Would you advice to use the combination PCG for p and PBiCG for other variables or using PCG for p and keep other variables with a smopothSolver/Gauss Seidel ? In my cases it looks like p is the most difficult to solve (at least it is the variable that takes the longest time to be solved at each iteration).
Sorry for the really late reply, I've had this on my to-do list for a long time and only now did I take a quick look into it. But unfortunately I still don't have a specific answer/solution for this.
The best I could tell you back then and now is that you try running for a few iterations yourself with each configuration.
Even the GAMG matrix solver can sometimes be improved if you fine tune the parameters and do some trial and error sessions with your case, because these parameters depend on the case size and how the sub-domains in the case are structured.

Either way, I hope you managed to figure this out on your own.

Best regards,
Bruno
wyldckat is offline   Reply With Quote

Old   November 4, 2015, 11:49
Default
  #26
mgg
New Member
 
Join Date: Nov 2012
Posts: 27
Rep Power: 14
mgg is on a distinguished road
Hi Bruno,

indeed. In my expericence, how the subdomain is structured has strong impact on the performance. So I choose to decompose manually.

My problem now is as following:

I am running a DNS case (22 Mio. cells) using buoyantPimpleFoam (OF V2.4). The case is a long pipe with an inlet and outlet. The fluid is air. Inlet Re is about 5400.

For getting better scalability, I use PCG for pressure equation. If I use perfect gas equation of state, the number of iterations will be around 100, which is acceptable. If I use icopolynom or rhoConst to describe the density, the number of iterations will be around 4000! If I use GAMG for p equation, number of iteration will be under 5, but the scalability is poor with above 500 cores. Does anyone has any opinion?

How can I improve PCG solver to decrease the number of iterations? Thank you.

Quote:
Originally Posted by wyldckat View Post
Hi arnaud6,


Sorry for the really late reply, I've had this on my to-do list for a long time and only now did I take a quick look into it. But unfortunately I still don't have a specific answer/solution for this.
The best I could tell you back then and now is that you try running for a few iterations yourself with each configuration.
Even the GAMG matrix solver can sometimes be improved if you fine tune the parameters and do some trial and error sessions with your case, because these parameters depend on the case size and how the sub-domains in the case are structured.

Either way, I hope you managed to figure this out on your own.

Best regards,
Bruno
mgg is offline   Reply With Quote

Old   November 7, 2015, 12:52
Default
  #27
Retired Super Moderator
 
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,981
Blog Entries: 45
Rep Power: 128
wyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to all
Quote:
Originally Posted by mgg View Post
If I use icopolynom or rhoConst to describe the density, the number of iterations will be around 4000! If I use GAMG for p equation, number of iteration will be under 5, but the scalability is poor with above 500 cores. Does anyone has any opinion?

How can I improve PCG solver to decrease the number of iterations? Thank you.
Quick questions/answers:
  • I don't know how to improve the PCG solver... perhaps you need to use another preconditioner? I can't remember right now, but isn't GAMG possible to be used as a preconditioner?
  • If GAMG can do it in 5 iterations, are those 5 iterations taking a lot longer than 4000 of the PCG?
  • I'm not familiar with DNS enough to know this, but isn't it possible to solve the same pressure equation a few times, with relaxation steps in between, like PIMPLE and SIMPLE have this ability?
  • GAMG is very configurable. Are you simply using a standard set of settings or have you tried to find the optimum settings for GAMG? Because GAMG can only scale well if you configure it correctly. I know there was a thread about this somewhere...
  • After a quick search:
__________________
wyldckat is offline   Reply With Quote

Old   March 25, 2016, 06:03
Default
  #28
New Member
 
Join Date: May 2013
Posts: 23
Rep Power: 13
arnaud6 is on a distinguished road
Sorry for getting back so late on this one. The problem was mpirun 1.6.5. As soon as I switched to mpirun 1.8.3, the slowness disappeared !
wyldckat and Ramzy1990 like this.
arnaud6 is offline   Reply With Quote

Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Large test case for running OpenFoam in parallel fhy OpenFOAM Running, Solving & CFD 23 April 6, 2019 10:55
Running AMI case in parallel Kaskade OpenFOAM Running, Solving & CFD 3 March 14, 2016 16:58
Parallel Moving Mesh Bug for Multi-patch Case albcem OpenFOAM 0 May 21, 2009 01:23
Parallel Performance of Fluent Soheyl FLUENT 2 October 30, 2005 07:11
PC vs. Workstation Tim Franke Main CFD Forum 5 September 29, 1999 16:01


All times are GMT -4. The time now is 12:59.