CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > Software User Forums > OpenFOAM > OpenFOAM Running, Solving & CFD

interFoam - Not getting better performance on parallel run

Register Blogs Community New Posts Updated Threads Search

Like Tree1Likes
  • 1 Post By geth03

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   August 1, 2021, 16:14
Default interFoam - Not getting better performance on parallel run
  #1
New Member
 
christopher van rees
Join Date: Jun 2021
Location: Chile
Posts: 9
Rep Power: 5
ctvanrees is on a distinguished road
Dear Foamers,

I am relatively new to OpenFOAM and I was doing some validation test case using interFoam. My objective is to measure the fluid elevation at a specific point of space and compare it to experimental results. I am using a DynamicMesh approach, since I want to refine the regions where alpha.water is between 0.99 and 0.01, and get a better representation of the free surface. My problem is very similar to a dam brake problem but considering a slurry (low-concentration) for the studied fluid, using a non-Newtonian model (Bingham plastic, yield stress 2.5Pa and plastic viscosity 0.15Pa*s).

My first question is: I am not getting a better performance by increasing the core count for my mesh. I have tried with 6, 8 and 16 cores, and I am getting similar results, which is pretty slow (like 20 days to finish 15 seconds of simulation). I have a initial cell count of around 350,000 cells. Is there a way I can increase the parallel performance for interFoam? I am using wrong schemes/solvers? Attached are my case files.

Also, how can I improve the quality of my results? I am already using a maximum Co number of 0.05. I am also following the recommendations for interFoam shown on this paper: https://www.tandfonline.com/doi/full...0.2019.1609713

Attached you can see a graph of my preliminar results vs experimental data (took something like 10 days to complete). Green line are OpenFOAM results, blue dots is experiment.

Also attached is a snapshot of the model.

Thanks again for any insight and/or recommendation.

Regards,
CVP
Attached Images
File Type: jpg Results.jpg (39.5 KB, 50 views)
File Type: jpg Geometry.JPG (27.6 KB, 66 views)
Attached Files
File Type: zip casefiles.zip (57.7 KB, 15 views)
ctvanrees is offline   Reply With Quote

Old   August 2, 2021, 08:39
Default
  #2
Senior Member
 
Join Date: Dec 2019
Location: Cologne, Germany
Posts: 369
Rep Power: 8
geth03 is on a distinguished road
when i start a new simulation with interfoam i start with easy basic stuff before going into complexity.

1. before using dynamic mesh, can't you use a static one to see if your case even runs? for 350k cells, you should be able to achieve speed up with 6 or more cores relatively easily.

2. use a higher Courant number than 0.05. this value is really low. 0.25 would be 5 times faster, not that accurate but a good value, too.
do you want to wait 5 times more or achieve a result faster with a few % deviation?
piu58 likes this.
geth03 is offline   Reply With Quote

Old   August 2, 2021, 17:20
Default
  #3
New Member
 
christopher van rees
Join Date: Jun 2021
Location: Chile
Posts: 9
Rep Power: 5
ctvanrees is on a distinguished road
geth03,

Thank you for your reply. I have tried using static mesh, and also not being able to achieve better performance with more cores. Could be because my Co number is set too low? I will try if I get scaling by using Co=0.25 and the 350K cells case (simplest case). How much time, roughly, should I expect with these settings?

I should find a good trade-off between time and quality, in order to achieve "reasonable" results in a "reasonable" amount of time (no more than 3 days of computing).

My case runs and it is stable, the problem is that does not scale, and it is very slow. I was thinking that maybe one of the solution schemes I am using is one of my "bottlenecks".

Let me know what you think. I will get back with my results using higher Co number and static mesh.

Regards,
CVP
ctvanrees is offline   Reply With Quote

Old   August 3, 2021, 04:08
Default
  #4
Senior Member
 
Join Date: Dec 2019
Location: Cologne, Germany
Posts: 369
Rep Power: 8
geth03 is on a distinguished road
speed up with core size does not depend on the Courant number,
the Co will just set the time step, but the number of equations to be solved and the complexity of the matrix and therefor iterations needed is dependant on the cell size, quality and residuals.

you can easily check cell quality of a static mesh with checkMesh.
could you show results from the terminal?

i can't tell you out of my mind how to set the Co to get results within 3 days,
check the terminal output for the timestep and how much time is needed for one iteration, then calculate how many iterations you would need to finish your simulation. adjust your Co accordingly to get faster result, you should definitely not go over Co=1, the lower the Co the more accurate your result.
geth03 is offline   Reply With Quote

Old   August 3, 2021, 04:25
Default
  #5
Senior Member
 
Yann
Join Date: Apr 2012
Location: France
Posts: 1,238
Rep Power: 29
Yann will become famous soon enoughYann will become famous soon enough
Hi CVP,

I see you are using the yPlus function object with a writeControl set to timeStep but no writeInterval. This should lead to writing the yPlus field on every timeStep and it could significantly slow down your simulation.

Try to switch you yPlus writeControl from timeStep to writeTime and see if it improves your calculation time.

Yann
Yann is offline   Reply With Quote

Old   August 3, 2021, 14:08
Default
  #6
New Member
 
christopher van rees
Join Date: Jun 2021
Location: Chile
Posts: 9
Rep Power: 5
ctvanrees is on a distinguished road
geth03,

Attached is my log by running checkMesh. You are right, the scaling is not dependent of the Co number. It looks like Co = 0.25 gives a reasonable amount of time to compute. I have to check if this values gives me reasonable results.

Also, I did a test run using 4 cores for 0.45s of simulation and got 1139s of clock time and 1322s using 8 cores (so actually less time using less cores...). Maybe is the way I am implementing the solver for alpha.water? (see my fvSolution attached in my first post).

Yenn,

Thanks for your suggestion. I have tried doing the modification you said and obtained the following results (for 350K cells):
-4 core for 0.45s of simulation: 1146s
-8 core for 0.45s of simulation: 1113s

So, I did not have a significant improvement, and also, I did not get a considerable better performance on more cores.

Regards,
CVP
Attached Files
File Type: txt log.checkMesh.txt (3.3 KB, 7 views)
ctvanrees is offline   Reply With Quote

Old   August 7, 2021, 19:58
Default
  #7
Senior Member
 
Reviewer #2
Join Date: Jul 2015
Location: Knoxville, TN
Posts: 141
Rep Power: 11
randolph is on a distinguished road
Chris,

What is your time step size?

From the snapshot, I would recommend you mute the surface tension calculation (sigma in the transport properties). Let me know if this helps.

Thanks,
Rdf
randolph is offline   Reply With Quote

Old   August 8, 2021, 14:54
Default
  #8
New Member
 
christopher van rees
Join Date: Jun 2021
Location: Chile
Posts: 9
Rep Power: 5
ctvanrees is on a distinguished road
randolph,

My time step size is controlled by the Co number, so it is set automatically so that Co < 0.25. My initial time step is 1e-5.

How could I mute the surface tension calculation? From my understanding, it is not possible to do this. Maybe I could set the surface tension to 0?

Thanks for your reply,
CVP
ctvanrees is offline   Reply With Quote

Old   August 8, 2021, 19:43
Default
  #9
Senior Member
 
Reviewer #2
Join Date: Jul 2015
Location: Knoxville, TN
Posts: 141
Rep Power: 11
randolph is on a distinguished road
Chris,

Try this and see whether this will accelerate your simulation.

Code:
sigma           0;
Thanks,
Rdf
randolph is offline   Reply With Quote

Old   August 10, 2021, 23:50
Default
  #10
New Member
 
christopher van rees
Join Date: Jun 2021
Location: Chile
Posts: 9
Rep Power: 5
ctvanrees is on a distinguished road
randolph,

So I followed your suggestion, and this is the result that I got for 0.45s of simulation:

-4 CPU cores = 1545s
-8 CPU cores = 1480s

So for some reason it took longer haha. Still not sure why I am not being able to get proper scaling. Also, attached you can find my log file for the decomposePar command for both cases. I noticed that the max number of faces between processors is too high for the case of 8 processors (I am using scoth method).

Let me know what you think, will appreciate your insights.

Regards,
CVP
Attached Files
File Type: txt log.decompose8CPU.txt (4.9 KB, 6 views)
File Type: txt log.decompose4CPU.txt (3.0 KB, 4 views)
ctvanrees is offline   Reply With Quote

Old   August 11, 2021, 09:27
Default
  #11
Senior Member
 
Reviewer #2
Join Date: Jul 2015
Location: Knoxville, TN
Posts: 141
Rep Power: 11
randolph is on a distinguished road
Chris,

That's interesting. My experience with existing surface tension calculation in OpenFOAM is that sometimes it will generate spurious oscillations in the water surface and slow down your calculation. It is a bazaar that muting the surface tension will slow down the computation. Nevertheless, with dynamic refinement at the water surface, I would expect some additional computational effort.

As for parallel scaling, I typically would not expect a linear scaling from OpenFOAM because of many reseasons. My simulation with interFoam is typical with mesh size from 8M to 20M, and rarely do I use more than 64 cores for these mesh range. I have been able to simulate 70 minutes on 8M mesh with 4 Xeon E5-2698 v3 processors (64 cores in total) in 7 days (wall time).

One ugly yet effective approach for prototyping the simulation (if the simulation speed is priority) is to drop all the schemes to first-order and use limited schemes for the gradient and Laplacian terms. I would recommend getting a solution that you can afford first and then gradually bring up the accuracy by using more accurate schemes.

Thanks,
Rdf

Code:
gradSchemes
{
    default            cellLimited Gauss linear 1;
}
Code:
laplacianSchemes
{
    default         Gauss linear limited 0.5;
}

interpolationSchemes
{
    default         linear;
}

snGradSchemes
{
    default         limited 0.5;
}
randolph is offline   Reply With Quote

Old   August 15, 2021, 23:29
Default
  #12
New Member
 
christopher van rees
Join Date: Jun 2021
Location: Chile
Posts: 9
Rep Power: 5
ctvanrees is on a distinguished road
randolph,

Thanks again for your reply and suggestions, will make some runs with your proposed schemes. Do you think my bottleneck for not getting scaling could be the schemes that I am using?

I did another scaling test considering surface tension = 0 and laminar, with a total cell count of around 900K. The results are in the attached graph.

Do you think I could get better performance considering 16 cores for example?

Regards,
CVP
Attached Images
File Type: jpg Scalling.jpg (94.2 KB, 21 views)
ctvanrees is offline   Reply With Quote

Old   August 20, 2021, 22:54
Default
  #13
Senior Member
 
Reviewer #2
Join Date: Jul 2015
Location: Knoxville, TN
Posts: 141
Rep Power: 11
randolph is on a distinguished road
Chris,

apologize for the late reply.

The bottleneck for scaling is complicated. But 0.9 M mesh, I would not go more than 8 processors. Typically, you need a larger mesh to have okay scaling. If you test your model on a small mesh, most time of your simulation time is used for communication instead of actual computation. I remember there is a tool in OpenFOAM to check your communications time and computation time.

If your application is okay with 0.9 M resolution, why is your purpose for test the scaling? 0.9K model is not computationally intense.

Thanks,
Rdf
randolph is offline   Reply With Quote

Old   August 22, 2021, 22:28
Default
  #14
New Member
 
christopher van rees
Join Date: Jun 2021
Location: Chile
Posts: 9
Rep Power: 5
ctvanrees is on a distinguished road
randolph,

Thanks for your reply. I am using right now a machine with 8 cores since it does not have any sense to go any further with 900K cells. My issue is that my simulations are taking too long to complete (about 6 days for 20 seconds of simulation).

My first attempt to try to speed my simulations was to increase the core count. The second option would be to increase the Co number, but I still want to have some decent results (maximum Co = 0.5).

Let me know what you think. Maybe it is okey to have 6 days of computation time?

Regards,
CVP
ctvanrees is offline   Reply With Quote

Old   August 28, 2021, 09:42
Default
  #15
Senior Member
 
Reviewer #2
Join Date: Jul 2015
Location: Knoxville, TN
Posts: 141
Rep Power: 11
randolph is on a distinguished road
Chris,

Is the application sensitive to the surface wave resolution?

If not, I would somewhat lower the resolution on the surface wave. In my humble opinion, I think keeping the mass conservation (I would monitor the water mass in the simulation) is more important than resolving the waves.

I could also be wrong, maybe resolving the (shock) wave front is important in this type of dam breaking problem. Nevertheless, the CFL number may not be the constrain of the resolution on the wave surface. Many times, the spatial resolution is the constrain. In that case, having a tight CFL criterion does not really improve your resolution. I would make some tests and compare the influence of the time steps size (i.e., CFL).


8 core simulation for 6 days sounds okay to me. Of course, this entirely depends on the computational resource one has in their hands as well as the expectation of the model accuracy.

Thanks,
Rdf
randolph is offline   Reply With Quote

Old   August 30, 2021, 13:11
Default
  #16
New Member
 
christopher van rees
Join Date: Jun 2021
Location: Chile
Posts: 9
Rep Power: 5
ctvanrees is on a distinguished road
randolph,

Thanks again for your reply. Yes, the validation exercise I am performing is to reproduce the experimental results of the slurry depth at a certain point. I will relax my CFL condition and use a finer mesh to see what I get in this case.

What Co number would you use? For both time stepping and interface Co? Currently I am limiting my Co to 0.3.

Let me know what you think.

Regards,
CVP
ctvanrees is offline   Reply With Quote

Old   August 31, 2021, 09:30
Default
  #17
Senior Member
 
Reviewer #2
Join Date: Jul 2015
Location: Knoxville, TN
Posts: 141
Rep Power: 11
randolph is on a distinguished road
Chris,

I would set both the convective CFL and wave CFL (interface) as large as I could as long as the simulation does not blow up (e.g., 0.9) and I typically just go with piso (or outer correction loop of 1 for pimple). One will typically find the wave CFL is the one that actually limits your time step. In that case, have a diffuse interface would accelerate and stabilize the computation, however at the cost of the accuracy.

Thanks,
Rdf
randolph is offline   Reply With Quote

Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
The problem when i use parallel computation for mesh deforming. Hiroaki Sumikawa OpenFOAM Running, Solving & CFD 0 November 20, 2018 03:58
Can run setFields in parallel while decomposed? totalart OpenFOAM Running, Solving & CFD 2 August 21, 2018 00:07
Explicitly filtered LES saeedi Main CFD Forum 16 October 14, 2015 12:58
Interfoam blows on parallel run danvica OpenFOAM Running, Solving & CFD 16 December 22, 2012 03:09
parallel performance on BX900 uzawa OpenFOAM Installation 3 September 5, 2011 16:52


All times are GMT -4. The time now is 15:27.