CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > Software User Forums > OpenFOAM

Why OpenFOAM decompose domain so slow?

Register Blogs Community New Posts Updated Threads Search

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   October 6, 2022, 03:42
Default Why OpenFOAM decompose domain so slow?
  #1
Senior Member
 
Dongyue Li
Join Date: Jun 2012
Location: Beijing, China
Posts: 844
Rep Power: 18
sharonyue is on a distinguished road
Hi there,

I know OpenFOAM needs to decompose the domain. These days I tried ANSYS fluent. For meshes with 30 million cells. Ansys fluent's partition is extremely faster than OpenFOAM. I would say, for this kind of meshes, if we use 800 CPUs, Ansys fluent's partition speed is 60 times faster? For example, openfoam needs around 5 mins to decompose. Ansys fluent needs 5 seconds....

I dont know why..
__________________
My OpenFOAM algorithm website: http://dyfluid.com
By far the largest Chinese CFD-based forum: http://www.cfd-china.com/category/6/openfoam
We provide lots of clusters to Chinese customers, and we are considering to do business overseas: http://dyfluid.com/DMCmodel.html
sharonyue is offline   Reply With Quote

Old   October 6, 2022, 04:02
Default
  #2
Member
 
Al Csc
Join Date: Jul 2018
Posts: 31
Rep Power: 8
al.csc is on a distinguished road
Parallel programming in OpenFOAM
You may want to check this thread

Cheers!
al.csc is offline   Reply With Quote

Old   October 6, 2022, 04:03
Default
  #3
Member
 
Al Csc
Join Date: Jul 2018
Posts: 31
Rep Power: 8
al.csc is on a distinguished road
And mainly this one, i forgot:
Parallelizing decomposePar utility
al.csc is offline   Reply With Quote

Old   October 7, 2022, 03:32
Default
  #4
Senior Member
 
Hrvoje Jasak
Join Date: Mar 2009
Location: London, England
Posts: 1,906
Rep Power: 33
hjasak will become famous soon enough
Well.... First, I am the guy who wrote the decomposePar and reconstructPar utilities in the first place so I'm the one to blame. What the utility does is to read in the original mesh (fixed cost), run an external package to decide which cell goes to which proc (~10 seconds) and then build individual processor meshes by sub-setting the original mesh for points, faces, cells, boundary faces, point-face-cell- zones. The processor mesh is then packed with local-to-global addressing and everything is written out.


Fluent is faster because it does not ACTUALLY decompose the mesh: to my knowledge all processors read the same file through the master Scheme interpreter and the mesh is actually decomposed when you run the case. There is no such thing as processor directories: since two codes do a different thing, you cannot expect them to run at same speed.


Now, making things faster:


- why do you have a serial mesh in the first place? cfMesh and snappy will build you a parallel mesh if you ask, so there is no need for decomposition. ParaView can post-process a distributed case again, no problem - and you can program your function object on-the-fly post-processing.


- building individual processor meshes can be made faster only in search. I have checked the code and can do something about it. I am currently using a lot of SLLists and I could count and allocate hard arrays instead. This makes a difference, but it is a significant project which requires FUNDING. Do you have any?


- We can change the way decomposition works altogether, using cell renumbering into a space-filling-curve format and then chop-read the single input file. This would seem to work on the parallel filing systems used today, but I am keen to consider distributed filing systems instead. For the latter, the current chopped-up processor mesh files are ideal.


Happy to hear your comments,


Hrv
__________________
Hrvoje Jasak
Providing commercial FOAM/OpenFOAM and CFD Consulting: http://wikki.co.uk
hjasak is offline   Reply With Quote

Old   October 7, 2022, 21:06
Default
  #5
Senior Member
 
Dongyue Li
Join Date: Jun 2012
Location: Beijing, China
Posts: 844
Rep Power: 18
sharonyue is on a distinguished road
Hi Hrvoje,

Good to hear from you.

Since I have been using OpenFOAM for a while. So in the past several years I think what OpenFOAM behaves is quite normal, and I accept it both from theory and pratice. But these days we tried Fluent and found something new.

> ... and the mesh is actually decomposed when you run the case.
> ... which requires FUNDING.

I am just curious what ANSYS Fluent does exactly and why it is so fast?? From your comments, it looks Fluent also need to decompose the mesh, not the beginning though. No matter if it decomposes the mesh at the beginning or in the middle of simulation, I am not aware of any LONG time using to decompose the mesh (32 million cells). Please see the following log at the end of this post.

> why do you have a serial mesh in the first place?

We do extremely simple simulations to test the cluster's speed, such as cavity and pitzDaily, but with 100 million cells. So we just use blockMesh to make the mesh. Indeed, I am thinking to use snappyHexMesh to make it parallel, with snap off and no object inside.
Right now for OpenFOAM case with 100 million cells, we decompose it then we never delete proce* files, otherwise we need to decompose it again.



Best,
Dongyue



The following code is extreme faster than OpenFOAM. It also partitions mesh, but its extremely fast. For meshes with 32 million cells, the following takes around around 20 seconds I would guess.
Code:
> /file/read-case
case file name [".cas.h5"] steadyState.cas

Reading "steadyState.cas"...

Buffering for file scan...



32000000 hexahedral cells, zone  1, binary.
Warning: reading 60 partition grid onto 768 compute node machine;
         will auto partition.
32000000 cell partition ids, zone  1, 60 partitions, binary.
95680000 quadrilateral interior faces, zone  2, binary.
   80000 quadrilateral velocity-inlet faces, zone 10, binary.
   80000 quadrilateral pressure-outlet faces, zone 11, binary.
   80000 quadrilateral symmetry faces, zone 12, binary.
   80000 quadrilateral wall faces, zone 13, binary.
  160000 quadrilateral symmetry faces, zone 14, binary.
  160000 quadrilateral symmetry faces, zone 15, binary.
32321001 nodes, binary.
the following takes around 5 seconds I would guess. very fast.

Code:
Building...
     mesh
        auto partitioning mesh by Metis (fast),
        distributing mesh
                parts..................................................,
                faces..................................................,
                nodes..................................................,
                cells..................................................,
        inter-node communication reduction using architecture-aware remapping: 51%
        bandwidth reduction using Reverse Cuthill-McKee: 3237/1261 = 2.56701
__________________
My OpenFOAM algorithm website: http://dyfluid.com
By far the largest Chinese CFD-based forum: http://www.cfd-china.com/category/6/openfoam
We provide lots of clusters to Chinese customers, and we are considering to do business overseas: http://dyfluid.com/DMCmodel.html
sharonyue is offline   Reply With Quote

Old   October 7, 2022, 21:14
Default
  #6
Senior Member
 
Dongyue Li
Join Date: Jun 2012
Location: Beijing, China
Posts: 844
Rep Power: 18
sharonyue is on a distinguished road
Quote:
Originally Posted by al.csc View Post
Parallel programming in OpenFOAM
You may want to check this thread

Cheers!
I tried their method.
It looks his utility does not support Scotch.
I tried with simple method, for meshes with 5 million cells, it ran much faster.
For meshes with 100 million cells, it hangs and I have to kill it.
__________________
My OpenFOAM algorithm website: http://dyfluid.com
By far the largest Chinese CFD-based forum: http://www.cfd-china.com/category/6/openfoam
We provide lots of clusters to Chinese customers, and we are considering to do business overseas: http://dyfluid.com/DMCmodel.html
sharonyue is offline   Reply With Quote

Old   March 21, 2023, 01:25
Default
  #7
Senior Member
 
Dongyue Li
Join Date: Jun 2012
Location: Beijing, China
Posts: 844
Rep Power: 18
sharonyue is on a distinguished road
I am stilling wondering what tech does ANSYS Fluent use. It looks like it does not need to decompose the domain, at least explicitly. Even it does, its extremely fast. Indeed I can see "metis" keyword. But even OpenFOAM uses metis, its still slow.
__________________
My OpenFOAM algorithm website: http://dyfluid.com
By far the largest Chinese CFD-based forum: http://www.cfd-china.com/category/6/openfoam
We provide lots of clusters to Chinese customers, and we are considering to do business overseas: http://dyfluid.com/DMCmodel.html
sharonyue is offline   Reply With Quote

Old   March 29, 2023, 03:59
Default
  #8
Super Moderator
 
Tobi's Avatar
 
Tobias Holzmann
Join Date: Oct 2010
Location: Bad Wörishofen
Posts: 2,711
Blog Entries: 6
Rep Power: 51
Tobi has a spectacular aura aboutTobi has a spectacular aura aboutTobi has a spectacular aura about
Send a message via ICQ to Tobi Send a message via Skype™ to Tobi
As Hrv already mentioned, fluent seems to use a different method. If you decompose a 30M case in 5 s, you can believe that nothing will be done in terms of real decomposing. Just imagine a file that has 10 GB and you will split it up to 10 times 1GB files and send it over to the nodes. Even the reading and sending (depending on your system) will not be done within 5 s. Its a while ago when I used fluent but if you have a license, you can call the fluent support and ask about that

By the way. My suggestion is using: redistributePar in parallel while using ptscotch rather than scotch.
__________________
Keep foaming,
Tobias Holzmann
Tobi is offline   Reply With Quote

Old   March 29, 2023, 04:09
Default
  #9
Super Moderator
 
Tobi's Avatar
 
Tobias Holzmann
Join Date: Oct 2010
Location: Bad Wörishofen
Posts: 2,711
Blog Entries: 6
Rep Power: 51
Tobi has a spectacular aura aboutTobi has a spectacular aura aboutTobi has a spectacular aura about
Send a message via ICQ to Tobi Send a message via Skype™ to Tobi
I just made a test:


  • I created a mesh with nCells: 20160000
  • Decomposed it to 30 procs
  • Time = 80 s


Please note: Fluent uses single precision as default settings while OpenFOAM is based on double-precision by default. This also makes a major difference.
__________________
Keep foaming,
Tobias Holzmann
Tobi is offline   Reply With Quote

Old   April 14, 2023, 08:31
Default
  #10
Senior Member
 
Dongyue Li
Join Date: Jun 2012
Location: Beijing, China
Posts: 844
Rep Power: 18
sharonyue is on a distinguished road
Hi Tobi,

> As Hrv already mentioned, fluent seems to use a different method. If you decompose a 30M case in 5 s, you can believe that nothing will be done in terms of real decomposing. Just imagine a file that has 10 GB and you will split it up to 10 times 1GB files and send it over to the nodes. Even the reading and sending (depending on your system) will not be done within 5 s. Its a while ago when I used fluent but if you have a license, you can call the fluent support and ask about that

Yeah. This is what I want to know. It looks they use a very different algorithm. I will ask them and see if they can provide some useful informations.
__________________
My OpenFOAM algorithm website: http://dyfluid.com
By far the largest Chinese CFD-based forum: http://www.cfd-china.com/category/6/openfoam
We provide lots of clusters to Chinese customers, and we are considering to do business overseas: http://dyfluid.com/DMCmodel.html
sharonyue is offline   Reply With Quote

Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Getting Started with OpenFOAM wyldckat OpenFOAM 26 June 21, 2024 06:54
Map of the OpenFOAM Forum - Understanding where to post your questions! wyldckat OpenFOAM 10 September 2, 2021 05:29
OpenFOAM course for beginners Jibran OpenFOAM Announcements from Other Sources 2 November 4, 2019 08:51
OpenFOAM Training Jan-Jul 2017, Virtual, London, Houston, Berlin CFDFoundation OpenFOAM Announcements from Other Sources 0 January 4, 2017 06:15
Pressure distribution on a wall darazsbence CFX 17 October 6, 2015 10:38


All times are GMT -4. The time now is 20:44.