|
[Sponsors] |
January 19, 2016, 13:11 |
Setting parallelization in CFX
|
#1 |
Member
|
Help set parallelization of calculation in CFX on the local computer.
I determine what performance gains are not large. 506549230.png Y - time spent on computation. X - Number of Cores Axial compress rotor |
|
January 19, 2016, 14:30 |
|
#2 |
Senior Member
Join Date: Jun 2009
Posts: 1,853
Rep Power: 33 |
What kind of problem are you solving ? Not all problems scale equally
How many cores do you have available ? Is HyperThreading enabled ? |
|
January 19, 2016, 15:42 |
|
#4 |
Member
|
||
January 19, 2016, 16:59 |
|
#5 |
Super Moderator
Glenn Horrocks
Join Date: Mar 2009
Location: Sydney, Australia
Posts: 17,819
Rep Power: 144 |
Small meshes do not parallelise well. Some physical models can have problems in parallel: radiation modelling especially, but some multiphase models slow things down. Also lots of file IO from results files, monitor points or anything else will cause parallel problems.
Finally: The only part of the simulation which runs in parallel is the solver iterations. Partitioning, initial condition interpolation, set up to begin and pack up to end (including writing the results files) are done serial. To get a good speed up you will need a simulation where the majority of the time is spent in the solver iterations. |
|
January 20, 2016, 02:17 |
|
#6 | |
Member
|
Quote:
https://drive.google.com/file/d/0B5g...ew?usp=sharing case number 5(3) |
||
January 20, 2016, 16:14 |
|
#7 |
Super Moderator
Glenn Horrocks
Join Date: Mar 2009
Location: Sydney, Australia
Posts: 17,819
Rep Power: 144 |
I do not have time to look at your project.
If you want us to look at your simulation, post some images of your mesh and your CCL (or output file). |
|
January 21, 2016, 03:21 |
|
#8 | |
Member
|
Quote:
506685770.png 506685896.png 506685896.png 506685746.png This is picture of quality of mesh and limits, settings of solver, results of series of solving. My processor Intel(R) Xeon(R) CPU E5430 @ 2.66GHz, 3024 МГц link of my mesh in TG https://drive.google.com/file/d/0B5g...ew?usp=sharing link of my results https://drive.google.com/file/d/0B5g...ew?usp=sharing |
||
January 21, 2016, 17:05 |
|
#9 |
Super Moderator
Glenn Horrocks
Join Date: Mar 2009
Location: Sydney, Australia
Posts: 17,819
Rep Power: 144 |
Is this a simple, single phase simulation? No radiation or multiphase?
In the <CFX ROOT>/examples directory there is a file called Benchmark.def. Can you run this in serial and local parallel? Please report the solver wall clock time for the solver iterations (not the total time). This is in the output file and listed after the solver iterations is converged and before the output file lists the results file contents. I use this benchmark simulation to debug parallel problems. |
|
January 30, 2016, 08:22 |
|
#10 | |
Member
|
Quote:
|
||
January 31, 2016, 04:25 |
|
#11 |
Super Moderator
Glenn Horrocks
Join Date: Mar 2009
Location: Sydney, Australia
Posts: 17,819
Rep Power: 144 |
OK thanks. Please run the Benchmark.def file I mentioned before in serial and local parallel modes.
|
|
February 5, 2016, 03:45 |
|
#12 |
Member
|
||
February 5, 2016, 04:15 |
|
#13 | |
Super Moderator
Glenn Horrocks
Join Date: Mar 2009
Location: Sydney, Australia
Posts: 17,819
Rep Power: 144 |
See post #9:
Quote:
|
||
February 5, 2016, 05:28 |
|
#14 |
Member
Join Date: Jan 2015
Posts: 63
Rep Power: 11 |
To get benefits from HPC the number of elements need to be large, otherwise the time spent for the process communication and I/O will be significant.
Ideally you should have a linear behaviour time vs ncores, otherwise you will not exploit the parallelism in the right way. I hope this helps. |
|
February 5, 2016, 17:40 |
|
#15 |
Senior Member
Erik
Join Date: Feb 2011
Location: Earth (Land portion)
Posts: 1,176
Rep Power: 23 |
My guess is that you are running out of memory bandwidth. That CPU only has 2 memory channels at 1333 MHz. You scale quite well from 1-2 cores, but lose efficiency quickly after that.
On my computer with 4 memory channels @ 2133 MHz, I do not get too much speedup past 4 cores. 3 cores is actually the sweet spot where it is pretty linear from 1, then efficiency starts to drop off at 4 cores. I've tried 5 cores and got the same performance as 4 in some cases. |
|
Tags |
axial compress rotor |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
CFX Treatment of Laminar and Turbulent Flows | Jade M | CFX | 18 | September 15, 2022 07:08 |
CFX setting for high velocity air beam in small gap. | Raymond | CFX | 0 | March 29, 2015 05:04 |
Question about heat transfer coefficient setting for CFX | Anna Tian | CFX | 1 | June 16, 2013 06:28 |
CFX pressure in Simulations problem | nasdak | CFX | 1 | April 14, 2010 13:22 |