|
[Sponsors] |
January 19, 2023, 09:05 |
Running CFX solver in batch parallel mode
|
#1 |
New Member
Felipe Silva Maffei
Join Date: Dec 2015
Posts: 12
Rep Power: 11 |
Hi,
I am trying to run the CFX solver using my university cluster, but when I execute the following command: Code:
cfx5solve -numa auto -def Fluid\ Flow\ CFX.def -start-method "Intel MPI Distributed Parallel" -par-dist n02*40,n03*40 -batch -monitor log Did you know how to make the CFX solver uses all the nodes of both nodes? |
|
January 19, 2023, 17:29 |
|
#2 |
Super Moderator
Glenn Horrocks
Join Date: Mar 2009
Location: Sydney, Australia
Posts: 17,870
Rep Power: 144 |
The most likely cause for this is that you are trying to feed the multipartition data for 80 partitions through one interconnect. Unless you have a very high-end network between these two machines the interconnect will be flooded and will bottleneck the simulation.
It would be better to have 8 machines with 10 partitions each rather than 2 machines with 40 partitions each as the network load will get spread over more interconnects.
__________________
Note: I do not answer CFD questions by PM. CFD questions should be posted on the forum. |
|
January 19, 2023, 17:32 |
|
#3 |
Super Moderator
Glenn Horrocks
Join Date: Mar 2009
Location: Sydney, Australia
Posts: 17,870
Rep Power: 144 |
Oh - and the problem might not necessarily be the network connection. It could be the connection of the network adapter to the CPU, so that means the FSB, memory interconnect and other motherboard stuff. So the motherboard quality is critical as well.
__________________
Note: I do not answer CFD questions by PM. CFD questions should be posted on the forum. |
|
January 20, 2023, 09:44 |
|
#4 |
New Member
Felipe Silva Maffei
Join Date: Dec 2015
Posts: 12
Rep Power: 11 |
Thanks for the answer,
Ok, I will check these points and the possibility to have more nodes with fewer CPUs in each one. Is there another possibility? Because I made one test where I stated the same case, but I don't allocate the node which the thinks works well on the first try and I observe one node with 40 CPUs working at 3% of its capacity. I was wondering if is something related to the host machine configuration (either Ansys or cluster). |
|
January 21, 2023, 02:18 |
|
#5 |
Super Moderator
Glenn Horrocks
Join Date: Mar 2009
Location: Sydney, Australia
Posts: 17,870
Rep Power: 144 |
It is a really good idea to test the capabilities of your cluster before you start using it. I recommend getting a benchmark simulation, and then running it 1,2,4,8,16,32 etc partitions on one machine and check the scaling, and then do the same across 2 machines, then 4, 8 etc.
This will tell you how many partitions you can put on a single node (as it is likely performance will drop off before you use all cores), and how it scales across multiple nodes. It is very instructive - and it will also tell you the optimum configuration you should define to get best performance from your cluster. As you are seeing, optimum performance is almost certainly not using the maximum number of partitions.
__________________
Note: I do not answer CFD questions by PM. CFD questions should be posted on the forum. |
|
January 21, 2023, 08:32 |
|
#6 |
New Member
Felipe Silva Maffei
Join Date: Dec 2015
Posts: 12
Rep Power: 11 |
Ok, thanks for the explanation. I will try it. One more thing, can the cluster performance be software dependent? This is because I have a lab friend who use to run OpenFOAM using 4 nodes with all CPUs and he doesn't report problems like this.
|
|
January 21, 2023, 18:19 |
|
#7 |
Super Moderator
Glenn Horrocks
Join Date: Mar 2009
Location: Sydney, Australia
Posts: 17,870
Rep Power: 144 |
Yes, cluster performance is software dependant. Different software has different loads on main memory, L1 and L2 caches, hard drive, inter-partition communications and so on. Also the optimisation options when the software is compiled makes a big difference.
But most respectable Navier Stokes based CFD codes should be very similar. You should only note major differences if going to extreme numbers of partitions. But if you compare a CFD software with a ray tracing software for instance - I would expect them to scale very differently on a large cluster.
__________________
Note: I do not answer CFD questions by PM. CFD questions should be posted on the forum. |
|
Tags |
ansys, cfx, cluster |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
CFX Solver stopped with error when requested for backup during solver running | Mfaizan | CFX | 40 | May 13, 2016 07:50 |
[PyFoam] Problems with the new PyFoam release | zfaraday | OpenFOAM Community Contributions | 13 | December 9, 2014 19:58 |
Running macros in parallel in batch mode | nomad | STAR-CCM+ | 13 | February 22, 2013 09:30 |
RSH problem for parallel running in CFX | Nicola | CFX | 5 | June 18, 2012 19:31 |
DPM model in parallel batch mode | Prashanth | FLUENT | 2 | March 6, 2009 08:54 |