|
[Sponsors] |
July 30, 2021, 10:43 |
Small cluster scaling with GbE
|
#1 |
New Member
Max Spencer
Join Date: Dec 2019
Posts: 11
Rep Power: 7 |
I have a small cluster of 3 machines. Each node is 2x 8core cpu (e5-2650). Currently they are connected via GbE via the same switch. I have teamed/bonded two GbE ports between the three machines and the switch (HP Procurve 1800-24G).
I have setup an nfs for the solution files and have mounted this nfs on each machine and configured passwordless ssh between the nodes. The cluster seems to function properly and will distribute correctly. However there is a severe dropoff in performance when moving much beyond 2 nodes. My test case is ~12M cells, so around 375k cells/core at 32 cores. 2 nodes (32 cores) = good scaling (~6.8 sec/iteration) 2 nodes (32 cores) + 3/4 cores from third node = marginal scaling (6.5 sec/iteration) 2 nodes (32 cores) + 6 or more cores from third node = terribly slow (9.8 sec/iteration) 3 nodes with 10 cores each = terribly slow (12sec/iteration) Two points: 1) It appears the speed limitation is not throughput limited as I did not notice any significant improvement by teaming a second GbE link. Traffic monitor shows maybe 120Mbit -150Mbit usage between nodes. 2) It seems adding any cores on a third node tanks performance, even if the total core count is kept lower 32 cores. I understand that scaling over a GbE will be limited and the recommended approach is to go infiniband. However I wished to see if this behavior is expected or if there is something more I can fix/tune for any scaling on a third node. Perhaps I have missed something in my setup. Thanks for any pointers! |
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Scaling Down of Arrows in ANSYS During Modeling of a Small Dimension Setup | Shomaz ul Haq | CFX | 1 | November 29, 2015 18:08 |
[snappyHexMesh] No layers in a small gap | bobburnquist | OpenFOAM Meshing & Mesh Conversion | 6 | August 26, 2015 10:38 |
Running Foam on multiple nodes (small cluster) | Hisham | OpenFOAM Running, Solving & CFD | 4 | June 11, 2012 14:44 |
Simulating small features on large models | siw | CFX | 1 | February 16, 2012 18:10 |
Parallel rerun in cluster | Andy_bm | OpenFOAM Running, Solving & CFD | 4 | November 27, 2011 08:16 |