|
[Sponsors] |
December 28, 2019, 04:58 |
Performance problems on AMD Epyc cluster
|
#1 |
New Member
Join Date: Dec 2018
Posts: 6
Rep Power: 8 |
Dear All!
In my workplace we have a new AMD based cluster to use OpenFOAM 19.06 for steady-state incompressible turbolent simulations with upper 40 millions cells mesh. - 2xAMD Epyc 7702 (2x 64 cores); - ram 256 GB DDR4; - hard disk RAID 5 - CentOS 7.7 Now, we have some problems using many cores simultaneously. As benchmark I ran simultaneously a simpleFoam single core case with an airfoil mesh (500'000 tetra cells). Using 4 cores test takes about 1200 s and on 128 cores about 4 hours. But, we noted many different single core performances. Time differences through cores increase as increasing cores used. For you, what can cause different single core performance? We ran also a simpleFoam case with about 15 millions cells mesh for 50 iterations. On 16 cores test takes 660 s, while 600 s on 32 cores. We ran same tests also in an Intel cluster with 2xIntel Xeon Gold (28 cores in total). After the first test, we noted very similar time for all cores used. Running the second case (15 millions tetras mesh) on 28 cores, it takes about 400 s. For now, we are disappointed, because we read about excellent multi-cores performance on AMD Epyc socket. Have anyone experiences about OpenFOAM scalability and performance on AMD Epyc 7002? Thank you very much! |
|
December 28, 2019, 08:23 |
|
#2 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49 |
So far, there is one Epyc Rome result in the benchmark thread. It took first place as far as dual-socket systems are concerned.
OpenFOAM benchmarks on various hardware So in theory, such a system can be fast in OpenFOAM. In practice, performance can depend on a lot of factors. A few things you should check: Use test cases that are large enough. 500k cells is definitely too small for 128 cores. Disable SMT in the bios Make sure the CPU clock speed is in the proper range when the system is under load, e.g. using turbostat Check memory configuration. You need 16 DIMMs of DDR4-3200, populated in the correct DIMM slots. Check how the system distributes the threads across the cores, e.g. using htop. You can also try a newer operating system. CentOS 8 finally switched to a 4.x kernel version, which might be better for bleeding edge hardware like yours. And last not least: adjust expectations. I would not expect much scaling beyond 64 cores, due to memory bandwidth limitations. Edit: also, "hard disk RAID5"... do your timing checks include meshing and I/O times, or do you only look at solver times? |
|
December 30, 2019, 08:14 |
|
#3 |
New Member
Join Date: Dec 2018
Posts: 6
Rep Power: 8 |
Thank you for your answer! I'll check those.
I used 500k cells because I ran it on single core n times simultaneously. My timing checks include only solver times. |
|
February 17, 2020, 09:50 |
|
#4 |
New Member
Leo Natan
Join Date: Dec 2019
Posts: 6
Rep Power: 7 |
Disabled SMT in the BIOS and everything is ok now!
|
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
[ICEM] Problems with coedge curves and surfaces | tommymoose | ANSYS Meshing & Geometry | 6 | December 1, 2020 12:12 |
New 128 mini cluster - Cascade Lake SP or EPYC Rome? | SLC | Hardware | 8 | December 16, 2019 17:25 |
Unforeseen problems in scaling up a cluster built with desktop parts? | kyle | Hardware | 22 | January 18, 2012 14:46 |
Linux Cluster Setup Problems | Bob | CFX | 1 | October 3, 2002 19:08 |
AMD Athlon problems? | Kenji Takeda | FLUENT | 10 | December 15, 2000 01:36 |