|
[Sponsors] |
March 15, 2023, 13:21 |
|
#661 |
Senior Member
Will Kernkamp
Join Date: Jun 2014
Posts: 371
Rep Power: 14 |
||
March 15, 2023, 14:27 |
|
#662 |
New Member
Chermac Rolle
Join Date: Mar 2023
Posts: 5
Rep Power: 3 |
EDITS:
I am still not quite confident I got it right (hence not posting times before), but initial findings included now below. Hardware
Software
Benchmark Code:
# cores Wall time (s): ------------------------ Meshing Times: 1 755.06 2 492.13 3 353.34 4 311.02 5 259.87 6 256.08 7 217.3 8 209.71 9 210.58 10 196.55 11 184.98 12 180.38 13 186.5 14 172.59 15 213.75 16 180.15 Flow Calculation: 1 516.34 2 296.19 3 208.3 4 181.82 5 51.41 6 167.92 7 166.2 8 162.63 9 167.11 10 164.64 11 166.08 12 165.67 13 168.1 14 167.56 15 170.41 16 170.9 The sample benchmark throws errors with the Windows binary. I am still investigating this (as a relative OpenFOAM newbie), but it seems to do with changing the subdomain to a number other than 6. It also needs the controlDict to be edited: Code:
writeCompression uncompressed; Code:
writeCompression off; Benchmark results: Windows 10 Pro 22H2 Build 19045.2673 OpenFOAM-v2212-windows-mingw Flow Calculation: 1 499.908 2 353.212 3 310.956 4 283.392 5 70.862 6 259.059 7 253.411 8 243.51 9 226.215 10 215.78 11 210.594 12 210.564 13 210.029 14 212.242 15 192.991 16 195.159 Last edited by iamchermac; March 15, 2023 at 19:05. Reason: Corrected all-core speed and added benchmark timings for Windows binary. |
|
March 15, 2023, 17:52 |
|
#663 |
Senior Member
Will Kernkamp
Join Date: Jun 2014
Posts: 371
Rep Power: 14 |
The result does not look bad. It is normal that this processor cannot take advantage of all these fast cores, because it has only two memory channels (with four dimms).
|
|
March 15, 2023, 21:11 |
Ryzen 5600G and 5700G
|
#664 |
New Member
Chermac Rolle
Join Date: Mar 2023
Posts: 5
Rep Power: 3 |
Edits added:
Hardware
Software
Benchmark Code:
# cores Wall time (s): ------------------------ Meshing Times: 1 779.15 2 541.52 3 385.36 4 334.29 5 301.73 6 277.16 Flow Calculation: 1 549.71 2 343.31 3 273.09 4 248.09 5 241.1 6 235.34 Code:
Flow Calculation: 1 565.38 2 361.24 3 292.32 4 270.56 5 261.67 6 256.19 Code:
Flow Calculation: 1 627.44 2 409.15 3 338.29 4 314.01 5 304.08 6 297.7 Hardware
Software
Benchmark (a simple rerun of the above 5600G due to meshing issue) Code:
# cores Wall time (s): ------------------------ Flow Calculation: 1 608.68 2 362.62 3 295.94 4 277.87 5 262.39 6 255.66 Last edited by iamchermac; March 16, 2023 at 20:44. Reason: Additional benchmarks provided for 5600G with adjusted RAM speeds. |
|
March 17, 2023, 13:30 |
|
#665 |
Senior Member
Join Date: May 2012
Posts: 551
Rep Power: 16 |
Intel 13900k (HT off), 32 GB DDR5@7200 MT/s (34-44-44-96), Ubuntu 22.04, OpenFOAM v10
Meshing (1,2,4,8 cores): 7m45,887s 5m32,672s 3m24,995s 2m16,678s # cores Wall time (s): ------------------------ 1 301.118 2 164.46 4 101.268 8 70.3852 ------------------------ Pretty OK results for a dual channel system. |
|
March 17, 2023, 18:24 |
|
#666 |
Senior Member
Will Kernkamp
Join Date: Jun 2014
Posts: 371
Rep Power: 14 |
What did the system cost you approximately? It is an amazing result!
|
|
March 17, 2023, 18:50 |
|
#667 | |
Senior Member
Join Date: May 2012
Posts: 551
Rep Power: 16 |
Quote:
It was close to 3000 Euro. The prices in Sweden are really high at the moment though and the system is not optimized for prize so by cutting on the PSU, Case, SSDs, Cooling solution (I picked an absolutely massive radiator for the AiO) and GPU it could be a lot cheaper. This will primarily be used for single core tasks, but it is fun to see how it fares in CFD as well. |
||
March 17, 2023, 19:52 |
|
#668 |
Senior Member
Will Kernkamp
Join Date: Jun 2014
Posts: 371
Rep Power: 14 |
||
March 18, 2023, 05:45 |
|
#669 |
Senior Member
Join Date: May 2012
Posts: 551
Rep Power: 16 |
I think it is interesting to analyze the influence of bandwidth.
Older hardware (maximum number of cores available): Threadripper 1950X: 154 s @ 102 GB/s peak memory bandwidth 8700k: 247 s @ 51.2 GB/s peak memory bandwidth Epyc 7301: 36.8 s @ 340 GB/s peak memory bandwidth From my last tests with recent hardware we have (all cases with 8 cores): 5800X3D: 138 s @ 51.2 GB/s peak memory bandwidth 5800X3D: 122 s @ 60.8 GB/s peak memory bandwidth M1 Max: 84 s @ 243 GB/s peak memory bandwidth 13900k: 70 s @ 115 GB/s peak memory bandwidth |
|
March 18, 2023, 05:54 |
|
#670 |
Member
Erik Andresen
Join Date: Feb 2016
Location: Denmark
Posts: 35
Rep Power: 10 |
Simbelmynė. Very impressive results! Was it necessary to switch off efficient cores before you ran the test?
Its a dual channel system but 2 x 7200 = 4.5 x 3200 so it corresponds to 4.5 channel DDR4 system. You results show that new systems for running cfd should use DDR5. |
|
March 18, 2023, 06:17 |
|
#671 | |
Member
Erik Andresen
Join Date: Feb 2016
Location: Denmark
Posts: 35
Rep Power: 10 |
Quote:
Apple M1 Ultra: 46 s @ 800 GB/s peak memory bandwidth ARM bases cores are not the strongest when it comes to floating point performance, but still an impressive result. |
||
March 18, 2023, 06:31 |
|
#672 | ||
Senior Member
Join Date: May 2012
Posts: 551
Rep Power: 16 |
Quote:
I did not switch off the e-cores for this test only HT. I seems the Linux kernel in Ubuntu 22.04.2 is reasonably good for hybrid core setups. Quote:
Agreed and I would not purchase any apple silicon product for CFD (or any engineering for that matter) since compatibility is really poor. However, the power draw is a different matter and if that is important then they can be very good. If you dislike noisy fans for instance, then you can get a Mac studio that performs 20% slower than the 13900k at about the same price. OpenFOAM and Comsol works well on apple silicon. |
|||
March 18, 2023, 07:26 |
|
#673 | |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49 |
Quote:
Then again, if one wanted to tweak Intel's 13th gen desktop CPUs for efficiency, I am sure there is A LOT to be gained here. Out of the box, an I9-13900k is geared towards maximum performance at all costs. That's what competition brought us I would not be surprised if power draw could be halved, at the cost of maybe 10% less performance in this benchmark. |
||
March 18, 2023, 07:35 |
|
#674 |
Senior Member
Join Date: May 2012
Posts: 551
Rep Power: 16 |
Yeah, that would be a very nice weekend test. The x86 based platform is so much more mature in terms of available software and instructions.
|
|
March 20, 2023, 08:12 |
Ryzen 7950x3d and 7900x3d
|
#675 |
New Member
Bruce
Join Date: Jun 2015
Posts: 2
Rep Power: 0 |
The Phoronix site has some benchmarking of ryzen 7950x3d and 7900x3d for technical applications. OpenFOAM is one of the applications
https://www.phoronix.com/review/amd-7900x3d-7950x3d/2 The extra 3d cache seems to really help in cfd. |
|
March 20, 2023, 14:00 |
|
#676 | |
Senior Member
Will Kernkamp
Join Date: Jun 2014
Posts: 371
Rep Power: 14 |
Quote:
This is true, but the effect gets smaller when the mesh is larger. That is a weakness of the openfoam benchmarking we do here. The benchmark is a relatively small problem. So people that need to run large problems should definitely look at the phoronix site. For large problems a large cache will not be equivalent to more memory channels. |
||
March 20, 2023, 15:32 |
|
#677 | |
New Member
Bruce
Join Date: Jun 2015
Posts: 2
Rep Power: 0 |
Quote:
If AMD releases a new Threadripper-X cpu, then it would have the extra memory channels and larger cache, albeit at a much higher cost. From all the posted results in the thread, it seems that once you get past 8 or so cores, then you plateau out and reach diminishing returns. Is this an artifact of the size of the OpenFOAM test problems, or is it also true with large problems on a dedicated high end workstation? I am new to OpenFOAM, and most of my work has been neutronic analysis via Monte Carlo methods, and the speedup is mainly dependent on the number of cores. Very interesting thread, especially if I were a graduate student starting out in CFD work. |
||
March 21, 2023, 02:07 |
|
#678 | |
New Member
Dmitry
Join Date: Feb 2013
Posts: 29
Rep Power: 13 |
Quote:
Monte-Carlo neutron physics tasks are highly scalable. There are computing a lot of independed tasks. Every MonteCarlo task is usually doing tons of integer summs in parallel with floating point operations. That is why SMT/HT make 1.5x boost of Monte Carlo performance (like CPU-Z benchmark). These codes are highly CPU frequency bounded. |
||
March 21, 2023, 02:28 |
|
#679 | |
Senior Member
Will Kernkamp
Join Date: Jun 2014
Posts: 371
Rep Power: 14 |
Quote:
The point of diminishing returns depends on memory bandwidth. A Genoa CPU with 12 memory channels can make a productive use out of more than 8 cores. |
||
March 25, 2023, 04:34 |
2x Epyc 7402
|
#680 |
New Member
Kaissar Nabbout
Join Date: Feb 2022
Posts: 1
Rep Power: 0 |
Here is my setup and the results:
processor: 2x Eypc 7402 memory: 16x Hynix 16GB 256G DDR43200 MHz ECC REG HMA82GR7CJR8N-XN Motherboard: Supermicro H11DSi-NT # cores Wall time (s): ------------------------ 48 22.93 46 23.81 40 25.44 32 27.22 24 34.18 16 41.89 12 55.09 8 80.78 4 163.07 2 384.07 1 761.34 I tested 46 and 48 cores because I usually use 46 cores, because while the simulation is running I sometimes do some visualization of the results on paraview, therefore I leave this 2 extra cores. |
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
How to contribute to the community of OpenFOAM users and to the OpenFOAM technology | wyldckat | OpenFOAM | 17 | November 10, 2017 16:54 |
UNIGE February 13th-17th - 2107. OpenFOAM advaced training days | joegi.geo | OpenFOAM Announcements from Other Sources | 0 | October 1, 2016 20:20 |
OpenFOAM Training Beijing 22-26 Aug 2016 | cfd.direct | OpenFOAM Announcements from Other Sources | 0 | May 3, 2016 05:57 |
New OpenFOAM Forum Structure | jola | OpenFOAM | 2 | October 19, 2011 07:55 |
Hardware for OpenFOAM LES | LijieNPIC | Hardware | 0 | November 8, 2010 10:54 |