CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > General Forums > Hardware

OpenFOAM benchmarks on various hardware

Register Blogs Community New Posts Updated Threads Search

Like Tree547Likes

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   March 15, 2023, 13:21
Default
  #661
Senior Member
 
Will Kernkamp
Join Date: Jun 2014
Posts: 371
Rep Power: 14
wkernkamp is on a distinguished road
Quote:
Originally Posted by iamchermac View Post
Hi Will, thanks for the quick reply.

...


I am still getting an odd failure when using 5-cores (for tinkering and interest purposes I benchmarked 1-16 in increments of 1). I will investigate this oddity a bit more later.

What were your times?
wkernkamp is offline   Reply With Quote

Old   March 15, 2023, 14:27
Default
  #662
New Member
 
Chermac Rolle
Join Date: Mar 2023
Posts: 5
Rep Power: 3
iamchermac is on a distinguished road
EDITS:

Quote:
Originally Posted by wkernkamp View Post
What were your times?
I am still not quite confident I got it right (hence not posting times before), but initial findings included now below.

Hardware
  • CPU: Ryzen 5950x (PBO Curve Optimiser -ve 15 with 200MHz boost, SMT enabled). Single-core 5.05GHz; All-core 4.70GHz. 144W Package full load.
  • RAM: 4x 16GB GSKILL F4-3600C18-16GTZN (Dual Rank DDR4) - 3600MHz
  • Motherboard: ROG Crosshair VIII Dark Hero (latest BIOS, voltage and current regulation set to extreme)
  • SSD: 2.5in SATA
  • PSU: 1200W Platinum

Software
  • Ubuntu 22.04
  • OpenFOAM-v2212 [Build: _c9081d5d-20230220]


Benchmark

Code:
 # cores   Wall time (s):
------------------------

Meshing Times:
1 755.06
2 492.13
3 353.34
4 311.02
5 259.87
6 256.08
7 217.3
8 209.71
9 210.58
10 196.55
11 184.98
12 180.38
13 186.5
14 172.59
15 213.75
16 180.15

Flow Calculation:
1 516.34
2 296.19
3 208.3
4 181.82
5 51.41 
6 167.92
7 166.2
8 162.63
9 167.11
10 164.64
11 166.08
12 165.67
13 168.1
14 167.56
15 170.41
16 170.9
NOTE:

The sample benchmark throws errors with the Windows binary. I am still investigating this (as a relative OpenFOAM newbie), but it seems to do with changing the subdomain to a number other than 6. It also needs the controlDict to be edited:

Code:
writeCompression uncompressed;
changed to

Code:
writeCompression off;
However, for the time being I am using the Ubuntu results to rerun simpleFoam for a quick OS comparison.

Benchmark results:

Windows 10 Pro 22H2 Build 19045.2673
OpenFOAM-v2212-windows-mingw

Flow Calculation:
1 499.908
2 353.212
3 310.956
4 283.392
5 70.862
6 259.059
7 253.411
8 243.51
9 226.215
10 215.78
11 210.594
12 210.564
13 210.029
14 212.242
15 192.991
16 195.159
wkernkamp and Crowdion like this.

Last edited by iamchermac; March 15, 2023 at 19:05. Reason: Corrected all-core speed and added benchmark timings for Windows binary.
iamchermac is offline   Reply With Quote

Old   March 15, 2023, 17:52
Default
  #663
Senior Member
 
Will Kernkamp
Join Date: Jun 2014
Posts: 371
Rep Power: 14
wkernkamp is on a distinguished road
The result does not look bad. It is normal that this processor cannot take advantage of all these fast cores, because it has only two memory channels (with four dimms).
iamchermac likes this.
wkernkamp is offline   Reply With Quote

Old   March 15, 2023, 21:11
Default Ryzen 5600G and 5700G
  #664
New Member
 
Chermac Rolle
Join Date: Mar 2023
Posts: 5
Rep Power: 3
iamchermac is on a distinguished road
Edits added:

Hardware
  • CPU: Ryzen 5600G (PBO Curve Optimiser -ve 15 with 200MHz boost, SMT enabled). Single-core 4.65GHz; All-core 4.30GHz. 59W max load during benchmarkOpenFOAM. IGFX Disabled.
  • RAM: 2x 32GB Corsair CMK32GX4M2A2666C16 (Dual Rank DDR4) - overclocked to 3600MHz.
  • Motherboard: X570 Phantom Gaming-ITX/TB3
  • SSD: NVME M.2
  • PSU: 750W Platinum

Software
  • Ubuntu 20.04
  • OpenFOAM-v2012 [Patch: 210618]


Benchmark

Code:
 # cores   Wall time (s):
------------------------

Meshing Times:
1 779.15
2 541.52
3 385.36
4 334.29
5 301.73
6 277.16

Flow Calculation:
1 549.71
2 343.31
3 273.09
4 248.09
5 241.1
6 235.34
RAM speed - 3200MHz. Very similar to the 5700G results below.

Code:
Flow Calculation:
1 565.38
2 361.24
3 292.32
4 270.56
5 261.67
6 256.19
RAM speed - 2666MHz

Code:
Flow Calculation:
1 627.44
2 409.15
3 338.29
4 314.01
5 304.08
6 297.7



Hardware
  • CPU: Ryzen 5700G (PBO Curve Optimiser -ve 15 with 200MHz boost, SMT enabled). Single-core 4.80GHz. 66W max load during benchmarkOpenFOAM.
  • RAM: 4x 32GB Corsair CMW128GX4M4E3200C16 (Dual Rank DDR4) - 3200MHz.
  • Motherboard: TUF GAMING B450M-PLUS II
  • SSD: NVME M.2
  • PSU: 650W Gold

Software
  • Ubuntu 22.04
  • OpenFOAM-v2212 [Build: _c9081d5d-20230220]


Benchmark (a simple rerun of the above 5600G due to meshing issue)

Code:
 # cores   Wall time (s):
------------------------

Flow Calculation:
1 608.68
2 362.62
3 295.94
4 277.87
5 262.39
6 255.66
DVSoares and wkernkamp like this.

Last edited by iamchermac; March 16, 2023 at 20:44. Reason: Additional benchmarks provided for 5600G with adjusted RAM speeds.
iamchermac is offline   Reply With Quote

Old   March 17, 2023, 13:30
Default
  #665
Senior Member
 
Simbelmynė's Avatar
 
Join Date: May 2012
Posts: 551
Rep Power: 16
Simbelmynė is on a distinguished road
Intel 13900k (HT off), 32 GB DDR5@7200 MT/s (34-44-44-96), Ubuntu 22.04, OpenFOAM v10


Meshing (1,2,4,8 cores):

7m45,887s

5m32,672s

3m24,995s

2m16,678s


# cores Wall time (s):
------------------------
1 301.118
2 164.46
4 101.268
8 70.3852


------------------------



Pretty OK results for a dual channel system.
Simbelmynė is offline   Reply With Quote

Old   March 17, 2023, 18:24
Default
  #666
Senior Member
 
Will Kernkamp
Join Date: Jun 2014
Posts: 371
Rep Power: 14
wkernkamp is on a distinguished road
What did the system cost you approximately? It is an amazing result!
wkernkamp is offline   Reply With Quote

Old   March 17, 2023, 18:50
Default
  #667
Senior Member
 
Simbelmynė's Avatar
 
Join Date: May 2012
Posts: 551
Rep Power: 16
Simbelmynė is on a distinguished road
Quote:
Originally Posted by wkernkamp View Post
What did the system cost you approximately? It is an amazing result!

It was close to 3000 Euro. The prices in Sweden are really high at the moment though and the system is not optimized for prize so by cutting on the PSU, Case, SSDs, Cooling solution (I picked an absolutely massive radiator for the AiO) and GPU it could be a lot cheaper.


This will primarily be used for single core tasks, but it is fun to see how it fares in CFD as well.
wkernkamp likes this.
Simbelmynė is offline   Reply With Quote

Old   March 17, 2023, 19:52
Default
  #668
Senior Member
 
Will Kernkamp
Join Date: Jun 2014
Posts: 371
Rep Power: 14
wkernkamp is on a distinguished road
Quote:
Originally Posted by Simbelmynė View Post
....., but it is fun to see how it fares in CFD as well.
I know. You tried with 5800X3D, but 4800 MT/s was not enough. Now you are the champ for sure. Haha.
wkernkamp is offline   Reply With Quote

Old   March 18, 2023, 05:45
Default
  #669
Senior Member
 
Simbelmynė's Avatar
 
Join Date: May 2012
Posts: 551
Rep Power: 16
Simbelmynė is on a distinguished road
I think it is interesting to analyze the influence of bandwidth.

Older hardware (maximum number of cores available):

Threadripper 1950X: 154 s @ 102 GB/s peak memory bandwidth

8700k: 247 s @ 51.2 GB/s peak memory bandwidth

Epyc 7301: 36.8 s @ 340 GB/s peak memory bandwidth

From my last tests with recent hardware we have (all cases with 8 cores):

5800X3D: 138 s @ 51.2 GB/s peak memory bandwidth

5800X3D: 122 s @ 60.8 GB/s peak memory bandwidth

M1 Max: 84 s @ 243 GB/s peak memory bandwidth

13900k: 70 s @ 115 GB/s peak memory bandwidth
Simbelmynė is offline   Reply With Quote

Old   March 18, 2023, 05:54
Default
  #670
Member
 
Erik Andresen
Join Date: Feb 2016
Location: Denmark
Posts: 35
Rep Power: 10
ErikAdr is on a distinguished road
Simbelmynė. Very impressive results! Was it necessary to switch off efficient cores before you ran the test?
Its a dual channel system but 2 x 7200 = 4.5 x 3200 so it corresponds to 4.5 channel DDR4 system. You results show that new systems for running cfd should use DDR5.
ErikAdr is offline   Reply With Quote

Old   March 18, 2023, 06:17
Default
  #671
Member
 
Erik Andresen
Join Date: Feb 2016
Location: Denmark
Posts: 35
Rep Power: 10
ErikAdr is on a distinguished road
Quote:
Originally Posted by Simbelmynė View Post
I think it is interesting to analyze the influence of bandwidth.

Older hardware (maximum number of cores available):

Threadripper 1950X: 154 s @ 102 GB/s peak memory bandwidth

8700k: 247 s @ 51.2 GB/s peak memory bandwidth

Epyc 7301: 36.8 s @ 340 GB/s peak memory bandwidth

From my last tests with recent hardware we have (all cases with 8 cores):

5800X3D: 138 s @ 51.2 GB/s peak memory bandwidth

5800X3D: 122 s @ 60.8 GB/s peak memory bandwidth

M1 Max: 84 s @ 243 GB/s peak memory bandwidth

13900k: 70 s @ 115 GB/s peak memory bandwidth
From post 643:
Apple M1 Ultra: 46 s @ 800 GB/s peak memory bandwidth

ARM bases cores are not the strongest when it comes to floating point performance, but still an impressive result.
ErikAdr is offline   Reply With Quote

Old   March 18, 2023, 06:31
Default
  #672
Senior Member
 
Simbelmynė's Avatar
 
Join Date: May 2012
Posts: 551
Rep Power: 16
Simbelmynė is on a distinguished road
Quote:
Originally Posted by ErikAdr View Post
Simbelmynė. Very impressive results! Was it necessary to switch off efficient cores before you ran the test?
Its a dual channel system but 2 x 7200 = 4.5 x 3200 so it corresponds to 4.5 channel DDR4 system. You results show that new systems for running cfd should use DDR5.

I did not switch off the e-cores for this test only HT. I seems the Linux kernel in Ubuntu 22.04.2 is reasonably good for hybrid core setups.





Quote:
Originally Posted by ErikAdr View Post
From post 643:
Apple M1 Ultra: 46 s @ 800 GB/s peak memory bandwidth

ARM bases cores are not the strongest when it comes to floating point performance, but still an impressive result.

Agreed and I would not purchase any apple silicon product for CFD (or any engineering for that matter) since compatibility is really poor. However, the power draw is a different matter and if that is important then they can be very good. If you dislike noisy fans for instance, then you can get a Mac studio that performs 20% slower than the 13900k at about the same price. OpenFOAM and Comsol works well on apple silicon.
ErikAdr likes this.
Simbelmynė is offline   Reply With Quote

Old   March 18, 2023, 07:26
Default
  #673
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
Quote:
However, the power draw is a different matter and if that is important then they can be very good
Efficiency sure is good with Apple's new chips.
Then again, if one wanted to tweak Intel's 13th gen desktop CPUs for efficiency, I am sure there is A LOT to be gained here. Out of the box, an I9-13900k is geared towards maximum performance at all costs. That's what competition brought us
I would not be surprised if power draw could be halved, at the cost of maybe 10% less performance in this benchmark.
iamchermac likes this.
flotus1 is offline   Reply With Quote

Old   March 18, 2023, 07:35
Default
  #674
Senior Member
 
Simbelmynė's Avatar
 
Join Date: May 2012
Posts: 551
Rep Power: 16
Simbelmynė is on a distinguished road
Yeah, that would be a very nice weekend test. The x86 based platform is so much more mature in terms of available software and instructions.
wkernkamp likes this.
Simbelmynė is offline   Reply With Quote

Old   March 20, 2023, 08:12
Default Ryzen 7950x3d and 7900x3d
  #675
New Member
 
Bruce
Join Date: Jun 2015
Posts: 2
Rep Power: 0
openfoamer999 is on a distinguished road
The Phoronix site has some benchmarking of ryzen 7950x3d and 7900x3d for technical applications. OpenFOAM is one of the applications

https://www.phoronix.com/review/amd-7900x3d-7950x3d/2

The extra 3d cache seems to really help in cfd.
wkernkamp likes this.
openfoamer999 is offline   Reply With Quote

Old   March 20, 2023, 14:00
Default
  #676
Senior Member
 
Will Kernkamp
Join Date: Jun 2014
Posts: 371
Rep Power: 14
wkernkamp is on a distinguished road
Quote:
Originally Posted by openfoamer999 View Post
The Phoronix site has some benchmarking of ryzen 7950x3d and 7900x3d for technical applications.......


The extra 3d cache seems to really help in cfd.

This is true, but the effect gets smaller when the mesh is larger. That is a weakness of the openfoam benchmarking we do here. The benchmark is a relatively small problem. So people that need to run large problems should definitely look at the phoronix site. For large problems a large cache will not be equivalent to more memory channels.
wkernkamp is offline   Reply With Quote

Old   March 20, 2023, 15:32
Default
  #677
New Member
 
Bruce
Join Date: Jun 2015
Posts: 2
Rep Power: 0
openfoamer999 is on a distinguished road
Quote:
Originally Posted by wkernkamp View Post
This is true, but the effect gets smaller when the mesh is larger. That is a weakness of the openfoam benchmarking we do here. The benchmark is a relatively small problem. So people that need to run large problems should definitely look at the phoronix site. For large problems a large cache will not be equivalent to more memory channels.
That is very true. In fact you can see that in his results if you compare the small mesh and medium mesh results. He noted it himself at the bottom of the page on the OpenFOAM results, i.e. "For hobbyists or those not wanting/able to make the leap to a Milan-X or Genoa server platform, these new AMD Ryzen 9 7900X3D/7950X3D processors drive a nice bargain for having a performant desktop experience dealing with CFD workloads."

If AMD releases a new Threadripper-X cpu, then it would have the extra memory channels and larger cache, albeit at a much higher cost.

From all the posted results in the thread, it seems that once you get past 8 or so cores, then you plateau out and reach diminishing returns. Is this an artifact of the size of the OpenFOAM test problems, or is it also true with large problems on a dedicated high end workstation?

I am new to OpenFOAM, and most of my work has been neutronic analysis via Monte Carlo methods, and the speedup is mainly dependent on the number of cores. Very interesting thread, especially if I were a graduate student starting out in CFD work.
openfoamer999 is offline   Reply With Quote

Old   March 21, 2023, 02:07
Default
  #678
New Member
 
Dmitry
Join Date: Feb 2013
Posts: 29
Rep Power: 13
techtuner is on a distinguished road
Quote:
Originally Posted by openfoamer999 View Post
From all the posted results in the thread, it seems that once you get past 8 or so cores, then you plateau out and reach diminishing returns. Is this an artifact of the size of the OpenFOAM test problems, or is it also true with large problems on a dedicated high end workstation?

I am new to OpenFOAM, and most of my work has been neutronic analysis via Monte Carlo methods, and the speedup is mainly dependent on the number of cores. Very interesting thread, especially if I were a graduate student starting out in CFD work.
According to my tests in ANSYS Fluent, ANSYS CFX, Siemens Star-CCM+ (they are all limited by RAM performance) on different RANS solvers on various hardware, reduction of the simulation performance is almost linear with increasing of mesh size from 2 to 12 mln. cells. That is mean, that 2 mln. cell is large enough task to fill at least 256MB L3 CPU cache.

Monte-Carlo neutron physics tasks are highly scalable. There are computing a lot of independed tasks. Every MonteCarlo task is usually doing tons of integer summs in parallel with floating point operations. That is why SMT/HT make 1.5x boost of Monte Carlo performance (like CPU-Z benchmark). These codes are highly CPU frequency bounded.
techtuner is offline   Reply With Quote

Old   March 21, 2023, 02:28
Default
  #679
Senior Member
 
Will Kernkamp
Join Date: Jun 2014
Posts: 371
Rep Power: 14
wkernkamp is on a distinguished road
Quote:
Originally Posted by openfoamer999 View Post
That is very true......

From all the posted results in the thread, it seems that once you get past 8 or so cores, then you plateau out and reach diminishing returns. Is this an artifact of the size of the OpenFOAM test problems, or is it also true with large problems on a dedicated high end workstation?......

The point of diminishing returns depends on memory bandwidth. A Genoa CPU with 12 memory channels can make a productive use out of more than 8 cores.
wkernkamp is offline   Reply With Quote

Old   March 25, 2023, 04:34
Default 2x Epyc 7402
  #680
New Member
 
Kaissar Nabbout
Join Date: Feb 2022
Posts: 1
Rep Power: 0
kaissar is on a distinguished road
Here is my setup and the results:

processor: 2x Eypc 7402
memory: 16x Hynix 16GB 256G DDR43200 MHz ECC REG HMA82GR7CJR8N-XN
Motherboard: Supermicro H11DSi-NT

# cores Wall time (s):
------------------------
48 22.93
46 23.81
40 25.44
32 27.22
24 34.18
16 41.89
12 55.09
8 80.78
4 163.07
2 384.07
1 761.34

I tested 46 and 48 cores because I usually use 46 cores, because while the simulation is running I sometimes do some visualization of the results on paraview, therefore I leave this 2 extra cores.
techtuner, wkernkamp and Crowdion like this.
kaissar is offline   Reply With Quote

Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
How to contribute to the community of OpenFOAM users and to the OpenFOAM technology wyldckat OpenFOAM 17 November 10, 2017 16:54
UNIGE February 13th-17th - 2107. OpenFOAM advaced training days joegi.geo OpenFOAM Announcements from Other Sources 0 October 1, 2016 20:20
OpenFOAM Training Beijing 22-26 Aug 2016 cfd.direct OpenFOAM Announcements from Other Sources 0 May 3, 2016 05:57
New OpenFOAM Forum Structure jola OpenFOAM 2 October 19, 2011 07:55
Hardware for OpenFOAM LES LijieNPIC Hardware 0 November 8, 2010 10:54


All times are GMT -4. The time now is 04:45.