CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > General Forums > Hardware

OpenFOAM benchmarks on various hardware

Register Blogs Community New Posts Updated Threads Search

Like Tree549Likes

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   October 29, 2021, 01:04
Default
  #421
Member
 
Yan
Join Date: Dec 2013
Location: Milano
Posts: 43
Rep Power: 13
aparangement is on a distinguished road
Send a message via Skype™ to aparangement
Quote:
Originally Posted by xuegy View Post
I can’t wait to see the rumored dual M1 Max Mac Pro next year. Given the memory bandwidth per USD, Apple is not expensive at all.
It's better to find a direct benchmark first, as there could be huge difference between theoretical and practical performance, especially for memory bandwidth.

In the meantime buying one just for testing is too expensive..
aparangement is offline   Reply With Quote

Old   October 29, 2021, 01:06
Default
  #422
Senior Member
 
Join Date: Jun 2016
Posts: 102
Rep Power: 10
xuegy is on a distinguished road
Maybe pack the binary and run the benchmark in apple store? How many minutes will they allow you to touch the laptop?
xuegy is offline   Reply With Quote

Old   October 29, 2021, 01:18
Default
  #423
Member
 
Yan
Join Date: Dec 2013
Location: Milano
Posts: 43
Rep Power: 13
aparangement is on a distinguished road
Send a message via Skype™ to aparangement
Quote:
Originally Posted by xuegy View Post
Maybe pack the binary and run the benchmark in apple store? How many minutes will they allow you to touch the laptop?
I am not sure. But I don't have any macOS devices, so I can't pre-test the codes..

One solution is to buy a cheapest M1 Mac mini just for testing, which is still a little bit pricy..

Most probably I will wait for the benckmark, or turn to a friend who do have a M1 macbook, hopefully he had swiched back to windows..
aparangement is offline   Reply With Quote

Old   October 29, 2021, 01:22
Default
  #424
Senior Member
 
Join Date: Jun 2016
Posts: 102
Rep Power: 10
xuegy is on a distinguished road
I’ve already tested the performance on a M1 Mac mini. You can find the result in this thread.
xuegy is offline   Reply With Quote

Old   October 29, 2021, 01:35
Default
  #425
Member
 
Yan
Join Date: Dec 2013
Location: Milano
Posts: 43
Rep Power: 13
aparangement is on a distinguished road
Send a message via Skype™ to aparangement
Quote:
Originally Posted by xuegy View Post
I’ve already tested the performance on a M1 Mac mini. You can find the result in this thread.
Yeah, I saw your post, very solid performance indeed.

My biggest curiosity is how fast is the M1 max. Would it be 5~6 times faster by a linear extrapolate from the theoretical bandwidths (which would be game changing), or there might be some other limiting factors that stop the speed-up.

To be sure it's better to have a direct benchmark run.
aparangement is offline   Reply With Quote

Old   November 1, 2021, 13:11
Default
  #426
New Member
 
Join Date: Apr 2009
Posts: 13
Rep Power: 17
user is on a distinguished road
Hi Foamers,

I just tested an Apple MacBook Pro 16 '' (2021, 32 GB, M1 Pro, 8 high performance cores, 2 efficiency cores)

OF-v2012 compiled in native ARM64 thanks to the help of this thread. As xuegy wrote it is still buggy, but I also managed to run this benchmark. Crash at the end of every run with some MPI_ABORT errors when calling streamLine output.

# cores Wall time (s):
------------------------
1 458.33
2 257.38
4 145.35
6 118.88
8 98.11
10 128.56 <- performance drop due to the usage of the 2 efficiency cores
user is offline   Reply With Quote

Old   November 1, 2021, 14:11
Default
  #427
Senior Member
 
Join Date: Jun 2016
Posts: 102
Rep Power: 10
xuegy is on a distinguished road
Thanks for the result. You've proved that M1 Pro single core is not faster than M1.
xuegy is offline   Reply With Quote

Old   November 1, 2021, 23:32
Default
  #428
Member
 
Yan
Join Date: Dec 2013
Location: Milano
Posts: 43
Rep Power: 13
aparangement is on a distinguished road
Send a message via Skype™ to aparangement
Strange that even with 4 cores the speed-up is not linear..
I am really curious to see benchmarks on m1 max.

In anycase even M1 pro could compete with a quad channel workstation, this should be the fastes laptop by now, in terms of CFD.

Quote:
Originally Posted by user View Post
Hi Foamers,

I just tested an Apple MacBook Pro 16 '' (2021, 32 GB, M1 Pro, 8 high performance cores, 2 efficiency cores)

OF-v2012 compiled in native ARM64 thanks to the help of this thread. As xuegy wrote it is still buggy, but I also managed to run this benchmark. Crash at the end of every run with some MPI_ABORT errors when calling streamLine output.

# cores Wall time (s):
------------------------
1 458.33
2 257.38
4 145.35
6 118.88
8 98.11
10 128.56 <- performance drop due to the usage of the 2 efficiency cores
aparangement is offline   Reply With Quote

Old   January 15, 2022, 00:52
Default Dual e5 2683v4, JGINYUE X99-D8 Server
  #429
New Member
 
Alexander Kazantcev
Join Date: Sep 2019
Posts: 24
Rep Power: 7
AlexKaz is on a distinguished road
Quote:
Originally Posted by julieng View Post
Hello

2 x Intel E5-2678 v3 @ 2.50GHz with SZMZ X99 Z8 motherboard from Aliexpress default settings

128 GB (8x16) DDR4 2133 MHz
openFoam 8
Ubuntu 20.04

Same test with <b>Hyperthreading disabled

# cores Clock time (s):
------------------------
1 622
2 339
4 161
6 118
10 82
12 70
14 64
16 58

18 56
20 53
22 51

24 53


Dual e5 2683v4, JGINYUE X99-D8 Server from Aliexpress, DDR4 RDIMM 2133 8x8 default timings
v1806, Linux Mint 19.3

HT on, NUMA off, CoD off
Code:
cores    speedup mesh     speedup flow     mesh sec.    flow sec    power
1     1         1          1649.57    1256.06  94.77
2     1.48    1.782    1117.49     705.03    97.73
4     2.78    4.034     593.14      311.35    111.14
6     3.63    5.960     454.04      210.75    122.62
8     4.42    7.524     372.84      166.95    129.69
12    5.31    9.708     310.83     129.38    147.65
16    6.07    11.23     271.89     111.89    161.83
20    6.66    11.98     247.87     104.88    175.18
24    7.76    12.52     212.62     100.29    186.94
28    7.96    12.62     207.12     99.53      198.46
30    7.19    12.55     229.57     100.07    203.85
HT off, NUMA on, CoD on
Code:
cores    speedup mesh     speedup flow     mesh sec.        flow sec
1    1            1         1649.57   1256.06
2                
4                
6                
8                
12                
16    6.47      14.09    254.92    89.17
20    7.15      15.40    230.72    81.56
24    8.41      16.19    196.11    77.57
28    8.55      16.62    193.05    75.59
30    7.69      15.67    214.58    80.18
AlexKaz is offline   Reply With Quote

Old   January 15, 2022, 10:52
Default Dual e5 2683v4, JGINYUE X99-D8 Server
  #430
New Member
 
Alexander Kazantcev
Join Date: Sep 2019
Posts: 24
Rep Power: 7
AlexKaz is on a distinguished road
After reset BIOS to default settings, ht on, numa on, cod off, timings 12-11-11-24...

Code:
cores    speedup speedup flow  mesh sec.    flow sec    power
1.00    0.88    0.81    1455.79    1017.61    88.44
2.00                    
4.00                    
6.00                    
8.00                    
12.00                    
16.00    5.79    11.37    251.33    89.52    166.46
20.00    6.40    12.40    227.45    82.08    179.20
24.00    7.51    13.04    193.94    78.06    191.52
28.00    7.70    13.21    189.06    77.03    204.56
30.00    6.99    13.17    208.26    77.26    208.33
AlexKaz is offline   Reply With Quote

Old   January 18, 2022, 00:53
Default
  #431
Member
 
Guy
Join Date: Jun 2019
Posts: 44
Rep Power: 7
linuxguy123 is on a distinguished road
Quote:
Originally Posted by hokhay View Post
Run the benchmark on AWS EC2 C6g.12xLarge (Graviton 2)
vCPU=48 (all physical cores)
Arm64

OpenFOAM compiled with Gcc

PHP Code:
# cores      Wall time (s):------------------------1            902.82            468.912           91.9616           74.5720           66.0824           60.35 48           48.23 
Thank you for posting this. C6g.12xlarge on demand costs $1.7856/hour in my area.

Now I can look at various hardware scenarios, figure out relative run times and calculate how much it would cost on AWS.
linuxguy123 is offline   Reply With Quote

Old   January 24, 2022, 04:32
Default
  #432
Member
 
Join Date: Sep 2010
Location: Leipzig, Germany
Posts: 96
Rep Power: 16
oswald is on a distinguished road
I recently compared my older and my new workstation.

Here are the results for the older one (Dual AMD EPYC 7281, 16x8GB DDR4@2666MHz, Ubuntu 20.04.3, OpenFOAM 9)
Code:
cores    Wall time (s)    Speedup    Iter/sec
1    1041.25    1.00    0.10
2    575.89    1.81    0.17
4    230.76    4.51    0.43
6    152.58    6.82    0.66
8    113.93    9.14    0.88
12    83.32    12.50    1.20
16    63.15    16.49    1.58
20    57.56    18.09    1.74
24    49.19    21.17    2.03
28    46.62    22.33    2.15
32    42.01    24.79    2.38
And here for the newer one (Dual AMD EPYC 7543, 16x16GB DDR4@3200MHz, Ubuntu 20.04.3, OpenFOAM 9)

Code:
cores    Wall time (s)    Speedup    Iter/sec
1    538.21    1.00    0.19
4    133    4.05    0.75
8    61.61    8.74    1.62
12    42.37    12.70    2.36
16    31.21    17.24    3.20
20    32.34    16.64    3.09
24    27.64    19.47    3.62
28    23.76    22.65    4.21
32    21.26    25.32    4.70
48    19.05    28.25    5.25
64    17.03    31.60    5.87
flotus1, linuxguy123 and engi75 like this.
oswald is offline   Reply With Quote

Old   February 1, 2022, 01:40
Default AMD 5900X Preliminary Results.
  #433
Member
 
Guy
Join Date: Jun 2019
Posts: 44
Rep Power: 7
linuxguy123 is on a distinguished road
Playing around with the openFOAM benchmark. 5900X B2 Stepping. 2 x 16GB = 32 GB 3600MHz Dual Rank RAM CL18 timing.

Ran on Linux. Kernel 5.15.18-200.fc35.x86_64 Generic openFOAM.

SMP OFF.

openfoam = /usr/lib/openfoam/openfoam2112

* Using: OpenFOAM-2112 (2112) - visit www.openfoam.com
* Build: _14aeaf8d-20211220
* Arch: label=32;scalar=64
* Platform: linux64GccDPInt32Opt (mpi=sys-openmpi)

# cores Wall time (s):
------------------------
1 470.86
2 284.32
4 173.74
6 157.86
8 155.97
12 156.63

I did not expect the 5900X to run out of bandwidth at 6 cores. I expected to see a time reduction with 8 and 12 cores. Just goes to show how crucial memory bandwidth is to openFOAM. Having said this, 156 seconds is not bad for a desktop processor with 2 channels of memory. The 5900X will make up time on some of the other processors in meshing, which needs a lot less memory bandwidth.

The 5900X has the fastest 1 core time of any processor on this thread. It is twice as fast on 1 core as many of the server architectures. But it doesn't have the memory bandwidth to keep that speed when more cores are brought into the equation.

FWIW, the memory controller in Zen3, ie Ryzen 5000 parts, is different than the memory controller in previous Ryzen CPUs. I am going to do some memory bandwidth testing in the near future. Should be interesting.

This isn't going to be my full time CFD computer. I have parts for an EPYC 7601 machine on the way.

BTW, Geekbench 5 for this machine is SC: 1767 MC:15,608 https://browser.geekbench.com/v5/cpu/12519107

Last edited by linuxguy123; February 4, 2022 at 02:03.
linuxguy123 is offline   Reply With Quote

Old   February 1, 2022, 12:34
Default
  #434
Member
 
Guy
Join Date: Jun 2019
Posts: 44
Rep Power: 7
linuxguy123 is on a distinguished road
5900X B2 Stepping. 2 x 16GB = 32 GB 2133MHz Dual Rank RAM CL18 timing.


# cores Wall time (s):
------------------------
1 565.88
2 375.11
4 247.73
6 238.35
8 238.13
12 238.51


2133/3600 = 41% slow down in memory speed
156/238 = 35% slow down in run time
openFOAM speed is highly dependent on memory speed.











Last edited by linuxguy123; February 4, 2022 at 01:58.
linuxguy123 is offline   Reply With Quote

Old   February 1, 2022, 14:59
Default
  #435
Senior Member
 
Simbelmynė's Avatar
 
Join Date: May 2012
Posts: 552
Rep Power: 16
Simbelmynė is on a distinguished road
I think the original benchmark was designed for v6 of OpenFOAM. There are some differences for the current versions.


Also, did you remove streamlines as instructed in post #3?
Simbelmynė is offline   Reply With Quote

Old   February 1, 2022, 15:23
Default
  #436
Member
 
Guy
Join Date: Jun 2019
Posts: 44
Rep Power: 7
linuxguy123 is on a distinguished road
Ignore this post.

Last edited by linuxguy123; February 4, 2022 at 00:24.
linuxguy123 is offline   Reply With Quote

Old   February 8, 2022, 19:10
Default
  #437
New Member
 
George
Join Date: Jul 2020
Location: TU Delft, The Netherlands
Posts: 18
Rep Power: 6
gpouliasis is on a distinguished road
Quote:
Originally Posted by linuxguy123 View Post
Playing around with the openFOAM benchmark. 5900X B2 Stepping. 2 x 16GB = 32 GB 3600MHz Dual Rank RAM CL18 timing.

Ran on Linux. Kernel 5.15.18-200.fc35.x86_64 Generic openFOAM.

SMP OFF.

openfoam = /usr/lib/openfoam/openfoam2112

* Using: OpenFOAM-2112 (2112) - visit www.openfoam.com
* Build: _14aeaf8d-20211220
* Arch: label=32;scalar=64
* Platform: linux64GccDPInt32Opt (mpi=sys-openmpi)

# cores Wall time (s):
------------------------
1 470.86
2 284.32
4 173.74
6 157.86
8 155.97
12 156.63

I did not expect the 5900X to run out of bandwidth at 6 cores. I expected to see a time reduction with 8 and 12 cores. Just goes to show how crucial memory bandwidth is to openFOAM. Having said this, 156 seconds is not bad for a desktop processor with 2 channels of memory. The 5900X will make up time on some of the other processors in meshing, which needs a lot less memory bandwidth.

The 5900X has the fastest 1 core time of any processor on this thread. It is twice as fast on 1 core as many of the server architectures. But it doesn't have the memory bandwidth to keep that speed when more cores are brought into the equation.

FWIW, the memory controller in Zen3, ie Ryzen 5000 parts, is different than the memory controller in previous Ryzen CPUs. I am going to do some memory bandwidth testing in the near future. Should be interesting.

This isn't going to be my full time CFD computer. I have parts for an EPYC 7601 machine on the way.

BTW, Geekbench 5 for this machine is SC: 1767 MC:15,608 https://browser.geekbench.com/v5/cpu/12519107
Great results linuxguy123, thanks for sharing. Indeed Zen3 cpus are very fast and do very well in small models.

Just for the history, you don't have the fastest single core result in this forum check this one OpenFOAM benchmarks on various hardware

Although, if I am being honest with a lower latency RAM you can definitely arrive to that result.

Cheers
gpouliasis is offline   Reply With Quote

Old   February 10, 2022, 03:54
Default
  #438
New Member
 
Daniel Dotson
Join Date: May 2020
Posts: 13
Rep Power: 6
Deedledot is on a distinguished road
Dual AMD EPYC 7313

OpenFOAM v2212
Ran on Fedora 35 OS

16×16 GB DDR4-3200 Dual Rank Reg ECC Memory
Supermicro H12DSi-NT6 Motherboard


Code:
# cores   Wall time (s):
------------------------
1 590.39
2 316.51
4 123.64
6 79.28
8 60.54
12 46.89
16 38.5
20 35.74
24 30.97
28 30.61
32 28.88
Deedledot is offline   Reply With Quote

Old   February 11, 2022, 17:21
Default
  #439
Member
 
Guy
Join Date: Jun 2019
Posts: 44
Rep Power: 7
linuxguy123 is on a distinguished road
Quote:
Originally Posted by gpouliasis View Post
Great results linuxguy123, thanks for sharing.
Thanks. I'm happy to share. I've learned a lot from this thread.

Quote:
Indeed Zen3 cpus are very fast and do very well in small models.
If only they had 4 memory channels instead of 2 !

Quote:
Just for the history, you don't have the fastest single core result in this forum check this one OpenFOAM benchmarks on various hardware

Although, if I am being honest with a lower latency RAM you can definitely arrive to that result.
I missed that run.

The 5900X should be a bit faster single core than the 5600X. He's running CL16 memory. I'm running CL18. I couldn't justify the cost of lower latency memory.

The 5900X 6 core time is better than his 5600X: ~158 versus 184. The 5900X has a better memory controller layout than the 5600X.

Today AMD announced that the Zen4/AM5 processors are going to be released in Q3 2022. DDR5 RAM, probably 5200MT/s by the looks of it. Still only 2 memory channels though.

Zen4 memory bandwidth should be ~40% better. (5200/3600MT/s). Faster IPC as well. I'm guessing the Zen4 Ryzen 9 processors will do the openFOAM benchmark in under 100 seconds. We'll see.
gpouliasis likes this.

Last edited by linuxguy123; February 12, 2022 at 19:52.
linuxguy123 is offline   Reply With Quote

Old   February 11, 2022, 17:28
Default
  #440
Member
 
Guy
Join Date: Jun 2019
Posts: 44
Rep Power: 7
linuxguy123 is on a distinguished road
EPYC 7601 on a Supermicro H11SSL-i motherboard. 8x8GB 1R 3200MT/s RAM, running at 2666MHz. SATA spinning hard drive. openfoam-2112, from Fedora copr.

Linux 5.16.7-200.fc35.x86_64 #1 SMP PREEMPT Sun Feb 6 19:53:54 UTC 2022 x86_64 x8
6_64 x86_64 GNU/Linux

# cores Wall time (s):
------------------------
1 971.84
2 560.23
4 206.57
6 151.3
8 112.73
12 90.75
16 80.19
20 74.21
24 68.72
28 67
30 69.79
32 65.7

Once again the gains aren't there at higher core counts, due to memory bandwidth restrictions. If one wants higher core counts, one is better off using dual processors and get twice as many memory channels at the same time.

Having said that the 7601 is much faster than my 5900X. (~66 seconds versus 156 seconds.)

My 7601 machine was a relatively inexpensive machine to throw together. Dual processor machines need a more expensive motherboard, 2 processors and twice as much RAM.

Geekbench5 results: SC: 895 MC:18,979 https://browser.geekbench.com/v5/cpu/12704656. This is the fastest GB5 result I can find for a single 7601 (32 cores).








Last edited by linuxguy123; February 12, 2022 at 19:51.
linuxguy123 is offline   Reply With Quote

Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
How to contribute to the community of OpenFOAM users and to the OpenFOAM technology wyldckat OpenFOAM 17 November 10, 2017 16:54
UNIGE February 13th-17th - 2107. OpenFOAM advaced training days joegi.geo OpenFOAM Announcements from Other Sources 0 October 1, 2016 20:20
OpenFOAM Training Beijing 22-26 Aug 2016 cfd.direct OpenFOAM Announcements from Other Sources 0 May 3, 2016 05:57
New OpenFOAM Forum Structure jola OpenFOAM 2 October 19, 2011 07:55
Hardware for OpenFOAM LES LijieNPIC Hardware 0 November 8, 2010 10:54


All times are GMT -4. The time now is 19:20.