CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > General Forums > Hardware

OpenFOAM benchmarks on various hardware

Register Blogs Community New Posts Updated Threads Search

Like Tree540Likes

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   July 13, 2019, 19:09
Talking 2% lower meshing- and walltime just by lowering Tref and higher cachefreq.
  #201
New Member
 
Erik
Join Date: Jul 2019
Posts: 7
Rep Power: 7
erik87 is on a distinguished road
I question myself, how to get more performance added to my overclocked cpu and ram.

So, i did some intensive testing on my 6600k workstation.

I used 2x8 GB 3100 MHz 16-18-18-37 DDR4 Ram with a Command Rate of 1T.

Some interessting correlations can be seen below.

All tests are made on Linux Mint 19.0 with Openfoam 7.0.

But lets show the best run first .



So, lets start at 4000 MHz cachefrequency and a Tref ramting of 408.

Lower Tref values (ramtimings) and higher cachefrequencies for Intel Skylake architecture are beneficial.

The benefit of higher cache is not that big, but ok.



Finally, i tested some Trefs between 408 and 320.

As we know lower ramtimings and higher ramfrequencies are very good to lower the walltime. Higher cpu frequencies are very important to get low mesh times. But i dont expect this.

All in all i got about 2% lower meshing- and walltime just by lowering Tref and higher cachefrequency.
flotus1 likes this.
erik87 is offline   Reply With Quote

Old   July 15, 2019, 10:53
Default
  #202
New Member
 
Erik
Join Date: Jul 2019
Posts: 7
Rep Power: 7
erik87 is on a distinguished road
I did some further testing comparison to Skylake, too.

Comparison for Skylake to some older notebooks:

X200: P8600@2,4GHz DDR3@1066MHz Openfoam 6.0 ubuntu 18.10
T520: i7 2670qm@2,2GHz DDR3@1333MHz Openfoam 6.0 ubuntu 18.10
Skylake: 6600k@4,6GHz DDR4@3100MHz Openfoam 7.0 Linux Mint 19.0
T520+X200: 2 Node Cluster with Crosscable 1Gbit Ethernet
AM2: 8 Node X2 4400+ with 1Gbit Ethernet Openfoam 7.0 Linux Mint 19.0

Code:
simpleFoam     
                                              
          |   X200    |   T520   |  Skylake  |  T520 + X200  |   AM2
-----------------------------------------------------------------------
     1    |  2794,99  | 1242,46  |  595,08   |               |
     2    |  2099,58  |  822,6   |  344,67   |               |
     4    |           |  653,52  |  285,62   |               |
     6    |           |          |           |    625,02     |
    16    |           |          |           |               |   465
You can see huge IPC improvements between Core2 Duo (45nm) to Sandybrindge (32nm) up to Skylake (14nm) architecture from the meshing as well.
With better IMCs over the last years we also got higher possible memory speeds.

Code:
meshing     
                                              
          |   X200    |   T520   |  Skylake  |   AM2
-------------------------------------------------------
     1    |  3309,92  | 2244,18  |  965,1    |
     2    |  2465,63  | 1568,04  |  634,48   |
     4    |           |  982,88  |  390,06   |
    16    |           |          |           |  1184
Two cores from 45nm Core2Duo era meshed in the same time to just one core from 32nm Sandybridge architecture.
One core overclocked Skylake outperforms about 3times C2D and 2times Sandybridge in the meshing part.
The older P8400 cpu shows to be about 7times slower in simpleFoam than the oced Skylake. Thats insane.

Last edited by erik87; August 6, 2019 at 05:39. Reason: AM2 added
erik87 is offline   Reply With Quote

Old   July 29, 2019, 11:09
Default
  #203
Senior Member
 
Simbelmynė's Avatar
 
Join Date: May 2012
Posts: 551
Rep Power: 16
Simbelmynė is on a distinguished road
Ryzen 7 3700X, Single rank 2x8 GB DDR4 @ 3533 MT/s (14-15-15-35-1T)
Asus Prime X570-Pro

Solus 4.0
OpenFOAM 7

Benchmark:
Code:
# cores   Wall time (s):
------------------------
   1          583.7
   2          332.98
   4          219.99
   6          203.18
   8          202.72
Meshing:
Code:
# cores   Wall time (s):
------------------------
   1          17m 8s
   2          11m 30s
   4          7m 2s
   6          5m 14s
   8          4m 36s

Edit: With 3600 MT/s the benchmark improves
Code:
# cores   Wall time (s):
------------------------
  1   562.51
  2   312.16
  4   199.88
  6   180.35
  8   177.39
aparangement, hokhay and erik87 like this.

Last edited by Simbelmynė; July 29, 2019 at 20:47.
Simbelmynė is offline   Reply With Quote

Old   September 21, 2019, 17:20
Default
  #204
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
Edit: found solution for SnappyHexMesh failing: SnappyHexMesh "Cannot open etc file"

Results with 2x Epyc 7551, 16x32GB 2Rx4 DDR4-2666 reg ECC, Openfoam-7 compiled from source along with OpenMPI; SMT disabled
OS: OpenSUSE 15.1, kernel version 4.12.14; CPU governor: performance

Solver times
Code:
# cores   Wall time (s):
------------------------
01 907.7
02 534.64
04 211.67
06 138.11
08 101.9
16 53.93
24 42.43
32 36.79
40 37.17
48 31.1
56 29.52
64 27.73
Some of these timings are not as good as they could be due to sub-optimal core binding. I might try to re-run the cases with some optimizations.
As you can see from these new components, there won't be any Epyc Rome benchmarks from me any time soon. Zero availability of proper mainboards, and substantial price cuts for Epyc Naples and DDR4-2666 kind of forced me into this upgrade path.

Meshing times for those interested:
Code:
01   24m42.449s
02   16m54.417s
04   9m28.228s
06   6m35.526s
08   5m17.711s
16   3m36.410s
24   3m9.786s
32   3m26.846s
40   2m53.971s
48   2m42.073s
56   2m40.621s
64   2m34.309s
aparangement and linuxguy123 like this.

Last edited by flotus1; September 21, 2019 at 20:34.
flotus1 is offline   Reply With Quote

Old   September 22, 2019, 05:42
Default
  #205
Senior Member
 
Simbelmynė's Avatar
 
Join Date: May 2012
Posts: 551
Rep Power: 16
Simbelmynė is on a distinguished road
Hey,

Does OpenSUSE use lots of kernel backports similar to CentOS? I guess they do, but if not then 4.12 might be a tad too old.

Anyways, very impressive results.
Simbelmynė is offline   Reply With Quote

Old   September 22, 2019, 18:26
Default
  #206
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
To be honest, I had to look up what a backport is first.
Too old for what, or how would I even begin to check whether the exact kernel of my distro is too old for my hardware? Officially, 4.12 is recent enough for Naples.
flotus1 is offline   Reply With Quote

Old   September 29, 2019, 09:18
Default
  #207
Senior Member
 
Simbelmynė's Avatar
 
Join Date: May 2012
Posts: 551
Rep Power: 16
Simbelmynė is on a distinguished road
Quote:
Originally Posted by flotus1 View Post
To be honest, I had to look up what a backport is first.
Too old for what, or how would I even begin to check whether the exact kernel of my distro is too old for my hardware? Officially, 4.12 is recent enough for Naples.

There are a few improvements with each kernel version that may or may not have an impact on performance. Sometimes there are regressions as well.


Probably this is no big deal, but if you wish to get the last few % out of your system then it may be worth checking out.


The easiest way to check is to just install a newer kernel. OpenSUSE can probably do this through YaST, and if you break your system you should be able to return to the stable kernel.
Simbelmynė is offline   Reply With Quote

Old   September 29, 2019, 11:03
Default
  #208
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
I have done my fair share of experiments with installing newer kernel versions. My takeaway is that I won't do it on a system I intend to actually use
I will leave that to people who know what they are doing.

As a side-note: I tried a few different compiler optimizations. Seems like there are no significant gains here, barely above the margin of error.
flotus1 is offline   Reply With Quote

Old   October 23, 2019, 18:35
Default
  #209
ctd
New Member
 
anonymous
Join Date: Oct 2019
Posts: 4
Rep Power: 7
ctd is on a distinguished road
2X EPYC 7302, 16x16GB 2Rx8 DDR4-3200 ECC, OpenFOAM v5, Ubuntu 18.04.3

Code:
# cores   Wall time (s):
------------------------
1 723.64
2 328.11
4 164.21
8 81.4
12 55.2
16 41.1
20 37.53
24 34.27
28 29.99
32 26.89
ctd is offline   Reply With Quote

Old   October 23, 2019, 18:39
Default
  #210
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
Hot damn!
flotus1 is offline   Reply With Quote

Old   October 24, 2019, 04:22
Default
  #211
Senior Member
 
Simbelmynė's Avatar
 
Join Date: May 2012
Posts: 551
Rep Power: 16
Simbelmynė is on a distinguished road
So it seems that the architecture performs better than just looking at the increased memory bandwidth.


EPYC 7301: 36.8 s @ 2666 MT/s



EPYC 7302: 26.9 s @ 3200 MT/s


That is impressive!



From Ryzen 3000 it can be seen that (all) the timings of the memory can play a huge role. Perhaps the XMP profiles and motherboard auto timings are better in tune with this release?


Edit: Can you also test this with OpenFOAM v7? Most of the benchmarks here are with v6 or v7.
Simbelmynė is offline   Reply With Quote

Old   October 24, 2019, 23:28
Default
  #212
ctd
New Member
 
anonymous
Join Date: Oct 2019
Posts: 4
Rep Power: 7
ctd is on a distinguished road
Sure, below are the results with v7.

2X EPYC 7302, 16x16GB 2Rx8 DDR4-3200 ECC, Ubuntu 18.04.3, OF v7

Code:
# cores   Wall time (s):
 ------------------------
1 711.73
2 345.65
4 164.97
8 84.15
12 55.9
16 47.45
20 38.14
24 34.21
28 30.51
32 26.89
Simbelmynė likes this.
ctd is offline   Reply With Quote

Old   October 25, 2019, 14:41
Default
  #213
New Member
 
Join Date: Oct 2019
Posts: 3
Rep Power: 7
Lazyjones is on a distinguished road
Quote:
Originally Posted by edomalley1 View Post
Yes, I modified the script. The geometry section in SHM to the new format, location of the geometry in the allmesh files, of course the run.sh file to include runs up to 32 cores... also removed #include "streamlines", etc. If I don't do that it can't find the geometry and runs through the whole thing in like 8 seconds!
I have the same problem, the script stops after a few seconds. Trying to run this in Ubuntu 18.04 through Windows 10 (WSL) with OpenFoamV7. Could someone elaborate on the neccessary changes?
Lazyjones is offline   Reply With Quote

Old   October 25, 2019, 14:44
Default
  #214
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
With v7, I had to do the following modifications to the script in the original post:
SnappyHexMesh "Cannot open etc file"
and
OpenFOAM benchmarks on various hardware
Lazyjones likes this.
flotus1 is offline   Reply With Quote

Old   October 25, 2019, 15:02
Default
  #215
New Member
 
Join Date: Oct 2019
Posts: 3
Rep Power: 7
Lazyjones is on a distinguished road
Quote:
Originally Posted by flotus1 View Post
With v7, I had to do the following modifications to the script...
Thanks, now it works.
Here are my results for a 3900x running 2x16 3466 14-15-14-28 (with gear down mode and tight subtimings):

Meshing:
Code:
1   18m6.798s   
2   11m30.075s  1.57
4   6m56.454s   2.61
6   4m53.322s   3.71
8   4m8.597s    4.37
12  4m5.024s    4.44
16  4m0.308s    4.52
20  3m43.419s   4.86
24  3m52.028s   4.60
Code:
# cores   Wall time (s):
------------------------
1  865.21  
2  363.37   2.38
4  223.95   3.86 
6  190.46   4.54
8  169.12   5.12
12 160.76   5.38
16 160.24   5.40
20 158.12   5.47
24 157.46   5.49
I will run a complete test tomorrow to look at core scaling.
Edit: Now with scaling.

Last edited by Lazyjones; October 26, 2019 at 06:26.
Lazyjones is offline   Reply With Quote

Old   October 25, 2019, 23:01
Default
  #216
New Member
 
Michael
Join Date: Feb 2016
Posts: 1
Rep Power: 0
mh-cfd is on a distinguished road
Quote:
Originally Posted by ctd View Post
2X EPYC 7302, 16x16GB 2Rx8 DDR4-3200 ECC, OpenFOAM v5, Ubuntu 18.04.3

Code:
# cores   Wall time (s):
------------------------
1 723.64
2 328.11
4 164.21
8 81.4
12 55.2
16 41.1
20 37.53
24 34.27
28 29.99
32 26.89

Wow, very impressive results! Could you please tell us which motherboard did you use? Because it's quite hard to find one that supports Rome officially. I've read on supermicro's site that some old motherboards do support Rome processors (with full 3200 MT bandwidth) but the motherboards have to be "revision 2". I don't know what that means, maybe it's just an updated bios...

Regards
mh-cfd is offline   Reply With Quote

Old   October 26, 2019, 03:22
Default
  #217
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
I can not tell you where to buy compatible motherboards, since I had the same problem. But I can answer the rest of your question:

On numerous occasions, AMD reiterated their claim that all SP3 Platforms will be able to get an upgrade from Naples to Rome. A promise they could not keep.
The alleged reason (German site): https://www.planet3dnow.de/cms/49742...en-bios-chips/
Most retail SP3 motherboards shipped with a 16MB ROM. The bios version for Rome require 32MB ROMs. Hence many board revisions 1.x will never get support for Rome. There will not be a bios update, the hardware is incompatible.
Board revisions 2.x solve this, mainly with a bigger ROM chip. So new hardware revision, not just a software update. Of course, these new revisions of older boards still lack support for some features of Epyc Rome, for example PCIe 4.0. When in the market for one of these boards, contact the retailer beforehand, and make sure they ship rev. 2.x.
Actual new versions of retail boards with full feature support for Rome were announced, but have not yet been spotted in the wild.
The lack availability is a recurring theme with AMD Epyc, unfortunately.
flotus1 is offline   Reply With Quote

Old   October 26, 2019, 07:05
Default
  #218
Senior Member
 
Simbelmynė's Avatar
 
Join Date: May 2012
Posts: 551
Rep Power: 16
Simbelmynė is on a distinguished road
Quote:
Originally Posted by Lazyjones View Post
Thanks, now it works.
Here are my results for a 3900x running 2x16 3466 14-15-14-28 (with gear down mode and tight subtimings):
Code:
# cores   Wall time (s):
------------------------
1  865.21  
2  363.37   2.38
4  223.95   3.86 
6  190.46   4.54
8  169.12   5.12
12 160.76   5.38
16 160.24   5.40
20 158.12   5.47
24 157.46   5.49
I will run a complete test tomorrow to look at core scaling.
Edit: Now with scaling.

Nice results, especially considering you are on WSL in Win 10. Did you use the DRAM Calculator for your timings?
Simbelmynė is offline   Reply With Quote

Old   October 26, 2019, 07:52
Default
  #219
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
The single-core result looks off. Especially compared to some of the other Zen and Zen2 results already posted. I would not care too much about it, but it messes up the scaling.
flotus1 is offline   Reply With Quote

Old   October 26, 2019, 08:13
Default
  #220
New Member
 
Join Date: Oct 2019
Posts: 3
Rep Power: 7
Lazyjones is on a distinguished road
Quote:
Originally Posted by Simbelmynė View Post
Nice results, especially considering you are on WSL in Win 10. Did you use the DRAM Calculator for your timings?
Indeed, I used the "fast" preset for Samsung B-Die 3466 DR and additionally lowered the SCL timings from 3 to 2.

Quote:
Originally Posted by flotus1 View Post
The single-core result looks off. Especially compared to some of the other Zen and Zen2 results already posted. I would not care too much about it, but it messes up the scaling.
I have been wondering about that as well. Especially compared to the EPYC 7302 which has a significantly lower frequency. The Ryzen 3000 series does have some boost clock issues. This leads to my CPU not hitting the advertised boost clock of 4.6 GHz and instead only runs at ~4.3 GHz during 1-core loads. However, I am not on the latest BIOS, which should give around 5% more clock speed. Still, even at 4.3 GHz it should be faster than the 7302 at 3.3 GHz. Might be related to WSL.
Lazyjones is offline   Reply With Quote

Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
How to contribute to the community of OpenFOAM users and to the OpenFOAM technology wyldckat OpenFOAM 17 November 10, 2017 16:54
UNIGE February 13th-17th - 2107. OpenFOAM advaced training days joegi.geo OpenFOAM Announcements from Other Sources 0 October 1, 2016 20:20
OpenFOAM Training Beijing 22-26 Aug 2016 cfd.direct OpenFOAM Announcements from Other Sources 0 May 3, 2016 05:57
New OpenFOAM Forum Structure jola OpenFOAM 2 October 19, 2011 07:55
Hardware for OpenFOAM LES LijieNPIC Hardware 0 November 8, 2010 10:54


All times are GMT -4. The time now is 13:32.