CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > General Forums > Hardware

OpenFOAM benchmarks on various hardware

Register Blogs Community New Posts Updated Threads Search

Like Tree549Likes

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   March 8, 2018, 01:57
Default
  #21
Member
 
Hrushikesh Khadamkar
Join Date: Jul 2010
Location: Mumbai
Posts: 68
Rep Power: 16
Hrushi is on a distinguished road
2 x Intel Xeon Gold 6136, 12 * 16 GB DDR4 2666MHz, Ubuntu 16.04 LTS,
Linux OF-7820-Tower 4.4.0-116-generic #140-Ubuntu SMP Mon Feb 12 21:23:04 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

OpenFOAM-5.x

I have commented following lines in controlDict.
Code:
#include streamlines
#include wallBoundedStreamlines
Here are the results.

Code:
# cores   Wall time (s):
------------------------
1             874.54
2             463.34
4             205.23
6             137.95
8             106.04
12             74.63
16             61.09
20             53.26
24             49.17
Hrushikesh
flotus1 and tin_tad like this.
Hrushi is offline   Reply With Quote

Old   March 8, 2018, 07:04
Default
  #22
Senior Member
 
Joern Beilke
Join Date: Mar 2009
Location: Dresden
Posts: 539
Rep Power: 20
JBeilke is on a distinguished road
Quote:
Originally Posted by havref View Post
Thank you for starting this thread. I got my hands on a couple of Epyc 7601 processors this week, so figured I'd do the same tests on it for comparison. Will post results with a dual Epyc 7351 when our server arrives in a couple of weeks and a 2 x dual Epyc 7351 when I've had the time to set them up with infiniband.

2x Epyc 7601, 16x 8GB DDR4 2666MHz, 1TB SSD, running OpenFOAM 5.0 on Ubuntu 16.04.
Code:
# Cores    Wall time [s]    Speedup
------------------------------------------------------------        
 1    971.64              1
 2    577.18              1.7
 4    234.01              4.2
 6    169.8              5.7
 8    132.41              7.3
12     81.52             11.9
16     59.65             16.3
20     62.56             15.5
24     54.39             17.9
28     45.92             21.2
32      43.42             22.4
36     42.83             22.7
48     40.5             24.0
64     35             27.8
I removed streamlines and wallBoundedStreamlines from the controlDict. The rest of the case is identical to yours. Let me know if you want me to fill in the gaps between 36 and 64 cores.
I'm not sure how this compares to #2 the 2xEpyc 7301 from Alex. The 7601 runs at higher clock speed and uses faster memory but is not really faster than the 7301.

Does it come from the use of different linux kernels? 4.14-14 vs 4.4 ?
JBeilke is offline   Reply With Quote

Old   March 8, 2018, 08:12
Default
  #23
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
We used different kernel versions and different versions of OpenFOAM, so there is that...
But to be honest the performance difference between the two processors is about what I would have expected. 7601 is slightly faster single-core thanks to its higher turbo clock speed. But the all-core turbo clock speed -and I assume also the clock speed for slightly less active cores- is identical https://www.amd.com/system/files/201...Data-Sheet.pdf
And I am pretty sure those 16x8GB DDR4-2666 are single-rank, the memory I used is dual-rank. I still can not state with absolute certainty that dual-rank memory is better for Epyc even when clocked slightly slower, but all evidence suggests so. FYI, Supermicro is now listing some dual-rank DDR4-2666 memory modules in their QVL http://www.supermicro.com/support/re...=0&reg=0&fbd=0
If I had to buy now, I would get Samsung DDR4-2666 dual-rank, try to run it at DDR4-2666 with the most recent bios and dial it down to 2400 if it does not post.
flotus1 is offline   Reply With Quote

Old   March 8, 2018, 09:32
Default
  #24
Senior Member
 
Joern Beilke
Join Date: Mar 2009
Location: Dresden
Posts: 539
Rep Power: 20
JBeilke is on a distinguished road
Quote:
Originally Posted by flotus1 View Post
If I had to buy now, I would get Samsung DDR4-2666 dual-rank, try to run it at DDR4-2666 with the most recent bios and dial it down to 2400 if it does not post.
Hopefully the last question :-)

To get 128GB. Would you take 16x8GB or 8x16GB?
JBeilke is offline   Reply With Quote

Old   March 8, 2018, 09:41
Default
  #25
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
For dual-socket Epyc: 16x8GB. You need to populate each of the 16 memory channels. Stuff like memory speed and DIMM organization has much less influence than that. With dual-rank vs single-rank and DDR4-2666 vs 2400, we are talking about differences in the order of 5%. Populating only half of the memory channels available effectively cuts parallel performance in half.
flotus1 is offline   Reply With Quote

Old   March 9, 2018, 07:25
Default
  #26
Member
 
Jógvan
Join Date: Feb 2014
Posts: 32
Rep Power: 12
Jeggi is on a distinguished road
To get some more budget systems on the list, I have ran benchmark using my 5820k at stock settings. It must be considered that this system is currently not optimized for CFD, as the RAM is set to a low frequency and I am using Windows. If needed, then I can run the test again with a mild overclock on the CPU and the RAM set to 3000 MHz.

5820K, 32 (4x8) GB 2133 MHz RAM, Windows 10 (HT on), blueCFD-Core 2017-2 (OpenFOAM 5.0).
Code:
# threads   Wall time (s)  Speedup:
--------------------------------------
1             1164.35          1
2              645.95          1.8
4              363.71          3.2
6              311.11          3.7
8              304.52          3.8
12             273.59          4.3
EDIT: Overclocked results.
5820K@4Ghz, 32 (4x8) GB 2800 MHz RAM, Windows 10 (HT on), blueCFD-Core 2017-2 (OpenFOAM 5.0).
Code:
# threads   Wall time (s)  Speedup:
--------------------------------------
6              267.14         
12             231.34

Last edited by Jeggi; March 12, 2018 at 10:09.
Jeggi is offline   Reply With Quote

Old   March 9, 2018, 10:56
Default
  #27
Member
 
Giovanni Medici
Join Date: Mar 2014
Posts: 48
Rep Power: 12
giovanni.medici is on a distinguished road
I did the benchmark on the computer I'm using, but I'm quite puzzled by the results; I was expecting better performance in particular of parallelization (or am I wrong?).

I can't figure out whether the problem is due to unbalanced memory configuration, slow memory clock, some virtual machine (Docker) settings or something else, any help, info, clue, to help me find it out, would be greatly appreciated.

2x Intel Xeon E5-2630 v3, 64 (2x32) Gb RAM DDR4-2132 1066MHz, OF_1712 running through Docker VM, on a windows server 2012R2 OS machine.

I commented the controlDict lines:
Code:
#include streamlines
#include wallBoundedStreamlines
The results:
Code:
# cores   Wall time (s):	
------------------------	
1	1175.71	1.00
2	789.62	1.49
4	654.19	1.80
6	673.60	1.75
8	663.71	1.77
12	662.91	1.77
16	670.90	1.75
20	662.62	1.77
24	671.49	1.75
32	772.40	1.52
It looks like any speedup flattens for values higher than 4 cores.

Other settings:
  • HyperThreading is on (will try to switch it off).
  • VM runs on 32 processors, (i.e. all available threads).
  • the command lscpu confirms the setup of the virtual machine is 32 processors.
  • VM running with as much RAM as possible.

Thanks !
etinavid likes this.
giovanni.medici is offline   Reply With Quote

Old   March 9, 2018, 11:03
Default
  #28
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
Quote:
64 (2x32) Gb RAM
Assuming this means you only have one 32GB DIMM per CPU, there is your problem.
Each of your CPUs needs 4 identical DIMMs populated to maximize parallel CFD performance.
Edit: Judging by how bad scaling is already for two cores, it might even be possible that both DIMMs are on one socket. Can you confirm this?
giovanni.medici and tin_tad like this.
flotus1 is offline   Reply With Quote

Old   March 9, 2018, 11:47
Default OpenFOAM benchmarks on various hardware
  #29
Member
 
Giovanni Medici
Join Date: Mar 2014
Posts: 48
Rep Power: 12
giovanni.medici is on a distinguished road
Quote:
Originally Posted by flotus1 View Post
Assuming this means you only have one 32GB DIMM per CPU, there is your problem.
Each of your CPUs needs 4 identical DIMMs populated to maximize parallel CFD performance.
Edit: Judging by how bad scaling is already for two cores, it might even be possible that both DIMMs are on one socket. Can you confirm this?

Thank you flotus1.

Unfortunately right now I don’t have direct (physical) access to the rack (will be the first thing I will check next Monday). Nevertheless, by using CPU-Z utility, apparently the two DIMMS are on slot #1 and #9, that in principle could match with A1 B1 (that’s the Dell motherboard slot numbering), which is the configuration the manufacturer suggests. Anyway I will check it directly.
giovanni.medici is offline   Reply With Quote

Old   March 12, 2018, 03:57
Default
  #30
New Member
 
Join Date: Mar 2015
Posts: 13
Rep Power: 11
CFDBuddha is on a distinguished road
I was just about to buy i7-7800x and my plan was to upgrade later (after 2-3 years) to some i9-7940x or i9-7960x (or maybe upcoming Cascade Lake-x) but then I saw this thread. It seems, according to all these test, that i7-7820x is some kind of maximum (regarding CFD) for x299 platform - it performs the same as i9-7940x. So later upgrade is not possible?

Also according to this euler3d benchmark:

https://pctuning.tyden.cz/hardware/p...-testu?start=6

it seems that i7-8700k is better than i7-7800x besides the fact that i7-8700k has just 2 memory channels. One more source confirming this (Rodinia CFD solver):

http://www.tomshardware.com/reviews/...u,5252-11.html

At the end, if we assume that x299 is not so much upgradeable, then what would be the best buy in that price range?

Thanks.
CFDBuddha is offline   Reply With Quote

Old   March 12, 2018, 09:09
Default
  #31
New Member
 
Join Date: Mar 2015
Posts: 13
Rep Power: 11
CFDBuddha is on a distinguished road
Jeggi, it would be very interesting if you could repeat the test on i7-5820K with slightly overclocked CPU and higher frequency RAM.

Thanks.
CFDBuddha is offline   Reply With Quote

Old   March 12, 2018, 09:14
Default
  #32
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
FYI, Xeon E5-1650v3 that was already posted is the same chip as an I7-5820k.
flotus1 is offline   Reply With Quote

Old   March 12, 2018, 10:05
Default
  #33
Member
 
Knut Erik T. Giljarhus
Join Date: Mar 2009
Location: Norway
Posts: 35
Rep Power: 22
eric will become famous soon enough
Thanks for the additional results, I've added them to the plot here,
https://openfoamwiki.net/index.php/Benchmarks
I omitted the 5820K since as flotus pointed out this is the same as the 1650v3, and also the E5-2630 v3 due to the strange results.
flotus1 likes this.
eric is offline   Reply With Quote

Old   March 12, 2018, 10:11
Default
  #34
Member
 
Jógvan
Join Date: Feb 2014
Posts: 32
Rep Power: 12
Jeggi is on a distinguished road
Quote:
Originally Posted by CFDBuddha View Post
Jeggi, it would be very interesting if you could repeat the test on i7-5820K with slightly overclocked CPU and higher frequency RAM.

Thanks.
I have added the results to my first post.

Quote:
FYI, Xeon E5-1650v3 that was already posted is the same chip as an I7-5820k.
I was not aware of that, but now I have something to compare my setup against. Thanks!
Jeggi is offline   Reply With Quote

Old   March 12, 2018, 10:49
Default
  #35
New Member
 
Join Date: Mar 2015
Posts: 13
Rep Power: 11
CFDBuddha is on a distinguished road
Jeggi thanks for this! It is about 16% gain in performance with higher clocks. Definitely 4 mem.channel processors outperform Coffee Lake i7-8700k and it could be concluded, as flotus1 pointed, that Euler CFD is not a good benchmark for real-world CFD applications.
CFDBuddha is offline   Reply With Quote

Old   March 13, 2018, 13:15
Default
  #36
Senior Member
 
Simbelmynė's Avatar
 
Join Date: May 2012
Posts: 552
Rep Power: 16
Simbelmynė is on a distinguished road
Quote:
Originally Posted by CFDBuddha View Post
Jeggi thanks for this! It is about 16% gain in performance with higher clocks. Definitely 4 mem.channel processors outperform Coffee Lake i7-8700k and it could be concluded, as flotus1 pointed, that Euler CFD is not a good benchmark for real-world CFD applications.
Not sure I agree on that. It depends on your application. A standard Fluent license only admits 4 processes. If you have such a license then the 8700k outperforms the 5820k by approx. 35% (assuming similar scaling as with the OpenFOAM benchmark). Sure this thread is about OpenFOAM, but anyways.

Comparing the overclocked 5820k using 12 threads, with the 8700k (stock speed) using 6 threads the difference is about 7% in favor of the 5820k. With a similar overclock on the 8700k I'd say that it is a wash in terms of performance.

If you have an existing 5820k then 8700k is not an upgrade if you are only running OpenFOAM. However, if you are running on a maximum of 4 cores and also wish to enjoy much faster pre- and post-processing, then the 8700k is a solid upgrade.
Simbelmynė is offline   Reply With Quote

Old   March 14, 2018, 18:04
Default
  #37
New Member
 
Join Date: Mar 2015
Posts: 13
Rep Power: 11
CFDBuddha is on a distinguished road
You forget that 5820K is tested here with slower memory (2800MHz in OC mod) than 8700K (3200MHz), so if you would like to take all factors in account you should OC the 8700K and LOWER its mem. frequency and then, I believe, it would be a little adventage of 5820K.
CFDBuddha is offline   Reply With Quote

Old   March 15, 2018, 02:55
Default
  #38
Senior Member
 
Simbelmynė's Avatar
 
Join Date: May 2012
Posts: 552
Rep Power: 16
Simbelmynė is on a distinguished road
There are a number of variables that can be changed and they may or may not change performance in a significant manner.

The 5820k is a reasonably good cpu for cfd cases where 6 cores can be used and to be fair, it can easily reach 4.5 GHz overclock with decent cooling (more than 35%). If you get a good price buying it used, then by all means go for it. If the price is similar to or higher than the 8700k, then it is a different potato. Without having done the benchmarks, I am quite certain that the 8700k will outperform the 5820k (substantially) when it comes to pre- and post-processing, as well as in cases where 4 or less cores are used, regardless of the overclocking state of the 5820k. At stock speeds this is certainly true.

A better choice is a dual Xeon of the same Haswell E (that would be superior for most cases except single/dual threaded simulations and pre- post-processing). It depends on what price you can get on the workstation and if you accept that there is probably no warranty on your parts.

Cheers!
Simbelmynė is offline   Reply With Quote

Old   March 19, 2018, 12:02
Default
  #39
New Member
 
Join Date: Jan 2018
Posts: 7
Rep Power: 8
The_Sle is on a distinguished road
Quote:
Originally Posted by CFDBuddha View Post
I was just about to buy i7-7800x and my plan was to upgrade later (after 2-3 years) to some i9-7940x or i9-7960x (or maybe upcoming Cascade Lake-x) but then I saw this thread. It seems, according to all these test, that i7-7820x is some kind of maximum (regarding CFD) for x299 platform - it performs the same as i9-7940x. So later upgrade is not possible?

Also according to this euler3d benchmark:

https://pctuning.tyden.cz/hardware/p...-testu?start=6

it seems that i7-8700k is better than i7-7800x besides the fact that i7-8700k has just 2 memory channels. One more source confirming this (Rodinia CFD solver):

http://www.tomshardware.com/reviews/...u,5252-11.html

At the end, if we assume that x299 is not so much upgradeable, then what would be the best buy in that price range?

Thanks.
The enthusiast X299 and X399 platforms can give even more performance if you want to dive down the rabbit hole of memory overclocking. The old rule of 3 processor cores per memory channel seem to apply strongly here though, so CPU:s with over 12 cores are always memory bottlenecked for sure. The higher core CPU:s will be faster than my 7820X, if overclocked similarly. But compared to Epyc or Xeon, they are always slower but much cheaper of course.

My latest results, after tweaking with memory timings for too many hours:

Code:
# cores   Wall time (s):
------------------------
1 646.88
2 328.93
4 184.32
6 149.3
8 139.18
etinavid and erik87 like this.
The_Sle is offline   Reply With Quote

Old   March 22, 2018, 11:01
Default Epyc 7351 benchmark results
  #40
Member
 
Johan Roenby
Join Date: May 2011
Location: Denmark
Posts: 93
Rep Power: 21
roenby will become famous soon enough
Just tried this benchmark on my newly built Epyc 7351 based computer.

# cores Wall time (s):
------------------------
1 1010.86
2 584.25
4 248.39
6 176.9
8 134.49
12 114
16 109.73

Configuration:
- Supermicro H11DSi Bundkort - Socket SP3 socket - DDR4 RAM - Extended ATX
- 1 x AMD EPYC 7351 / 2.4 GHz Processor CPU
- 8 x Samsung 8GB Module DDR4 2666MHz ECC Reg_ (M393A1K43BB1-CTD)
- Ubuntu 16.04 LTS
- OpenFOAM 5.x
- No overclocking
- Outcommented "streamLines and wallBoundedStreamLines from the contreolDict

This is quite disappointing compared to flotus1's Epyc 7301 results reported above by (66 s on 16 cores compared to my 110 s).

Any educated guesses as to what might be the cause of this?

Have I maybe been too cheap on the RAM with only 8GB per slot?

Best,
Johan
tin_tad likes this.
roenby is offline   Reply With Quote

Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
How to contribute to the community of OpenFOAM users and to the OpenFOAM technology wyldckat OpenFOAM 17 November 10, 2017 16:54
UNIGE February 13th-17th - 2107. OpenFOAM advaced training days joegi.geo OpenFOAM Announcements from Other Sources 0 October 1, 2016 20:20
OpenFOAM Training Beijing 22-26 Aug 2016 cfd.direct OpenFOAM Announcements from Other Sources 0 May 3, 2016 05:57
New OpenFOAM Forum Structure jola OpenFOAM 2 October 19, 2011 07:55
Hardware for OpenFOAM LES LijieNPIC Hardware 0 November 8, 2010 10:54


All times are GMT -4. The time now is 15:05.