CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > General Forums > Hardware

OpenFOAM benchmarks on various hardware

Register Blogs Community New Posts Updated Threads Search

Like Tree547Likes

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   December 14, 2018, 02:38
Default
  #161
Member
 
Geir Karlsen
Join Date: Nov 2013
Location: Norway
Posts: 59
Rep Power: 14
gkarlsen is on a distinguished road
Quote:
Originally Posted by Rec View Post
I have AMD Ryzen 7 1800X 8-core, @ 3.6 GHz, 2 X 16GB Samsung DDR4, SSD M.2 950 EVO 256 Gb
I have installed UBUNTU 16.04 LTS and OpenFOAM 5.0.
When I run the test I get the following result:

# cores Wall time (s):
------------------------
1 893.2
2
4
6
8

I see in log from OpenFOAM/bench_template/run_2/ log.simpleFoam
I try this in Ubunty 18.4, Centos 7.6, can you help me?

Code:
Starting time loop

streamLine streamLines:
    automatic track length specified through number of sub cycles : 5

[1] 
[1] 
[1] --> FOAM FATAL ERROR: 
[1] Attempt to return primitive entry ITstream : IOstream.functions.streamLines.seedSampleSet, line 0, IOstream: Version 2.0, format ASCII, line 0, OPENED, GOOD
    primitiveEntry 'seedSampleSet' comprises 
        on line 0 the word 'uniform'
 as a sub-dictionary
[1] 
[1]     From function virtual const Foam::dictionary& Foam::primitiveEntry::dict() const
[1]     in file db/dictionary/primitiveEntry/primitiveEntry.C at line 189.
[1] 
FOAM parallel run aborting
[1] 
[0] 
[0] 
[0] --> FOAM FATAL ERROR: 
[0] Attempt to return primitive entry ITstream : /home/sergey/OpenFOAM/bench_template/run_2/system/controlDict.functions.streamLines.seedSampleSet, line 45, IOstream: Version 2.0, format ASCII, line 0, OPENED, GOOD
    primitiveEntry 'seedSampleSet' comprises 
        on line 45 the word 'uniform'
 as a sub-dictionary
[0] 
[0]     From function virtual const Foam::dictionary& Foam::primitiveEntry::dict() const
[0]     in file db/dictionary/primitiveEntry/primitiveEntry.C at line 189.
[0] 
FOAM parallel run aborting
[0] 
[1] #0  Foam::error::printStack(Foam::Ostream&)[0] #0  Foam::error::printStack(Foam::Ostream&) at ??:?
 at ??:?
[1] #1  Foam::error::abort()[0] #1  Foam::error::abort() at ??:?
[1] #2  Foam::primitiveEntry::dict() const at ??:?
[0] #2  Foam::primitiveEntry::dict() const at primitiveEntry.C:?
[1] #3  Foam::functionObjects::streamLine::read(Foam::dictionary const&) at primitiveEntry.C:?
[0] #3  Foam::functionObjects::streamLine::read(Foam::dictionary const&) at ??:?
[1] #4  Foam::functionObjects::streamLine::streamLine(Foam::word const&, Foam::Time const&, Foam::dictionary const&) at ??:?
[0] #4  Foam::functionObjects::streamLine::streamLine(Foam::word const&, Foam::Time const&, Foam::dictionary const&) at ??:?
[1] #5  Foam::functionObject::adddictionaryConstructorToTable<Foam::functionObjects::streamLine>::New(Foam::word const&, Foam::Time const&, Foam::dictionary const&) at ??:?
[0] #5  Foam::functionObject::adddictionaryConstructorToTable<Foam::functionObjects::streamLine>::New(Foam::word const&, Foam::Time const&, Foam::dictionary const&) at ??:?
[1] #6  Foam::functionObject::New(Foam::word const&, Foam::Time const&, Foam::dictionary const&) at ??:?
[0] #6  Foam::functionObject::New(Foam::word const&, Foam::Time const&, Foam::dictionary const&) at ??:?
[1] #7  Foam::functionObjects::timeControl::timeControl(Foam::word const&, Foam::Time const&, Foam::dictionary const&) at ??:?
[0] #7  Foam::functionObjects::timeControl::timeControl(Foam::word const&, Foam::Time const&, Foam::dictionary const&) at ??:?
[1] #8  Foam::functionObjectList::read() at ??:?
[0] #8  Foam::functionObjectList::read() at ??:?
[1] #9  Foam::Time::loop() at ??:?
[0] #9  Foam::Time::loop() at ??:?
[1] #10  Foam::simpleControl::loop() at ??:?
[0] #10  Foam::simpleControl::loop() at ??:?
[1] #11   at ??:?
[0] #11  ?? at ??:?
[1] #12  __libc_start_main at ??:?
[0] #12  __libc_start_main in "/lib/x86_64-linux-gnu/libc.so.6"
[1] #13  ? in "/lib/x86_64-linux-gnu/libc.so.6"
[0] #13  --------------------------------------------------------------------------
MPI_ABORT was invoked on rank 1 in communicator MPI_COMM_WORLD 
with errorcode 1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
 at ??:?
? at ??:?
[kb-4:14244] 1 more process has sent help message help-mpi-api.txt / mpi-abort
[kb-4:14244] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
Try removing streamlines from controlDict.

#include streamlines
#include wallBoundedStreamlines
gkarlsen is offline   Reply With Quote

Old   December 14, 2018, 06:09
Default CPU AMD Ryzen 7 1800X 8-core @ 3.6 GHz
  #162
Rec
New Member
 
Sergey
Join Date: Jan 2018
Posts: 18
Rep Power: 8
Rec is on a distinguished road
Quote:
Originally Posted by gkarlsen View Post
Try removing streamlines from controlDict.
#include streamlines
#include wallBoundedStreamlines
Tnx a lot, it works!

CPU AMD Ryzen 7 1800X 8-core @ 3.6 GHz
2 X 16GB Samsung DDR4
SSD M.2 950 EVO 256 Gb
OS: UBUNTU 16.04 LTS + OpenFOAM 5.0

# cores Wall time (s):
------------------------
1 876.25
2 518.92
4 336.94
6 297.91
8 282.64
Rec is offline   Reply With Quote

Old   December 14, 2018, 10:21
Default
  #163
Rec
New Member
 
Sergey
Join Date: Jan 2018
Posts: 18
Rep Power: 8
Rec is on a distinguished road
2CPU x Intel Xeon X5660 @ 2.8GHz (6 core) = 12 total
64 GB memory DDR3 1333 MHz
HDD 2 Tb Hitachi 7200 rpm
OS: Ubuntu 18.04.1, OpenFOAM 5

# cores Wall time (s):
------------------------
1 1387.79
2 815.7
4 394.51
6 327.79
8 300.67
10 287.82
12 279.58
Rec is offline   Reply With Quote

Old   January 6, 2019, 13:28
Default
  #164
Member
 
Andrew
Join Date: Mar 2018
Posts: 82
Rep Power: 8
Astan is on a distinguished road
Hi guys, I have an i7 7700 processor (quad-core, 2 memory channels). ubuntu 18.04.1 , openfoam 5

I performed two tests:

1) 3 memory modules (2 x 16GB) + 1 x 8GB, all 2400mhz.
I got the following results:

# cores Wall time (s):
------------------------
1 751.25
2 567.25
3 539.83
4 529.66
5 536.26
6 538.96
7 548.8
8 549.27

2) 2 memory modules (2x 16GB)

# cores Wall time (s):
------------------------
1 637.4
2 385.94
3 335.91
4 318.02
5 331.21
6 324.82
7 324.22
8 322.47

The difference in the times is certainly due to the fact that with 3 memories the dual channel was not exploited.

However I have noticed that there is not a significant reduction of the calculation time using 3 and 4 cores.

If I bought another 2 banks of 16gb 2400mhz ram in order to completely populate the slots, could I improve the computation time when 3 and 4 cores are used?
Astan is offline   Reply With Quote

Old   January 6, 2019, 13:53
Default
  #165
Senior Member
 
Simbelmynë's Avatar
 
Join Date: May 2012
Posts: 551
Rep Power: 16
Simbelmynë is on a distinguished road
Quote:
Originally Posted by Astan View Post
Hi guys, I have an i7 7700 processor (quad-core, 2 memory channels). ubuntu 18.04.1 , openfoam 5

I performed two tests:

1) 3 memory modules (2 x 16GB) + 1 x 8GB, all 2400mhz.
I got the following results:

# cores Wall time (s):
------------------------
1 751.25
2 567.25
3 539.83
4 529.66
5 536.26
6 538.96
7 548.8
8 549.27

2) 2 memory modules (2x 16GB)

# cores Wall time (s):
------------------------
1 637.4
2 385.94
3 335.91
4 318.02
5 331.21
6 324.82
7 324.22
8 322.47

The difference in the times is certainly due to the fact that with 3 memories the dual channel was not exploited.

However I have noticed that there is not a significant reduction of the calculation time using 3 and 4 cores.

If I bought another 2 banks of 16gb 2400mhz ram in order to completely populate the slots, could I improve the computation time when 3 and 4 cores are used?

First I would suggest to turn of hyperthreading (it will probably not matter too much though, but there is no point in running more than 4 threads on a 4 core CPU).


From my own benchmarks with my 7600k (only rank 1 memory ) I will get 384 respectively 318 seconds for 2 and 4 cores when my memory is clocked at 2400 MHz (10-10-10-30, T1), so this is perfectly in line with what you get. @3200 MHz (13-13-13-33, T2) I get 343 and 274 seconds respectively.


In order to reach better results I suggest trying to overclock your current memory. I would not buy more memory to populate additional banks, you will see no speed gain in that (most likely the results will be worse instead since the memory controller on the CPU rather handles 2 banks only).


I think 3200 MHz memory (rank 2 if you can find it) is the sweet-spot today in terms of price, availability and likelihood to achieve the overclock (Intel does not support 3200 MHz memory, but as you can tell from the vast amount of 3200 MHz memory kits on the market, there is a good chance that it will work anyway - but it is not guaranteed!). XMP profile support makes this very easy as well.


Good luck!
Simbelmynë is offline   Reply With Quote

Old   January 6, 2019, 14:48
Default
  #166
Member
 
Geir Karlsen
Join Date: Nov 2013
Location: Norway
Posts: 59
Rep Power: 14
gkarlsen is on a distinguished road
Quote:
Originally Posted by Simbelmynë View Post
....

In order to reach better results I suggest trying to overclock your current memory. I would not buy more memory to populate additional banks, you will see no speed gain in that (most likely the results will be worse instead since the memory controller on the CPU rather handles 2 banks only).
What he said... and if you can't get a decent overclock to work. It is probably better to sell the RAM you have and use that money for 2 new sticks of fast RAM instead of 2 new slow ones combined with the 2 existing slow ones.
gkarlsen is offline   Reply With Quote

Old   January 6, 2019, 16:08
Default
  #167
Member
 
Andrew
Join Date: Mar 2018
Posts: 82
Rep Power: 8
Astan is on a distinguished road
Hi Simbelmynë and gkarlsen thanks you for your answers.

In general if i increase the cell number is there a hope to gain speed up = 2 using 2 cores? and speed up close to 4 with 4 cores?
Astan is offline   Reply With Quote

Old   January 6, 2019, 16:33
Default
  #168
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
That's not how it works. In this situation using 3 cores almost saturates the available memory bandwidth. There is no way around it, other than overclocking memory (your motherboard needs to support it) or buying a different machine with 4 or more memory channels.
flotus1 is offline   Reply With Quote

Old   January 7, 2019, 15:21
Default
  #169
New Member
 
Rob
Join Date: Apr 2018
Posts: 18
Rep Power: 8
Morlind is on a distinguished road
I have a new loaner machine for benchmarking. It does pretty well but I should populate the memory properly. It is only running 8x16gb DDR4 1600 which explains the 72c speed. I still don't think it'll touch the Epyc's here but it's pretty solid for a 3+yr old piece of tech.


4x E7-8890v3 2.5gHz - 8x 16gb DDR4 1600 - Ubuntu 18 - OpenFoam5



# cores Wall time (s):
------------------------
1 1051.29
2 595.72
4 226.62
6 163.77
8 123.32
12 91.01
16 75.55
20 68.75
24 65.73
36 59.81
72 76.36
Morlind is offline   Reply With Quote

Old   January 7, 2019, 15:55
Default
  #170
Senior Member
 
Simbelmynë's Avatar
 
Join Date: May 2012
Posts: 551
Rep Power: 16
Simbelmynë is on a distinguished road
If it really has 8x16 GB and not 16x8 GB then it does extremely well. Even if it has 16x8 GB it does a really good job!


Not something I would purchase, even used, seeing that even though it is a 3+ year old piece of tech it still holds a premium price at eBay (I did a search and hoped for a miracle when you posted, but alas - it is too expensive )


If it really is 8x16 configured then it will be fantastic if you manage to run with all 16 slots populated.
Simbelmynë is offline   Reply With Quote

Old   January 7, 2019, 16:46
Default
  #171
New Member
 
Rob
Join Date: Apr 2018
Posts: 18
Rep Power: 8
Morlind is on a distinguished road
Yes, it is indeed 8 pieces of 16gb ea. I will see if I can borrow enough to reconfigure and get another result. I am very curious to see what it will be.


Absolutely this is not as cost effective as the ~$2k quad v2 machine but it's an interesting benchmark since it's still far less $$ than I can find dual Epyc machines for. It was offered so I wasn't going to say no!
Morlind is offline   Reply With Quote

Old   January 31, 2019, 10:12
Default
  #172
New Member
 
Håvard B. Refvik
Join Date: Jun 2015
Location: Norway
Posts: 17
Rep Power: 11
havref is on a distinguished road
Quote:
Originally Posted by spaceprop View Post
I saw you have 16x8GB 2666MHz RAM, but what rank is it? If it's 2R (dual rank), and the 2x 7601 was 1R, that might explain it.

havref: Is your RAM single rank?

I'm curious now, haha
Sorry for the late answer. Never recieved a notification and did not stay up to date on this thread. The server with Epyc 7601 used single rank memory. However, it was only in my hands for a couple of days, so I won't be able to modify further now. Quite sure I cleared caches etc., but it's been a while, so who knows .

Though I had published our Epyc 7351 results already, but here they are. We have two servers which are getting the same results:
Code:
1x dual Epyc 7351 - 16x8 1R DDR5 2666MHz - Ubuntu 16.04 - OpenFOAM 6 - caches cleared
# cores   Wall time (s):
------------------------
1 	1035.30
2 	 583.49
4 	 236.28
6 	 155.03
8 	 111.10
12	  78.33
16	  58.51
20	  54.19
24	  47.13
28	  45.89
32	  36.82


2x dual Epyc 7351 InfiniBand connected - Ubuntu 16.04  - OpenFOAM 6 - caches cleared
# cores   Wall time (s):    Processors in use pr. server:
------------------------------------------------
32 	       28.54			16 
40 	       26.50  			20
48 	       22.90  			24
56 	       21.00    		28
64 	       19.16			32
The bottom results are from our two, dual Epyc7351 servers connected with InfiniBand . I have not had time to play around with binding processes etc. (tips or tutorials appreciated!), so I assume the results can be improved. The --hostsfile specified in the mpirun looks as follows for the 64 core run:
Code:
cfdepyc1 cpu=32
cfdepyc2 cpu=32
No rankfile is specified as I do not yet know how to do this properly. Will also update to Ubuntu 18.04 shortly, in case there's some improvement there.
Clément_G likes this.
havref is offline   Reply With Quote

Old   January 31, 2019, 17:03
Default
  #173
Senior Member
 
Simbelmynë's Avatar
 
Join Date: May 2012
Posts: 551
Rep Power: 16
Simbelmynë is on a distinguished road
Nice results! WIth regards to the 18.04, I'm not sure you will gain anything. If you run Ubuntu server, then it is probably no difference at all. If you use the standard 18.04 then my experience is that Gnome3 is eating resources.



Would be interesting to hear your experience though if you do try the 18.04 and manage to run some benchmarks.


Btw, we are using Mint 19.1 on our dual EPYC 7301 machine and lately we have noticed that the performance is not as good as with Mint 19.0 or 18.3 (also as noticed in this review). So Mint 19.1 should probably be avoided for the time being.
Simbelmynë is offline   Reply With Quote

Old   February 21, 2019, 21:03
Default Sony i7 Laptop
  #174
Senior Member
 
Will Kernkamp
Join Date: Jun 2014
Posts: 371
Rep Power: 14
wkernkamp is on a distinguished road
Intel(R) Core(TM) i7-3632QM CPU @ 2.20GHz
stepping : 9
microcode : 0x20
cpu MHz : 1197.455
cache size : 6144 KB


Sony Laptop with 16Gb
openFoam 6
docker on Unbuntu 18.04


# cores Wall time (s):
------------------------
1 835.7
2 551.6
4 506.9

I called the run with 2 threads 1 core, and 4 is all 8 threads on all 4 cores.
wkernkamp is offline   Reply With Quote

Old   February 21, 2019, 21:13
Default
  #175
Senior Member
 
Will Kernkamp
Join Date: Jun 2014
Posts: 371
Rep Power: 14
wkernkamp is on a distinguished road
Quote:
Originally Posted by Rec View Post
Tnx a lot, it works!

CPU AMD Ryzen 7 1800X 8-core @ 3.6 GHz
2 X 16GB Samsung DDR4
SSD M.2 950 EVO 256 Gb
OS: UBUNTU 16.04 LTS + OpenFOAM 5.0

# cores Wall time (s):
------------------------
1 876.25
2 518.92
4 336.94
6 297.91
8 282.64
Did you try to run out to 16? Your machine has two threads per core. The standard run.sh loads threads, not cores.
wkernkamp is offline   Reply With Quote

Old   February 27, 2019, 08:46
Default
  #176
New Member
 
davide
Join Date: Jun 2017
Posts: 9
Rep Power: 9
cfdx is on a distinguished road
Hi

Additional results with the EPYC 7301 using AMD compilers:

- dual Epyc 7301
- 16x16 DDR4 rank2 2666MHz
- CentOS 7.5 3.10.0-862
- OpenFOAM-v1812
- AOCC 1.3 compiler


Code:
# cores   Wall time (s):
------------------------
1 893.58
2 542.67
4 209.83
6 135.39
8 100.87
12 78.7
16 54.84
20 49.71
24 41.23
28 38.81
32 34.54
For comparison, the results with OpenFOAM-v1812 with gcc 4.8.5 compiler
Code:
# cores   Wall time (s):
------------------------
1 982.77
2 583.02
4 233.52
6 151.84
8 112.32
12 85.82
16 61.93
20 54.33
24 44.21
28 42.34
32 37.28
It looks like a 10% performance increase...
wyldckat and Simbelmynë like this.
cfdx is offline   Reply With Quote

Old   March 11, 2019, 16:51
Default i7 6800K 3.40 GHz
  #177
Member
 
Giovanni Medici
Join Date: Mar 2014
Posts: 48
Rep Power: 12
giovanni.medici is on a distinguished road
I performed few tests on my PC.
Intel i7 6800K 3.40GHz, tested different combinations:
  1. openFOAM releases (1812 installed and compiled on Ubuntu), and 1612 via docker on Windows 10).
  2. Different RAM setups, 2x16Gb DDR4 2133 MHz and 4x16Gb DDR4 2133 MHz, and 2666 MHz. (the RAM is rated 2800 MHz).
  3. HyperThreading ON and OFF.


Code:
Ubuntu 32gb 2133 MHz HT OFF | Ubuntu 64gb 2133 MHz HT OFF | Ubuntu 64gb 2666 MHz HT OFF | Ubuntu 64gb 2133 MHz HT ON |
# cores   Wall time (s):    | # cores   Wall time (s):    | # cores   Wall time (s):    | # cores   Wall time (s):   |
----------------------------+-----------------------------+-----------------------------+----------------------------+
1 897.95                    | 1 878.6                     | 1 825.96                    | 1 885.37                   |
2 526.46                    | 2 475.78                    | 2 437.07                    | 2 484.32                   |
4 361.62                    | 4 272.68                    | 4 246.12                    | 4 274.11                   |
6 337.28                    | 6 226.86                    | 6 199.83                    | 6 227.82                   |
                            |                             |                             | 8 252.65                   |
                            |                             |                             | 10 235.2                   |
                            |                             |                             | 12 223.22                  |
============================+=============================+=============================+============================+
Win 32gb 2133 MHz HT ON     | Win 64gb 2133 MHz HT OFF    | Win 64gb 2666 MHz HT OFF    |
# cores   Wall time (s):    | # cores   Wall time (s):    | # cores   Wall time (s):    |
----------------------------+-----------------------------+-----------------------------+
1 835.09                    | 1 1460.39                   | 1 1352.23                   |
2 508.23                    | 2 816.66                    | 2 740.21                    |
4 369.22                    | 4 432.79                    | 4 406.12                    |
6 447.91                    | 6 459.66                    | 6 401.81                    |
============================+=============================+=============================+
Both Ubuntu and Windows are mounted on SSD. Possibly the Windows 10 system is a little more loaded, as is the main system I use, but overall I would say there is a clear performance loss (maybe it does not take fully advantage of the 4 memory channels ? or parallelization) on the Windows 10 side.
Did any of you experienced something similar?
flotus1 and erik87 like this.
giovanni.medici is offline   Reply With Quote

Old   March 11, 2019, 17:11
Default
  #178
Senior Member
 
Simbelmynë's Avatar
 
Join Date: May 2012
Posts: 551
Rep Power: 16
Simbelmynë is on a distinguished road
Not sure why the Windows docker version performs so poorly. You can always try Ubuntu terminal from Windows store. However, there might be a penalty if you use lots of file I/O (output to screen instead of to file if possible).


Third option is to test virtualization. When using Ubuntu in a Virtualbox on a Windows machine I had about 10% performance hit in the motorbike benchmark.
Simbelmynë is offline   Reply With Quote

Old   March 11, 2019, 17:43
Default
  #179
Senior Member
 
Will Kernkamp
Join Date: Jun 2014
Posts: 371
Rep Power: 14
wkernkamp is on a distinguished road
Nice work. Interesting that the memory variations have an effect when 32Gb is already plenty. I think your fastest run would be "Ubuntu with HT 8 threads and memory at 2666 Hz. On my Dell Poweredge R810 it also doesn't help to turn HT off. (I am keeping it on, because I am experimenting with openmp and openacc and the threads might help.)
wkernkamp is offline   Reply With Quote

Old   March 12, 2019, 01:49
Default
  #180
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
The difference here is not 32GB vs 64GB, it's 2 channels vs 4 channels. 32GB is more than enough to run this benchmark.
giovanni.medici likes this.
flotus1 is offline   Reply With Quote

Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
How to contribute to the community of OpenFOAM users and to the OpenFOAM technology wyldckat OpenFOAM 17 November 10, 2017 16:54
UNIGE February 13th-17th - 2107. OpenFOAM advaced training days joegi.geo OpenFOAM Announcements from Other Sources 0 October 1, 2016 20:20
OpenFOAM Training Beijing 22-26 Aug 2016 cfd.direct OpenFOAM Announcements from Other Sources 0 May 3, 2016 05:57
New OpenFOAM Forum Structure jola OpenFOAM 2 October 19, 2011 07:55
Hardware for OpenFOAM LES LijieNPIC Hardware 0 November 8, 2010 10:54


All times are GMT -4. The time now is 20:16.