|
[Sponsors] |
March 24, 2020, 13:28 |
|
#261 | |
Member
Kailee
Join Date: Dec 2019
Posts: 35
Rep Power: 7 |
Quote:
Cheers, Kai. |
||
March 27, 2020, 15:24 |
|
#262 |
Senior Member
Josh McCraney
Join Date: Jun 2018
Posts: 220
Rep Power: 9 |
Ubuntu 18.04 OF6. EPYC 7281 8 slots of MEM-DR416L-CL07-ER26 16GB DDR4 2666 RDIMM Server Memory RAM.
16 core: 90.1 seconds for 3 time average. |
|
March 30, 2020, 04:45 |
|
#263 | |
New Member
Kurt Stuart
Join Date: Feb 2020
Location: Southern illinois
Posts: 19
Rep Power: 6 |
Quote:
This is the V1 - 2.4ghz stuff. It's running in turbo up to 2.6ghz. I am pretty happy with it. Of course, a week after I got it setup and started running some Forte cases, I find out the school has a 400 core cluster of V3 and v4 stuff. |
||
April 2, 2020, 20:06 |
ryzen 9 3950x
|
#264 |
Member
Join Date: Sep 2013
Posts: 46
Rep Power: 13 |
Hi!
Ryzen 9 3950x, 16(32) x 4.2GHz, 2x16GB DDR4-3200, ubuntu 18.04 LTS, m2 ssd with min 1900MB/s direct read/write costs ~ 1800€ Memory bandwidth seems saturated after ~ 8 threads Code:
# cores Wall time (s): ------------------------ 1 649.56 2 355.39 4 219.56 6 198.86 8 190.2 12 189.75 16 190.12 20 191.28 24 194.41 Hi! Since I struggled a lot to make the sources run on a foam-extend-4.0 build, I created a case that should basically work on every OpenFoam or foam-extend build: https://github.com/ma-tri-x/setup_ubuntu Also, when I ran this, the above values transformed to # cores Wall time (s): #------------------------ 1 774.7 2 457.09 4 258.68 6 229.71 8 212.6 12 208.46 16 208.63 20 211.86 24 214.47 The top values were done with OF-7 precompiled for ubuntu. The bottom values were done with self-compiled foam-extend-4.0, g++-5 Last edited by ma-tri-x; April 3, 2020 at 20:15. Reason: values not true anymore, link for base case |
|
April 4, 2020, 14:40 |
diagram
|
#265 |
Member
Join Date: Sep 2013
Posts: 46
Rep Power: 13 |
Hi everyone!
CPUs behave weirdly. Here's an example where an 8core i7-9900 seems to beat a 16core ryzen when using 30 threads. What's wrong here? best, M https://owncloud.gwdg.de/index.php/s/784OYnGXClzydtJ edit :---------------------------- This made me execute the testcase on the ryzen up to 112 threads. It works. With massive speedup. No end in sight. edit: --------------------------------------- FATAL: Sorry, I was posting too fast. "ExecutionTime" doesn't match real execution time, when threads > cores+hyperthreading. My timer confirms: CLOCKTIME is the one to go for. Changed the script. Behaviour as expected. No magic happening. Last edited by ma-tri-x; April 4, 2020 at 18:35. Reason: falsified |
|
April 13, 2020, 12:46 |
|
#266 |
New Member
Join Date: Apr 2020
Posts: 2
Rep Power: 0 |
#DL380 Gen9: 2* 12 Core Xeon E5-2687W v4 @ 3GHz, 8*32GB 2400MHz Memory (4-Channel)
# cores Wall time (s): ------------------------ 1 880.71 2 476.69 4 222.56 6 156.74 8 122.75 12 96.56 16 83.21 20 77.61 24 74.21 #DL380 Gen10: 2* 12 Core Xeon Gold 6146 @3.2GHz 12*16GB 2666MHz Memory (6-Channel) # cores Wall time (s): ------------------------ 1 889.47 2 431.06 4 191.42 6 128.94 8 101.64 12 77.48 16 64.62 20 58.06 24 54.68 Next month we should get our new Epyc (2*7542) based DL385 Gen10plus servers. Looking forward on how fast they are compared to the Intel based ones. |
|
April 16, 2020, 12:58 |
|
#267 |
Member
alexander thierfelder
Join Date: Dec 2019
Posts: 71
Rep Power: 7 |
I run a Ryzen 2700X eight core CPU @ stock, 32Gb RAM @ 3600MHz 17-19-19-39, Ubuntu 19.10, OF1912 (but nearly same results with OF7)
Code:
# cores Wall time (s): ------------------------ 1 823 2 525.13 4 352.57 6 330.33 8 330.59 |
|
April 16, 2020, 13:49 |
|
#268 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,428
Rep Power: 49 |
That's a pretty hefty overclock on the memory. Since single-core results look great, but scaling doesn't, there are a few things I would check
|
|
April 16, 2020, 16:30 |
|
#269 | |
Member
alexander thierfelder
Join Date: Dec 2019
Posts: 71
Rep Power: 7 |
Quote:
New results with Ryzen 2700X eight core CPU @ stock, 32Gb RAM (4 x 8Gb in dual channel) @ 3200MHz 17-19-19-39, Ubuntu 19.10, OF1912 Code:
# cores Wall time (s): ------------------------ 1 719.94 2 428.66 4 270.53 6 238.9 8 232.88 |
||
April 17, 2020, 11:53 |
|
#270 |
Member
alexander thierfelder
Join Date: Dec 2019
Posts: 71
Rep Power: 7 |
I wonder how I can use the "virtual cores" that are available by SMT. In my case it does not automaticaly run those. I also tried to use the option "--use-hwthread-cpus" for mpirun:
Code:
mpirun --use-hwthread-cpus -np 12 simpleFoam -parallel |
|
April 17, 2020, 12:37 |
|
#271 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,428
Rep Power: 49 |
What seems to be the problem? As soon as you use -np >8 on your 8-core CPU, mpirun should have no other choice than oversubscribing cores with more than one thread.
|
|
April 17, 2020, 13:47 |
|
#272 | |
Member
alexander thierfelder
Join Date: Dec 2019
Posts: 71
Rep Power: 7 |
Quote:
Code:
mpirun -np 12 simpleFoam -parallel | tee log.simpleFoam -------------------------------------------------------------------------- There are not enough slots available in the system to satisfy the 12 slots that were requested by the application: simpleFoam Either request fewer slots for your application, or make more slots available for use. -------------------------------------------------------------------------- |
||
April 17, 2020, 14:17 |
|
#273 | |
Member
alexander thierfelder
Join Date: Dec 2019
Posts: 71
Rep Power: 7 |
Quote:
Code:
mpirun --use-hwthread-cpus -np 12 simpleFoam -parallel |
||
April 18, 2020, 10:33 |
A bit of nostalgia...
|
#274 |
Member
Kailee
Join Date: Dec 2019
Posts: 35
Rep Power: 7 |
Just for kicks, tried it on an X8DAE, 2x X5670 (2.93, turbo 3.2 I think), 12x4Gb 1333MHz RDRAM, Ubuntu 19.10, kernel 5.3.0, OpenFOAM7 from openfoam.org repository.
Code:
CPUs Mesh Speedup Runtime Speedup It/s 1 2305 1.00 1251.24 1.00 0.08 2 1508 1.53 717.89 1.74 0.14 4 887 2.60 334.84 3.74 0.30 6 625 3.69 274.43 4.56 0.36 8 515 4.48 252.89 4.95 0.40 12 417 5.53 241.46 5.18 0.41 No this was not a VM... Cheers, Kai. |
|
April 19, 2020, 08:05 |
|
#275 |
Member
alexander thierfelder
Join Date: Dec 2019
Posts: 71
Rep Power: 7 |
Update with optimised memory timings:
Ryzen 2700X eight core CPU @ stock, 32Gb RAM (4 x 8Gb in dual channel) @ 3200MHz 16-18-20-36 (+ optimises subsettings by DRAM Calculator for Ryzen v1.7.0 by 1usmus), Ubuntu 19.10, OF1912 Code:
# cores Wall time (s): ------------------------ 1 699.57 2 414.2 4 257.01 6 227.03 8 221.71 ******************** 12 216.02 16 223.53 |
|
April 20, 2020, 12:04 |
|
#276 |
Member
alexander thierfelder
Join Date: Dec 2019
Posts: 71
Rep Power: 7 |
So a little comparison graph for different memory configurations, but be aware that I have done every run only one time, it is really no quantitative statement behind a single run. I did not change any of the latency setting so every run was done on:
Ryzen 2700X eight core CPU @ stock, RAM @ 16-18-20-36 (+ optimises subsettings by DRAM Calculator for Ryzen v1.7.0 by 1usmus), Ubuntu 19.10, OF1912 |
|
May 23, 2020, 12:57 |
Some more nostalgia...
|
#277 |
Member
Kailee
Join Date: Dec 2019
Posts: 35
Rep Power: 7 |
HP Z840, 2x Xeon E5-2637v4, 2x4 cores, 3.5GHz, 128Gb 2400Mhz in 8x16Gb, single rank.
Code:
Threads Mesh Speedup Runtime Speedup It/s 1 1581 1 1048.83 1.00 0.1 2 1042 1.52 525.38 1.99 0.19 4 570 2.77 224.17 4.68 0.45 6 400 3.95 159.36 6.58 0.63 8 332 4.76 133.11 7.88 0.75 Those cpus still have 4 memory channels, which usually form the bottleneck but might be underused by 1 thread/memory channel, so just out of curiosity... Code:
Threads Mesh Speedup Runtime Speedup It/s 10 449 3.52 157.40 6.66 .64 12 376 4.20 146.59 7.15 .68 14 354 4.47 130.91 8.01 .76 16 337 4.69 125.14 8.38 .80 K. |
|
May 26, 2020, 13:20 |
Virtualisation comparison
|
#278 |
Member
Kailee
Join Date: Dec 2019
Posts: 35
Rep Power: 7 |
Hi all,
on the same hardware as in #274, comparing the bare-metal performance to virtualisation using bhyve and esxi 6.5. In each case, vm is ubuntu 20.04, openfoam7 from openfoam.org repository, vms have 24Gib ram, installed and run on local ssd storage. Bhyve: Code:
CPUs Mesh Speedup Runtime Speedup It/s 1 2617 1.00 1676.16 1.00 0.06 2 1708 1.53 866.58 1.93 0.12 4 1020 2.57 502.86 3.33 0.20 6 710 3.69 399.49 4.20 0.25 8 592 4.42 354.14 4.73 0.28 12 481 5.44 306.78 5.46 0.33 Code:
CPUs Mesh Speedup Runtime Speedup It/s 1 2509 1.00 1500.86 1.00 0.07 2 1665 1.51 847.07 1.77 0.12 4 965 2.60 425.78 3.52 0.23 6 683 3.67 365.86 4.10 0.27 8 565 4.44 320.61 4.68 0.31 Code:
Cores Bare Metal Bhyve ESXi 1 0.08 0.06 0.07 2 0.14 0.12 0.12 4 0.30 0.20 0.23 6 0.36 0.25 0.27 8 0.40 0.28 0.31 12 0.41 0.33 Cheers, Kai. |
|
May 27, 2020, 10:43 |
All benchmarking data from this thread
|
#279 |
New Member
Aoo
Join Date: Jun 2015
Posts: 1
Rep Power: 0 |
Hi everyone
I have collected the benchmark data posted in this thread in an Excel spreadsheet and plotted out walltime as well as speed up for comparison. I hope it is helpful for those who plan to build a new PC for OpenFOAM. |
|
May 28, 2020, 15:43 |
Nice Spreadsheet
|
#280 |
Senior Member
Will Kernkamp
Join Date: Jun 2014
Posts: 372
Rep Power: 14 |
Thanks for the spreadsheet. I liked the idea of coloring the result by column.
|
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
How to contribute to the community of OpenFOAM users and to the OpenFOAM technology | wyldckat | OpenFOAM | 17 | November 10, 2017 16:54 |
UNIGE February 13th-17th - 2107. OpenFOAM advaced training days | joegi.geo | OpenFOAM Announcements from Other Sources | 0 | October 1, 2016 20:20 |
OpenFOAM Training Beijing 22-26 Aug 2016 | cfd.direct | OpenFOAM Announcements from Other Sources | 0 | May 3, 2016 05:57 |
New OpenFOAM Forum Structure | jola | OpenFOAM | 2 | October 19, 2011 07:55 |
Hardware for OpenFOAM LES | LijieNPIC | Hardware | 0 | November 8, 2010 10:54 |