|
[Sponsors] |
May 29, 2021, 10:03 |
|
#401 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49 |
I read that in some places in the US, electricity has become pretty expensive. But I was not aware that they already surpassed Germany at around 32ct/kWh.
|
|
May 29, 2021, 10:19 |
|
#402 |
Senior Member
Join Date: Jun 2016
Posts: 102
Rep Power: 10 |
Sorry I calculated a wrong decimal point when using my phone calculator. Will edit that post. Since the electricity bill is paid by the university anyway, I'm not very sensitive to that..
|
|
June 5, 2021, 23:54 |
Ryzen 5000 - may be wrong advice
|
#403 |
New Member
Alexander Kazantcev
Join Date: Sep 2019
Posts: 24
Rep Power: 7 |
Hi all!
Thanks for the topic and public results of benchmark. I have to say some words about Ryzen 5600x. This CPU has only 16 bytes/tact write transfer speed to RAM. But Ryzen 1800x, 2700x, 3900x, 3950x, 5900x, 5950x have 32 bytes per tact write speed. Then, it is realy bad choise to buy for CFD (and LS-Dyna, Code_Aster and same FEM-software with MPI) 3000-5000 Ryzen CPU with 5-6 cores. But the speedup of 5000 Ryzen lies in good IPC. Because 2x IPC speed 3900 and 5900 have the same results in benchs as Intel 7980xe and 7960xe. You can find the results at https://openbenchmarking.org/test/pts/openfoam , see 30M model, this model does fit into cache. Second, cheap Ryzen systems are sensetive for number of DIMM-plates in system. 4 DIMM may be better then 2 DIMM strips, I don't know exactly. Also there is another results for 3900x in OpenFOAM https://firepear.net/grid/ryzen3900/ , where 3900x two times faster then 2700x, but the ftequencies and threads are same. All measured results from AIDA with RAM-speed of most Ryzen's you can see there https://www.hardwareluxx.ru/index.ph...n-9-3900x.html https://www.hardwareluxx.ru/index.ph...3.html?start=3 https://www.hardwareluxx.ru/index.ph...3.html?start=6 https://www.hardwareluxx.ru/index.ph...n-9-3900x.html https://3dnews.ru/1024662/obzor-prot...itekture-zen-3 Last edited by AlexKaz; June 7, 2021 at 10:45. |
|
June 9, 2021, 05:58 |
|
#404 | |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49 |
That much is true: Since Zen2, the memory write bandwidth per chiplet is only half of the read bandwidth. More precisely; the bandwidth of the link between the compute die and the I/O die, which then handles memory access. AMD did that for two reasons:
1) it saves power 2) it doesn't affect the vast majority of workloads. And that includes CFD in general You could theoretically write a CFD code that has similar memory read and write bandwidth requirements. And there might be some carefully optimised research codes that work like this. But for most codes out in the wild, reads are much more important than writes. Thus the amount of compute dies per CPU should not affect which CPU you buy. Ryzen CPUs with only one compute die are fine. So are Epyc CPUs with only 4 compute dies. You get more L3 cache with more compute dies, but that's about it. Quote:
|
||
June 13, 2021, 18:00 |
Some more legacy harware...
|
#405 |
Member
Kailee
Join Date: Dec 2019
Posts: 35
Rep Power: 6 |
Hi all, HP DL560 Gen8 with 2x E5-2690v2, 16x16Gb 1866 DDR3. Ubuntu 20.04, OF 7 from .org Ubuntu repository.
Code:
threads mesh speedup sim speedup it/s 1.00 1708.00 1.00 934.50 1.00 0.11 2.00 1141.00 1.50 496.60 1.88 0.20 4.00 658.00 2.60 225.90 4.14 0.44 6.00 460.00 3.71 160.40 5.83 0.62 8.00 383.00 4.46 131.80 7.09 0.76 12.00 301.00 5.67 105.10 8.89 0.95 16.00 261.00 6.54 94.20 9.92 1.06 20.00 235.00 7.27 90.20 10.36 1.11 |
|
June 22, 2021, 09:02 |
HP Dl560 Gen 8, 2x 2690 V2, 256Gb 1866 - VM results
|
#406 |
Member
Kailee
Join Date: Dec 2019
Posts: 35
Rep Power: 6 |
Hi all,
in addition to the above results on bare metal, here some results from running in VMs of different flavour on TrueNAS Scale 21.04; KVM and Docker are self-explanatory, and as TrueNAS Scale is Debian based the openfoam.org OpenFOAM installs straightforward natively on that also. Code:
Env Mesh Sim It/s KVM 362,00 222,15 0,45 Docker 246,00 94,48 1,06 Native 230,10 94,75 1,06 Does anybody have an idea why the 560 might suffer so much from the KVM/Qemu virtualization? Does it have to do with having 4 sockets (currently I only have 2 populated)? Cheers, Kai. |
|
July 8, 2021, 03:25 |
|
#407 | |
Member
|
10920x, ddr4 3733 (16-18-18-38) 16g*4, ofv8-commit-9603c.
Code:
# cores Wall time (s): ------------------------ 1 760.04 2 401.79 3 255.95 4 202.53 6 142.74 8 122.8 12 105.17 16 116.53 surfaceFeatureExtractDict has to be changed into surfacesFeatureDict. streamline and meshDict were modified according to Quote:
|
||
July 10, 2021, 09:23 |
|
#408 | |
Member
Kailee
Join Date: Dec 2019
Posts: 35
Rep Power: 6 |
Hi all,
tried to run this with openfoam.org OF8 also, however running into issues; Quote:
Even when doing that, when I run surfaceFeatures it barfs; Code:
Reading surfaceFeaturesDict --> FOAM FATAL IO ERROR: keyword surfaces is undefined in dictionary "/truenas/data/preserve/OF_bench/OF8/run_20/system/surfaceFeaturesDict/motorBike.obj" file: /truenas/data/preserve/OF_bench/OF8/run_20/system/surfaceFeaturesDict/motorBike.obj from line 20 to line 44. From function const Foam::entry& Foam::dictionary::lookupEntry(const Foam::word&, bool, bool) const in file db/dictionary/dictionary.C at line 797. FOAM exiting TIA Kai. |
||
July 10, 2021, 09:39 |
|
#409 | |
Member
|
It seems that openfoam-8 changes the routine of sureface*dict.
I modified the original surfaceFeatureExtractDict according to the example in $FOAM_ETC. I didn't check the extraction results between versions but I think they should be equivalent. Quote:
|
||
July 12, 2021, 04:28 |
|
#410 | |
Member
|
7532*2, numa NPS4, ddr4 3200 16g*16 2R, ubuntu18.04, ofv8 commit-30b264 (copiled normally without any special gcc tuning.):
Code:
# cores Wall time (s): ------------------------ 1 730.97 2 342.93 4 171.78 8 81.72 16 41.99 24 29.84 32 23 48 20.04 64 18.4 However both ST and MT can't perform as well as the other two 7532 systems, e.g. Novel's. I hope it is due to lack of gcc tuning, and room temperature (around 28 degree C during my testing) Quote:
|
||
August 12, 2021, 06:41 |
Intel 6338*2
|
#411 | |
Member
|
Intel 6338*2, numa SNC2 (very similar results by SNC1), ddr4 3200 16g*16 2R, ubuntu18.04, ofv8 commit-30b264 (compiled normally without any special gcc tuning.):
Code:
# cores Wall time (s): ------------------------ 1 732.86 2 358.24 4 171.29 8 91.2 16 53.6 24 42.98 32 36.11 48 30.44 64 27.37 Clock speed of 6338 is lower than 7532, in full load it's only ~2.6G (VS 3.2G of 7532), however for parallel CFD I guess this should not be the bottleneck, especially when using all 64 cores. Another reason could be that ubuntu18.04 with kernel5.4 is too old for ice lake sp, but I faced some difficulties installing paraview and Aspeed graphic drivers on 20.04 so I stop the effort and go back to 18.04. Quote:
HTML Code:
https://images.anandtech.com/doci/16594/STREAM-8380.png |
||
August 14, 2021, 14:30 |
|
#412 |
New Member
Alexander Kazantcev
Join Date: Sep 2019
Posts: 24
Rep Power: 7 |
Xeon Silver 4314 Ice Lake-SP Scalable 3rd gen 10nm 2900MHz all cores, RAM 2666 8 dimms 8 channels, NUMA on, HT on (with off will be the same), bios power profile "Power (save)"
OpenFOAM v1806, openmpi 2.1.3, Puppy Linux Fossa mitigations = off (with on result ~ the same) flow | mesh 1 649.63 16m40.79s 2 369.84 12m3.332s 4 204 7m5.208s 6 148.87 5m17.678s 8 123.86 4m23.601s 12 99.57 3m41.038s 16 85.1 3m18.140s 20 94.18 4m20.113s 24 89.79 3m37.573 28 86.99 3m49.439s 32 84.81 3m50.380s |
|
August 16, 2021, 23:57 |
Dual AMD EPYC 7313 in 4 channel mode
|
#413 |
New Member
Dmitry
Join Date: Feb 2013
Posts: 29
Rep Power: 13 |
Dual AMD EPYC 7317 (2x16 cores), 8 x 32 GB DDR4 ECC 3200 MHz 32 GB Micron MTA18ASF4G72PDZ-3G2E1 (in 4 channel mode, NUMA NPS4). Gigabyte MZ72-HB0 (rev. 3.0) motherboard, dual Noctua NH-U14S TR3-SP3 coolers. openfoam 2106, OpenSUSE Leap 15.3.
# cores - Solver Wall time (s): ------------------------------------------ 1 - 731.2 2 - 444.28 4 - 199.06 6 - 126.95 8 - 96.3 12 - 69.26 16 - 56.43 20 - 49.85 24 - 47.59 32 - 51.07 On the same system, openfoam 2106, mingw version, Windows Server 2019. # cores - Solver Wall time (s): ------------------------------------------ 16 - 127.2 32 - 66.8 |
|
October 7, 2021, 03:03 |
|
#414 | |
New Member
Alexander Kazantcev
Join Date: Sep 2019
Posts: 24
Rep Power: 7 |
Quote:
But in OpenFOAM 3800x faster than 5600x. I think 5900x will be faster up to 2 times because of full speed of cpu-memory data bus. But now there are no reasons to buy 5900x. 5600x is cold after overclocking up to 4850 MHz and that's why a workstation has low noise. Terrific! Last edited by AlexKaz; October 7, 2021 at 08:31. |
||
October 7, 2021, 03:35 |
|
#415 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49 |
I doubt the "2x faster" hypothesis for a Ryzen 5900X.
What you get compared to a 5600x is twice the amount of L3 cache (which is good) and twice the WRITE memory bandwidth - which is not as awesome as it sounds. Reads are what matters. and they are basically the same with one or two CCDs on Zen3 Ryzen. |
|
October 11, 2021, 22:02 |
|
#416 |
Member
Kailee
Join Date: Dec 2019
Posts: 35
Rep Power: 6 |
Ok - I finally got my 4x 4627v2's... Box is the same as in June with the 2690v2's (DL560G8, 16x16Gb DDR1866, data remote via NFS over 10GbE).
Code:
Threads Tmesh Tsim It/s Wmesh Wsim kWh 1 1705 1038.75 0.10 204 212 0.061 2 1164 539.05 0.19 211 221 0.033 4 639 215.99 0.46 314 339 0.020 8 359 114.26 0.88 341 388 0.012 16 239 67.06 1.49 421 507 0.009 24 208 54.18 1.85 494 607 0.009 32 219 49.56 2.02 570 710 0.010 Code:
Threads Tmesh Tsim It/s Wmesh Wsim kWh 20 231 89.36 1.12 326 399 0.010 [EDIT] Added it/s and kWh for efficiency comparision[/EDIT] Last edited by Kailee71; October 12, 2021 at 07:24. |
|
October 28, 2021, 07:37 |
|
#417 | |
Member
|
That's solid performance already, considering the bandwidth limits.
Would you mind providing more details on the OF installation? By native ARM64 you mean directly install source code on MacOS, or on an arm64 linux virtual machine? I am very curious since the new M1 max has a much higher memory bandwidth, and the new macbook, althogh quite expensive, is still much cheaper than a 2-way workstaiton. But I have no idear how much work involved to port the codes.. Quote:
|
||
October 28, 2021, 15:47 |
|
#418 | |
Senior Member
Join Date: Jun 2016
Posts: 102
Rep Power: 10 |
Quote:
https://github.com/BrushXue/OpenFOAM-AppleM1 |
||
October 28, 2021, 23:49 |
|
#419 | |
Member
|
Thanks a lot xuegy.
Although I have literally zero experience on macOS.. If I managed to get hands on M1 max & find a solution, I will post the benckmark results, which I myself is also looking for. Quote:
|
||
October 28, 2021, 23:59 |
|
#420 |
Senior Member
Join Date: Jun 2016
Posts: 102
Rep Power: 10 |
I can’t wait to see the rumored dual M1 Max Mac Pro next year. Given the memory bandwidth per USD, Apple is not expensive at all.
|
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
How to contribute to the community of OpenFOAM users and to the OpenFOAM technology | wyldckat | OpenFOAM | 17 | November 10, 2017 16:54 |
UNIGE February 13th-17th - 2107. OpenFOAM advaced training days | joegi.geo | OpenFOAM Announcements from Other Sources | 0 | October 1, 2016 20:20 |
OpenFOAM Training Beijing 22-26 Aug 2016 | cfd.direct | OpenFOAM Announcements from Other Sources | 0 | May 3, 2016 05:57 |
New OpenFOAM Forum Structure | jola | OpenFOAM | 2 | October 19, 2011 07:55 |
Hardware for OpenFOAM LES | LijieNPIC | Hardware | 0 | November 8, 2010 10:54 |