|
[Sponsors] |
January 16, 2018, 02:05 |
|
#21 |
Member
Ivan
Join Date: Oct 2017
Location: 3rd planet
Posts: 34
Rep Power: 9 |
We found this benchmark:
http://www.ansys.com/solutions/solut...ntrifugal-pump So as I understand 16 cores AMD Epyc CPU has best efficiency - 100% for CFD. No need to buy CPU with more cores, better to buy second computer. For Intel - better to buy 32 cores CPUs. But I do not understand why there is no 6,8,12 cores CPUs in this benchmark. |
|
January 16, 2018, 05:50 |
|
#22 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49 |
You might have misinterpreted the benchmark results.
Those "100%" are just the baseline parallel efficiency. The other results are normalized with the performance at this data point. Parallel efficiency less than 100% is a normal result for scaling on a single node. Look at the core solver rating for a less confusing performance metric. Higher numbers here mean better performance, comparable across all data points. All you can take from the parallel efficiency: if you pay for your licenses, do not buy the high core count CPU models, neither from Intel nor from AMD. Instead, get more nodes with low to medium core count CPUs. Overall, I would take this benchmark result with a grain of salt. The platform description is pretty minimalistic to say the least. |
|
March 7, 2018, 06:04 |
|
#23 |
Senior Member
Joern Beilke
Join Date: Mar 2009
Location: Dresden
Posts: 533
Rep Power: 20 |
Hi Alex,
are there new observations when running this machine? I plan to buy a 2 processor 7351 epyc machine. What is your experience with the noise? Is it a machine to place it under the desk? |
|
March 7, 2018, 06:25 |
|
#24 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49 |
I would say "so far, so good". I have nothing negative to say about AMD Epyc in general. Of course with the exception that the CPU architecture has one or two drawbacks aside from its benefits. single-threaded workloads that require more RAM than one NUMA-Node has will run rather slow. This is generally the same on dual-socket Intel machines, but here one NUMA node spans half of the total memory, for AMD it is only one eight. An issue you should be aware of before deciding which CPU to buy.
In terms of noise, you get what you make of it. It can be just as silent or annoying as any other workstation. I prefer quiet, so I picked the largest CPU coolers from Noctua and also their highest-quality 140mm case fans. The machine is less noisy than any of our pre-built Dell and HP workstations in the office. There are some issues with Supermicro boards and slow-spinning fans. If the fan rpm drops below 500, the board detects it as "stalled" and revs up all the fans to maximum in a cycle of a few seconds. There is no solution from supermicro for this (other than the recommendation of buying high-rpm fans from supermicro ), but a workaround that lets you lower the fan thresholds: https://calvin.me/quick-how-to-decre...fan-threshold/ Worked for me. By the way: I am planning to replace the motherboard as soon as any other brand releases dual-socket SP3 boards. ASRock rack appears to be working on it. The reason being that I have had some issues with it (see above) and the experiences I had with their end-user support were pretty unpleasant. And then there are some ridiculous design decisions, like using only 2 PCIe-Lanes for the m.2 port. But don't let that discourage you, I tend to have negative experience with customer support from many companies when they can not go off-script. |
|
March 7, 2018, 06:50 |
|
#25 |
Senior Member
Joern Beilke
Join Date: Mar 2009
Location: Dresden
Posts: 533
Rep Power: 20 |
Many thanks for the info. I'm in contact with Delta Computer. Let's see what they can put together.
I can still use the i7-3960X for single core jobs. Even after so many years it is still a very fast machine. I also replaced all water coolings with Noctuas some years ago. It's much much better in terms of reliability and noise. Viele Grüße Jörn |
|
April 7, 2018, 19:29 |
|
#26 |
New Member
Join Date: Aug 2017
Posts: 9
Rep Power: 9 |
Hello,
Thanks Flotus1 for this interesting topic. I note that the "Samsung 2Rx4 DDR4-2133 reg ECC" is quite expenssive. Do you know the performance with 8*16Go RAM ? I know it would be better to install 16*8Go than 8*16Go for a total of 128 Go on 16 slots but the second option let the choice to upgrade easily . mathieu |
|
April 8, 2018, 05:39 |
|
#27 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49 |
In the first post I wrote that I do not recommend buying this particular memory, but DDR4-2666 instead. The reason I used it was simply that I already had this RAM, bought it back when prices were lower.
Any DDR4 is quite expensive nowadays, but you will get significantly lower parallel performance with only 8 DIMMs installed. |
|
April 8, 2018, 10:22 |
|
#28 |
New Member
Join Date: Aug 2017
Posts: 9
Rep Power: 9 |
Ok thanks for your answer.
I suppose this is particular true here because the epyc 7301 has a 8 memory channels and not a quad. I have a workstation with a 4 sockets motherboard and 4 cpu amd octeron 6380 which are quad channels. For each CPU i have 4*16Go RAM (there is 8 slots RAM/cpu) but I suppose my performance on this machine will not increase if i put 8*16Go because my cpu are quad channels and not octo channels like epyc are. Do you agree ? Other question : do you think performance with your epyc 7301 will decrease with a 16*4Go RAM configuration (because only 2 Go Ram/core whereas it is commended to have 4 to 8 Go/core)? thanks for help mathieu |
|
April 8, 2018, 10:34 |
|
#29 | ||
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49 |
Quote:
Then again, fully populating all slots on this particular machine with dual (or quad?) rank DIMMs will probably decrease performance because it reduces the memory speed. See the manual of your motherboard. Quote:
But one issue specific to AMD Epyc you should be aware of: single-threaded workloads will run slower if they require more memory than available on one NUMA node. So putting only 64GB total on a dual-socket Epyc workstation, you will run into this problem more often because one NUMA node only addresses 8GB of RAM. I really would not drop down to 4GB DIMMs because they are significantly more expensive per GB than 8GB or 16GB DIMMs. |
|||
April 8, 2018, 12:26 |
|
#30 | |||
New Member
Join Date: Aug 2017
Posts: 9
Rep Power: 9 |
Thank you for your answer.
Quote:
Quote:
Quote:
Other question : do you know scalability with cluster made of two nodes (2*(2*epyc 7301)) and the type of connection you would use (min 10Gb/s i suppose) ? Thanks Mathieu |
||||
April 8, 2018, 12:38 |
|
#31 | |||
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49 |
Quote:
Quote:
Compare two dual-socket machines with Xeon and Epyc, both with 128GB of RAM total. On the Xeon machine each of the 2 NUMA nodes has 64GB of RAM. So no worries running a single-threaded workload that requires 40GB of RAM. On Epyc, each of the 8 NUMA node has 16GB of RAM. Running a single-threaded 40GB job here will result in high inter-node communication, slowing down the process significantly due to lower bandwidth and higher latency. Quote:
By the way, at the current price Epyc 7281 is a very attractive option if money is tight. Before sacrificing memory channels on Epyc 7301, I would consider this CPU instead. |
||||
April 8, 2018, 13:40 |
|
#32 | ||||
New Member
Join Date: Aug 2017
Posts: 9
Rep Power: 9 |
Quote:
Quote:
Quote:
Quote:
If my simulations need a lot of cpu and not to much RAM (long unsteady simulation for ex) i calculated that a 2 nodes cluster with 64 RAM/ node is only 20-30% more expenssive than 1 node with 256 Go RAM. Thanks Mathieu |
|||||
April 8, 2018, 14:11 |
|
#33 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49 |
Infiniband cards: https://www.ebay.de/itm/40Gbps-low-p...-/152380156089
Cable: https://www.ebay.de/itm/Mellanox-MC2...gAAOSwWG5aoBMl This is the first cable I found, there might be cheaper ones if you search a little bit more. The price difference between Epyc 7301 and 7281 became pretty substantial. It was less than 100€ when I bought, now it is more than 300€ https://geizhals.eu/amd-epyc-7301-ps...-a1743454.html https://geizhals.eu/amd-epyc-7281-ps...-a1743436.html Apart from the lower amount of L3 cache, these CPUs are identical. If I had to buy now, I would be very tempted to get the 7281 instead. If your jobs are really small in terms of cell count, I would start with one workstation first and see if it scales properly on 32 cores. Then you can decide if a second workstation is worth it. |
|
April 9, 2018, 05:29 |
|
#34 | |||
New Member
Join Date: Aug 2017
Posts: 9
Rep Power: 9 |
Quote:
Quote:
Quote:
|
||||
April 9, 2018, 06:05 |
|
#35 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49 |
There are quite a few tutorials on how to setup Infiniband interconnects with various operating systems. Just use the forum search.
I think people hesitate to go Infiniband for three reasons
If you are worried about hardware failures when buying used: my opinion is that you could easily buy additional spare parts and store them in a drawer for quick replacement. Still much cheaper and faster than waiting for a warranty replacement part. |
|
April 9, 2018, 13:21 |
|
#36 |
New Member
Join Date: Aug 2017
Posts: 9
Rep Power: 9 |
Ok thanks for your answer.
Concerning the Mainboard is it a difference between your Supermicro H11DSi and the H11DSi-NT ? Is the second one better if i want to upgrade to a second node later? Thanks |
|
April 9, 2018, 13:42 |
|
#37 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49 |
NT has built-in 10Gigabit Ethernet instead of 1Gigabit.
If you want to give Ethernet a try before going Infiniband, NT is the version you want. |
|
April 9, 2018, 16:39 |
|
#38 |
New Member
Join Date: Aug 2017
Posts: 9
Rep Power: 9 |
||
April 27, 2018, 08:19 |
|
#39 |
Member
Join Date: Jul 2011
Posts: 53
Rep Power: 15 |
I've purchased a new compute setup.
It consists of two nodes, each with dual Intel Xeon Gold 6146 CPUs. I don't normally run Fluent, but I want to benchmark the CPUs to give a comparison. See below: System CPU: 2x Intel Xeon Gold 6146 (12 cores, 3.9 GHz all-core turbo, 4.2 GHz single-core turbo) RAM: 12 x 8GB DDR4-2666 ECC (single rank) Interconnect: 10 GbE OS: Windows 10 Pro Fluent: 19.0 1) External Flow Over an Aircraft Wing (aircraft_2m), single precision INTEL Single Node, 1 core, 10 iterations: 234 s INTEL Single Node, 24 cores, 100 iterations: 107 s INTEL Dual Node, 32 cores, 100 iterations: 87 s 2) External Flow Over an Aircraft Wing (aircraft_14m), double precision INTEL Dual Node, 24 cores, 10 iterations: 101 s INTEL Dual Node, 32 cores, 10 iterations: 84 s INCORRECT BENCHMARKS, SEE UPDATED POST AMD Epyc CFD benchmarks with Ansys Fluent I'm a little surprised by the poor single core performance of the aircraft_2m benchmark? Could this be a result of using single rank memory...? The systems are Dell Precision 7920 racks, and unfortunately Dell could only deliver dual rank memory in 32GB sticks (stupidly expensive!). The memory sticks are properly distributed/installed across the memory slots. As far as I can tell from benchmarking, the system is performing well for CFX, both compared to my old compute setup and compared to published CFX benchmark results. What do you guys think? Last edited by SLC; May 4, 2018 at 07:47. |
|
April 27, 2018, 11:45 |
|
#40 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49 |
To be honest, these numbers are lower (i.e. higher execution times) than I would expect for the kind of hardware you have. Both single-core and parallel. At least when comparing against the results in the initial post here. We used different operating systems and different software versions, so there is that...
I hardly think that poor single-threaded performance is linked to the choice of memory. A single thread usually can not saturate memory bandwidth on such a system. And even if this was the cause of the issue, performance difference between single- and dual-rank is less than 10%. Checklist
|
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Using inlet mpi in parallel ANSYS fluent with AMD processors | freebird | ANSYS | 1 | June 16, 2017 10:04 |
Can you help me with a problem in ansys static structural solver? | sourabh.porwal | Structural Mechanics | 0 | March 27, 2016 18:07 |
CFD papers Numerical study - Upwind schemes ANSYS FLUENT | Volumeoffluid | FLUENT | 0 | January 31, 2014 13:21 |
CFD papers Numerical study- upwind schemes ANSYS FLUENT | Volumeoffluid | Main CFD Forum | 0 | January 30, 2014 12:19 |
Free UK seminars: ANSYS CFD software | Gavin Butcher | CFX | 0 | November 23, 2004 10:13 |