|
[Sponsors] |
128-core cluster for Ansys Fluent: E5-2697A vs E5-2667 |
|
LinkBack | Thread Tools | Search this Thread | Display Modes |
November 30, 2017, 09:30 |
128-core cluster for Ansys Fluent: E5-2697A vs E5-2667
|
#1 |
New Member
Join Date: Nov 2017
Posts: 5
Rep Power: 8 |
Hello dear fellows,
I'm a member of aerodynamic department in an engineering company. We perform CFD calculations using Ansys Fluent. Typically, our computational models are about 10-40 million cells, but sometimes are up to 80 mln. Nowadays we are looking for a new server to speed up our calculations using up to 128 cores (so, 3 HPC packs). I have chosen the following configuration for our request: CPU: 4 nodes with 2 Xeon E5-2697A v4 (16 cores 2.6 GHz each) RAM: 4 x 32 GB per node, 512 GB total SSD 480 GB on each node + HDD 2 TB Interconnection: 56 Gb/s InfiniBand Price: around 50k USD But then I have found a topic on this forum with a similar problem. 128 core cluster E5-26xx V4 processor choice for Ansys FLUENT The final choice was an 8 node cluster with E5-2667 CPU (8 cores 3.2 GHz each). As far as I understood the choice is based on a fact that in this case 1 core uses more memory lanes than in 16-cores 2697A. The price of an 8-node cluster with E5-2667 is around 68k USD, so 36% more expensive. So, I’m wondering if the speed up of the 8 core configuration really worth these extra money? Is there any test results or maybe someone has his own experience in using different clusters? Any help would be appreciated indeed. Thank you in advance! Kind regards, Mike. |
|
November 30, 2017, 11:02 |
|
#2 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,426
Rep Power: 49 |
No need to spend extra money, at least not for "outdated" CPUs. Intel already released their new "Skylake-SP" processors and they should be available from all vendors.
Their main advantage for CFD compared to "Broadwell" predecessors: 6 instead of 4 memory channels with support for DDR4-2666 instead of DDR4-2400. And the price premium for quad-socket processors was lowered. All Xeon gold 6xxx processors have 3 UPI links which makes them suitable for quad-socket setups. The current "high-end" recommendation for an Intel-based 128-core cluster would be: Four nodes with 4x Intel Xeon Gold 6144 (8 cores) Cheaper (slower) solutions are always possible. E.g. Xeon Gold 6134 (8 cores) instead of 6144. Three nodes with 4x 12-core processors Two nodes with 4x 16-core processors Four nodes with 2x 16-core processors... You get the picture. Which one is best for you depends on your budget and the prices your hardware vendor quotes. But to make my point once more: Do not buy Xeon v4 processors any more. Btw: Ansys publishes benchmark data for various test cases and hardware: http://www.ansys.com/solutions/solut...ent-benchmarks |
|
December 1, 2017, 06:54 |
|
#3 |
New Member
Join Date: Nov 2017
Posts: 5
Rep Power: 8 |
flotus1, thank you for your reply.
Unfortunately, in my region (I live in Russia) hardware suppliers do not yet sell servers with 4x Xeon Gold CPU. So, I think that I will look for a 2x Xeon Gold 6144 servers, it looks like a good scalable solution in my case. |
|
December 1, 2017, 07:26 |
|
#4 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,426
Rep Power: 49 |
Since you have a decent Infiniband interconnect dual-socket nodes will work just fine. Yet in this case you could settle for the cheaper Xeon gold 5xxx processors since you don't need that many UPI links. The only downside is that these CPUs also have lower clock speeds.
Make sure to get a correct memory configuration for those Skylake-SP processors, so 6 DIMMs per CPU. Apparently, not all vendors recommend this by default. |
|
December 1, 2017, 08:20 |
|
#5 | |
New Member
Join Date: Nov 2017
Posts: 5
Rep Power: 8 |
Quote:
Concerning the memory configuration, as far as I understood, it’s better to have as many memory controllers as possible, to increase total memory bandwidth. For example, with two Xeon Gold 61xx I need to buy 24 DDR4-2666 controllers (per 8 GB, 192 GB per node would be sufficient, I guess). Please, correct me if I’m wrong. |
||
December 1, 2017, 08:47 |
|
#6 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,426
Rep Power: 49 |
You are partly correct: for CFD you want as many memory controllers - memory channels to be more precise. But a single Skylake-SP CPU only has one memory controller with 6 channels. Populating more than one DIMM per channel is useless. In fact, this might even lower the maximum supported memory speed and thus bandwidth compared to one DIMM per channel. Depends on the type of memory and motherboard you use.
So for a dual-socket node, you want 12 DIMMs in total. Preferably dual-rank RDIMMs if possible. |
|
December 1, 2017, 09:59 |
|
#7 |
New Member
Join Date: Nov 2017
Posts: 5
Rep Power: 8 |
flotus1, Thank you! It's much clearer after your explanation.
|
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
looking for a smart interface matlab fluent | chary | FLUENT | 24 | June 18, 2021 09:07 |
128 core cluster E5-26xx V4 processor choice for Ansys FLUENT | F1aerofan | Hardware | 30 | January 19, 2018 03:53 |
Problem in using parallel process in fluent 14 | Tleja | FLUENT | 3 | September 13, 2013 10:54 |
problem in using parallel process in fluent 14 | aydinkabir88 | FLUENT | 1 | July 10, 2013 02:00 |
Fluent on a Windows cluster | Erwin | FLUENT | 4 | October 22, 2002 11:39 |