|
[Sponsors] |
February 6, 2018, 06:19 |
Xeon Gold workstation config.
|
#1 |
Senior Member
Blanco
Join Date: Mar 2009
Location: Torino, Italy
Posts: 193
Rep Power: 17 |
Hello everyone,
I'm looking for some suggestion for a new CFD-3D workstation. I must use Xeon Gold processors, therefore there's no space for AMD Epyc here... Anyway, the workstation will be used for CFD-3D, consider unlimited number of paid licenses available, n° of cells will vary between 1e6 and 10e6, will probably reach 40e6 "rarely". Simulations, however, will involve combustion w/ detailed chemistry (quite heavy, hundreds of species), RANS but also LES/DDES in the near future. We also simulate spray, moving meshes, etc. Now, as far as the CPU is concerned, I'm considering the following alternatives: Opt 1: 2x Intel Xeon 6136 3.0 2666MHz 12C -- reference cost -- 24 cores Opt 2: 2x Intel Xeon 6148 2.4 2666MHz 20C -- cost +16% -- 40 cores Opt 3: 2x Intel Xeon 6154 3.0 2666MHz 18C -- cost +29% -- 36 cores All these share the same L3, 24.75 MB. Now, opt2 has the higher number of cores but the base frequency is the lowest among the three, and I suppose this is the reason for the lower cost compared to Opt3. Opt3 has 50% more cores than Opt1 and it costs 30% more...it seems reasonable to me. Therefore I would go with Opt3, with the aim to obtain the higher core number. Do you have any suggestion? There could be another option actually: Opt4: 2x Intel Xeon 6152 2.1 2666MHz 22C -- cost +31% -- 44 cores The L3 in this case is higher, 30 MB, but the base frequency is very low therefore I would not consider this option. Am I wrong? Other spec for wk config: - Linux OS (SUSE/Ubuntu LTS) - 96 GB (12x8 GB) DDR4 26666MHz ECC Reg RAM (dual rank DIMMs if possible) - NVIDIA Quadro P1000 4GB - 10GbE LAN - 2x 1TB 72000RPM SATA Enterprise or SAS (15k possibly) OT: does anyone have ever used SSD for CFD-3D? If yes, M.2 or PCIe? Have you got any clear advantage? Do you know any supplier of 4-socket workstation? If yes please PM. Thanks! Regards Last edited by Blanco; February 6, 2018 at 10:36. |
|
February 6, 2018, 08:50 |
|
#2 |
Member
Knut Erik T. Giljarhus
Join Date: Mar 2009
Location: Norway
Posts: 35
Rep Power: 22 |
If you see my post here,
OpenFOAM benchmarks on various hardware there are some benchmarks of the 6148 processor. As you will see, scaling is poor past 16 cores. So even if you didn't mention it, I would say the 6130 processor is the best compromise of the Intel processors. |
|
February 6, 2018, 09:40 |
|
#3 | |
Senior Member
Blanco
Join Date: Mar 2009
Location: Torino, Italy
Posts: 193
Rep Power: 17 |
Quote:
thanks for the post, I've just read your benchmark results. Maybe I'm getting something wrong, but the parallelization seems poor even if not all the cores were used (<20), which seems quite strange in my experience. Did you have all the 6-channel RAM populated? What were the other workstation specs? Thanks! |
||
February 6, 2018, 10:16 |
|
#4 |
Senior Member
Blanco
Join Date: Mar 2009
Location: Torino, Italy
Posts: 193
Rep Power: 17 |
Sorry, yours are single-machine simulation time, that's why parallelization efficiency is well below 1 when increasing n of cores above 1, right?
I would add: - Gold 6130 has 16C but base freq. of 2.1 GHz, which seems quite low to me, L3/cores is aligned with upper options. Theoretical clock is 33.6 GHz, which is the lowest amont other options above (36/48/54/46) - can 2x Gold 6144 3.5GHz 8C or 2x Gold 6146 3.2GHz 12C be reasonable alternatives, even if with limited number of cores? Their theoretical clock is however low: 28 and 38 GHz. I'm still in doubt that an higher number of cores will be the best choice or not, considering that I will "always" run using the highest amount of cores available Last edited by Blanco; February 6, 2018 at 11:56. |
|
February 7, 2018, 06:40 |
|
#5 |
Member
Knut Erik T. Giljarhus
Join Date: Mar 2009
Location: Norway
Posts: 35
Rep Power: 22 |
It's a single-socket machine, yes. The memory is 6 x 16 GB 2666 MHz. But as I also showed in the other thread, even on a dual socket machine you would not see a parallel efficiency of 1 due to the memory being a bottleneck.
The Gold 6142 is the same as the 6130 only with a higher base frequency. The question I guess is whether a 50% increase in price is worth it. The same for the 6144/6146, I do not think the increase in frequency (and price!) is worth it as long as you are able to utilize all the cores on the 6130. The cache difference is minimal. It would be nice to have some more benchmarks, though. |
|
February 8, 2018, 10:06 |
|
#6 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49 |
Puget systems put together an article that tried to make some sense of the mess that is Intels Skylake-SP lineup: https://www.pugetsystems.com/labs/hp...rs-Guide-1077/
They somehow managed to get non-AVX all-core turbo frequencies which are much more relevant than base clock or single-core turbo. Towards the end, they recommend a handful of processors for memory-bound applications with larger cache per core. These are the ones to pick for maximum performance at least when paying for licenses. Performance per dollar is a different topic entirely. I would not recommend CPUs with more than ~16 cores even if you are not on a per-core licensing scheme. In terms of storage, I think the time for 15k HDDs is up for most applications. Expensive, loud, still bad for smaller chunks of data. Get an SSD instead. SATA/SAS or PCIe depends on your budget, anything is faster than spinning disks. HDDs are for long-term storage of larger data after you finished running and post-processing your results. Here 7200rpm or even 5400rpm drives are good enough. |
|
February 8, 2018, 11:39 |
|
#7 | ||
Senior Member
Blanco
Join Date: Mar 2009
Location: Torino, Italy
Posts: 193
Rep Power: 17 |
Quote:
I've found the source for the frequencies used in the article you linked, they come from Intel: https://www.intel.com/content/www/us...ec-update.html I think my CFD-3D app is using AVX-512, it's a commercial CFD-3D SW and I've found a benchmark where AVX-512 was explicitly cited as "useful" for simulations performed with this SW. Considering this, if I try to create a "cost/performance" rank of the processors, by computing the "performance index" as in the linked article, I have: - 1st place: 6140 18C -- performance index w/ AVX512 604 - 2nd place: 6148 20C -- performance index w/ AVX512 704 - 3rd place: 6154 18C -- performance index w/ AVX512 777 All others Gold processors I've checked (range 6128-6152) have higher cost/performance ratio and lower performance index. Among all the processors I checked, the 6154 is showing the highest performance index w/ AVX512. If however I consider NON-AVX performance, I get - 1st place: 6140 18C -- performance index w/ AVX512 864 - 2nd place: 6130 16C -- performance index w/ AVX512 716 - 3rd place: 6148 20C -- performance index w/ AVX512 992 and the 6154 is in the 4th place but it still has the 2nd higher NON-AVX performance index (1065). I'll look more closely to the SW I'm using to confirm it effectively use the AVX512 technology, but in any case it seems to me that the 6140 is the best solution in terms of cost/performance, while the 6154 is probably the best "high performance" solution. I think I'll go with the 6154 if not bounded by the total cost. This processor is also suggested in the article for both NON-AVX and AVX-512 processes. As a side note from the article: "The 5122 stands out as the processor with the highest AVX-512 All-Core-Turbo. It is just 4 cores but they are all running full speed. It also has the largest per core Cache. This could be a good processor for memory bound programs and/or programs that don't have good parallel scaling but do take advantage of AVX512 vectorization." -> this could be a good solution for me if considering more than 1 workstation, but at the moment I'm looking for a single workstation. Quote:
|
|||
February 8, 2018, 12:18 |
|
#8 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49 |
I can not over-emphasize how useless such a "performance index" aka aggregate CPU frequency is especially when it comes to CFD. With high core count processors like these, you will hit the wall called memory bandwidth bottleneck long before their theoretical compute performance become relevant, rendering all high core count processors more or less equal in terms of performance.
As a more realistic performance estimate I would recommend a model based on Amdahl's law with a parallel efficiency p of ~97%. It does not model the actual cause for sub-linear speedup in CFD, but fits the results pretty well: high core count processors are a waste of money. Same applies to AVX and its newer variants. Ansys/Intel also mentioned AVX512 in one of their marketing presentations surrounding Skylake-SP. But as a matter of fact, most of the performance increase they found is caused by the increase in memory performance. AVX doesn't help much in memory bandwidth bound applications. But in the end it doesn't really matter, all processors you are currently looking at have the same AVX capabilities enabled. So if you should encounter a workload that actually benefits from AVX, they are up to the task. |
|
February 8, 2018, 13:45 |
|
#9 | |
Senior Member
Blanco
Join Date: Mar 2009
Location: Torino, Italy
Posts: 193
Rep Power: 17 |
Quote:
Effective cores = 1/(1-eff.+eff/real_core_n) Then we could compute the performance_index by considering this effective number of cores. I've tried this and the best CPU for cost/perf ratio (AVX and NON-AVX) from this analysis is the 6126 12C if I did thing correctly. From the performance perspective, however, the 6154 18C is still be the best solution, and it is in the 5th position in the cost/perf rank... Maybe I'll post the table if I have time to arrange it properly |
||
February 8, 2018, 15:19 |
|
#10 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49 |
As I said, Amdahl's law with a scaling of 97% models this behavior pretty well despite the fact that it was invented to model parallel sections of a code. For some cases it might be 98%, in most cases it is lower.
In case you are struggling with Amdahl's law: the formula to calculate an estimated performance index P_N with N cores is . Here f_N is the CPU frequency while running with N cores and p is the parallel efficiency -typically between 0.9-0.98. Last edited by flotus1; February 9, 2018 at 10:04. |
|
February 13, 2018, 04:34 |
|
#11 |
Senior Member
Blanco
Join Date: Mar 2009
Location: Torino, Italy
Posts: 193
Rep Power: 17 |
I finally found time to collect the info I gathered through this topic. I post the summary tables I created using the CPU costs from Intel website (my supplier has slightly different costs, but the resulting rank is still the same)
In the attachment you will find: - Performance rank performance.jpg - Cost over performance rank cost_ov_perf.jpg - Performance over core rank perf_ov_core.jpg |
|
February 13, 2018, 08:12 |
|
#12 |
Senior Member
Gwenael H.
Join Date: Mar 2011
Location: Switzerland
Posts: 392
Rep Power: 20 |
Thanks Blanco
I'm also currently looking for a new workstation configuration with similar requirements so this topic is a great source of information, also mentioning Alex's valuable answers in several other threads |
|
April 26, 2019, 10:23 |
|
#13 |
New Member
Sibel
Join Date: Apr 2017
Posts: 18
Rep Power: 9 |
Hello everyone
I need your help for my configuration. We will buy a workstation for our CFD group at the university. We will model mostly two phase flow CPU 2xIntel Xeon Gold X2 6140 or 2xIntel Xeon Gold X2 6148 RAM: 64GB LRDIMM Samsung DDR4-2666, CL19, reg. ECC (8x 64GB = 512GB) or 8X32=256GB NVIDIA Quadro P2000 5 GB GDDR5 Mainboard: ASUS WS C621E Sage, Dual So. 3647; E-ATX SSD: 512GB Samsung 970 Pro, M.2 PCIe (MZ-V7P512BW) HDD: 6TB Seagate IronWolf Pro NAS, SATA3, 7200RPM (ST6000NE0023) And some questions: 1) Do you have a suggestion for the cooling system? 2) Does the GPU provide additional performance for academic studies? Or should the money for CPU? 3) I've searched the AVX512, but I don't really understand. It seems that it has both negatives and positives? 4) Should I use SSD for the ANSYS installation? If so my chose is enough? Many thanks |
|
May 3, 2019, 14:42 |
|
#14 | |
New Member
Joshua Brickel
Join Date: Nov 2013
Posts: 26
Rep Power: 13 |
Quote:
I would actually suggest you look more closely at the CPU you want. You may be get the same bang for you buck going with a slightly fewer number of cores. I recently posted something on this. My experience is that CFX does not gain any advantage from a high end graphics processor for solving. But if you are going to use Fluent, then it might be different. If you are doing transient tests (and recording transient information), then it might be useful to have at least a SSD for the solution drive. You might want to also consider a SSD on a NVMe bus. |
||
June 8, 2019, 02:55 |
|
#15 | |
Member
Ivan
Join Date: Oct 2017
Location: 3rd planet
Posts: 34
Rep Power: 9 |
Quote:
We bought 2x7351 with Supermicro Ram same GPU - no Overclocking - yes, about 2,6-2,7 (with 2,8-2,9 after 25-30 hours we had mistakes), but I know 2 computers with 7351 which work stable at 3.0-3.1. Cooling - Noctua, plus we instal large powerfull mitsubishi air conditioning system to have +15 C in server room P.S. Before mitsubishi air conditioning we have some experience with different water in-case, near-case systems - need too many time for maintenance, and normally one time per year you need to change O-rings, add liquid, etc. |
||
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Xeon workstation: suggestions needed | Blanco | Hardware | 23 | September 2, 2016 17:29 |
home workstation (on a budget): dual xeon e5-2630 v4 vs i7 5960x? | foxmitten2 | Hardware | 1 | June 8, 2016 13:09 |
Workstation with new E5 Xeon Ivy-bridge | Manuelo | Hardware | 23 | November 24, 2014 15:11 |
need opinion Workstation 2x Xeon e5 e2690 ? | laxwendrofzx9r | Hardware | 6 | June 5, 2012 10:04 |
PC vs. Workstation | Tim Franke | Main CFD Forum | 5 | September 29, 1999 16:01 |