CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > General Forums > Hardware

CPU cache memory

Register Blogs Community New Posts Updated Threads Search

Like Tree8Likes
  • 1 Post By flotus1
  • 1 Post By flotus1
  • 1 Post By flotus1
  • 1 Post By flotus1
  • 1 Post By flotus1
  • 1 Post By naffrancois
  • 1 Post By naffrancois
  • 1 Post By naffrancois

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   February 26, 2024, 05:14
Default CPU cache memory
  #1
New Member
 
Join Date: Feb 2024
Location: Spain
Posts: 19
Rep Power: 2
Sharku is on a distinguished road
Hello,
When selecting the model of Intel Xeon CPU for a workstation for FEA using Ansys, I have noticed that the included CPU cache memory tends to increase with the number of cores of the model. Let's assume I have a license that does not allow to use more than 8 cores for the FEA. Would it be reasonable to choose a CPU with more than 8 cores just to get more cache memory?
Sharku is offline   Reply With Quote

Old   February 26, 2024, 05:32
Default
  #2
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
Maybe not with the way Intel currently sells their last level cache. I.e. keeping it a constant value per core. The benefits of larger caches are small. So it's "go big or go home"
They have Xeon "MAX" CPUs with a ton of HBM inside the CPU package. But the prices are a bit excessive IMO.

The AMD Epyc 9174F is a 16-core CPU, with 265MB of L3 cache, and high core clock frequency.
And then there are the "X3D" variants, which were specifically designed for workloads that benefit from large caches. Like FEA and CFD. AMD Epyc 9184X, 16 cores, 768 MB of L3 cache.
Sharku likes this.
flotus1 is offline   Reply With Quote

Old   February 26, 2024, 05:46
Default
  #3
New Member
 
Join Date: Feb 2024
Location: Spain
Posts: 19
Rep Power: 2
Sharku is on a distinguished road
Thank you, flotus1. Assuming only 8 cores will be used, do you think AMD Epyc 9174F (4.4 GHz, 265MB cache) would perform better for FEA than Intel Xeon W7-2495X (4.8 GHz, 45 MB cache), because of the much higher cache of the former?
Sharku is offline   Reply With Quote

Old   February 26, 2024, 06:18
Default
  #4
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
Using anything but Intel has been a gamble with FEA solvers in the past.
But not any more it seems. Ansys is providing slides where AMD CPUs come out on top:
https://www.ansys.com/content/dam/am...tion-brief.pdf
So I would probably go with the AMD option if it was my money.

Is this "RAID 0" question still relevant here? Because for out-of-core execution, getting the scratch space fast enough matters more than the CPU. And maximizing memory capacity of course.
flotus1 is offline   Reply With Quote

Old   February 26, 2024, 06:26
Default
  #5
New Member
 
Join Date: Feb 2024
Location: Spain
Posts: 19
Rep Power: 2
Sharku is on a distinguished road
Quote:
Originally Posted by flotus1 View Post
Using anything but Intel has been a gamble with FEA solvers in the past.
But not any more it seems. Ansys is providing slides where AMD CPUs come out on top:
https://www.ansys.com/content/dam/am...tion-brief.pdf
So I would probably go with the AMD option if it was my money.
Thank you for your helpful suggestions.

Quote:
Originally Posted by flotus1 View Post
Is this "RAID 0" question still relevant here? Because for out-of-core execution, getting the scratch space fast enough matters more than the CPU. And maximizing memory capacity of course.
Sorry, but I'm afraid I don't understand what you mean here. Unfortunately, I still have much to learn about hardware concepts.
Sharku is offline   Reply With Quote

Old   February 26, 2024, 07:05
Default
  #6
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
This thread right here, also started by you today: Workstation with Raid 0

If running the solver is heavily bottle-necked by getting the data from the SSDs, faster CPUs do not help much. The first priority then would be to get the scratch space as fast as possible.
It boils down to this: do your simulations fit into physical memory, or do you want to run out-of-core. I.e. using additional space on the SSD(s) to make up for a lack of memory capacity.
Modern NVMe SSDs can be fast. But they are still at least an order of magnitude slower than RAM.
Memory bandwidth of an AMD Epyc Genoa CPU: 460.8 GB/s
Sequential read speed of an NVMe SSD with 4 PCIe 4.0 lanes: 7000 MB/s. That's 7 GB/s
Sharku likes this.
flotus1 is offline   Reply With Quote

Old   February 26, 2024, 07:12
Default
  #7
New Member
 
Join Date: Feb 2024
Location: Spain
Posts: 19
Rep Power: 2
Sharku is on a distinguished road
Quote:
Originally Posted by flotus1 View Post
This thread right here, also started by you today: Workstation with Raid 0

If running the solver is heavily bottle-necked by getting the data from the SSDs, faster CPUs do not help much. The first priority then would be to get the scratch space as fast as possible.
It boils down to this: do your simulations fit into physical memory, or do you want to run out-of-core. I.e. using additional space on the SSD(s) to make up for a lack of memory capacity.
Modern NVMe SSDs can be fast. But they are still at least an order of magnitude slower than RAM.
For an AMD Epyc 9174F with 12 memory channels, I would get 12*32=384 Gb of RAM. I'm not sure yet if that will be enough RAM to avoid requiring significant scratch space in the SSDs for my simulation cases, though... I don't have much experience with that yet. I'm planning to perform multiphysics simulation of electric machines, including electromagnetic and thermal behavior, using Ansys Maxwell and Fluent.
Sharku is offline   Reply With Quote

Old   February 28, 2024, 04:15
Default
  #8
New Member
 
Join Date: Feb 2024
Location: Spain
Posts: 19
Rep Power: 2
Sharku is on a distinguished road
Is the memory cache of the CPU per core very important for complex FEA simulations?

I can see that the Epyc 9174F has 256MB L3 cache and 4.1GHz-4.4GHz clock frequency, whereas the Epyc 9184X has
768MB L3 cache and 3.55GHz-4.2GHz clock frequency. That is, the former is better in clock frequency, whereas the latter is better in memory cache. Which option could be expected to perform better for demanding FEA simulations?
Sharku is offline   Reply With Quote

Old   February 28, 2024, 06:24
Default
  #9
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
The latter will be faster most of the time.
These CPUs with extra large caches are specifically designed and marketed for this type of workload. Base clock speeds are just that: a baseline. Both CPUs will run at higher boost clock speeds anyway, even with all cores busy.
There used to be tables for which boost clock speed to expect, for which type of code, at different amounts of cores loaded. But I haven't seen those around for any modern CPU. I guess the logic for boost clocks just became too complicated to put it into a table.
Sharku likes this.
flotus1 is offline   Reply With Quote

Old   February 28, 2024, 06:34
Default
  #10
New Member
 
Join Date: Feb 2024
Location: Spain
Posts: 19
Rep Power: 2
Sharku is on a distinguished road
Thank you again, your responses are very helpful.

Regarding the EPYC 9174F (https://www.amd.com/en/products/cpu/amd-epyc-9174f), I can see that the EPYC 9274F (https://www.amd.com/en/products/cpu/amd-epyc-9274f) has 8 more cores, practically the same clock frequency, and yet it is about 25% cheaper. Doesn't the latter seem much preferable over the former?
Sharku is offline   Reply With Quote

Old   February 28, 2024, 07:11
Default
  #11
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
Oh yeah, they have some weird scaling on prices going on with the "F"-SKUs.
These are for maximum per-core performance in traditional CPU-heavy workloads.
AMD designed their prices around software licensing costs. There must be an extremely expensive, and widely-used software floating around, that charges based on cores per socket. Mixing together all these costs, it turns out that money is to be saved with a 16-core CPU, even if that costs more.

Nothing that concerns us though. If the 16-core variant is actually more expensive on the retail market, the 24-core variant is the better option.
Sharku likes this.
flotus1 is offline   Reply With Quote

Old   February 28, 2024, 09:05
Default
  #12
New Member
 
Join Date: Feb 2024
Location: Spain
Posts: 19
Rep Power: 2
Sharku is on a distinguished road
Excellent. And what about the graphic card when using Ansys Fluent, assuming the CPU is the EPYC 9274F and there is 12*64GB=768GB RAM? Is it worth going for a powerful graphic card such as "4 GB PNY NVIDIA T400 GDDR6 384 cores CUDA", or even "12 GB PNY NVIDIA RTX A2000 GDDR6 3328 Cores CUDA"?
Sharku is offline   Reply With Quote

Old   February 28, 2024, 10:17
Default
  #13
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
General recommendations for CFD hardware [WIP]
Chapter 3

the tl;dr of that: no, probably not worth spending serious cash on a GPU for compute.
And for just rendering the image on the screen, anything remotely recent with at least 8GB of VRAM should work.
Sharku likes this.
flotus1 is offline   Reply With Quote

Old   February 28, 2024, 18:09
Default
  #14
New Member
 
Join Date: Feb 2024
Location: Spain
Posts: 19
Rep Power: 2
Sharku is on a distinguished road
Thank you!

I have one additional question, which hopefully may be the last one about this subject. Regarding the Epyc 9274F, it seems very difficult to find it mounted on a tower case instead of a server rack case. I asked the vendor about this problem (PCSpecialist) and they said that after some searches they were able to find a provider who would be able to mount the Epyc 9274F on a tower case, as I asked for. Do you think this may mean any problem, like decreased performance or reliability?

EDIT: I have now talked again with the vendor and they discourage using a tower for the processor because a different motherboard would have to be used. Now, the question I have is, would it mean any decrease in performance using the Epyc as a server with which my personal PC would communicate to run the FEA simulations, compared with having the Epyc directly in the same PC that I would use? Intuitively, I tend to think the communications between the PC and server would imply a greater latency, am I wrong?

Last edited by Sharku; February 29, 2024 at 05:41.
Sharku is offline   Reply With Quote

Old   February 29, 2024, 06:02
Default
  #15
Senior Member
 
Join Date: Oct 2011
Posts: 242
Rep Power: 17
naffrancois is on a distinguished road
You clearly do not want an air-cooled rack sitting next to you, it will be loud as hell. Racks are compact form-factors with small fans spinning fast and are meant to sit in a dedicated room.

What your vendor said is not very clear, you should maybe contact an other one. Some time ago a vendor made me an estimate for a water-cooled dual genoa within a tower, did not seem to be that hard for them.
Sharku likes this.
naffrancois is offline   Reply With Quote

Old   March 1, 2024, 00:53
Default
  #16
New Member
 
Join Date: Feb 2024
Location: Spain
Posts: 19
Rep Power: 2
Sharku is on a distinguished road
Quote:
Originally Posted by naffrancois View Post
You clearly do not want an air-cooled rack sitting next to you, it will be loud as hell. Racks are compact form-factors with small fans spinning fast and are meant to sit in a dedicated room.

What your vendor said is not very clear, you should maybe contact an other one. Some time ago a vendor made me an estimate for a water-cooled dual genoa within a tower, did not seem to be that hard for them.
Thank you naffrancois for your helpful comments, and for sending me the information about that vendor. I have contacted them and I am awaiting their response. Based on your comments, I have discarded the idea of a server rack, and I am looking for alternatives that implement an Epyc CPU of the Zen4 generation in a tower case for use as a workstation. The vendor you suggested is one promising possibility, but after some search I have also found some European vendors that offer specific workstations on their websites for Zen4 Epyc in tower cases, like this one:
https://www.rect.coreto-europe.com/e...-96-cores.html

Does it look got to you? Note that my budget is 7800€ so it would have just one Epyc CPU, not two. Unfortunately, in principle it does not include the option of liquid cooling. Do you think this particular option would also be too noisy?
Sharku is offline   Reply With Quote

Old   March 1, 2024, 04:37
Default
  #17
Senior Member
 
Join Date: Oct 2011
Posts: 242
Rep Power: 17
naffrancois is on a distinguished road
Hello,

I am a bit out of the loop concerning prices, sorry I cannot tell if it is a good deal or not. What surprises me is that the cpu list mixes single (P suffix) and dual socket processors. But I am not up to date, maybe all these genoa cpus can be set up on a single socket motherboard. Anyhow I doubt you would fit a dual socket config within your budget.

Still you would have to ask their cooling solution as it is not specified.

If you can make an estimate on the RAM you need beforehand you could save some money, 384Gb may (or may not) be a lot more than needed. See if half is a viable option, keeping of course 12 sticks to feed the all the channels.

If it were my money I wouldn't go for less than 2Tb nvme, the price difference is not that high.

Do not forget to include a graphics card, nothing fancy needed as long as it has enough vram to display smoothly, does not need to be in the professional line (see the flotus1 sticky post, general recommendations for cfd hardware)
Sharku likes this.
naffrancois is offline   Reply With Quote

Old   March 1, 2024, 04:46
Default
  #18
New Member
 
Join Date: Feb 2024
Location: Spain
Posts: 19
Rep Power: 2
Sharku is on a distinguished road
Thank you again, naffrancois.

Please, note that the link I posted was for the default settings of this workstation, not exactly for the specific settings I would select. I would indeed add a graphic card (I was thinking about the PNY Nvidia T1000 with 8 GB GDDR6-RAM or the RTXA2000 with 12 GB GDRR6 ECCE-RAM) and a greater SSD (I was thinking about a couple of 4 TB M.2 NVMe SSD SN850X with 1,200,000 IOPS). My concern was mainly about whether the system would be as noisy as a dual-Epyc workstation. I will ask the vendor about the cooling, as you suggested.

Many thanks!
Sharku is offline   Reply With Quote

Old   March 3, 2024, 06:36
Default
  #19
New Member
 
Join Date: Feb 2024
Location: Spain
Posts: 19
Rep Power: 2
Sharku is on a distinguished road
I have now found another European vendor who offers Epyc 9004 CPUs in tower servers. I have set this configuration:

https://server-konfigurieren.de/prod...453552764/4206

In contrast to the previous German vendor, this one does specify the cooler:

"Supermicro SNK-P0084AP4 4U Active CPU Heat Sink for AMD Socket SP5 Processors"

I have looked the cooler code on the Internet and apparently it only produces 42 dB, which seems reasonable for an office, doesn't it?
Sharku is offline   Reply With Quote

Old   March 4, 2024, 09:30
Default
  #20
Senior Member
 
Join Date: Oct 2011
Posts: 242
Rep Power: 17
naffrancois is on a distinguished road
This is hard to tell, reasonable for me may not be reasonable for you and vice versa as noise is highly subjective. 42 dB/5000 rpm is not what I would call silent but there may not be better alternatives with air cooling. I am already 3 generations of EPYC late, at that times Noctua coolers were enough to cool EPYC cpus and really silent ~20dB/2000 rpm. Since then TDP has increased a lot. I hope other people with more recent EPYC config can give you more insight on that topic.
Sharku likes this.
naffrancois is offline   Reply With Quote

Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
General recommendations for CFD hardware [WIP] flotus1 Hardware 19 June 23, 2024 19:02
Workstation Suggestions For A Newbie mrtcnsmgr Hardware 1 February 22, 2023 02:13
Superlinear speedup in OpenFOAM 13 msrinath80 OpenFOAM Running, Solving & CFD 18 March 3, 2015 06:36
Star cd es-ice solver error ernarasimman STAR-CD 2 September 12, 2014 01:01
OpenFOAM 13 Intel quadcore parallel results msrinath80 OpenFOAM Running, Solving & CFD 13 February 5, 2008 06:26


All times are GMT -4. The time now is 00:38.