CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > General Forums > Hardware

my EPYC's low performance in fluent

Register Blogs Community New Posts Updated Threads Search

Like Tree1Likes
  • 1 Post By flotus1

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   April 13, 2023, 09:53
Exclamation my EPYC's low performance in fluent
  #1
New Member
 
BaiYu
Join Date: Apr 2023
Posts: 1
Rep Power: 0
widwesd is on a distinguished road
I recently built a 7T83 workstation with 2*64 core to run fluent, but its performance is not as good as the previous 2*48 core 7R32.After multiple cases are opened, the cpu usage of 7T83 is abnormally high, and the memory usage is very low, and the computing speed is very low after the CPU usage up to 80%.Is this normal?Is there any way to improve its performance (mainly to open more cases and maintain a certain speed in fluent)?
widwesd is offline   Reply With Quote

Old   April 13, 2023, 11:23
Default
  #2
Member
 
Matt
Join Date: May 2011
Posts: 44
Rep Power: 15
the_phew is on a distinguished road
At least for DDR4, CFD solvers are starved for memory bandwidth once there are more than four cores per memory channel (EPYC Milan has 8 memory channels, so that would be around 32 cores). Clock speeds go down as core counts go up, which is why you are seeing worse performance with 64 cores vs. 48 (and a 32-core Milan CPU may even be faster than either for CFD, especially Milan-X).

DDR5 are 3D cache are different stories; OpenFOAM benchmarks with EPYC Genoa show good speedup up to 64 cores, and near-linear speedup up to 48 cores (Genoa has 12 memory channels of 50% faster DDR5, so 225% of Milan's memory bandwidth overall). Perhaps Genoa-X (3D cache) may even be able to benefit from the max 96 cores (as Milan-X was able to do with 64 cores).

Since that's an OEM CPU, I assume it's vendor-locked. Thus you may not be able to fetch a good price on the used market for those chips. Similarly, if it's a OEM-proprietary motherboard, you may not be able to upgrade to Milan-X CPUs (7773X, for instance). Although if you are running simulations with tens of millions of cells, you may not see much benefit to 3-D cache anyway. So you may just have to chalk it up to a lesson learned and do a better job matching the core count to the available memory bandwidth next upgrade cycle.
the_phew is offline   Reply With Quote

Old   April 13, 2023, 12:00
Default
  #3
Senior Member
 
Erik
Join Date: Feb 2011
Location: Earth (Land portion)
Posts: 1,188
Rep Power: 23
evcelica is on a distinguished road
I'm wondering if it's just a little lower and explainable by the previous post, or is something misconfigured with the system?
How much worse are we talking here?
Are all 8 memory channels populated evenly on both CPUs?
evcelica is offline   Reply With Quote

Old   April 13, 2023, 13:22
Default
  #4
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
Quote:
After multiple cases are opened, the cpu usage of 7T83 is abnormally high
What exactly is "abnormally high"?
One thread of a Fluent solver run will show 100% utilization for the core it is running on.
I.e. if you start simulations with a total of 64 threads, you should see half of all 128 cores running with 100% utilization, with the other cores being mostly idle. htop would be a good tool to check that on Linux.

There are a few things you need to check.
1) Hardware/memory: you need 16 identical DIMMs to get the best performance from these CPUs. If your motherboard has more than 16 DIMM slots, you also need to check they are in the correct slots.
2) Bios settings: memory needs to be set to 3200MT/s, if your DIMMs support that. Memory interleaving needs to be set to NPS=4. CPU turbo has to be enabled. And SMT has to be disabled.
3) Thermals. There are a lot of things that can overheat with in such a setup. The CPUs themselves are pretty low on that list. More likely: CPU VRMs, the memory modules themselves, and the memory VRMs. Check the temperatures of these components. At the very least, check that CPU core frequency is in a reasonable range.
4) Core binding: Especially when running several solver instances at the same time, you need to make absolutely sure that each thread gets pinned to its own core. Again, htop can give you a first indication of where the threads are actually running. And just to state the obvious: no oversubscribing. Running solver instances with more than 128 threads total is guaranteed to tank performance.

Side note: I had to make a lot of assumptions here. So feel free to give us more information about your setup. Motherboard, memory, case/cooling, operating system...

Edit, forgot one important part: check that all memory channels are present, if they are not, re-seat the CPUs
evcelica likes this.

Last edited by flotus1; April 14, 2023 at 07:47.
flotus1 is offline   Reply With Quote

Reply

Tags
cfd, epyc, performance analysis, performance testing, solve time


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
The fluent stopped and errors with "Emergency: received SIGHUP signal" yuyuxuan FLUENT 0 December 3, 2013 23:56
Fluent jobs through pbs ibnkureshi FLUENT 5 June 9, 2011 14:43
Is Fluent applicable to simulate velocity distribution under low pressure (~100pa)? beastieboys FLUENT 0 March 3, 2010 02:55
Multicomponent fluid Andrea CFX 2 October 11, 2004 06:12
Performance of fluent on win200 and linux Seb FLUENT 7 February 5, 2004 16:08


All times are GMT -4. The time now is 14:56.