|
[Sponsors] |
May 7, 2019, 12:11 |
Memory bandwidth problem?
|
#1 |
New Member
Join Date: Apr 2014
Location: Germany
Posts: 24
Rep Power: 12 |
Hi,
our lab has a workstation with two Intel Gold 6150 (each 18 cores) and 6x 16GB DDR4-2666 ECC. Each processor has 6 memory channels with in total makes 12. So only half of them are in use. I have to run a large number of rather small cases ~ 40.000 cells, so I run the cases in parallel by starting e.g. 20 simulations at once. I noticed that the simulations get very slow (need more than 3 times as long) if I start more than ~ 24 simulations at once. Our workstations has more than 24 cores, so I do not think that the processor is the bottleneck.I read at lot about memory bandwidth problems in this forum so I was wondering if this is one. I therefore removed 4 of the DIMMs and ran 20 Simulations at once. I expected the 20 simulations to run slower but the simulations weren't getting much slower (only 5-10%). Is it a memory bandwidth problem? Anyone an idea where the bottleneck is? Best, Moritz |
|
May 7, 2019, 15:51 |
|
#2 |
Senior Member
Erik
Join Date: Feb 2011
Location: Earth (Land portion)
Posts: 1,186
Rep Power: 23 |
Get 12 identical memory modules and populate them correctly (in the correct slots per your motherboard manual)
Check your motherboard manual for the best way to balance only 6 slots, (3 per CPU) but you really should be using all 12 channels. |
|
May 7, 2019, 16:16 |
|
#3 |
New Member
Joshua Brickel
Join Date: Nov 2013
Posts: 26
Rep Power: 13 |
See https://lenovopress.com/lp0742.pdf This will show you why populating only half the memory channels is, to put it mildly, not good.
|
|
May 7, 2019, 17:56 |
|
#4 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49 |
The fact that the simulations do not become much slower when you remove DIMMs would normally indicate that memory bandwidth is not an issue here. Then again, we don't know whether the 6 DIMMs that are populated now are populated correctly to give at least 2x triple-channel. Did you check for that?
What might also be happening with these very small cases: one computation mostly fits into L3 cache. Running more of them at once causes more and more cache-misses. Or you could have a thermal issue where stressing more cores leads to very low CPU frequencies. This would need to be checked. Or for some reason several of your smaller simulations get scheduled on the same physical cores. Again, something that needs to be investigated before drawing conclusions. Disabling Hyperthreading could be one way to solve this. Or on Linux you could use taskset to make sure each computation gets pinned to a different physical core. After that, I would highly recommend getting 12 identical DIMMs and making sure they get populated correctly. |
|
May 8, 2019, 07:22 |
|
#5 |
New Member
Join Date: Apr 2014
Location: Germany
Posts: 24
Rep Power: 12 |
I checked the following:
1. The DIMMs are populated correctly. (And as soon as there is some money I will ask for more DIMMs.) 2. The simulations are not running on the same physical core. 3. The temperature of the CPU is at 90°C (lm_sensors) but the CPU is not throttling. cat /proc/cpuinfo | grep "MHz" tells me that all cores are running with around 3400 MHz.(By the way why is the CPU running at 3400 MHz when the model name is Gold 6150 CPU @ 2.70GHz ? ) So I conclude that the bottleneck is the L3 cache size. Is there anything I can do? Thanks to all. Best, Moritz |
|
May 8, 2019, 08:16 |
|
#6 | ||
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49 |
Quote:
Quote:
Aside from that, fully populating all memory channels will help at least a bit. Last level cache misses mean that the data has to be funnelled through memory. Edit, I forgot one thing: another possible bottleneck could be data I/O. In case your simulations read or write a lot of data from disk. Or perform a high number of small reads/writes. |
|||
May 8, 2019, 08:28 |
|
#7 | |
New Member
Join Date: Apr 2014
Location: Germany
Posts: 24
Rep Power: 12 |
Quote:
Thanks for the advice. I/O is not an issue. |
||
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Memory bandwidth and memory interleaving | Sly | Hardware | 2 | February 19, 2015 14:41 |
Lenovo C30 memory configuration and discussions with Lenovo | matthewe | Hardware | 3 | October 17, 2013 11:23 |
RAM memory problem | alpha | Main CFD Forum | 8 | February 12, 2008 12:07 |
"Memory too low" problem with Fluent HELP NEEDED | Amr | FLUENT | 6 | May 8, 2006 13:06 |