Zen 2 Memory Controller

Simbelmynë · May 13, 2019, 13:45

Speculating on rumors may be silly, but it is also fun.

So.. Latest rumor surrounding the Zen 2 launch talks about a much better memory controller with possible support for DDR4 5000 (!) MHz. This will be achieved by cutting the infinity fabric frequency in half. However, infinity fabric v.2 should be about twice as fast so this is not supposed to have any draw-back.

With DDR4@5000 MHz we should be able to see some nice improvements in CFD even if we only have dual channel. If AMD also manage to push single core boost close to 5 GHz, then we will have a very nice system for pre- and post-processing CFD that can also manage (some degree of) simulations.

flotus1 · May 14, 2019, 05:16

I've read that this only applies to extreme overclocking methods. Like using dry ice and whatnot. There certainly won't be official support for this kind of memory frequency.
And I highly doubt there will be affordable memory rated for these frequencies during the lifespan of Ryzen 3000.

Simbelmynë · May 14, 2019, 05:35

Don't smash my hopes now

The good news here are that previous generation Zen+ has not been able to hit frequencies that high, which indicates that we might see a sizable headroom for hobby overclockers. Previously Intel controllers managed 4000+ with quite high success rate, even though they are only officially supporting 2666.

I think Zen 2 might take the lead in 1-2 thread CFD workloads, while at the same time Rome might bring some slight improvement in the EPYC camp.

Consumer DDR5 adoption may come in 2020 or 2021.

flotus1 · May 14, 2019, 06:08

Rome will bring quite a few improvements to the table.
The biggest IMO will be only one NUMA node per CPU. This will greatly improve performance for a vast amount of applications that are either lightly threaded or not NUMA-aware. At the very least it will facilitate getting decent performance out of these CPUs for any kind of application.
Provided they can keep latency low. The new layout adds one additional hop at minimum to communicate between cores on different chiplets.
And then there is the rumored 32MB of L3 cache per chiplet. if AMD doesn't follow Intel's example and keeps the full cache activated for lower core count models we have ourselve a nice new CPU.
Right now I am pretty sure that I will upgrade my Epyc 7301 along with the memory.

Simbelmynë · May 14, 2019, 08:14

Sound really nice! Any information about the Rome socket? Will old motherboards still work?

flotus1 · May 14, 2019, 08:51

Rome should be backwards compatible with current motherboards.

kyle · May 15, 2019, 11:45

Since most CFD software is NUMA aware, I would think the new memory architecture will be slightly worse for CFD than Zen 1. There has to be some latency overhead due to the extra hop to the IO chip.

Overall Rome will be an improvement, but not because they split the memory controller off of the compute dies.

flotus1 · May 15, 2019, 17:52

I had the platform in general in mind. Many of the benchmark results where Epyc Naples fell behind Xeon were due to the high amount of NUMA nodes per CPU. With that out of the way, AMD can gain some more market share and push the industry forward.
Remains to be seen how bad the latency hit will be for applications that already ran fine on Naples. I keep my fingers crossed.

Simbelmynë · June 12, 2019, 14:06

So now we start to get some more information about the performance of the Zen 2 memory controller. An overclocked 3950X manages to push memory above 5.1 GHz and now holds many of the previous Intel world records in Cinebench/Geekbench etc.

LN2 was required to push all 16 cores to 5 GHz.

Unclear if LN2 was required for the memory to achieve the 5.1 GHz speed though. The GSkill modules were rated at 4533 MHz.

These are very good news in my opinion, considering the official release is 3 weeks away still.

Duke711 · June 13, 2019, 10:22

Quote:

Originally Posted by Simbelmynë

With DDR4@5000 MHz we should be able to see some nice improvements in CFD even if we only have dual channel.

no thats wrong

Scale factor of DDR frequency:

speedup = frequency^0,66

(2500 / 1200 MHz)^0,66 -> only 1,62 (+62%) speedup by upgrade a DDR 2400 to 5000.

The speedup with channel is proportional

Simbelmynë · June 13, 2019, 11:35

Quote:

Originally Posted by Duke711

no thats wrong

Scale factor of DDR frequency:

speedup = frequency^0,66

(2500 / 1200 MHz)^0,66 -> only 1,62 (+62%) speedup by upgrade a DDR 2400 to 5000.

The speedup with channel is proportional

My point was not to compare some dual vs quad memory setup. Running memory @ 3466 which is the practical maximum for most zen+ systems today, then we will see some nice improvements @ 5000 MHz with zen 2. Obviously this will not happen if we need LN2, but if they manage to push the retail chips without LN2 on the memory controller then we are in for some nice improvements I think.

flotus1 · June 13, 2019, 14:41

Quote:

Originally Posted by Duke711

no thats wrong

Great way to preface the point you are trying to make. It almost prevented me from asking any follow-up questions. But anyway:

Quote:

Originally Posted by Duke711

Scale factor of DDR frequency:
speedup = frequency^0,66

Where exactly do you pull that number from? Speedup in which kind of workload? Or speedup for which performance metric?
And which frequency is the baseline for this correlation?

flotus1 · June 23, 2019, 06:01

Turns out these high memory frequencies on Zen 2 come at a cost: https://www.anandtech.com/show/14525...d-epyc-rome/11
Reducing the infinity fabric frequency by a factor of 2.
The way these CPUs are designed -chiplets with separate I/O die- this seems like a significant drawback.

Simbelmynë · June 23, 2019, 06:20

Quote:

Originally Posted by flotus1

Turns out these high memory frequencies on Zen 2 come at a cost: https://www.anandtech.com/show/14525...d-epyc-rome/11
Reducing the infinity fabric frequency by a factor of 2.
The way these CPUs are designed -chiplets with separate I/O die- this seems like a significant drawback.

So I guess the recommendation from AMD of 3600 MHz @ 1:1 ratio is the top performing option. Will be interesting to see the effect of IF speeds on CFD performance.

flotus1 · July 8, 2019, 17:24

https://www.anandtech.com/show/14605...sing-the-bar/2

Simbelmynë · July 9, 2019, 08:06

Good read!

Then, from what I can tell, there should be somewhat similar performance at the same frequencies and hopefully some improvement @ 3600 MHz.

Malinator · August 8, 2019, 06:53

So, at last Zen2 generation is launched.

First reviews are starting to roll out; https://www.tomshardware.com/news/am...7nm,40108.html
https://www.anandtech.com/show/14694...e-epyc-2nd-gen
Not too many new info aside from provisional prices yet)

I'm anxious to find out though what are performance changes relevant to CFD software

I'm currently planning on assembling workstation/server for OpenFOAM calculations have a couple of months to decide whether to stick with 7301/7351*2 setup or jump into 2nd Zen Gen.

Simbelmynë · August 8, 2019, 07:06

You can check out some CFD performance of Zen 2

Ryzen 3X00 benchmarks and memory timings

OpenFOAM benchmarks on various hardware

Malinator · August 8, 2019, 07:46

Yep, I've already studied them)
Pretty impressive results btw for a 8-core system, at least on par with pricey Corei9-9***x systems tested there.

Now the interesting part for me is whether changes in memory system would overall benefit Epyc systems; results of trading off between higher memory frequencies and higher latencies is not obvious.

The most interesting info yet is probably

flotus1 · August 8, 2019, 12:26

On paper, the most obvious changes to the architecture seem like they would be detrimental to performance of NUMA-aware software. I.e. trading in "fast" close memory access for more homogeneous, but overall slower memory access times, with the introduction of a separate I/O die.
But under the hood there is quite a lot that changed for the better. L3 cache sizes were doubled, which should reduce LLC misses and thus negate some of the higher memory latency. And prefetching was improved a lot, to the same effect.
And I was pleasantly surprised to read that Rome can be configured into a "sub-NUMA clustering" (NPS4) where each core only accesses the 2 memory channels closest to it. Leading to a similar NUMA topology as in Naples with 4 nodes per CPU.
https://www.anandtech.com/show/14694...e-epyc-2nd-gen
This will decrease memory latency quite a bit, leading to better performance in NUMA-aware software.
And let's not forget DDR4-3200 versus DDR4-2666. Memory bandwidth is still important, despite all the fuss about memory latency. I wonder when these will become readily available to consumers.
And the prices are pretty spectacular. 1350$ for 24 cores and 2025$ for 32 cores respectively. I am more convinced than ever about an upgrade of my home workstation with 2xEpyc 7301.

May 13, 2019, 13:45	Zen 2 Memory Controller	#1
Simbelmynë Senior Member Join Date: May 2012 Posts: 551 Rep Power: 16	Speculating on rumors may be silly, but it is also fun. So.. Latest rumor surrounding the Zen 2 launch talks about a much better memory controller with possible support for DDR4 5000 (!) MHz. This will be achieved by cutting the infinity fabric frequency in half. However, infinity fabric v.2 should be about twice as fast so this is not supposed to have any draw-back. With DDR4@5000 MHz we should be able to see some nice improvements in CFD even if we only have dual channel. If AMD also manage to push single core boost close to 5 GHz, then we will have a very nice system for pre- and post-processing CFD that can also manage (some degree of) simulations.

May 14, 2019, 08:51		#6
flotus1 Super Moderator Alex Join Date: Jun 2012 Location: Germany Posts: 3,427 Rep Power: 49	Rome should be backwards compatible with current motherboards. Simbelmynë likes this.

August 8, 2019, 06:53	AMD Epyc Rome	#17
Malinator New Member Andrew Join Date: Apr 2012 Posts: 15 Rep Power: 14	So, at last Zen2 generation is launched. First reviews are starting to roll out; https://www.tomshardware.com/news/am...7nm,40108.html https://www.anandtech.com/show/14694...e-epyc-2nd-gen Not too many new info aside from provisional prices yet) I'm anxious to find out though what are performance changes relevant to CFD software I'm currently planning on assembling workstation/server for OpenFOAM calculations have a couple of months to decide whether to stick with 7301/7351*2 setup or jump into 2nd Zen Gen.

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
Lenovo C30 memory configuration and discussions with Lenovo	matthewe	Hardware	3	October 17, 2013 11:23
[OpenFOAM] Color display problem to view OpenFOAM results.	Sargam05	ParaView	16	May 11, 2013 01:10
[OpenFOAM] [Critical] ParaView 3.12.0 breaks monitor signal in Ubuntu 11.04	v_mil	ParaView	5	March 18, 2012 14:39
CFX CPU time & real time	Nick Strantzias	CFX	8	July 23, 2006 18:50

May 14, 2019, 05:16		#2
flotus1 Super Moderator Alex Join Date: Jun 2012 Location: Germany Posts: 3,427 Rep Power: 49	I've read that this only applies to extreme overclocking methods. Like using dry ice and whatnot. There certainly won't be official support for this kind of memory frequency. And I highly doubt there will be affordable memory rated for these frequencies during the lifespan of Ryzen 3000.

May 14, 2019, 05:35		#3
Simbelmynë Senior Member Join Date: May 2012 Posts: 551 Rep Power: 16	Don't smash my hopes now The good news here are that previous generation Zen+ has not been able to hit frequencies that high, which indicates that we might see a sizable headroom for hobby overclockers. Previously Intel controllers managed 4000+ with quite high success rate, even though they are only officially supporting 2666. I think Zen 2 might take the lead in 1-2 thread CFD workloads, while at the same time Rome might bring some slight improvement in the EPYC camp. Consumer DDR5 adoption may come in 2020 or 2021.

May 14, 2019, 06:08		#4
flotus1 Super Moderator Alex Join Date: Jun 2012 Location: Germany Posts: 3,427 Rep Power: 49	Rome will bring quite a few improvements to the table. The biggest IMO will be only one NUMA node per CPU. This will greatly improve performance for a vast amount of applications that are either lightly threaded or not NUMA-aware. At the very least it will facilitate getting decent performance out of these CPUs for any kind of application. Provided they can keep latency low. The new layout adds one additional hop at minimum to communicate between cores on different chiplets. And then there is the rumored 32MB of L3 cache per chiplet. if AMD doesn't follow Intel's example and keeps the full cache activated for lower core count models we have ourselve a nice new CPU. Right now I am pretty sure that I will upgrade my Epyc 7301 along with the memory.

May 14, 2019, 08:14		#5
Simbelmynë Senior Member Join Date: May 2012 Posts: 551 Rep Power: 16	Sound really nice! Any information about the Rome socket? Will old motherboards still work?

May 15, 2019, 11:45		#7
kyle Senior Member Join Date: Mar 2009 Location: Austin, TX Posts: 160 Rep Power: 18	Since most CFD software is NUMA aware, I would think the new memory architecture will be slightly worse for CFD than Zen 1. There has to be some latency overhead due to the extra hop to the IO chip. Overall Rome will be an improvement, but not because they split the memory controller off of the compute dies.

May 15, 2019, 17:52		#8
flotus1 Super Moderator Alex Join Date: Jun 2012 Location: Germany Posts: 3,427 Rep Power: 49	I had the platform in general in mind. Many of the benchmark results where Epyc Naples fell behind Xeon were due to the high amount of NUMA nodes per CPU. With that out of the way, AMD can gain some more market share and push the industry forward. Remains to be seen how bad the latency hit will be for applications that already ran fine on Naples. I keep my fingers crossed.

June 12, 2019, 14:06		#9
Simbelmynë Senior Member Join Date: May 2012 Posts: 551 Rep Power: 16	So now we start to get some more information about the performance of the Zen 2 memory controller. An overclocked 3950X manages to push memory above 5.1 GHz and now holds many of the previous Intel world records in Cinebench/Geekbench etc. LN2 was required to push all 16 cores to 5 GHz. Unclear if LN2 was required for the memory to achieve the 5.1 GHz speed though. The GSkill modules were rated at 4533 MHz. These are very good news in my opinion, considering the official release is 3 weeks away still.

June 23, 2019, 06:01		#13
flotus1 Super Moderator Alex Join Date: Jun 2012 Location: Germany Posts: 3,427 Rep Power: 49	Turns out these high memory frequencies on Zen 2 come at a cost: https://www.anandtech.com/show/14525...d-epyc-rome/11 Reducing the infinity fabric frequency by a factor of 2. The way these CPUs are designed -chiplets with separate I/O die- this seems like a significant drawback.

July 8, 2019, 17:24		#15
flotus1 Super Moderator Alex Join Date: Jun 2012 Location: Germany Posts: 3,427 Rep Power: 49	https://www.anandtech.com/show/14605...sing-the-bar/2

July 9, 2019, 08:06		#16
Simbelmynë Senior Member Join Date: May 2012 Posts: 551 Rep Power: 16	Good read! Then, from what I can tell, there should be somewhat similar performance at the same frequencies and hopefully some improvement @ 3600 MHz.

August 8, 2019, 07:06		#18
Simbelmynë Senior Member Join Date: May 2012 Posts: 551 Rep Power: 16	You can check out some CFD performance of Zen 2 Ryzen 3X00 benchmarks and memory timings OpenFOAM benchmarks on various hardware

August 8, 2019, 07:46		#19
Malinator New Member Andrew Join Date: Apr 2012 Posts: 15 Rep Power: 14	Yep, I've already studied them) Pretty impressive results btw for a 8-core system, at least on par with pricey Corei9-9***x systems tested there. Now the interesting part for me is whether changes in memory system would overall benefit Epyc systems; results of trading off between higher memory frequencies and higher latencies is not obvious. The most interesting info yet is probably

August 8, 2019, 12:26		#20
flotus1 Super Moderator Alex Join Date: Jun 2012 Location: Germany Posts: 3,427 Rep Power: 49	On paper, the most obvious changes to the architecture seem like they would be detrimental to performance of NUMA-aware software. I.e. trading in "fast" close memory access for more homogeneous, but overall slower memory access times, with the introduction of a separate I/O die. But under the hood there is quite a lot that changed for the better. L3 cache sizes were doubled, which should reduce LLC misses and thus negate some of the higher memory latency. And prefetching was improved a lot, to the same effect. And I was pleasantly surprised to read that Rome can be configured into a "sub-NUMA clustering" (NPS4) where each core only accesses the 2 memory channels closest to it. Leading to a similar NUMA topology as in Naples with 4 nodes per CPU. https://www.anandtech.com/show/14694...e-epyc-2nd-gen This will decrease memory latency quite a bit, leading to better performance in NUMA-aware software. And let's not forget DDR4-3200 versus DDR4-2666. Memory bandwidth is still important, despite all the fuss about memory latency. I wonder when these will become readily available to consumers. And the prices are pretty spectacular. 1350$ for 24 cores and 2025$ for 32 cores respectively. I am more convinced than ever about an upgrade of my home workstation with 2xEpyc 7301.