In a new blog post on the AMD website Robert Hallock is talking about the latest AGESA (BIOS Firmware) 1006. We’ve been discussing and testing AGESA 1006 ourselves in depth already, and it certainly brings in better memory timings and higher frequencies. The new blog post advances on that.
We posted a Agesa 1006 / Ryzen memory article recently ourselves, read here. The AMD blog post however is talking more in-depth on fine tuning and teaking, plus the performance gains that can bring.
Here’s what they analyzed:
- The impact of the new BankGroupSwap (BGS) BIOS option
- Single-rank DIMMs vs. dual-rank DIMMs
- Automatic sub-timings vs. manually-tweaked subtimings
- Max frequency vs. lower frequency at tighter timings
- Geardown Mode (GDM) on vs. off
Geardown Mode (GDM)
GDM is enabled by default for memory speeds greater than DDR4-2667 per the DDR4 spec. GDM allows the RAM to use a clock that’s one half the true DRAM frequency for the purposes of latching (storing a value) on the memory’s command or address buses. This conservative latching can potentially allow for higher clockspeeds, broader compatibility, and better stability—good for the average user.
But what about overclockers? For overclockers, Geardown Mode will be noteworthy because it also tells the memory subsystem to “disregard” the command rate set in the BIOS. As 1T command rates can be beneficial (though tough to maintain) for performance, the chart below is really asking whether it’s useful to run GDM if the desired memory clockspeed can be achieved.
BankGroupSwap
BankGroupSwap (BGS) is a new memory mapping option in AGESA 1.0.0.6 that alters how applications get assigned to physical locations within the memory modules; the goal of this knob is to optimize how memory requests are executed after taking DRAM architecture and your memory timings into account. The theory goes that toggling this setting can shift the balance of performance in favor of either games or synthetic apps. Our data seems to bear this out: our games got a little faster with BGS off, while AIDA64 memory bandwidth was higher with BGS ON.
Single versus Dual Rank
the BankGroupSwap section, we alluded to “single rank” memory modules; that may have left some people scratching their head. That’s not surprising: memory ranks are largely unknown, not to mention cryptic. Starting from the top, PC enthusiasts know that a stick of memory is a circuit board with various memory chips attached. But have you ever thought about how a PC talks to those memory chips? That’s where ranks come in.
A “rank” is a group of memory chips that receive read and write commands as a group. Some memory sticks have all of their memory chips in one group, and those are single rank (SR) DIMMs. Other memory sticks split their memory chips into two groups, and those are called dual rank (DR) DIMMs.
DR modules can often be a smidge faster thanks to a capability called “rank interleaving,” wherein the second memory rank can still perform work while the first is being refreshed for use. However, DR modules are often harder for a system to drive to high frequency, which is why most high-performance memory kits use multiple 4GB or 8GB SR memory sticks. The extra frequency achievable by the SR memory modules is often enough to overcome the small performance benefit of DR DIMMs, too.
You can often tell single and dual rank memory apart by looking at the product code, which might say 1Rx4 or 1Rx8 for single rank, or 2Rx4 or 2Rx8 for dual rank. And though you should always verify with spec sheet, it’s a decent shortcut to assume an 8GB DDR4 DIMM is single rank, whereas a 16GB DIMM is almost certainly dual rank.
As we finally come to the data, our results lend credence that—all things being equal—DR memory configurations are a touch faster than SR configs for the purposes of PC gaming. But all things aren’t equal when it comes to overclocking memory, and we’ll explore that in the conclusion.
Automatic timings vs. manual tuning
Every overclocker knows that memory runs on “timings,” which are various wait periods PC memory must make as it completes a full cycle of reading or writing data. Lowering the timing values (making them more aggressive) can yield better performance by shrinking the wait periods. However, timings that are too aggressive can easily lead to instability and memory corruption as the memory struggles to accurately read and write its own data.
Motherboards generally take on all the heavy lifting of setting the complicated list of memory timings through mechanisms like SPD and XMP. These timings are configured to balance the fussy triangle of performance, compatibility, and stability. But was there something being left on the table? Sami intervened to find out, and his results couldn’t be clearer: overclockers with the wherewithal to hand-tune their memory timings can extract notably better performance in the PC games we looked at. Some games might be less sensitive to memory timings, but these tasks seem to love it.
frequency or timings?
Last, but not least, Sami set out to find whether it was tighter timings or higher clockspeeds that mattered most on the AMD Ryzen™ processor. Sami pushed this combination of hardware up to DDR4-3520, DDR4-3466 with tighter timings, and DDR4-3200 with the tightest timings that could be achieved while maintaining stability with Memtest.
The verdict: tighter timings won. DDR4-3200 with aggressive timing adjustments outperformed the looser timings needed to hit DDR4-3520, while 3466 clearly split the difference with the right balance of timings and frequency.
Putting it all together
Now that we’ve picked through the data in isolation, we thought it would prove useful to take a mile-high view and draw some conclusions about what we found from our data set, and how that might impact gaming on the AMD AM4 platform.
- DDR4-3200 “maxed” settings: tCL =12, tRCDW/R = 12, tRP = 12, tRAS = 28, tRC = 54, tWR = 12, tWCL = 9, tRFC = 224, tRTP = 8, tRDRDSCL = 2, tWRWRSCL = 2, ProcODT = 60Ω.
- DDR4-3466 “tuned” settings: tCL = 14, tRCDR/W = 14, tRP = 14, tRAS = 28, ProcODT = 60Ω, CR = 1T, GDM = Disabled, BGS = Disabled.
- DDR4-3520 “tuned” settings: tCL = 14, tRCDW/R = 14, tRP = 14, tRAS = 30, tRC = 56, tWR = 14, tWCL = 12, tRFC = 312, ProcODT = 53.3Ω.
- Conclusion #1: Dual rank DIMMs (yellow) offered the best performance amongst “set and forget” (light blue, orange, yellow) memory configured automatically by XMP profiles.
- Conclusion #1a: But the increased overclocking headroom of single rank modules was more than enough to overpower the benefits of rank interleaving, so manually-tuned single rank DDR4-3200 and 3466 won the day (dark blue and green).
- Conclusion #2: BankGroupSwap should likely be disabled for users that want the best PC gaming performance. As always, test your specific use case.
- Conclusion #3: Chasing the highest possible clockspeed required timings so relaxed that real world performance suffered versus lower frequencies with tighter timings. This is a fine balance, however, so testing on your platform is always helpful.
- Conclusion #4: Geardown Mode should likely be disabled if your overclock is stable with a 1T command rate. As always, test your specific use case.
We hope these insights prove useful, and we’re looking forward to your feedback.