Ever since AMD shipped Ryzen 7nm CPUs, the enthusiast community has been busy trying to find ways to squeeze a bit more performance out of the product stack. One of the areas of concern highlighted by users is that Ryzen Master — AMD’s utility for overclocking and tweaking its Ryzen CPUs — identifies a different “best” core for workloads than the actual CPU core often used for most single-threaded tests.
I saw this issue in action when I did the testing for our recent evaluation of the 1usmus power plan. The screenshot of Ryzen Master below shows my 3900X testing Cinebench R20 in single-threaded mode. The workload is being executed on Core 02, not C04, C01, and C03. These are the three best-performing cores in the CCX according to Ryzen Master, as evidenced by the gold star and silver stars.
While this screenshot was taken when the workload was mostly executing on C02, the most popular core for single-threaded work across the entire test run was C05. While the workload wouldn’t stick to C04 or C11 for very long, I did note slightly higher clock speeds when these cores were engaged compared with the non-starred cores. C04 appears to reliably boost a bit higher than C05, even though C05 was the most popular core for work.
This has led to concerns that the Windows 10 Scheduler was harming performance by refusing to schedule work on the appropriate CPU core. AMD has now published a blog post on this topic, explaining why neither Ryzen Master nor Windows 10 were wrong — and why it’s changing the way RM works anyway.
The UEFI maintains a ranking of the CPU cores as generated during the final test and assembly process. Both Windows and Ryzen Master use this data to determine their respective core rankings. The fact that the CPU cores are ranked roughly 3 percent apart does not mean that each CPU core is 3 percent faster than the next — AMD chose to order the core rankings this way because “arbitrarily ranking the cores ~3% apart is perfect for communicating to the OS which core(s) are fastest, without leaving room for rounding errors when CPPC2’s abstract and unit-less performance scale gets converted to CPU frequency selection for your workloads.”
CPPC2 stands for Collaborative Processor Performance Control, which is defined in the ACPI 6.2 specification (page 527). Said specification states: “Collaborative processor performance control defines an abstracted and flexible mechanism for OSPM (OS Power Management) to collaborate with an entity in the platform to manage the performance of a logical processor.” In short, AMD ranks the cores in the way it does because it allows the company to communicate its management strategy to Windows in a clear and effective way, not because each core is literally three percent faster or slower than the next.
Why Windows 10 and Ryzen Master See Two Different Things in the Same CPU
Windows 10 doesn’t choose a single “best” CPU core. Instead, Windows 10 prioritizes finding the fastest pair of CPU cores within each CCX and swaps workloads back and forth between them. I happened to capture a screenshot of what this handoff looks like in Ryzen Master when running CB20:
This screenshot was taken with 1usmus’ plan engaged, which is why most of the other CPU cores are asleep, but it shows the handoff between C04 and C05. Windows 10 attempts to keep ST workloads split between a pair of CPU cores to share the electrical and thermal loading. Ryzen Master, on the other hand, chooses which core to award with a gold star based on its single maximum overclocking potential.
In short, Ryzen Master is checking to see which single CPU core ranks the best for overclocking purposes, while Windows is checking for something more along the lines of “Which pair of CPU cores within the same CCX offer the best overall performance?” The two evaluations are based on the same data set, but they aren’t checking for the same thing. As a result, RM has thought there was a different “best” CPU core than Windows does.
Going forward, AMD will adjust Ryzen Master so that it reports exactly the same information Windows 10 does, eliminating this source of confusion. The company’s consistent messaging on the matter of managing Ryzen’s clock speed has been that leaving the matter to Windows 10 results in better outcomes than attempting to manage it manually. This may not be technically true in the case of flat, all-core overclocks, but those kinds of manual configurations also tend to toss power efficiency out the window. My own experience as a reviewer backs up AMD’s guidance. I’ve made a few attempts to improve Ryzen performance using the sorts of performance enhancement options vendors typically offer in UEFI. They’ve generally resulted in slightly lower performance, not higher.
AMD’s guidance for achieving maximum performance on 7nm Ryzen CPUs remains unchanged: Update to the May 1903 update or later, install the latest chipset drivers, and use a UEFI from AGESA 1002 or later. If you wish to make absolutely certain these capabilities are engaged, set your Global C-States and CPPC settings to AUTO ON or force-enabled.
Let’s block ads! (Why?)
Read more here: ExtremeTechComputing – ExtremeTech