Whether we’re talking dollars in your bank account or dates you got lined up on tender, more is generally considered to be better. A sentiment that also seems to hold true with the number of cores in your computer’s CPU, at least if you buy into the marketing.
Hold on! Even though having many cores definitely gives you a boost in multi-threaded applications like rendering 3D animations. There are actually situations where more cores provide no benefit whatsoever or can even actually hurt your system’s performance.
Understanding the Problem With Many Multicore Processor
Well, to start off with, the more cause you pack onto a CPU, the more power they need and the more heat they generate. And remember that because CPU cores are crammed into a relatively small space, manufacturers end up working against some serious limits when it comes to thermal design power or TDP.
This means that to prevent the CPU from drawing too much power and producing too much heat, the individual cores have traditionally run their clock frequencies lower to improve efficiency. And even if the advertised boost clock for a CPU with lots and lots of cores can appear to be high, it’s often the case that they can not maintain these clocks for long periods of time or that they only do it when you’re running very late applications.
A small change in the core clock frequency can result in a dramatic impact on the heat load. To prevent this from happening and to prevent the heat output of a CPU from getting out of hand, modern CPU boards often include some fairly aggressive thermal control technology or TCC technologies that can adjust the core clock frequency. TCC stands for Thermal Coupling Control and can be thought of as a kind of heat sink for the CPU cores, the first heat transfer stage between the CPU and the rest of the system. The TCC is connected to the PLL of the CPU directly and helps to create a small thermal gap to dissipate the heat generated by the CPU. It basically helps to keep your system cool and prevents the CPU from overheating.
Scenarios Where Expensive Multicore Processor Fails to Be Beneficial
So if you’re using your computer mostly for applications where single-threaded performance matters more, such as games that have super expensive 18 core CPU, you might actually yield a worse experience than something cheaper. And if you go with a really high core count CPU, there’s another wrinkle with how processors with that many cores access the system memory.
You see, in some cases, these larger CPUs need to have their core split into two groups or nodes of cores, with each group getting its own memory controller and a segment of the physical memory in a scheme called non-uniform memory access or NUMA.
This is generally quicker than the opposite solution called uniform memory access or UMA, where all the core share one big pool of memory. But here’s the problem. The faster your CPU is, the more that UMA can take, and the more bandwidth it demands. So the faster the processor you’re using, the faster your system memory needs to be to fill up the UMA pool to make it all run at the same speed.
NUMA vs. UMA
CPU that uses NUMA, which is better for latency-sensitive applications, can often struggle when running a single program that uses tons of threads because of the different memory access times between the nodes and the fact that each node would have to wait on the other one to finish working on the same data, highly multi-threaded programs like (Blender 3d, Adobe Premiere, Adobe Photoshop, etc.) often don’t want to cross nodes even if it would mean being able to take advantage of the entire CPU.
So back to Yuma, then right. No, because one controller manages all memory accesses to give every program equal time rather than allowing access to the memory more directly, as, in NUMA, UMA has a built-in performance penalty that increases the more nodes your system has to manage. So using a CPU with separate groups of cores means you’re going to be subjected to one of these drawbacks, and you’re going to take a performance hit either way.
These are problems that you simply don’t run into on smaller chips with fewer cores because you’re not dealing with multiple nodes. But getting away from memory access, sometimes the cores themselves are even designed in a way that bottlenecks them, the more of them you slap onto a chip.
Do you remember how before Rizen, AMD’s processors seemed to be significantly slower than Intel despite having more cores? Well, a big reason for this was that those old bulldozer effects processors didn’t use full cores. Instead, FX CPUs advertised as having cores would, in reality, have eight integer units but only 4 floating point units that were shared between the 8 cores. The point is you could think of these CPU use as having 4 – half cores that were missing, which severely hampered their single-threaded performance in some critical applications.
Now, this design allowed AMD’s processors to handle more threads for a cheaper price, but it also meant that their real-world performance lagged way behind Intel and the only way AMD could try and compensate was to increase clock speeds which increased heat output and contributed to AMD’s reputation for hot-running CPU use for many years.
What’s Our Bottom Line Than
However, both AMD and Intel are using much wiser strategies for their multicore CPU use and clever boosting techniques to give them a similar single-threaded performance to their less costly brethren.
If the best sales pitch for a super-premium product is that it doesn’t suffer a performance penalty in the applications that you use well, you’d better make sure you’ve got a use case for it before spending your hard-earned cash. And not playing fortnight and watching Netflix on your system definitely doesn’t count.