An Explanation of CUDA Cores vs Stream Processors

When you’re trying to choose a graphics card, it’s natural to want an apples-to-apples comparison of the technical specifications between two cards. For example, if one GPU has 4GB of DDR4 RAM, and another card has 8GB DDR4 RAM, it’s easy to use that information. If you don’t understand all the technical details involved, you’ll still definitively know some things about the performance of each card. Even if all you know is what you can read on the system requirements sheet of your favorite game, the information can be put to practical use.

But unfortunately those kinds of comparisons aren’t always possible when you’re comparing cards between the world’s two foremost GPU developers, AMD and NVidia. Apples-to-apples comparisons are conditionally possible when comparing an AMD card to an AMD card, or a NVidia GPU to another NVidia GPU. But direct comparison between the two is impossible. That’s because each brand uses its own unique approach to building a graphics card. One of those differences are called CUDA Cores and Stream Processors.

What Are Cores?

If you know anything about CPUs, then you’ve probably heard about multi-core processors. The processor you’ll find in an ordinary home computer usually has between two and eight cores. That’s the same range you see with smartphones and other mobile devices. CPUs are built with a small number of cores, and each core is made to be exceptionally good at specific types of computing.

Whether you’re talking about a CPU or a GPU, these cores are what does the “thinking” in a processor. And each core is essentially like its own brain, performing sets of tasks independently. Having several cores provides your system the ability to do a special type of multitasking called parallel processing, which creates more efficiency from the processor.

Cores Built Differently Also Work Differently

For example, imagine you’re playing a videogame on a computer with a quad-core processor. Between those four cores, the game may task one core to operate enemy AI, the second core with loading tasks, the third core with the game physics engine, and so on. Being able to split those jobs several ways means less work for each core, and an overall more efficient use of computing power.

While consumer CPUs use a small handful of very powerful cores, GPUs are built with a large number of comparably less powerful cores. Those types of differences between CPUs and GPUs exists because each do very different types of processing. A GPU is great for doing parallel tasks, like working out how thousands of pixels need to appear in fractions of a second.

Technically speaking, they’re great with iterative thinking and numerical works. But when it comes to things like single-thread execution, short-running kernels, and branch-heavy code, GPU’s are slow and inept. For tasks like that, you need the kind of architecture CPUs are built around.

nvidia cuda cores

What are CUDA Cores?

Like there are essential differences between CPUs and GPUs, there are similar fundamental differences between NVidia GPUs and AMD GPUs. In the most superficial sense, CUDA Cores are simply the branded name which NVidia uses to refer to the cores used on their products. But more importantly, CUDA specifies that those cores are using a specific type of architecture, similar to the difference between an Intel and AMD processor. More than simply being a set of cores, CUDA is an interface to access those cores and communicate with the rest of your system. The cores that execute those instructions are called CUDA cores.

What are Stream Processors?

Likewise, Stream Processors are what AMD calls the cores and their associated infrastructure. If you want to put aside the branded terms, a more ubiquitous name which refers to each is “pixel pipeline.” Another term commonly used online is “shader”, which is technically a correct use of the phrase, but is vague enough to be easily confused.

flow of cuda

CUDA Cores vs. Stream Processors

When you’re comparing two cards with the same architecture, having more cores will generally mean a stronger card. But even with those conditions attached, that rule may be broken depending on a number of other factors. To understand, it would help to understand more about the architecture of a GPU.

What is GPU architecture? It refers to the way the various parts of a GPU are constructed to interrelate to one another. Technically speaking, the architecture dictates things like how the register file interacts with a specific set of cores, how those cores interact with the symmetric multi-processor, how that interacts with the L1 cache, and so on. As these components are arranged with subtle differences in construction and communication, the resulting differences in performance are difficult to characterize in simple terms.

If that sounds confusing, think of it this way. There are dozens of different ways you might build a car. The engine might be under the hood or in the trunk. You may lower or raise the frame by an inch. You might construct a massive 18-wheeler, or a compact sedan. The construction of the frame, the miles per gallon you can get from the engine, your safety in a crash, there are countless ways to try and construct the same type of machine.

Comparing Cards

AMD and NVidia make machines that try to do the same thing, but they go about it in different ways. Suppose you were comparing an AMD card with a similar GPU from NVidia. Even if one card had 500 CUDA Cores and the other had 500 Stream Processors, their differences in architecture ensure both card have a unique performance. Much like the difference between a sedan and an SUV, each type of GPU will perform in its own distinctive way.

gtx-1070

If you’re trying to compare CUDA Cores to Stream processors, you can’t think of them as equal in either their quantity or quality. Even between two GPUs with the same number of cores and clock frequencies, you will not end up with the same performance. To make matters even more complicated, sometimes you’ll see CUDA cores or Stream Processors used in addition to another type of core, like Tensor cores. And those cores, too, often defy immediate comparisons.

You can find precisely one circumstance where it’s possible to make an apples-to-apples comparison between the cores of two GPUs: when both GPUs are using the same architecture. And don’t be fooled, being simply designed by AMD or NVidia doesn’t mean every card from those brands will use the same architecture. Both brands routinely develop new architecture to based their cards upon, and those architectural advancements usually occur along generational lines.

In other words, it would be possible to make an apples-to-apples comparison between the GTX 1070 and GTX 1080. They’re both NVidia cards from the 10XX series, and they both use the same Pascal architecture. The same can be done for AMD cards. But if you really want to understand differences in core performance, the best way isn’t staring at a spec sheet, it’s turning your attention to performance tests.

Measuring The Difference

If you want to know how well a GPU performs, the best method is through a benchmarking test. GPUs are built to crank out as many frames as they can at all times, unless they’ve specifically been instructed to do otherwise, like with V-sync. For all intents and purposes, that means the winner of any given benchmark will be whatever card can produce the highest frame rates. Unless you’ve got terribly specific testing in mind, you don’t need anything like benchmark software to do your own testing. All you’ve got to do is load up your favorite game, and see how many frames come out.

However, there is one more sticking point. Suppose you have two GPUs, one from NVidia and another from AMD. Suppose both those cards are identical in their power, and have equally smart infrastructure. Even in those circumstances, a direct comparison can be perilous because from one game to the next, driver support can vary wildly. One game will have better support for NVidia than AMD, while another game might favor AMD over NVidia. Driver support can make a night-and-day difference in performance, but things like that aren’t reflected in a simple comparison of CUDA Cores to Stream Processors.

A Quick Recap

If you’re feeling lost, here are the main points you should takeaway. Stream Processors and CUDA Cores are branded names for the same thing: a multi-core processor and the set of rules for its operation. In practice, the two are fundamentally different because AMD and NVidia each use their own unique architecture. The way they carry out operations is fundamentally different.

Trying to directly compare Stream Processors to CUDA Cores to each other is like trying to determine how much gas mileage you’ll have by looking at the diameter of your gas tank. It makes much more sense to just take the care for a drive. And if you don’t want to do the testing yourself, the Internet is full of other people’s benchmarks. Looking at hard data will always be far more valuable than making comparisons between CUDA Cores and Stream Processors.

Share This Article

Leave a Reply