What are CUDA Cores?
Learn the basics regarding CUDA cores, plus a simple example of how they enable parallel work on NVIDIA GPUs.
CUDA Cores are the individual processing units inside NVIDIA GPUs that execute floating-point and integer arithmetic operations. A single GPU can contain thousands of CUDA cores working in parallel.
What Is CUDA Programming
CUDA programming is how developers write code that runs on NVIDIA GPUs instead of only on the CPU. It breaks work into many parallel threads so thousands of CUDA cores can execute at once, which is ideal for vector math and ML workloads.
Basic CUDA Programming Example
Below is a tiny PyTorch example that runs on the GPU. Each element-wise add is split across many CUDA cores, so the work happens in parallel.
# A simple vector add — each CUDA core handles one element
import torch
a = torch.randn(1_000_000, device="cuda")
b = torch.randn(1_000_000, device="cuda")
c = a + b # distributed across CUDA cores
NVIDIA GPU CUDA Core Comparison
| Graphics Card | Architecture | CUDA Cores |
|---|---|---|
| RTX 3090 | Ampere | 10,496 |
| RTX 4090 | Ada Lovelace | 16,384 |
| RTX 5090 | Blackwell | 21,760 |
| RTX A6000 | Ampere | 10,752 |
| A100 | Ampere | 6,912 |
| H100 | Hopper | 14,592 |