Hardware

TPU vs GPU: Which to Use for AI?

Last update:
June 11, 2026
13 mins read

The TPU vs GPU debate isn't new, but it's definitely alive. GPUs have powered deep learning for the last decade, but Google's Tensor Processing Units are now a serious contender. This comparison covers each chip, how they differ, their intended workloads, and where to rent them in 2026.

What Is a TPU?

TPU stands for Tensor Processing Unit. It's an application-specific integrated circuit (ASIC) designed by Google to accelerate tensor and matrix math, the core operations that power neural networks.

Unlike a general-purpose CPU or a highly-flexible GPU, a TPU is specialized in data workflows of machine learning. Its systolic array architecture is optimized for massive matrix and vector operations, drastically reducing memory access bottlenecks.

Google TPU v1

How Google's TPU Was Built for Deep Learning from the Ground Up

TPUs were conceived in 2013, when Google projected that demand for neural network inference would double its datacenter footprint. Building enough CPU capacity for that load would have cost billions and years of construction. The solution was a purpose-built chip for matrix operations at a fraction of the power and cost.

Google went from concept to production in just 15 months. TPU v1 quietly powered Google Search, Photos, Translate, and YouTube before anyone outside the company knew it existed. The key architectural decision was the systolic array: a grid of multiply-accumulate units that streams data through in a wave, reusing every intermediate result and reducing memory reads that slow down general-purpose hardware.

How Google Built TPUs for Deep Learning

TPUs were conceived in 2013, when Google projected that demand for neural network inference would double its datacenter footprint. Building enough CPU capacity for that load would have cost billions and years of construction. The solution was a purpose-built chip for matrix operations at a fraction of the power and cost.

Google went from concept to production and unveiled the first TPU in 2016. TPU v1 quietly powered Google Search, Photos, Translate, and YouTube. The key architectural decision was the systolic array: a grid of multiply-accumulate units that streams data through in a wave, reusing every intermediate result and reducing memory reads that slow down general-purpose hardware.

What Is a GPU?

A GPU, or Graphics Processing Unit, was originally designed to render pixels. Rendering requires computing thousands of polygons simultaneously, so GPUs were built for parallelism; thousands of small cores performing relatively simple operations at once. In the late 2000s, researchers discovered that this architecture also excelled at training neural networks, giving GPUs a second career.

NVIDIA released CUDA in 2007, giving developers a programming model that could target GPU cores for general computation. Researchers built entire ecosystems of libraries, frameworks, and tools on top of it. By the time deep learning exploded in the early 2010s, NVIDIA GPUs were already the accelerator of choice.

Why NVIDIA Dominates

NVIDIA's dominance comes down to two compounding advantages: hardware and ecosystem. Modern NVIDIA GPUs combine thousands of CUDA cores with dedicated Tensor Cores for mixed-precision matrix multiplication. PyTorch, the dominant AI framework, runs natively on NVIDIA GPUs with no translation layer required.

This combination has made NVIDIA GPUs the default choice for training, fine-tuning, and inference. Most frameworks, models, and workflows run on NVIDIA hardware out of the box.

Google's TPU Product Lines: Every Generation Explained

Timeline showing Google TPU generations from v1 in 2015 through v6 Trillium in 2025, displaying the evolution of Tensor Processing Units with release years and key milestones marked along a horizontal axis

TPU v1 to v4: Early Generations

The first four generations of TPU established the foundation for everything that followed.

TPU v1 (2015) was an inference-only chip built on a 28-nanometer process, consuming just 40 watts while delivering 92 trillion 8-bit operations per second. It plugged into standard servers through PCIe and was 15-30x faster than contemporary GPUs for production inference workloads, with 30 to 80 times better operations per watt.

TPU v2 (2017) added training support by replacing 8-bit integer arrays with bfloat16 floating-point units, plus high-bandwidth memory and multi-chip pod scaling. Implemented 128×128 matrix multiplication units and 2D torus interconnects.

TPU v3 (2018) introduced liquid cooling and higher clock speeds.

TPU v4 (2021) upgraded to a 3D torus topology, cutting worst-case communication latency at large pod sizes. Became the chip behind many of Google's production models.

TPU v5e and v5p: Cost-Efficient Training

The v5 series introduced specialization within a single generation. The v5e is the cost-optimized variant: $1.20 per chip-hour on-demand, 393 trillion INT8 operations per second per chip, and 2.5 times more throughput per dollar than TPU v4. The v5p is the high-performance variant for large-scale enterprise LLM training, priced at $4.20 per chip-hour and delivering 2.8 times faster LLM training than v4 across 4,096-chip pods.

TPU v6e (Trillium): Performance Uplift

Ironwood (TPU v7) became generally available in late 2025 and is the most powerful custom silicon Google has built. Each chip delivers:

  • 4,614 FP8 TFLOPS
  • 192 GB of HBM3E with 7.37 TB/s of memory bandwidth
  • 9.6 Tb/s ICI bandwidth to neighboring chips

A full superpod integrates 9,216 chips for 42.5 exaFLOPS of FP8 compute, with 2 times better perf/watt than Trillium and 4 times better performance per chip than v6e.

TPU v7 (Ironwood): Built for Inference

Google TPU v7 Ironwood

Ironwood (TPU v7) became generally available in late 2025 and is the most powerful custom silicon Google has built. Each chip delivers:

  • 4,614 FP8 TFLOPS
  • 192 GB of HBM3E with 7.37 TB/s of memory bandwidth
  • 9.6 Tb/s ICI bandwidth to neighboring chips

A full superpod integrates 9,216 chips for 42.5 exaFLOPS of FP8 compute, with 2 times better perf/watt than Trillium and 4 times better performance per chip than v6e.

TPU vs GPU: Key Differences Explained

Architecture: Systolic Arrays vs CUDA Cores

The architectural difference between TPUs and GPUs explains most of the performance and flexibility trade-offs downstream.

A TPU's systolic array is a grid of multiply-accumulate units that streams data in a single direction. Each unit receives values from its neighbor, computes a result, and passes it along, maximizing data reuse and reducing expensive memory reads. Modern TPUs use 256×256 arrays (Trillium and Ironwood), delivering 65,536 operations per clock cycle.

A GPU uses CUDA cores organized into Streaming Multiprocessors (SMs), paired with Tensor Cores for matrix operations. This architecture handles branching, diverse data types (FP64, FP32, FP16, FP8, INT8), and non-uniform workloads that a systolic array struggles with. The trade-off; general-purpose flexibility takes up chip space that could be used for raw matrix throughput.

Performance: Real-World Benchmarks

For large-batch matrix multiplication (the dominant operation in transformer training), TPUs hold a measurable advantage at pod scale. Trillium (v6e) in an 8-chip configuration delivers 7,344 BF16 TFLOPS at roughly 300W TDP, closely matching a quad-H100 NVL system's 6,682 TFLOPS at 700W. At pod level the advantage compounds further: TPU pods scale to 9,216 chips via purpose-built ICI interconnects, while GPU clusters rely on NVLink, NVSwitch, or InfiniBand, adding latency and configuration complexity.

For smaller workloads, single-model inference, or tasks with irregular compute patterns, GPUs typically win on raw latency. The H100 handles diverse batch sizes and data types without requiring tuning like TPUs.

Performance: Real-World Benchmarks

For large-batch matrix multiplication (the dominant operation in transformer training), TPUs hold a measurable advantage at pod scale. Trillium (v6e) in an 8-chip configuration delivers 7,344 BF16 TFLOPS at roughly 300W TDP, closely matching a quad-H100 NVL system's 6,682 TFLOPS at 700W. At pod level the advantage compounds further: TPU pods scale to 9,216 chips via purpose-built ICI interconnects, while GPU clusters rely on NVLink, NVSwitch, or InfiniBand, adding latency and configuration complexity.

For smaller workloads, single-model inference, or tasks with irregular compute patterns, GPUs typically win on raw latency. The H100 handles diverse batch sizes and data types without requiring tuning like TPUs.

Energy Efficiency: Performance Per Watt

Energy efficiency is where TPUs show a structural advantage. The Trillium v6e runs at 300W TDP versus the H100's 700W, delivering $2-2.5x better performance per watt for transformer training. Ironwood extends that lead further with 2 times better perf/watt than Trillium, making it the most energy-efficient AI accelerator Google has shipped.

For organizations with sustainability targets or power-constrained data centers, this efficiency gap translates directly into lower operating costs.

Framework Support: TensorFlow, PyTorch, and JAX

Framework support is one of the most significant practical differences between TPUs and GPUs, and the gap has narrowed but not closed.

GPUs natively support major frameworks: PyTorch, TensorFlow, JAX, and scientific computing libraries. Almost any open-source model or research codebase runs on a GPU without modification.

TPUs require code to pass through the XLA (Accelerated Linear Algebra) compiler, which integrates well with JAX and TensorFlow but needs an additional backend layer for PyTorch (PyTorch/XLA). Migrating a standard PyTorch codebase often means debugging compilation traces, restructuring loops, and handling host-device transfer patterns.

It's worth noting that Ironwood (TPU v7) dropped TensorFlow support entirely. Only JAX and PyTorch are supported going forward.

Cost: Cloud Pricing and Total Cost of Ownership

On-demand pricing is only part of the cost story. TPU chips are rented exclusively on Google Cloud and range from $1.20 (v5e) to $4.20 (v5p). GPU pricing is more competitive: the H100 is available from AWS, Azure, GCP, Lambda, RunPod, CoreWeave, Thunder Compute, and many other providers, creating genuine competition.

Total cost should also includes engineering time. Migrating a PyTorch workflow to TPUs is a major investment, and that cost must offset any savings to make TPUs economically attractive.

Google TPU vs NVIDIA GPU: A Head-to-Head Comparison

Google Ironwood TPU vs NVIDIA H100: How Do They Stack Up?

Specification Google Ironwood (TPU v7) NVIDIA H100 SXM
Peak FP8 TFLOPS (per chip) 4,614 3,958
Memory per chip 192 GB HBM3E 80 GB HBM3
Memory bandwidth 7.37 TB/s 3.35 TB/s
TDP (power draw) ~350W (est.) 700W
Max pod / cluster scale 9,216 chips (ICI) ~512 GPUs (NVLink / InfiniBand)
Framework support JAX, PyTorch/XLA PyTorch, TensorFlow, JAX, CUDA
Cloud access model Google Cloud Platform 40+ cloud providers
TPU v7 pricing is not yet fully public; H100 pricing reflects on-demand rates across major providers as of June 2026.

Where Google TPUs Win

TPUs hold genuine advantages in specific, large-scale scenarios. For training very large transformer models at pod scale, TPU pods offer lower communication latency and higher memory bandwidth than equivalent GPU clusters. The ICI interconnect is a purpose-built fabric; GPU clusters depend on InfiniBand or NVLink, adding configuration complexity and potential bottlenecks.

Ironwood's roughly 2 times better perf/watt also creates a real cost advantage for 24/7 hyperscale inference. Organizations deep in the Google Cloud ecosystem (with JAX codebases and existing GCP commitments) additionally benefit from tight integration with Vertex AI, BigQuery, and Google's managed training infrastructure.

Where NVIDIA GPUs Win

NVIDIA GPUs win on flexibility, availability, and ecosystem. PyTorch, the dominant framework for AI research and production, runs natively on GPUs with no compiler layer. Any open-source checkpoint, any Hugging Face library, any research codebase works on an NVIDIA GPU out of the box.

GPUs are also available on-premises, on consumer hardware, and across 40+ cloud providers competing on price. That competition keeps GPU rates lower and gives developers leverage that simply does not exist with TPUs, which are locked to Google Cloud.

When to Use a TPU vs a GPU

When to us TPUs

TPUs are a strong choice for:

  • Training very large models (10B+ parameters) on Google Cloud with JAX or TensorFlow, with enough sustained chip-hours to justify committed-use discounts.
  • High-volume batch inference at hyperscale (millions of requests per day), where v6e or Ironwood cost advantages compound over time.
  • Workflows native to Google's AI ecosystem: Vertex AI, BigQuery ML, or the Gemini APIs.
  • Transformer architectures with large, predictable matrix shapes that map cleanly to systolic arrays without extensive tuning.

Use Cases Where GPUs Win

GPUs are the better choice for the majority of AI practitioners:

  • Any workflow built on PyTorch, including fine-tuning models, custom training loops, or production inference.
  • Research and experimentation, where rapid iteration and framework flexibility are essential.
  • Smaller teams who cannot absorb the engineering cost of migrating to XLA.
  • On-premises deployments or multi-cloud strategies.
  • Any workload that is not purely matrix-heavy.

Where to Rent TPUs (and GPUs): Cloud Provider Comparison

TPUs are available exclusively through Google Cloud. There is no marketplace, no alternative provider, and no on-premises option. If you want a TPU, you are committing to GCP's pricing, tooling, support model, and data residency policies. Spot TPU instances are available at lower rates but carry interruption risk for long training runs.

For GPUs, the market is much bigger. AWS, Azure, GCP, Lambda, CoreWeave, Thunder Compute, and a long list of providers offer NVIDIA GPUs at varying price points.

If you want to minimize cost while maximizing flexibility, Thunder Compute offers NVIDIA A100 and H100 instances with on-demand pricing well below the major hyperscalers. Commitment-free, per-minute billing, and persistent storage.

TPU and GPU Cloud Pricing: What to Expect in 2026

Accelerator Provider On-Demand Price (per GPU or chip hour)
TPU v5e Google Cloud $1.20-$1.56
TPU v6e (Trillium) Google Cloud $2.70-$3.24
TPU v5p Google Cloud $4.20
NVIDIA A100 80 GB Thunder Compute $0.78
NVIDIA H100 80 GB Thunder Compute $1.38
NVIDIA A100 80 GB AWS $3.43
NVIDIA H100 80 GB AWS $6.88
NVIDIA A100 80 GB Google Cloud $11.06
NVIDIA H100 80 GB Google Cloud ~$5.07
Prices are on-demand, single-GPU equivalents in US regions as of June 2026.

The pricing overview shows the access problem with TPUs: they are a single-vendor product with pricing set entirely by Google. GPU pricing, by contrast, is shaped by real competition.

For most developers building with PyTorch, fine-tuning open-source models, running inference workloads, or simply experimenting with AI, the right GPU choice on the right provider is the most practical and cost-effective path.

We make GPUs cheaper

Low prices, developer-first features, simple UX. Start building today.


                                                           `..`                                          `                                  
                                                                                                                                            
                                                                                                                                            
               ``        `                                                                                                                  
                        .;.                                                                                                                 
                                                                                                                                            
      .                                                                                                                                     
                                                                                                                                            
                                                                                                                                            
                                                                                                                                            
                                                                                                                                            
                                                                                                                                           .
                                                         `....                            ``                                                
                                                                                                                                            
                                            .`                                                                                              
                                             `  `.                                                                                          
                                             `                                                                                              
                                   `.                                                                                                       
                                     `                                                                                                      
                                                                               ;`                     .                                     
                                                                                                                                            
                                                 ````                                  .```                                                 
                                                                                                                                            
                                                                                                                                            
                                                                       .                                                                    
                                                 `+`                  `.                                                                    
                                                    .`                                  ;`                                                  
                                         ``       `;                               `;;`.;;`    `                                            
                                         .`                                                                                                 
                                                                                               ` `                                          
                                    `     `                     `   ;       ;`                 `;`                                          
                                    .  .    `` `                                   ```                                                      
                                                                                                                                            
                                                                +*******.    ``     `+++++`         `.`                                     
                                                 `.......       +```````     `.                                                             
                                `                               *             ;                                                             
                                `                              `+            `*                                                             
                                  `              ````...`                     *                         `;                                  
                                                                              *                                                             
                                                                              .       ``   .`           .;                                  
                         .;```      `                   `                                    `;                 `.``.;                      
                       ;.           ;                                               .``.`      `        `             ...                   
                    `;;          `.`                       .   `    .        .           ;`         `.`                 `;.                 
                  .+`        `*```                         *   ;;  `*`  +;   *             +`                    .+        ;+               
                ;.         ;;`          ;``....+          .;   +;   *`  +;   *               ;...  `;`             `;;       `;`            
                        `;;`          `.                       *;   *`  +;   ;.                                      `;.`                   
                      `+;                                      *.   *`  +;                 `..+.                        ;+`                 
                    ;+.        .+`           `` `              `        ++            .```         .  `+;                 `+.               
                             ;+.                ......;.                                    ;.`         `+;                                 
                          `.+.               `.;;                                                         `;;`                              
                         ++`                                          `*                                     +*`                            
                      `++                                             .*`            +`                        ;+.                          
                     .;`                                .;            .*              .                         `;;                         
                                                        .             .*              .                                                     
                                                        .             .*              .                                                     
                                       ````````     `.  .`            `+              ;+`    ``````.                                        
                                     `+*.   ```     ;.   ;          ;  ;+             +*;    ````` ;*;                                      
                                   `**;             `   ;+         **   +;            +*;            +*+                                    
                                 `+*;                   `          ;*   +.             `              `+*;                                  
                               `;*;                                     `                               `+*;                                
                              ++;                  ``                                                     `+*;                              
                           `++;                    ..               `              +*                       `++;        `                   
                   `       `.                      ;`               ;              +*                         `.                            
                      `                            ;`               +              ;*                                                       
                     .+                            ;`               +              ;*`                                                      
                                                   .;;             `+`             .**`                                                     
                                                    `+*.           `*+`             .**;`                                                   
                                                      +*+           ;**+              ;**;                                                  
                            ``                         .**;           +*+.              +**.                                                
                                                        `+*+`          .+*;              `+*+.                                              
                               .;  ``.`                   ;**;           ;**;              ;**;`                                            
                             `.;;   `..`                   `**+           `***.              +**;                                           
                              `       `..                    **             ;**+              `+*+                                          
         ``                             ..`                  ++              `**+               **                                          
         ;;                              ...                 ..  ..            *+               ;+                 `                        
        `;.                               ...                ++` ;++           ++                    ..          ```                        
        ;;`                                `..`             `**`  +*`           .  .+;           ..  ;++        ```                         
       `.;                                  `.;.```````     .*+   +*`          ;+. .**.         `+;   +*.      ```