Google TPU v5

Realizes: High-throughput tensor acceleration for deep learning training and inference

Google's fifth-generation TPU (v5) is a datacenter AI accelerator optimized for massive matrix multiplies; each chip exposes more matrix units than v4, and when assembled into TPU v5 pods it delivers higher TFLOPS along with pod-scale interconnects that sustain large language model training and inference.

Examples

🔗

Google Cloud TPU v5 pods

Google Cloud's official TPU v5 documentation describes pods that combine hundreds of TPU v5 chips, each with expanded matrix units and higher TFLOPS, to train massive models and serve inference with multi-exaFLOP throughput and low latency.

DENSE_MATRIX_MULTIPLY CONVOLUTION LLM_TRAINING high-throughput pod-scale ≈45 pJ per fused multiply-add (bfloat16)