Google TPU v5
Realizes: High-throughput tensor acceleration for deep learning training and inference
Google's fifth-generation TPU (v5) is a datacenter AI accelerator optimized for massive matrix multiplies; each chip exposes more matrix units than v4, and when assembled into TPU v5 pods it delivers higher TFLOPS along with pod-scale interconnects that sustain large language model training and inference.
Examples
🔗
Google Cloud TPU v5 pods
Google Cloud's official TPU v5 documentation describes pods that combine hundreds of TPU v5 chips, each with expanded matrix units and higher TFLOPS, to train massive models and serve inference with multi-exaFLOP throughput and low latency.
DENSE_MATRIX_MULTIPLY
CONVOLUTION
LLM_TRAINING
high-throughput
pod-scale
≈45 pJ per fused multiply-add (bfloat16)