NVIDIA Tesla
Realizes: dense floating-point and tensor algebra for HPC and AI training workloads
NVIDIA Tesla GPU compute cards deliver massively parallel floating-point and tensor acceleration for HPC, AI training, and inference, leveraging NVLink, HBM memory, and CUDA programming to pack thousands of CUDA cores and Tensor cores behind blower-style cooling.
Examples
🔗
Tesla V100
Tesla V100 boards pair 5120 CUDA cores, 640 Tensor cores, 900 GB/s HBM2, and NVLink 2 to tackle large-scale training of transformers, replacing hundreds of CPU cores for dense matrix multiply and convolution workloads.
MATRIX_MULTIPLY
CONVOLUTION
milliseconds
cluster
mJ