Computers aren't just electronic ! You can compute things with pasta, ropes or light. Our goal here is to exhaustively list all systems we currently know about that are capable of computing something.


substrate determinism exactness operation model
substrate
determinism
exactness
model
204 systems
AMD CDNA f(x) = High-throughput HPC accelerator families
CDNA (CDNA 1 and CDNA 2) is AMDs compute-focused GPU architecture powering the Instinct MI100 and MI200, building on matrix cores and Infinity Fabric connectivity to link chiplets, HBM stacks, and SIMD pipelines for large-scale sparse and dense matrix workloads.
AMD GCN f(x) = Graphics Core Next compute-and-graphics pipeline
Graphics Core Next architecture is a 28nm AMD GPU microarchitecture powering Radeon HD 7000 series and FirePro W9000 boards; it uses compute units with scalar and vector execution to accelerate graphics and OpenCL workloads.
AMD RDNA f(x) = real-time graphics rendering and compute acceleration with RDNA compute units
AMD RDNA (Navi 10) is a 7nm architecture with 36 compute units and 32-wide SIMD wavefronts that deliver the rasterization and general-purpose throughput powering Radeon RX 5700-class GPUs, exposing asynchronous compute engines, command processors, and refined geometry pipelines for modern gaming and compute workloads.
AMD RDNA2 f(x) = Discrete GPU architecture for consumer and professional graphics compute
AMD RDNA2 architecture powers the Radeon RX 6000 series, combines dedicated ray accelerators with AMD Infinity Cache, and balances throughput across gaming and compute workloads.
AMD RDNA3 f(x) = High-end graphics and compute acceleration for Radeon RX 7900 series
AMD RDNA3 uses 5nm+ Navi 3x chiplets with improved shader engines, wider workgroup processors, chiplet-based Infinity Cache, and ray accelerators to deliver higher throughput and lower power for the Radeon RX 7900 series.
AMD RDNA4 f(x) = Navi 4x GPU architecture
Upcoming Navi 4x iteration of the RDNA lineage, the AMD RDNA4 architecture, expands its ray accelerator fabric and AI feature suite while remaining a deterministic, irreversible, exact execution platform for advanced graphics and compute.
AMD Terascale f(x) = Unified VLIW5 shader compute and Eyefinity multi-display pipeline
The HD 5000-era Terascale architecture couples 40nm VLIW5 shader arrays with Eyefinity-aware raster and memory subsystems, providing a general-purpose GPU fabric where GPGPU shader arrays power multi-display graphics and OpenCL-style compute domains.
AMD Zen f(x) = Revives AMD desktop and server x86 compute by pairing SMT, chiplet modularity, and Infinity Fabric with SenseMI adaptivity to deliver high-IPC throughput.
The 2017 Zen architecture organizes 14nm FinFET chiplets into 8-core CCXs with SenseMI telemetry, simultaneous multithreading, and Infinity Fabric links to the I/O die, delivering responsive general-purpose compute.
AMD Zen 2 f(x) = 2019 high-throughput platforms
AMD Zen 2 is a 7nm chiplet-based x86 architecture that couples multiple 7nm CPU chiplets, increased shared cache capacity, PCIe 4.0 lanes, and a robust Infinity Fabric interconnect to deliver high throughput for 2019 high-throughput platforms.
AMD Zen 3 f(x) = premium desktop/EPYC platforms
AMD Zen 3 increases IPC, optimizes its cache hierarchy, and leverages Precision Boost to sustain higher operating frequencies for premium desktop and EPYC platforms on the 7nm process.
AMD Zen 4 f(x) = next-generation Ryzen and EPYC throughput
AMD Zen 4 is a next-generation x86 microarchitecture powering Ryzen 7000 and EPYC Genoa families. Built on a 5nm FinFET CMOS chiplet architecture, it pairs a stronger DDR5/LPDDR5 memory controller with AVX-512 improvements and delivers next-generation Ryzen and EPYC throughput.
ARM Cortex-A15 f(x) = premium tablets/servers
An aggressive out-of-order Cortex-A15 pipeline with virtualization extensions, powering Samsung Exynos 5 Dual 5250 and sometimes paired with Qualcomm Krait in premium tablets and servers.
ARM Cortex-A53 f(x) = energy-efficient mobile workloads
64-bit in-order energy-efficient core found in mid-range smartphones such as the Raspberry Pi 3 and Snapdragon 410, optimized for sustained mobile workloads.
ARM Cortex-A57 f(x) = high-performance mobile compute
ARM's Cortex-A57 is a high-performance, out-of-order 64-bit ARMv8-A core with a 3-wide decode/issue pipeline, aggressive micro-op reordering, and NEON/FPU arrays, routinely paired in big.LITTLE clusters with Cortex-A53 companions in platforms such as Nvidia Tegra X1 and Samsung Exynos to cover flagship smartphone and tablet workloads.
ARM Cortex-A72 f(x) = High-IPC out-of-order execution powering premium 2016 smartphones/tablets (e.g., HiSilicon Kirin 960).
ARMv8-A general-purpose CPU with a wide-issue, high-IPC pipeline, aggressive branch prediction, and NEON/FPU arrays, designed for flagship mobile SoCs around 2016.
ARM Cortex-A76 f(x) = flagship smartphone/PC compute
Arriving in 2018, the ARM Cortex-A76 brought an out-of-order, high-frequency microarchitecture to flagship smartphone and PC compute, forming the performance backbone of Snapdragon 855 and Kirin 990 platforms.
ARM Cortex-A78 f(x) = premium flagship mobile compute
The 2020 ARM core that improved IPC over Cortex-A77 and is deployed alongside Cortex-X1 and Cortex-A55 companions in Snapdragon 888 and Dimensity 1200 rivals.
ARM Cortex-A8 f(x) = mainstream smartphones around 2010
ARM Cortex-A8 delivers an in-order dual-issue pipeline with NEON SIMD that powered Apple iPhone 3GS and other early smartphones, offering strong single-core performance for multimedia, UI, and application workflows.
ARM Cortex-A9 f(x) = high-performance 2011 smartphones
The ARM Cortex-A9 is a multi-core, out-of-order general-purpose CPU that powered flagship 2011 devices such as the Samsung Galaxy S II and Nvidia Tegra 2 platforms, delivering energy-efficient performance for mobile workloads.
ARM Cortex-M0 f(x) = simple deterministic microcontroller control
Energy-efficient 32-bit Cortex-M0/v0+ subset powering low-power STM32F0, NXP LPC800, and similar microcontrollers for industrial sensors, wearables, and simple consumer peripherals, offering predictable Thumb-2 instruction execution and deterministic response.
ARM Cortex-M0+ f(x) = ultra-low-power deterministic control
Low-power, microcontroller subset delivering improved performance with single-cycle branches and Thumb-2 capabilities, powering NXP LPC800 and Microchip SAM D10 boards for wearable and sensor applications.
ARM Cortex-M3 f(x) = deterministic embedded applications requiring interrupts
32-bit mid-range MCU core used in STM32F1 and LPC1768 boards; features the Thumb-2 instruction set, nested vectored interrupt controller, and tightly coupled memory for deterministic response.
ARM Cortex-M33 f(x) = deterministic low-power IoT control with security
Security-enhanced ARM Cortex-M33 core with TrustZone, used in the STM32L5 series for constrained secure IoT control.
ARM Cortex-M4 f(x) = signal processing real-time control
DSP-ready embedded microcontroller core with single-precision FPU and DSP extensions, powering STM32F4 and Teensy boards for real-time sensing, audio, and control workloads.
ARM Cortex-M55 f(x) = deterministic low-power ML inference
ML-enabled Cortex-M55 with Helium vector extensions, embedded in STM32U5 and other advanced microcontrollers for energy-efficient inference.
ARM Cortex-M7 f(x) = Deterministic high-speed real-time and AI acceleration
High-performance ARM Cortex-M7 core with a dual-issue pipeline, single-precision FPU, and DSP extensions, powering STMicroelectronics STM32H7, NXP i.MX RT, and Teensy 4.1 boards for physics-based sensing, motor control, and signal-processing applications.
ARM Cortex-M85 f(x) = deterministic real-time plus ML compute
ARM Cortex-M85 is a high-performance microcontroller core with 2nd-generation Helium vector extensions and dual scalar pipelines, powering the latest STM32U6 family to blend deterministic control with on-chip ML inference.
ARM Cortex-X1 f(x) = flagship-tier compute
Performance-focused single-core introduced in 2020-2021 with wide execution used in Snapdragon 888 platforms to drive flagship compute workloads.
ARM Cortex-X2 f(x) = flagship smartphone/PC compute
ARM Cortex-X2 is ARM's 2022 performance CPU core powering Snapdragon 8 Gen 1 and realizing flagship smartphone and PC compute workloads.
ARM Cortex-X4 f(x) = next-gen ultra-premium mobile/AI compute
ARM Cortex-X4 is ARM's 2024 high-performance flagship core for next-generation flagship SoCs such as the Qualcomm Snapdragon 8 Gen 4, emphasizing aggressive pipelining and high IPC to sustain sustained high frequencies for mobile and AI compute workloads.
ARM Mali G12x f(x) = GPUs → Others
Latest Valhall-based upper-midrange GPU used in MediaTek Dimensity 9000/9000+ SoCs with big.LITTLE drivers, balancing graphics, display, and AI workloads.
ARM Mali G5x f(x) = GPUs → Others
The Mali G52/G57 series is a midrange Valhall architecture GPU used in chipsets like the MediaTek Dimensity 820, targeting efficient mobile graphics and compute.
ARM Mali G7x f(x) = GPUs → Others
ARM Mali G77/G78 high-end mobile GPU lineup powering Exynos 1080 and 2100 with Valhall architecture improvements in throughput, efficiency, and feature set.
ARM Mali G9x f(x) = AI and image-processing pipelines for flagship mobile SoCs
ARM Mali G90 and G91 GPUs target flagship mobile devices, combining Valhall architecture top-tier GPU rendering with neural front-ends for AI and image processing. MediaTek Dimensity 1200 smartphones use this GPU cluster to accelerate multi-frame photography, computational video, and high-refresh-rate gaming workloads.
ARM Mali T6xx f(x) = mobile GPU and TV GPU acceleration
First-generation ARM Mali T series (T600 family) graphics clusters as integrated into Exynos 5 Octa and similar SoCs, delivering shader-based mobile and smart-TV graphics pipelines.
ARM Neoverse N1 f(x) = datacenter/server compute
2019 server-focused microarchitecture tuned for high-frequency operation, scalable mesh interconnect, and energy-efficient cores for cloud infrastructure workloads.
ARM Neoverse V1 f(x) = HPC/datacenter math
2021 vector-optimized core built for high-throughput floating-point and SVE workloads, highlighted in Ampere and Fujitsu platforms.
ARM2 f(x) = early RISC embedded bridging
ARM2 is a 1985 32-bit hitless architecture used in early Acorn Archimedes workstations, representing the bridge from general-purpose CPUs to early RISC embedded systems.
ARM6 f(x) = mid-1990s embedded workstation
Modernized 32-bit ARM architecture derived from the Acorn lineage, powering Apple Newton PDAs and Amiga embedded workstations with a low-power RISC core.
ARM7TDMI f(x) = Cost-sensitive embedded and smartphone control workloads, especially those demanding the Thumb instruction set for high code density.
ARM7TDMI is a 32-bit general-purpose CPU core in the ARM lineage that powered early smartphones like the Nokia 3310 and many microcontrollers; its Thumb instruction set compression and pipelined CMOS RISC architecture deliver high code density for tight memory budgets while still handling keypad, radio, and sensor control loops.
ARM9 f(x) = Early Symbian and PSP workloads
ARM9 dual-issue pipeline processors powered early smartphones like the Nokia N-series and handheld consoles such as the Sony PSP, balancing efficient instruction throughput with low power draw.
AWS Inferentia f(x) = deep learning inference pipelines
AWS-designed chip for deep learning inference with high throughput and low latency, used in Inferentia-based EC2 Inf1 instances and the Neuron SDK stack.
AWS Trainium f(x) = High-throughput transformer and large-scale neural network training across sparse and dense workloads on EC2 Trn1 clusters.
AWS custom training chip powering EC2 Trn1 instances with high throughput, supporting InfiniBand fabric for massive multi-node synchronization while running dense and sparse machine learning workloads at scale.
Alibaba XuanTie C910 f(x) = Alibaba cloud/edge inference scenarios
The 2020 vector-plus-scalar Alibaba XuanTie C910 couples a wide vector unit with a scalar pipeline to enable edge and AI acceleration workloads.
Antikythera mechanism f(x) = astronomical positions, eclipse prediction, Metonic calendar (multi-cycle gear ratios)
A hand-cranked bronze gearwork device built around 150–100 BC — the oldest known analog computer. Turning a single input crank advances 37 meshing gears whose tooth-count ratios encode the periods of the Sun, Moon, and planets. A differential gear (rediscovered in the 16th century) models the Moon's elliptical speed variation. Front dials show the zodiac position of Sun and Moon and display lunar phase via a half-silvered sphere; rear spiral dials track the 19-year Metonic cycle (235 lunations), the 18-year Saros eclipse cycle, and the 4-year Olympiad. Setting a date predicts eclipses and planetary positions decades ahead. Speed: instantaneous (gears turn as fast as the crank). Capacity: ~10 astronomical cycles simultaneously; eclipse prediction decades in advance.
Apple ANE (Neural Engine) f(x) = AI accelerator
Apple's Neural Engine inside A-series and M-series SoCs is a dedicated neural processing unit composed of hardware matrix multiply arrays and supporting SRAM/control pipelines that deterministically accelerate inference workloads on-device.
Apple G3 f(x) = personal computing in late 1990s
PowerPC 750 (G3) microprocessor powering early iMacs, PowerBooks, and consumer desktops with a CMOS RISC core, backside cache, and enhanced multimedia units in the late 1990s Apple lineup.
Apple G4 f(x) = media-rich consumer compute
PowerPC 7400-based Apple G4 systems such as the iMac G4 and Power Mac G4 combine a superscalar CMOS core with AltiVec vector units to accelerate media-rich applications.
Apple G5 f(x) = General-purpose CPUs → PowerPC / POWER
IBM PowerPC 970-based G5 microarchitecture brought 64-bit, dual-core performance to the Power Mac G5 lineup, executing PowerPC/Unix workloads with high-bandwidth caches and AltiVec acceleration while emphasizing Apple's desktop-class general-purpose computing goals.
Apple GPU f(x) = high-efficiency SoC compute (graphics & neural)
Apple integrated GPU (e.g., in M1/M2) featuring unified memory, tile-based rendering, and tight coherence with the CPU to deliver graphics and neural compute within the SoC.
Atmel/Microchip AVR ATmega f(x) = embedded control loops and peripheral interfacing
An 8-bit RISC microcontroller family with rich peripheral set and in-system programmable flash used across embedded projects, most notably as the ATmega328P on the Arduino Uno development board for robotics, sensing, and control.
Atmel/Microchip AVR ATtiny f(x) = Classic embedded
Tiny AVR microcontrollers designed for constrained devices, often stitched into LED wearables and wearable control loops that need tiny flash footprints and low power draw.
Atmel/Microchip AVR32 f(x) = Embedded & microcontroller cores → Classic embedded
Atmel/Microchip AVR32 is a 32-bit RISC microcontroller architecture with Harvard instruction/data pipelines, targeting deterministic audio signal processing and industrial control applications via the UC3/Applite families that integrate DMA, codecs, and peripheral controllers.
Belousov-Zhabotinsky (BZ) reaction computer f(x) = boolean logic / reaction-diffusion computation (via chemical wave collisions)
The BZ reaction is an oscillating chemical system that produces propagating excitation waves in a thin layer of reagent (typically ferroin or ruthenium catalyst in acidified bromate/malonate). Signals are encoded as wave fronts; the interaction of two colliding wave fragments implements logic at the collision site. Annihilation corresponds to AND; a wave passing through unimpeded corresponds to OR. Adamatzky demonstrated NOT, OR, AND gates in fixed channel geometries. A light-sensitive variant (with ruthenium catalyst) allows gates to be programmed by illumination patterns. A 2024 Nature Communications paper demonstrated a hybrid digital-chemical programmable array. Speed: ~1-10 mm/min wave propagation; seconds to minutes per gate. Capacity: small logic circuits; limited by wave-front geometry and reagent lifetime.
Billiard-ball computer f(x) = reversible boolean logic (Fredkin gate)
Proposed by Fredkin & Toffoli (1982). Balls travel on paths representing wires; presence/absence of a ball encodes a bit. Collisions at path intersections implement logic gates. Logically and thermodynamically reversible — no information is destroyed. Speed: nanoseconds to microseconds (ball velocity dependent). Capacity: arbitrary boolean circuits (theoretically universal).
Biological brain f(x) = general intelligence / perception, memory, reasoning, motor control
The human brain contains ~86 billion neurons connected by ~10¹⁵ synapses. Each neuron integrates thousands of synaptic inputs and fires a spike when its membrane potential crosses threshold — a leaky integrate-and-fire operation. Computation is massively parallel, spike-coded, and energy-efficient at ~20 W total. Synaptic weights are plastic: Hebbian learning and spike-timing-dependent plasticity (STDP) modify connection strengths in response to activity, implementing online learning with no separate training phase. The brain solves tasks — scene understanding, language, planning — that remain beyond engineered systems at equivalent energy budgets. Unlike every other entry, the substrate is also the substrate of the observer. Speed: ~100 Hz spike rate per neuron; millisecond reaction times; years of learning. Capacity: ~86 billion neurons; ~10¹⁵ synapses; ~20 W; general-purpose cognition.
Boson sampler f(x) = sampling from the permanent of a unitary matrix (classically #P-hard)
Identical single photons enter an m-mode linear optical network (beam splitters and phase shifters implementing a unitary U). Detectors at the outputs sample from a distribution whose probabilities are proportional to |Perm(U_S)|² — the squared permanent of submatrices of U — a quantity believed to be classically intractable to compute. Aaronson & Arkhipov (2011) proved that an efficient classical simulation would collapse the polynomial hierarchy. The device does not solve a user-defined optimization problem; rather, it demonstrates quantum advantage on a specific sampling task. Gaussian boson sampling (GBS) variants use squeezed-light inputs and have been demonstrated at scale (Jiuzhang, 2020). Speed: nanoseconds per sample (photon transit time through chip). Capacity: 53+ photons demonstrated (Jiuzhang); quantum advantage claimed for n≥50.
BrainScaleS wafer-scale neuromorphic system f(x) = accelerated analog spiking neural network emulation
BrainScaleS wafer-scale neuromorphic system blends analog wafer-scale integrators with digital control, forming the EBRAINS neuromorphic computing service for fast emulation of spiking neural networks.
CDC 6600 (1964) f(x) = gigaflop frontier scientific floating-point workloads
The CDC 6600 (1964) paired scoreboard-driven pipelines with dedicated peripheral processors to keep its transistorized core aimed at float-heavy scientific targets, chasing gigaflop-frontier workloads while isolating I/O on a peripheral backplane.
CEVA DSP cores f(x) = DSP-based audio/5G acceleration
Programmable CEVA DSP cores such as the CEVA-TeakLite and CEVA-XC families power audio codecs, wireless basebands, and modem SoCs.
Cell Broadband Engine f(x) = high-throughput parallel workloads
The Cell Broadband Engine pairs a Power Processing Element (PPE) with multiple Synergistic Processing Elements (SPEs) to support the PlayStation 3 and supercomputers such as IBM Roadrunner, delivering a heterogeneous multi-core fabric for demanding parallel computation.
Cerebras WSE (Wafer-Scale Engine) f(x) = Large-scale deep learning training and inference
The Cerebras Wafer-Scale Engine is a wafer-scale AI accelerator with millions of cores interconnected through a dense on-chip fabric, delivering massive compute for large-scale model training on systems like the CS-2.
Coherent Ising machine (OPO network) f(x) = Ising Hamiltonian ground state / combinatorial optimization (MAX-CUT, QUBO)
A network of degenerate optical parametric oscillator (DOPO) pulses circulating in a fiber ring cavity. Each pulse can oscillate in one of two phase states (0 or π), encoding a spin. Measurement-feedback electronics couple the pulses according to the Ising coupling matrix programmed by the user. As the pump power increases past threshold, the network undergoes bifurcation and settles into a low-energy spin configuration. NTT's 2021 system used 100,000 DOPO pulses in a 5-km fiber loop. Unlike classical or quantum annealers, the CIM operates at room temperature and exploits optical coherence rather than thermal or quantum fluctuations. Speed: microseconds per Ising problem instance. Capacity: up to 100,000 spins (NTT 2021); competitive with quantum annealers on dense graphs.
Coupled oscillator network (Kuramoto / XY model) f(x) = MAX-CUT / graph partitioning (approximate)
A network of identical oscillators — pendula, LC circuits, or CMOS ring oscillators — coupled to their neighbours by springs or resistive links. The Kuramoto model describes how each oscillator's phase evolves under the pull of its neighbours. When the coupling weights encode a graph's edge weights, the system's stable phase configuration minimizes the same energy function as MAX-CUT: oscillators partition into two phase-locked clusters (0° and 180°) that approximately bisect the graph. Implemented in silicon as oscillator-based Ising machines with up to 1440 CMOS nodes; reported within 99% of optimal MAX-CUT on tested benchmarks. Speed: microseconds to milliseconds (oscillator ring-down time). Capacity: graph problems with hundreds to thousands of nodes.
DEC Alpha 21064 f(x) = High-frequency floating-point compute
1992 64-bit RISC pipeline featuring dual integer issue stages, a four-stage fetch/decode/execute/commit path, and on-chip 8 KB instruction and data caches, positioned for DEC server racks and AlphaStation workstations.
DEC PDP-11 f(x) = interactive workstation time-sharing, general-purpose operating system development
The DEC PDP-11 married an unusually orthogonal instruction set with dual UNIBUS and DIBUS pathways, letting its transistorized backplane with microprogrammed control deliver interactive time-sharing workstations that popularized UNIX and RSX operating systems, while responsive DMA-friendly peripherals powered a generation of general-purpose CPU designs.
DEC PDP-8 f(x) = Instrument control and laboratory automation
The DEC PDP-8 combines a 12-bit accumulator, low-cost rack packaging, and roots in real-time control applications, embodying the general-purpose mainframe lineage.
DEC VAX f(x) = general-purpose server/UNIX workloads
DEC VAX introduced a 32-bit CISC ISA with virtualization extensions, vectored interrupt handling, and multi-processor cache-coherent CPU/cache modules, enabling general-purpose server and UNIX workloads across VMS and Unix environments.
DNA computer (Adleman 1994) f(x) = Hamiltonian path via strand hybridization
Leonard Adleman's 1994 demonstration solved the directed Hamiltonian path problem using DNA strand hybridization. Cities encoded as DNA sequences, flight connections as complementary strands. Massively parallel biochemical search. Speed: hours to days (biochemical reactions). Capacity: combinatorial search problems (limited by DNA synthesis/sequencing).
DNA strand-displacement computer f(x) = boolean logic / neural network inference (via hybridization cascades)
Single-stranded DNA molecules in solution compute via toehold-mediated strand displacement: a short single-stranded 'toehold' on a partially double-stranded gate complex allows an input strand to invade, displace, and release an output strand. Presence/absence of a strand encodes a bit. Cascades of these reactions implement AND, OR, NOT, NAND, NOR, XOR, and threshold gates without enzymes or moving parts. Qian and Winfree (2011) demonstrated a four-bit square-root circuit from 130 DNA strands; a subsequent paper (Nature, 2011) realized a 30-node Hopfield neural network entirely in DNA solution. Speed: minutes to hours per logic operation (hybridization kinetics). Capacity: ~100-gate circuits demonstrated; massively parallel (each molecule is a gate).
Differential analyzer f(x) = solutions to systems of ODEs (via chained mechanical integration)
Built by Vannevar Bush and Harold Hazen at MIT in 1928–1931, the differential analyzer is a general-purpose analog ODE solver. The core component is a wheel-and-disk integrator: a disk rotates at rate proportional to one variable; a wheel resting on the disk at a radial position proportional to a second variable rotates at their product — implementing ∫ y dx mechanically. Multiple integrators are chained via shafts and differential gears to represent higher-order ODEs. A torque amplifier (Bush's key innovation) prevents the tiny friction coupling from loading the computation. The MIT machine solved sixth-order ODEs; later machines solved 18th-order equations. The device is the missing link between the planimeter (single integral) and the fire-control computer (hardwired ODE). Speed: minutes per ODE solution (shaft rotation time). Capacity: up to 18th-order ODEs (later machines); ~3 significant figures.
Diffractive deep neural network (D²NN) f(x) = neural network inference / image classification (at the speed of light)
A stack of passive, 3D-printed diffraction layers implements a trained neural network entirely in the optical domain. Each layer is a mask with pixel-wise phase or amplitude modulation, trained offline with backpropagation through a differentiable wave-optics model. During inference, light propagates through the layers via diffraction — no active computation occurs. The network function is encoded in the geometry of the passive masks. Lin et al. (2018, Science) demonstrated handwritten-digit classification at terahertz frequencies with 91.75% accuracy. Inference runs at the speed of light with zero dynamic energy consumption beyond the input illumination. Speed: picoseconds (optical propagation through ~cm of layers). Capacity: image classification at THz; scales with aperture area and layer count.
DishBrain (in-vitro neural culture) f(x) = closed-loop sensorimotor control / game-playing (via biological learning)
~800,000 human iPSC-derived or mouse cortical neurons are plated onto a high-density multi-electrode array (HD-MEA). The DishBrain system (Kagan et al., 2022, Neuron) embeds the culture in a simulated game of Pong: electrode stimulation encodes ball position and side; the recorded neural firing pattern drives paddle movement. Motivated by the free-energy principle — cells prefer predictable stimulation over white noise — the culture learns to rally the ball within five minutes of real-time play. No explicit training algorithm runs; the biology self-organizes. The substrate is neurons-in-a-dish, making this the only entry where the substrate is alive and may be sentient. Speed: minutes to learn; milliseconds per action (neural firing rate). Capacity: closed-loop sensorimotor tasks; ~800,000 neurons, ~22,000 electrodes.
Domino computer f(x) = boolean logic (AND, OR, NOT)
Standing dominoes propagate a falling signal. Fan-outs split signals, and careful geometry implements AND and OR gates. Signal is one-shot — must reset by standing dominoes again. Speed: ~1 domino per second propagation (~10-50 seconds total). Capacity: single boolean expression evaluation (one-shot).
ESP32-C3 f(x) = secure low-power IoT edge
Espressif's 32-bit RISC-V wireless MCU with integrated Wi-Fi and BLE connectivity, targeted at IoT deployments.
ESP32-C6 f(x) = secure IoT and smart sensing
Espressif's more capable MCU pairing 40nm RISC-V vector extensions with Wi-Fi 6 and BLE 5.3 to harden compute at the edge for secure IoT and smart sensing.
Galton board (bean machine) f(x) = Gaussian / binomial distribution
Balls dropped through a triangular array of pegs deflect left or right at each level. The distribution of balls in the output bins converges to a Gaussian as N→∞. Each peg is an independent Bernoulli trial. Speed: minutes to hours (depending on ball count). Capacity: statistical sampling (scales with number of balls).
Gate-based quantum computer f(x) = unitary quantum computation / quantum algorithms (Shor factoring, Grover search, VQE)
A register of qubits — typically superconducting transmons cooled to ~10 mK — whose state is manipulated by sequences of microwave pulses implementing one- and two-qubit unitary gates. Any computation is a product of these gates, forming a universal gate set. Superposition lets a qubit represent 0 and 1 simultaneously; entanglement correlates qubits non-classically; interference is used to amplify correct answers and cancel wrong ones. Shor's algorithm factors n-bit integers in O(n³) gate operations vs. exponential classically; Grover's algorithm searches an unsorted list in O(√N). Current NISQ (noisy intermediate-scale quantum) devices have 100–1000 physical qubits with limited coherence; fault-tolerant quantum computing requires ~1000 physical qubits per logical qubit. Google's 2019 Sycamore experiment claimed quantum supremacy on a sampling task in 200 seconds vs. ~10,000 years classically. Speed: nanosecond gate times; microseconds coherence (NISQ era). Capacity: 53–1121 physical qubits (current hardware); fault-tolerant QC requires orders of magnitude more.
GigaDevice GD32VF RISC-V microcontrollers f(x) = industrial IoT control
GigaDevice's GD32VF103 family (e.g., the GD32VF103C8T6) couples a 40nm RV32 core with DSP accelerators and single-cycle MAC units, delivering deterministic real-time motor, sensor, and industrial control loops with fast ADC-to-PWM pipelines.
Google Sycamore f(x) = quantum supremacy via random circuit sampling
A 53-qubit superconducting transmon processor built by Google AI Quantum that executed a random quantum circuit sampling task in 2019 to demonstrate quantum supremacy, providing empirical evidence of a computation outside the reach of classical supercomputers at the time.
Google TPU v1 f(x) = AI accelerator
Google's first TPU, announced in 2016, ties a large 256×256 systolic array built for dense matrix multiplies to local weight memory so inference workloads across Google data centers run deterministically with predictable throughput and latency from the ASIC systolic array hardware.
Google TPU v2 f(x) = AI training and inference acceleration
Google's second-generation TPU v2 is a datacenter-scale AI accelerator built around large systolic arrays, high-bandwidth memory, and bfloat16 matrix units, forming Cloud TPU v2 pods to deliver high-throughput training and inference for deep learning workloads.
Google TPU v3 f(x) = AI training and inference acceleration
Third-generation Google TPU pairs float32/16 matrix multiply arrays with HBM2 and Cloud TPU v3 pods contain 8x more TPU chips than the previous generation, delivering massive training and inference acceleration.
Google TPU v4 f(x) = dense matrix multiply and transformer attention pipelines
Google TPU v4 is the latest pod-scale accelerator from Google that deterministically realizes dense linear algebra and transformer attention via custom systolic arrays. Each TPU v4 die pairs stacked HBM3, the newest TPU pod interconnect routers, and liquid cooling to sustain the throughput demanded by the latest Cloud TPU v4 pods, which stitch thousands of chips across the pod interconnect fabric for multi-petaflop training.
Google TPU v5 f(x) = High-throughput tensor acceleration for deep learning training and inference
Google's fifth-generation TPU (v5) is a datacenter AI accelerator optimized for massive matrix multiplies; each chip exposes more matrix units than v4, and when assembled into TPU v5 pods it delivers higher TFLOPS along with pod-scale interconnects that sustain large language model training and inference.
Graphcore IPU f(x) = machine intelligence workloads
Graphcore's Intelligence Processing Unit (IPU) is a massively parallel AI accelerator composed of thousands of SRAM-backed tile cores linked by an exchange-style interconnect, enabling sparse tensor graph processing workloads in IPU-POD16 and IPU-M2000 systems.
Groq Tensor Streaming Processor f(x) = deterministic single-cycle tensor streaming execution for deep learning inference
Groq Tensor Streaming Processor delivers deterministic single-cycle tensor execution within Groq hardware so ML inference workloads observe predictable latency in massive pipelined flows.
HP PA-RISC f(x) = server/workstation compute
32/64-bit RISC with in-order pipeline and dual-issue execution, tailorable to HP 9000 servers and workstations.
Hanging chain (catenary) f(x) = hyperbolic cosine / thrust line
A chain suspended from two fixed points and left to hang under gravity settles into a curve that exactly realizes the hyperbolic cosine. Gaudí used physical catenaries (inverted) to design the arches of the Sagrada Família. Speed: instantaneous (static equilibrium). Capacity: single function evaluation (hyperbolic cosine).
IBM Condor f(x) = Unconventional / Quantum
Planned ~1000-qubit superconducting processor from the IBM Quantum roadmap, extending its gate-based quantum systems.
IBM Eagle f(x) = fault-tolerant superconducting qubit arrays
IBM Eagle superconducting quantum processor (127 transmon qubits) supports gate-based quantum circuits research and Qiskit experimentation toward fault-tolerant architectures.
IBM Osprey (433-qubit gate-based quantum computer) f(x) = unitary quantum computation / quantum algorithms (Shor factoring, Grover search, VQE)
IBM's Osprey is a 433-qubit superconducting heavy-hexagon processor in IBM Quantum System Two, engineered for Qiskit access and Qiskit Runtime workflows to run unitary circuits across hundreds of qubits via microwave control pulses.
IBM POWER1 f(x) = high-performance UNIX server workloads
1990s IBM POWER1 architecture: a general-purpose RISC design with superscalar execution and an expanded register file that powered RS/6000 systems.
IBM POWER10 f(x) = modern enterprise/AI compute
The IBM POWER10 (2021) delivers high throughput compute with deep SMT and a focus on AI acceleration, optimizing matrix math and inference for demanding enterprise workloads.
IBM POWER7 f(x) = energy-efficient enterprise compute
IBM POWER7 (2010) introduces 8-way simultaneous multithreading, high-throughput virtualization, and energy-efficient design targeted to IBM Power Systems.
IBM System/360 unified IBM's commercial mainframe line with a single instruction set architecture, establishing upward compatibility and shaping enterprise computing for decades.
IBM TrueNorth f(x) = event-driven spiking neural network inference with massively parallel neurosynaptic cores
IBM TrueNorth is a 45 nm CMOS neurosynaptic chip with one million programmable spiking neurons and 256 million configurable synapses. It realizes massively parallel, event-driven computation with asynchronous low-power inter-core communication, enabling pattern recognition and sensor fusion workloads inspired by biology.
IBM analog AI chip f(x) = Energy-efficient inference via PCM memristive crossbar arrays
IBM Research analog AI chip uses memristive crossbar arrays with PCM elements to implement analog differential compute for neural inference, tightly integrating in-memory multiply-accumulate operations for ultra-low-power AI workloads.
Imagination PowerVR f(x) = mobile graphics/AR workloads
Tile-based PowerVR SGX/Rogue GPUs deployed in Apple iPhone/iPad series, featuring deferred rendering pipelines for efficient mobile graphics and AR experiences.
Intel 286 f(x) = protected-mode segmentation
The Intel 80286 advanced the x86 lineage with protected mode and richer 16-bit enhancements, pushing early departmental servers by exposing segmented protection and expanded memory beyond the 8086/88 era; example: Compaq Deskpro 286 and IBM PS/2 Model 80 servers running Novell NetWare relied on the new mode to host shared files and directories with segmentation-based protection.
Intel 386 f(x) = 32-bit protected-mode CPU operations with paging enabling modern OS virtualization
Intel's 80386 microprocessor introduced 32-bit protected mode with paging and hardware multitasking support, forming the foundation for modern OS virtualization and advanced multitasking environments.
Intel 486 f(x) = Integration for mainstream PC workloads (DOS/Windows 3.1 era desktop computing with spreadsheets, CAD, and client/server applications).
The Intel 80486 fused an on-chip floating-point unit, eight-stage pipelined datapath, and write-back L1 cache into one superscalar CMOS microprocessor, delivering deterministic x86 throughput for desktop applications while reducing bus contention compared to the 386.
Intel 8051 microcontroller f(x) = deterministic real-time control loops
8-bit microcontroller featuring Harvard architecture with separate code and data spaces, integrated timers, serial UART, and parallel I/O, widely deployed in embedded appliances for deterministic control loops.
Intel 8086 f(x) = PC-era general-purpose computing
Intel's first 16-bit CISC CPU powering early IBM PCs and compatible machines, notable for its segmented memory model that bridged 16-bit processing with a 20-bit address space.
Intel Alder Lake (hybrid) f(x) = big.LITTLE-inspired desktop/mobile tuning
Hybrid x86 microarchitecture pairing Golden Cove performance cores with Gracemont efficiency cores, guided by Thread Director for workload steering and offering DDR5 plus PCIe 5.0 support, as seen in systems like the Intel NUC 12 Extreme.
Intel Core (Yonah) f(x) = laptop-performance
Intel Core (Yonah) dual-core mobile microarchitecture introduced for 2006 laptop platforms; features paired Yonah cores with shared cache, advanced power-efficient enhancements, and Intel 64 support for 64-bit notebook performance.
Intel Haswell Microarchitecture f(x) = Mainstream laptop and desktop performance from the 2013 Haswell era.
Haswell's 2013 microarchitecture pairs aggressive out-of-order cores, AVX2 vector extensions, the UPI fabric, fine-grained power gating, and Gen7 integrated graphics to drive responsive performance in mobile and desktop laptops.
Intel Itanium (IA-64 EPIC) f(x) = Enterprise/server workloads that attempted to harness VLIW/EPIC scheduling for mission-critical throughput on HP Integrity class machines.
Intel's Itanium (IA-64) combined a 64-bit EPIC/VLIW instruction set with compiler-managed parallelism, predication, and speculation to target enterprise and mission-critical workloads, primarily deployed in HP Integrity servers.
Intel Loihi 1 f(x) = Neuromorphic computing research
Intel Loihi 1 is an asynchronous digital neuromorphic research chip with 128 programmable cores connected by a packet-switched mesh, simulating roughly 130k neurons and 130M synapses per chip for robotics, vision, and sensor-fusion workloads with tens-of-microseconds spike latency and picojoule-scale synaptic updates.
Intel Loihi 2 f(x) = asynchronous event-driven spiking neural networks with integrated learning
Intel's second-generation neuromorphic research chip implements asynchronous event-driven spiking neural networks with tightly coupled memory and compute plus sparse programmable synapses for adaptive, low-power AI. Example: the Sandia National Laboratories Hala Point system deploys 1,152 Loihi 2 processors to model 1.15 billion neurons and 128 billion synapses while running continuous-learning workloads at better than 15 TOPS/W.
Intel Nehalem microarchitecture f(x) = Improved server energy efficiency through integrated memory controller, QuickPath interconnect, and Turbo Boost optimizations.
Category 1 general-purpose x86 CPU lineage microarchitecture that brings the memory controller on-die, couples cores with QuickPath, and uses Turbo Boost to lift throughput and power efficiency.
Intel P6 (Pentium Pro) f(x) = Delivers high server and workstation throughput by combining speculative multi-issue scheduling with the integrated L2 cache to keep pipelines fed.
The Intel P6 (Pentium Pro) CPU introduced a deeply pipelined out-of-order superscalar core with an on-die L2 cache to accelerate enterprise workloads; example: Pentium Pro 200 MHz powering mid-1990s servers.
Intel Pentium (P5) f(x) = Mainstream PC acceleration through dual-issue x86 execution.
The Intel Pentium (P5) was Intel's first superscalar CPU, adding superscalar execution, dynamic branch prediction, and dual pipelines over the 486 to deliver significantly higher general-purpose performance.
Intel Pentium 4 (NetBurst) f(x) = Desktop and server scaling is realized through the NetBurst deep pipeline CMOS microarchitecture and Hyper-Threading to maximize throughput across wide x86 workloads.
The Intel Pentium 4 (NetBurst) pairs the NetBurst microarchitecture with a very long pipeline and the first mainstream Hyper-Threading implementation to chase high clock rates across desktop and server markets; example: 3.06 GHz Prescott chips scaled general-purpose workloads.
Intel Sandy Bridge f(x) = Mainstream PC and laptop compute workloads (productivity, media, client/server) on x86-64 platforms.
Intel's Sandy Bridge microarchitecture fused its CPU cores with the first-generation Intel HD Graphics, an improved branch predictor, and AVX support into a unified CMOS design to deliver deterministic x86-64 compute for mainstream PCs and laptops.
Intel Skylake f(x) = efficient mid-2010s computing
Intel Skylake is a 14nm FinFET microarchitecture featuring a micro-op cache, improved branch prediction, Gen9 graphics, and balanced desktop and laptop deployment.
Intel Xe Low Power GPU f(x) = integrated mobile graphics
Intel Xe Low Power (Xe-LP) GPU powers Tiger Lake and Arc Alchemist mobile SoCs, offering up to 96 execution units (768 vector ALUs) and dedicated media engines including AV1/HEVC encode and decode acceleration for thin-and-light laptops.
Intel Xe-HPC f(x) = Dense HPC GPU acceleration for AI training, scientific simulation, and matrix algebra
Ponte Vecchio GPUs combine HBM2e stacks, AVX-512 adapted cores, and a tile-based Intel 7/4 process optimized for HPC tiles, locking thousands of wide SIMT lanes per tile and coordinating them through a scalable fabric designed for large-scale scientific and AI workloads.
Intel Xe-HPG f(x) = Discrete Intel Arc Alchemist GPU acceleration for gaming, media, and AI workflows.
The Intel Xe-HPG family packages discrete Arc Alchemist GPUs to deliver hardware ray tracing, advanced media encode/decode, and AI acceleration for consumer gaming and creative workloads.
Upcoming Xe2 architecture is positioned as a next-gen tile-based GPU platform for discrete and data center workloads, extending Intel Xe with larger tiles and AI-ready matrix engines.
IonQ f(x) = gate-based quantum circuits orchestrated with trapped ions and photonic links
Trapped ion quantum computer hosted in vacuum chambers with photonic interconnects for modular entanglement, delivered over the cloud via IonQ Harmony and Aria.
Kelvin tide-predicting machine f(x) = sum of sinusoids / tidal height (Fourier synthesis)
Designed by Lord Kelvin (William Thomson) in 1872–73, this special-purpose mechanical analog computer performs real-time Fourier synthesis. Each tidal harmonic constituent (M2, S2, N2 …) is represented by a pulley on a crank whose radius sets the amplitude and whose rotation rate is geared to the constituent's period. A single wire threads over all pulleys in series; as a hand-crank advances time, the wire's endpoint traces the sum of all cosines, drawing the predicted tide curve on a paper roll. Kelvin's final version summed 24 harmonic components and could predict a full year of tides in about four hours. Variants were built for the US, India, and other nations and remained in operational use through World War II. Speed: a full year of tidal predictions in ~4 hours of cranking. Capacity: up to 40 harmonic components (later US machines); continuous output.
LEGO mechanical computer f(x) = arbitrary digital logic / sequential game state
A fully mechanical computer built from LEGO Technic with no electronics. Binary memory is stored as lever positions on a rotating drum (rod logic); a read/write head flips levers to write bits and senses them pneumatically on readback. A joystick translates direction inputs into pneumatic signals that pass through a mechanical filter preventing illegal moves, then drive a 16×16 push-rod display. Demonstrated running the game Snake entirely in hardware. Speed: ~1 Hz game-tick (limited by pneumatic signal propagation through tubing). Capacity: 16×16 display state + snake tail buffer (tens of bits of working memory).
Lightmatter Passage f(x) = photonic inference
Lightmatter Passage optical AI accelerator uses photonic inference and light-based matrix multiplies to drive an optical dataflow across a waveguide matrix engine.
Liquid marble computer f(x) = boolean logic / reversible gates (AND, XOR, OR, NOT, Toffoli, Fredkin)
Liquid marbles are millimetre-scale droplets coated with hydrophobic powder that makes them roll freely without wetting surfaces. Computation is collision-based: two marbles directed at an intersection merge if their relative speed exceeds ~0.29 m/s (AND = 1, carry output) and rebound below that threshold (AND = 0, separate outputs). The three output trajectories encode AND and XOR simultaneously, forming a half-adder in a single interaction gate. By controlling routing channels and gate geometry, all classical gates (AND, OR, NOT, NAND, NOR, XOR) and the reversible Toffoli and Fredkin gates can be constructed. The Fredkin gate conserves marble count — no information is destroyed — making this a physical substrate for reversible and potentially thermodynamically efficient computing. Speed: ~0.1–1 s per gate (marble travel time at cm/s speeds). Capacity: gate-level; multi-cycle datapath demonstrated in simulation.
Luminous Computing f(x) = photonic logic for AI
Luminous Computing centers on photonic logic for AI, building coherent-light neural accelerators orchestrated via optical dataflow.
MEMS accelerometer f(x) = Newton's second law (a = F/m) — continuous analog acceleration measurement
A microfabricated proof mass (typically silicon, ~1 μg) suspended by folded-beam springs. Under acceleration, the mass displaces by x = ma/k (Hooke's law + Newton's second law in equilibrium). Displacement is read by capacitive sensing: the mass carries interdigitated comb fingers whose capacitance changes by ΔC ∝ x ∝ a. The device is a physical analog computer that continuously divides force by spring constant — realizing a = F/m at the hardware level without arithmetic. MEMS gyroscopes extend this to Coriolis-effect angular-rate sensing, and IMUs combine three-axis accelerometers and gyroscopes to integrate trajectory in 3D. Found in every smartphone, airbag controller, and inertial navigation unit. Speed: continuous real-time output (bandwidth typically 1 Hz – 10 kHz). Capacity: single scalar (or 3-axis) acceleration; sub-μg resolution in precision variants.
MIPS R2000 f(x) = Early SGI and DECstation workstations
Introduced in 1985 with a five-stage RISC pipeline, the MIPS R2000 leaned on single-cycle integer basics, a concise load/store ISA, and predictable control flow to keep each stage decoding, executing, and retiring in lockstep, which made it attractive to early SGI and DEC workstation vendors.
MIPS R3000 f(x) = higher-performance workstation compute
The MIPS R3000 builds on the R2000 with 33/64-bit addressing flexibility, deeper pipelines, write-back caches, and expanded coprocessor support, making it the go-to processor for high-performance workstations like SGI Indigo and consoles such as the Sony PlayStation.
MIT Tagged-Token Dataflow Architecture f(x) = Id dataflow semantics with tag-based dynamic scheduling
MIT Tagged-Token Dataflow Architecture pairs high-performance scheduling with tagged token contexts that encode activation frames, letting distributed execution units match tokens, dispatch operands, and fire instructions for Id programs.
MONIAC (Phillips hydraulic computer) f(x) = Keynesian macroeconomic equilibrium (ODE system)
Built by Bill Phillips (1949). Water flows through tanks and pipes representing economic sectors — income, consumption, taxation, investment. Flow rates encode economic quantities. The system settles into equilibrium representing GDP balance. 14 machines were built. Speed: minutes to hours (hydraulic equilibration). Capacity: ~10-20 economic variables (limited by physical plumbing).
Manchester Dataflow Machine f(x) = fine-grained token-driven computation
The Manchester Dataflow Machine concept of the 1970s emphasized token-based dataflow execution with tokens flowing through FIFO routers and firing operations out of order as soon as operands arrived, exposing fine-grained dataflow computation across processors.
Marble computer f(x) = binary arithmetic / boolean logic
Gravity-fed marble runs with rocker/seesaw gates implement binary arithmetic and logic operations. One marble = 1 bit. The rocker flips state on each pass, implementing half-adders and logic gates. The Digi-Comp II (1965) is the canonical plastic educational design, while K'NEX construction sets allow modular prototyping of custom layouts. Speed: ~1-10 seconds per operation (marble transit time). Capacity: 3-8 bit operations (modular, expandable).
Mechanical fire-control computer f(x) = ballistic trajectory / gun bearing and elevation (multivariate real-time ODE)
Electromechanical analog computers installed on WWII-era warships (e.g. the US Navy Mark 1) continuously computed the correct bearing and elevation for naval guns from up to 25 live inputs: target range, target bearing, own-ship speed and course, wind speed, shell muzzle velocity, and more. Seven classes of mechanism — shafts, gears, cams, differentials, component solvers, integrators, and multipliers — were combined to solve the fire-control problem in real time. Speed: continuous real-time (output updated as fast as inputs change). Capacity: ~25 input variables → 2 output variables (bearing, elevation).
Mechanical gyroscope f(x) = time-integral of angular velocity (orientation tracking)
A spinning rotor mounted in gimbals conserves angular momentum. Any external torque causes precession perpendicular to both the spin axis and the applied torque — rather than tilting directly. By reading gimbal angles, the device outputs the accumulated rotation of the platform relative to inertial space. It is a physical integrator: angular velocity in → angle out, with no arithmetic required. Inertial navigation systems chain three orthogonal gyroscopes with three accelerometers; double-integrating the accelerometer outputs (in the gyroscope-maintained inertial frame) gives position. Mechanical gyros guided Apollo missions and ICBM warheads; they have largely been replaced by MEMS and ring-laser gyroscopes but remain the conceptual anchor of inertial navigation. Speed: continuous real-time (spin-up time seconds to minutes). Capacity: 3-axis orientation; drift accumulates over time (arcseconds per hour in precision instruments).
Memristive Hopfield network optimizer f(x) = optimization via chaotic annealing / transient dynamics
Memristive circuits implementing Hopfield network topology where the intrinsic nonlinearity of memristors creates transient chaotic annealing processes. The chaotic dynamics enable escape from local minima for solving optimization problems like Max-Cut and continuous function optimization.
Memristor crossbar f(x) = analog matrix-vector multiplication
Crossbar arrays of memristors (memory resistors) perform matrix-vector operations in analog. Voltages applied to rows, currents collected from columns. Resistance values encode matrix elements. Enables in-memory computing for neural network inference. Speed: nanoseconds (electrical propagation). Capacity: large matrix operations (scales with crossbar size).
Microchip PIC10 f(x) = simple embedded controls
tiny 8-bit microcontroller (PIC10F series) for cost-sensitive control loops.
Microchip PIC12 f(x) = LED and sensor control
Microchip PIC12 (PIC12F series) 12-bit CMOS microcontrollers in tiny DIP/SOT packages, supporting direct LED and sensor control on compact embedded boards.
Microchip PIC16 f(x) = automation devices
The 8-bit PIC16 family combines a Harvard architecture with a pipelined instruction path, making it a staple of hobbyist and professional controllers used in automation devices and embedded teaching rigs.
Microchip PIC18 f(x) = 8-bit enhanced microcontroller architecture with advanced instructions for embedded control
The PIC18 family pairs an 8-bit enhanced core, pipelined execution, and extended instruction set with rich peripherals, making it deterministic, exact, and suited to robotics and instrumentation workflows requiring tight control loops.
Microchip PIC24 f(x) = Embedded & microcontroller cores → Classic embedded
Microchip's PIC24 line is a 16-bit mid-range DSC used in motor control, blending a microcontroller-friendly datapath with DSP extensions for PWM, ADC, and sensor-feedback loops.
Microchip PIC32 f(x) = Embedded & microcontroller cores → Classic embedded
Microchip PIC32 is a 32-bit MIPS-based microcontroller line that equips advanced embedded systems with DMA-driven peripherals, caches, and large flash to anchor automation, connectivity, and real-time instrumentation workloads.
Mill Computing f(x) = experimental compiler-rich compute
Mill's architecture merges a belt machine register model with VLIW-style wide-issue and deeply pipelined stages to pursue high efficiency and sustained throughput, relying on a compiler-centric workflow to schedule operations on the belt.
NVIDIA Ampere GPUs f(x) = High-throughput floating-point, tensor, and ray-tracing compute for AI/HPC workloads
NVIDIA's Ampere family (GA100, GA102) integrates second-generation ray tracing cores with third-generation tensor cores, powering the A100 data-center accelerator and GeForce RTX 3090 consumer card on a 7 nm FinFET substrate.
NVIDIA Blackwell f(x) = Grace Hopper superchips
Upcoming NVIDIA Blackwell architecture is tuned for inference, pairing DPX and Tensor cores in the Grace Hopper superchip line to accelerate dense matrix and sparse transformer workloads.
NVIDIA Fermi (GF100) f(x) = Massively-parallel double-precision CUDA compute and graphics shading workloads
NVIDIA Fermi GF100 architecture introduced compute capability 2.x with ECC-protected GDDR5, hardware thread scheduling, and large register files; Tesla C2050 HPC accelerators and GeForce GTX 480 gaming cards both deploy this silicon to accelerate dense linear algebra, physics simulations, and shader pipelines.
NVIDIA Hopper GPU f(x) = Transformer and dense matrix acceleration
The Hopper family (H100) is NVIDIA's GPU architecture for large-scale transformer training, pairing a new Transformer Engine with CUDA/SIMT cores and tensor cores on a 4nm/5nm FinFET node; HGX H100 cabinets tie multiple Hopper GPUs via NVLink to deliver deterministic throughput for massive AI workloads.
NVIDIA Kepler GPU microarchitecture f(x) = CUDA compute and high-throughput SIMD workloads
Kepler GK110/GB100 derivatives underpin Tesla K80 and GeForce GTX 780, delivering CUDA compute services with large SMX arrays tuned for both HPC and graphics tasks.
NVIDIA Maxwell f(x) = SIMT GPU acceleration for graphics and HPC workloads
NVIDIA Maxwell (GM204/GM200) architecture drives energy-efficient graphics and compute, powering GeForce GTX 980 and Tesla M40 with improved power efficiency and mixed-precision throughput.
NVIDIA Pascal f(x) = GP104 Pascal GPU microarchitecture
NVIDIA Pascal GPUs built on the GP104 architecture deliver high bandwidth memory and compute-intensive blocks, powering Tesla P100 accelerators and GeForce GTX 1080 cards across HPC and graphics workloads.
NVIDIA Tesla f(x) = dense floating-point and tensor algebra for HPC and AI training workloads
NVIDIA Tesla GPU compute cards deliver massively parallel floating-point and tensor acceleration for HPC, AI training, and inference, leveraging NVLink, HBM memory, and CUDA programming to pack thousands of CUDA cores and Tensor cores behind blower-style cooling.
NVIDIA Turing f(x) = Hybrid ray tracing and AI inference pipelines
NVIDIA Turing is a GPU microarchitecture that uses dedicated real-time ray tracing RT cores and Tensor cores for deep learning while powering GeForce RTX 2080, Tesla T4, and similar boards.
NVIDIA Volta f(x) = GV100
NVIDIA Volta GPU architecture built on the GV100 die with tensor cores, HBM2 memory, and NVLink, powering Tesla V100 accelerators and DGX-1 systems.
Neuromorphic chip (Intel Loihi / IBM TrueNorth) f(x) = spiking neural network computation
Silicon chips that mimic neural computation using spiking neurons and synaptic connections. Intel Loihi and IBM TrueNorth implement event-driven, asynchronous processing with on-chip learning capabilities. Speed: microseconds (spike propagation). Capacity: millions of neurons (parallel event-driven processing).
Normal Computing stochastic processing units f(x) = Unconventional analog thermodynamic inference
Normal Computing's stochastic processing units leverage probabilistic analog circuits with thermodynamic noise shaping and memristive elements to accelerate AI inference workloads while embracing physical stochasticity.
Op-amp analog computer f(x) = ODE integration via Kirchhoff's laws
Operational amplifiers configured as integrators, adders, and multipliers solve differential equations in real-time. Voltages represent variables, circuit topology encodes the equation structure. Classical electronic analog computation. Speed: real-time (microseconds to seconds). Capacity: systems of ODEs (~10-100 variables typical).
Optical correlator (4f / VanderLugt filter) f(x) = cross-correlation / matched filtering (pattern detection in O(1) optical time)
A 4f lens system consists of two lenses separated by twice their focal length with a holographic or spatial-light-modulator (SLM) filter at the shared Fourier plane. The first lens computes the Fourier transform of the input image; the filter multiplies by the complex conjugate of the reference pattern's Fourier transform; the second lens inverse-Fourier-transforms the product, yielding the cross-correlation at the output plane. This implements a matched filter — the canonical operation for detecting a known pattern in a cluttered scene — in a single optical pass at the speed of light, regardless of image size. The system realizes the convolution theorem physically: FT(f⋆g) = F*·G. Used in optical character recognition, fingerprint identification, and radar pulse compression. Speed: picoseconds to nanoseconds (optical propagation through ~cm path). Capacity: full 2D cross-correlation of megapixel images in a single pass; filter change requires SLM reprogramming.
POWER2 f(x) = server-class compute
General-purpose CPU lineage to PowerPC/POWER, POWER2 couples advanced superscalar, multi-issue pipelines and large caches in a high-frequency multi-chip module to deliver server-class compute for IBM RS/6000 and AS/400 deployments.
POWER3 f(x) = HPC/server compute
A 64-bit out-of-order PowerPC architecture with multiple integer and floating-point execution units and high floating-point throughput, deployed across IBM RS/6000 and pSeries servers.
POWER4 f(x) = General-purpose CPUs → PowerPC / POWER
Dual-core Power4 (2001) high-throughput server DNA that underpins IBM pSeries 690 and eServer P655 SAN clusters, delivering 64-bit PowerPC/POWER general-purpose CPU throughput for block-storage control and enterprise compute workloads.
POWER5 f(x) = General-purpose CPUs → PowerPC / POWER
Multi-core POWER5 processor deployed in IBM eServer pSeries machines (e.g., p5 590) for virtualization workloads.
POWER6 f(x) = enterprise large-scale computing
2007 IBM POWER6 is a high-frequency PowerPC/POWER chip with SMT and hardware virtualization that powered IBM pSeries 570/780 enterprise servers.
POWER8 f(x) = General-purpose CPUs → PowerPC / POWER
2012 IBM POWER8 many-core processor with CAPI support, deployed in Power Systems E870 servers for enterprise workloads.
POWER9 f(x) = General-purpose CPUs → PowerPC / POWER
2017 IBM POWER9 processor featuring OpenCAPI, NVLink, and SMT, powering Summit and IBM Power9 servers.
Photonic integrated circuit (silicon photonics) f(x) = matrix-vector multiplication / unitary linear transforms (for neural network inference)
Arrays of Mach-Zehnder interferometers (MZIs) and microring resonators on a silicon chip implement programmable unitary matrices in the optical domain. Light encodes values as amplitude or phase; passing through a mesh of beam-splitters (MZIs) with tunable phase shifters multiplies an optical input vector by the weight matrix in a single forward pass. Because photons travel at c and interference is intrinsically parallel, a single matrix-vector multiply completes in picoseconds with energy consumption set only by modulation and detection, not arithmetic logic. MIT demonstrated a photonic processor running all key deep-learning operations on-chip. Neuromorphic silicon photonics has achieved 50 GHz tiled matrix multiplication. Speed: picoseconds per matrix-vector multiply; 50 GHz demonstrated. Capacity: 64×64 to 512×512 unitary matrices on current chips; ~4-6 bit precision.
Physarum polycephalum (slime mold) f(x) = Steiner tree / shortest transport network (approximate)
The plasmodial slime mold extends filaments toward nutrient sources and progressively reinforces paths that carry more flow, pruning inefficient routes. Toshiyuki Nakagaki showed it reproduces the Tokyo rail network topology. Speed: hours to days (biological growth/optimization). Capacity: network optimization problems with ~10-100 nodes.
Planimeter f(x) = area enclosed by an arbitrary plane curve (∮ via Green's theorem)
A two-bar linkage with a tracing point at one end and a measuring wheel mounted on the tracer arm. When the operator traces the boundary of an arbitrary shape, the wheel rolls only in the direction perpendicular to the tracer arm — the component encoding the integrand of Green's theorem (∮ x dy). The total wheel rotation equals the enclosed area regardless of path geometry. The polar planimeter (Amsler, 1854) requires no straight guide rail and works anywhere on a flat surface. Precision versions routinely achieve 0.1% accuracy. Historically used in cartography, engineering drawing, and medical imaging to measure irregular areas from printed plans. Speed: seconds to minutes per area measurement (tracing speed). Capacity: single scalar output (area); arbitrary curve complexity.
Pneumatic logic (Coanda-effect fluidics) f(x) = boolean logic (AND, OR, NOT, NOR) via wall-attachment bistability
A jet of air entering a Y-shaped channel naturally attaches to one wall (the Coandă effect) and locks into that state by low-pressure recirculation. A small control jet on the opposite side provides enough momentum to switch the main jet to the other wall — bistable flip-flop behaviour with no moving parts. AND, OR, NOT, and NOR gates are realized by channel geometry; outputs fan out by splitting the attached jet. Developed in the early 1960s at the Harry Diamond Laboratories (Bowles, Gottron) and widely used in industrial control until PLCs displaced them. Inherently radiation-hardened (no electronics) and tolerant of dust and oil. MTBFs of 25,000–50,000 hours reported. Speed: milliseconds per gate switching (air transit time). Capacity: arbitrary boolean circuits; industrial systems ran thousands of gates.
Qualcomm Hexagon f(x) = mobile signal processing
Qualcomm Hexagon is a VLIW DSP inside Snapdragon SoCs that accelerates audio, vision, and machine learning workloads.
Qualcomm Hexagon NPU f(x) = on-device AI inference
Qualcomm Hexagon NPU is the tensor accelerator embedded in Snapdragon platforms, combining Hexagon DSP cores and tensor accelerator fabric to deliver power-efficient on-device inference.
Quantum and quantum-inspired annealers f(x) = Ising model energy minimization / QUBO optimization
Quantum and quantum-inspired systems for solving combinatorial optimization problems through annealing processes. Includes true quantum annealers (D-Wave) using superconducting qubits and quantum-inspired CMOS implementations (Fujitsu, Toshiba, Hitachi) that simulate annealing dynamics. Speed: microseconds to seconds. Capacity: hundreds to thousands of variables.
Quantum gate computer (superconducting qubits) f(x) = unitary transformations / quantum algorithms
Superconducting qubits manipulated by microwave pulses to perform unitary operations. Quantum gates like Hadamard, CNOT, and phase gates enable quantum algorithms such as Shor's factoring and Grover's search. Speed: nanoseconds to microseconds (gate operations). Capacity: exponential in qubit count (theoretical universal quantum computation).
Repressilator (synthetic gene oscillator) f(x) = limit-cycle oscillation / biological clock (via negative-feedback transcription loop)
Elowitz & Leibler (2000, Nature) constructed a synthetic oscillator in E. coli from three mutual repressor genes wired in a ring: LacI represses tetR; TetR represses cI; CI represses lacI. No gene product directly activates its own production, yet the circular negative feedback drives sustained oscillations in protein concentration with a period of ~150 minutes. The repressilator is a physical implementation of a relaxation oscillator: the mathematical operation is sustained limit-cycle dynamics, the same function realized by a CMOS ring oscillator or a Van der Pol circuit — but in living cells. Demonstrates that genetic regulatory networks can be designed as analog computing substrates, encoding functions (oscillation, bistability, logic) in DNA sequence. Speed: ~150 min period (transcription/translation kinetics). Capacity: single-frequency oscillator; frequency tunable by changing promoter strength or mRNA degradation rate.
Reservoir computer f(x) = temporal pattern recognition / dynamical system computation
Fixed nonlinear dynamical system (reservoir) coupled to a trained linear readout layer. Input drives the reservoir dynamics, output layer learns to extract desired computations. Echo state networks and liquid state machines are implementations. Speed: depends on reservoir substrate (microseconds to seconds). Capacity: temporal sequence processing (scales with reservoir size).
Resistive sheet (Teledeltos) Laplace solver f(x) = solutions to Laplace's equation ∇²φ = 0 (electrostatics, heat, groundwater flow)
A sheet of Teledeltos — carbon-coated resistive paper with ~6 kΩ/square resistivity — conducts current that obeys the same Laplace equation as electrostatic potential, steady-state heat conduction, inviscid fluid flow, and Darcy groundwater seepage. Boundary conditions are imposed by painting silver-loaded conductive ink in the shape of conductors or flow boundaries; a voltage is applied across them. A probe voltmeter scanned over the sheet reads the potential field directly. Complex 2D geometries that would require days of PDE numerics can be mapped in hours. Widely used from the 1930s through the 1970s in capacitor design, transformer core analysis, dam seepage studies, and aircraft aerodynamics before finite-element codes displaced it. Speed: hours for full field map (manual probe scanning); boundary setup in minutes. Capacity: 2D scalar field on arbitrary domain geometry; ~1-2% accuracy.
Rubber-band Steiner tree f(x) = Euclidean Steiner minimum tree (approximate)
Elastic bands stretched between pins hammered into a board relax under tension to a state of minimum total length. Because each band pulls with a force proportional to its extension, the equilibrium configuration satisfies the equal-angles condition at every interior junction — the defining property of a Steiner tree. The result is the shortest network connecting all pins, approximating the solution to the NP-hard Euclidean Steiner tree problem. The mechanism is combinatorially distinct from the soap-film Steiner tree (Plateau's problem in 2-D) because the topology of junctions is fixed by the discrete wiring of the bands, not by a continuous surface. Speed: instantaneous (elastic equilibration). Capacity: Steiner tree for ~5-20 pins (limited by physical layout).
SPARC f(x) = UltraSPARC/enterprise compute
SPARC (Scalable Processor Architecture) is a VLSI RISC architecture from Sun Microsystems/Oracle featuring register windows that keep deep call stacks performant and powering Sun and Oracle workstations and enterprise servers.
SambaNova RDU (Reconfigurable Dataflow Unit) f(x) = AI training and inference dataflow graphs
Reconfigurable Dataflow Units implement granular dataflow graphs by combining configurable tiles with per-tile scheduling and streaming data paths. Each tile bundles compute arrays, SRAM buffers, and intra-tile routers, enabling DataScale to keep thousands of MACs busy while mapping compiled tensor graphs to the dataflow fabric.
Samsung HBM-PIM f(x) = Unconventional / In-memory compute
Samsung HBM2 memory with Processing-in-Memory logic for AI, deployed in research prototypes to offload vector-heavy kernels and shorten data movement for next-generation accelerators.
SiFive U54-MC core cluster f(x) = low-power Linux-capable RISC-V compute
The SiFive U54-MC cluster combines four RV64IMAFD general-purpose cores with a supervisory S7 management core and coherent cache fabric, delivering low-power Linux-capable RISC-V compute used on HiFive Unmatched and Unleashed development boards.
SiFive U74 f(x) = scale-up RISC-V Linux servers
High-performance out-of-order RISC-V core family from the SiFive Performance Series, optimized for scale-up Linux server deployments.
SiFive X280 f(x) = General-purpose CPUs → RISC-V
The SiFive X280 is a vector extension core aimed at AI acceleration, realizing machine learning inference workloads in SiFive Lighthouse and other AI boards.
Simulated annealing (thermal) f(x) = argmin of energy / cost landscape
A physical system coupled to a heat bath at slowly decreasing temperature explores its energy landscape. At high temperature it escapes local minima; as T→0 it settles into a global minimum — if cooling is slow enough. Speed: minutes to hours (depends on cooling schedule). Capacity: global optimization problems (scales exponentially with problem size).
Slide rule f(x) = logarithm, multiplication, division, roots
Logarithmic scales engraved on sliding rules allow multiplication by physical addition of lengths (log a + log b = log ab). Precision is bounded by engraving quality and human reading resolution — typically 3 significant figures. Speed: seconds (human reading time). Capacity: single arithmetic operation (3 significant figures).
Soap film f(x) = minimal surface (Plateau's problem)
A soap film spanning a closed wire boundary settles into the surface of minimum area — the solution to Plateau's problem. For two parallel rings it realizes a catenoid. Can approximate Steiner trees for planar point sets. Speed: seconds to minutes (surface tension equilibration). Capacity: continuous optimization over infinite-dimensional space.
Spaghetti sort f(x) = total ordering of positive reals (sorting) in O(n) physical time
Cut n spaghetti strands to lengths proportional to the n values to be sorted. Gather them loosely in a fist and lower them vertically onto a flat table so all strands stand upright. Lower a flat hand from above: the first strand it touches is the maximum. Remove it, record the value, repeat — each contact extracts the next-largest in O(1) time. Preparing the rods is O(n); the n extractions are O(n); the whole sort is O(n) in physical time, exploiting the parallel nature of gravity and contact. Introduced by A. K. Dewdney in Scientific American. Illustrates how physical parallelism can circumvent the Ω(n log n) comparison-sort lower bound by using a non-comparison primitive (contact with a plane). Speed: O(n) physical steps; each step is constant time. Capacity: n positive real values; precision limited by ability to cut and measure strand lengths.
SpiNNaker f(x) = Massively parallel ARM968 neuromorphic fabric for real-time spiking networks
SpiNNaker machines at the University of Manchester network over a million ARM968 cores via packet-switched triple-torus routers to run spiking neural networks with local plasticity and sensor-motor I/O in real time, letting cortical-scale models stay synchronized as spikes hop across the mesh.
Sun SPARC v8 f(x) = Unix workstation and multi-threaded workloads
Sun SPARC v8 is a 64-bit RISC core with register windows, clean encoding, and Solaris Ultra workstation deployment used in Sun Ultra 1/2 systems to accelerate interactive multi-threaded UNIX development tasks.
Sun SPARC v9 f(x) = enterprise Solaris workload
Sun SPARC v9 extends the SPARC ISA with full 64-bit improvements, wider floating-point units, and server-class scaling (large caches and coherent SMP) to keep pace with Solaris enterprise services.
TI MSP430 f(x) = Embedded & microcontroller cores → Classic embedded
Texas Instruments' MSP430 family is an ultra-low-power 16-bit microcontroller platform widely used in energy-harvested sensor nodes and low-power embedded monitoring tasks, combining deep sleep modes with fast wake-up and analog integration for deterministic control loops.
TI TMS320 C2000 f(x) = deterministic control loops
TI TMS320 C2000 family of 32-bit fixed-point DSPs optimize deterministic motor control loops with on-chip ADCs, PWMs, comparators, and other peripherals for real-time sensing and actuation.
TI TMS320 C5000 f(x) = Audio and voice signal processing for low-power hearing aids and embedded audio products.
Low-power 16-bit fixed-point DSP for audio and voice processing widely used in digital hearing aids.
The TI TMS320 C6000 family are 32-bit VLIW/very long instruction word DSPs engineered for high-throughput signal processing, often deployed in base station and other wireless infrastructure hardware.
Tensilica Xtensa DSP f(x) = audio/machine learning acceleration
Configurable VLIW/dual-issue DSP core used across Cadence HiFi audio DSP families such as HiFi 3 and Tensilica LX processors, enabling extensible ISA custom instructions for audio decoding and machine-learning inference.
Tenstorrent Grayskull f(x) = AI training acceleration
Tenstorrent Grayskull is a tile-based architecture of compute tiles with systolic arrays paired with on-tile high-bandwidth memory to deliver massive data-parallel tensor math and training throughput for large neural networks.
Tenstorrent Wormhole f(x) = AI accelerator
Tenstorrent Wormhole is a multi-chip module designed for large language models, providing a high-bandwidth interconnect and integration with the Tenstorrent software stack.
Thermodynamic computer f(x) = sampling from Boltzmann distributions
Uses thermal noise in analog circuits to sample from Boltzmann distributions. Thermal fluctuations provide natural randomness that follows statistical mechanics principles. The Normal Computing SDE (Stochastic Differential Equation) approach leverages this thermal noise for computation. Speed: microseconds to milliseconds (thermal equilibration). Capacity: probabilistic sampling problems (scales with circuit complexity).
Thermodynamic computer (Normal Computing SPU) f(x) = probabilistic sampling / linear algebra via thermal equilibration
Analog physics-based computers using thermodynamic principles for computation. Normal Computing's Stochastic Processing Unit (SPU) uses RLC circuits as unit cells with all-to-all coupling via switched capacitances, natively simulating Langevin/Ornstein-Uhlenbeck dynamics for probabilistic reasoning, generative design, and scientific computing.
Transmeta Crusoe f(x) = low-power mobile compute
Transmeta's Crusoe family paired a 256-bit VLIW core with Code Morphing Software that dynamically translated x86 binaries into ultra-low-power native instructions, caching hot traces and emulating the entire Intel stack to run fanless ultra-portables. The dynamic code-morphing VLIW chip and 64-bit translation engine delivered x86 compatibility and long battery life for early Efficeon/Crusoe notebooks such as the NEC MobilePro and Sony VAIO U-series thin clients.
UPMEM PIM f(x) = parallel search and graph analytics near memory
UPMEM Processing-In-Memory DIMMs combine DRAM banks with embedded RISC DPUs, enabling data-center scale parallel search and graph analytics without moving data back to host CPUs.
Ventana Veyron f(x) = Ventana internal server designs
Ventana's high-performance multi-core RISC-V targeted at AI and HPC workloads within Ventana's own compute node and server fabric, optimizing for chiplet scalability and domain-specific acceleration.
Water (fluidic) computer f(x) = binary addition / boolean logic (AND, XOR)
Water levels in vessels encode binary digits; a siphon and slow drain combine to implement AND and XOR in a single cup-and-tube unit. A filled cup is a 1, an empty cup a 0. When two cups feed one container the siphon trips (AND = carry), while the remainder leaks out the XOR drain. These half-adder cells chain into a multi-bit ripple adder. No moving parts beyond the water itself. Speed: seconds to minutes per bit (gravity-driven flow). Capacity: 4-bit addition demonstrated; theoretically scalable.
Watt centrifugal governor f(x) = proportional speed regulation (continuous set-point tracking via negative feedback)
Two steel balls are mounted on hinged arms linked to a rotating vertical shaft driven by the engine. As engine speed increases, centrifugal force swings the balls outward and upward; through a collar linkage this motion partially closes the steam throttle, reducing power and slowing the engine. As speed falls the balls drop, the throttle reopens, and the cycle repeats. The system finds equilibrium where centrifugal force exactly balances gravity — and that equilibrium corresponds to the desired set speed. James Watt adapted this in 1788 from a windmill governor; James Clerk Maxwell's 1868 paper 'On Governors' analysed it as the first mathematical treatment of feedback control. The device is a physical analog computer that continuously solves the equation: throttle = f(ω − ω_set). Speed: continuous real-time (mechanical response time ~0.1–1 s). Capacity: single-variable set-point control; extends to multi-variable with additional linkages.