Groq

Groq, Inc.

CHIP DESIGNERS🇺🇸 United StatesPrivate

Key Product

GroqChip LPU, GroqCloud AI inference service

Full briefing▼ Expand

WHY IT MATTERS

Groq, Inc. is a private AI infrastructure company headquartered in Mountain View, California, founded in 2016 by Jonathan Ross, who was also the lead engineer on Google's first TPU. Groq's founding insight was that LLM inference — unlike training — is fundamentally memory-bandwidth-bound rather than compute-bound: the bottleneck is moving model weights from memory to processing cores fast enough to keep up with the sequential, token-by-token generation process. Groq designed the LPU (Language Processing Unit) as a software-programmable, deterministic dataflow processor that eliminates this bottleneck through a fundamentally different architectural approach. The LPU architecture uses a Systolic Array design with a Temporal Instruction Set Architecture (TISA) — a statically-scheduled execution model where the compiler determines at compile time exactly when each instruction executes, with no dynamic scheduling, no cache hierarchy, and no out-of-order execution hardware. This eliminates all sources of non-deterministic latency (cache misses, dynamic memory allocation, branch prediction failures) that cause the high variance in GPU inference timing. The result is an inference processor that delivers completely deterministic, single-digit-millisecond per-token latency for large models, regardless of batch size or concurrent user load. A single LPU chip achieves approximately 750 GB/s of memory bandwidth using SRAM rather than HBM. GroqCloud, Groq's public inference API service, became one of the most-cited benchmarks in the AI inference speed debate when it demonstrated LLaMA 2 70B inference at over 300 tokens per second per user in early 2024 — approximately 4–10× faster than comparable GPU-based inference services at the time. The GroqCloud throughput advantage comes from both the LPU's memory bandwidth architecture and Groq's compiler-optimized model serving pipeline. Groq raised $640 million in a Series D funding round in August 2024, with participation from Samsung Ventures, Cisco, and others, bringing its total funding to approximately $1.1 billion and valuing the company at $2.8 billion. Groq's chips are fabricated by TSMC. The current GroqChip (LPU1) is on TSMC's 14nm process; subsequent generations are planned on more advanced nodes. The Samsung Ventures investment signals a potential strategic relationship with Samsung as a future fabrication alternative, though TSMC remains Groq's primary fab partner. The LPU's SRAM-centric design — which uses distributed on-chip SRAM arrays rather than HBM stacks — means Groq does not depend on SK Hynix or Samsung for HBM packaging, differentiating it from GPU-based inference infrastructure and eliminating one layer of supply chain complexity. Groq's target market is real-time AI inference applications where latency matters more than cost-per-token throughput: voice AI, customer service agents, real-time translation, code completion, and enterprise applications requiring sub-second response times. The company is also pursuing defense and intelligence community contracts where deterministic latency is a mission-critical requirement — a use case where the LPU's predictable timing properties offer a meaningful advantage over GPU-based systems with their inherent scheduling variance. As LLM inference workloads grow faster than training workloads in the overall AI compute mix, Groq's specialized inference-only architecture positions it as a complement to (rather than a replacement for) GPU-based training infrastructure.

Connected companies

Tap a chip to trace that company's chain.

TSMC Synopsys Cadence

Critical path — raw silicon to deployment

The tightest single-source dependencies, in order.

FOUNDRIES

TSMC

CoWoS advanced packaging, N3/N2 logic

Chokepoint

EDA TOOLS

Synopsys

Design Compiler (synthesis), PrimeTime (timing), VCS (simulation), IC Compiler 2

Chokepoint

EDA TOOLS

Cadence

Virtuoso (analog), Genus/Innovus (digital synthesis), Tempus (timing signoff)

Chokepoint

CHIP DESIGNERS

Groq

GroqChip LPU, GroqCloud AI inference service

About this company

QWho supplies Groq?

Groq relies on 3 upstream suppliers across the AI chip supply chain.

TSMC (World's largest contract chip manufacturer), Synopsys (Largest EDA software provider; IC design and verification tools (Design Compiler, PrimeTime, VCS) essential for all advanced chip tapeouts; US export-controlled to Chinese entities), Cadence (Leading EDA software provider; IC design tools (Virtuoso, Genus, Innovus) used by every advanced chip designer; US export-controlled to Chinese entities since 2022).

QWhat does Groq make?

AI inference chip company designing LPUs (Language Processing Units); GroqCloud holds the fastest publicly benchmarked LLM inference throughput

Key products GroqChip LPU, GroqCloud AI inference service