Ion Stoica's SGLang Project Becomes RadixArk with $400M Valuation from Accel
Ion Stoica has built a career turning academic research into infrastructure companies. Spark became Databricks (now worth $43 billion). Ray became Anyscale. Now SGLang, the inference optimization project from his UC Berkeley lab, is spinning out as a new company called RadixArk with a $400 million valuation and backing from Accel.
The move marks another milestone in the rapid commercialization of AI infrastructure—and signals that investors believe the real money in AI may not be in building models, but in running them efficiently.
From Research Project to $400M Company
SGLang emerged from UC Berkeley's RISELab, the successor to AMPLab where Stoica's team created Apache Spark. The project gained traction in 2024 as a fast, memory-efficient framework for serving large language models. Its core innovation: a domain-specific language for programming LLM applications that dramatically reduces inference costs through clever batching and caching techniques.
The spin-out follows a familiar Stoica playbook. Start with open-source research that solves a real problem. Build community adoption. Then commercialize with enterprise features while keeping the core free. Databricks and Anyscale both followed this path. RadixArk appears headed the same direction.
Accel's investment suggests the firm sees RadixArk competing for a piece of the inference optimization market that's projected to dwarf model training spend. Every company deploying AI applications needs inference—and most are discovering it's their largest ongoing cost.
The Inference Market Heats Up
RadixArk enters a crowded and rapidly evolving space. NVIDIA's TensorRT-LLM dominates among teams already locked into CUDA. vLLM, another Berkeley-originated project, has become the default for many open-source deployments. Hugging Face's Text Generation Inference serves as a simpler on-ramp.
But the market is far from settled. Inference optimization is still more art than science, with performance varying wildly across model architectures, hardware configurations, and use cases. There's room for multiple winners—and massive value for whoever can reliably cut inference costs by 2x or 10x.
SGLang's technical approach differs from competitors in important ways. Where vLLM focuses on memory efficiency through PagedAttention, SGLang takes a higher-level view—optimizing not just individual inferences but entire application flows. Its RadixAttention technique enables efficient prefix caching across requests, which becomes especially valuable for applications that share common system prompts or retrieved context.
What This Means for SGLang Users
The open-source community will watch closely how RadixArk handles the transition. Stoica's track record here is strong—both Spark and Ray remained vibrant open-source projects even as Databricks and Anyscale built commercial businesses on top.
The likely model: SGLang stays open-source and continues development, while RadixArk offers enterprise features like managed deployment, security compliance, and premium support. This "open core" approach has proven sustainable for infrastructure companies, though it requires careful management of community relations.
For teams currently using SGLang, the spin-out is probably good news. Dedicated funding means full-time engineering resources, faster development, and better documentation. The risk is that the open-source version becomes a loss leader—good enough for experimentation but missing critical features for production.
The Stoica Pattern
Ion Stoica's journey from academic researcher to serial infrastructure entrepreneur offers a template that other university labs are trying to replicate. The pattern works because academic environments can take risks on fundamental research that startups can't afford, while commercial spin-outs can solve the "last mile" problems of enterprise deployment.
RISELab's lineage now includes some of the most important AI infrastructure projects of the past decade. Apache Spark revolutionized big data processing. Ray became the backbone of distributed AI training. Modin brought Pandas to scale. Each project addressed a genuine bottleneck in the data and AI stack.
SGLang/RadixArk fits the same mold: identify where practitioners are struggling, build elegant open-source solutions, then commercialize the hard parts. It's a playbook that works because it aligns academic incentives (publish papers, have impact) with commercial ones (solve problems, make money).
The Bigger Picture
The $400M valuation—high for a pre-revenue spin-out—reflects investor conviction that inference costs will define the AI industry's economics. Training a frontier model costs hundreds of millions. But inference is where the ongoing bills pile up: every ChatGPT query, every Copilot suggestion, every AI-powered search result costs compute.
Right now, inference spending is dominated by hyperscalers running their own models. But as more companies deploy AI in production, inference optimization becomes critical infrastructure. A 2x efficiency improvement directly translates to 50% cost reduction—or 2x the capability at the same price.
This is why the inference market is exploding. It's not just about running models faster. It's about making AI economically viable for applications beyond the few companies that can afford massive compute budgets.
RadixArk's $400M valuation is a bet that Ion Stoica's team can capture meaningful share of that market. Given his track record, it's a bet many will be watching closely.
This article was ultrathought.