paperJuly 2026

Flexible Routing via Uncertainty Decomposition

AuthorsCharlotte Peale†**, Siddartha Devic‡**, Parikshit Gopalan, Udi Wieder, Aravind Gollakota

This paper was accepted at the Statistical Frameworks for Uncertainty in Agentic Systems Workshop at ICML 2026.

A key strategy for balancing performance and cost in modern machine learning systems is to dynamically route queries to either a low-cost model or a more expensive oracle (such as a large pretrained model or human expert), an approach known as model routing. In this work we present a new uncertainty-aware router that (1) avoids unnecessary oracle calls on inherently ambiguous queries, and (2) adapts dynamically to different loss functions and cost parameters through simple hyperparameter changes, without retraining. Our method, applicable to any classification setting where multiple independent annotations per input are available, is based on decomposing total uncertainty into irreducible and reducible components using higher-order predictors [Ahdritz et al., 2025]. This enables a unified approach to both routing and abstention: predict with the weak model when uncertainty is low, route to the oracle when reducible uncertainty is high, and abstain when irreducible uncertainty is high. Our router comes with strong theoretical guarantees bounding regret relative to optimal task-specific routers. We conduct experiments on both synthetic and real-world datasets that demonstrate the benefits of our approach in suitable regimes—in particular, whenever reducible and irreducible uncertainty are not too correlated.

† Stanford University
‡ University of Southern California
** Work done while at Apple

Flexible Routing via Uncertainty Decomposition

Related readings and updates.

Omni-Router: Sharing Routing Decisions in Sparse Mixture-of-Experts for Speech Recognition

Capsules with Inverted Dot-Product Attention Routing

Discover opportunities in Machine Learning.