videoJuly 20, 2023

NLU Workshop Talk: Model-Aided Human Annotation at Scale

AuthorsHadas Kotek

Related readings and updates.

LEAD: Breaking the No-Recovery Bottleneck in Long-Horizon Reasoning

July 24, 2026research area Methods and Algorithms, research area Speech and Natural Language Processing

Long-horizon execution in Large Language Models (LLMs) remains unstable even when high-level strategies are provided. Evaluating on controlled algorithmic puzzles, we demonstrate that while decomposition is essential for stability, extreme decomposition creates a “no-recovery bottleneck”. We show that this bottleneck becomes critical due to highly non-uniform error distribution, where consistent errors on a few “hard” steps become irreversible…

Environment-free Synthetic Data Generation for API-Calling Agents

July 21, 2026research area Methods and Algorithms

Training API-calling large language model (LLM) agents demands massive amounts of high-quality trajectories. However, collecting such data at scale typically requires fully implemented environments with executable APIs and realistic, pre-populated backend databases, creating a major bottleneck for scalability. To overcome this, we propose an environment-free synthetic data generation approach that leverages LLMs as on-the-fly digital world…

NLU Workshop Talk: Model-Aided Human Annotation at Scale

Related readings and updates.

LEAD: Breaking the No-Recovery Bottleneck in Long-Horizon Reasoning

Environment-free Synthetic Data Generation for API-Calling Agents

Discover opportunities in Machine Learning.