AgentBuilder: Exploring Scaffolds for Prototyping User Experiences of Interface Agents
AuthorsJenny T. Liang†, Titus Barik, Jeffrey Nichols, Eldon Schoop, Ruijia Cheng
AgentBuilder: Exploring Scaffolds for Prototyping User Experiences of Interface Agents
AuthorsJenny T. Liang†, Titus Barik, Jeffrey Nichols, Eldon Schoop, Ruijia Cheng
Interface agents powered by generative AI models (referred to as “agents”) can automate actions based on user commands. An important aspect of developing agents is their user experience (i.e., agent experience). There is a growing need to provide scaffolds for a broader set of individuals beyond AI engineers to prototype agent experiences, since they can contribute valuable perspectives to designing agent experiences. In this work, we explore the affordances agent prototyping systems should offer by conducting a requirements elicitation study with 12 participants with varying experience with agents. We identify key activities in agent experience prototyping and the desired capabilities of agent prototyping systems. We instantiate those capabilities in the AgentBuilder design probe for agent prototyping. We conduct an in situ agent prototyping study with 14 participants using AgentBuilder to validate the design requirements and elicit insights on how developers prototype agents and what their needs are in this process.
AMUSE: Audio-Visual Benchmark and Alignment Framework for Agentic Multi-Speaker Understanding
February 24, 2026research area Computer Vision, research area Methods and Algorithmsconference CVPR
Recent multimodal large language models (MLLMs) such as GPT-4o and Qwen3-Omni show strong perception but struggle in multi-speaker, dialogue-centric settings that demand agentic reasoning tracking who speaks, maintaining roles, and grounding events across time. These scenarios are central to multimodal audio-video understanding, where models must jointly reason over audio and visual streams in applications such as conversational video assistants…
Towards Learning Multi-Agent Negotiations via Self-Play
January 28, 2019research area Computer VisionWorkshop at ICCV
Making sophisticated, robust, and safe sequential decisions is at the heart of intelligent systems. This is especially critical for planning in complex multi-agent environments, where agents need to anticipate other agents’ intentions and possible future actions. Traditional methods formulate the problem as a Markov Decision Process, but the solutions often rely on various assumptions and become brittle when presented with corner cases. In…