Apple Workshop on Machine Learning for Health : Web3 and Decentralized AI (DecAI)
AuthorsRamesh Raskar (MIT)
Apple Workshop on Machine Learning for Health : Web3 and Decentralized AI (DecAI)
AuthorsRamesh Raskar (MIT)
VSAS-Bench: Real-Time Evaluation of Visual Streaming Assistant Models
May 22, 2026research area Computer Vision, research area Data Science and Annotationconference CVPR
Streaming vision-language models (VLMs) continuously generate responses given an instruction prompt and an online stream of input frames. This is a core mechanism for real-time visual assistants. Existing VLM frameworks predominantly assess models in offline settings. In contrast, the performance of a streaming VLM depends on additional metrics beyond pure video understanding, including proactiveness, which reflects the timeliness of the model’s…
EpiCache: Episodic KV Cache Management for Long-Term Conversation on Resource-Constrained Environments
May 19, 2026research area Methods and Algorithms, research area Speech and Natural Language Processingconference ICML
Modern large language models (LLMs) extend context lengths to millions of tokens, enabling coherent, personalized responses grounded in long conversational history. However, the Key-Value (KV) cache grows linearly with the extended dialogue history, causing the model’s memory footprint to quickly exceed device limits. While recent KV cache compression methods attempt to reduce memory usage, most apply cache eviction after processing the entire…