Apple Workshop on Machine Learning for Health: Pre-trained Model Representations and their Robustness against Noise for Speech Emotion Analysis
AuthorsVikram Mitra (Apple)
Apple Workshop on Machine Learning for Health: Pre-trained Model Representations and their Robustness against Noise for Speech Emotion Analysis
AuthorsVikram Mitra (Apple)
Stochastic KV Routing: Enabling Adaptive Depth-Wise Cache Sharing
May 5, 2026research area Methods and Algorithms, research area Speech and Natural Language Processing
Serving transformer language models with high throughput requires caching Key-Values (KVs) to avoid redundant computation during autoregressive generation. The memory footprint of KV caching is significant and heavily impacts serving costs. This work proposes to lessen these memory requirements. While recent work has largely addressed KV cache reduction via compression and eviction along the temporal axis, we argue that the depth dimension offers…
PORTool: Importance-Aware Policy Optimization with Rewarded Tree for Multi-Tool-Integrated Reasoning
May 4, 2026research area Speech and Natural Language Processing, research area Tools, Platforms, Frameworks
Multi-tool-integrated reasoning enables LLM-empowered tool-use agents to solve complex tasks by interleaving natural-language reasoning with calls to external tools. However, training such agents using outcome-only rewards suffers from credit-assignment ambiguity, obscuring which intermediate steps (or tool-use decisions) lead to success or failure. In this paper, we propose PORTool, an importance-aware policy-optimization algorithm that…