Emergent Symbolic Structure in Health Foundation Models: Extraction, Alignment, and Cross-Modal Transfer

AuthorsGajendra Katuwal, Advait Koparkar, Salar Abbaspourazad, Anshuman Mishra, Sarvesh Kirthivasan

This paper was accepted at the Mechanistic Interpretability Workshop at ICML 2026.

Health foundation models (FMs) learn useful representations from wearable sensors, but interpreting what they encode and transferring that knowledge across modalities after training remains difficult. We present a post-training framework that decomposes frozen embeddings into interpretable directions, referred to as symbols, and use these symbols to align the embedding spaces without retraining. We evaluate the framework on three FMs for photoplethysmography (PPG) and accelerometer data, independently pretrained on ∼20M minutes of unlabeled data from ∼172K participants, and analyzed on a held-out cohort of 30K subjects. We find that extracted symbols associate selectively with health conditions and physiological attributes, and these associations are partially shared across modalities and architectures. Cross-modal transfer via symbols retains more than 95% of in-domain performance, is nearly symmetric across domain directions, and saturates with limited paired data, together indicating that alignment recovers a shared low-dimensional subspace rich in physiological information. Overall, these results suggest that health FM embeddings contain an interpretable symbolic organization that is shared across modalities and supports cross-domain transfer without joint training.

Emergent Symbolic Structure in Health Foundation Models: Extraction, Alignment, and Cross-Modal Transfer

Related readings and updates.

Guiding Cross-Modal Representations with MLLM Priors via Preference Alignment

Promoting Cross-Modal Representations to Improve Multimodal Foundation Models for Physiological Signals

Discover opportunities in Machine Learning.