Embedding Atlas: Low-Friction, Interactive Embedding Visualization

AuthorsDonghao Ren, Fred Hohman, Halden Lin, Dominik Moritz

Embedding projections are popular for visualizing large datasets and models. However, people often encounter “friction” when using embedding visualization tools: (1) barriers to adoption, e.g., tedious data wrangling and loading, scalability limits, no integration of results into existing workflows, and (2) limitations in possible analyses, without integration with external tools to additionally show coordinated views of metadata. In this paper, we present Embedding Atlas, a scalable, interactive visualization tool designed to make interacting with large embeddings as easy as possible. Embedding Atlas uses modern web technologies and advanced algorithms — including density-based clustering, and automated labeling — to provide a fast and rich data analysis experience at scale. We evaluate Embedding Atlas with a competitive analysis against other popular embedding tools, showing that Embedding Atlas’s feature set specifically helps reduce friction, and report a benchmark on its real-time rendering performance with millions of points. Embedding Atlas is available as open source to support future work in embedding-based analysis.

Embedding Atlas UI showing filtered embedding of 196,630 wine reviews with a selected outlier and review metadata displayed. — Figure 1: The Embedding Atlas user interface visualizing the embedding of 196,630 wine reviews. A user has filtered the embedding to only show wines from the US, France, and Italy with “points” in a specific range, and selected one outlier from the embedding to read its review and inspect its metadata. Try the demo at: https://apple.github.io/embedding-atlas.

Embedding Atlas: Low-Friction, Interactive Embedding Visualization

Related readings and updates.

Learning Compressed Embeddings for On-Device Inference

Single Training Dimension Selection for Word Embedding with PCA

Discover opportunities in Machine Learning.