Affine-based Deformable Attention and Selective Fusion for Semi-dense Matching
AuthorsHongkai Chen, Zixin Luo, Ray Tian, Xuyang Bai, Aron Wang, Lei Zhou, Mingmin Zhen, Tian Fang, David McKinnon, Yanghai Tsin, Long Quan
Affine-based Deformable Attention and Selective Fusion for Semi-dense Matching
AuthorsHongkai Chen, Zixin Luo, Ray Tian, Xuyang Bai, Aron Wang, Lei Zhou, Mingmin Zhen, Tian Fang, David McKinnon, Yanghai Tsin, Long Quan
This paper was accepted at the Image Matching: Local Features & Beyond workshop at CVPR 2024.
Identifying robust and accurate correspondences across images is a fundamental problem in computer vision that enables various downstream tasks. Recent semi-dense matching methods emphasize the effectiveness of fusing relevant cross-view information through Transformer. In this paper, we propose several improvements upon this paradigm. Firstly, we introduce affine-based local attention to model cross-view deformations. Secondly, we present selective fusion to merge local and global messages from cross attention. Apart from network structure, we also identify the importance of enforcing spatial smoothness in loss design, which has been omitted by previous works. Based on these augmentations, our network demonstrate strong matching capacity under different settings. The full version of our network achieves state-of-the-art performance among semi-dense matching methods at a similar cost to LoFTR, while the slim version reaches LoFTR baseline’s performance with only 15% computation cost and 18% parameters.
Learning Deformable Body Interactions With Adaptive Spatial Tokenization
November 4, 2025research area Data Science and Annotation, research area Methods and AlgorithmsWorkshop at NeurIPS
This paper was accepted at the AI for Science Workshop at NeurIPS 2025.
Simulating interactions between deformable bodies is vital in fields like material science, mechanical design, and robotics. While learning-based methods with Graph Neural Networks (GNNs) are effective at solving complex physical systems, they encounter scalability issues when modeling deformable body interactions. To model interactions between objects, pairwise global edges…
ASpanFormer: Detector-Free Image Matching with Adaptive Span Transformer
September 6, 2022research area Computer Visionconference ECCV
Generating robust and reliable correspondences across images is a fundamental task for a diversity of applications. To capture context at both global and local granularity, we propose ASpanFormer, a Transformer-based detector-free matcher that is built on hierarchical attention structure, adopting a novel attention operation which is capable of adjusting attention span in a self-adaptive manner. To achieve this goal, first, flow maps are…