Open Access Green as soon as Postprint is submitted to ZB.
FM2: Fusing multiple foundation models for pathology image analysis via disentangled consensus-divergence representation.
Inf. Fusion 127:103840 (2026)
Foundation models (FMs) have emerged and achieved good performance on numerous downstream tasks. However, different FMs, like CLIP, DINOv2, and SAM, are trained on diverse datasets with varying methodologies, exhibiting model-specific characteristics and encoding scenario-specific knowledge. Efforts to unify the strengths of these different FMs through knowledge distillation show promise but remain challenging due to the inconsistencies in feature distributions, which can lead to suboptimal convergence and reduced generalizability. In this paper, we propose a novel aggregation framework, FM 2 (Fusing Multiple Foundation Models), which leverages disentangled representation learning to address these challenges. Specifically, our approach effectively disentangles consensus and divergence features from multiple expert FMs and then aligns them into a unified and robust representation. Extensive experiments on datasets with over 1,000,000 pathology images across various tasks, including zero-shot and few-shot classification, cross-modal retrieval, and survival analysis, demonstrate that our method consistently outperforms state-of-the-art models, delivering superior accuracy and reliability across various clinical scenarios. Additionally, the visualizations offer insights into the model’s ability to harmonize knowledge across different FMs, highlighting its potential for enhancing diagnostic precision in medical imaging. The significant advancements demonstrated in our work underscore the promise of effectively aligning FMs, showing potential for broadening their application not only in pathology but also in other medical imaging domains.
Altmetric
Additional Metrics?
Edit extra informations
Login
Publication type
Article: Journal article
Document type
Scientific Article
Keywords
Disentangled Representation ; Foundation Model ; Knowledge Distillation ; Pathology Image ; Teacher-student Network
ISSN (print) / ISBN
1566-2535
e-ISSN
1872-6305
Journal
Information Fusion
Quellenangaben
Volume: 127,
Article Number: 103840
Publisher
Elsevier
Publishing Place
Radarweg 29, 1043 Nx Amsterdam, Netherlands
Reviewing status
Peer reviewed
Grants
China Postdoctoral Science Foundation
Natural Science Foundation of Shanghai
Shanghai Key Laboratory of Child Brain and Development
National Natural Science Foundation of China
Natural Science Foundation of Shanghai
Shanghai Key Laboratory of Child Brain and Development
National Natural Science Foundation of China