Open Access Green möglich sobald Postprint bei der ZB eingereicht worden ist.
FM2: Fusing multiple foundation models for pathology image analysis via disentangled consensus-divergence representation.
Inf. Fusion 127:103840 (2026)
Foundation models (FMs) have emerged and achieved good performance on numerous downstream tasks. However, different FMs, like CLIP, DINOv2, and SAM, are trained on diverse datasets with varying methodologies, exhibiting model-specific characteristics and encoding scenario-specific knowledge. Efforts to unify the strengths of these different FMs through knowledge distillation show promise but remain challenging due to the inconsistencies in feature distributions, which can lead to suboptimal convergence and reduced generalizability. In this paper, we propose a novel aggregation framework, FM 2 (Fusing Multiple Foundation Models), which leverages disentangled representation learning to address these challenges. Specifically, our approach effectively disentangles consensus and divergence features from multiple expert FMs and then aligns them into a unified and robust representation. Extensive experiments on datasets with over 1,000,000 pathology images across various tasks, including zero-shot and few-shot classification, cross-modal retrieval, and survival analysis, demonstrate that our method consistently outperforms state-of-the-art models, delivering superior accuracy and reliability across various clinical scenarios. Additionally, the visualizations offer insights into the model’s ability to harmonize knowledge across different FMs, highlighting its potential for enhancing diagnostic precision in medical imaging. The significant advancements demonstrated in our work underscore the promise of effectively aligning FMs, showing potential for broadening their application not only in pathology but also in other medical imaging domains.
Altmetric
Weitere Metriken?
Zusatzinfos bearbeiten
[➜Einloggen]
Publikationstyp
Artikel: Journalartikel
Dokumenttyp
Wissenschaftlicher Artikel
Schlagwörter
Disentangled Representation ; Foundation Model ; Knowledge Distillation ; Pathology Image ; Teacher-student Network
ISSN (print) / ISBN
1566-2535
e-ISSN
1872-6305
Zeitschrift
Information Fusion
Quellenangaben
Band: 127,
Artikelnummer: 103840
Verlag
Elsevier
Verlagsort
Radarweg 29, 1043 Nx Amsterdam, Netherlands
Begutachtungsstatus
Peer reviewed
Förderungen
China Postdoctoral Science Foundation
Natural Science Foundation of Shanghai
Shanghai Key Laboratory of Child Brain and Development
National Natural Science Foundation of China
Natural Science Foundation of Shanghai
Shanghai Key Laboratory of Child Brain and Development
National Natural Science Foundation of China