PuSH - Publication Server of Helmholtz Zentrum München

Petzold, A.* ; Wessely, A.* ; Schliep, S.* ; Jiang, H. ; Tran, M. ; Koch, E.A.* ; Peng, T. ; Starz, H.* ; Berking, C.* ; Marr, C. ; Heppt, M.V.*

Weakly supervised deep learning for cutaneous squamous and basal cell carcinoma in whole-slide histopathology.

J. Pathol. Clin. Res. 12:e70082 (2026)
Publ. Version/Full Text Research data DOI PMC
Open Access Gold
Creative Commons Lizenzvertrag
Distinguishing infiltrative basal cell carcinoma (BCC) from poorly differentiated cutaneous squamous cell carcinoma (cSCC) remains a significant histopathological challenge. Automated deep learning approaches hold promise for improving diagnostic reliability, yet robust external validation is essential. In this study, we developed a weakly supervised deep learning model to classify these diagnostically challenging subtypes and evaluated its generalizability across internal and external cohorts, as well as in comparison to a dermatopathology foundation model (HistoGPT). The model employed a multiple-instance learning framework (CLAM) using the histopathology-specific transformer Phikon for feature extraction from whole-slide images. Slide-level ground-truth diagnoses from the collected images (n = 335, University Hospital Erlangen) were derived from routine clinical practice and re-evaluated by two board-certified dermatopathologists. Performance was assessed on an internal test set of 84 whole-slide images (27 cSCC and 57 BCC) and two external datasets: Queensland cohort (n = 10, curated in-distribution cases) and the COBRA cohort (n = 200, broad, partly out-of-distribution cases). Model discrimination was quantified using ROC curves, while accuracy, sensitivity, and specificity were reported alongside 95% Wilson confidence intervals (CIs). On the internal test set, the model achieved perfect classification [area under the receiver operating characteristic (AUC) = 1.0; 100% accuracy, sensitivity, and specificity]. Similarly, strong performance was observed in the Queensland cohort (AUC = 1.0), although limited by sample size. In the more heterogeneous COBRA cohort, discrimination remained high (AUC = 0.923, 95% CI 0.885-0.961), requiring threshold adjustment to correct for marked calibration shift (balanced accuracy 86.5% at Youden's J). Attention heatmaps highlighted histologically meaningful regions. In zero-shot evaluation on the internal test set, HistoGPT achieved an overall accuracy of 77%, with high class-wise sensitivity for BCC (98%, 95% CI 91-100) but markedly reduced sensitivity for cSCC (33%, 95% CI 19-52). Fine-tuning a task-specific classifier on the HistoGPT backbone substantially improved performance, achieving near-perfect discrimination and 98% balanced accuracy. These findings demonstrate that weakly supervised deep learning enables highly accurate classification of diagnostically challenging BCC and cutaneous squamous cell carcinoma subtypes. However, reliable deployment across institutions necessitates careful calibration and domain adaptation, and even powerful foundation models such as HistoGPT benefit from targeted fine-tuning to ensure robust performance in dermatopathology.
Altmetric
Additional Metrics?
Edit extra informations Login
Publication type Article: Journal article
Document type Scientific Article
Keywords Artificial Intelligence ; Basal Cell Carcinoma ; Clinical Pathology ; Computer‐assisted Image Interpretation ; Deep Learning ; Skin Neoplasms ; Squamous Cell Carcinoma
ISSN (print) / ISBN 2056-4538
e-ISSN 2056-4538
Quellenangaben Volume: 12, Issue: 2, Pages: , Article Number: e70082 Supplement: ,
Publisher Wiley
Publishing Place 111 River St, Hoboken 07030-5774, Nj Usa
Reviewing status Peer reviewed
Grants European Research Council (ERC)
European Union's Horizon Research and Innovation Programme Grant
German Federal Ministry of Education and Research (BMBF)
Forschungsstiftung Medizin am Universittsklinikum Erlangen
Hiege Stiftung
Else-Krner Fresenius Excellence Fellowship
Clinician Scientist Programme of the IZKF (Interdisciplinary Center for Clinical Research) at the Medical Faculty of the FAU Erlangen
German Society of Dermatology (DDG)
Arbeitsgemeinschaft Dermatologische Forschung (ADF)
European Research Council (ERC) under the European Union
Projekt DEAL
Hightech Agenda Bayern
Helmholtz Munich