PuSH - Publication Server of Helmholtz Zentrum München

Sadeghi, M.* ; Richer, R.* ; Egger, B.* ; Schindler-Gmelch, L.* ; Rupp, L.H.* ; Rahimi, F.* ; Berking, M.* ; Eskofier, B.M.

Harnessing multimodal approaches for depression detection using large language models and facial expressions.

Npj Ment. Health Res. 3:66 (2024)
Publ. Version/Full Text DOI PMC
Free journal
Creative Commons Lizenzvertrag
Detecting depression is a critical component of mental health diagnosis, and accurate assessment is essential for effective treatment. This study introduces a novel, fully automated approach to predicting depression severity using the E-DAIC dataset. We employ Large Language Models (LLMs) to extract depression-related indicators from interview transcripts, utilizing the Patient Health Questionnaire-8 (PHQ-8) score to train the prediction model. Additionally, facial data extracted from video frames is integrated with textual data to create a multimodal model for depression severity prediction. We evaluate three approaches: text-based features, facial features, and a combination of both. Our findings show the best results are achieved by enhancing text data with speech quality assessment, with a mean absolute error of 2.85 and root mean square error of 4.02. This study underscores the potential of automated depression detection, showing text-only models as robust and effective while paving the way for multimodal analysis.
Altmetric
Additional Metrics?
Edit extra informations Login
Publication type Article: Journal article
Document type Scientific Article
Corresponding Author
ISSN (print) / ISBN 2731-4251
e-ISSN 2731-4251
Quellenangaben Volume: 3, Issue: 1, Pages: , Article Number: 66 Supplement: ,
Publisher Springer
Non-patent literature Publications
Reviewing status Peer reviewed
Institute(s) Institute of AI for Health (AIH)