Zamanian, A.* ; von Kleist, H. ; Ciora, O.A.* ; Piperno, M.* ; Lancho, G.* ; Ahmidi, N.*
Analysis of missingness scenarios for observational health data.
J. Pers. Med. 14:32 (2024)
Despite the extensive literature on missing data theory and cautionary articles emphasizing the importance of realistic analysis for healthcare data, a critical gap persists in incorporating domain knowledge into the missing data methods. In this paper, we argue that the remedy is to identify the key scenarios that lead to data missingness and investigate their theoretical implications. Based on this proposal, we first introduce an analysis framework where we investigate how different observation agents, such as physicians, influence the data availability and then scrutinize each scenario with respect to the steps in the missing data analysis. We apply this framework to the case study of observational data in healthcare facilities. We identify ten fundamental missingness scenarios and show how they influence the identification step for missing data graphical models, inverse probability weighting estimation, and exponential tilting sensitivity analysis. To emphasize how domain-informed analysis can improve method reliability, we conduct simulation studies under the influence of various missingness scenarios. We compare the results of three common methods in medical data analysis: complete-case analysis, Missforest imputation, and inverse probability weighting estimation. The experiments are conducted for two objectives: variable mean estimation and classification accuracy. We advocate for our analysis approach as a reference for the observational health data analysis. Beyond that, we also posit that the proposed analysis framework is applicable to other medical domains.
Impact Factor
Scopus SNIP
Web of Science
Times Cited
Scopus
Cited By
Altmetric
Publikationstyp
Artikel: Journalartikel
Dokumenttyp
Wissenschaftlicher Artikel
Typ der Hochschulschrift
Herausgeber
Schlagwörter
Missing Data Analysis ; Missing Data Assumptions ; Missingness Distribution Shift ; Missingness Scenarios ; Observational Health Data; Multiple Imputation; Risk; Prediction; Monotone; Model; Score
Keywords plus
Sprache
englisch
Veröffentlichungsjahr
2024
Prepublished im Jahr
0
HGF-Berichtsjahr
2024
ISSN (print) / ISBN
2075-4426
e-ISSN
2075-4426
ISBN
Bandtitel
Konferenztitel
Konferzenzdatum
Konferenzort
Konferenzband
Quellenangaben
Band: 14,
Heft: 5,
Seiten: ,
Artikelnummer: 32
Supplement: ,
Reihe
Verlag
MDPI
Verlagsort
Basel, Switzerland
Tag d. mündl. Prüfung
0000-00-00
Betreuer
Gutachter
Prüfer
Topic
Hochschule
Hochschulort
Fakultät
Veröffentlichungsdatum
0000-00-00
Anmeldedatum
0000-00-00
Anmelder/Inhaber
weitere Inhaber
Anmeldeland
Priorität
Begutachtungsstatus
Peer reviewed
POF Topic(s)
30205 - Bioengineering and Digital Health
Forschungsfeld(er)
Enabling and Novel Technologies
PSP-Element(e)
G-503800-007
Förderungen
Bayerisches Staatsministerium für Wirtschaft, Landesentwicklung und Energie
Copyright
Erfassungsdatum
2024-07-16