PuSH - Publikationsserver des Helmholtz Zentrums München

Rodemann, J.* ; Goschenhofer, J.* ; Dorigatti, E. ; Nagler, T.* ; Augustin, T.*

Approximately Bayes-Optimal Pseudo-Label Selection.

In: (Proceedings of Machine Learning Research). 2023. 1762-1773 (Proceedings of Machine Learning Research ; 216)
Verlagsversion
Semi-supervised learning by self-training heavily relies on pseudo-label selection (PLS). This selection often depends on the initial model fit on labeled data. Early overfitting might thus be propagated to the final model by selecting instances with overconfident but erroneous predictions, often referred to as confirmation bias. This paper introduces BPLS, a Bayesian framework for PLS that aims to mitigate this issue. At its core lies a criterion for selecting instances to label: an analytical approximation of the posterior predictive of pseudo-samples. We derive this selection criterion by proving Bayes-optimality of the posterior predictive of pseudo-samples. We further overcome computational hurdles by approximating the criterion analytically. Its relation to the marginal likelihood allows us to come up with an approximation based on Laplace's method and the Gaussian integral. We empirically assess BPLS on simulated and real-world data. When faced with high-dimensional data prone to overfitting, BPLS outperforms traditional PLS methods.
Tags
Anmerkungen
Besondere Publikation
Auf Hompepage verbergern

Zusatzinfos bearbeiten
Eigene Tags bearbeiten
Privat
Eigene Anmerkung bearbeiten
Privat
Auf Publikationslisten für
Homepage nicht anzeigen
Als besondere Publikation
markieren
Publikationstyp Artikel: Konferenzbeitrag
Sprache englisch
Veröffentlichungsjahr 2023
HGF-Berichtsjahr 2023
Konferenztitel Proceedings of Machine Learning Research
Quellenangaben Band: 216, Heft: , Seiten: 1762-1773 Artikelnummer: , Supplement: ,
POF Topic(s) 30205 - Bioengineering and Digital Health
Forschungsfeld(er) Enabling and Novel Technologies
PSP-Element(e) G-503800-001
Scopus ID 85170035784
Erfassungsdatum 2023-10-18