To, D.* ; Quinting, J.* ; Hoshyaripour, G.A.* ; Goetz, M.* ; Streit, A.* ; Debus, C.*
Architectural insights into and training methodology optimization of Pangu-Weather.
Geosci. Model. Dev. 17, 8873-8884 (2024)
Data-driven medium-range weather forecasts have recently outperformed classical numerical weather prediction models, with Pangu-Weather (PGW) being the first breakthrough model to achieve this. The Transformer-based PGW introduced novel architectural components including the three-dimensional attention mechanism (3D Transformer) in the Transformer blocks. Additionally, it features an Earth-specific positional bias term which accounts for weather states being related to the absolute position on Earth. However, the effectiveness of different architectural components is not yet well understood. Here, we reproduce the 24 h forecast model of PGW based on subsampled 6-hourly data. We then present an ablation study of PGW to better understand the sensitivity to the model architecture and training procedure. We find that using a two-dimensional attention mechanism (2D Transformer) yields a model that is more robust to training, converges faster, and produces better forecasts compared to using the 3D Transformer. The 2D Transformer reduces the overall computational requirements by 20 %-30 %. Further, the Earth-specific positional bias term can be replaced with a relative bias, reducing the model size by nearly 40 %. A sensitivity study comparing the convergence of the PGW model and the 2D-Transformer model shows large batch effects; however, the 2D-Transformer model is more robust to such effects. Lastly, we propose a new training procedure that increases the speed of convergence for the 2D-Transformer model by 30 % without any further hyperparameter tuning.
Impact Factor
Scopus SNIP
Web of Science
Times Cited
Scopus
Cited By
Altmetric
Publikationstyp
Artikel: Journalartikel
Dokumenttyp
Wissenschaftlicher Artikel
Typ der Hochschulschrift
Herausgeber
Schlagwörter
Keywords plus
Sprache
englisch
Veröffentlichungsjahr
2024
Prepublished im Jahr
0
HGF-Berichtsjahr
2024
ISSN (print) / ISBN
1991-959X
e-ISSN
1991-959X
ISBN
Bandtitel
Konferenztitel
Konferzenzdatum
Konferenzort
Konferenzband
Quellenangaben
Band: 17,
Heft: 23,
Seiten: 8873-8884
Artikelnummer: ,
Supplement: ,
Reihe
Verlag
Copernicus
Verlagsort
Göttingen
Tag d. mündl. Prüfung
0000-00-00
Betreuer
Gutachter
Prüfer
Topic
Hochschule
Hochschulort
Fakultät
Veröffentlichungsdatum
0000-00-00
Anmeldedatum
0000-00-00
Anmelder/Inhaber
weitere Inhaber
Anmeldeland
Priorität
Begutachtungsstatus
Peer reviewed
Institut(e)
Helmholtz AI - KIT (HAI - KIT)
POF Topic(s)
Forschungsfeld(er)
PSP-Element(e)
Förderungen
KIT Graduate School Computational and Data Science under the FAST-DREAM Bridge PhD grant
KIT Center MathSEE
Copyright
Erfassungsdatum
2025-01-08