möglich sobald bei der ZB eingereicht worden ist.
Taylor Expansion in Neural Networks: How Higher Orders Yield Better Predictions.
Front. Artif. Intell. 392, 2983-2989 (2024)
Deep learning has become a popular tool for solving complex problems in a variety of domains. Transformers and the attention mechanism have contributed a lot to this success. We hypothesize that the enhanced predictive capabilities of the attention mechanism can be attributed to higher-order terms in the input. Expanding on this idea and taking inspiration from Taylor Series approximation, we introduce “Taylor layers” as higher order polynomial layers for universal function approximation. We evaluate Taylor layers of second and third order on the task of time series forecasting, comparing them to classical linear layers as well as the attention mechanism. Our results on two commonly used datasets demonstrate that higher expansion orders can improve prediction accuracy given the same amount of trainable model weights. Interpreting higher-order terms as a form of token mixing, we further show that second order (quadratic) Taylor layers can efficiently replace canonical dot-product attention, increasing prediction accuracy while reducing computational requirements.
Impact Factor
Scopus SNIP
Altmetric
3.000
0.475
Anmerkungen
Besondere Publikation
Auf Hompepage verbergern
Publikationstyp
Artikel: Journalartikel
Dokumenttyp
Wissenschaftlicher Artikel
Sprache
englisch
Veröffentlichungsjahr
2024
HGF-Berichtsjahr
2024
ISSN (print) / ISBN
2624-8212
e-ISSN
2624-8212
Zeitschrift
Frontiers in artificial intelligence
Quellenangaben
Band: 392,
Seiten: 2983-2989
Verlag
Frontiers
Begutachtungsstatus
Peer reviewed
Institut(e)
Helmholtz AI - KIT (HAI - KIT)
Scopus ID
85216667330
Erfassungsdatum
2025-02-11