Linear Transformers Are Secretly Fast Weight Memory Systems (Machine Learning Paper Explained)

First published at 17:29 UTC on February 27th, 2021.
subscribers

#fastweights #deeplearning #transformers

Transformers are dominating Deep Learning, but their quadratic memory and compute requirements make them expensive to train and hard to use. Many papers have attempted to linearize the core module: the atten…

MORE
CategoryScience & Technology
SensitivityNormal - Content that is suitable for ages 16 and over
DISCUSS THIS VIDEO