Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention (Paper Explained)

First published at 13:03 UTC on July 4th, 2020.
subscribers

#ai #attention #transformer #deeplearning

Transformers are famous for two things: Their superior performance and their insane requirements of compute and memory. This paper reformulates the attention mechanism in terms of kernel functions and obtai…

MORE
CategoryScience & Technology
SensitivityNormal - Content that is suitable for ages 16 and over
DISCUSS THIS VIDEO