Rethinking Attention with Performers (Paper Explained)

First published at 18:18 UTC on October 26th, 2020.
subscribers

#ai #research #attention

Transformers have huge memory and compute requirements because they construct an Attention matrix, which grows quadratically in the size of the input. The Reformer is a model that uses random positive orthogonal features to…

MORE
CategoryScience & Technology
SensitivityNormal - Content that is suitable for ages 16 and over
DISCUSS THIS VIDEO