Sparse is Enough in Scaling Transformers (aka Terraformer) | ML Research Paper Explained

First published at 10:21 UTC on December 2nd, 2021.
subscribers

#scalingtransformers #terraformer #sparsity

Transformers keep pushing the state of the art in language and other domains, mainly due to their ability to scale to ever more parameters. However, this scaling has made it prohibitively expensive to run…

MORE
CategoryScience & Technology
SensitivityNormal - Content that is suitable for ages 16 and over
DISCUSS THIS VIDEO