Reformer: The Efficient Transformer

Understanding Transformer-Based Self-Supervised Architectures

Rohan Jagtap
Towards Data Science
9 min readAug 18, 2020

--

Photo by Denys Nevozhai on Unsplash

Transformer (Vaswani et. al.) is great, it attends to a longer context, it offers parallelization in computation which RNNs don’t, and most importantly, they have the state of the art results.

In this article, we’ll be covering the Reformer Model, which was proposed in the…

--

--

Immensely interested in AI Research | I read papers and post my notes on Medium