Linformer: Making Transformers Linear, Efficient, and Scalable
Data parallelism and Model parallelism
Positional Embedding the real brain of the Transformers