Wider, Not Deeper: Cambridge, Oxford & ICL Challenge Conventional Transformer Design Approaches | Synced

In the new paper Wide Attention Is The Way Forward For Transformers, a research team from the University of Cambridge, Imperial College London, and the University of Oxford challenges the commonly ...

By · · 1 min read

Source: Synced | AI Technology & Industry Review

In the new paper Wide Attention Is The Way Forward For Transformers, a research team from the University of Cambridge, Imperial College London, and the University of Oxford challenges the commonly held belief that deeper is better for transformer architectures, demonstrating that wider layers result in superior performance on natural language processing tasks.