Attention is all you need
By Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin
"The foundational paper introducing the Transformer architecture. Dispenses with recurrence and convolutions, relying entirely on self-attention for sequence modeling."
Abstract
We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train.
Accelerate your workflow with Feedalyze
AI churn detection for SaaS. Know which customers will leave before they do.
Free plan available · Connects to HubSpot, Intercom, Zendesk