Transformers

Manuel Gentile; Fabrizio Falchi

Manuel Gentile and Fabrizio Falchi

Transformers are a neural network model designed to overcome the limitations of recurrent neural networks in the analysis of sequences of data (in our case, words or tokens)¹.

Specifically, transformers, through the self-attention mechanism, make it possible to parallelise the analysis of data sequences and extract the dependencies between the elements of these sequences and the contexts in which they occur.

¹Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., … & Polosukhin, I., Attention is all you need, Advances in neural information processing systems, 30, 2017.

Licence

Icon for the Creative Commons Attribution 4.0 International License

Licence

Share This Book