Manuel Gentile and Fabrizio Falchi

Transformers are a neural network model designed to overcome the limitations of recurrent neural networks in the analysis of sequences of data (in our case, words or tokens)1.

Specifically, transformers, through the self-attention mechanism, make it possible to parallelise the analysis of data sequences and extract the dependencies between the elements of these sequences and the contexts in which they occur.

 


Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., … & Polosukhin, I., Attention is all you need, Advances in neural information processing systems, 30, 2017.

Licence

Icon for the Creative Commons Attribution 4.0 International License

AI for Teachers: an Open Textbook Copyright © 2024 by Manuel Gentile and Fabrizio Falchi is licensed under a Creative Commons Attribution 4.0 International License, except where otherwise noted.

Share This Book