Manuel Gentile and Fabrizio Falchi

Transformers are a neural network model designed to overcome the limitations of recurrent neural networks in the analysis of sequences of data (in our case, words or tokens)1.

Specifically, transformers, through the self-attention mechanism, make it possible to parallelise the analysis of data sequences and extract the dependencies between the elements of these sequences and the contexts in which they occur.

 


Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., … & Polosukhin, I., Attention is all you need, Advances in neural information processing systems, 30, 2017.

Licence

Icon for the Creative Commons Attribution 4.0 International License

AI for Teachers: an Open Textbook Copyright © 2024 by Manuel Gentile and Fabrizio Falchi is licensed under a Creative Commons Attribution 4.0 International License, except where otherwise noted.

Share This Book

AI for Teachers: an Open Textbook
Privacy Overview

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.