AI Glossary/Transformer
Deep Learning

Transformer

A neural network architecture based on self-attention mechanisms, powering modern language models.

In-depth explanation

Introduced in "Attention is All You Need" (2017), Transformers replaced RNNs for many sequence tasks. They use self-attention to weigh the importance of different input elements regardless of distance. This enables parallel processing and better handling of long-range dependencies. Transformers power GPT, BERT, and modern LLMs.

Examples

GPT-4
BERT
Vision Transformer (ViT)

Related terms

Master Transformer.

Learn how to apply this concept with hands-on projects in our comprehensive AI programs.