AI papers,
in order.
A curated, chronological archive of AI papers: 7 entries with summaries and sources.
Chinchilla scaling laws
Showed that many large models were undertrained, reframing how compute should be split between model size and data.
Scaling · DeepMind
Denoising Diffusion Probabilistic Models
A key paper establishing diffusion models as a powerful approach to high-quality image generation.
Image Generation · UC Berkeley
GPT-3
Showed that scaling language models to 175B parameters produced strong few-shot abilities, popularizing large-scale LLMs.
Language Models · OpenAI
BERT
A bidirectional Transformer pretraining method that set a new standard for many language-understanding tasks.
Language Models · Google
Attention Is All You Need
Introduced the Transformer architecture, which became the foundation of nearly all modern large language models.
Architectures · Google
Deep Residual Learning (ResNet)
Introduced residual connections, enabling much deeper networks and influencing architectures across AI.
Architectures · Microsoft Research
Generative Adversarial Networks
Introduced GANs, training two networks against each other to generate realistic data, a foundational idea for generative models.
Generative · Universite de Montreal