AI papers archive

Chinchilla scaling laws

Showed that many large models were undertrained, reframing how compute should be split between model size and data.

Scaling · DeepMind

A key paper establishing diffusion models as a powerful approach to high-quality image generation.

Image Generation · UC Berkeley

Showed that scaling language models to 175B parameters produced strong few-shot abilities, popularizing large-scale LLMs.

Language Models · OpenAI

A bidirectional Transformer pretraining method that set a new standard for many language-understanding tasks.

Language Models · Google

Introduced the Transformer architecture, which became the foundation of nearly all modern large language models.

Architectures · Google

Introduced residual connections, enabling much deeper networks and influencing architectures across AI.

Architectures · Microsoft Research

Introduced GANs, training two networks against each other to generate realistic data, a foundational idea for generative models.

Generative · Universite de Montreal