Popular papers, already deconstructed.

A curated, public collection of research papers parsed with Deconstructed. Open any entry to explore its structure, equations, and AI explanations — no signup required.

Attention Is All You Need
Vaswani et al.2017Transformers · NLP · Deep Learning
The Transformer paper that replaced recurrence and convolutions with pure attention for sequence modeling. It became the architectural foundation for modern language models.
arXiv
Language Models are Few-Shot Learners
Brown et al.2020LLMs · Few-Shot Learning · Scaling
The GPT-3 paper showing that scaling language models dramatically improves in-context and few-shot learning. It marked a major shift away from task-specific fine-tuning toward prompt-based use.
arXiv
Learning to summarize from human feedback
Stiennon et al.2020RLHF · Alignment · Summarization
An early RLHF paper that improves summarization by training on human preferences instead of only matching reference summaries. It helped establish the practical value of preference-based optimization.
arXiv
Denoising Diffusion Probabilistic Models
Ho et al.2020Diffusion · Generative Models · Computer Vision
A foundational diffusion paper showing high-quality image synthesis from iterative denoising. It helped kick off the modern diffusion wave in generative modeling.
arXiv
Deep Residual Learning for Image Recognition
He et al.2015Computer Vision · ResNets · Deep Learning
Introduces ResNets and residual connections, making it practical to train much deeper neural networks. The paper became a cornerstone of modern computer vision and deep learning more broadly.
arXiv
LoRA: Low-Rank Adaptation of Large Language Models
Hu et al.2021LLMs · Fine-Tuning · Parameter Efficiency
Introduces LoRA, a parameter-efficient fine-tuning method that freezes base weights and learns small low-rank updates. It became one of the standard techniques for adapting large models cheaply.
arXiv
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Devlin et al.2018NLP · Pre-training · Transformers
Introduces BERT, a bidirectional Transformer pre-trained with masked language modeling. It reset the standard for transfer learning in NLP and powered a huge wave of downstream fine-tuning work.
arXiv
Learning Transferable Visual Models From Natural Language Supervision
Radford et al.2021CLIP · Vision-Language · Zero-Shot
The CLIP paper showing that image-text pretraining produces highly transferable visual representations. It demonstrated strong zero-shot performance across a wide range of vision tasks.
arXiv
Semi-Supervised Classification with Graph Convolutional Networks
Kipf and Welling2016Graph Neural Networks · Semi-Supervised Learning · Representation Learning
A landmark GCN paper that brought efficient graph convolutions into mainstream machine learning. It became one of the canonical starting points for graph representation learning.
arXiv