Reading List

Papers, websites, and cool people. Things I've read or am meaning to read. All entries before June 26, 2024 undated.

Pinned

Papers With Code


All

2024-08-14 Algorithms for Decision Making

2024-08-12 Evaluating ∇f(x) is as fast as f(x)

2024-08-12 Reinforcement Learning from Human Feedback Book

2024-08-12 Mathematical Foundations of Reinforcement Learning

2024-08-11 Animated AI

2024-08-07 Learning to Move with Affordance Maps

2024-08-07 Imitation Learning

2024-08-06 Algorithms for Modern Hardware

2024-08-05 bytecode interpreters for tiny computers

2024-08-03 Latency Numbers Every Engineer Should Know

2024-07-28 The Matrix Calculus You Need For Deep Learning

2024-07-28 A Recipe for Training Neural Networks

2024-07-28 Practical Deep Learning

2024-07-25 A (Long) Peer into Reinforcement Learning

2024-07-23 Open Source AI Is the Path Forward

2024-07-23 The Llama 3 Herd of Models

2024-07-21 Mamba: The Hard Way

2024-07-18 CleanRL (Clean Implementation of RL Algorithms)

2024-07-18 Policy Gradient Demystified

2024-07-17 Prover-Verifier Games improve legibility of language model outputs

2024-07-16 Codestral Mamba

2024-07-14 An extended collection of matrix derivative results for forward and reverse mode algorithmic differentiation

2024-07-12 Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers

2024-07-12 Modularized Implementation of Deep RL Algorithms in PyTorch

2024-07-10 ML Code Challenges

2024-07-10 lucidrains

2024-07-05 From Autoencoder to Beta-VAE

2024-07-05 Tutorial on Variational Autoencoders

2024-07-01 The Super Effectiveness of Pokémon Embeddings Using Only Raw JSON and Images

2024-07-01 Popular Model-free Reinforcement Learning Algorithms

2024-06-29 Auto-Encoding Variational Bayes

2024-06-29 Stanford CS234: Reinforcement Learning Spring 2024

2024-06-29 Policy Gradient Algorithms

2024-06-28 How to Optimize a CUDA Matmul Kernel for cuBLAS-like Performance: a Worklog

2024-06-27 Meta Large Language Model Compiler: Foundation Models of Compiler Optimization

2024-06-27 KAN: Kolmogorov-Arnold Networks

2024-06-26 Higher-order Virtual Machine 2

2024-06-26 4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities

The Wi-Fi only works when it's raining

Yacine (Software @ X)

Andrej Karpathy

Lilian Weng (Safety @ OpenAI)

Einops: Clear and Reliable Tensor Manipulations with Einstein-like Notation

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Formal Algorithms for Transformers

A high-bias, low-variance introduction to Machine Learning for physicists

Attention is All You Need

Generative Adversarial Nets

Gradient-Based Learning Applied to Document Recognition

Diffusion Models

UGeneva 14x050 Deep Learning Course

The Transformer Family Version 2.0

Algebra, Topology, Differential Calculus, and Optimization Theory For Computer Science and Machine Learning

UToronto CSC321 Lecture 10: Automatic Differentiation

ML Interviews Book

AI by Hand

The Matrix Calculus You Need For Deep Learning

Karpathy's MinBPE

index.globe.engineer

Autodidax

Ilya's 30u30

rsrch.space

Ishan's Idea List

The Annotated Transformer

The Little Book of Deep Learning

Transformers from Scratch