Reading List
Papers, websites, and cool people. Things I've read or am meaning to read. All entries before June 26, 2024 undated.
Pinned
All
2024-11-13 Personality Basins
2024-11-12 reflections on palantir
2024-11-12 Jason Liu
2024-11-12 Jack Morris
2024-11-12 Deep Learning, NLP, and Representations
2024-11-05 Setting Your Pet Rock Free.
2024-08-14 Algorithms for Decision Making
2024-08-12 Evaluating ∇f(x) is as fast as f(x)
2024-08-12 Reinforcement Learning from Human Feedback Book
2024-08-12 Mathematical Foundations of Reinforcement Learning
2024-08-11 Animated AI
2024-08-07 Learning to Move with Affordance Maps
2024-08-07 Imitation Learning
2024-08-06 Algorithms for Modern Hardware
2024-08-05 bytecode interpreters for tiny computers
2024-08-03 Latency Numbers Every Engineer Should Know
2024-07-28 The Matrix Calculus You Need For Deep Learning
2024-07-28 A Recipe for Training Neural Networks
2024-07-28 Practical Deep Learning
2024-07-25 A (Long) Peer into Reinforcement Learning
2024-07-23 Open Source AI Is the Path Forward
2024-07-23 The Llama 3 Herd of Models
2024-07-21 Mamba: The Hard Way
2024-07-18 CleanRL (Clean Implementation of RL Algorithms)
2024-07-18 Policy Gradient Demystified
2024-07-17 Prover-Verifier Games improve legibility of language model outputs
2024-07-16 Codestral Mamba
2024-07-12 Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers
2024-07-12 Modularized Implementation of Deep RL Algorithms in PyTorch
2024-07-10 ML Code Challenges
2024-07-10 lucidrains
2024-07-05 From Autoencoder to Beta-VAE
2024-07-05 Tutorial on Variational Autoencoders
2024-07-01 The Super Effectiveness of Pokémon Embeddings Using Only Raw JSON and Images
2024-07-01 Popular Model-free Reinforcement Learning Algorithms
2024-06-29 Auto-Encoding Variational Bayes
2024-06-29 Stanford CS234: Reinforcement Learning Spring 2024
2024-06-29 Policy Gradient Algorithms
2024-06-28 How to Optimize a CUDA Matmul Kernel for cuBLAS-like Performance: a Worklog
2024-06-27 Meta Large Language Model Compiler: Foundation Models of Compiler Optimization
2024-06-27 KAN: Kolmogorov-Arnold Networks
2024-06-26 Higher-order Virtual Machine 2
2024-06-26 4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities
The Wi-Fi only works when it's raining
Einops: Clear and Reliable Tensor Manipulations with Einstein-like Notation
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Formal Algorithms for Transformers
A high-bias, low-variance introduction to Machine Learning for physicists
Gradient-Based Learning Applied to Document Recognition
UGeneva 14x050 Deep Learning Course
The Transformer Family Version 2.0
UToronto CSC321 Lecture 10: Automatic Differentiation
The Matrix Calculus You Need For Deep Learning