LoreonLabsPlatform

Overview

Intelligence

Markets
Builders
Research
Narratives
Ecosystems
Launchpads

Discover

Search
Sources

Other

rlhf-book

Textbook on reinforcement learning from human feedback

OtherEmerging

Stars

—

Forks

—

Contributors

8

Last push

12d ago

Recent commits

Latest commits.

Dark mode (#442)
bfac558segyges13d ago
Lec 6 DPO: math-nit fixes (parens on u, factor beta on slide 38) (#441)
79cdd52Nathan Lambert14d ago
Course: add Q&A 1 + Lecture 5 videos, link Lecture 5 on reasoning TOC (#440)
68c6037Nathan Lambert15d ago
Add Lecture 6: Direct Preference Optimization (chapter 8) (#435)
f23f0e5Nathan Lambert15d ago
Recreate RL-for-LLM diagrams in TikZ + organize diagrams/tikz by topic (#438)
4e09170Nathan Lambert16d ago

Lecture 5: clarify FP32 LM head caption, link loss-aggregation to Lecture 4 (#437)

e0c2b1dNathan Lambert16d ago

Course: add Extra Resources section (#436)

b9d9c72Nathan Lambert18d ago

Lecture 5: The Rise of Reasoning Models (#327)

82fc036Nathan Lambert20d ago

Top contributors

Builders behind this project.