LoreonLabsPlatform

Overview

Intelligence

Markets
Builders
Research
Narratives
Ecosystems
Launchpads

Discover

Search
Sources

TeX

rlhf-book

Textbook on reinforcement learning from human feedback

TeXEmerging

Stars

—

Forks

—

Contributors

8

Last push

8mo ago

Recent commits

Latest commits.

Regularization and rejection sampling tweaks (#162)
a72e591Zoran Medić8mo ago
Policy gradients and dpo typos/tweaks (#163)
c2ecfffZoran Medić8mo ago
Update metadata.yml with Pandoc workaround
6f9bdbbNathan Lambert8mo ago
Try to fix action 1 (#161)
55b973eNathan Lambert8mo ago
Clarify reward model conditioning (#160)
70ac958Nathan Lambert8mo ago

Fix typos (#159)

45e90b4Zoran Medić9mo ago

a754032Nathan Lambert9mo ago

WIP: Add completions library (#157)

d211b16Nathan Lambert9mo ago

Top contributors

Builders behind this project.

emmanuel-ferdman