LoreonLabsPlatform

Overview

Intelligence

Markets
Builders
Research
Narratives
Ecosystems
Launchpads

Discover

Search
Sources

Python

oat

🌾 OAT: A research-friendly framework for LLM online alignment, including preference learning, reinforcement learning, etc.

PythonEmerging

Stars

—

Forks

—

Contributors

5

Last push

15mo ago

Recent commits

Latest commits.

Upgrade vllm for more efficient collocation (#34)
59eb01bzclzc15mo ago
Fix ppo advantage computation (#33)
dfe3e8bQingfeng Lan16mo ago
add sft script (#32)
f2cb697zclzc16mo ago
chore: update deepspeed.py (#30)
4540740Ikko Eltociear Ashimine17mo ago
Update README.md (#29)
fa662b1zclzc17mo ago

Use a toy task to test R1-zero like training behaviors (#28)

56b9b57zclzc17mo ago

minor fix for offline sft (#27)

d000304zclzc17mo ago

f778278zclzc17mo ago

Top contributors

Builders behind this project.