LoreonLabsPlatform

Overview

Intelligence

Markets
Builders
Research
Narratives
Ecosystems
Launchpads

Discover

Search
Sources

Other

oat

🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.

OtherEmerging

Stars

—

Forks

—

Contributors

8

Last push

5mo ago

Recent commits

Latest commits.

feat: migrate to vllm AsyncLLMEngine with async generation support (#77)
8697066Simon Yu5mo ago
fix: incorrect double use of timing variable (#71)
8727fc3Yannik Keller5mo ago
chore: minor updates on logging and resource allocation (#73)
1b52eedzclzc6mo ago
feat: add fp16 training (#70)
c1a074czclzc8mo ago
fix: micro batch calculation in offline DPO (#68)
e7ce5caHoang Minh Huy8mo ago

fix: incorrect state indexing in PPOMultiTurnLearner critic training (#67)

bc30eafMinzheng_Wang8mo ago

chore: update lora and add metrics (#66)

f3ccb7fzclzc8mo ago

feat: support LoRA RL training (#64)

e1164aczclzc9mo ago

Top contributors

Builders behind this project.

emmanuel-ferdman