Loreon
Labs
Platform
Docs
Home
Ecosystems
Python
llm-code
llm 相关的代码库 | GRPO、Train Free GRPO 等
Python
Emerging
GitHub
Stars
—
Forks
—
Contributors
6
Last push
2mo ago
Recent commits
Latest commits.
docs: 更新 README.md,增加 RL 中的 KL 估计器选型和 On-Policy Distillation 相关链接
99327a9
Async
2mo ago
docs: 增加 PPO、GRPO 解析文档
1e53670
Async
2mo ago
docs: 增加学习文档
8dca5b5
Async
2mo ago
Merge branch 'main' of https://github.com/wyf3/llm_related
7c8bf6b
wyf3
3mo ago
new file: kimi_attnres/dataset.py
7755f47
wyf3
3mo ago
Simplify action_mask and ends calculations
05e7fc9
wyf3
4mo ago
new file: deepseek_learn/mHC.ipynb
68423c6
wyf3
4mo ago
Merge branch 'main' of https://github.com/wyf3/llm_related
e8579c4
wyf3
5mo ago
Top contributors
Builders behind this project.
wyf3
89 commits
EnableAsync
3 commits
BiboyQG
1 commits
Richard-zrx
1 commits
Hana61
1 commits
allblueee
1 commits