LoreonLabsPlatform

Overview

Intelligence

Markets
Builders
Research
Narratives
Ecosystems
Launchpads

Discover

Search
Sources

Python

GRPO

利用TRL库和自定义奖励函数实现GRPO算法完整流程，使Qwen2.5-0.5B-Instruct模型同样具备数学思维能力！

PythonEmerging

Stars

9

Forks

1

Contributors

1

Last push

12mo ago

Recent commits

Latest commits.

Add files via upload
93e019eTangBaron12mo ago
Initial commit
8f76abcTangBaron12mo ago

Top contributors

Builders behind this project.