LoreonLabsPlatform

Overview

Intelligence

Markets
Builders
Research
Narratives
Ecosystems
Launchpads

Discover

Search
Sources

Shell

PeRL

PeRL: Parameter-Efficient Reinforcement Learning

ShellEmerging

Stars

80

Forks

12

Contributors

7

Last push

1mo ago

Recent commits

Latest commits.

Merge pull request #40 from MikaStars39/feature/async-perf-p0-p1
6c9f85aMikaStar391mo ago
add nanoeval evaluation module with backend, reward, and utils
59e0b89MikaStars391mo ago
[chore] remove deprecated qwen experiment scripts
9cd8ab4MikaStars392mo ago
[refactor] rename hf2mgt to hf2mcore and add mcore2hf converter
e4ad447MikaStars392mo ago
[doc] add OOM recovery and parallelism change guide to SFT README
4114e9aMikaStars392mo ago

[perf] increase max-tokens-per-gpu to 128k for better B300 utilization

ac3bbe3MikaStars392mo ago

[fix] add B300 (sm_103a) compatibility fixes to SFT script

db150bbMikaStars392mo ago

[fix] switch wandb to offline mode in B300 SFT script

f141f9bMikaStars392mo ago

Top contributors

Builders behind this project.

shangshang-wang