Learn an image-augmentation policy with RL (group-relative policy gradient — the GRPO trick) and honestly prove whether it beats RandAugment/TrivialAugment. Agent-drivable --json CLI; verifiable reward = held-out accuracy. RL × computer-vision.
Latest commits.
Builders behind this project.