Can policy evaluation be automated? Or inevitably create hallucinated AI slop? We are trying to find out.
Latest commits.
Builders behind this project.