Python
Simulated SRE incident-response benchmark for reliable tool-using agents.
Latest commits.
Builders behind this project.