LoreonLabsPlatform

Overview

Intelligence

Markets
Builders
Research
Narratives
Ecosystems
Launchpads

Discover

Search
Sources

Other

evals

Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.

OtherEmerging

Stars

1

Forks

—

Contributors

8

Last push

38mo ago

Recent commits

Latest commits.

[evals] simplify and extend modelgraded evals (#804)
4ee127cShane Gu38mo ago
Add SVG understanding eval (#786)
ef1f0ebJosh Gruenstein38mo ago
Add test for FuzzyMatch (#802)
53836bfAlvin Wang38mo ago
[Evals] If ideal is not list, cast as list (#800)
004d681Andrew Kondrich38mo ago
Compare countries by area (#623)
24dae81Yohei Inui38mo ago

Update build-eval.md (#524)

ef21a17Joe Devon38mo ago

eval: categorize with distractors (contextual bias) (#551)

7c68139yuvalshirav38mo ago

add eval: knot-theory (#704)

39e98b0Matthew Haigh38mo ago

Top contributors

Builders behind this project.

logankilpatrick