Loreon
Labs
Platform
Docs
Home
Ecosystems
Python
mc-multiplayer-eval
daohanlu/mc-multiplayer-eval
Python
Emerging
GitHub
Stars
2
Forks
1
Contributors
2
Last push
3mo ago
Recent commits
Latest commits.
add evaluation results and supporting docs
a9938a2
Fred Lu
3mo ago
report_real_episode_accuracy.py
3e64e60
Fred Lu
4mo ago
renamed example generations
02dab36
Fred Lu
4mo ago
updated model names mapping; removed rounding; now uses "long" look-away evals
27d6e77
Fred Lu
5mo ago
eval dataset and generation match for "..._long" datasets
e5dc076
Fred Lu
5mo ago
added 2 new dataset types to existing handlers; results will be saved separately
de373c0
Fred Lu
5mo ago
make run all evals support custom models
d30be44
Georgy Savva
5mo ago
add VLM accuracy eval for one looks away, both look away
e321204
Georgy Savva
5mo ago
Top contributors
Builders behind this project.
daohanlu
30 commits
georgysavva
14 commits