LoreonLabsPlatform

Overview

Intelligence

Markets
Builders
Research
Narratives
Ecosystems
Launchpads

Discover

Search
Sources

Python

mc-multiplayer-eval

daohanlu/mc-multiplayer-eval

PythonEmerging

Stars

2

Forks

1

Contributors

2

Last push

3mo ago

Recent commits

Latest commits.

add evaluation results and supporting docs
a9938a2Fred Lu3mo ago
report_real_episode_accuracy.py
3e64e60Fred Lu4mo ago
renamed example generations
02dab36Fred Lu4mo ago
updated model names mapping; removed rounding; now uses "long" look-away evals
27d6e77Fred Lu5mo ago
eval dataset and generation match for "..._long" datasets
e5dc076Fred Lu5mo ago

added 2 new dataset types to existing handlers; results will be saved separately

de373c0Fred Lu5mo ago

make run all evals support custom models

d30be44Georgy Savva5mo ago

add VLM accuracy eval for one looks away, both look away

e321204Georgy Savva5mo ago

Top contributors

Builders behind this project.