Other

weicj-vLLM-2080Ti-Definitive

The definitive vLLM runtime for dual RTX 2080 Ti 22GB + NVLink, delivering 27B/31B local inference with 100+ tok/s single-request decode and native 262K context.

OtherEmerging

GitHub

Stars

—

Forks

—

Contributors

Last push

13d ago

Recent commits

Latest commits.

docs: move chart values above bars
38ed661weicj14d ago
docs: show KV throughput charts in README
4872409weicj14d ago
docs: show values on SVG throughput charts
1e680adweicj14d ago
docs: restore clean SVG throughput charts
0d6ea65weicj14d ago
docs: use PNG throughput charts
6a7a557weicj14d ago

Top contributors

Builders behind this project.