LoreonLabsPlatform

Overview

Intelligence

Markets
Builders
Research
Ecosystems
Launchpads

Search

Go

vllm-jukebox

Server that multiplexes multiple LLM models through vLLM backends with automatic model swapping, multi-GPU scheduling, and graceful request draining

GoEmerginginferencevllmvllm-jukebox

Stars

5

Forks

2

Contributors

1

Last push

28d ago

Recent commits

Latest commits.

Add first-class llama.cpp runtime support (swap mode) (#2)
4f36f30Eran Sandler28d ago
feat: add Anthropic protocol support (/v1/messages endpoint) (#1)
995904bEran Sandler6mo ago
docs: add installation section with releases download
5d4b526Eran Sandler6mo ago
fix(ci): remove coverage profile to fix test exit code
842025fEran Sandler6mo ago
ci: add GitHub Actions for CI and releases
2c911d1Eran Sandler6mo ago

test(vllm): add log file integration tests

30e4aaeEran Sandler6mo ago

docs: add vLLM log file documentation

4e95ba1Eran Sandler6mo ago

docs(config): add log file examples to config files

bf15251Eran Sandler6mo ago

Top contributors

Builders behind this project.