LoreonLabsPlatform
DocsHome
  • Overview

Intelligence

  • Markets
  • Builders
  • Research
  • Ecosystems
  • Launchpads
  • Search
Ecosystems

Go

vllm-jukebox

Server that multiplexes multiple LLM models through vLLM backends with automatic model swapping, multi-GPU scheduling, and graceful request draining

GoEmerginginferencevllmvllm-jukebox
GitHub
Stars
5
Forks
2
Contributors
1
Last push
28d ago

Recent commits

Latest commits.

  • Add first-class llama.cpp runtime support (swap mode) (#2)
    4f36f30Eran Sandler28d ago
  • feat: add Anthropic protocol support (/v1/messages endpoint) (#1)
    995904bEran Sandler6mo ago
  • docs: add installation section with releases download
    5d4b526Eran Sandler6mo ago
  • fix(ci): remove coverage profile to fix test exit code
    842025fEran Sandler6mo ago
  • ci: add GitHub Actions for CI and releases
    2c911d1Eran Sandler6mo ago
test(vllm): add log file integration tests
30e4aaeEran Sandler6mo ago
  • docs: add vLLM log file documentation
    4e95ba1Eran Sandler6mo ago
  • docs(config): add log file examples to config files
    bf15251Eran Sandler6mo ago
  • Top contributors

    Builders behind this project.

    erans
    57 commits