LoreonLabsPlatform
DocsHome
  • Overview

Intelligence

  • Markets
  • Builders
  • Research
  • Ecosystems
  • Launchpads
  • Search
Ecosystems

Python

vllm-dev

A high-throughput and memory-efficient inference and serving engine for LLMs

PythonEmerging
GitHubWebsite
Stars
1
Forks
—
Contributors
8
Last push
22h ago

Recent commits

Latest commits.

  • [Frontend] Remove AsyncMicrobatchTokenizer. (#45759)
    c4fd979wang.yuqi23h ago
  • [Bugfix] Prevent cuMemcpyBatchAsync segfault with MTP and KV offloading (#44784)
    7ad894cjoshua abraham23h ago
  • [CPU] Support Gemma Diffusion (#45690)
    a7fdfeeLi, Jiang1d ago
  • [Bug Fix] Allow pinned memory for WSL2 (#41496)
    8bf3749Jimmy Lee1d ago
  • [Cleanup] Remove dead env (#45777)
    9096659Cyrus Leung1d ago
[Misc] Added validation for Cohere /v2/embed input field exclusivity (#45640)
81d8f4eTaneem Ibrahim1d ago
  • Register parsed config classes before tokenizer init (#40299)
    a9a8a32Andrew Barnes1d ago
  • [Core] Use fastsafetensors ParallelLoader for weight loading (#40183)
    9d808e2gitbisector1d ago
  • Top contributors

    Builders behind this project.

    DarkLight1337
    897 commits
    WoosukKwon
    800 commits
    mgoin
    537 commits
    hmellor
    501 commits
    youkaichao
    471 commits
    Isotr0py
    415 commits
    njhill
    382 commits
    yewentao256
    332 commits