LoreonLabsPlatform
DocsHome
  • Overview

Intelligence

  • Markets
  • Builders
  • Research
  • Ecosystems
  • Launchpads
  • Search
Ecosystems

Other

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

OtherEmerging
GitHubWebsite
Stars
—
Forks
—
Contributors
8
Last push
7mo ago

Recent commits

Latest commits.

  • [Model][Perf] Use cos and sin cache in QwenVL (#28798)
    b9489f5Canlin Guo7mo ago
  • [Bugfix] Safeguard against missing backend in AttentionBackendEnum (#28846)
    285eaa4Song Zhixin7mo ago
  • [BugFix] Fix PP/async scheduling with pooling models (#28899)
    4393684Nick Hill7mo ago
  • [CI/Build] Replace wikipedia url with local server ones (#28908)
    896e41aIsotr0py7mo ago
  • [MISC] Remove format.sh (#28906)
    5bb1da5Kuntai Du7mo ago
[CI] Fix async scheduling + spec decoding test flake (#28902)
5bdd155Nick Hill7mo ago
  • [Misc] Remove unnecessary parentheses from log statements (#28897)
    0168f69Ning Xie7mo ago
  • [Doc]: fix typos in various files (#28863)
    083cf32Didier Durand7mo ago
  • Top contributors

    Builders behind this project.

    WoosukKwon
    674 commits
    DarkLight1337
    663 commits
    youkaichao
    468 commits
    mgoin
    413 commits
    hmellor
    332 commits
    Isotr0py
    304 commits
    njhill
    249 commits
    jeejeelee
    233 commits