LoreonLabsPlatform

Overview

Intelligence

Markets
Builders
Research
Ecosystems
Launchpads

Search

Python

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

PythonEmerging

Stars

—

Forks

—

Contributors

8

Last push

9h ago

Recent commits

Latest commits.

[Rust Frontend] Add CORS support (#45753)
cca3365Tahsin Tunan15h ago
[CI] Fix attention benchmark smoke test (#45728)
040df8fMatthew Bonanni15h ago
[Perf] Add VLLM_TRITON_FORCE_FIRST_CONFIG to skip Triton autotuning (#42425)
ced32bbFrancesco Fusco16h ago
[Bugfix][MoE] Restore routed output unpadding before shared expert add (#45707)
c5e5c33Netanel Haber16h ago
[Quant] Support modelopt_mixed on Ampere (SM80/SM86) (#45306)
a8c86eeMike G16h ago

[ROCm][CI] Gate incompatible HF references on Transformers v5 (#41532)

7e179e4Andreas Karatzas16h ago

[ZenCPU] Add zencpu Platform Runtime Logging and Docs (#42726)

405c7cfLalithnarayan C17h ago

[Refactor] Remove `Fp8OnlineLinearMethod` as scheduled (#45463)

3f53e21Wentao Ye17h ago

Top contributors

Builders behind this project.