LoreonLabsPlatform

Overview

Intelligence

Markets
Builders
Research
Ecosystems
Launchpads

Search

Python

vllm-dev

A high-throughput and memory-efficient inference and serving engine for LLMs

PythonEmerging

Stars

1

Forks

—

Contributors

8

Last push

22h ago

Recent commits

Latest commits.

[Frontend] Remove AsyncMicrobatchTokenizer. (#45759)
c4fd979wang.yuqi23h ago
[Bugfix] Prevent cuMemcpyBatchAsync segfault with MTP and KV offloading (#44784)
7ad894cjoshua abraham23h ago
[CPU] Support Gemma Diffusion (#45690)
a7fdfeeLi, Jiang1d ago
[Bug Fix] Allow pinned memory for WSL2 (#41496)
8bf3749Jimmy Lee1d ago
[Cleanup] Remove dead env (#45777)
9096659Cyrus Leung1d ago

[Misc] Added validation for Cohere /v2/embed input field exclusivity (#45640)

81d8f4eTaneem Ibrahim1d ago

Register parsed config classes before tokenizer init (#40299)

a9a8a32Andrew Barnes1d ago

[Core] Use fastsafetensors ParallelLoader for weight loading (#40183)

9d808e2gitbisector1d ago

Top contributors

Builders behind this project.