Loreon
Labs
Platform
Search…
⌘K
Docs
Home
Ecosystems
C++
aphrodite-engine
Large-scale LLM inference engine
C++
Emerging
GitHub
Website
Stars
—
Forks
—
Contributors
8
Last push
16mo ago
Recent commits
Latest commits.
fix bugs to load llama 405B by vptq quant method
0ccf11b
wejoncy
16mo ago
fix: invalid args passed to `_proces_request` in `encode()` (#1239)
5efd3fb
AlpinDale
16mo ago
CPU: add support for AWQ quants (slow) (#1238)
417cf26
AlpinDale
16mo ago
fix: access `get_vocab` instead of `vocab` in tool parsers (#1237)
7b317a1
AlpinDale
16mo ago
lora: abstract away the punica wrapper (#1236)
e21f28b
AlpinDale
16mo ago
lora: clean up the punica function interface (#1235)
1073b03
AlpinDale
16mo ago
lora: move the bias implementation to punica.py (#1234)
69e55d3
AlpinDale
16mo ago
fix: lora weight sharding in `ColumnParallelLinearWithLoRA` (#1233)
cdbf421
AlpinDale
16mo ago
Top contributors
Builders behind this project.
AlpinDale
1.1K commits
50h100a
52 commits
StefanGliga
20 commits
g4rg
9 commits
WolframRavenwolf
4 commits
sgsdxzy
4 commits
ahme-dev
3 commits
Naomiusearch
3 commits