LoreonLabsPlatform
DocsHome
  • Overview

Intelligence

  • Markets
  • Builders
  • Research
  • Ecosystems
  • Launchpads
  • Search
Ecosystems

C++

aphrodite-engine

Large-scale LLM inference engine

C++Emerging
GitHubWebsite
Stars
—
Forks
—
Contributors
8
Last push
16mo ago

Recent commits

Latest commits.

  • fix bugs to load llama 405B by vptq quant method
    0ccf11bwejoncy16mo ago
  • fix: invalid args passed to `_proces_request` in `encode()` (#1239)
    5efd3fbAlpinDale16mo ago
  • CPU: add support for AWQ quants (slow) (#1238)
    417cf26AlpinDale16mo ago
  • fix: access `get_vocab` instead of `vocab` in tool parsers (#1237)
    7b317a1AlpinDale16mo ago
  • lora: abstract away the punica wrapper (#1236)
    e21f28bAlpinDale16mo ago
lora: clean up the punica function interface (#1235)
1073b03AlpinDale16mo ago
  • lora: move the bias implementation to punica.py (#1234)
    69e55d3AlpinDale16mo ago
  • fix: lora weight sharding in `ColumnParallelLinearWithLoRA` (#1233)
    cdbf421AlpinDale16mo ago
  • Top contributors

    Builders behind this project.

    AlpinDale
    1.1K commits
    50h100a
    52 commits
    StefanGliga
    20 commits
    g4rg
    9 commits
    WolframRavenwolf
    4 commits
    sgsdxzy
    4 commits
    ahme-dev
    3 commits
    Naomiusearch
    3 commits