LoreonLabsPlatform
DocsHome
  • Overview

Intelligence

  • Markets
  • Builders
  • Research
  • Ecosystems
  • Launchpads
  • Search
Ecosystems

Shell

llm-d

llm-d is a Kubernetes-native high-performance distributed LLM inference framework

ShellEmerging
GitHubWebsite
Stars
—
Forks
—
Contributors
8
Last push
2h ago

Recent commits

Latest commits.

  • Encode disaggregation guide (#1614)
    3beb52eAlexey Roytman6h ago
  • Fix multimodal path references and Update Guide with embedding cache (vision encoder) cache routing (#1823)
    a34bbd8Rahul Gurnani6h ago
  • DP-aware scheduling on GKE clusters (#1679)
    e45f87fTessa Pham6h ago
  • build new llm-d-cuda image with vLLM v0.22.1 (#1769)
    043e090Tessa Pham10h ago
  • Add V6 tiered-prefix-cache recipe for TPU benchmark (#1815)
    0fc4f4eRadhika Lakhtakia12h ago
agentic-serving: token-load router + offload-aware prefix cache + calibration (#1817)
c32b4a0kaushik mitra12h ago
  • removing helmfile and helmdiff plugin from deps (#1825)
    41708cdkapil jain14h ago
  • Add UCCL transport to images (#1377)
    cfba82aPravein Govindan Kannan15h ago
  • Top contributors

    Builders behind this project.

    Gregory-Pereira
    102 commits
    clubanderson
    71 commits
    lionelvillard
    61 commits
    robertgshaw2-redhat
    59 commits
    ahg-g
    52 commits
    diegocastanibm
    45 commits
    maugustosilva
    44 commits
    liu-cong
    39 commits