Loreon
Labs
Platform
Docs
Home
Ecosystems
Python
flashinfer
FlashInfer: Kernel Library for LLM Serving
Python
Emerging
GitHub
Website
Stars
—
Forks
—
Contributors
8
Last push
1mo ago
Recent commits
Latest commits.
feat(cute_dsl/moe): deterministic balanced autotune profile inputs (#3286)
ce43023
Lee Nau
1mo ago
feat(trace): embed runnable init() in every TraceTemplate (#3221)
b0ec700
eigen
1mo ago
fix(CI unit tests, cute_dsl, spark): set USER env var before torch._dynamo import for unmapped UIDs (#3314)
de541f1
Ka-Hyun Nam
1mo ago
feat: Reland support lse in trtllm paged attn kernels (#3116)
a73c281
Matt Murphy
1mo ago
feat: Expose unpacked topk weights for routed moe (fp4) (#2425)
719ee23
Alex Yang
1mo ago
Use cudnn 9.23 new API to query workspace with override shape (#3291)
a1b8a60
yanqinz2
1mo ago
feat(logging,trace): cuda-graph-compatible level-5/10 logging + fi_trace template additions/fixes (#3172)
7d1d46e
eigen
1mo ago
fix typo llama routing issue in trtllm-gen moe (#3303)
ef98312
Lain
1mo ago
Top contributors
Builders behind this project.
yzh119
948 commits
bkryu
100 commits
yongwww
60 commits
yyihuang
53 commits
abcdabcd987
41 commits
cyx-6
40 commits
aleozlx
38 commits
MasterJH5574
37 commits