Loreon
Labs
Platform
Docs
Home
Ecosystems
Other
flashinfer
FlashInfer: Kernel Library for LLM Serving
Other
Emerging
GitHub
Website
Stars
—
Forks
—
Contributors
8
Last push
11mo ago
Recent commits
Latest commits.
cudnn: Add native cudnn_decode for improved cudnn decode performance (#1283)
c261e97
Anerudhan Gopal
11mo ago
Refactor Fused Moe Module (#1309)
1b83168
Shu Wang
11mo ago
bugfix: gen_trtllm_comm_module: fix device capability detection (#1356)
ece1689
Daniele
11mo ago
minor: more informative error message for buffer overflow (#1357)
e458896
Wenxuan Tan
11mo ago
cleanup: retire aot-build-utils (#1354)
2fe5331
Zihao Ye
11mo ago
minor: add trtllm_gen_mla benchmark (#1316)
68d1608
eigen
11mo ago
chore: remove cpp benchmarks, tests, cmake path, as they are deprecated (#1345)
12d48c6
Julien Debache
11mo ago
feat: Support logits_soft_cap for Persistent attn; fix kv split limit (#1324)
bf3445f
Wenxuan Tan
11mo ago
Top contributors
Builders behind this project.
yzh119
819 commits
abcdabcd987
41 commits
MasterJH5574
33 commits
zhyncs
33 commits
yyihuang
25 commits
nandor
17 commits
xslingcn
16 commits
cyx-6
14 commits