Loreon
Labs
Platform
Docs
Home
Ecosystems
Python
flashinfer
FlashInfer: Kernel Library for LLM Serving
Python
Emerging
GitHub
Website
Stars
—
Forks
—
Contributors
8
Last push
2mo ago
Recent commits
Latest commits.
fix(cute_dsl/moe): make autotuner bucket configuration adapt to runtime input (#3216)
e6ac7cc
Lee Nau
2mo ago
bump version to 0.6.11 (#3245)
cb44e7d
Alex Yang
2mo ago
fix(mla): widen page index to int64_t to avoid 32-bit overflow (#3136)
d885b71
Qi Zhang (qizh)
2mo ago
cute-dsl fmha prefill (cubin integration): remove front-padding, add attention_sink, and pdl support (#3181)
89af11c
Li Min
2mo ago
Support Sigmoid (sigmoid+topk) routing function (#2869)
417e59f
EdalatiAli
2mo ago
cute_dsl/moe: drop redundant Python-side moe_sort buffer init (#3226)
979644f
Lee Nau
2mo ago
fix: add jitter to cubin download backoff (#3169)
55a9eea
Paul Luh
2mo ago
fix: fused MoE autotuning correctness issues by filtering clusterDimZ (#3227)
2bb1f85
Wei Zhao
2mo ago
Top contributors
Builders behind this project.
yzh119
948 commits
bkryu
100 commits
yongwww
60 commits
yyihuang
51 commits
abcdabcd987
41 commits
cyx-6
40 commits
MasterJH5574
37 commits
aleozlx
36 commits