Loreon
Labs
Platform
Docs
Home
Ecosystems
Other
tokenspeed
TokenSpeed is a speed-of-light LLM inference engine.
Other
Emerging
GitHub
Website
Stars
—
Forks
—
Contributors
8
Last push
14d ago
Recent commits
Latest commits.
feat: reduce DeepSeek V4 prefix state snapshots with replay reuse (#329)
1176731
Simon_CQK
14d ago
perf: add Gluon MoE kernels for GPT-OSS (#314)
1780377
Kyle Wang
15d ago
perf(deepseek-v4): decode attention optimizations (#339)
b80268f
dongjiyingdjy
15d ago
Defer perf reference failure reporting (#340)
e26e7a1
Yineng Zhang
15d ago
perf(kernel): optimize mha kernel for sliding window case (#336)
94ccce6
Pengzhan Zhao
15d ago
Fix(spec decode): catch up trim bug (#335)
15385b2
Hongbin Zhong
15d ago
fix(PD): fix PD speculative bootstrap input seeding (#286)
58a5993
Xuchun Shang
15d ago
chore: use -O3 -use_fast_math for tokenspeed_kernel compilation (#285)
1e2ab88
Enwei Zhu
16d ago
Top contributors
Builders behind this project.
zhyncs
80 commits
borontion
17 commits
syuoni
17 commits
dongjiyingdjy
13 commits
lightseek-bot
12 commits
antiagainst
11 commits
qywu
10 commits
minedec
9 commits