Loreon
Labs
Platform
Search…
⌘K
Docs
Home
Ecosystems
Other
sgl-attn
Fast and memory-efficient exact attention
Other
Emerging
GitHub
Stars
—
Forks
—
Contributors
8
Last push
10mo ago
Recent commits
Latest commits.
[Cute] Update to nvidia-cutlass-dsl==4.1.0
b8eb683
Tri Dao
11mo ago
[Cute] Simplify some variables, be more careful about self.q_stage
d6dbdaf
Tri Dao
11mo ago
[Cute] Use kv_stage=3 for hdim (192,128)
7337307
Tri Dao
11mo ago
[Cute] Support hdim (192,128)
1b36ab1
Tri Dao
11mo ago
[Cute] Support hdim_v != hdim_qk
1a15733
Tri Dao
11mo ago
[AMD ROCm] Fix compilation issue in gfx942 (#1787)
413d07e
rocking
11mo ago
Bump to v2.8.2
7321879
Tri Dao
11mo ago
Revert "[BE] Better compress flash attention binaries (#1744)" (#1751)
24f0957
One
11mo ago
Top contributors
Builders behind this project.
tridao
727 commits
piercefreeman
25 commits
ksivaman
18 commits
ipiszy
17 commits
DanFu09
11 commits
rocking5566
10 commits
drisspg
6 commits
danthe3rd
6 commits