Loreon
Labs
Platform
Docs
Home
Ecosystems
Other
flash-attention
Fast and memory-efficient exact attention
Other
Emerging
GitHub
Stars
—
Forks
—
Contributors
8
Last push
21mo ago
Recent commits
Latest commits.
minify torch.torch.int32 to torch.int32 (#1237)
30e1ef0
Zhihao Shen
21mo ago
Add custom ops for compatibility with PT Compile (#1139)
83e41b3
Antoni Viros
21mo ago
Merge pull request #1182 from ipiszy/used_q
af314d4
Ying Zhang
21mo ago
small fixes
8cbc8a0
Ying Zhang
21mo ago
minor changes to unpad_input test util func
cdbbe84
Ying Zhang
21mo ago
Add seqused_q in fwd / bwd and seqused_k in bwd.
db80387
Ying Zhang
22mo ago
Support page kvcache in AMD ROCm (#1198)
e2182cc
rocking
21mo ago
[Rotary] Add test for rotary when qkv are packed an there's GQA
cc1690d
Tri Dao
21mo ago
Top contributors
Builders behind this project.
tridao
510 commits
piercefreeman
25 commits
ksivaman
13 commits
DanFu09
11 commits
ipiszy
11 commits
tmm1
4 commits
drisspg
4 commits
lucidrains
4 commits