LoreonLabsPlatform

Overview

Intelligence

Markets
Builders
Research
Ecosystems
Launchpads

Search

Other

sgl-attn

Fast and memory-efficient exact attention

OtherEmerging

Stars

—

Forks

—

Contributors

8

Last push

10mo ago

Recent commits

Latest commits.

[Cute] Update to nvidia-cutlass-dsl==4.1.0
b8eb683Tri Dao11mo ago
[Cute] Simplify some variables, be more careful about self.q_stage
d6dbdafTri Dao11mo ago
[Cute] Use kv_stage=3 for hdim (192,128)
7337307Tri Dao11mo ago
[Cute] Support hdim (192,128)
1b36ab1Tri Dao11mo ago
[Cute] Support hdim_v != hdim_qk
1a15733Tri Dao11mo ago

[AMD ROCm] Fix compilation issue in gfx942 (#1787)

413d07erocking11mo ago

7321879Tri Dao11mo ago

Revert "[BE] Better compress flash attention binaries (#1744)" (#1751)

24f0957One11mo ago

Top contributors

Builders behind this project.