LoreonLabsPlatform
DocsHome
  • Overview

Intelligence

  • Markets
  • Builders
  • Research
  • Ecosystems
  • Launchpads
  • Search
Ecosystems

Other

sgl-attn

Fast and memory-efficient exact attention

OtherEmerging
GitHub
Stars
—
Forks
—
Contributors
8
Last push
10mo ago

Recent commits

Latest commits.

  • [Cute] Update to nvidia-cutlass-dsl==4.1.0
    b8eb683Tri Dao11mo ago
  • [Cute] Simplify some variables, be more careful about self.q_stage
    d6dbdafTri Dao11mo ago
  • [Cute] Use kv_stage=3 for hdim (192,128)
    7337307Tri Dao11mo ago
  • [Cute] Support hdim (192,128)
    1b36ab1Tri Dao11mo ago
  • [Cute] Support hdim_v != hdim_qk
    1a15733Tri Dao11mo ago
[AMD ROCm] Fix compilation issue in gfx942 (#1787)
413d07erocking11mo ago
  • Bump to v2.8.2
    7321879Tri Dao11mo ago
  • Revert "[BE] Better compress flash attention binaries (#1744)" (#1751)
    24f0957One11mo ago
  • Top contributors

    Builders behind this project.

    tridao
    727 commits
    piercefreeman
    25 commits
    ksivaman
    18 commits
    ipiszy
    17 commits
    DanFu09
    11 commits
    rocking5566
    10 commits
    drisspg
    6 commits
    danthe3rd
    6 commits