LoreonLabsPlatform
DocsHome
  • Overview

Intelligence

  • Markets
  • Builders
  • Research
  • Ecosystems
  • Launchpads
  • Search
Ecosystems

Python

flash-attention

Fast and memory-efficient exact attention

PythonEmerging
GitHub
Stars
—
Forks
—
Contributors
8
Last push
4mo ago

Recent commits

Latest commits.

  • basics working (#2070)
    ac9b5f1Driss Guessous6mo ago
  • [FIRST] Fix softcap scoremod kwargs typo. (#2072)
    0a5339fLeo Dong6mo ago
  • [CUTE] Seeing if tvvm reduces cpu overhead (#2042)
    179f793Driss Guessous6mo ago
  • [Cute,Fwd] Extend score_mod to variable sequence length (#2043)
    fd8d5ebReuben Stern6mo ago
  • [CUTE] Allow grads to be preallocated (#2065)
    e240e0fDriss Guessous6mo ago
Fix use-after-free in FA3 deterministic mode. The pytorch caching allocator actually saves us here, but if you turn it off, then compute-sanitizer will detect this. (#2063)
bc0e4acskarupke6mo ago
  • fixing cute bwd func def (#2056)
    c783ab2liangel-026mo ago
  • [AMD ROCm] Update to latest composable_kernel to improve performance (#2052)
    6328432rocking6mo ago
  • Top contributors

    Builders behind this project.

    tridao
    834 commits
    drisspg
    28 commits
    piercefreeman
    25 commits
    guilhermeleobas
    18 commits
    ksivaman
    18 commits
    ipiszy
    17 commits
    jayhshah
    13 commits
    DanFu09
    11 commits