Loreon
Labs
Platform
Docs
Home
Ecosystems
C++
cutlass
CUDA Templates for Linear Algebra Subroutines
C++
Emerging
GitHub
Stars
—
Forks
—
Contributors
8
Last push
1mo ago
Recent commits
Latest commits.
update to 4.5 (#3228)
ef120d0
Haicheng Wu
1mo ago
fix: exclude SM70/72 from CUTLASS_NVCC_ARCHS_SUPPORTED on CUDA >= 13.0 (#3166)
c775e56
Vensen
1mo ago
Add Snake activation functor for EVT (#3184)
2e56847
Emre Albayrak
1mo ago
[CuTeDSL] Fix loop carried target scope (#3200)
1d9e1f6
TungtungQia
1mo ago
[CuTeDSL] Update atomic_max_float32 to atomic_fmax in blockscaled GEMM example (#3206)
ae6bccf
questa-quan-wang
1mo ago
v4.5 tag update (#3202)
cb37157
Junkai-Wu
2mo ago
[Hopper CuTeDSL] Add FP8 GEMM with 2xAcc (#3149)
f74fea9
Johnsonms
2mo ago
fix: Add missing kElementsPerAccess division in RegularTileIterator store (#3049)
7a9fe05
Blake Ledden
2mo ago
Top contributors
Builders behind this project.
hwu36
127 commits
kerrmudgeon
69 commits
Junkai-Wu
30 commits
fengxie
23 commits
ANIKET-SHIVAM
21 commits
yzhaiustc
18 commits
reed-lau
17 commits
dumerrill
16 commits