Loreon
Labs
Platform
Docs
Home
Ecosystems
Other
Megatron-LM
Ongoing research training transformer models at scale
Other
Emerging
GitHub
Stars
—
Forks
—
Contributors
8
Last push
40mo ago
Recent commits
Latest commits.
Merge branch 'tridao-flashattn' into 'main'
285068c
Jared Casper
42mo ago
Merge branch 'main' into tridao-flashattn
c92f10b
Jared Casper
42mo ago
Remove FA's check for headdim <= 128
9200e43
Tri Dao
42mo ago
Merge branch 'skyw/fix_zeroshot_eval_script' into 'main'
b707199
Jared Casper
42mo ago
Merge branch 'transformer_engine_rebase' into 'main'
e1c334b
John Kamalu
43mo ago
Transformer Engine Integration Rebase
3499542
John Kamalu
43mo ago
remove mpu dependency in zeroshot script
8ed3887
Hao Wu
43mo ago
Integrate FlashAttention into Megatron-LM
d693034
Tri Dao
43mo ago
Top contributors
Builders behind this project.
jaredcasper
351 commits
shoeybi
338 commits
lmcafee-nvidia
182 commits
kvareddy
114 commits
mpatwary
85 commits
RPrenger
81 commits
zliucr
78 commits
deepakn94
66 commits