Loreon
Labs
Platform
Docs
Home
Ecosystems
Other
elastic
PyTorch elastic training
Other
Emerging
GitHub
Stars
—
Forks
—
Contributors
8
Last push
68mo ago
Recent commits
Latest commits.
Add error propagation to local scheduler
a03d13c
Aliaksandr Ivanou
68mo ago
Make subprocess handler use popen instead of inherit from it
78cd3eb
Aliaksandr Ivanou
68mo ago
Refactor error-handler and torchelastic to make it return-based when errors occur rather than exception-based
d98ff60
Aliaksandr Ivanou
68mo ago
pass THRIFT_TLS_CL_* env vars to role replica subprocesses
301c317
Kiuk Chung
68mo ago
Make app_id unique for submit_dryrun rather than just returning a name template
fa526d8
Kiuk Chung
68mo ago
make default session name be tsm_[session_backend]_[username]
118c86a
Kiuk Chung
68mo ago
Enable Pyre's source-db buck builder and auto-suppress errors - batch 8.
1b8ca31
Pradeep Kumar Srinivasan
68mo ago
Fix circle CI breakage by depending on torch-1.8.0dev (nightly) (#132)
9b8a890
Kiuk Chung
68mo ago
Top contributors
Builders behind this project.
aivanou
60 commits
isunjin
7 commits
Jeffwan
5 commits
StanislavGlebik
2 commits
vreis
2 commits
amyreese
1 commits
jspisak
1 commits
mannatsingh
1 commits