Loreon
Labs
Platform
Docs
Home
Ecosystems
Python
toy-attention-residuals
eevaain/toy-attention-residuals
Python
Emerging
GitHub
Stars
1
Forks
—
Contributors
1
Last push
3mo ago
Recent commits
Latest commits.
added prenorms to mha and mlp layers, aded back toggle for standard residual connections, added sinusoidal positional embeddings, training loop refactor
a6ea022
Evan Lin
3mo ago
attnres works!
63c5c1f
Evan Lin
3mo ago
fixed mha, added another norm, next step to fully plumb in attnRes. also realized paper treats each selfattn or mlp as an individual layer. means in my fullattnreslayer class i need 2 query weight vecs
2e73b54
Evan Lin
3mo ago
added mha
e29ae5d
Evan Lin
3mo ago
layer 1 gradients increase after using attnres compared to w/o. going to add positional enc later to see how that changes stuff. see how that affects loss.
73877d0
Evan Lin
3mo ago
Top contributors
Builders behind this project.
eevaain
5 commits