LoreonLabsPlatform

Overview

Intelligence

Markets
Builders
Research
Ecosystems
Launchpads

Python

toy-attention-residuals

eevaain/toy-attention-residuals

PythonEmerging

Stars

1

Forks

—

Contributors

1

Last push

3mo ago

Recent commits

Latest commits.

added prenorms to mha and mlp layers, aded back toggle for standard residual connections, added sinusoidal positional embeddings, training loop refactor
a6ea022Evan Lin3mo ago
attnres works!
63c5c1fEvan Lin3mo ago
fixed mha, added another norm, next step to fully plumb in attnRes. also realized paper treats each selfattn or mlp as an individual layer. means in my fullattnreslayer class i need 2 query weight vecs
2e73b54Evan Lin3mo ago
added mha
e29ae5dEvan Lin3mo ago
layer 1 gradients increase after using attnres compared to w/o. going to add positional enc later to see how that changes stuff. see how that affects loss.

73877d0Evan Lin3mo ago

Top contributors

Builders behind this project.