Svelte Hacker News logo
  • top
  • new
  • show
  • ask
  • jobs
  • about

Deriving the gradient for the backward pass of Layer Normalization

shreyansh26.github.io

3 points by shreyansh26 a day ago