Deriving the gradient for the backward pass of Layer Normalization (shreyansh26.github.io)
3 points by shreyansh26 2 hours ago | 0 comments
313 points by shreyansh26 2 hours ago | 0 comments
311 points by shaduadiale 2 hours ago | 1 comment
321 points by grayfox777 2 hours ago | 0 comments
332 points by nativeit 2 hours ago | 0 comments
341 points by benedictowusu 2 hours ago | 0 comments
352 points by aspenmayer 2 hours ago | 1 comment
367 points by thunderbong 2 hours ago | 0 comments
373 points by rmason 2 hours ago | 0 comments
382 points by rmason 2 hours ago | 0 comments
391 points by onescales 2 hours ago | 0 comments
404 points by loog5566 3 hours ago | 0 comments
4161 points by seveibar 3 hours ago | 7 comments
4210 points by fazlerocks 3 hours ago | 16 comments
433 points by niksmac 3 hours ago | 1 comment
448 points by yawz 3 hours ago | 0 comments
452 points by 12_throw_away 3 hours ago | 0 comments
468 points by xqcgrek2 3 hours ago | 2 comments
4713 points by NaOH 3 hours ago | 5 comments
4814 points by ronbenton 3 hours ago | 1 comment
492 points by nodesocket 3 hours ago | 5 comments
505 points by rolph 3 hours ago | 1 comment
513 points by azophy_2 3 hours ago | 0 comments
526 points by jakamm 3 hours ago | 14 comments
531 points by PaulHoule 3 hours ago | 0 comments
542 points by npmipg 3 hours ago | 0 comments
553 points by mooreds 3 hours ago | 2 comments
561 points by mooreds 3 hours ago | 0 comments
572 points by mooreds 3 hours ago | 0 comments
583 points by bladeee 3 hours ago | 1 comment
592 points by red369 3 hours ago | 0 comments
60