deep learning - Theano multiplying by zero -
can explain me meaning behind these 2 lines of code here: https://github.com/newmu/theano-tutorials/blob/master/4_modern_net.py
acc = theano.shared(p.get_value() * 0.) acc_new = rho * acc + (1 - rho) * g ** 2
is mistake? why instantiate acc 0 , multiply rho in next line? looks not achieve way , remain zero. there difference if replace "rho * acc" "acc"?
the full function given below:
def rmsprop(cost, params, lr=0.001, rho=0.9, epsilon=1e-6): grads = t.grad(cost=cost, wrt=params) updates = [] p, g in zip(params, grads): acc = theano.shared(p.get_value() * 0.) acc_new = rho * acc + (1 - rho) * g ** 2 gradient_scaling = t.sqrt(acc_new + epsilon) g = g / gradient_scaling updates.append((acc, acc_new)) updates.append((p, p - lr * g)) return updates
this way tell theano "create shared variable , initialize value 0 in same shape p."
this rmsprop
method symbolic method. not compute rmsprop parameter updates, tells theano how parameter updates should computed when eventual theano function executed.
if further down the tutorial code linked to you'll see symbolic execution graph parameter updates constructed rmsprop
via call on line 67. these updates compiled theano function called train
in python on line 69 , train function executed many times on line 74 within loops of lines 72 , 73. python function rmsprop
called once, irrespective of how many times train
function called within loops on lines 72 , 73.
within rmsprop
, telling theano that, each parameter p
, need new theano variable initial value has same shape p
, 0. throughout. go on tell theano how should update both new variable (unnamed far theano concerned named acc
in python) , how update parameter p
itself. these commands not alter either p
or acc
, tell theano how p
, acc
should updated later, once function has been compiled (line 69) each time executed (line 74).
the function executions on line 74 not call rmsprop
python function, execute compiled version of rmsprop
. there no initialization inside compiled version because happened in python version of rmsprop. each train
execution of line acc_new = rho * acc + (1 - rho) * g ** 2
use current value of acc
not initial value.
Comments
Post a Comment