Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions for computing of derivatives #1

Open
khanrc opened this issue Feb 16, 2018 · 1 comment
Open

Questions for computing of derivatives #1

khanrc opened this issue Feb 16, 2018 · 1 comment

Comments

@khanrc
Copy link

khanrc commented Feb 16, 2018

I cannot understand your code for computing derivatives:

#first_derivative
first_derivative = tf.exp(cost)[0][label_index]*target_conv_layer_grad 	
	
#second_derivative
second_derivative = tf.exp(cost)[0][label_index]*target_conv_layer_grad*target_conv_layer_grad 

#triple_derivative
triple_derivative = tf.exp(cost)[0][label_index]*target_conv_layer_grad*target_conv_layer_grad*target_conv_layer_grad 

My questions are,

  1. Why did you multiply exp(cost) ?
  2. How the second/triple derivatives are calculated through the code? I think it should be like this:
    second derivative: tf.gradient(tf.gradient(Y, A), A)
    triple derivative: tf.gradient(tf.gradient(tf.gradient(Y, A), A), A)

Can you help me?

@adityac94
Copy link
Owner

Hi,
Please refer to our paper here "https://arxiv.org/pdf/1710.11063.pdf" for detailed explaination for the gradients. In particular Eq. 11, 15 and 16.

The way you suggested won't work because tf.gradient() cumulates all the partial derivatives for a particular input dimension. Ideally tf.gradient(tf.gradient(Y, A), A) should be the hessian of size size(tf.gradient(Y, A)) x size(A). However, you would get a vector of size(A).

Hope that clears things up? Get back if you have any more concernts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants