Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

simplify flip_gradient #27

Open
xvlvzhu opened this issue Nov 20, 2018 · 4 comments
Open

simplify flip_gradient #27

xvlvzhu opened this issue Nov 20, 2018 · 4 comments

Comments

@xvlvzhu
Copy link

xvlvzhu commented Nov 20, 2018

Since tensorflow 1.7 there is a new way to redefine the gradient.

@tf.custom_gradient
def flip_grad_layer(x, l):
    def grad(dy):
        return tf.negative(dy) * l, None
    return tf.identity(x), grad
@pumpikano
Copy link
Owner

Thanks for pointing this out! Yes, this is much cleaner and easier to understand then the old method. I will try to find time to update and test the code with this soon.

@Jasperty
Copy link

how to use? it returns two outputs,
y1, y2 = flip_grad_layer(x, l)
what is y1 and what is y2?

@eliottbrion
Copy link

I am wondering why the second output of the "grad" function is "None" (fourth line). Can someone explain it to me?

@lorenzoprincipi
Copy link

lorenzoprincipi commented Apr 6, 2021

I know this is old but it might be helpful for newcomers as well.

@Jasperty

how to use? it returns two outputs,
y1, y2 = flip_grad_layer(x, l)
what is y1 and what is y2?

According to tf.custom_gradient doc:

Returns: a function (x) which returns the same value as f(x)[0] and whose gradient (as calculated by tf.gradients) is determined by f(x)[1].

Simply put, y1 represents the tensor computed in the forward pass (i.e. identity) whereas y2 the downstream (custom) gradient in the backward pass.


@eliottbrion

I am wondering why the second output of the "grad" function is "None" (fourth line). Can someone explain it to me?

Referring to the same doc page as above:

In case the function takes multiple variables as input, the grad function must also return the same number of variable

This means that by providing l as required parameter to the flip_grad_layer function the gradient will be computed for the latter variable, which is not intended and discarded by returning None.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants