Skip to content
This repository has been archived by the owner on Feb 12, 2022. It is now read-only.

ForgetMult equation in code is different from the paper #22

Open
Agoniii opened this issue Sep 28, 2018 · 2 comments
Open

ForgetMult equation in code is different from the paper #22

Agoniii opened this issue Sep 28, 2018 · 2 comments

Comments

@Agoniii
Copy link

Agoniii commented Sep 28, 2018

In this code, ForgetMult computes a simple recurrent equation:
h_t = f_t * x_t + (1 - f_t) * h_{t-1}
but in paper, it is
h_t = f_t * h_{t-1}+ (1 - f_t) * x_t
Which one is correct?

@RahulBhalley
Copy link

I am confused for the same. Please reply.

@elmarhaussmann
Copy link

It's unfortunate that it's not consistent but it doesn't really matter. h_t is a linear combination of x_t and h_{t-1} weighted by f_t and 1 - f_t. f_t is [0,1] so 1 - f_t is just the remainder to 1. You can transform from one equation to the other by setting f_t = 1 - f_t. The model will learn the same thing...

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants