This commit fixes #191 #205

dev-chauhan · 2020-02-24T20:27:04Z

Replaced outdated truncate!() with reset!()

MikeInnes · 2020-02-25T18:23:34Z

truncate! actually does something slightly different from reset!, and probably isn't necessary if using Zygote. I guess this model uses an old Flux version, though; if changing this we need to also update the manifests.

cc @dhairyagandhi96 for the state of the manifest for this model, I'm not completely up to date on how we're organising the zoo.

dev-chauhan · 2020-02-25T21:42:55Z

I have done these changes without looking into the workings of flux.
Now as I have looked how gradients are calculated using Zygote in Flux, I have come to conclusion that we do not need any truncate! function while using Zygote to calculate gradients.
Need for truncate! arose due to Tracker.jl which had been tracking gradients of params for all inputs given to rnn cell e.g. if we have very large sequence a = [a1 a2] (a1 is first half and a2 is second half of a), if we call loss(a1) , update!(), loss(a2), update!() we will have result equivalent to loss(a1), update!(), reset!(), loss([a1 a2]), update!() because while we do loss(a2) after loss(a1) Tracker still remembers what graph computation of a1 created so in calculation of gradient it will calculate as if input was loss([a1 a2]).
If we do loss(a1) ; update!(); truncate!(); loss(a2); update!()
gradient due to loss(a1) will not be considered second time which saves computation + time and final outcome will be same as loss([a1 a2]); update!() which we want.
But in Zygote we calculate gradient for given function and given input so when we do
loss(a1) ; update!(); loss(a2); update!()
it is same as loss([a1 a2]); update!(), because after loss(a1) state of rnn is changed but during calculation of loss(a2), calculation done before is irrelevent but state of rnn continues where it left so final effect will be same as loss([a1 a2]); update!() which is what we want.
So in Zygote we do not need truncate! function.
Am I correct ?

MikeInnes · 2020-02-26T15:05:04Z

Yup, that's exactly right.

CarloLucibello · 2020-02-28T10:13:04Z

so you should just delete the truncate line here, right?

dev-chauhan · 2020-02-28T10:35:27Z

Yes but according to the project environment (model-zoo) uses Flux with Tracker so to do this first have to update that and check for all models if they are compatible with new version or not. It was me who tried to run char-rnn.jl on new version of flux but if we use project env it works as should work.
So first we have to decide how we want to manage this project as mentioned by @MikeInnes. If we go with current env no need to change any thing with truncate else if we're going to work with the current version of Flux we have to remove truncate line in master branch or say remove reset from this commit.
We can also create env for char-rnn directory instead of changing Project.toml and Manifest.toml.

CarloLucibello · 2020-03-01T18:39:38Z

we should update this to reflect latest Flux, zygote changes. You can create a separate Project.toml and Manifest.toml for this example

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This commit fixes #191 #205

This commit fixes #191 #205

dev-chauhan commented Feb 24, 2020 •

edited

MikeInnes commented Feb 25, 2020

dev-chauhan commented Feb 25, 2020 •

edited

MikeInnes commented Feb 26, 2020

CarloLucibello commented Feb 28, 2020

dev-chauhan commented Feb 28, 2020

CarloLucibello commented Mar 1, 2020

CarloLucibello commented Mar 2, 2020

This commit fixes #191 #205

Are you sure you want to change the base?

This commit fixes #191 #205

Conversation

dev-chauhan commented Feb 24, 2020 • edited

MikeInnes commented Feb 25, 2020

dev-chauhan commented Feb 25, 2020 • edited

MikeInnes commented Feb 26, 2020

CarloLucibello commented Feb 28, 2020

dev-chauhan commented Feb 28, 2020

CarloLucibello commented Mar 1, 2020

CarloLucibello commented Mar 2, 2020

dev-chauhan commented Feb 24, 2020 •

edited

dev-chauhan commented Feb 25, 2020 •

edited