[WIP] Adam learning rule #1425

JesseLivezey · 2015-03-04T17:28:38Z

This implementation is based on the arxiv [v4] paper. Haven't run or tested it yet.

The paper seems to have at least one typo in that \beta_2^t is used but never defined. I'm assuming it is just \beta_2 currently. Also assuming that \beta_{1,t} is the same thing as \beta_1^t.

goodfeli · 2015-03-06T22:08:54Z

Why not just use Alec Radford's implementation?
https://gist.github.com/Newmu/acb738767acb4788bac3

I've been using that plugged into Pylearn2 in my private repo and it works well.

JesseLivezey · 2015-03-06T23:22:42Z

I don't think Alec's version is consistent with the most recent version of the paper, but I haven't really tested this implementation vs. his, so I'm not sure how different the results will be.

JesseLivezey · 2015-03-06T23:26:25Z

It might just be that Alec's version doesn't decay beta1, although the betas have been redefined, and I haven't checkout to see whether the rest of the math is equivalent.

JesseLivezey added 2 commits March 3, 2015 23:17

first go

bc5adc2

changes based on paper

f15d0c9

JesseLivezey mentioned this pull request Mar 4, 2015

Implement Adam as a LearningRule #1362

Open

fixed y_hat

a6801cf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP] Adam learning rule #1425

[WIP] Adam learning rule #1425

Uh oh!

JesseLivezey commented Mar 4, 2015

Uh oh!

goodfeli commented Mar 6, 2015

Uh oh!

JesseLivezey commented Mar 6, 2015

Uh oh!

JesseLivezey commented Mar 6, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[WIP] Adam learning rule #1425

Are you sure you want to change the base?

[WIP] Adam learning rule #1425

Uh oh!

Conversation

JesseLivezey commented Mar 4, 2015

Uh oh!

goodfeli commented Mar 6, 2015

Uh oh!

JesseLivezey commented Mar 6, 2015

Uh oh!

JesseLivezey commented Mar 6, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants