Skip to content

A3C instead of actor-critic in reinforcement_learning/reinforce.py  #151

Open
@susht3

Description

@susht3

There is the code of reinforce.py
for action, r in zip(self.saved_actions, rewards): action.reinforce(r)

And there is the code of actor-critic.py:
for (action, value), r in zip(saved_actions, rewards): reward = r - value.data[0,0] action.reinforce(reward) value_loss += F.smooth_l1_loss(value, Variable(torch.Tensor([r])))

So i consider it is Asynchronous Advantage Actor-Critic, A3C, not Actor-critic

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions