Why did you need copyTargetQNetwork #5

fevemania · 2017-03-08T06:19:56Z

I have no idea about the meaning of copyTargetQNetwork. Why did we need QValueT to eval the QValue_batch? In order to let training process more stable ?

saselovejulie · 2017-09-07T00:56:03Z

i'm confuse about this function too,

if self.timeStep % UPDATE_TIME == 0:
self.copyTargetQNetwork()

as this code will transform QValue to QValueT every 100 steps, then why we need two of them?

FrankRouter · 2018-07-14T03:14:07Z

This is explained in the DQN nature paper.

We address these instabilities with a novel variant of Q-learning, which uses two key ideas. First, ... Second, we used an iterative update that adjusts the action-values (Q) towards target values that are only periodically updated, thereby reducing correlations with the target.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why did you need copyTargetQNetwork #5

Why did you need copyTargetQNetwork #5

fevemania commented Mar 8, 2017 •

edited

Loading

saselovejulie commented Sep 7, 2017

FrankRouter commented Jul 14, 2018

Why did you need copyTargetQNetwork #5

Why did you need copyTargetQNetwork #5

Comments

fevemania commented Mar 8, 2017 • edited Loading

saselovejulie commented Sep 7, 2017

FrankRouter commented Jul 14, 2018

fevemania commented Mar 8, 2017 •

edited

Loading