Why do the program only use two state? #4

guotong1988 · 2017-03-07T08:19:28Z

I read from here.
Why do the program only use the current state and the next state?
Why only using the two state can work?
Thank you @songrotek

guotong1988 · 2017-04-12T11:29:01Z

反过来想，为什么不只用1个state呢，而用了2个state

guotong1988 · 2017-04-12T11:54:48Z

关键这两个state是紧挨着的，
就是说第二个state有情况，是前若干步决定的啊

saselovejulie · 2017-09-07T00:58:15Z

执行前的画面, 执行的动作, reward, 执行后的画面, terminal. 这5个元素组成一个训练集.
http://blog.csdn.net/songrotek/article/details/50580904 这个里面写了这个这个算法的要素, 我也不是很清楚. 可以一起探讨下

saselovejulie · 2017-09-07T02:04:58Z

@guotong1988 我看代码是这样的, 每次执行操作获得一帧画面.
currentState = [画面1, 画面2, 画面3, 画面4]
newState = np.append(self.currentState[:,:,1:],nextObservation,axis = 2)
执行完的newState = [画面2, 画面3, 画面4, 画面5]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why do the program only use two state? #4

Why do the program only use two state? #4

guotong1988 commented Mar 7, 2017 •

edited

Loading

guotong1988 commented Apr 12, 2017

guotong1988 commented Apr 12, 2017

saselovejulie commented Sep 7, 2017

saselovejulie commented Sep 7, 2017 •

edited

Loading

Why do the program only use two state? #4

Why do the program only use two state? #4

Comments

guotong1988 commented Mar 7, 2017 • edited Loading

guotong1988 commented Apr 12, 2017

guotong1988 commented Apr 12, 2017

saselovejulie commented Sep 7, 2017

saselovejulie commented Sep 7, 2017 • edited Loading

guotong1988 commented Mar 7, 2017 •

edited

Loading

saselovejulie commented Sep 7, 2017 •

edited

Loading