TimeDistributed LSTM Middleware #461

OGordon100 · 2020-09-08T22:52:48Z

For many real-world situations, the task may have hidden state or partially observable features, making the Markovian assumption only semi-valid.

One way around this is to use frame stacking - doable already in Coach with filters.observation.observation_stacking_filter. It may be even better to use LSTM (and bi-directional) LSTM. Agents for this already exist, with the very well cited DRQN being one of them.

In Coach currently, there is the LSTMMiddleware layer. However, from what I understand of the source code it runs along the observations axis (for inputs such as text). Tensorflow of course has the TimeDistributed wrapper (with return_sequences=True) to run LSTM along the temporal axis between transitions.

Could timedistributed LSTM be added as a middleware? (or at the very least "hacked" in, as it would be of immense benefit to my current research, which I am using with a simple behavioural cloning agent)

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TimeDistributed LSTM Middleware #461

TimeDistributed LSTM Middleware #461

OGordon100 commented Sep 8, 2020 •

edited

Loading

TimeDistributed LSTM Middleware #461

TimeDistributed LSTM Middleware #461

Comments

OGordon100 commented Sep 8, 2020 • edited Loading

OGordon100 commented Sep 8, 2020 •

edited

Loading