Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sebastian_database_shiny #26

Open
wants to merge 53 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
53 commits
Select commit Hold shift + click to select a range
3eee645
new replay
May 24, 2018
e271bca
Merge branch 'master' into sebastian
May 27, 2018
e93316f
Merge branch 'master' into sebastian
May 31, 2018
d3ba424
implemented db connection for replay memory and a shiny app for visua…
Jun 1, 2018
2c047fb
Moved shiny app from performance.R to new file performance_shiny.R
Jun 1, 2018
c70f1ff
small change
Jun 1, 2018
6d0195c
small changes
Jun 3, 2018
a6e9676
Merge branch 'master' into sebastian
Jun 3, 2018
e6e75bb
updated imports for shiny app
Jun 4, 2018
4431439
Merge branch 'master' into sebastian
Jun 4, 2018
875db95
updated some missing package imports for some functions
Jun 4, 2018
55ff118
Merge branch 'master' into sebastian
Jun 5, 2018
1ba0f08
added createReplayVideo
Jun 5, 2018
23d47eb
added error for replay video if state arrays are only one dimensional
Jun 7, 2018
3399462
:wq
Jun 7, 2018
276ccd5
some changes I already forgot. Probably making things perfect
Jun 8, 2018
fea4412
merge master and resolve conflict
smilesun Jun 22, 2018
075e44d
small fixes
Jun 22, 2018
8f8b67c
Merge branch 'sebastian' of github.com:smilesun/rlR into sebastian
Jun 22, 2018
219eb60
Merge branch 'master' into sebastian
smilesun Jun 23, 2018
83386cf
add unit test to replaymemdb for pong
smilesun Jun 23, 2018
42a07d8
Merge branch 'master' into sebastian
smilesun Jun 23, 2018
3c3e8f2
Merge branch 'master' into sebastian
Jun 24, 2018
d232b26
bugfixes. Reset replay memory with every new initialization. Added pr…
Jun 25, 2018
9958abe
Merge branch 'master' into sebastian
smilesun Jun 25, 2018
bc4280b
Merge branch 'master' into sebastian
smilesun Jun 25, 2018
565f604
Merge branch 'master' into sebastian
smilesun Jun 25, 2018
eb21958
Merge branch 'master' into sebastian
smilesun Jun 25, 2018
e701dd8
Merge branch 'master' into sebastian
smilesun Jun 25, 2018
5cfa421
Merge branch 'master' into sebastian
smilesun Jun 26, 2018
1bf964a
Merge branches 'master' and 'master' of github.com:smilesun/rlR into …
smilesun Jun 26, 2018
d08098f
Merge branch 'master' into sebastian
smilesun Jun 26, 2018
0499828
Merge branch 'master' into sebastian
smilesun Jun 28, 2018
2ba928b
merge master
smilesun Jul 1, 2018
dca01f5
Merge branch 'master' into sebastian
smilesun Jul 1, 2018
5c5e0fe
Merge branch 'master' into sebastian
smilesun Jul 1, 2018
8394167
Merge branch 'master' into sebastian
smilesun Jul 1, 2018
e29cb36
Merge branch 'master' into sebastian
smilesun Jul 1, 2018
5052643
Merge branch 'master' into sebastian
smilesun Jul 2, 2018
45a5a67
.
smilesun Jul 7, 2018
b13870e
Merge branch 'master' into sebastian
Jul 16, 2018
03ee345
resolved issues
Jul 16, 2018
caf9850
improved comments
Jul 23, 2018
a137f4e
merge master
smilesun Sep 29, 2018
e071363
Merge branch 'master' into sebastian
smilesun Oct 3, 2018
7887ef7
added new file performance_plot and preparation of deleting file perf…
SebGGruber Nov 12, 2018
23ec859
finished work on code - but not tested yet
SebGGruber Nov 18, 2018
42d5382
resolve merge conflict
SebGGruber Nov 26, 2018
e6a4238
solved merge conflicts and merged master
SebGGruber Nov 26, 2018
6f2fc3e
Merge branch 'sebastian' into sebastian_rm_shiny
SebGGruber Nov 26, 2018
3dfcd54
wip
SebGGruber Nov 26, 2018
a26e020
works now; pdp next todo
SebGGruber Nov 26, 2018
66f3a4f
Extracted plots from shiny app
SebGGruber Feb 23, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .directory
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[Dolphin]
Timestamp=2018,11,12,17,18,47
Version=4
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,4 @@ logout
.Rhistory
.RData
.Ruserdata
replay_memory
14 changes: 13 additions & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,10 @@ Package: rlR
Type: Package
Title: Reinforcement Learning in R
Version: 0.1.0
Authors@R: person("Xudong", "Sun", email = {"[email protected]"}, role = c("aut", "cre"))
Authors@R: c(
person("Xudong", "Sun", email = {"[email protected]"}, role = c("aut", "cre")),
person("Sebastian", "Gruber", email = {"[email protected]"}, role = c("ctb"))
)
Maintainer: Xudong Sun <[email protected]>
Description: Reinforcement Learning with deep Q learning, double deep Q
learning, frozen target deep Q learning, policy gradient deep learning, policy
Expand All @@ -21,6 +24,15 @@ Imports:
logging,
ggplot2,
openssl,
RSQLite,
png,
stringr,
DT,
plotly,
crosstalk,
shiny,
shinythemes,
imager,
magrittr,
abind
LazyData: true
Expand Down
3 changes: 3 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -31,4 +31,7 @@ import(keras)
import(logging)
import(openssl)
import(reticulate)
import(shiny)
import(tensorflow)
importFrom(magrittr,"%<>%")
importFrom(magrittr,"%>%")
2 changes: 1 addition & 1 deletion R/interaction_observer.R
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ Interaction = R6::R6Class("Interaction",
self$step_in_episode = 0
private$continue_flag = TRUE
self$glogger = self$rl_agent$glogger
self$perf = Performance$new(self$rl_agent)
self$perf = PerformancePlots$new(self$rl_agent)
private$list_cmd = list(
"render" = self$rl_env$render,
"before.act" = function() {
Expand Down
306 changes: 306 additions & 0 deletions R/performance_plots.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,306 @@
#' @importFrom magrittr %>% %<>%

PerformancePlots = R6::R6Class(
"PerformancePlots",
inherit = Performance,
public = list(

plot_2d_weights = function(layer = 1L, bias = FALSE) {

#' @description (interactive) 2d plot of the neural network weights
#' @param layer Integer - Index of the layer to visualize (starting at "1", default "1")
#' @param bias Bool - Should the bias weights of this layer be visualized instead? (default "FALSE")
#' @return Plotly object
#' @export

# some data manipulation for successful plotting
weight_index = layer * 2L + bias - 1L
s_weights = self$agent$brain$getWeights()
dim_weights = dim(s_weights[[weight_index]])

weights_data = {
if (length(dim_weights) == 2)
cbind(
expand.grid(1:dim_weights[1], 1:dim_weights[2]),
round(c(s_weights[[weight_index]]), 4)
)
else
cbind(
1:dim_weights,
1,
round(c(s_weights[[weight_index]]), 4)
)
} %>%
as.data.frame()

names(weights_data) = c("col_index", "row_index", "value")

# define the interactive plot
weights_data %>%
plotly::plot_ly(
x = ~ col_index,
y = ~ row_index,
color = ~ value,
mode = "markers",
size = I(10),
type = "scatter"
) %>%
plotly::layout(
showlegend = FALSE,
xaxis = list(title = "Column Index", titlefont = list(size = 18)),
yaxis = list(title = "Row Index", titlefont = list(size = 18))
) %>%
plotly::add_trace(
text = ~ value
)
},


plot_3d_weights = function(layer = 1L, bias = FALSE) {

#' @description Interactive 3d plot of the neural network weights
#' @param layer Integer - Index of the layer to visualize (starting at "1", default "1")
#' @param bias Bool - Should the bias weights of this layer be visualized instead? (default "FALSE")
#' @return Plotly object
#' @export

# some data manipulation for successful plotting
weight_index = layer * 2L + bias - 1L
s_weights = self$agent$brain$getWeights()
dim_weights = dim(s_weights[[weight_index]])

weights_data = {
if (length(dim_weights) == 2)
cbind(
expand.grid(1:dim_weights[1], 1:dim_weights[2]),
round(c(s_weights[[weight_index]]), 4)
)
else
cbind(
1:dim_weights,
1,
round(c(s_weights[[weight_index]]), 4)
)
} %>%
as.data.frame()
names(weights_data) = c("col_index", "row_index", "value")

# define the interactive plot
weights_data %>%
plotly::plot_ly(
x = ~ col_index,
y = ~ row_index,
z = ~ value,
mode = "markers",
color = ~ value,
size = I(5),
type = "scatter3d"
) %>%
plotly::layout(
showlegend = TRUE,
xaxis = list(title = "Column Index", titlefont = list(size = 25)),
yaxis = list(title = "Row Index", titlefont = list(size = 25)),
zaxis = list(title = "Weight Value", titlefont = list(size = 25))
)
},


plot_2d_action_value = function(x_axis = 1L, y_axis = 1L, input = NULL, iteration = NULL) {

#' @description Interactive 2d plot of the action value function. The points are actual values of the last batch,
#' the line is the prediction based on the trained model.
#' @param x_axis Integer - State dimension to use as x axis in the plot (default "1")
#' @param y_axis Integer - Action value dimension to use as y axis in the plot (default "1")
#' @param input Numeric vector - Fixed values of all state dimensions to calculate the predictions for.
#' The entry corresponding to the chosen "x_axis" is ignored. If "NULL" then a vector with only 0s is used. (default "NULL")
#' @param iteration Integer - Training iteration of which its model should be used for the predictions.
#' This requires the models to be stored during the training. If "NULL" then the most recent model will be used. (default "NULL")
#' @return Plotly object
#' @export

# TODO: instead of only zeros, use mean of each state dimension
if (is.null(input))
input = rep(0, ncol(self$agent$replay.x))

if (is.null(iteration))
iteration = length(self$list_models)

# set up data point generation
state_space_l = as.list(input)

min_s = min(self$agent$replay.x[, x_axis])
max_s = max(self$agent$replay.x[, x_axis])
# create a sequence from min to max (of the chosen state dimension) with 10 steps
state_space_l[[x_axis]] = seq(
from = min_s,
to = max_s,
by = (max_s - min_s) / 9
)
# create data matrix holding artificial data for the predictions
state_space = expand.grid(state_space_l) %>% as.matrix

# calculate predictions based on the created data points
predictions = array(state_space, dim = c(nrow(state_space), ncol(state_space))) %>%
self$list_models[[iteration]]$pred()

state_space = cbind(state_space, predictions) %>% as.data.frame()
names(state_space) = c(
paste0("State_Dim_", 1:ncol(self$agent$replay.x)),
paste0("ActionVal_Dim_", 1:ncol(self$agent$replay.y))
)

minibatch = cbind(self$agent$replay.x, self$agent$replay.y) %>%
data.frame()

names(minibatch) = c(
paste0("State_Dim_", 1:ncol(self$agent$replay.x)),
paste0("ActionVal_Dim_", 1:ncol(self$agent$replay.y))
)

# TODO: reimplement color and size argument
minibatch %>%
plotly::plot_ly(
x = ~ get(paste0("State_Dim_", x_axis)),
y = ~ get(paste0("ActionVal_Dim_", y_axis)),
mode = "markers",
color = I("black"), #if (input$color_2d == "-") I("black") else ~ get(input$color_2d),
size = I(8), #if (input$size_2d == "-") I(8) else ~ get(input$size_2d),
name = "Unfiltered",
type = "scatter"
) %>%
plotly::layout(
showlegend = TRUE,
xaxis = list(title = paste0("State_Dim_", x_axis), titlefont = list(size = 18)),
yaxis = list(title = paste0("ActionVal_Dim_", y_axis), titlefont = list(size = 18))
) %>%
plotly::add_trace(
data = state_space,
x = ~ get(paste0("State_Dim_", x_axis)),
y = ~ get(paste0("ActionVal_Dim_", y_axis)),
size = I(5),
name = 'trace 1',
mode = 'lines',
color = "#e6550d"
)
},


plot_3d_action_value = function(
x_axis = 1L,
y_axis = 2L,
z_axis = 1L,
input = NULL,
iteration = NULL,
showscale = TRUE
) {

#' @description Interactive 3d plot of the action value function. The points are actual values of the last batch,
#' the line is the prediction based on the trained model.
#' @param x_axis Integer - State dimension to use as x axis in the plot (default "1")
#' @param y_axis Integer - State dimension to use as y axis in the plot (default "2")
#' @param z_axis Integer - Action value dimension to use as z axis in the plot (default "1")
#' @param input Numeric vector - Fixed values of all state dimensions to calculate the predictions for.
#' The entry corresponding to the chosen "x_axis" and "y_axis" is ignored. If "NULL" then a vector with only 0s is used. (default "NULL")
#' @param iteration Integer - Training iteration of which its model should be used for the predictions.
#' This requires the models to be stored during the training. If "NULL" then the most recent model will be used. (default "NULL")
#' @return Plotly object
#' @export

# TODO: instead of only zeros, use mean of each state dimension
if (is.null(input))
input = rep(0, ncol(self$agent$replay.x))

if (is.null(iteration))
iteration = length(self$list_models)

# set up data point generation
state_space_l = as.list(input)

min_sx = min(self$agent$replay.x[, x_axis])
max_sx = max(self$agent$replay.x[, x_axis])
min_sy = min(self$agent$replay.x[, y_axis])
max_sy = max(self$agent$replay.x[, y_axis])

# create a sequence from min to max (of the chosen state dimension) with 10 steps
state_space_l[[x_axis]] = seq(
from = min_sx,
to = max_sx,
by = (max_sx - min_sx) / 9
)
state_space_l[[y_axis]] = seq(
from = min_sy,
to = max_sy,
by = (max_sy - min_sy) / 9
)

# create data matrix holding artificial data for the predictions
state_space = expand.grid(state_space_l) %>% as.matrix

# calculate predictions based on the created data points
predictions = array(state_space, dim = c(nrow(state_space), ncol(state_space))) %>%
self$list_models[[iteration]]$pred()

state_space = cbind(state_space, predictions) %>% as.data.frame()
names(state_space) = c(
paste0("State_Dim_", 1:ncol(self$agent$replay.x)),
paste0("ActionVal_Dim_", 1:ncol(self$agent$replay.y))
)

# create helper function for formatting the data points plus their predictions
helper = function(df) {
result = list(
x = unique(df[[x_axis]]),
y = unique(df[[y_axis]])
)

for (i in 1:ncol(self$agent$replay.y))
result[[paste0("ActionVal_Dim_", i)]] =
matrix(df[[length(df)-ncol(self$agent$replay.y)+i]], nrow = 100, ncol = 100, byrow = TRUE)

return(result)
}
state_space %<>% helper

minibatch = cbind(self$agent$replay.x, self$agent$replay.y) %>%
data.frame()

names(minibatch) = c(
paste0("State_Dim_", 1:ncol(self$agent$replay.x)),
paste0("ActionVal_Dim_", 1:ncol(self$agent$replay.y))
)

# TODO: reimplement color and size argument
minibatch %>%
plotly::plot_ly(
x = ~ get(paste0("State_Dim_", x_axis)),
y = ~ get(paste0("State_Dim_", y_axis)),
z = ~ get(paste0("ActionVal_Dim_", z_axis)),
mode = "markers",
color = I("black"), #if (input$color == "-") I("black") else ~ get(input$color),
size = I(3), #if (input$size == "-") I(3) else ~ get(input$size),
showlegend = FALSE,
type = "scatter3d"
) %>%
plotly::layout(
dragmode = "turntable",
scene = list(
xaxis = list(title = paste0("State_Dim_", x_axis), titlefont = list(size = 15)),
yaxis = list(title = paste0("State_Dim_", y_axis), titlefont = list(size = 15)),
zaxis = list(title = paste0("ActionVal_Dim_", z_axis), titlefont = list(size = 15))#, tickangle = 90)
)
) %>%
plotly::add_surface(
data = state_space,
x = ~ x,
y = ~ y,
z = ~ get(paste0("ActionVal_Dim_", z_axis)),
colorscale = "Viridis",
showscale = showscale,
opacity = 0.95,
autocolorscale = FALSE
)
}
),
private = list(),
active = list()
)
Loading