You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Checkpointer() saves the entire trained model with BSON. For several models, I am able to recover them with BSON.@load modeladdress model. I am facing a case I could not recover the saved checkpoint. Error message as indicated below. Any hint on how to recover the saved stuff?
Anyway, it seems advisable to change the function from saving the model to saving the outcome of Flux.setup(model).
We'd need to use Flux.state instead of setup because setup is only for optimizer state (not model params), but it could be done. The bigger problem IMO is relying on BSON.jl, which is very buggy and basically unmaintained. For Flux's own docs, we've moved towards recommending JLD2.jl instead. FluxTraining should be switched to use that or the Serialization stdlib.
Have made a quick-and-dirty fix, by creating a callback function, which can be executed after each epoch. Model is brought to CPU prior to saving its state. Have tested with BSON, but it might work with JLD2 as well.
function saveModelState(fullpathFilename, model)
modelcpu = Flux.cpu(model)
model_state = Flux.state(modelcpu)
BSON.@save fullpathFilename model_state
end
function saveModelStateCB(path, model)
if path[end] != '/'
path = path * "/"
end
fpfn = path * "model_state-" * Dates.format(Dates.now(), "yyyy-mm-ddTHH-MM-SS-sss") * ".bson"
saveModelState(fpfn, model)
end
Cheers,
Checkpointer()
saves the entire trained model with BSON. For several models, I am able to recover them withBSON.@load modeladdress model
. I am facing a case I could not recover the saved checkpoint. Error message as indicated below. Any hint on how to recover the saved stuff?Anyway, it seems advisable to change the function from saving the model to saving the outcome of
Flux.setup(model)
.Thanks.
The text was updated successfully, but these errors were encountered: