Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug with saving model object to a network drive with UNC path #1026

Closed
bschneidr opened this issue Aug 30, 2024 · 4 comments · Fixed by #1038
Closed

Bug with saving model object to a network drive with UNC path #1026

bschneidr opened this issue Aug 30, 2024 · 4 comments · Fixed by #1038
Labels
bug Something isn't working

Comments

@bschneidr
Copy link
Contributor

bschneidr commented Aug 30, 2024

Describe the bug
I'm working in a fairly common situation where I have to use a Windows machine and store data on a network drive accessed by UNC paths (e.g. using paths such as "\\westat.com\projects\my-project-folder\data.csv").

I'm able to call model_fit$save_output_files() to save my CSV files of MCMC draws to the network folder. But when I try to save the model object (class CmdStanMCMC) using model_fit$save_object(), I get an error message saying that the CSV files don't exist.

I suspect that cmdstanr/cmdstan are having trouble with the two slashes at the beginning of the filepath.

Likely related to #1 (comment).

To Reproduce

Create a fitted model object with the usual example:

library(cmdstanr)
fit_schools_mcmc <- cmdstanr::cmdstanr_example()

Save the CSV files to a network drive:

fit_schools_mcmc$save_output_files(
  dir = "\\\\westat.com\\projects\\my-project-folder", 
  basename = "schools-fit", timestamp = FALSE, random = FALSE
)
#> Moved 3 files and set internal paths to new locations:
#> - /westat.com/projects/my-project-folder/schools-fit-1.csv
#> - /westat.com/projects/my-project-folder/schools-fit-2.csv
#> - /westat.com/projects/my-project-folder/schools-fit-3.csv

These CSV files do save successfully, all though it's a bit suspicious that the beginnings of the filepaths shown here only contain one slash.

Then try to save the fitted model object to the network drive:

fit_schools_mcmc$save_object(file.path("\\\\westat.com\\projects\\my-project-folder", "schools-fit.rds"))
#> Error in read_cmdstan_csv(files = self$output_files(include_failed = FALSE),  : 
#>  Assertion on 'files' failed: File does not exist: '/westat.com/projects/my-project-folder/schools-fit-1.csv'

I suspect there's an issue to do with the double slashes at the beginning of the filepath. See for example the following R output:

file.exists('/westat.com/projects/my-project-folder/schools-fit-1.csv')
#> [1] FALSE

file.exists('//westat.com/projects/my-project-folder/schools-fit-1.csv')
#> [1] TRUE

Expected behavior
I expected the model object to save without issue, since the CSV files were successfully copied.

Operating system
Windows 10 Enterprise

CmdStanR version number
Your CmdStanR version number (e.g. from packageVersion("cmdstanr")).

Additional context

I think the issue has to do with the internal cmdstanr utility functions repair_path() and absolute_path():

cmdstanr/R/utils.R

Lines 125 to 136 in f2e152b

repair_path <- function(path) {
if (!length(path) || !is.character(path)) {
return(path)
}
path <- path.expand(path)
path <- gsub("\\\\", "/", path)
# WSL cmdstan path is a network path and needs the leading //
path <- gsub("//(?!wsl)", "/", path, perl = TRUE)
# remove trailing "/"
path <- gsub("/$","", path)
path
}

@bschneidr bschneidr added the bug Something isn't working label Aug 30, 2024
@bschneidr
Copy link
Contributor Author

Right now, repair_path() only preserves the leading "//" if it's followed by "wsl". The WSL use-case as well as the more general UNC path case could be handled by preserving the leading "//" if it's at the beginning of the expanded path, regardless of whether it's followed by "wsl".

This could be implemented by updating the regex at cmdstanr/R/utils.R from its current value:

path <- gsub("//(?!wsl)", "/", path, perl = TRUE) 

to the following:

path <- gsub("(?<!^)//", "/", path, perl = TRUE) 

For example, if we updated repair_path() to use the suggested regex change, here is what the function output would look like:

# Test with a valid Windows UNC path
repair_path("\\\\westat.com\\projects\\my-project-folder\\data.csv")
[1] "//westat.com/projects/my-project-folder"

# Test with a bad Windows UNC path
repair_path("\\\\westat.com\\\\projects\\my-project-folder\data.csv")
[1] "//westat.com/projects/my-project-folder"

bschneidr added a commit to bschneidr/cmdstanr that referenced this issue Aug 31, 2024
bschneidr added a commit to bschneidr/cmdstanr that referenced this issue Nov 19, 2024
jgabry added a commit that referenced this issue Nov 20, 2024
For #1026: update `repair_path()` to work with network paths, add test.
@jgabry
Copy link
Member

jgabry commented Nov 20, 2024

Closing now that #1038 is merged. Thanks for fixing this!

@jgabry jgabry closed this as completed Nov 20, 2024
@bschneidr
Copy link
Contributor Author

Thank Jonah! I appreciate you taking the time to review this and accept the PR.

@jgabry
Copy link
Member

jgabry commented Nov 20, 2024

It's always great to get contributions like this! Especially since at the moment we're a bit short on R developers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants