-
Notifications
You must be signed in to change notification settings - Fork 364
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to properly close (SSH) file systems? #1682
Comments
The .client object has a
I don't think that condition does anything anymore. But actually, instances in fsspec are generally cached: if you instantiate with exactly the same arguments as previously, you get back the same object as before. You can pass |
Running into the same issue. What's the intended way to clean up after a filesystem instance is no longer needed? Is it the responsibility of fs implementations to anticipate having to reconnect automatically to their remote servers when a connection has timed out? If so, I don't see any implementations actually do that. Or should long running applications explicitly forego the fs instance cache ( The latter works, but as fsspec does not seem to have anything in the way of automatic cleanup, old fs instances will leak file descriptors that are never explicitly closed. Having applications hardwire calls to Is this an oversight in the fsspec API design, or are am I using it wrong? |
@erikvanzijst I also saw your comment on the paramiko performance issue. If you have not found it already, I want to mention that, in the end, I have switched to using asyncssh, which I found much more well-maintained and also has fsspec bindings here: https://github.com/fsspec/sshfs I don't remember testing the connection loss issue with asyncssh though ... It might have the same issue, but at least the performance should be better. |
Thanks for the suggestion @mxmlnkn. I've gone and switched. The performance of fsspec/sshfs is much more consistent and generally much higher than Paramiko. I did notice sshfs does not appear to implement any read-ahead buffering when downloading, penalizing many small reads (like iterating over the lines of a remote txt file), but wrapping it with a BufferedReader resolves that. This also appears to better deal with cleanup. When reading a file using
|
I did open an SSH file system and used it successfully:
Then, I lost the SSH connection because I lost WLAN connectivity because I entered suspend mode on my notebook. After that, I got this error on
o.fs.listdir
:I was not able to get around this error without closing and restarting the Python interactive interpreter itself. Note that this error takes a while. Directly after loosing connectivity, or enabling VPN, the listdir call hangs multiple minutes before I loose patience. I tried unsuccessfully:
o.close()
o.fs.close() # no attribute "close"
fsspec.open("ssh://[email protected]").fs.listdir('/')
o.fs.client.close()
. This fixes the hang, but now I get theSocket is closed
error, which I also got after ~30 minutes (timeout?), with no way to reopen the same connection again.I am especially stumped as to why the last one did not work. It seems that SSH connections are somehow cached ad some layer.
I also looked into the paramiko API specification and found this:
The fsspec SFTP implementation never calls this
close
method on the Client. As far as I can see, the client is only opened, never closed.I would expect all other remote file systems to have similar issues, i.e., how can I ensure that the connection is properly closed?
Edit: I noticed that there is some kind of caching going on. This would explain why simply trying to reopen the SSH mount did not work.
The text was updated successfully, but these errors were encountered: