-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
broken pipe with lbsvc strategy and --no-inc-recursive when re-syncing #320
Comments
It is hard to remember exactly, but I believe the reason I used But I agree that there can be use cases where it is undesired. I don't know how the progress bar would look like without that flag. An easy solution could be to not set those two flags when So let me think about it - maybe exposing the whole rsync command as a raw flag and allowing it to be overridden can be a good idea, or something like that. Suggestions are welcome. |
I think I can add a new flag for advanced users, --rsync-command-override, to allow overriding everything. If it is specified, other high-level flags like --delete etc. would be ignored. |
@utkuozdemir the entire rsync command can already be overwritten via helm. Can't it? The problem with this approach combined with load balancer strategy is that you can't know the load balancer hostname upfront.
why you don't do it the other way around? add By the way I tried running rsync with |
Interesting update: Is it possible to overwrite ssh options? |
Seems not: https://github.com/utkuozdemir/pv-migrate/blob/master/rsync/cmd.go Your best bet probably is to do your changes in the code and build your own binary for now. |
Simply because these kind of requests (specifically rsync command customization actually) come a lot, and I want to address this and any future issues in one go. In a similar way, in early versions of pv-migrate, the manifests it installed were not helm-based. So I had to add a new toggle/flag for each and every new use case. When I migrated it to helm, it addressed all future requests in a generic way. I aim to get the same effect for the rsync args, so it'll never be an issue again :) |
@utkuozdemir the root cause is related to the default AWS load balancer idle timeout (60s) VS the time the ssh session remains idle while rsync calculates files to be transferred with pv-migrate configures the ssh server with the following:
This is greater than the ELB idle timeout setting and doesn't help with my case. What helps is to set Another option is to increase the ELB idle timeout. I also confirmed that changing the ELB timeout to a higher value works without tweaks on the ssh client. I set it to 330 (slightly higher than ssh server ClientAliveInterval - 300). |
Describe the bug
I'm trying to migrate data between clusters. The source and dest PVCs use EFS storage class and contain hundreds of thousands of small files. The initial sync operation worked fine with
However, when I try to re-sync with the same command, rsync keeps failing with broken pipe messages.
I believe it could be due to SSH timeouts combined with no incremental recursion. If I manually try to sync with incremental recursion, it works just fine. The problem is that apparently,
--no-inc-recursive
can't be dynamically disabled unless you overwrite the entire rsync command, which doesn't seem to be possible with load balancer strategy as you don't know the load balancer hostname upfront.By the way, what's the purpose of using
--no-inc-recursive
by default? Afaik rsync's default is to use incremental recursion. It splits the file list into manageable chunks, reducing the memory and network pressure. It’s the default for a reason... the way I see it, unless there's a very specific need for--no-inc-recursive
, it’s generally better to avoid it (especially when dealing with thousands of small files).Maybe it makes sense to change this default behavior with an option to overwrite it?
To Reproduce
Steps to reproduce the behavior:
Using the load balancer strategy, sync a pvc containing hundreds of thousands of small files and then attempt to re-sync the changes.
Expected behavior
I expected a subsequent rsync operation to work as the 1st one did.
Console output
Version
, dest k8s version:
v1.29.10-eks-7f9249a`containerd://1.4.4-k3s2
,docker://19.3.6
]ReadWriteMany, 8G, kubernetes.io/gce-pd -> ReadWriteOnce, N/A, rancher.io/local-path
]The text was updated successfully, but these errors were encountered: