-
Notifications
You must be signed in to change notification settings - Fork 208
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Weekly snapshots not being retained #1094
Comments
I found a potential fix for this problem. I added some "logger.debug" calls, and changed the cron table entry to create log files so that I could see what's happening. The function "smart_remove_keep_first" keeps the newest snapshot in the specified date/time range, since the snapshots are sorted in descending order of date/time. I added a function "smart_remove_keep_last" that reverses the order of the snapshots and thus keeps the oldest snapshot. In the code in the previous post, I replaced the call to "smart_remove_keep_first" with a call to "smart_remove_keep_last". So far it seems to be working. But it needs further testing and analysis. If this is actually a bug, I don't understand why it hasn't been reported before. Can someone else review the code/algorithm and provide an opinion? |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This can be fixed with a one-line change in common/snapshots.py: In But it would probably be "cleaner" to rename |
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
I have created a testbed with the following shell code in a VM: In extreme cases, it appears to work as expected: Keeping "one snapshot per week for the last 52 weeks" really does leave 1 years' worth of weekly snapshots. However, I'm not so sure about shorter timespans. Calculating dates is hard! My current guess is that we may be dealing with an off-by-1 problem here, where a setting of "one snapshot per week for the last n weeks" only actually keeps one for the last n–1 weeks. This would be especially significant for the default configuration of:
|
I once had a tool called |
I think I'm very close to cracking this case. In this function: backintime/common/snapshots.py Lines 1686 to 1696 in 22f468c
… smartRemoveKeepFirst is called with a max_date of d + datetime.timedelta(days=8) — not days=7 !
Therefore, in a routine that says "keep only the youngest snapshot of these 8 days to consider", it will also delete the youngest snapshot of the previous week — but only if it's caught by the last in a series of "keep one per week" calls. In effect, of "one snapshot per week for the last n weeks", the n-th week has its youngest snapshot thrown away, even though it shouldn't. I'll need to investigate if changing the call of P.S.: The official soundtrack for this bug is Only For The Week. |
There are some unit tests in |
It occurred to me that what makes the "smart remove" behavior confusing is that the snapshots are named using only the timestamp. The names don't contain "hourly", "daily", "weekly", etc. I wonder if adding an index file that identifies daily, weekly, monthly and yearly snapshots would help. Or maybe I'm wrong. :) |
I'm confident that I can fix this bug, and to make sure I don't introduce any new ones, I'm doing some manual testing in a VM. I'm totally unfamiliar with automatic testing, though. I'll take a look, but I'd rather not put this bugfix on the "waiting list" until I've cracked the automatic testing of Smart Remove, because that would probably take a very long time ;) |
I see what you mean, but BiT doesn't work like that. Snapshots are not created as "weekly" or "monthly" snapshots. In fact, they are only characterized by their date+time of saving (plus a random three-digit number, which is called the "tag"). Together, this makes a "Snapshot ID" (SID for short): The only time when a snapshot might be characterized as "weekly" or "monthly" is when the Smart Remove routine determines: "this snapshot should be kept, because it's the only one from that week/month, and it should be kept according to the removal rules". Therefore, any single snapshot can start out as an "hourly" snapshot, then become a "weekly" one, and maybe even a "monthly" one later. |
I know that. To figure out the workaround I posted above, I added logging code in the smart-remove processing to log the "kept" and "removed" snapshots, so I could see what was happening. |
I see, and I've also understood why your workaround is successful. It's hard to put into words ;) What happens is that the code considers snapshots from 8 days (instead of 7) when determining which to keep for a specific week. But in a one-snapshot-per-week environment, 8 days will likely contain two snapshots. Since This has no consequences as long as the previous week is also considered before deletion. But it fails for the last of the weeks to be considered (resulting in the n–1 situation I described above). Your workaround reverses the list of snapshots, so in the "8-day-week" that is the last to be considered, the older of the two snapshots is kept, and your problem disappears. However, it is my understanding that the problem disappears completely when |
After verifying that the bug was still present in version 1.4.3 (the latest version available for Ubuntu 24.04), I'm testing your fix (days=8 -> days=7) by making that one-line change in that version. It will take a few more weeks. I want to verify that the monthly snapshots are retained too. |
Cool! Real time testing is the best. 😄 |
Ideally, I'd like to write a simulator for the code in snapshots.py that could do the testing in seconds. But I'm too lazy and senile to accomplish that now. :) |
Hello Guys, I was not able to produce a problem in my testings. The actual behavior I have observed in the context of "keep one snapshot per week for the last N weeks".
|
It's been 4-1/2 years since I posted the original problem, and I don't remember the exact details. The fix I posted above works for me. I'm going to use the released version of BackInTime with my fix applied. |
Thank you for the reply. Would be nice to have example Dates to reproduce. |
Ideally, the relevant code should be simple and clear enough that its correctness, or lack of correctness, is obvious. If code is too complex to understand easily, then it's too complex. But I'm not volunteering to rewrite it. ;) |
Hello Dave, Out of your guts, does the original "keep one per N weeks" settings include the current running and not complete week or not? From your perspective as a user what would you expect?
It seems my assumption was not correct. The algorithm try to take the current week also into account. But that does not work in all situations. I am on it... |
Since Oct 10, I've been using BackInTime 1.4.3 with the one-line change in this PR manually applied: BackInTime was run hourly, with the same Auto-remove settings as in the screenshot I posted above on Mar 20. Monthly and hourly snapshots are being retained, but I think that the number of weekly backups retained is one less than it should be. For that reason, I recently changed the weekly setting to: "Keep one snapshot per week for the last 5 weeks". If there is ambiguity in the meaning of the setting, I think that a user would prefer to retain one more, rather than one less, backup. You can remove an extra backup; you can't restore a deleted one. Here is the current list of snapshots: The same issue may apply to the daily snapshots. It seems like the number retained is smaller than it should be. When I was using BackInTime 1.2.1, with the patch I posted above on Sep 25, 2022, I didn't observe any problems. |
On reflection, I consider "keep one for N weeks" to mean: |
Is 18. to 24. the "Last week"? |
In my opinion, "last week's backup" is the snapshot taken approximately seven days ago. It's hard to define precisely. It's like art. It's hard to define, but you know when you like it. |
I see, you define "a week" as a timedelta and not as a calendar element starting on Monday. |
In this context, I suppose so. I haven't given the matter much thought. |
I'm using the default values for Smart Remove. Hourly and daily snapshots are kept or deleted as expected, but no weekly snapshots are kept.
This seems to be the relevant code in /usr/share/backintime/common/snapshots.py:
I'm trying to figure it out, but I thought I'd ask first if there's a known issue.
The text was updated successfully, but these errors were encountered: