make "tahoe backup" avoid "piling up" if the backup job takes longer than the period #2053

Open
opened 2013-08-07 22:42:53 +00:00 by zooko · 4 comments

the backup cronjob should be built to not pile up as I'm guessing it's going to take more than 24 hours for this first run to finish

and if you have that built in you should put that in the documents

<myers> the backup cronjob should be built to not pile up as I'm guessing it's going to take more than 24 hours for this first run to finish <myers> and if you have that built in you should put that in the documents
zooko added the
unknown
normal
enhancement
1.10.0
labels 2013-08-07 22:42:53 +00:00
zooko added this to the undecided milestone 2013-08-07 22:42:53 +00:00
daira commented 2013-08-08 11:37:48 +00:00
Owner

The tricky design decision here is how to detect that another backup is running (since they will be in different processes, currently). I can think of several possibilities:

  1. Move the backup job into the gateway.
  2. Have a longer-running process (perhaps the gateway but not necessarily) schedule backups and ensure mutual exclusion.
  3. Have the backup process acquire a lock (perhaps just by opening the backupdb in exclusive mode) while it runs.

3 seems simplest starting from where we are.

The tricky design decision here is how to detect that another backup is running (since they will be in different processes, currently). I can think of several possibilities: 1. Move the backup job into the gateway. 2. Have a longer-running process (perhaps the gateway but not necessarily) schedule backups and ensure mutual exclusion. 3. Have the backup process acquire a lock (perhaps just by opening the backupdb in exclusive mode) while it runs. 3 seems simplest starting from where we are.
Author

See also #2062.

See also #2062.
Author

We've been assuming that the user is backing up the same directory multiple times. In that case it makes sense to abort or delay the second backup until the first one completes. But, in #2285, a real live user is hitting a bug because he's running tahoe backup concurrently on two different directories. That use case, which we hadn't apparently previously considered, should probably not cause one of the backups to wait until the other backup is completely finished before it begins!

We've been assuming that the user is backing up the *same* directory multiple times. In that case it makes sense to abort or delay the second backup until the first one completes. But, in #2285, a real live user is hitting a bug because he's running `tahoe backup` concurrently on two *different* directories. That use case, which we hadn't apparently previously considered, should probably not cause one of the backups to wait until the other backup is completely finished before it begins!
daira commented 2014-08-28 15:14:08 +00:00
Owner

Well, it's quite possible that the directory trees overlap (and if we follow symlinks then we won't know this in advance of starting the second backup).

Well, it's quite possible that the directory trees overlap (and if we follow symlinks then we won't know this in advance of starting the second backup).
Sign in to join this conversation.
No Milestone
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Reference: tahoe-lafs/trac-2024-07-25#2053
No description provided.