automatically upload a file when it is put in a given local directory #1429
Labels
No Label
0.2.0
0.3.0
0.4.0
0.5.0
0.5.1
0.6.0
0.6.1
0.7.0
0.8.0
0.9.0
1.0.0
1.1.0
1.10.0
1.10.1
1.10.2
1.10a2
1.11.0
1.12.0
1.12.1
1.13.0
1.14.0
1.15.0
1.15.1
1.2.0
1.3.0
1.4.1
1.5.0
1.6.0
1.6.1
1.7.0
1.7.1
1.7β
1.8.0
1.8.1
1.8.2
1.8.3
1.8β
1.9.0
1.9.0-s3branch
1.9.0a1
1.9.0a2
1.9.0b1
1.9.1
1.9.2
1.9.2a1
LeastAuthority.com automation
blocker
cannot reproduce
cloud-branch
code
code-dirnodes
code-encoding
code-frontend
code-frontend-cli
code-frontend-ftp-sftp
code-frontend-magic-folder
code-frontend-web
code-mutable
code-network
code-nodeadmin
code-peerselection
code-storage
contrib
critical
defect
dev-infrastructure
documentation
duplicate
enhancement
fixed
invalid
major
minor
n/a
normal
operational
packaging
somebody else's problem
supercritical
task
trivial
unknown
was already fixed
website
wontfix
worksforme
No Milestone
No Assignees
4 Participants
Notifications
Due Date
No due date set.
Reference: tahoe-lafs/trac-2024-07-25#1429
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
During the Tahoe-LAFS summit, I (David-Sarah) implemented a prototype of a dropbox-like uploader: if you write a file into a given directory, it will upload it to a given Tahoe mutable directory (with the same name as its name in the local filesystem).
Its current limitations are:
the behaviour when multiple files are put in the directory (or the same file more than once) has not been well-tested so farit doesn't have unit tests or docs (I will write those)Attachment drop-upload.darcs.patch (18988 bytes) added
Prototype implementation of drop-upload from Tahoe-LAFS summit. No tests or docs. (Corrected to include new file.)
(http://tahoe-lafs.org/trac/tahoe-lafs/attachment/ticket/1435/dependency-updates.darcs.patch) updates the dependency for Twisted to 10.1, which ensures that
twisted.internet.inotify
is available. (Note that it conflicts with drop-upload.darcs.patch. I'll rebase the latter when I have tests and docs.)In changeset:8b4082677477daf1:
Here is a review of drop-upload.darcs.patch .
First of all: yay! This feature is neat. :-)
Exception
--was it just because we might have Twisted < 10.1? We should remove any checks for if we have Twisted < 10.1 (except, of course, for the checks we do for all dependencies in source:src/allmydata/init.py).dirnode.add_file()
andUploader.upload()
and I guess it just redundantly uploads it a second time. (See also #952, which is about what happens when you upload the bytewise-identical file more than once at a time through the WAPI.) What should it do? Well, what should the user experience look like? You drag and drop a file into this directory, and then you drag and drop a new version of it one second later, and then what? Maybe the visual display of the directory should have been updated as soon as you dropped the first one to show that it is in the middle of uploading that file. Then, if you drop another version on it, the display should update to show that there is another update scheduled to happen after this one completes. So I guess we need a new ticket for a GUI frontend to this functionality (integrated into some extant GUI like Nautilus, I suppose). But at this layer, I guess what it should do is schedule another upload for as soon as this upload is finished unless another such upload is already scheduled, in which case it should do nothing.IN_CLOSE_WRITE | IN_CREATE | IN_MOVE_SELF | IN_MOVE_TO
Other than that, it looks good. It is neat how few lines of code are needed to add this functionality! :-)
Replying to zooko:
Thanks :-)
tahoe.cfg
is encoded in UTF-8, but the path wants to be insys.getfilesystemencoding()
.If there is any exception here, the node will silently fail to start. (The same is true of exceptions during initialization of the other frontends.) We should fix that in a better way -- it seems relevant to #355 and #1360 for example -- but I wanted to avoid this failure mode while I was debugging.
We should raise an exception with a proper message (and ensure that gets logged) rather than failing an assert.
Yes, it was just easier to see the events that way while prototyping. Logging to the Twisted log is sufficient, and the message could be shorter.
Yes, I think this is fairly harmless, modulo the fact that we should probably be using a more restrictive event mask.
Right, I'll open another ticket about suppressing redundant uploads.
That looks about right, although IN_MOVE_SELF should be treated differently, since it's moving the local directory away from its current path. I don't know what that should do -- probably the safest thing is to disable the drop-upload feature entirely until the node is restarted (and say "don't do that" in the docs).
Yeah,
twisted.internet.inotify
rocks!Replying to davidsarah:
Also, in
_notify
the conversion to a Tahoe Unicode filename is wrong (unicode(filepath.basename())
will use the default ASCII encoding, when it should be converting fromsys.getfilesystemencoding()
).Attachment drop-upload-2.darcs.patch (30700 bytes) added
Drop-upload frontend, with tests (but no documentation). refs #1429
The tests could do with covering a few more corner cases and error conditions, but I'm pretty sure there will be enough time for that before the beta.
This is pretty cool stuff. I'm hesitant about landing it in Tahoe core, though.. at least in its present form, it feels more like something that wants to be in a plugin, or in an extensions/ sort of directory. If it were a more complete Dropbox-ish replacment, I'd feel better about it: watching all directories under a root, handling modification of existing files not just new ones, and ideally some kind of multiple-client support. (as is, it's more like an automatic backup tool than a sync-a-virtual-directory-across-multiple-machines tool).
Can you envision maintaining its current UI for a couple years? Or do you think you'll look at it in six months and go "oh, that should really look like X instead".
Replying to warner:
We don't have a plugin mechanism. In any case, I'm not encouraged by the fate of things that were previously out-of-core, like the FUSE implementations. I'd prefer it to be in the core, tested by default, and available for all users to try out without installing anything else.
If it introduced new dependencies that would be a different matter, but it doesn't (and we'd decided that it was OK to depend on Twisted 10.1, which has other advantages).
I didn't intend it to be a sync-a-virtual-directory-across-multiple-machines tool. It just uploads things that you drop into a particular directory. Perhaps the name 'drop-upload' suggests that it is more similar to Dropbox than it is. We have time before the 1.9 release to rename it, if a better name is suggested.
It already handles modification of existing files (I'll add that to the tests). Watching all directories under a root is #1433, and is a relatively straightforward evolution of the current code.
Yes, absolutely. It'll be easy to maintain compatibility with the current UI, that's only a few lines of code and documentation. It also needs to be in core to get sufficient feedback on the UI to improve it.
Attachment drop-upload-3.darcs.patch (34494 bytes) added
attachment:drop-upload-3.darcs.patch fixes some bugs in error paths, and has better test coverage. The tests also pass on Windows now.
Attachment drop-upload-4.darcs.patch (42570 bytes) added
drop-upload: make counts visible on the statistics page, and disable some debugging. refs #1429
There are some lines > 80 characters in drop-upload-docs.darcs.patch; I'll rewrap those before committing it.
I just read over these ticket comments and I'm about to review each patch in sequence. On success I can say: "This code makes sense to me and the tests work and the docs are clear.", but I cannot say: "We should / should not include this in trunk."
I will bug repository writers in IRC about that policy decision after reviewing the ticket, but before I change "review-needed" to "reviewed" to avoid an unintended 1.9 merge.
Attachment requirements-comment-merge.darcs.patch (20736 bytes) added
Fix requirement justification comments for Twisted >= 10.1.0
There's a comment collision for drop-upload.darcs.patch in
_auto_deps.py
which causes no change, just more justifications, for the requirement "Twisted>=10.1.0". I've attached requirements-comment-merge.darcs.patch with the merge of the requirements comments.When applying drop-upload-2.darcs.patch there are merge conflicts in
./src/allmydata/client.py
,./src/allmydata/scripts/create_node.py
, and./src/allmydata/test/test_runner.py
.I'm going to punt understanding and resolving these conflicts back to davidsarah and move on to reviewing other patches.
drop-upload-2.darcs.patch is obsolete. Use drop-upload-4.darcs.patch .
The wording in requirements-comment-merge.darcs.patch isn't quite equivalent, because the inotify requirement is specific to Linux and the FTP server one is cross-platform. I think the wording as drop-upload-4.darcs.patch leaves it is fine.
I am resuming this ticket after I realized later patch attachments supercede earlier ones.
In the
DropUploader
constructor there may be a race condition between:-and later-
What happens if
self._local_path
is deleted between these lines?Would it be better to catch and handle the error of
self._local_path
not existing in the call to.watch
? (Would that call signal the error?)I do not consider this issue significant enough to require another iteration on this ticket; but a new ticket may be necessary. I haven't thought of any security problems as of yet, and only a rare usability problem.
Replying to nejucomo:
If the directory doesn't exist, the call to
watch
seems to succeed but not work as intended (even if the directory is later created). Similarly if the path points to a file rather than a directory. So theisdir()
check is necessary for proper error reporting. In general I agree that it is "Better To Ask Forgiveness Than Permission" as opposed to "Looking Before You Leap", but it doesn't seem to be possible to do that here.There is no security problem if the directory is deleted, that will just cause there to be no further notifications.
Reassigning to warner to decide whether this goes in the beta release.
I concur that this is reviewed. davidsarah answered the question about the
_local_path
race condition in IRC: The inotify interface will not indicate if the target directory does not exist.It still may be possible to remove the race condition by calling
isdir()
after the call towatch()
but this detail does not seem important enough to prevent inclusion.Proposed NEWS:
Brian mentioned on IRC that he was uncertain about including this feature in 1.9. His reasoning persuaded me that we might want to be a bit conservative about distributing this feature in a way that makes people think of it as being a fully supported feature of Tahoe-LAFS. It is new and we haven't thought about it that much, and we don't want to take on the burden of backward compatibility for this feature the way we do for all of the supported features in Tahoe-LAFS.
I think maybe we should go ahead and include this feature in Tahoe-LAFS v1.9 but mark it as an experimental feature which we don't necessarily commit to supporting at this time.
Having a new feature in a release seems like a great way to learn about whether we want to promote it to a fully supported feature in a future release.
If this is agreeable to everyone, we should make sure to flag it as experimental in the release announcement and docs.
(http://tahoe-lafs.org/trac/tahoe-lafs/attachment/ticket/1431/drop-upload-docs-including-windows.darcs.patch) is an alternative docs patch if the Windows implementation is included in 1.9.
Attachment drop-upload-docs.darcs.patch (35586 bytes) added
Documentation for drop-upload frontend. refs #1429
Attachment drop-upload-5a.darcs.patch (50509 bytes) added
Drop-upload frontend, rerecorded for 1.9 beta (and correcting a minor mistake). Includes some fixes for Windows but not the Windows inotify implementation. fixes #1429
In changeset:32a7717205ed824a:
In changeset:667b086b59ee37d3:
In changeset:08af9cea50c3c3cd:
Attachment rename-upload-uri-to-dircap.darcs.patch (37632 bytes) added
drop-upload: rename the 'upload.uri' parameter to 'upload.dircap', and a couple of cleanups to error messages. refs #1429
Attachment 1429-rerecord-and-add-test-of-absent-localdir.darcs.patch (39023 bytes) added
Reviewed attachment:rename-upload-uri-to-dircap.darcs.patch. There are some issues. I'm attaching patches that fix a couple of them but not all.
Out of curiosity I rerecorded the patch with
darcs record
; this helped me find issues in the patch; see below. The resulting patches are in the latest attachment -- they redo the renaming withdarcs replace
and would be an acceptable variant to commit to trunk (also a suitably fixed version with hunks instead ofdarcs replace
would be acceptable to me).issues fixed by the latest attachment:
upload_uri
toupload_cap
, the way this patch changed such names elsewhere. This is changed by thedarcs record
rename in the attached patches.FilePath.is_dir
as if it will returnFalse
when then thing doesn't exist or the thing is a non-directory, butFilePath.is_dir
actually raises exception when the thing doesn't exist. The attached patches fix it to useFilePath.exists
and change it to report those two cases separately.issues not fixed:
Client.init_drop_uploader
isn't exercised by unit tests.Specifically it is the code inside the "if drop-upload is configured" block in
Client.init_drop_uploader
. The unit tests instantiate aDropUploader
instance directly, but don't test what happens when you instantiate aClient
instance which is configured to create aDropUploader
.I use the following bash script (executed by emacs) to generate code coverage results for just
test_drop_upload
:In changeset:720bc2433b9bd16d:
In changeset:5633375d267e5728:
In changeset:b7683d9b83a23cdd:
In changeset:612abca271703508:
In changeset:369e30b1dfa2b77f:
In changeset:f157b733676b33f5:
In changeset:10ee22f50e5bdfd3:
In changeset:c102056ac1df1784:
In changeset:db22fdc20dc93a3e:
In changeset:ab9eb12f7006322f:
In [5148/ticket393-MDMF-2]:
In [5149/ticket393-MDMF-2]:
In [5150/ticket393-MDMF-2]:
In [5155/ticket393-MDMF-2]:
In [5156/ticket393-MDMF-2]:
In [5157/ticket393-MDMF-2]:
In [5158/ticket393-MDMF-2]:
In [5159/ticket393-MDMF-2]:
In [5160/ticket393-MDMF-2]:
In [5161/ticket393-MDMF-2]:
In [5162/ticket393-MDMF-2]:
In [5163/ticket393-MDMF-2]:
In [5167/ticket393-MDMF-2]:
Although this was auto-closed for the wrong reason (a commit of the original patch on a branch), it is in fact fixed.
just to be clear, this made it into 1.9.0 (released yesterday). It's labeled as "experimental", which means we aren't committed to supporting it long-term yet, and it may get pulled out if it misbehaves :).