consider changes to webapi "Move" API before release #1732
Labels
No Label
0.2.0
0.3.0
0.4.0
0.5.0
0.5.1
0.6.0
0.6.1
0.7.0
0.8.0
0.9.0
1.0.0
1.1.0
1.10.0
1.10.1
1.10.2
1.10a2
1.11.0
1.12.0
1.12.1
1.13.0
1.14.0
1.15.0
1.15.1
1.2.0
1.3.0
1.4.1
1.5.0
1.6.0
1.6.1
1.7.0
1.7.1
1.7β
1.8.0
1.8.1
1.8.2
1.8.3
1.8β
1.9.0
1.9.0-s3branch
1.9.0a1
1.9.0a2
1.9.0b1
1.9.1
1.9.2
1.9.2a1
LeastAuthority.com automation
blocker
cannot reproduce
cloud-branch
code
code-dirnodes
code-encoding
code-frontend
code-frontend-cli
code-frontend-ftp-sftp
code-frontend-magic-folder
code-frontend-web
code-mutable
code-network
code-nodeadmin
code-peerselection
code-storage
contrib
critical
defect
dev-infrastructure
documentation
duplicate
enhancement
fixed
invalid
major
minor
n/a
normal
operational
packaging
somebody else's problem
supercritical
task
trivial
unknown
was already fixed
website
wontfix
worksforme
No Milestone
No Assignees
3 Participants
Notifications
Due Date
No due date set.
Reference: tahoe-lafs/trac-2024-07-25#1732
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
I just finished reviewing and landing #1579, and it looks nice. One thing we should probably consider though, before committing to the API, is whether we're happy with the split "subdirname-vs-dircap" API. In the current code, you can either move a child to a subdir by name:
or to a (possibly unrelated) subdir by its dircap:
That feels a bit weird. I'm kind of thinking that we should accept
to_dirname=
orto_dircap=
(and throw an error if you provide both), instead of switching the interpretation ofto_dir=
according to the presence of thattarget_type=
argument.thoughts? note this is a blocker for 1.10 (or whatever release is next done on trunk), since we don't want to wind up supporting the wrong API forever.
I agree that
to_dirname=
/to_dircap=
is a better API thanto_dir=&target_type=
.I agree as well (in general I don't like overloading of the interpretation of one argument based on another argument, in any API).
Should
to_dircap
accept a path after the dircap? Note that supporting that would allow changingtahoe mv
to use this API, which is a prerequisite to fixing #943 in the way proposed in /tahoe-lafs/trac-2024-07-25/issues/6005#comment:-1 (since that fix depends on knowing the path that was used in thetahoe mv
command to specify the destination). If it does accept a path then it should be called something other thanto_dircap
.Unix
mv
can rename a file at the same time as moving it. Should there be ato_name
argument (defaulting tofrom_name
)?[fix wrong bug number for #943]edit:
This looks good; the current API is simply a result of the URI target functionality being tacked onto the subdir option. I like warner's API better.
Think I'll change this from "defect" to "enhancement" because my API did in fact work, even if not optimal :p
Milestone: 1.10.0 is correct because the 'Move' feature is only in trunk, not on the 1.9.2 branch.
I'll try to write a patch for this over the weekend.
Replying to davidsarah:
There this thing that we use a lot in the WUI/WAPI: a dircap optionally followed by a list of childnames. The list of childnames is clearly a unix (relative) path. But there's no word for "dircap+relpath". I think we need to coin a word for it. I can't think of anything clever and clear, so how about something awkward and explicit and clear, like "dircappluspath"?
Hm, or actually I think we can extend the vocabulary of unix here. Unix already has "absolute paths" and "relative paths". How about if we call these things "rooted paths"?
I like "rooted path" because it's consistent with the use of "rooted" in graph theory.
Great! I think "rooted path" is a good word for this. So in the WAPI, we currently have:
And the proposal is to change it to:
and
?
Would
to_relative
andto_rooted
be better names for the parameters? (to_dirname
seems inaccurate for the case where the target is in the same directory as before.)Actually, if we allow rooted paths, is it any longer necessary to have distinct parameters? A relative path and a rooted path can be distinguished exactly as they are by the CLI.
(except for not allowing aliases)
Replying to davidsarah:
Well, here is the form that uses a rooted path:
The only use of the other form --
POST /uri/$DIRCAP?t=move&from_name=OLD&to_dirname=SUBDIRNAME
would be to more succinctly express when you want the "to_dirname" to be relative to$DIRCAP
instead of relative to a (different) root. So, I guess it is a tiny optimization and not worth the added complexity of API, IMO.So I think David-Sarah is right, and a single API (
POST /uri/$DIRCAP?t=move&from_name=OLD&to=NEWROOTEDPATH
) is sufficient.Rooted path seems to be the correct behavior that we would want to use for a programmatic interface. The only reason to not require the rooted path would be to protect humans from another DIRCAP or a copy paste error. Because we're not trying to protect humans from this, there was agreement on the weekly dev chat that this is the technically correct approach.
We made sure that the WUI does not display or use a move button in 1.9.x, so there are no supported WUI's out there that would be broken. This means that there shouldn't be anybody left in the cold because of a change to this API.
My one thing that I think I understand but want to explicitly state is:
Thanks a lot for the review, ClashTheBunny! I think your 5-point rules for handling this API call are right. Brian: does that look like a good API to you? Reassigning this ticket to Brian.
One more question, too: I earlier, in comment:16, wrote
POST /uri/$DIRCAP?t=move&from_name=OLD&to=NEWROOTEDPATH
Should it actually be:
POST /uri/$ROOTEDPATH_TO_DIR?t=move&from_name=OLD&to=NEWROOTEDPATH
? That is: does the caller have to supply the actual dircap of the original (source) directory, or can they use a rooted path to that directory?
The rooted path is the most general representation of a file or directory, as it could just be a dircap or a dircap with subdirectories. As such, the correct representation is as zooko mentions:
It then adds this slight complication to my 5 point list above:
I was thinking that zooko's suggested change might be useful to fix #943 (as suggested in /tahoe-lafs/trac-2024-07-25/issues/6005#comment:-1), but it isn't actually necessary -- knowing the full path to the destination is sufficient to fix that ticket. OTOH, we agreed in the Dev Chat that this change was a good one anyway, just for generality and convenience.
Replying to ClashTheBunny:
Hey that was a good comment.
Proposed doc:
This includes the "replace=only-files" option to avoid accidentally clobbering a directory, as discussed on the Dev Chat.
Note that this description doesn't define or use the term "rooted path". I've opened #1929 for that.
Improvements to clarify that metadata is preserved and the crash/failure behaviour.
Some possible improvements included below:
The correct response code would be 404 Not Found if the source doesn't exist, or 403 Forbidden if it is a file. (You could also make a case for 418 ;-)
We agreed in the dev chat of 2013-03-14 that this API won't support an implicit final name, so you can't say
POST /uri/$OLDDIRCAP/$OLDNAME/?t=move&from_name=$OLD&to=$NEWDIRCAP/$NEWNAME/
to mean that the new name should be$NEWDIRCAP/$NEWNAME/$OLD
. This is something that unixmv
supports ("mv foo/bar baz/
"), and tahoe mv might or might not support it, but this API won't. The reason not to is that if it did, then it becomes harder to disambiguate if the user means to add to the directory$NEWDIRCAP/$NEWNAME
a link under the name$OLD
, or to add to the directory$NEWDIRCAP
a link under the name$NEWNAME
.Now I'm confused about why we spell it
$CAP/$NAME/&arg=$FINALNAME
in the source but$CAP/$NAME/$FINALNAME
in the destination? It's not consistent. In the interests of consistency, how about one of these two APIs:or
Between these two, the first one has the advantage of being less wordy.
I can think of two arguments for the second alternative:
a) Consistency with
?t=rename
.b) The object modified by the POST is
$DIRCAP/SUBDIRS../
.$DIRCAP/SUBDIRS../OLD
is not modified (unless it happens to be aliased to the source or destination directory).a) is not a strong argument; I would prefer conciseness over consistency with
?t=rename
.b) is a stronger argument, although the operation also modifies the destination directory which is not the object of the request. Still, I suppose this does mean that the second alternative is more consistent with REST~~, so I'm -0 on switching to the first alternative~~. I'm not sure which is more important, that or internal consistency.
BTW,
?t=move
has the potential to leak filenames and/or cap URIs into logs, similar to?t=rename
and?t=rename-form
in #1904.Talking with zooko at pycon, we think the following might be best:
Zooko's first spelling (
.../OLD?t=move&to=...NEW
) would be nice(r), but the webapi code uses twisted.web Resource traversal to find the dir/file object referenced by the URL, and to accomodate.../OLD
) we'd need to change that code to remember the penultimate Resource (the directory), which would be a lot of work. (parsingto=
is easier: we could just regexp off the last slash).Replying to warner:
Ah yes, I see the consistency argument for that. In that case perhaps
to_name
should default tofrom_name
after all.That seems like a compelling argument. So the choice is now between comment:25 and comment:88441.
Oh, I've had another idea. What if we just generalize
t=rename
so that it can take an optionalto_dir
parameter? I.e.:IRC conversation, edited for formatting:
davidsarah: did you see my suggestion on #1732 ?
warner: yeah
warner: I'm uncertain about changing t=rename. Basically t=rename currently addresses a directory, and tells it to do something to itself. It's a single atomic operation (well, as atomic as changing any mutable file)
davidsarah: well, the operation we want is a strict generalization of rename
warner: whereas t=move addresses one directory and tells it to talk to a (possibly different) directory
warner: from a filesystem point of view, yeah
davidsarah: but only possibly different
warner: but in terms of dirnodes, that generalization would increase the number of actors from one to two
davidsarah: if the destination directory for t=move is the same as the source, then it should be atomic (from the filesystem point of view)
warner: true
davidsarah: so the complexity is still there
warner: yeah. I guess I think of a lot of the webapi in terms of speaking to a specific node and telling it to do something
davidsarah: basically the only difference between t=rename in my suggestion and t=move in (@@https://tahoe-lafs.org/trac/tahoe-lafs/ticket/1732#comment:88441@@) is the defaults for to_dir and to_name. I.e. to_dir defaults to the source dir, and to_name defaults to from_name
warner: from the outside, yeah
davidsarah: I'm all about eliminating redundancy :-)
warner: but the current t=rename is quasi-atomic (there's no way to wind up with multiple copies of the child), and that would change it to be sometimes non-atomic (if interrupted, it might add to the target but not remove from the original)
warner: I dunno
davidsarah: [would be non-atomic]It only if to_dir is provided
warner: I guess we could argue that having a t=move with the proposed syntax means it'd be cleaner to modify t=rename instead
davidsarah: besides, there was a suggestion to deprecate t=rename if we added t=move
warner: hm. ok, so I think "rename" shouldn't be used to move something long distances, whereas "move" is a fine verb for that. And I guess my attitude suggests that, if we only have one command, then it should be "move"
davidsarah: hmm. I don't place much significance on the operation name
warner: I dunno. [...] let's pester zooko to think about this too
davidsarah: well, we don't have the option of adding move and immediately removing rename... if we want to retain backward compatibility
warner: yeah, true. So if we keep rename around, then it suggests that "move" should be something different
davidsarah: I think we need Zooko as a casting vote :-) (it is actually quite useful having 3 of us deciding these things)
davidsarah: here's another advantage of my suggestion: the existing tests for t=rename will test the same-directory case, and so the new tests only need to cover the different-directory case
marking "critical" to indicate this is a blocker for 1.10
I mentioned to zooko this morning that I don't think I'm picky about this API, and am happy go along with whatever everyone else likes. But then I thought of two constraints that I think are important:
t=rename
must stick around)The latter constraint precludes adding some new argument to
t=rename
that would change behavior, because then a new client (who knows about the new argument) who submits such a request to an older gateway (who ignores the new argument) will get a very different behavior than they were expecting. It's the same reason that python functions reject unrecognized keyword-arguments: the use of kwargs allows new callees to add new kwargs over time, but still get a clear error when a new caller calls an older callee, instead of silently ignoring the new arg and doing something completely different.So I guess that means I am picky about it, and I'm voting for a new
t=move
. :)Zooko and I talked about this during the dev talk and I wanted to share my two points.
I first felt that not having the full rooted cap to the file to be moved was punting in that it was hard to implement the other way, so we would do it this way. Zooko helped me understand that the real operation happens at the directory, in that the directory is a file allocation table that is being edited. He told me about the history with the command line 'rm' being confusing in that it doesn't remove a file, but just removes the entry in the directory's file allocation table. Because of this, I'm good with the new verbage of doing a move operation on the containing directory. It's also a win for people who want to just replace "rename" with the new API and things will just work.
The second thing we talked about is that his new API should really be called "relink" in that it's not operating on (moving) a file, but it's creating a link for a file in a new place, and then deleting the old link (or editing the link in the current directory). That brings me to my proposal for a slight alteration to comment 30's definition of the new API:
This way people understand that the operation doesn't change a file in any way, but it actually edits links to the file. It operates on the directory, and not the file too, which is reflected in this. It also leave rename the same, but allows for people who want to update to the new API from 'rename' to just replace 'rename' with 'relink'.
I'm cool with that.
Replying to warner:
[...]
Oh, I am dense for not seeing that. OK, that rules out just generalizing
t=rename
. +1 for calling itt=relink
.OK, I think this is what we have now:
(I just added the bit about #943.)
I believe from_name is always required, which would be nice to include in the text. Maybe the first paragraph could look like this:
and again as a quote:
But even without that, I think this is great. +1
I've thought about this. I'm +1 on comment:88452 and comment:40, with the following comments:
to_name
could be optional, and if omitted it defaults tofrom_name
. The reason we earlier agreed to require it was that it could be ambiguous what was the name and what was the path. Now that theto_name
lives in a separate field from theto_dir
, there's no ambiguity.rename
at this time, but I'd like to revisit that in the future…I'm vaguely -0 on anything being optional, but it's a really weak -0.. I wouldn't object to it landing in that form.
My personal feeling is that this API would benefit from having a total compatibility with rename such that if rename were deprecated in the future, it would be a one word change from 'rename' to 'relink'. You could even do it with a hex editor on a binary since it's the same number of bits. I don't care as much about things being optional as I care about things being compatible. If we require to_dir, if people want to jump from rename to relink, that much more work will be required and that much more chance of people messing up.
Philosophically, I'm for APIs that easy to use, but only if they are unambiguous and easy to not misuse. I don't think that having either to_dir or to_name be required makes the API ambiguous or easy to misuse. A good programmer with an attention to details would probably fill them both in anyhow.
Replying to ClashTheBunny:
Well, if that were a concern then we could just not deprecate
rename
. It's not really imposing significant complexity, since it will be implementable by delegating torelink
.I'm with warner: -0 on having either
to_dir
orto_name
be optional. Do we have concensus on the comment:88452 spec then? If so I will start implementing it tomorrow.Ok, I think we have consensus.
to_name=
is optional, per zooko,to_dir=
is optional, per ClashTheBunny, and we leavet=rename
untouched.Thanks davidsarah for implementing this!
Replying to warner:
Agreed!
This means we need a sensible error message if both
to_name=
andto_dir=
are omitted. ☺I posted a patch review: https://github.com/davidsarah/tahoe-lafs/commit/f75e9c7a3aff04204df0977d37e3b130549a506c#commitcomment-2941334
Please review https://github.com/davidsarah/tahoe-lafs/commit/05d19771b3abf158a282f748198cfc2fcf849f53.
+1! ☺
Fixed in changeset:35f37cc5b8646540.
Replying to [zooko]comment:46:
If both are omitted then the operation succeeds with no effect.
If I understand the github WUI correctly, Brian made some docs changes that I don't agree with:
https://github.com/tahoe-lafs/tahoe-lafs/commit/f9335892f2008198bd8210b98dd953665d8a5e08
Here's an attempt at fixing the issues I have with these docs while preserving Brian's apparent idea of making it discoverable to people who think in terms of "moving" the child: https://github.com/tahoe-lafs/tahoe-lafs/pull/36 Please review!
I like those changes.. landed.
Reviewed, +1 for zooko's changes.