WebAPI: GET /uri/$FILECAP?t=json doesn't return size for mutable files, but the HTML version does #677
Labels
No Label
0.2.0
0.3.0
0.4.0
0.5.0
0.5.1
0.6.0
0.6.1
0.7.0
0.8.0
0.9.0
1.0.0
1.1.0
1.10.0
1.10.1
1.10.2
1.10a2
1.11.0
1.12.0
1.12.1
1.13.0
1.14.0
1.15.0
1.15.1
1.2.0
1.3.0
1.4.1
1.5.0
1.6.0
1.6.1
1.7.0
1.7.1
1.7β
1.8.0
1.8.1
1.8.2
1.8.3
1.8β
1.9.0
1.9.0-s3branch
1.9.0a1
1.9.0a2
1.9.0b1
1.9.1
1.9.2
1.9.2a1
LeastAuthority.com automation
blocker
cannot reproduce
cloud-branch
code
code-dirnodes
code-encoding
code-frontend
code-frontend-cli
code-frontend-ftp-sftp
code-frontend-magic-folder
code-frontend-web
code-mutable
code-network
code-nodeadmin
code-peerselection
code-storage
contrib
critical
defect
dev-infrastructure
documentation
duplicate
enhancement
fixed
invalid
major
minor
n/a
normal
operational
packaging
somebody else's problem
supercritical
task
trivial
unknown
was already fixed
website
wontfix
worksforme
No Milestone
No Assignees
3 Participants
Notifications
Due Date
No due date set.
Reference: tahoe-lafs/trac-2024-07-25#677
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Compare:
HTML version
JSON version
The JSON version is returning "?" as the file size, while the HTML version returns the correct size. This only happens on mutable files.
Attachment fix_mutable_size_in_json.patch (239702 bytes) added
Patch that appears to fix the problem
This one is causing me some trouble, so I did some investigation.
The immediate culprit is them implementation of MutableFileNode.get_size(), in mutable/filenode.py, line 195. The implementation is supremely simple, and not very useful:
return "?"
I'm assuming this is because finding out the size of a mutable node isn't easy. The implementation of the HTML "More Info" page, makes a deferred call to get_size_of_best_version(), so I presume that's what's needed for the JSON as well.
I've attached a patch that seems to fix the problem, by doing a deferred call to get_size_of_best_version(), but I'm not very familiar with this code so I don't know it's the best way to handle the issue. I'll create a test case, too.
yeah, get_size_of_best_version() is fairly new, and the JSON-rendering code was written before it was available. Also, I seem to remember that requiring a Deferred to get that size was annoying, and I didn't want to slow down a directory listing by doing a mapupdate for every single mutable file therein.
For CHK files, the size is stored in the filecap, so it's trivial to get it. For mutable files, you have to ask around and find out what versions are available. This "mapupdate" operation isn't as expensive as actually retrieving the file's contents, but it's close (a roundtrip or two).
Your approach sounds reasonable. Be sure to think about whether it's t=JSON on a single file that gets this fix, and/or t=JSON on a whole directory (which might not be a good idea, if there are lots of mutable files in it). It'd probably be better to leave the "size" field out of the elements of a directory fetch than to include "?" strings.
Is it worth getting the size(s) only when asked for, for example using
t=JSON+size
?This patch conflicts with the current code. Here is the conflict:
A quick fix for that conflict (with ugly nested function) is:
There is another conflict in the definition of FileJSONMetadata:
The previous conflicting code passes some metadata to this function so that (IIRC) the 'tahoe ls' command woks correctly. Perhaps a fix here would be to take both edge metadata and best size as parameters?
Also, this fix should have a test written.
When these issues are fixed, I'll be happy to re-review the patch.
Replying to kevan:
I'm totally flummoxed by this code. (I even had to look up whether d in
_get_best_size
will shadow the outer d or reference it. According to PEP 227, this changed between versions of Python.) Also using(sz, md)
in one place and(md, sz)
in another looks wrong.I suspect that using two separate deferreds with different variable names would be clearer.
Replying to [davidsarah]comment:7:
Indeed, the ordering of
md
andsz
is wrong -- I'd fixed it in my sandbox, but not in that comment. Sorry for the confusion! I also agree on the names of the deferreds.FWIW, I usually use "d2" when nested callbacks want to use Deferreds too.
Python will shadow the outer name, so it's safe to use "d" everywhere, but
the time spent remembering that is worse than the time spent typing the "2"
:).
Also, there's a new method named
get_current_size
that is better to usehere. It's defined on both mutable and immutable files, and always returns a
Deferred. So try something like the following instead (this includes a little
bit of cleanup, and keeps the filenode-specific code inside
!FileJSONMetadata.. needs a test, of course):
Needs to address comments before further review.
Replying to swillden:
Note that this was changed in changeset:e046744f40d59e70. It now returns the last cached size, initially None.
Out of time.
Attachment current-size-in-mutable-file-json.dpatch (2427 bytes) added
web.filenode: include the current (rather than cached) file size in the JSON metadata for a mutable file. fixes #677
Attachment omit-size-from-dir-json-if-unknown.dpatch (1424 bytes) added
web.directory: omit the 'size' field for a mutable file in the JSON rendering of a directory if it is not known.
current-size-in-mutable-file-json.dpatch is an adaptation of Brian's code in comment:70577 to current trunk. It only affects
t=json
for a mutable file, not for directory listings.omit-size-from-dir-json-if-unknown.dpatch omits the "
size
" field from directory listings if it is not currently cached. I'm not sure that this is necessary; the current behaviour of including"size": null
does not seem too unreasonable.Hey, you've done this again. I don't understand what it means for a ticket to be both review-needed and test-needed. Shouldn't the reviewer just point out that tests are needed and then unset the review-needed flag?
Replying to zooko:
Yes. Unfortunately I forgot about this ticket, and we're out of time for 1.8.
I forgot about this ticket again :-( Still needs a test.
Updated to current trunk (taking into account MDMF) and gitified: https://github.com/davidsarah/tahoe-lafs/commits/677-current-size-in-mutable-json. Tests still needed.