make docs/performance.rst more precise and accurate #1398
Labels
No Label
0.2.0
0.3.0
0.4.0
0.5.0
0.5.1
0.6.0
0.6.1
0.7.0
0.8.0
0.9.0
1.0.0
1.1.0
1.10.0
1.10.1
1.10.2
1.10a2
1.11.0
1.12.0
1.12.1
1.13.0
1.14.0
1.15.0
1.15.1
1.2.0
1.3.0
1.4.1
1.5.0
1.6.0
1.6.1
1.7.0
1.7.1
1.7β
1.8.0
1.8.1
1.8.2
1.8.3
1.8β
1.9.0
1.9.0-s3branch
1.9.0a1
1.9.0a2
1.9.0b1
1.9.1
1.9.2
1.9.2a1
LeastAuthority.com automation
blocker
cannot reproduce
cloud-branch
code
code-dirnodes
code-encoding
code-frontend
code-frontend-cli
code-frontend-ftp-sftp
code-frontend-magic-folder
code-frontend-web
code-mutable
code-network
code-nodeadmin
code-peerselection
code-storage
contrib
critical
defect
dev-infrastructure
documentation
duplicate
enhancement
fixed
invalid
major
minor
n/a
normal
operational
packaging
somebody else's problem
supercritical
task
trivial
unknown
was already fixed
website
wontfix
worksforme
No Milestone
No Assignees
3 Participants
Notifications
Due Date
No due date set.
Reference: tahoe-lafs/trac-2024-07-25#1398
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
In comment:13:/tahoe-lafs/trac-2024-07-25/issues/6457, Brian wrote, concerning the "Performing a file-verify on an A-byte file":
to be "N/KS times a small multiple". I think the multiple is currently about 2 or 3. During encryption, we hold both a plaintext share and a ciphertext share in RAM at the same time (so 2S), then we drop the plaintext. During erasure-coding, we hold a whole S of ciphertext in memory at the same time as the N/K*S shares, then we drop the ciphertext before pushing. We also pipeline the sends a little bit, I think 10kB or 50kB per server, to get better utilization out of a non-zero-latency wire.
Also Python's memory-management strategy interacts weirdly. Dropping the plaintext segment may not be enough: Python might not re-use that memory space for anything else right away. Although I'd expect it to de-fragment or coalesce free blocks before asking the OS for so much memory that it crashed.
Although I wonder if Brian was thinking of repair rather than verify since he talks about encrypting, which is not done in verify.
Subsequently I reviewed the document and I see a bunch of things I'm not sure are right. (Note that I myself am mostly responsible for the current state of this document.)
"Publishing an A-byte immutable file" / "when the file is already uploaded":
memory footprint: N/K*S
. Shouldn't that just bememory footprint: S
? All it does is read eachS
-byte segment in turn and hash it.K
andN
shouldn't come into it. This is probably just a cut-and-paste error of think-o error on my part originally, so unless someone else knows of a better reason why I wrote that then I'm going to change it tomemory footprint: S
."Publishing an A-byte immutable file" / "when the file is not already uploaded": if we're going to make the
memory footprint
more precise as Brian suggests above then this one should be changed too. Alsonetwork: ~N + ~A
should actually benetwork: N/K*~A
, right?"Downloading B bytes of an A-byte immutable file":
cpu: ~A
. What? The CPU usage for downloadingB
bytes of anA
-byte immutable file is~A
? I really hope it is actually~B
(plus an amount of CPU logarithmic inA
for Merkle Tree verification, but I'm not sure we should try to include that much precision in this document). Unless someone tells me I'm wrong now and was right then, I'm going to change this tocpu: ~B
."Repairing an A-byte file"
network: variable; up to around ~A
: surely that should saynetwork: variable; up to around N/K*A
Attachment update-performance.rst.darcs.patch (13143 bytes) added
Please review update-performance.rst.darcs.patch . It fixes all of the issues that I raised in the original ticket description except for the ones that I'm not sure how precise Brian wants it to get or what exact numbers would be correct for the finer precision. It also fixed another issue: the costs of repair have been updated to show both lower bounds (just as cheap as a download) and upper (just as expensive as a full initial upload).
Should this go into 1.9?
+1 from me for this to go into 1.9.
It looks OK to me, but Brian is more familiar with the performance characteristics.
I think I was thinking of encryption/upload in that original
description, not verify: good catch.
N/K*A
, not justA
(SDMF has one segment, not oneshare). Also, this should probably refer specifically to SDMF, since
Other than that, it looks accurate. Some entries are missing memory
footprints, and most memory footprints could take the constant-multiple
(1+N/K)*S
overlap into account if you want to get that detailed.Note to self:
Replying to zooko:
Note that FTP does not support mutables at all (#680).
In changeset:3dc491758daad9df:
I will fix the points in comment:83490.
In [5506/ticket999-S3-backend]:
the remainder of this won't happen in time for 1.9, bumping
looks like this is currently owned by davidsarah, and the next step is to edit the patch (to incorporate both my comments and zooko's additional notes). Removing the review-needed flag.
davidsarah: please assign it to me once you've updated the doc or decided that you won't prioritize doing it. Once it is assigned to me, I'll go through all the notes I wrote to myself saying "note to self: do $X; do $Y".
I'm not prioritizing this.
Brian:
I don't entirely understand the more precise estimate of memory footprint during repair that you've suggested (original ticket contents), and I'm not sure it is worth trying to document that level of precision if the absolute difference is going to be only a hundred kilobytes or so.
Would you please either post a patch (or just new text suitable for pasting into source:docs/performance.rst with your suggested improvement, or tell me that the current docs of that are good enough?
In changeset:e850b54772f4303d:
In changeset:e850b54772f4303d:
I added a line about constant factors, that should cover what I mentioned without getting into unnecessary detail.
No complaints from zooko, so I'm considering this one closed.
Looks good to me -- thanks!