HTTP protocol is significantly slower than Foolscap protocol #3939
Labels
No Label
0.2.0
0.3.0
0.4.0
0.5.0
0.5.1
0.6.0
0.6.1
0.7.0
0.8.0
0.9.0
1.0.0
1.1.0
1.10.0
1.10.1
1.10.2
1.10a2
1.11.0
1.12.0
1.12.1
1.13.0
1.14.0
1.15.0
1.15.1
1.2.0
1.3.0
1.4.1
1.5.0
1.6.0
1.6.1
1.7.0
1.7.1
1.7β
1.8.0
1.8.1
1.8.2
1.8.3
1.8β
1.9.0
1.9.0-s3branch
1.9.0a1
1.9.0a2
1.9.0b1
1.9.1
1.9.2
1.9.2a1
LeastAuthority.com automation
blocker
cannot reproduce
cloud-branch
code
code-dirnodes
code-encoding
code-frontend
code-frontend-cli
code-frontend-ftp-sftp
code-frontend-magic-folder
code-frontend-web
code-mutable
code-network
code-nodeadmin
code-peerselection
code-storage
contrib
critical
defect
dev-infrastructure
documentation
duplicate
enhancement
fixed
invalid
major
minor
n/a
normal
operational
packaging
somebody else's problem
supercritical
task
trivial
unknown
was already fixed
website
wontfix
worksforme
No Milestone
No Assignees
2 Participants
Notifications
Due Date
No due date set.
Reference: tahoe-lafs/trac-2024-07-25#3939
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Specifically, in https://github.com/tahoe-lafs/tahoe-lafs/pull/1225,
test_filesystem
(from test_system.py) is 40 seconds with HTTP, compared to 20 with Foolscap.Of those extra 20 seconds:
get_spki_hash()
in the HTTP client validation code. Or rather, the real bottleneck is the call to get_public_bytes(), which is a little surprising but OK.The test setup doesn't use persistent connections, so this arguably unrealistic, and maybe tests should use persistent connections. This still seems... excessive. Perhaps we're doing way too many HTTP requests. E.g. maybe chunk sizes that are OK for Foolscap are too small for HTTP.
If I switch back to persistent HTTP connections, the test takes 20 seconds. So this is perhaps not a blocker if I can figure out how to make the tests not get dirty reactor with persistent HTTP.
Still seems worth fixing though, it suggests the HTTP protocol is doing way too much requests.
Looking at an immutable, here are some writes for a (presumably small) object, as recorded via layout.py: 36 bytes, 1400, 32, 32, 32, 170, 320. This will get batched via pipeline.py (see #3787) and for Foolscap maybe that's fine. But HTTP/1.1 has higher overhead per query than Foolscap I suspect, and even with HTTP 2.0 I imagine it's rather higher.
So possibly one strategy is doing an alternative to
pipeline.py
where logic isn't generic "batch API queries" but instead it relies on the fact we're doing writes without any holes, so we can coalesce writes semantically. (Previously it would have holes so this would've been harder.)There is the problem that the API currently doesn't force the code to do writes in order, but that's solvable.
In practice immutables path does actually have the writes happening in the correct order, so possibly that should just work.
TODO for branch-in-progress:
Did all the above (mutable uploads seem more reasonable apriori? they at least don't do writes per tiny bit of metadata). Some quantitative results: for
allmydata.test.test_system.HTTPSystemTest.test_filesystem
, number of writes during immutable upload goes from 530 to 60. So that's good! It does not however make a meaningful dent in run time... so there are likely other overly chatty interactions.Initial thought is that downloads are maybe using too small of a chunk size, so will investigate that next.
In 1eba202c/trunk: