cloud backend uses lots of expensive LIST requests #2346
Labels
No Label
0.2.0
0.3.0
0.4.0
0.5.0
0.5.1
0.6.0
0.6.1
0.7.0
0.8.0
0.9.0
1.0.0
1.1.0
1.10.0
1.10.1
1.10.2
1.10a2
1.11.0
1.12.0
1.12.1
1.13.0
1.14.0
1.15.0
1.15.1
1.2.0
1.3.0
1.4.1
1.5.0
1.6.0
1.6.1
1.7.0
1.7.1
1.7β
1.8.0
1.8.1
1.8.2
1.8.3
1.8β
1.9.0
1.9.0-s3branch
1.9.0a1
1.9.0a2
1.9.0b1
1.9.1
1.9.2
1.9.2a1
LeastAuthority.com automation
blocker
cannot reproduce
cloud-branch
code
code-dirnodes
code-encoding
code-frontend
code-frontend-cli
code-frontend-ftp-sftp
code-frontend-magic-folder
code-frontend-web
code-mutable
code-network
code-nodeadmin
code-peerselection
code-storage
contrib
critical
defect
dev-infrastructure
documentation
duplicate
enhancement
fixed
invalid
major
minor
n/a
normal
operational
packaging
somebody else's problem
supercritical
task
trivial
unknown
was already fixed
website
wontfix
worksforme
No Milestone
No Assignees
3 Participants
Notifications
Due Date
No due date set.
Reference: tahoe-lafs/trac-2024-07-25#2346
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
The cloud backend uses lots of expensive LIST requests with an Amazon S3 bucket from heavy use of GET Bucket. The GET Bucket request is billed as a LIST request and is 10 times more expensive than a GET Object request.
These LIST requests can be a large portion of the cost of using an S3 backend storage node. For example, my logs show 1.5 times as many GET Bucket requests as GET Object requests (with two storage nodes, one S3 bucket and one desktop computer) and the cost exceeds storage, transfer, and EC2 costs.
Here is some relevant code:
https://github.com/LeastAuthority/tahoe-lafs/blob/cloud-rebased/src/allmydata/storage/backends/cloud/cloud_common.py#L426
And relevant chat on IRC:
the list of shares is stored in a local database called the leasedb. that was added recently on the cloud branch, so I suspect we're not making optimal use of it yet
ISTR that zooko was arguing for treating the leasedb as authoritative as to whether a share exists, and I was arguing against for a reason that I can't remember right now. there's a ticket about it
Yes, the arguments about the trade-offs of treating leasedb as authoritative vs. advisory are encoded into tickets.
I seem to recall that treating leasedb as authoritative gets nice performance, including for this particular aspect, while trading off some other values.
cloud backend uses losts of expensive LIST requeststo cloud backend uses lots of expensive LIST requestsMilestone renamed
renaming milestone
Moving open issues out of closed milestones.
The established line of development on the "cloud backend" branch has been abandoned. This ticket is being closed as part of a batch-ticket cleanup for "cloud backend"-related tickets.
If this is a bug, it is probably genuinely no longer relevant. The "cloud backend" branch is too large and unwieldy to ever be merged into the main line of development (particularly now that the Python 3 porting effort is significantly underway).
If this is a feature, it may be relevant to some future efforts - if they are sufficiently similar to the "cloud backend" effort - but I am still closing it because there are no immediate plans for a new development effort in such a direction.
Tickets related to the "leasedb" are included in this set because the "leasedb" code is in the "cloud backend" branch and fairly well intertwined with the "cloud backend". If there is interest in lease implementation change at some future time then that effort will essentially have to be restarted as well.