upload: tolerate lost or unacceptably slow servers #873
Labels
No Label
0.2.0
0.3.0
0.4.0
0.5.0
0.5.1
0.6.0
0.6.1
0.7.0
0.8.0
0.9.0
1.0.0
1.1.0
1.10.0
1.10.1
1.10.2
1.10a2
1.11.0
1.12.0
1.12.1
1.13.0
1.14.0
1.15.0
1.15.1
1.2.0
1.3.0
1.4.1
1.5.0
1.6.0
1.6.1
1.7.0
1.7.1
1.7β
1.8.0
1.8.1
1.8.2
1.8.3
1.8β
1.9.0
1.9.0-s3branch
1.9.0a1
1.9.0a2
1.9.0b1
1.9.1
1.9.2
1.9.2a1
LeastAuthority.com automation
blocker
cannot reproduce
cloud-branch
code
code-dirnodes
code-encoding
code-frontend
code-frontend-cli
code-frontend-ftp-sftp
code-frontend-magic-folder
code-frontend-web
code-mutable
code-network
code-nodeadmin
code-peerselection
code-storage
contrib
critical
defect
dev-infrastructure
documentation
duplicate
enhancement
fixed
invalid
major
minor
n/a
normal
operational
packaging
somebody else's problem
supercritical
task
trivial
unknown
was already fixed
website
wontfix
worksforme
No Milestone
No Assignees
3 Participants
Notifications
Due Date
No due date set.
Reference: tahoe-lafs/trac-2024-07-25#873
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
As with download in #287, we'd like upload to gracefully handle the event of servers silently disconnecting during the upload process. This is more difficult than for download, because we don't have the option of switching to a different server. Giving up on a server during upload means giving up on the whole share, which reduces reliability. "shares of happiness" is the current threshold used to decide how important this abandon-the-share event is.
To implement this, the upload code needs to use a timeout (to distinguish between slow-server and silently-lost-server) and we need some way to decide what that timeout should be.
Attachment logs.tgz (24156 bytes) added
Contents of Kyle's .tahoe/logs directory after noticing two hung tahoe backup operations.
I noticed two 'tahoe backup' operations hang on my node, and attached my .tahoe/logs directory as logs.tgz. Here are my versions:
allmydata-tahoe: 1.5.0, foolscap: 0.4.2, pycryptopp: 0.5.17, zfec: 1.4.5, Twisted: 8.2.0, Nevow: 0.9.33-r17222, zope.interface: 3.5.2, python: 2.6.2, platform: OpenBSD-4.6-amd64-Genuine_Intel-R-CPU_000@_2.93GHz-64bit-ELF, sqlite: 3.6.13, simplejson: 2.0.9, argparse: 0.9.1, pyOpenSSL: 0.9, pyutil: 1.3.34, zbase32: 1.1.1, setuptools: 0.6c12dev, pysqlite: 2.4.1
Kyle wrote:
It was impulsive of me to put this ticket into the 1.8 Milestone. This ticket will probably get fixed in a complete rewrite of the upload code at some point.
upload: tolerate lost or missing serversto upload: tolerate lost or unacceptably slow servers#1394 is a near-duplicate for the server selection stage of upload. There's a tension between this ticket and #362 ('enhance upload to search longer and more completely for shares'), which I'm not sure how to resolve.
Kevan: does #1382 affect this ticket? Also if you know how to close tickets or clarify the relationships mentioned in comment:74190, that might be good