UncoordinatedWriteError on prod grid #899
Labels
No Label
0.2.0
0.3.0
0.4.0
0.5.0
0.5.1
0.6.0
0.6.1
0.7.0
0.8.0
0.9.0
1.0.0
1.1.0
1.10.0
1.10.1
1.10.2
1.10a2
1.11.0
1.12.0
1.12.1
1.13.0
1.14.0
1.15.0
1.15.1
1.2.0
1.3.0
1.4.1
1.5.0
1.6.0
1.6.1
1.7.0
1.7.1
1.7β
1.8.0
1.8.1
1.8.2
1.8.3
1.8β
1.9.0
1.9.0-s3branch
1.9.0a1
1.9.0a2
1.9.0b1
1.9.1
1.9.2
1.9.2a1
LeastAuthority.com automation
blocker
cannot reproduce
cloud-branch
code
code-dirnodes
code-encoding
code-frontend
code-frontend-cli
code-frontend-ftp-sftp
code-frontend-magic-folder
code-frontend-web
code-mutable
code-network
code-nodeadmin
code-peerselection
code-storage
contrib
critical
defect
dev-infrastructure
documentation
duplicate
enhancement
fixed
invalid
major
minor
n/a
normal
operational
packaging
somebody else's problem
supercritical
task
trivial
unknown
was already fixed
website
wontfix
worksforme
No Milestone
No Assignees
3 Participants
Notifications
Due Date
No due date set.
Reference: tahoe-lafs/trac-2024-07-25#899
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Kyle Markley reported this on the tahoe-dev list:
http://allmydata.org/pipermail/tahoe-dev/2010-January/003554.html
It could be related to #540, #877, or #893.
I'll ask Kyle to supply more diagnostic info on this ticket.
Attachment logs.tgz (1300514 bytes) added
UncoordinatedWriteError log
allmydata-tahoe: 1.5.0, foolscap: 0.4.2, pycryptopp: 0.5.17, zfec: 1.4.5, Twisted: 8.2.0, Nevow: 0.9.33-r17222, zope.interface: 3.5.2, python: 2.6.2, platform: OpenBSD-4.6-amd64-Genuine_Intel-R-CPU_000@_2.93GHz-64bit-ELF, sqlite: 3.6.13, simplejson: 2.0.9, argparse: 0.9.1, pyOpenSSL: 0.9, pyutil: 1.3.34, zbase32: 1.1.1, setuptools: 0.6c12dev, pysqlite: 2.4.1
Mutable File Publish Status
Retrieve Results
Andrej Falout couldn't attach his incident reports to this ticket because trac doesn't let you upload attachments larger than 1,000,000 bytes. I bunzip2'ed them and 7z'ed them and they came out half as big, so here they are.
Attachment tahoeIncident.7z (477142 bytes) added
Oh, and I reconfigured trac to allow attachments of up to 10 MB.
I'm continuing to hit this UncoordinatedWriteError very frequently on the production grid. I think it happens most often when creating directories. I can provide lots of additional incident reports if that would be useful.
This has made it almost impossible for me to run a 'tahoe backup' command to the production grid; should the priority of this ticket be raised?
allmydata.com is continuing to repair servers and configuration issues on the allmydata.com prod grid, so that might be the way that your problem gets solved. However, at the very least your Tahoe-LAFS client is reporting something with a wrong error message. It may also be buggy in some way that leads to this problem.
One thing that you could do that would help is to try the same thing with a newer version of Tahoe-LAFS. Could you try installing the latest version http://allmydata.org/source/tahoe/tarballs/?C=M;O=D , per these install instructions: http://allmydata.org/source/tahoe/trunk/docs/install.html ?
I haven't seen one of these errors since upgrading from tahoe 1.5.0 to 1.5.0-r4160. Between that and general repair of the grid, the problem has gone away for me.
I glanced through a couple of these Incidents, and all the ones I looked at were that artifact that we fixed in which DeadReferenceError is logged too severely by accident (the one where the ServerFailure that wrapped the DeadReferenceError, preventing the errback code from identifying it as a DeadReferenceError). This got fixed with the overhaul of the add-lease code.