stop respecting the pre-v1.3.0 configuration files (improve error message and tests) #1385
Labels
No Label
0.2.0
0.3.0
0.4.0
0.5.0
0.5.1
0.6.0
0.6.1
0.7.0
0.8.0
0.9.0
1.0.0
1.1.0
1.10.0
1.10.1
1.10.2
1.10a2
1.11.0
1.12.0
1.12.1
1.13.0
1.14.0
1.15.0
1.15.1
1.2.0
1.3.0
1.4.1
1.5.0
1.6.0
1.6.1
1.7.0
1.7.1
1.7β
1.8.0
1.8.1
1.8.2
1.8.3
1.8β
1.9.0
1.9.0-s3branch
1.9.0a1
1.9.0a2
1.9.0b1
1.9.1
1.9.2
1.9.2a1
LeastAuthority.com automation
blocker
cannot reproduce
cloud-branch
code
code-dirnodes
code-encoding
code-frontend
code-frontend-cli
code-frontend-ftp-sftp
code-frontend-magic-folder
code-frontend-web
code-mutable
code-network
code-nodeadmin
code-peerselection
code-storage
contrib
critical
defect
dev-infrastructure
documentation
duplicate
enhancement
fixed
invalid
major
minor
n/a
normal
operational
packaging
somebody else's problem
supercritical
task
trivial
unknown
was already fixed
website
wontfix
worksforme
No Milestone
No Assignees
3 Participants
Notifications
Due Date
No due date set.
Reference: tahoe-lafs/trac-2024-07-25#1385
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
As documented in [configuration.rst]source:trunk/docs/configuration.rst?rev=5006#backwards-compatibility-files, Tahoe-LAFS before release v1.3.0 (2009-02-13) had a different way to control configuration. Until now if that older way is present then it [overrides]source:trunk/src/allmydata/node.py?annotate=blame&rev=4852#L115 the newer way. Let's remove any detection of the old way and just leave docs (in source:trunk/docs/historical) for anybody who needs to upgrade. (There are probably zero users of Tahoe-LAFS < 1.3.)
The advantage is to simplify the docs and the configuration code and reduce the number of ways that things can be configured.
Replying to zooko:
There might actually be someone who still has config files that were generated by Tahoe-LAFS < 1.3. Certainly I don't. I don't think Brian does. Anyone else?
In any case, I'm writing this patch and I'm making it emit a warning message if any old-style configuration files are detected.
Replying to [zooko]comment:1:
Another option would be to print the warning but still use the old settings for one major release, and then stop using the old settings in the next major release.
OTOH, I'm fine with going straight to ignoring these files, but in that case we should probably announce the change on tahoe-dev.
Done: http://tahoe-lafs.org/pipermail/tahoe-dev/2011-April/006243.html
By the way I like the principle of warning in a major release before ignoring in a later major release, but in this case I don't think it is worth it since I doubt there are any users with old-style configuration files, and the failure mode if there are is pretty safe.
There's one issue I've found: foolscap (v0.6.1) can't accept a set of log gatherer furls through its Python API—it can accept at most one that way and it can accept any number by reading a file and finding one furl per line in that file. I've opened foolscap ticket 176 to request that a future version of foolscap accept any number of log gatherer furls through its Python API.
In the meantime we could either:
remove the ability to have multiple log gatherers from Tahoe-LAFS, which would be a regression (albeit this is probably a feature that nobody currently uses), or
we could preserve the file
$BASEDIR/log_gatherer.furl
for another major release (unioning with the contents of the singletonlog_gatherer.furl
key in$BASEDIR/tahoe.cfg
), orwe could extend the
tahoe.cfg
key to accept multiple furls (whitespace separated), treat$BASEDIR/log_gatherer.furl
like all the other old-style configuration files by warning about its existence and ignoring its contents, and use a different filename such as$BASEDIR/foolscap/log_gatherer_furls.txt
to transmit the set of furls fromtahoe.cfg
to foolscap.The advantage of approach 3 is that the user configures log gatherer furls just like she configures everything else: in
$BASEDIR/tahoe.cfg
.$BASEDIR/foolscap/log_gatherer_furls.txt
is documented as being "internal use only" and not for users to read or edit. (We might rm it after letting foolscap read it just to drive the point home.)Someday when all users have upgraded to a version of foolscap that provides foolscap ticket 176, then we could stop using the temporary file hack to communicate the set of furls from
tahoe.cfg
into foolscap.So, I'm currently implementing approach 3, but I'll listen if anybody has a strong opinion to the contrary.
I'd like some sort of warning in a release or two, but Zooko says he's writing code to detect-and-complain-about the old files, which I'm happy with as long as we have a plan to remove it eventually (probably around the 2.0 timeframe) (so we don't accumulate old cruft forever). Note that it'd probably be sufficient to do a fatal complaint about the lack of a
tahoe.cfg
, because the nodes that were configured with individual files probably won't have one, and that sort of complaint could be kept around forever.I'm ok with regressing on multiple-log-gatherers until Foolscap has an API to handle that. I'd prefer the whitespace-separated tahoe.cfg key over having a separate file with a funny name.
(note that part of the reason for having discrete files for things like log-gatherers was to make it easy to set up or modify a whole bunch of Allmydata servers with a batch of
scp
commands: just stuff thelog-gatherer.furl
file into all of them and then bounce them all. To do that withtahoe.cfg
requires editing files, so needs more complex tooling. But I don't think this is an important feature these days, and I prefer the simplicity of a single config file).Just to be clear, are you also okay with my plan 3 which supports multiple log gatherers in Tahoe-LAFS by putting them into a file to give to foolscap?
eh, yeah, if you want to do that, I'm ok with it, but it feels a touch complex. I'd be just as happy with the simpler plan 1.
Replying to warner:
I see. Yes, I think we can push this complexity off because nowadays sysadmins and their tools like puppet are getting good at editing config files in place.
Here are all the old-style config files that I've found (from [configuration.rst]source:trunk/docs/configuration.rst@5006#backwards-compatibility-files):
BASEDIR/nickname
BASEDIR/webport
BASEDIR/client.port
BASEDIR/introducer.port
BASEDIR/advertised_ip_addresses
BASEDIR/log_gatherer.furl
BASEDIR/keepalive_timeout
BASEDIR/disconnect_timeout
BASEDIR/introducer.furl
BASEDIR/helper.furl
BASEDIR/key_generator.furl
BASEDIR/stats_gatherer.furl
BASEDIR/no_storage
BASEDIR/readonly_storage
BASEDIR/sizelimit
BASEDIR/debug_discard_storage
BASEDIR/run_helper
Here is the
NEWS
file entry from the Tahoe-LAFS v1.3.0 release which announced the new config file format: [NEWS]source:trunk/NEWS?annotate=blame&rev=3620#L333.Replying to warner:
Okay, I would still prefer plan 3 (simple for users, more complicated for implementors, allows multiple log gatherers), but only if someone else does the work of implementing it. :-) Since I'm doing the work, I've changed my mind to plan 1 (simple for users, simple for implementors, doesn't allow multiple log gatherers).
Replying to warner:
I'd also be just as happy with plan 1.
Incidentally, the old-file-detection code doesn't need to be particularly complex (it just needs a list of old filenames, not separate code for each), so I don't think will be a problem to keep it until the next major significant compatibility break, say 2.0.
Okay I have this patch almost finished -- I'm just writing the source:NEWS and patch description, and I was writing "This will fail safe and fail loud if an old-style config is found", and then I started to wonder if we shouldn't ensure that it fails safe by stopping the node if it detects an old-style config file. If the node goes ahead and starts up and runs, then it will (after emitting a warning) be operating with different values than it was using in Tahoe-LAFS v1.8.2. I'm looking at the list of old-style config files wondering if it would be unsafe for the user if their node goes ahead and switches to the
tahoe.cfg
value from the old-style value for any of them. I don't like to wonder about things like that. (Even though there are few or no users who have any old-style config files left.)So I'm going to go back and change this patch to make the node emit a warning and then stop itself, if it detects an old-style config file.
Replying to zooko:
+1.
Thanks for the design review!
Planning to work on this and #1363 on the car ride home tomorrow (about ten hours, with one co-driver and two children in the car). In order to make the deadline for new-feature patches for v1.9, which is tomorrow.
Attachment reject-old-style-config-files.darcs.patch (54290 bytes) added
reject-old-style-config-files.darcs.patch has three patches in it: the one that actually rejects old-style config files, a tidy-up patch that it depends on, and a whitespace-cleanup patch that it does not depend on. Please review! :-)
In the changes to docs/configuration.rst:
N
servers, up toN
-k
servers can be offline without losing the file" implicitly assumes that the shares are all stored on different servers.NODEDIR/tahoe.cfg
file..." -- say which fields of the file.Otherwise +1.
In changeset:e5c4e83f4cfe3769:
T_X discovered the following regression: Introducer nodes create a file named
introducer.furl
in their base directory. On the second and subsequent runs after theintroducer.furl
file has already been created, the introducer will fail to start, because it will see that file and think that it is an old config file.Our tests failed to detect this for two reasons:
test_runner.RunNode.test_introducer
was switched off, presumably because it was causing too many false positives.In changeset:e74387f4f15e6839:
Replying to davidsarah:
Actually
test_runner.RunNode.test_introducer
did start and then restart an introducer, but it deleted theintroducer.furl
in-between. I've changed it to use the mtime ofintroducer.furl
to detect when it has been rewritten.The "no noise" check being switched off was a red herring; the test would have failed anyway if it hadn't been deleting the
introducer.furl
.Hmm, maybe there are filesystems on which mtime is too coarse for the new test to work without hanging (because the new mtime of
introducer.furl
might be the same as the old one). Can anyone think of a better way to tell when the introducer process has restarted?In changeset:2d16a16ee3d99482:
The regression seems to be fixed, but there's still the issue in comment:83234 about the new test. Also the way in which the existence of old config files is reported is quite ugly, with an unnecessary traceback (which initially made me miss the message at the top):
stop respecting the pre-v1.3.0 configuration filesto stop respecting the pre-v1.3.0 configuration files (improve error message)In changeset:f45bfeb3df62df17:
Attachment fix-introducer-test.darcs.patch (29637 bytes) added
test_runner.py: fix RunNode.test_introducer to not rely on the mtime of introducer.furl to detect when the node has restarted. Instead we detect when node.url has been written. refs #1385
Attachment improve-old-config-error-message.darcs.patch (27545 bytes) added
Further improve error message about old config files. refs #1385
With improve-old-config-error-message.darcs.patch, the error looks like:
The stack trace is still there, but at least it prints a sensible message at the end.
+1 on changeset:2d16a16ee3d99482, changeset:f45bfeb3df62df17, fix-introducer-test.darcs.patch, and improve-old-config-error-message.darcs.patch. Thanks for the nice usability and testing improvements!
In changeset:80300ea7a3c582ea:
In changeset:521754b5062cfadd:
In changeset:b6cfbbeb234cd8c9:
In changeset:b9eb0235ea38ce37:
stop respecting the pre-v1.3.0 configuration files (improve error message)to stop respecting the pre-v1.3.0 configuration files (improve error message and tests)