Make the paths of the different folders configurable #2045
Labels
No Label
0.2.0
0.3.0
0.4.0
0.5.0
0.5.1
0.6.0
0.6.1
0.7.0
0.8.0
0.9.0
1.0.0
1.1.0
1.10.0
1.10.1
1.10.2
1.10a2
1.11.0
1.12.0
1.12.1
1.13.0
1.14.0
1.15.0
1.15.1
1.2.0
1.3.0
1.4.1
1.5.0
1.6.0
1.6.1
1.7.0
1.7.1
1.7β
1.8.0
1.8.1
1.8.2
1.8.3
1.8β
1.9.0
1.9.0-s3branch
1.9.0a1
1.9.0a2
1.9.0b1
1.9.1
1.9.2
1.9.2a1
LeastAuthority.com automation
blocker
cannot reproduce
cloud-branch
code
code-dirnodes
code-encoding
code-frontend
code-frontend-cli
code-frontend-ftp-sftp
code-frontend-magic-folder
code-frontend-web
code-mutable
code-network
code-nodeadmin
code-peerselection
code-storage
contrib
critical
defect
dev-infrastructure
documentation
duplicate
enhancement
fixed
invalid
major
minor
n/a
normal
operational
packaging
somebody else's problem
supercritical
task
trivial
unknown
was already fixed
website
wontfix
worksforme
No Milestone
No Assignees
7 Participants
Notifications
Due Date
No due date set.
Reference: tahoe-lafs/trac-2024-07-25#2045
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
The objective is to be able to run Tahoe-LAFS as a daemon on a unix system. For that each part of the node needs to have a folder in its unix path instead of inside the node folder: configuration files goes to /etc, storage to /var/lib, log files to /var/log, ...
The proposed solution to that is to have a set of fields on tahoe.cfg to set where each folder should go, also adding a set of options to the tahoe command line tools.
The topic was discussed on the past on the mailing list:
[//pipermail/tahoe-dev/2011-May/006324.html]
I started an implementation at:
https://github.com/meskio/tahoe-lafs/tree/meskio
Adding storedir and logdir to the configuration file, --pidfile to the start/stop cli and --logdir and --storedir to the create-* cli.
This puppet module already has initscripts:
https://github.com/ctrlaltdel/puppet-tahoe
Re: the
--pidfile
option, see also #1546 and #1533.When the new Debian package (1.10-1) will be uploaded, this bug will be a blocker for it to migrate into testing.
With the iniscript feature implementd in this package, tahoe-lafs now becomes a real unix daemon, and as such should respect the FHS. Otherwise it will be considered as RC-buggy and won't be migrated in the next upcoming Debian stable.
Jessie's freeze is planned to happen on the 2014-11-15, hopefully far enough for this bug to be closed. Thanks to meskio preliminary work, that shouldn't be difficult.
I've reviewed meskios's patches regarding FHS compliance. They sound close to ready, tests adaptations are also included.
It might remain one issue:
nodedir/tahoe.cfg
(and probably some others, like private keys) should probably stand in/etc/tahoe-lafs/nodedir/
as they are configuration files.I plan to merge meskio's branch in a feature branch of the debian package repo, and see how this last implementation is possible. This could land in Debian's experimental suite if people are willing to try it. That'd be a good candidate for a 1.11-1, now that git.debian.org is up again. :)
Thank you for your review of my patches bertagaz. I've being pretty busy to follow up with this, but I'll love to see it in tahoe.
I'm not sure if I understand the remain issue. Is it your proposal to be able to pass the path to the cfg file to the tahoe command?
How do you envision to run it from the /etc/init.d/ script? Something like 'tahoe start --cfg=/etc/tahoe-lafs/nodedir/tahoe.cfg /var/lib/tahoe-lafs/nodedir/'?
I'm -1 on splitting out
tahoe.cfg
etc. from the nodedir, since I think that's too much of a divergence from the current behaviour, and would complicate the code and our support burden significantly.Would everyone interested in this ticket (#2045) please also go read the entire discussion on #1310 carefully?
Thanks for the #1310 pointer. These two issues certainly share commons.
I've taken time to think and read more the code regarging this FHS issue.
Sure the way tahoe is made isn't really fiting to the proposed changes regarding configuration files.
But maybe that's because once they are removed from basedir, there's almost nothing left in there when meskio's patches have been applied. Well, appart three things i've seen, maybe more as I surely have missed some:
tmp/
:This one was fine to lay in basedir if this directory was under /var/
tmpfile
:Same as previous
picklefile
:Stats_gatherer data
So, another approach for tahoe-lafs to be FHS compliant in vendors would be to make basedir being a subdir of
/etc/tahoe-lafs/
rather than/var/lib/tahoe-lafs/
.Then the two tmp contents should probably be in a subdir of
/var/tmp/tahoe-lafs/
. Regading the picklefile, a subdir of/var/lib/tahoe-lafs/
might be the right place.In the end, this approach would require very little changes to add in meskio's patches, and not so much in Debian's packaging too.
Does it sound realistic? More interesting regarding support burden? Any other stuff that should get out of basedir once considered as a configuration directory?
FHS aspires to be universal, but I view it as Linux-centric; BSD has hier(7) which is similar but not quite the same. That argues for making paths configurable, so that people (packages) can just pass --foo-dir=/bar and do the locally-correct thing.
I +1 gdt's point about making them freely configurable and let each OS packagers decide. Command line seems the best/simplest option.
End-users are not going to want to type options specifying paths on every
tahoe
command they use. Please can we have a bit more practical consideration of usability issues on this ticket and #1310?End users won't have to do this, as in the proposition, they would still use the guenuine historical way of using tahoe-lafs: basedir and all content will reside in ${HOME}/.tahoe. This is already how meskio did it.
Still what we're trying to achieve is helping sysadmins to deploy tahoe-lafs instances system-wide, by having it better integrated in distributions, with other tools. Usualy it goes by respecting standards, like FHS and hier. +1 on gtd's and amontero's proposal btw.
What is proposed is to let sysadmins being able to manage easily nodes on their systems. Regarding the 'multitude' of options, as sysadmins they should know how to do that. Good documentation, maybe an tahoe create-node helper is certainly something to ship...
Still they would just have to use the full options on node creation, then the relevant directories will be found in the tahoe.cfg file.
As bertagaz clarified, adding "--foo-dir=/bar" options would be as a way of overriding the default/current behaviour of using ~/.tahoe, if nothing specified. As Daira already pointed out, doing it otherwise would be a serious usability regression. Overriding dirs on command line would bring best of both worlds.
I'm not sure if/how adding the config option in the .cfg file fits.
I've worked a bit toward this and send 3 small patches to meskio who merged them in his repo (link in the first comment).
tempdir was already configurable in tahoe.cfg, so I've just made it so that it can be define at the command line during node creation.
I've also added an option to specify an alternative directory to place the picklefile, if one wants to puts this file in a different path from basedir. This is usefull when using the initscript provided in Debian, where a node basedir will lay in a subdir of /etc/tahoe-lafs/.
That would be great if some core dev could have a look and see if it is good enough to be merged. This is the last bits I think to have tahoe-LAFS following FHS.
I send a pull request with the changes:
https://github.com/tahoe-lafs/tahoe-lafs/pull/79
Replying to meskio:
test_log_dir
needs to be updated for the change fromlogdir
toincident_dir
incidents_dir
(plural) might be betterstoredir
should bestorage_dir
? I think whole words and underscores are preferable. s/tempdir/temp_dir/ too.tahoe.cfg
able to configure the path to twistd.log because that path is needed inscripts/startstop_node.py
when the config file has not yet been read. A simple solution for properly debianizing tahoe is to just have the--logdir
option passed on the commandline by the init script, perhaps configured in/etc/default/tahoe-lafs
or something like that. (The pull request includes a--logdir
commandline option, not to be confused with theincident_dir
config option which was previously calledlogdir
.)--logdir
option needs to be documented--logdir
option should actually be--logfile
(as the twistd option derived from it is) so that the log file doesn't have to be called twistd.log. That way it could log to something like/var/log/tahoe-lafs.log
."Replying to [leif]comment:20:
Ack.
Are there several directories to store a node incident logs?
Ack, that makes sense.
That is the conclusion meskio and I also ended on. I'm willing to implement this in the Debian initscript as soon as the patches are accepted.
Ack.
In case sysadmins run a lot of nodes, this might be a bit confusing, or hard to dig in a one-log-file-rules-them-all. Having one logfile per node would probably be much more useable in this regard. That would also require more code change to maintain backward compatibility, as the
twistd.log
filename is appended to the logdir option at the moment. But I admit I'm unsure about this, maybe I missed something.The --logdir parameter is also used by create-node to initialice the 'incident_dir' config field, if it's present incident_dir is configured as $logdir/incidents. If we rename --logdir to --logfile I think won't make sense this behaviour anymore, and I can remove it.
I agree with the rest of leid coments. I'll fix them soon.
The configuration field 'tempdir' was already in tahoe, what my patches add is a way to configure it by the 'create-node' command. I'm not sure how much it make sense to rename with my patches.
leif: will you please shepherd this patch to merge?
I see there has been some more commits on meskio's branch, implementing what leif commented in the first review.
Meskio: do you consider it ready for another review, or do you need more time/help?
Sorry if I bug you too much :)
Replying to meskio:
I agree this shouldn't be renamed.
The remaining parts are:
No problem about the bugging. I'm traveling this days and not paying much attention to it, sorry.
Replying to meskio:
I'm confused, I said 'tempdir' shouldn't be renamed. (That's because renaming it would break compatibility with existing config files, so we'd need to support both names which is unnecessarily complex.)
Oh, I missread it, sorry. I won't rename it then.
Replying to meskio:
Please correct me if I'm wrong, but as I understand it, introducers don't store any data in their basedir, so I don't think we need to have a specific option for them.
Replying to [bertagaz]comment:31:
I don't know where is my brain :( I wanted to write 'incidents_dir', sorry.
Replying to [meskio]comment:32:
Travels often have that effect :)
Actually you might be raising an interesting point in favor of keeping logdir (which was the historical name in tahoe) rather than switching to logfile, as incidents_dir is using the value of the first one if set to compute its own value. That is how I understand it. Switching to logfile might introduce more complexity into the patch (and probably elsewhere), for not so much gain IMHO. I think the patches are quite good at the moment.
So is the logfile switch (not so related to FHS btw) very much needed in the end?
Replying to [bertagaz]comment:33:
The parameter --logdir is introduced by my patches. There was already some internal variables with name 'logdir', that is why I used the name.
I think --logdir is simpler than --logfile, because it allows us to reuse the concept to compute 'incidents_dir'. But it's true that if you have several nodes you will need different folders for the logs. Using --logfile (so specify the file name) will allow to have the log files of all the nodes in the same folder just with different names. But then we might need to add an extra parameter to create-node to be able to set the incidents_dir.
I don't have an strong opinion about it. bertagaz, you did more thinking about how all that will fit in debian, do you prefer one option on top of the other? What is your opinion leif?
In case that we go for --logdir the patches are ready for another review.
I think that, given that users will probably want separate
incidents
directories per node anyway, there's not much advantage in being able to specify--logfile
to be in the same directory. Therefore, I suggest keeping--logdir
(if that is considered sufficiently compatible with FHS -- but I think it is not very different from what some other Linux software does).IIUC, that means the branch is ready for another round of review.
When I tried to use the patches (at 6f5fe46c8ffd1e74b27e0b6caf0a4ead1e7680be) I found I was unable to create a new node without specifying the logdir.
I think overloading the logdir option to mean "parent of 'incidents' directory path to be written to tahoe.cfg" during node creation but "directory where twistd.log will be written" when starting an existing node is confusing.
I've pushed some commits to https://github.com/meskio/tahoe-lafs/pull/1 which add an incidents_dir option to the create node scripts and also change the behavior of the runtime logdir option (making it relative to the user's current directory rather than the node's basedir). I also slipped in a change of allowing web.port to be set by create-introducer because it seemed to fit with the new CreateNodeCommonOptions class I made for the incidents_dir parameter.
I've merged the pull-request.
I don't have a strong opinion on the logdir option. I'm fine with the new incidents_dir option you added to create-node.
Is there something else missing? Do you think we are ready to merge?
After re-reading all the thread and my pull-request I think it makes sense to implement --logfile. There is only one log file that will be written in this directory, this is not reuse for the incidents_dir anymore and in the case of use it as daemon can be used to place a log like /var/log/tahoe.log.
I implemented it in the last commit.
I'd like to be careful with ambiguous argument/configuration options. In particular, what happens if a configuration file specifies separate directories, while the commandline also includes
--node-directory
?My plan is to:
--node-directory
in conflict with the configuration file.I see the following unittest failures on meskio's pull request revision b7f3585.
I recall Daira mentioning some failures in the meeting, so these may be the same. I'm switching to other tasks
Replying to nejucomo:
I'm not sure why that should be an error. I would expect that
--node-directory
operates as usual to change the node directory, and that any explicitly specified config entries from thetahoe.cfg
in that directory override the node-directory-relative defaults.The errors in comment:92923 are indeed the ones I saw in the meeting.
I suspect that there may be additional errors on Windows but my computer crashed before I could look at them.
Replying to [daira]comment:41:
I do agree with daira, and I believe that's what actually happens with my code.
Replying to nejucomo:
I'm working on fixing them. I thought I run the tests on the latest commit, but maybe I didn't.
Replying to nejucomo:
I discovered that b7f3585 introduced a bug, I fixed in a new commit.
Replying to daira:
I don't have a windows machine to test it, will be nice if someone can test it and report back.
I'm working on a new clean branch that addresses the issues mentioned.
I have a work in progress branch that addresses the problems commented on the review and has a clean history:
https://github.com/meskio/tahoe-lafs/tree/2045_configurable_paths
New pull-request:
https://github.com/tahoe-lafs/tahoe-lafs/pull/99
Note: The use case of a system wide daemon with multiple user clients connecting is similar to the #1665 use case (of a public/shared web gateway) as well as with #1283 use case (running a system wide Windows service).
We need to adopt a cross-platform consistent usability story for all of these use cases. For example, a system-wide daemon should not have an aliases file, which should be local to users.
Replying to nejucomo:
Yes, this is something that will be nice to consider, but will take more planification than what we have done up to now here and more knowledge of other platforms that I don't have myself. It can be done step by step introducing the changes in this ticket now and think in the broad aspect of it for future changes.
New pull-request fixing the comments up to now:
https://github.com/tahoe-lafs/tahoe-lafs/pull/104
(sorry to be coming late to the party here)
My current opinion: I'm happy to make it easier to be FHS-compliant, I absolutely want Tahoe in Debian, I'm +0 on enabling system-wide installs, I'm -1 on adding a bunch of
--blahdir=
options, and I'm -0 on extending tahoe.cfg to express the necessary directory paths.I guess I'd assumed that getting Tahoe into debian would just mean making
/usr/bin/tahoe
work, enabling users to create their own nodes (in~/.tahoe
) and use them independently. I can see how that's not an ideal assumption: there are almost-reasonable ways to share access to a system-wide node among multiple users. Currently the main problem is that they'll all be forced to use the same grid, convergence secret, and encoding parameters. In the future, as we start building Agents and control panels and stuff, this will get more challenging. But right now I can see it working well enough.One option would be to merely change the Tahoe codebase to put the locations of all these directories into a single source file. The Debian/FHS-compliant version could patch this file to look in the appropriate places for that platform. The trunk version could stick with the existing NODEDIR-relative approach. That might be easiest for everybody, although it would mean a non-upstreamable patch for the debian folks.
Another would be to have tahoe.cfg settings for these directories, and do something in the debian world to populate it appropriately for the FHS. Maybe ship a separate create-node script that puts everything into tahoe.cfg? Maybe add a
tahoe create-node --fhs
and have the debian postinst script run that if /etc/tahoe doesn't already exist?I guess one critical question is: to what extent is it appropriate to automatically create a node at install time? Tahoe nodes are useless without a specific grid to connect to, so the answer is probably "no, it's not appropriate". Which means really the package needs to point admins at the right tool to create this system-wide shared node some time after installation. How do we want them to do this? I like writing the docs first, before tests and code.. what instructions could we feel comfortable giving sysadmins to create their system-wide node, and how would it differ from what users would use to create a personal node?
That said, how technically difficult would it be to use
tahoe.cfg
to drive this? Suppose we define BASEDIR as the thing that holds tahoe.cfg, and therefore put it in /etc/tahoe or something. Until we fix #1159 we also need the .tac file in there, but that's config-like and static, so that's fine. We read the config file before doing anything else. Logs are easy enough to write elsewhere. There are some FURLs that are generated on each node startup, but tahoe.cfg can instruct the node to use other directories for them (/var/lib ?) first. So probably doable.Or, should we go even narrower and just add a
tahoe --fhs
global option? One boolean that the node knows should change the configuration of all these paths to something that matches FHS? I'm less confident about that, because then running multiple nodes on a single box becomes really hard (you need to coordinate the various subdirs of /etc/tahoe , /var/lib/something, etc).I don't exactly understand what Brian is suggesting in comment:92935. I don't understand why not to add a bunch of
--blahdir=
options, and I don't really understand how any other way of making the directory structure Debian-compatible wouldn't be worse.Replying to warner:
All the options on the
--blahdir=
options and the tahoe.cfg configuration are optional and backward compatible. You can not use them and everything will work as it is right now. I understand that adding options always complicate things and is not great.We want to support both posibilities in debian, to run it as a daemon and to run it as a user with all the configuration in
~/.tahoe
as it is right now. I think the daemon set up can be a requirement for other unixes around and not debian or gnu/linux specific. To hardcode a 'FHS' set up somewhere in tahoe or in debian won't help other unixes to daemonize tahoe.I don't care about install time. As a sysadmin, I care to be able to run tahoe from init when my system boots without hacking debian to fit in the tahoe model, but the other way around, to have tahoe fitting with the rest of the daemons in my system.
We are putting all that we can in the tahoe.cfg, but things like logs start writting before tahoe.cfg is readed and we need to configure the logging without the tahoe.cfg. Or modify tahoe a lot to change the logging system.
No, FHS is too specific of a bunch of linux distributions, and not general enough for other unixes to daemonize tahoe.
I think meskio is right that being able to configure each particular directory, like
logs
,storage
, andtmp
is necessary for the Debian use case. I still don't really understand what alternative Brian is proposing. Is Brian suggesting a different way to configure those three directories instead of commandline options like--logs-dir=BLAH
?I intend to work on this in time for 1.11 (but not the upcoming 1.10.3).
I have some of the work on this ticket in my truckee branch at https://github.com/leif/tahoe-lafs
Great, leif tell me if you need any help for me. You can find my lattest pull-req at:
https://github.com/tahoe-lafs/tahoe-lafs/pull/104
Milestone renamed
moving most tickets from 1.12 to 1.13 so we can release 1.12 with magic-folders
I did a review and had some questions/comments.
Regarding overall discussion, and as another sysadmin:
it is undoubtedly good to have an option to configure all the paths and may be critical for some installs. Thanks for the pull request!
I don't think that just sticking everything to something like
/var/lib/tahoe
is against FHS, especially given tahoe is non a system-wide, but device-wide service (I suspect it is very common to have node per physical storage device).also, I think FHS is dead and hardly makes sense in modern world of docker and static builds. Sadly it will take Debian and other traditional distros quite a while to figure this out.
As a sysadmin, I'd be pissed if Debian ships tahoe with very non-standard mangled configuration just for FHS sake and makes magic behind my back to automatically create and start a node.
Replying to rvs:
We could argue about that, not sure if I agree. But anyway I think we agree that is useful to be able to configure the path of things like storage or logs which is what matters for this issue.
I'm not the debian packager, but what makes sense for me is to package tahoe as it is (no magic) so the intended use case of tahoe being run by each user separately is still the default one. And to ship as well a systemd and/or /etc/init.d/tahoe script that the sysadmins can enable to daemonize tahoe using all this configuration paths.
In ee20a69/trunk:
Moving open issues out of closed milestones.
Ticket retargeted after milestone closed