Cli fs operations refactoring #640
Labels
No Label
0.2.0
0.3.0
0.4.0
0.5.0
0.5.1
0.6.0
0.6.1
0.7.0
0.8.0
0.9.0
1.0.0
1.1.0
1.10.0
1.10.1
1.10.2
1.10a2
1.11.0
1.12.0
1.12.1
1.13.0
1.14.0
1.15.0
1.15.1
1.2.0
1.3.0
1.4.1
1.5.0
1.6.0
1.6.1
1.7.0
1.7.1
1.7β
1.8.0
1.8.1
1.8.2
1.8.3
1.8β
1.9.0
1.9.0-s3branch
1.9.0a1
1.9.0a2
1.9.0b1
1.9.1
1.9.2
1.9.2a1
LeastAuthority.com automation
blocker
cannot reproduce
cloud-branch
code
code-dirnodes
code-encoding
code-frontend
code-frontend-cli
code-frontend-ftp-sftp
code-frontend-magic-folder
code-frontend-web
code-mutable
code-network
code-nodeadmin
code-peerselection
code-storage
contrib
critical
defect
dev-infrastructure
documentation
duplicate
enhancement
fixed
invalid
major
minor
n/a
normal
operational
packaging
somebody else's problem
supercritical
task
trivial
unknown
was already fixed
website
wontfix
worksforme
No Milestone
No Assignees
3 Participants
Notifications
Due Date
No due date set.
Reference: tahoe-lafs/trac-2024-07-25#640
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Each of the cli commands implements its own logic to do fs operations, both local-side and tahoe-side. I think that most of the command modules should be refactored to use common code. This is for many reasons among:
The steps in order to do that (that I can see):
Recognize the needs of each command in term of fs manipulation;
Look at the best implementation available and eventually improve it;
maybe design a minimal layer that will help handling alias/path/cap based operations and other cases.
Other interesting points to discuss:
Let's discuss it!
azazel: Those are good reasons! We already have reasonable test coverage for most cli commands, so after doing this refactoring it should be possible to look at the per-line code coverage output and see that the tests are thoroughly exercising the code in the new refactored version.
I don't have time to do this myself (see my blog for notes about what I'm doing), but I would happily review and accept patches that do this (once the buildbot is green).
That should probably be a separate ticket.
I read over the code to understand how to get from my shell into ./src/allmydata/scripts/* and it seems overly complex:
./bin/tahoe is a python script generated from ./bin/tahoe-script.template.
Q1: Why is this generated from a template? That seems unpythonic and unnecessary. When I diff the template and the generated script on my Mac, the only change is to replace "#!/bin/false" with "#!/usr/bin/python".
./bin/tahoe does a bunch of platform difference wrangling and eventually calls ./support/bin/tahoe.
Q2: Can any of the complexity in ./bin/tahoe be removed?
./support/bin/tahoe is very short and simply calls: load_entry_point('allmydata-tahoe==1.8.0', 'console_scripts', 'tahoe')()
Q3: Why does ./support/bin/tahoe exist? Why doesn't ./bin/tahoe call load_entry_point directly?
Q4: Why is the load_entry_point mechanism used? Is it necessary?
load_entry_point() resolves a mapping for "console_scripts" for tahoe to ./src/allmydata/scripts/runner.py's run() function. This mapping is configured in ./setup.py.
Q5: Why doesn't ./bin/tahoe call runner.py's run() directly?
Q6: Why are there two separate packaging processes necessary to make the commandline work? One being script template generation, the other being entry_point registration.
My perspective probably doesn't account for use cases, but I hope by asking these questions we might notice potential simplifications.
Replying to nejucomo:
The intention is that the script runs with the same Python interpreter that ran
setup.py build
. A shebang line must specify an absolute path. It would be possible to use "/usr/bin/env python
", but note that some of the built dependencies contain compiled code that is specific to a Python version, so upgrading the system Python (that is first on the PATH) would then cause an existing installation to fail.Yes. Most of the complexity is just due to the fact that we have two scripts (this one and
support/bin/tahoe
). The reason why we need two scripts is that the one generated by setuptools doesn't set up thesys.path
as we want, and setting the PYTHONPATH environment variable is the documented way to change the initialsys.path
. The Windows-specific code is necessary in order to pass on Unicode arguments correctly to the second script, which would not be needed if we only had one script.support/bin/tahoe
is the entry script generated by setuptools. I presume that originally we considered the contents of that script to be an implementation detail of setuptools, and therefore didn't want to duplicate it.Since then, we have ended up changing how setuptools generates scripts, and we would probably be better off just setting
sys.path
ourselves. That is what Brian's "unsuck branch" does, IIUC.Sort of. Some mechanism to put the right copy of Tahoe and dependent libraries on the
sys.path
is necessary, andload_entry_point
is the setuptools way of doing that.The
sys.path
would be wrong.A combination of overcomplication in the setuptools design, and our workarounds for setuptools bugs.