build/install should be able to refrain from getting dependencies #1220
Labels
No Label
0.2.0
0.3.0
0.4.0
0.5.0
0.5.1
0.6.0
0.6.1
0.7.0
0.8.0
0.9.0
1.0.0
1.1.0
1.10.0
1.10.1
1.10.2
1.10a2
1.11.0
1.12.0
1.12.1
1.13.0
1.14.0
1.15.0
1.15.1
1.2.0
1.3.0
1.4.1
1.5.0
1.6.0
1.6.1
1.7.0
1.7.1
1.7β
1.8.0
1.8.1
1.8.2
1.8.3
1.8β
1.9.0
1.9.0-s3branch
1.9.0a1
1.9.0a2
1.9.0b1
1.9.1
1.9.2
1.9.2a1
LeastAuthority.com automation
blocker
cannot reproduce
cloud-branch
code
code-dirnodes
code-encoding
code-frontend
code-frontend-cli
code-frontend-ftp-sftp
code-frontend-magic-folder
code-frontend-web
code-mutable
code-network
code-nodeadmin
code-peerselection
code-storage
contrib
critical
defect
dev-infrastructure
documentation
duplicate
enhancement
fixed
invalid
major
minor
n/a
normal
operational
packaging
somebody else's problem
supercritical
task
trivial
unknown
was already fixed
website
wontfix
worksforme
No Milestone
No Assignees
3 Participants
Notifications
Due Date
No due date set.
Reference: tahoe-lafs/trac-2024-07-25#1220
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
In a managed package system, each program's dependencies are expressed in control files and provided before the package builds. If the package has more dependencies than expresssed, the right behavior is failure so that this can be fixed, and it is unhelpful to download/install code either from included eggs or especially from the net.
There are two parts to this problem. One is downloading and installing things like py-cryptopp. The other is that tahoe seems to have to need modified versions of standard tools and has included eggs. This kind of divergence should be resolved.
I realize that this complaint is perhaps directed at setuptools, but tahoe-lafs inherits responsibility.
A reasonable solution would be to have a switch that packaging systems can add.
I put this on packaging even though the bug is in tahoe-lafs, not in any packaging of it.
I just remembered that there is the
--single-version-externally-managed
flag. If you pass that flag as an argument topython setup.py install
then it will suppress all automated fetching of dependencies. We test the use of this flag on all of our buildbots--look at the buildsteps called "install-to-prefix", e.g. this one on NetBSD and "test-from-prefixdir", e.g. this one on NetBSD. "install-to-prefix" does an install using--single-version-externally-managed
to suppress automated resolution of dependencies, and "test-from-prefixdir" runs the unit tests in the resulting target directory where it was installed to.Please try adding
--single-version-externally-managed
and see if that is sufficient to close this ticket.I don't see how a flag passed at install time would really fix the issue. What I would like is to tell the build step to not install missing dependencies.
Well, can you (either of you) show me a script that is used to package Python applications for your system? I imagine that you could do something like this:
Then collect all the files that got written into
$TARGETDIR
and put them into your newly created package. This should work with any setuptools-built Python package.But, if that's not how you do it, then show me how you do it and I'll see if I can help make it so that the setuptools automatic resolution of dependencies gets out of your way.
Here's a log of building under pkgsrc. You can see that it's basically setup.py build (with the presetup to have a symlink tree of allowed libaries, so that only expressed dependencies are available). build doesn't have --single-version-externally-managed but install does. So are you saying that I should pass --single-version-externally-managed to the build phase as well?
pkgsrc-build-log.txt
Okay, thanks for the log! My current thought is: do we need the
build
step for anything? What happens if you just comment-out that step and head straight for theinstall
step? As far as I know, that will work, and will also completely avoid any automated downloading of any dependencies (since theinstall
step already has--single-version-externally-managed
). Tahoe-LAFS doesn't have any native code modules that need to be compiled, but even if it did (or if you used this same script for a different Python package which did have native code modules) then I think runningpython setup.py install
would automatically build those native code modules, so I don't think you really need to invokepython setup.py build
directly.I just ran a quick manual test locally, and
python setup.py build --single-version-externally-managed
gives an error message saying that "--single-version-externally-managed" is not a recognized option for "build", butpython setup.py install --single-version-externally-managed --prefix=instdir --record=list-of-installed-files.txt
correctly builds and installs without downloading any dependencies.Replying to zooko:
No, because pkgsrc requires that the build phase do all things that feel like what "make" should do, and stay within the working directory. Then install does what "make install" should do and puts compiled bits in a staging area. Then the package tar bundles up that staging area.
It seems odd to me that --single-version-externally-managed suppresses dependencies and is only valid at install. I had thought -svem was about changing the way the egg file is created, and the dep suppression seems to be a side effect.
The real question for me is whether a build/install attempt would fail and refraing from getting dependencies in the case where they didn't already exist.
Replying to [gdt]comment:9:
But there aren't any compiled bits, so as far as I can tell if we force the build phase to be a no-op then we still satisfy the pkgsrc protocol. Alternately, if you let the build phase be
python setup.py build
(just like it currently is) instead of a no-op then we are still satisfying the protocol because it keeps all of the deps that it acquires within its working directory.But maybe there is another requirement for the build phase besides what you wrote above, such as "no open connections to remote hosts" or perhaps even more importantly "no printing out messages that make the human think that you are installing deps".
Is one or both of those a requirement? Am I missing some other requirements on what the build phase is allowed/required to do?
Why do you find this to be odd? Perhaps it is because you think of
python setup.py build
as the step that would create a egg if a egg were going to be created? It is not—if an egg were going to be created, that would be done inpython setup.py install
.Replying to gdt:
Oh, I see, so the requirement that I was missing on the "build" step is: "return non-zero exit code if any of the deps are missing".
Waitaminute, that's not truly a requirement. None of your C programs, for example, reliably do that do they? Or maybe some of them do nowadays by using a tool like pkg-config?
So, I'm still not 100% certain what you mean by "refrain from getting dependencies". Does my buildstep fail if it opens a TCP or HTTP connection but doesn't download any large files? Does it fail if it downloads a large file but that file isn't a dependency? What if it downloads a dependency as a .zip or a .tar but doesn't unpack it? What if it unpacks it but only into the current working directory (this is the one that it currently does)? What if it writes it into
/usr/lib/python2.6/site-packages
and then edits your/usr/lib/python2.6/site-packages/site.py
script to change the way Python imports modules (this is the one that it would do if you ransudo python setup.py install
)? Does it matter whether it prints out messages describing what it is doing versus if it stays quiet? Does it matter how long it takes to finish the build step?Replying to [zooko]comment:12:
[to myself]following-up
Although we could potentially do better than C programs and actually satisfy this requirement of reliably exiting with non-zero exit code if all of the deps aren't already present. Is that what we should do? It sounds like we would be going over and above the normal requirements of a pkgsrc build step and if we were going to go that direction then we should try to generalize the hack so that all Python programs that are being built by pkgsrc would do the same. :-)
You raise good points about unarticulated requirements; a lot of them are captured in "what 'make' is supposed to do". So specifically, the build phase
An underlying goal is that building a package should have a deterministic outcome, with the same bits produced regardless of which dependencies or other programs were already installed. This allows the use of the resulting binary packages on other systems. If a program has an optional dependency foo, then the pkgsrc entry has to require foo (and thus depend on the foo package), or disable use of foo, or have a pkgsrc option to control it. Having the built package be built differently depending on whether foo is present is considered a packaging bug (and perhaps an upstream bug, if there's no --disable-foo switch/method).
It's also a goal to be able to 'make fetch-list|sh' on a net-connected machine and grab all distfiles but not build, and then to be able to build offline.
I see that there are .pyc files installed, but not produced during build. This seems wrong, but not important or causing an actual problem, and it seems to be the python way.
Basically, there's a huge difference in approach between large-scale package management systems and the various language-specific packaging systems. I suspect debian/ubuntu and rpms are much more like pkgsrc than not in their requirements. But, there seems not be a culture of bulk building all rpms in Linux; it seems the maintainers build them and upload them.
I ran 'python2.6 install --single-version-externally-managed --root ../.destdir" without having run build, after uninstalling nevow. The install completed, and then running that tahoe failed on importing tahoe.
Having read setup.py and _auto_deps.py, I think the problem is in hand-written setup code in tahoe-lafs which needs a switch to require/fail vs require/fetch.
[problem isn't causing me lots of trouble; I simply check the build output when updating the package and manually consider it broken if it uses the net.]This
Should we close this ticket, due to the existence of the
--single-version-externally-managed
flag forpython setup.py install
?No, because a) --svem isn't usable during a build phase (install writes to the destination) and b) it doesn't check dependencies and fail. (This gives me the impression install is only supposed to be used after build.)
I don't mean to demand that anyone spend time on this, but I still think the setup.py code is incorrect compared to longstanding open source norms.
I would be curious to hear about how people who work on packaging for other systems deal with this issue.
This problem is an annoyance and increases the risk of packaging errors, but the resulting packages are ok. Therefore dropping to minor, which is probably should have been already.
Replying to gdt:
'setup.py install' and 'setup.py build' are alternatives. As far as I understand, it isn't intended that both be used.
I don't dispute that, but I favour making sure that a replacement for setuptools -- probably Brian's "unsuck" branch -- follows those norms by default, rather than continuing to hack at zetuptoolz. zooko's efforts with the latter are appreciated, but that approach has consumed an enormous amount of development effort, and is still causing obscure and often irreproducible bugs on our buildslaves and for our users.
I was just hacking at zetuptoolz and I noticed that there is already a method named
url_ok()
which implements the feature of excluding certain domain names from the set that you will download from. If we hack it to always returnFalse
(when the user has specified "no downloads") then this would be our implementation of this ticket. Here is the url_ok() method in zetuptoolz. Here is the current body of it:Replying to gdt:
How about this. I'm going to propose a build step and you have to tell me if you would accept any code that passes that build step or whether you have other requirements.
The buildstep starts with a pristine tarball of tahoe-lafs and unpacks it, then runs
python setup.py justbuild
. If the code under test emits any lines to stdout or stderr which have the phrase "Downloading http" then it is marked as red by this buildstep. (The implementation of this test is visible here: [misc/build_helpers/check-build.py]source:trunk/misc/build_helpers/check-build.py?annotate=blame&rev=4434#L15, which is invoked from here: [Makefile]source:trunk/Makefile?annotate=blame&rev=4847#L278)Then the buildstep runs
python setup.py justinstall --prefix=$PREFIXDIR
. Then it executes$PREFIXDIR/bin/tahoe --version-and-path
and if the code under test emits the right version and path then it is marked as green by this buildstep, else it is marked as red.Now, one thing that this buildstep does not require of the code under test is that it detect missing dependencies or that it find and download missing dependencies. That would be cool, and you have requested it in this ticket, and I know how to implement it, but since that is above and beyond the standard packaging functionality that we're trying to emulate perhaps we should open a separate ticket and finish fixing the basic functionality first.
This means that the test can't give the code under test a fair chance of going green unless it is run on a system where all of the dependencies are already installed. As far as I understand, that's standard for this sort of packaging.
If you like this ticket, you might also like #1270 (have a separate build target to download any missing deps but not to compile or install them).
I don't consider this a minor issue, because the downloading from potentially insecure sites is a significant vulnerability (as we were recently reminded by SourceForge being compromised -- and setuptools will happily download from far less secure sites than SourceForge).
People were just wishing for related (but not identical) functionality on the distutils-sig mailing list and Barry Warsaw settled on patching
setup.cfg
of each Python project that he is building to add this stanza:http://mail.python.org/pipermail/distutils-sig/2011-February/017400.html
But I still feel like this ticket is underspecified. Before I make further progress on this ticket I want someone who cares a lot about this issue to tell me whether the test procedure (which is a Buildbot "build step") in comment:21 would be sufficient.
As Kyle mentioned on a mailing list thread, it would be nice if, when the build system detects that it already has everything it needs locally, then it doesn't look at the net at all. If this ticket were fixed, and we had the ability to refrain from getting dependencies, then we could also implement this added feature of "don't look at the net if you already have everything you need". I guess that should really be a separate ticket, but I honestly don't feel like going to all the effort to open a separate ticket.
I'll just re-iterate that if you want me, or anyone else, to make progress on this ticket, then please start by answering my questions from comment:21.
The "allow_hosts=None" configuration that Barry Warsaw was using (mentioned in comment:80450) is documented here:
pip
has the following relevant options:These seem very comprehensive and useful!
I don't fully understand Zooko's suggestion in /tahoe-lafs/trac-2024-07-25/issues/6282#comment:21 above, probably because I know very little about python packaging. Here's what I would want:
that implies:
Replying to jmalcolm:
jmalcolm: what you wrote there seems consistent with my proposal from comment:21.
On #2055, dstufft wrote:
I think a good next-step on this is #2473 (stop using
setup_requires
).Another good next step on this is to take the "Desert Island" test (https://github.com/tahoe-lafs/tahoe-lafs/blame/15a1550ced5c3691061f4f07d3597078fef8814f/Makefile#L285) and copy it to make this test. The changes from the "Desert Island" test to this test are:
python setup.py justbuild
; the Desert Island test runspython setup.py build
.I think this should be resolved, now that we're using pip/virtualenv, and do not have a
setup_requires=
anymore. Packagers can usepython setup.py install --single-version-externally-managed
with a--root
that points into a new directory, then turn that directory into a package. I believe this is how Debian currently does things, and by changing Tahoe to behave like every other python package, we should be able to take advantage of that machinery.gdt, please feel free to re-open this if you disagree.