the test-from-egg buildstep has started failing on all builders #2378
Labels
No Label
0.2.0
0.3.0
0.4.0
0.5.0
0.5.1
0.6.0
0.6.1
0.7.0
0.8.0
0.9.0
1.0.0
1.1.0
1.10.0
1.10.1
1.10.2
1.10a2
1.11.0
1.12.0
1.12.1
1.13.0
1.14.0
1.15.0
1.15.1
1.2.0
1.3.0
1.4.1
1.5.0
1.6.0
1.6.1
1.7.0
1.7.1
1.7β
1.8.0
1.8.1
1.8.2
1.8.3
1.8β
1.9.0
1.9.0-s3branch
1.9.0a1
1.9.0a2
1.9.0b1
1.9.1
1.9.2
1.9.2a1
LeastAuthority.com automation
blocker
cannot reproduce
cloud-branch
code
code-dirnodes
code-encoding
code-frontend
code-frontend-cli
code-frontend-ftp-sftp
code-frontend-magic-folder
code-frontend-web
code-mutable
code-network
code-nodeadmin
code-peerselection
code-storage
contrib
critical
defect
dev-infrastructure
documentation
duplicate
enhancement
fixed
invalid
major
minor
n/a
normal
operational
packaging
somebody else's problem
supercritical
task
trivial
unknown
was already fixed
website
wontfix
worksforme
No Milestone
No Assignees
3 Participants
Notifications
Due Date
No due date set.
Reference: tahoe-lafs/trac-2024-07-25#2378
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
It was passing in [5218e87ed148e92353e03f86e8949d227b5af731/trunk] and has started failing since e4149496d267d39b61d93ececcb014010235f850/trunk. (There are 14 non-merge commits between those revisions, but it should be an easy bisection.)
This is caused by cf0f64be98c42b4c18c2dccf297d67b44de53a0d/trunk. (I hadn't noticed that source:misc/build_helpers/run_trial.py was still used by the test-from-egg step. Mea culpa.)
We changed the buildbot config to run
sys.executable tahoe debug trial testsuite
. Unfortunately that runs into the bug described in #2242, in which a function insidepkg_resources
is interpreting a*requires*
version specification incorrectly. The symptom is:So I think we must fix #2242 first, either by fixing our zetuptoolz fork, or getting rid of it altogether (#2044).
I was able to reproduce this locally, eventually, with the following (paraphrased slightly):
The PYTHONPATH in step 5 is necessary to appease
easy_install
, which is trying to protect you against writing into a directory that you wouldn't normally read from. The PYTHONPATH in step 7 does two things: it uses tahoe's zetuptoolz fork, and it makes the dependencies available.Step 7 produces the "Missing distribution spec" exception.
The
egginst
directory, as populated byeasy_install
, contains an .egg for every dependency, aneasy-install.pth
which would add them to the path (if you includedegginst
on PYTHONPATH), and an executable entry-point script for everything that produced one (tahoe, flappserver, flogtool, some zfec ones, etc). It is thistahoe
entrypoint script that we execute. On my machine, it contains the following:(this matches what I'm seeing on freestorm's buildslave).
If I omit tahoe's zetuptoolz egg from the PYTHONPATH in step 7, the problem goes away, and the tests run normally.
Note that step 5 is using my system-supplied
easy_install
, which reports its--version
assetuptools 12.1
. I think the problem here is a mismatch between the version of setuptools (zetuptoolz) that produced the .egg file in step 2, the version of easy_install that installed the egg (setuptools-12.1) in step 5, and the version of setuptools (zetuptoolz again) which provides thepkg_resources
used for execution in step 7.Our toolchain takes great pains to work even if you don't have setuptools installed on your system (perhaps due to my early complaints about not liking setuptools :-), by providing a setuptools/zetuptoolz .egg in the source tree, and importing it early in
setup.py
. However, step 5 in the test-from-egg process depends upon having a system-suppliedeasy_install
script. (Tahoe includes setuptools as an egg, but not easy_install).Digging into the zetuptoolz
pkg_resources
, it seems that the single-string*requires*
in the entrypoint script is, sometimes, being interpreted as a list, causing it to parse each letter of the string as a separate requirement string. Most of these ("a", "l", "l", "m", etc) are ignored, but when it finally gets to "=", the parser throws an exception, because "=" is supposed to have a package name on the left. However, in other places withinpkg_resources
, the string is correctly interpreted as a string.I've seen other entrypoint scripts that provide a list for
*requires*
, like the one tahoe'ssetup.py build
installs intosupport/bin/tahoe
:And if I make
easy_install
use tahoe's zetuptoolz egg, by replacing step 5 with:5b. PYTHONPATH=~/tahoe/setuptools.egg:egginst easy_install -d egginst ~/tahoe/dist/tahoe.egg
then I get the list-
*requires*
entrypoint script like the one insupport/bin/tahoe
.So my hunches are:
*requires*
listpkg_resources
accepts a string or a listeasy_install
produces a stringand the mismatch is between the version of easy_install that creates the entrypoint script, and the version of pkg_resources that gets loaded by that script.
Solutions
So one solution for this ticket might be to use zetuptoolz for the
easy_install
, like step 5b. I'm experimenting with this to see if it would work.A second would be to fix zetuptoolz to look more like modern setuptools.
But I suspect this test is no longer exercising the kind of functionality we really care about. Nobody installing a tahoe .egg is going to use 5b: they'll use
easy_install tahoe.egg
. And they won't include tahoe's zetuptools in PYTHONPATH when they run tahoe, so they'll get their pkg_resources from the system setuptools. To be relevant, our test should match real-world usage.So a third fix is to remove the non-realistic PYTHONPATH= from step 7. The
easy-install.pth
in egginstalldir means we could probably use:7b. PYTHONPATH=~/tmp/egginst python tahoe debug trial
(This reflects reality, since eggs are usually installed into something on the built-in PYTHONPATH, like /usr/local/lib).
Doing that would make the test dependent upon a system-installed setuptools, but it was already depending on that for easy_install.
I'm going to try this third option and see if it works.
Hm, one potential problem with that third option: I think the Twisted egg comes from the source directory (is it used as a
setup_requires=
? or was that a workaround for some dependency's packaging bug?), and doesn't seem to be installed toegginstdir
along with everything else. So without a PYTHONPATH that points back to the source dir, this might not work. Haven't tried it yet.Indeed that failed, as it couldn't find the Twisted egg.
Our setup.py explains why it
setup_requires=
on Twisted (it involves bugs in Nevow packaging). Since removing our dependency on Nevow doesn't sound feasible for this release, I'm going to make the test slightly less realistic and include the Twisted .egg in the PYTHONPATH.I think that bug has been fixed in nevow:
https://github.com/twisted/nevow/issues/7
Cool. I'm hesitant to change our setup.py too much right now, but after the release let's remove that stuff and see what happens.
I've experimented some more, and I don't know why easy_install isn't putting a Twisted egg into the egginstdir. It might have to do with the buildslave using distribute-0.6.10 (I'm using setuptools-12.1 here and it installs a twisted egg when Twisted isn't already installed in /usr/lib). It might be that running easy_install from the source directory, where there's a twisted.egg sitting, might trick it into thinking that twisted is generally available. It might also involve an apparent bug in buildbot, since I see the
InstallToEgg
buildstep is using:but easy_install sees
PYTHONPATH=egginstalldir:
(probably the result of a lazy join..egginstalldir+":"+os.environ.get("PYTHONPATH","")
). The extra colon (or, rather, the empty component once youpp.split(":")
) causes the current directory to be added to the PYTHONPATH. And maybe it's somehow seeing the twisted .egg there, so not bothering to install a new one into the install dir.I'm going to try doing the install from a temporary directory, to see if that helps.
Hey, that fixed it. Ok, time to pull out all the temporary debug stuff and make sure it all works.
The tempdir-on-easy_install seems to have fixed it.. the orange buildslaves are turning green. Calling this one fixed.
Replying to zooko:
It has, and removing the workaround will work on Unix. It won't work on Windows, because on Windows we still require an unfixed version of Nevow (because the fixed version depends on Twisted >= 13 and that in turn depends on pywin32).