Avoid Popen() of executables that don't exist #109
No reviewers
Labels
No Label
Benchmarking and Performance
HTTP Storage Protocol
Nevow Removal
Python 3 Porting
not-for-merge
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: tahoe-lafs/tahoe-lafs#109
Loading…
Reference in New Issue
No description provided.
Delete Branch "2023-caution"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
The stdlib 'subprocess' module in python-2.7.4 through 2.7.7 suffers
from http://bugs.python.org/issue18851 which causes unrelated file
descriptors to be closed when
subprocess.call()
fails theexec()
,such as when the executable being invoked does not actually exist. There
appears to be some randomness involved. This was fixed in python-2.7.8.
Tahoe's iputil.py uses subprocess.call on many different "ifconfig"-type
executables, most of which don't exist on any given platform (added in
git commit
8e31d66cd0
). This results in a lot of file-descriptorclosing, which (at least during unit tests) tends to clobber important
things like Tub TCP sockets. This seems to be the root cause behind
ticket:2121, in which normal code tries to close already-closed sockets,
crashing the unit tests. Since different platforms have different
ifconfigs, some platforms will experience more failed execs than others,
so this bug could easily behave differently on linux vs freebsd, as well
as working normally on python-2.7.8 or 2.7.4.
This patch inserts a guard to make sure that os.path.isfile() is true
before allowing Popen.call() to try executing the target. This ought to
be enough to avoid the bug. It changes both iputil.py and
allmydata.init (which uses Popen for calling "lsb_release"), which
are all the places where 'subprocess' is used outside of unit tests.
Other potential fixes: use the 'subprocess32' module from PyPI (which is
a bug-free backport of the Python3 stdlib subprocess module, but would
introduce a new dependency), or require python >= 2.7.8 (but this would
rule out development/deployment on the current OS-X 10.9 release, which
ships with 2.7.5, as well as other distributions like Ubuntu 14.04 LTS).
I believe this closes ticket:2121, and given the apparent relationship
between 2121 and 2023, I think it also closes ticket:2023 (although
since 2023 doesn't have copies of the failing log files, it's hard to
tell). I'm hoping that this will tide us over until 1.11 is released, at
which point we can execute on the plan to remove iputil.py entirely by
changing the way that nodes learn their externally-facing IP address.
Coverage decreased (-0.01%) when pulling 1a07dfdedef4747784ad1798542a2f4c0b9c4be3 on warner:2023-caution into
6a38a3c54e
on tahoe-lafs:master.not sure why that failed (but I filed https://tahoe-lafs.org/trac/tahoe-lafs/ticket/2298 about it. probably intermittent. I'll re-push a rebased form to trigger a new build.
Coverage decreased (-0.02%) when pulling 5e70aa818e5c06c5f50d1c1ba4d8f4d4f9ffdf34 on warner:2023-caution into
528364a421
on tahoe-lafs:master.That travis build hit the same intermittent failure. The actual iputil.py stuff passed. Don't be put off by the red X :)
ok, I think we fixed the intermittent failure (https://tahoe-lafs.org/trac/tahoe-lafs/ticket/2290), so hopefully this next rebase will look less scary.
Coverage decreased (-0.0%) when pulling
d27a57cb49
on warner:2023-caution into38668c9e35
on tahoe-lafs:master.Reviewing. Coverage is not actually decreasing btw, that's due to https://tahoe-lafs.org/trac/tahoe-lafs/ticket/2295 .
yeah, I figure "lsb_release" is probably specified to live in a pretty stable spot, so looking at either /bin or /usr/bin seemed sufficient.
thanks!