document in what ways Tahoe-LAFS builds are not currently verifiable #2357
Labels
No Label
0.2.0
0.3.0
0.4.0
0.5.0
0.5.1
0.6.0
0.6.1
0.7.0
0.8.0
0.9.0
1.0.0
1.1.0
1.10.0
1.10.1
1.10.2
1.10a2
1.11.0
1.12.0
1.12.1
1.13.0
1.14.0
1.15.0
1.15.1
1.2.0
1.3.0
1.4.1
1.5.0
1.6.0
1.6.1
1.7.0
1.7.1
1.7β
1.8.0
1.8.1
1.8.2
1.8.3
1.8β
1.9.0
1.9.0-s3branch
1.9.0a1
1.9.0a2
1.9.0b1
1.9.1
1.9.2
1.9.2a1
LeastAuthority.com automation
blocker
cannot reproduce
cloud-branch
code
code-dirnodes
code-encoding
code-frontend
code-frontend-cli
code-frontend-ftp-sftp
code-frontend-magic-folder
code-frontend-web
code-mutable
code-network
code-nodeadmin
code-peerselection
code-storage
contrib
critical
defect
dev-infrastructure
documentation
duplicate
enhancement
fixed
invalid
major
minor
n/a
normal
operational
packaging
somebody else's problem
supercritical
task
trivial
unknown
was already fixed
website
wontfix
worksforme
No Milestone
No Assignees
1 Participants
Notifications
Due Date
No due date set.
Reference: tahoe-lafs/trac-2024-07-25#2357
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
A long-term goal, ticketed as #2057, is to enable end-users to verify that the package of Tahoe-LAFS that they are using was generated from the exact same source code that a security auditor examined.
In order to explain the verifiable build concept, consider this simple diagram:
Here we use “➾” to mean “build” — the process that produces usable packages out of source code.
Now consider a security auditor who does a source-code-based examination (as opposed to binary-based, which is called “reverse engineering”). This security auditor will start with the source code, and examine it for vulnerabilities or backdoors.
How can the user who receives a binary package know whether that package was built from the source that the auditor examined?
The “verifiable build” approach attempts to answer that question by having the security auditor perform the “source code ➾ binary package” on their own trusted system, and then taking a fingerprint (secure hash) of the resulting binary package:
The auditor then publishes that fingerprint along with their report about their security audit. Users who receive the binary package can take a fingerprint of that package and compare it to the fingerprint
in the published report.
This approach can work only if the ➾ operation performed by the distributor results in a bytewise-identical binary as the ➾ operation performed by the security auditor.
Here is a news article from LWN.net about the concept of verifiable builds (prompted in part by an open letter that we wrote): “Security software verifiability”. Here is a [//pipermail/tahoe-dev/2013-August/008684.html post on the tahoe-dev mailing list] about our desire to have verifiable builds for Tahoe-LAFS.
The goal of this ticket is to have documentation of the ways in which Tahoe-LAFS builds are not currently verifiable. Its scope includes:
but does not include Tahoe-LAFS as packaged by an operating system distribution or package management system.
It may be useful to consider how existing projects have approached this problem: Debian, Tor, Bitcoin, and the recent ad-hoc [reproduction of the TrueCrypt Windows binaries](https://madiba.encs.concordia.ca/~x_decarn/truecrypt-binaries-analysis/).
OpenITP meeting 5 January 2014
note: nondeterminism that results in obvious build failures is ok
different build targets can have different fingerprints
what counts as a build target?
[operating system versions, patches, variants, distribution if counted as the same target]NONDET:
quickstart build flow:
install Python if necessary
download the allmydata-tahoe-*.zip file (for a given build target)
unzip it
[unzip programs might vary in e.g. permissions of unzipped files]NONDET:
[file timestamps may depend on the clock of the build system]NONDET:
[order of files/subdirs in directories, if filesystem does not sort them]NONDET:
run setup.py build in a command prompt
[which Python version runs setup.py?]NONDET:
[other installed Python versions might affect the build?]NONDET:
[which setuptools/pkg_resources/virtualenv version?]NONDET:
[system or virtualenv?]NONDET:
[which other Python packages installed on system and in virtualenv?]NONDET:
[PYTHONPATH]NONDET:
it has some set of URLs where it looks for package distributions ("dists")
[using the net at all is hopeless wrt determinism]NONDET:
which dists it chooses can influence further choices of dist for other dependencies
try to build each dist
[order of builds? not sure what algorithm is used]NONDET:
dists are either pure Python or have C/C++ code
[buildchain for C/C++ code (includes many non-obvious dependencies)]NONDET:
[build process for C/C++ code]NONDET:
[distutils properties that affect compilation]NONDET:
[environment vars that affect compilation]NONDET:
[execution of Python code for building a dist (e.g dict order etc.)]NONDET:
[do any dependencies rely on entropy sources (e.g. os.urandom)?]NONDET:
[can operations like running tests affect the built copy of Tahoe?]NONDET:
sources of nondeterminism from builds of dependencies
Fixed; the report is at https://github.com/LeastAuthority/openitp-good-packaging-proposal/blob/master/openitp-good-packaging-for-LAFS_sources-of-nondeterminism.rst.