The CI Docker image builders are hard to test and are happy to push broken images #3484
Labels
No Label
0.2.0
0.3.0
0.4.0
0.5.0
0.5.1
0.6.0
0.6.1
0.7.0
0.8.0
0.9.0
1.0.0
1.1.0
1.10.0
1.10.1
1.10.2
1.10a2
1.11.0
1.12.0
1.12.1
1.13.0
1.14.0
1.15.0
1.15.1
1.2.0
1.3.0
1.4.1
1.5.0
1.6.0
1.6.1
1.7.0
1.7.1
1.7β
1.8.0
1.8.1
1.8.2
1.8.3
1.8β
1.9.0
1.9.0-s3branch
1.9.0a1
1.9.0a2
1.9.0b1
1.9.1
1.9.2
1.9.2a1
LeastAuthority.com automation
blocker
cannot reproduce
cloud-branch
code
code-dirnodes
code-encoding
code-frontend
code-frontend-cli
code-frontend-ftp-sftp
code-frontend-magic-folder
code-frontend-web
code-mutable
code-network
code-nodeadmin
code-peerselection
code-storage
contrib
critical
defect
dev-infrastructure
documentation
duplicate
enhancement
fixed
invalid
major
minor
n/a
normal
operational
packaging
somebody else's problem
supercritical
task
trivial
unknown
was already fixed
website
wontfix
worksforme
No Milestone
No Assignees
1 Participants
Notifications
Due Date
No due date set.
Reference: tahoe-lafs/trac-2024-07-25#3484
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
A lot of the current CI configuration uses Docker images as a basis for the testing environment. There is also CI configuration to build these Docker images. These images are pre-loaded with as much software as we can manage so that they bear most of the environment setup cost. Then individual CI jobs that are for running tests only have to get Tahoe-LAFS itself in place. This lets them run more quickly.
Docker images are built on CircleCI by a cron-driven nightly workflow. This means images are typically only built based on what's in master. This means that it is hard to test any code changes to how the images are built since those changes don't happen on master. If you want to test them at all, you have to do it manually outside of CI (which is error prone since you cannot reproduce the exact CI environment) or you have to mess with the configuration to make the images build on your branch (and then un-mess it afterwards).
It would be a nicer developer experience if changes to the image building code were testable in essentially the same way changes to any other code are testable - push changes to a branch, let CI run against that branch, see if CI succeeds or not.
Apart from those issues, another issue is that the image-building CI jobs will push the images they build as long as that build succeeds. The build may succeed any include an incompatible version of some dependency (eg because it was just released and the builders pull the latest version of many dependencies).
It would be nice if new images were only pushed if they worked at least as well as the image they were replacing. This would let normal Tahoe-LAFS development continue undisturbed when a dependency publishes an incompatible release. When a developer has a chance to look at the issue, they can then address the problem. Once the problem is resolved in Tahoe-LAFS the image builder would be unblocked to push new images.
These two things may really be independent problems with independent solutions and if so then this ticket should be split in half (if not further). I describe both of the problems here because they seem very interconnected to me, partly due to the constraints placed on us by the capabilities of the CI systems we rely on.
Some pieces of the description are now out-of-date. The images are no longer built on a schedule. They are only built when a developer explicitly requests this using
.circleci/rebuild-images.sh
.The other parts are still relevent, though.