Run the test suite with some concurrency on CircleCI #3138

Open
opened 2019-07-01 14:57:18 +00:00 by exarkun · 3 comments

CircleCI provides runners with two vCPUs. If a single CircleCI job can split its work across two Python processes we can take better advantage of the runner configuration and hopefully get a much faster CI run.

trial supports spreading test execution over multiple local processes with --jobs. We can already pass extra arguments to trial through tox thanks to #2894.

There are some problems with coverage collection's interactions with --jobs. There is a Twisted ticket to deal with this, https://twistedmatrix.com/trac/ticket/9663, but there is also https://pypi.org/project/coverage_enable_subprocess/ which basically solves the problem externally.

So, add coverage_enable_subprocess and --jobs 2 to our CircleCI configuration and get a ~50% reduction in CircleCI runtime.

CircleCI provides runners with two vCPUs. If a single CircleCI job can split its work across two Python processes we can take better advantage of the runner configuration and hopefully get a much faster CI run. trial supports spreading test execution over multiple local processes with `--jobs`. We can already pass extra arguments to trial through tox thanks to #2894. There are some problems with coverage collection's interactions with `--jobs`. There is a Twisted ticket to deal with this, <https://twistedmatrix.com/trac/ticket/9663>, but there is also <https://pypi.org/project/coverage_enable_subprocess/> which basically solves the problem externally. So, add coverage_enable_subprocess and `--jobs 2` to our CircleCI configuration and get a ~50% reduction in CircleCI runtime.
exarkun added the
dev-infrastructure
normal
defect
n/a
labels 2019-07-01 14:57:18 +00:00
exarkun added this to the undecided milestone 2019-07-01 14:57:18 +00:00
exarkun self-assigned this 2019-07-01 14:57:18 +00:00
Author

Unfortunately the test suite hangs when run under coverage with coverage_enable_subprocess installed. Presumably this is fixable but the goal isn't achieved just by setting up the test environment as described above ...

Unfortunately the test suite hangs when run under coverage with coverage_enable_subprocess installed. Presumably this is fixable but the goal isn't achieved just by setting up the test environment as described above ...
Author

Also, before the hang, it doesn't look like it is on its way to being fast. It seems to take about 6x as long to run the test suite this way (~18 minutes until it hung, seemingly pretty close to the end, compared to 3 minutes for coverage run -m twisted.trial --jobs 2 allmydata or 6 minutes for coverage run -m twisted.trial allmydata on master - which of course misses a lot of coverage measurement).

The result is arguably more correct but it can't be said that it's any faster...

Also, before the hang, it doesn't look like it is on its way to being *fast*. It seems to take about 6x as long to run the test suite this way (~18 minutes until it hung, seemingly pretty close to the end, compared to 3 minutes for `coverage run -m twisted.trial --jobs 2 allmydata` or 6 minutes for `coverage run -m twisted.trial allmydata` on master - which of course misses a lot of coverage measurement). The result is arguably more correct but it can't be said that it's any faster...
Author

Another option would be to apply this change only to the integration tests where it may be more useful to collect the additional information. As I understand it, the integration tests essentially only provide coverage via subprocesses so all of their value goes unmeasured. The "unit test" suite should, perhaps, be penalized for providing coverage in child processes since these are hardly "units" (but this is an opinionated stance that probably needs to be evaluated in the context of tahoe-lafs and the goals of the different parts of the automate test suite).

Another option would be to apply this change only to the integration tests where it may be more useful to collect the additional information. As I understand it, the integration tests essentially _only_ provide coverage via subprocesses so all of their value goes unmeasured. The "unit test" suite should, perhaps, be penalized for providing coverage in child processes since these are hardly "units" (but this is an opinionated stance that probably needs to be evaluated in the context of tahoe-lafs and the goals of the different parts of the automate test suite).
exarkun changed title from Run the test suite concurrency on CircleCI to Run the test suite with some concurrency on CircleCI 2019-07-25 12:40:55 +00:00
Sign in to join this conversation.
No Milestone
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Reference: tahoe-lafs/trac-2024-07-25#3138
No description provided.