readiness status API for clients #2844

Open
opened 2016-11-14 19:36:29 +00:00 by dawuud · 4 comments
dawuud commented 2016-11-14 19:36:29 +00:00
Owner

We need an API that can be used to determine when a Tahoe client is connected to enough storage servers to do something useful and retrieve a file from the storage grid.

Similarly there should be a way to determine the readiness of storage nodes and introducers.

There are potentially many uses for such an API, for instance the integration tests should use it as should any Tahoe GUI we build.

We need an API that can be used to determine when a Tahoe client is connected to enough storage servers to do something useful and retrieve a file from the storage grid. Similarly there should be a way to determine the readiness of storage nodes and introducers. There are potentially many uses for such an API, for instance the integration tests should use it as should any Tahoe GUI we build.
tahoe-lafs added the
unknown
normal
defect
1.11.0
labels 2016-11-14 19:36:29 +00:00
tahoe-lafs added this to the undecided milestone 2016-11-14 19:36:29 +00:00
dawuud commented 2016-11-15 19:08:56 +00:00
Author
Owner

i don't know much about writing http APIs but i was thinking to write a polling http api that outputs JSON and it looks like i could copy a lot of code from the magic folder status http api; anyway i wrote this is_ready method for Clients to determine if at least k storage servers are connected:
https://github.com/david415/tahoe-lafs/commits/2844.add_readiness_api.0

maybe this readiness doesn't make sense for storage servers? or would they be ready after they have announced themselves to at least one introducer or all introducers?

also, what would it mean for an introducer to be ready? they are always ready so perhaps this isn't meaningful for introducers.

i don't know much about writing http APIs but i was thinking to write a polling http api that outputs JSON and it looks like i could copy a lot of code from the magic folder status http api; anyway i wrote this is_ready method for Clients to determine if at least k storage servers are connected: <https://github.com/david415/tahoe-lafs/commits/2844.add_readiness_api.0> maybe this readiness doesn't make sense for storage servers? or would they be ready after they have announced themselves to at least one introducer or all introducers? also, what would it mean for an introducer to be ready? they are always ready so perhaps this isn't meaningful for introducers.
Owner

The following is purely for the integration tests use-case (but I think also applies to e.g. Docker images etc).

The "is ready" is mostly about "are all the ports open and listening now" (for all non-client servers). That is, when we spawnProcess, that returns fairly rapidly -- such that if we spawn an Introducer and (without waiting) spawn a storage server, it will fail to connect to the Introducer. There are lots of different ways of tackling this (e.g. the integration tests could keep polling the ports they care about on the Introducer). Another way is what I'm doing currently: waiting for the "introducer running" (or similar) to appear in the stdout logs.

Maybe there's really no better way than trolling stdout (in "some" defense this is one way systemd works -- waiting for READY=1 to be printed by the launched process). I'm also kind of hoping there's "A Good Way" that people do this already...

In any case, it would be good to standardize on something (the "introducer running" is a bit fragile-feeling, and the string changes depending on what kind of tahoe you launched).

A JSON API would let us return more-interesting statuses -- and then e.g. for an integration-test use-case, the tests would have to rapidly poll the API (and keep re-trying on connection errors etc).

The following is purely for the integration tests use-case (but I think also applies to e.g. Docker images etc). The "is ready" is mostly about "are all the ports open and listening now" (for all non-client servers). That is, when we `spawnProcess`, that returns fairly rapidly -- such that if we spawn an Introducer and (without waiting) spawn a storage server, it will fail to connect to the Introducer. There are lots of different ways of tackling this (e.g. the integration tests could keep polling the ports they care about on the Introducer). Another way is what I'm doing currently: waiting for the "introducer running" (or similar) to appear in the stdout logs. Maybe there's really no better way than trolling stdout (in "some" defense this is one way systemd works -- waiting for `READY=1` to be printed by the launched process). I'm also kind of hoping there's "A Good Way" that people do this already... In any case, it would be good to standardize on something (the "introducer running" is a bit fragile-feeling, and the string changes depending on what kind of tahoe you launched). A JSON API would let us return more-interesting statuses -- and then e.g. for an integration-test use-case, the tests would have to rapidly poll the API (and keep re-trying on connection errors etc).
Owner

We just discussed this in Ops meeting Nov 29, and concluded:

  • we just need a "twisted" API for e.g. "spawn_client"
  • this takes the requested configuration and returns a Deferred
  • the Deferred either errbacks, or succeeds when the client is "ready"
  • to determine "readiness" we add an HTTP API
  • under the hood, the "spawn_client()" talks via the above HTTP API (polling until it connects)
  • the API has a /v0/ready or similar path, which only returns (ends the request) when the client is ready (i.e. a "put" will "probably" work -- the client has connected to enough storage-servers)
We just discussed this in Ops meeting Nov 29, and concluded: - we just need a "twisted" API for e.g. "spawn_client" - this takes the requested configuration and returns a Deferred - the Deferred either errbacks, or succeeds when the client is "ready" - to determine "readiness" we add an HTTP API - under the hood, the "spawn_client()" talks via the above HTTP API (polling until it connects) - the API has a /v0/ready or similar path, which only returns (ends the request) when the client is ready (i.e. a "put" will "probably" work -- the client has connected to enough storage-servers)
dawuud commented 2016-11-30 00:59:40 +00:00
Author
Owner

ok i made the http api work and unit tested as well:
https://github.com/david415/tahoe-lafs/tree/2844.add_readiness_api.1

i have not yet written the client side http API...
i'll do that soon. next.

ok i made the http api work and unit tested as well: <https://github.com/david415/tahoe-lafs/tree/2844.add_readiness_api.1> i have not yet written the client side http API... i'll do that soon. next.
Sign in to join this conversation.
No Milestone
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Reference: tahoe-lafs/trac-2024-07-25#2844
No description provided.