Extend external interfaces for operation monitoring. #83

Closed
opened 2007-07-10 18:19:49 +00:00 by nejucomo · 4 comments

I'd like to see the external interfaces (rest, xml-rpc, foolscap) support operation monitoring, so that external clients can display operation progress and perhaps control the operations (cancel, for instance).

The rest api could include an operation id in response headers for new operations. Separate urls could be used to query progress or list current operations.

I'd like to see the external interfaces (rest, xml-rpc, foolscap) support operation monitoring, so that external clients can display operation progress and perhaps control the operations (cancel, for instance). The rest api could include an operation id in response headers for new operations. Separate urls could be used to query progress or list current operations.
nejucomo added the
code
minor
enhancement
0.4.0
labels 2007-07-10 18:19:49 +00:00
nejucomo added this to the eventually milestone 2007-07-10 18:19:49 +00:00

Currently you can see progress of e.g. an upload (tahoe put) by observing how much of the file you have been able to upload. The server (which ideally is running on your localhost anyway) will not accept the file faster than it can encrypt, encode, and distribute the shares.

Likewise, the progress of download is apparent by how much of the leading cleartext segments of the files have been delivered to you. :-)

What do you think?

Oh, cancellation is implemented by closing the HTTP connection before you've finished up/down load.

I'm pretty pleased with this design so far...

Currently you can see progress of e.g. an upload (`tahoe put`) by observing how much of the file you have been able to upload. The server (which ideally is running on your localhost anyway) will not accept the file faster than it can encrypt, encode, and distribute the shares. Likewise, the progress of download is apparent by how much of the leading cleartext segments of the files have been delivered to you. :-) What do you think? Oh, cancellation is implemented by closing the HTTP connection before you've finished up/down load. I'm pretty pleased with this design so far...
zooko added
0.6.1
and removed
0.4.0
labels 2007-10-20 21:05:46 +00:00
warner modified the milestone from eventually to undecided 2008-06-01 21:19:51 +00:00
davidsarah commented 2009-12-13 01:47:26 +00:00
Owner

Replying to zooko:

Currently you can see progress of e.g. an upload (tahoe put) by observing how much of the file you have been able to upload. ...
Likewise, the progress of download is apparent by how much of the leading cleartext segments of the files have been delivered to you. :-)
Oh, cancellation is implemented by closing the HTTP connection before you've finished up/down load.

+1. The request in this ticket seems like unnecessary complexity. I suggest wontfix.

(Note that #92 is about showing upload progress/completion to the user in the WUI; that would still be useful.)

Replying to [zooko](/tahoe-lafs/trac-2024-07-25/issues/83#issuecomment-60816): > Currently you can see progress of e.g. an upload (`tahoe put`) by observing how much of the file you have been able to upload. ... > Likewise, the progress of download is apparent by how much of the leading cleartext segments of the files have been delivered to you. :-) > Oh, cancellation is implemented by closing the HTTP connection before you've finished up/down load. +1. The request in this ticket seems like unnecessary complexity. I suggest wontfix. (Note that #92 is about showing upload progress/completion to the user in the WUI; that would still be useful.)

So I definitely would have preferred the simplicity of using in-band progress indicators and cancellation as described in comment:60816, but Brian persuaded me that this just wasn't good enough. The part of his argument that I remember being unable to counter was that we have some operations that take longer than an HTTP connection can reliably last. For example if you want to do a deep-verify-and-repair which is going to walk a large directory structure and download every bit of every share of every file and, if necessary, upload replacement shares. This could take days or weeks or months, and if your control of the process is a single HTTP connection then you're quite likely to suffer a network glitch which closes your TCP connection or encounter some kind of stupid timeout in an HTTP proxy or something.

(The way I like to think of this is that the comms abstraction of TCP is insufficiently robust -- there isn't a widely understood and implemented way to force your HTTP transaction to outlive temporary disconnections of the underlying TCP connection. That means that HTTP, while a wonderful lingua franca for some protocols, can't be used for long-running operations or operations which cannot be cannot be safely retried when the first try might or might not have failed to get through.)

So, Brian went ahead and invented "operation handles", documented here: source:docs/frontends/webapi.txt@4112#L203.

Hm, reading those docs again, I see this new text:

Many "slow" operations can begin to use unacceptable amounts of memory when
operation on large directory structures. The memory usage increases when the
ophandle is polled, as the results must be copied into a JSON string, sent
over the wire, then parsed by a client. So, as an alternative, many "slow"
operations have streaming equivalents. These equivalents do not use operation
handles. Instead, they emit line-oriented status results immediately. Client
code can cancel the operation by simply closing the HTTP connection.

Oh dear, so it appears that neither the operation-handles nor the single HTTP connection is really good enough in all dimensions. Hm.

So what shall we do with this ticket? I guess we'll close it as "fixed", and then maybe open a new ticket saying "Make operation-handle-querying use only a little memory" and maybe open a new ticket saying "Invent robust HTTP so that streaming operations handles can be used on operations that last longer than a TCP connection lasts".

I'm not actually going to open either of those two tickets right now. I just took painkillers for my knee (recuperating from surgery).

If Brian, Nathan, or David-Sarah (or anyone) have any ideas on how to follow-up on this by all means post to the list or comment on this or some other ticket.

So I definitely would have preferred the simplicity of using in-band progress indicators and cancellation as described in [comment:60816](/tahoe-lafs/trac-2024-07-25/issues/83#issuecomment-60816), but Brian persuaded me that this just wasn't good enough. The part of his argument that I remember being unable to counter was that we have some operations that take longer than an HTTP connection can reliably last. For example if you want to do a deep-verify-and-repair which is going to walk a large directory structure and download every bit of every share of every file and, if necessary, upload replacement shares. This could take days or weeks or months, and if your control of the process is a single HTTP connection then you're quite likely to suffer a network glitch which closes your TCP connection or encounter some kind of stupid timeout in an HTTP proxy or something. (The way I like to think of this is that the comms abstraction of TCP is insufficiently robust -- there isn't a widely understood and implemented way to force your HTTP transaction to outlive temporary disconnections of the underlying TCP connection. That means that HTTP, while a wonderful lingua franca for some protocols, can't be used for long-running operations or operations which cannot be cannot be safely retried when the first try might or might not have failed to get through.) So, Brian went ahead and invented "operation handles", documented here: source:docs/frontends/webapi.txt@4112#L203. Hm, reading those docs again, I see this new text: ``` Many "slow" operations can begin to use unacceptable amounts of memory when operation on large directory structures. The memory usage increases when the ophandle is polled, as the results must be copied into a JSON string, sent over the wire, then parsed by a client. So, as an alternative, many "slow" operations have streaming equivalents. These equivalents do not use operation handles. Instead, they emit line-oriented status results immediately. Client code can cancel the operation by simply closing the HTTP connection. ``` Oh dear, so it appears that neither the operation-handles nor the single HTTP connection is really good enough in all dimensions. Hm. So what shall we do with this ticket? I guess we'll close it as "fixed", and then maybe open a new ticket saying "Make operation-handle-querying use only a little memory" and maybe open a new ticket saying "Invent robust HTTP so that streaming operations handles can be used on operations that last longer than a TCP connection lasts". I'm not actually going to open either of those two tickets right now. I just took painkillers for my knee (recuperating from surgery). If Brian, Nathan, or David-Sarah (or anyone) have any ideas on how to follow-up on this by all means post to the list or comment on this or some other ticket.
zooko added the
fixed
label 2009-12-13 03:59:45 +00:00
zooko closed this issue 2009-12-13 03:59:45 +00:00
davidsarah commented 2009-12-13 08:07:53 +00:00
Owner

Replying to zooko:

So what shall we do with this ticket? I guess we'll close it as "fixed", and then maybe open a new ticket saying "Make operation-handle-querying use only a little memory"

This is #857.

and maybe open a new ticket saying "Invent robust HTTP so that streaming operations handles can be used on operations that last longer than a TCP connection lasts".

Not a Tahoe bug :-)

I'm not actually going to open either of those two tickets right now. I just took painkillers for my knee (recuperating from surgery).

Get well soon!

Replying to [zooko](/tahoe-lafs/trac-2024-07-25/issues/83#issuecomment-60821): > So what shall we do with this ticket? I guess we'll close it as "fixed", and then maybe open a new ticket saying "Make operation-handle-querying use only a little memory" This is #857. > and maybe open a new ticket saying "Invent robust HTTP so that streaming operations handles can be used on operations that last longer than a TCP connection lasts". Not a Tahoe bug :-) > I'm not actually going to open either of those two tickets right now. I just took painkillers for my knee (recuperating from surgery). Get well soon!
Sign in to join this conversation.
No Milestone
No Assignees
3 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Reference: tahoe-lafs/trac-2024-07-25#83
No description provided.