command-line: do things in an incremental fashion and accept stdin as input #113

Open
opened 2007-08-17 22:09:21 +00:00 by zooko · 10 comments

The "put" command-line currently can't take stdin as its input, because it needs to find the file size (Content-Length) before it starts. Fix this! Details: maybe use chunked transfer encoding? Maybe twisted.web2 client already does this? See if tahoe_put-web2ish.py already does the right thing.

Alternately, maybe our web server could be trained to recognize everything between the header and the (half-)close of the connection as being body?

The "put" command-line currently can't take stdin as its input, because it needs to find the file size (Content-Length) before it starts. Fix this! Details: maybe use chunked transfer encoding? Maybe twisted.web2 client already does this? See if tahoe_put-web2ish.py already does the right thing. Alternately, maybe our web server could be trained to recognize everything between the header and the (half-)close of the connection as being body?
zooko added the
unknown
minor
enhancement
0.4.0
labels 2007-08-17 22:09:21 +00:00
zooko added this to the 0.6.0 milestone 2007-08-17 22:09:21 +00:00
zooko self-assigned this 2007-08-17 22:09:21 +00:00
Author

This is part of the "improved command-line" task. I would like to see it done for v0.6.

This is part of the "improved command-line" task. I would like to see it done for v0.6.
warner added
code-frontend
and removed
unknown
labels 2007-08-20 18:55:18 +00:00
zooko modified the milestone from 0.6.0 to 0.7.0 2007-09-19 22:59:02 +00:00
Author

I'm interested in working on a few tickets which all have to do with improving the cmdline, for v0.6.2. This is one of them.

I'm interested in working on a few tickets which all have to do with improving the cmdline, for v0.6.2. This is one of them.
zooko added
0.6.1
and removed
0.4.0
labels 2007-10-19 23:15:15 +00:00
Author

We're focussing on an imminent v0.7.0 (see the roadmap) which hopefully has [#197 #197 -- Small Distributed Mutable Files] and also a fix for [#199 #199 -- bad SHA-256]. So I'm bumping less urgent tickets to v0.7.1.

We're focussing on an imminent v0.7.0 (see [the roadmap](http://allmydata.org/trac/tahoe/roadmap)) which hopefully has [#197 #197 -- Small Distributed Mutable Files] and also a fix for [#199 #199 -- bad SHA-256]. So I'm bumping less urgent tickets to v0.7.1.
Author

We need to choose a manageable subset of desired improvements for v0.7.1, scheduled for two week hence, so I'm bumping this one into v0.7.2, scheduled for mid-December.

We need to choose a manageable subset of desired improvements for [v0.7.1](http://allmydata.org/trac/tahoe/milestone/0.7.1), scheduled for two week hence, so I'm bumping this one into [v0.7.2](http://allmydata.org/trac/tahoe/milestone/0.7.2), scheduled for mid-December.
zooko added
0.7.0
and removed
0.6.1
labels 2007-11-13 18:29:32 +00:00
zooko added
code-frontend-cli
and removed
code-frontend
labels 2008-01-15 21:37:44 +00:00
zooko added this to the undecided milestone 2008-01-23 04:21:35 +00:00
davidsarah commented 2009-12-13 05:02:02 +00:00
Owner

Accepting a half-close as end of file would be quite error-prone.

Accepting a half-close as end of file would be quite error-prone.
davidsarah commented 2011-05-21 15:54:37 +00:00
Owner

Related to #320 (add streaming (on-line) upload to HTTP interface).

Related to #320 (add streaming (on-line) upload to HTTP interface).
tahoe-lafs added
major
and removed
minor
labels 2011-08-26 23:53:09 +00:00
davidsarah commented 2011-09-02 03:22:36 +00:00
Owner

Note that tahoe put never uses streaming, even when its input is from a file rather than stdin. This results in memory usage proportional to the file size (which would be expected for SDMF files, but not for immutable or MDMF files).

Note that the increase in memory usage of the gateway process seems to be at least double the file size; for example, when uploading a 191 MiB MDMF file in 1.9alpha using tahoe put --mutable --mutable-type=mdmf, the peak RSS of the gateway (which was also a storage server) was about 510 MiB greater than when updating the same file using SFTP. I think that counts as a defect.

Note that `tahoe put` never uses streaming, even when its input is from a file rather than stdin. This results in memory usage proportional to the file size (which would be expected for SDMF files, but not for immutable or MDMF files). Note that the increase in memory usage of the gateway process seems to be at least double the file size; for example, when uploading a 191 MiB MDMF file in 1.9alpha using `tahoe put --mutable --mutable-type=mdmf`, the peak RSS of the gateway (which was also a storage server) was about 510 MiB greater than when updating the same file using SFTP. I think that counts as a defect.
tahoe-lafs added
defect
and removed
enhancement
labels 2011-09-02 03:22:36 +00:00
davidsarah commented 2011-09-02 03:26:55 +00:00
Owner

BTW, I'm much less concerned about whether tahoe put accepts input from stdin, than about whether uploads are memory-efficient when the file size is known in advance. The latter case happens much more frequently (also for other commands like tahoe cp).

BTW, I'm much less concerned about whether `tahoe put` accepts input from stdin, than about whether uploads are memory-efficient when the file size is known in advance. The latter case happens much more frequently (also for other commands like `tahoe cp`).
Author

Replying to davidsarah:

Note that the increase in memory usage of the gateway process seems to be at least double the file size; for example, when uploading a 191 MiB MDMF file in 1.9alpha using tahoe put --mutable --mutable-type=mdmf, the peak RSS of the gateway (which was also a storage server) was about 510 MiB greater than when updating the same file using SFTP. I think that counts as a defect.

Agreed. And it should occupy its own new ticket.

Replying to [davidsarah](/tahoe-lafs/trac-2024-07-25/issues/113#issuecomment-61368): > > Note that the increase in memory usage of the gateway process seems to be at least double the file size; for example, when uploading a 191 MiB MDMF file in 1.9alpha using `tahoe put --mutable --mutable-type=mdmf`, the peak RSS of the gateway (which was also a storage server) was about 510 MiB greater than when updating the same file using SFTP. I think that counts as a defect. Agreed. And it should occupy its own new ticket.
davidsarah commented 2011-09-02 15:55:16 +00:00
Owner

Replying to [zooko]comment:15:

Replying to davidsarah:

Note that the increase in memory usage of the gateway process seems to be at least double the file size... I think that counts as a defect.

Agreed. And it should occupy its own new ticket.

Filed as #1523.

Replying to [zooko]comment:15: > Replying to [davidsarah](/tahoe-lafs/trac-2024-07-25/issues/113#issuecomment-61368): > > > > Note that the increase in memory usage of the gateway process seems to be at least double the file size... I think that counts as a defect. > > Agreed. And it should occupy its own new ticket. Filed as #1523.
tahoe-lafs added
enhancement
and removed
defect
labels 2011-09-02 15:55:16 +00:00
Sign in to join this conversation.
No Milestone
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Reference: tahoe-lafs/trac-2024-07-25#113
No description provided.