there isn't a doc that says "which operations are efficient" #757

Closed
opened 2009-07-14 03:46:57 +00:00 by zooko · 19 comments

Jonathan Ellis says that he was telling a co-worker: "There are a lot of docs on Tahoe, but there's nothing that says which operations are efficient."

Jonathan Ellis says that he was telling a co-worker: "There are a lot of docs on Tahoe, but there's nothing that says *which operations are efficient*."
zooko added the
documentation
major
enhancement
1.4.1
labels 2009-07-14 03:46:57 +00:00
zooko added this to the undecided milestone 2009-07-14 03:46:57 +00:00
Author

To close this ticket, add notes to the Performance page about the general efficiency of all of the major operations, and link to the Performance page from [the webapi.txt doc]source:docs/frontends/webapi.txt.

To close this ticket, add notes to [the Performance page](wiki/Performance) about the general efficiency of all of the major operations, and link to the Performance page from [the webapi.txt doc]source:docs/frontends/webapi.txt.

Also, include a copy of these notes in the source tree somewhere. Adding docs to the wiki is great and all, but I'd like people to be able to grab a source tree and know that they don't need to get anything else.

The biggest benefits (in my mind) of having docs on the wiki is that 1) everybody can edit them, add questions, etc, and 2) you can reference them from other wiki pages with nice short WikiNames. The benefit of having docs in the source tree are 1) long-term persistence, 2) no version skew between the tahoe version you're looking at and the docs, and 3) local availability despite allmydata.org and/or trac being down.

Maybe each file in docs/ could have a corresponding wiki page, and the file could have a note at the bottom saying "for the latest version of this document, please see URL".
And we could write a script that would fetch all the docs and copy them into the source tree.

Also, include a copy of these notes in the source tree somewhere. Adding docs to the wiki is great and all, but I'd like people to be able to grab a source tree and know that they don't need to get anything else. The biggest benefits (in my mind) of having docs on the wiki is that 1) everybody can edit them, add questions, etc, and 2) you can reference them from other wiki pages with nice short WikiNames. The benefit of having docs in the source tree are 1) long-term persistence, 2) no version skew between the tahoe version you're looking at and the docs, and 3) local availability despite allmydata.org and/or trac being down. Maybe each file in docs/ could have a corresponding wiki page, and the file could have a note at the bottom saying "for the latest version of this document, please see URL". And we could write a script that would fetch all the docs and copy them into the source tree.
Author

#878 (warn users about the performance issues of mutable files) was a subset of this ticket. By the way, Jonathan Ellis is an engineer at Rackspace and recently posted this: http://www.rackspacecloud.com/blog/2009/09/23/the-cassandra-project/

When we mentioned that comment 6 months ago I guess they were internally evaluating Tahoe-LAFS as a candidate technology for Rackspace's cloud offerings. Since I haven't heard that much from him since, I guess they are not using it. :-(

Anyway, documenting the approximate efficiency of the various Tahoe-LAFS operations should help people like that evaluate Tahoe-LAFS in the future.

#878 (warn users about the performance issues of mutable files) was a subset of this ticket. By the way, Jonathan Ellis is an engineer at Rackspace and recently posted this: <http://www.rackspacecloud.com/blog/2009/09/23/the-cassandra-project/> When we mentioned that comment 6 months ago I guess they were internally evaluating Tahoe-LAFS as a candidate technology for Rackspace's cloud offerings. Since I haven't heard that much from him since, I guess they are not using it. :-( Anyway, documenting the approximate efficiency of the various Tahoe-LAFS operations should help people like that evaluate Tahoe-LAFS in the future.
Author

Kevan: since you did #878 you might also be interested in this ticket. If so, accept it.

Kevan: since you did #878 you might also be interested in this ticket. If so, accept it.
kevan commented 2010-01-15 04:59:38 +00:00
Owner

(summarizing a discussion on IRC with zooko)

This file will live at docs/performance.txt. It will contain a list of operations performed by Tahoe-LAFS, and an approximation of the cost of each operation in whatever units are appropriate.

(summarizing a discussion on IRC with zooko) This file will live at docs/performance.txt. It will contain a list of operations performed by Tahoe-LAFS, and an approximation of the cost of each operation in whatever units are appropriate.
kevan commented 2010-01-29 04:41:36 +00:00
Owner

Okay, so what operations would be good to have here?

My list so far:

  • Publishing a mutable file.
  • Publishing an immutable file.
  • Downloading a mutable file.
  • Downloading an immutable file.
  • Replacing a mutable file
  • Modifying a mutable file (in the sense of, say, adding an entry to a directory)
  • Using tahoe backup
  • Verification
  • Checking/Repair
  • Listing a directory

The only code in that list that I'm intimately familiar with is the upload code, so if I'm separating operations that aren't very different into different categories, that's why (and please let me know if you think I am!).

What other operations might I think of including?

Okay, so what operations would be good to have here? My list so far: * Publishing a mutable file. * Publishing an immutable file. * Downloading a mutable file. * Downloading an immutable file. * Replacing a mutable file * Modifying a mutable file (in the sense of, say, adding an entry to a directory) * Using `tahoe backup` * Verification * Checking/Repair * Listing a directory The only code in that list that I'm intimately familiar with is the upload code, so if I'm separating operations that aren't very different into different categories, that's why (and please let me know if you think I am!). What other operations might I think of including?

I'd break out some operations that people might (or might not) expect to cost proportional to the size of the data they touch, especially when those operations actually cost proportional to the size of the overall file. So things like:

  • publishing A-byte immutable file
  • downloading B bytes (possibly all) of an A-byte immutable file
  • initial publish of an A-byte mutable file
  • modify B bytes of an existing A-byte mutable file
  • insert/delete B bytes in the middle of an existing A-byte mutable file
  • downloading B bytes (possibly all) of an A-byte mutable file
  • adding one directory entry to an existing A-entry directory
  • listing an A-entry directory

The only surprises for our current formats is that any mutable-file operation costs O(A) (i.e. proportional to filesize, even if you're only modifying or downloading a single byte), likewise for directories. Each directory entry tends to use about 300-330 bytes. And of course mutable-file creation requires a relatively slow (1 or 2 second) RSA keypair generation.

I'd break out some operations that people might (or might not) expect to cost proportional to the size of the data they touch, especially when those operations actually cost proportional to the size of the overall file. So things like: * publishing A-byte immutable file * downloading B bytes (possibly all) of an A-byte immutable file * initial publish of an A-byte mutable file * modify B bytes of an existing A-byte mutable file * insert/delete B bytes in the middle of an existing A-byte mutable file * downloading B bytes (possibly all) of an A-byte mutable file * adding one directory entry to an existing A-entry directory * listing an A-entry directory The only surprises for our current formats is that any mutable-file operation costs O(A) (i.e. proportional to filesize, even if you're only modifying or downloading a single byte), likewise for directories. Each directory entry tends to use about 300-330 bytes. And of course mutable-file creation requires a relatively slow (1 or 2 second) RSA keypair generation.
kevan commented 2010-01-30 22:54:55 +00:00
Owner

I'm attaching a first stab at this.

If I understand correctly, mutable files can be modified in two ways:

  • Overwriting their contents with those of a new file. This is what tools like the WUI and CLI do when updating a mutable file.
  • download-modify-upload, which is done internally to directories when entries are added or removed to/from them.

When I read "insert/delete B bytes in the middle of an existing A-byte mutable file", I thought of download-modify-upload (otherwise, this just seems like a special case of "modify B bytes of an existing A-byte mutable file"). I then wondered if it was appropriate to have an explanation of that in there. If performance.txt is intended for end-users and people who maintain grids, and if they wouldn't ever directly modify mutable files with download-modify-upload (since none of the end-user interfaces do that directly), then it seems like we'd do fine to just move the explanation for "insert/delete B bytes in the middle of an existing A-byte mutable file" into the explanation about modifying directories, since that is a special case of the same operation, and something that end users can do directly.

I also removed the existing blurb about mutable files, since it seemed redundant now that there are blurbs about mutable files elsewhere. If anyone misses it, let me know and I'll put it back.

I still want to add blurbs for the various verification operations, but I probably won't get around to that until tomorrow or Monday.

I'm attaching a first stab at this. If I understand correctly, mutable files can be modified in two ways: * Overwriting their contents with those of a new file. This is what tools like the WUI and CLI do when updating a mutable file. * download-modify-upload, which is done internally to directories when entries are added or removed to/from them. When I read "insert/delete B bytes in the middle of an existing A-byte mutable file", I thought of download-modify-upload (otherwise, this just seems like a special case of "modify B bytes of an existing A-byte mutable file"). I then wondered if it was appropriate to have an explanation of that in there. If performance.txt is intended for end-users and people who maintain grids, and if they wouldn't ever directly modify mutable files with download-modify-upload (since none of the end-user interfaces do that directly), then it seems like we'd do fine to just move the explanation for "insert/delete B bytes in the middle of an existing A-byte mutable file" into the explanation about modifying directories, since that is a special case of the same operation, and something that end users can do directly. I also removed the existing blurb about mutable files, since it seemed redundant now that there are blurbs about mutable files elsewhere. If anyone misses it, let me know and I'll put it back. I still want to add blurbs for the various verification operations, but I probably won't get around to that until tomorrow or Monday.
kevan commented 2010-02-02 01:10:16 +00:00
Owner

I'm attaching an updated patch with the costs of checking, verifying, and repairing files.

I collapsed computational cost and network cost into one measurement where I thought it was appropriate (which was in most places), though I wonder if that was the right thing to do. It is nicer to read if we have just one rough cost for each operation, and doing so achieves the goal of telling users about the surprises associated with some operations, but it is imprecise. Granted, this file is imprecise -- people who really wanted precise information would probably go read the code -- but I wonder if we're losing too much.

I'm attaching an updated patch with the costs of checking, verifying, and repairing files. I collapsed computational cost and network cost into one measurement where I thought it was appropriate (which was in most places), though I wonder if that was the right thing to do. It is nicer to read if we have just one rough cost for each operation, and doing so achieves the goal of telling users about the surprises associated with some operations, but it is imprecise. Granted, this file is imprecise -- people who really wanted precise information would probably go read the code -- but I wonder if we're losing too much.
kevan commented 2010-02-02 01:11:45 +00:00
Owner

Attachment performance.darcspatch.txt (53748 bytes) added

**Attachment** performance.darcspatch.txt (53748 bytes) added
kevan commented 2010-02-02 01:13:42 +00:00
Owner

If you're looking at this to review it:

  • Am I doing a good job of compressing notes and justifications into digestible chunks without losing a lot of information? (alt: are there any glaring inaccuracies? I looked over the code for the operations that are in there to try to make sure there wouldn't be, but I could have missed something)
  • Do you think there are any operations that should be in there and aren't?
If you're looking at this to review it: * Am I doing a good job of compressing notes and justifications into digestible chunks without losing a lot of information? (alt: are there any glaring inaccuracies? I looked over the code for the operations that are in there to try to make sure there wouldn't be, but I could have missed something) * Do you think there are any operations that should be in there and aren't?

When I read "insert/delete B bytes in the middle of an existing A-byte mutable file", I thought of download-modify-upload (otherwise, this just seems like a special case of "modify B bytes of an existing A-byte mutable file").

I mentioned that just because we have plans to make it fast: O(B) instead of O(A). As far as I know, no existing filesystem can do this right now, even local disk filesystems. This is the "LDMF" proposal. It's entirely reasonable to leave it out until we actually get around to building LDMF.

> When I read "insert/delete B bytes in the middle of an existing A-byte mutable file", I thought of download-modify-upload (otherwise, this just seems like a special case of "modify B bytes of an existing A-byte mutable file"). I mentioned that just because we have plans to make it fast: O(B) instead of O(A). As far as I know, no existing filesystem can do this right now, even local disk filesystems. This is the "LDMF" proposal. It's entirely reasonable to leave it out until we actually get around to building LDMF.
Author
  • Yes, I think we should break out RAM, CPU, bandwidth, and round-trips. Without the units, the asymptotic measurements aren't useful enough. O(A) CPU usage is typically not a problem. O(A) RAM usage is either no problem at all or a huge problem, depending on whether A is greater than your physical RAM. Bandwidth and round-trips are always a problem. :-)
* Yes, I think we should break out RAM, CPU, bandwidth, and round-trips. Without the units, the asymptotic measurements aren't useful enough. O(A) CPU usage is typically not a problem. O(A) RAM usage is either no problem at all or a huge problem, depending on whether A is greater than your physical RAM. Bandwidth and round-trips are always a problem. :-)
davidsarah commented 2010-02-02 04:14:17 +00:00
Owner

"cost: O(A) + a large constant for RSA + memory usage."

Since there's no way of using an algorithm other than RSA at the moment, this could just say "a large constant".

"Modifying B bytes of an A-byte mutable file" and "Adding/Removing B bytes in an A-byte mutable file" could probably be merged. The point about possibly being able to implement the latter efficiently even though local filesystems can't, could be made in the notes without requiring two sections.

"cost: O(A) + a large constant for RSA + memory usage." Since there's no way of using an algorithm other than RSA at the moment, this could just say "a large constant". "Modifying B bytes of an A-byte mutable file" and "Adding/Removing B bytes in an A-byte mutable file" could probably be merged. The point about possibly being able to implement the latter efficiently even though local filesystems can't, could be made in the notes without requiring two sections.
Author

There is no way currently in the cli, wui or the wapi to upload a file without using convergent encryption. (The confidentiality risk of convergent encryption is solved by adding in a separate "added convergence secret", not by skipping the step of hashing the cleartext to generate a symmetric key.) Therefore, all uploads of immutable files take two passes over the file. If you're uploading through the wui/wapi, this means your client (i.e. web browser, or a wapi-using client) first reads the entire file from disk while streaming it to the gateway, then the gateway writes it out to a temporary directory on disk while hashing it to generate the symmetric encryption key, then the gateway reads it again from the beginning from its temporary location on disk while encrypting it, erasure-coding it, and uploading the shares to the storage servers.

If you're uploading from within the tahoe node process itself (i.e. you've extended your tahoe node with your own code instead of using the wapi) then it will make two consecutive passes of reading the entire file from its original location on disk and then encrypt, erasure-code, and upload during the second pass.

#329 (add streaming (on-line) upload to HTTP interface) is about allowing one-pass "streaming" upload, so for example the web gateway would no longer write a temporary copy of the file to disk at all but would instead process it incrementally in (a small amount of) RAM. Shawn Willden contributed the first step of #320, which is code to use a random encryption key instead of to hash the file and generate a convergent encryption key. See: source:src/allmydata/immutable/upload.py@4164#L1099. However, as far as I can tell from the wapi code and the docs, there is no way to access this feature through the wapi. (I thought that Shawn contributed a patch to make this feature available. Maybe that patch never got accepted into trunk?)

There is no way currently in the cli, wui or the wapi to upload a file *without* using convergent encryption. (The confidentiality risk of convergent encryption is solved by adding in a separate "added convergence secret", not by skipping the step of hashing the cleartext to generate a symmetric key.) Therefore, all uploads of immutable files take two passes over the file. If you're uploading through the wui/wapi, this means your client (i.e. web browser, or a wapi-using client) first reads the entire file from disk while streaming it to the gateway, then the gateway writes it out to a temporary directory on disk while hashing it to generate the symmetric encryption key, then the gateway reads it again from the beginning from its temporary location on disk while encrypting it, erasure-coding it, and uploading the shares to the storage servers. If you're uploading from within the tahoe node process itself (i.e. you've extended your tahoe node with your own code instead of using the wapi) then it will make two consecutive passes of reading the entire file from its original location on disk and then encrypt, erasure-code, and upload during the second pass. #329 (add streaming (on-line) upload to HTTP interface) is about allowing one-pass "streaming" upload, so for example the web gateway would no longer write a temporary copy of the file to disk at all but would instead process it incrementally in (a small amount of) RAM. Shawn Willden contributed the first step of #320, which is code to use a random encryption key instead of to hash the file and generate a convergent encryption key. See: source:src/allmydata/immutable/upload.py@4164#L1099. However, as far as I can tell from the wapi code and the docs, there is no way to access this feature through the wapi. (I thought that Shawn contributed a patch to make this feature available. Maybe that patch never got accepted into trunk?)
davidsarah commented 2010-02-02 04:22:11 +00:00
Owner

Replying to zooko:

  • Yes, I think we should break out RAM, CPU, bandwidth, and round-trips. Without the units, the asymptotic measurements aren't useful enough. O(A) CPU usage is typically not a problem. O(A) RAM usage is either no problem at all or a huge problem, depending on whether A is greater than your physical RAM. Bandwidth and round-trips are always a problem. :-)

If we broke them out, they'd still all end up as O(A) at the moment, no? (even though it would be possible to use only O(1) memory). I think it would be sufficient just to say "O(A) memory, time, and total bandwidth". Round-trips are interesting, but I'm not sure we have time to work that out if we want to commit this for 1.6.

Replying to [zooko](/tahoe-lafs/trac-2024-07-25/issues/757#issuecomment-72026): > * Yes, I think we should break out RAM, CPU, bandwidth, and round-trips. Without the units, the asymptotic measurements aren't useful enough. O(A) CPU usage is typically not a problem. O(A) RAM usage is either no problem at all or a huge problem, depending on whether A is greater than your physical RAM. Bandwidth and round-trips are always a problem. :-) If we broke them out, they'd still all end up as O(A) at the moment, no? (even though it would be possible to use only O(1) memory). I think it would be sufficient just to say "O(A) memory, time, and total bandwidth". Round-trips are interesting, but I'm not sure we have time to work that out if we want to commit this for 1.6.
Author

I meant "#320" in all places in comment:72028, not "#329".

I meant "#320" in all places in [comment:72028](/tahoe-lafs/trac-2024-07-25/issues/757#issuecomment-72028), not "#329".
Author

Replying to [davidsarah]comment:19:

If we broke them out, they'd still all end up as O(A) at the moment, no? (even though it would be possible to use only O(1) memory). I think it would be sufficient just to say "O(A) memory, time, and total bandwidth". Round-trips are interesting, but I'm not sure we have time to work that out if we want to commit this for 1.6.

Well, all operations on immutable files take O(1) RAM (although as described in comment:72028 upload takes O(N) temporary disk space). Yeah, you may be right. I'm not sure!

Replying to [davidsarah]comment:19: > If we broke them out, they'd still all end up as O(A) at the moment, no? (even though it would be possible to use only O(1) memory). I think it would be sufficient just to say "O(A) memory, time, and total bandwidth". Round-trips are interesting, but I'm not sure we have time to work that out if we want to commit this for 1.6. Well, all operations on immutable files take O(1) RAM (although as described in [comment:72028](/tahoe-lafs/trac-2024-07-25/issues/757#issuecomment-72028) upload takes O(N) temporary disk space). Yeah, you may be right. I'm not sure!
tahoe-lafs added the
fixed
label 2010-02-02 05:34:48 +00:00
davidsarah closed this issue 2010-02-02 05:34:48 +00:00
Author

fixed by changeset:26c6b806d7922da1, changeset:7094f11a287a94f0, and changeset:be1dac0e563db575

fixed by changeset:26c6b806d7922da1, changeset:7094f11a287a94f0, and changeset:be1dac0e563db575
Sign in to join this conversation.
No Milestone
No Assignees
3 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Reference: tahoe-lafs/trac-2024-07-25#757
No description provided.