old helper vs new client: sharemap is messed up #609

Closed
opened 2009-02-07 21:50:17 +00:00 by warner · 6 comments

The UploadResults.sharemap field was changed recently, from being a mapping of shnum-to-peerid into a mapping of shnum-to-set-of-peerids. The allmydata.com helpers are still running the old code (which produces shnum-to-peerid mappings), so when they are used by a new client (which expects shnum-to-set mappings), the client will display the sharemap incorrectly.

Those new clients will display a 20-ish long list of 2-character "peerid" strings, whereas they should really be listing a single 8-character long peerid.

Since this is a version skew problem, I'm not sure how to best fix it. The new shnum-to-set mapping is arguably the more correct one, so I'd like to find a way to stick with the new format. This field isn't very important, it's just used for debugging, so we may just be able to ignore it.

The `UploadResults.sharemap` field was changed recently, from being a mapping of shnum-to-peerid into a mapping of shnum-to-set-of-peerids. The allmydata.com helpers are still running the old code (which produces shnum-to-peerid mappings), so when they are used by a new client (which expects shnum-to-set mappings), the client will display the sharemap incorrectly. Those new clients will display a 20-ish long list of 2-character "peerid" strings, whereas they should really be listing a single 8-character long peerid. Since this is a version skew problem, I'm not sure how to best fix it. The new shnum-to-set mapping is arguably the more correct one, so I'd like to find a way to stick with the new format. This field isn't very important, it's just used for debugging, so we may just be able to ignore it.
warner added the
code-encoding
minor
defect
1.2.0
labels 2009-02-07 21:50:17 +00:00
warner added this to the undecided milestone 2009-02-07 21:50:17 +00:00
Author

Hm, I should have made the comment in UploadResults more specific.. the entire body of UploadResults is an external interface, not just the typeToCopy string, so any changes that are made to it should be done in a backwards-compatible fashion.

Hm, I should have made the comment in [UploadResults](wiki/UploadResults) more specific.. the entire body of `UploadResults` is an external interface, not just the `typeToCopy` string, so any changes that are made to it should be done in a backwards-compatible fashion.
Author

Here's an example of what a new (trunk) client's upload/download results show when used against an old (1.2.0) helper:

File Upload Status

* Started: 13:09:20 07-Feb-2009
* Storage Index: (None)
* Helper?: Yes
* Total Size: 6089815
* Progress (Hash): 0.0%
* Progress (Ciphertext): 100.0%
* Progress (Encode+Push): 0.0%
* Status: Done

Upload Results

* Shares Pushed: 9
* Shares Already Present: 1
* Sharemap:
o 0 -> placed on [iy, n4, ou, ny, mq, ea, n4, ny, ea, lm, oa, gy, gy, gi, ne, na, my, gq, lu]
o 1 -> placed on [ka, nq, me, mm, mu, mq, ea, n4, ny, ea, lm, g4, oy, ne, gi, nq, mm, ny, ne, lu]
o 2 -> placed on [ka, nq, me, mm, mu, mq, ea, n4, ny, ea, lm, mu, om, gm, mu, ne, m4, me, mq, lu]
o 3 -> placed on [ka, nq, me, mm, mu, mq, ea, n4, ny, ea, lm, oe, ny, oy, my, oi, gq, gi, om, lu]
o 4 -> placed on [ka, nq, me, mm, mu, mq, ea, n4, ny, ea, lm, nm, ou, pi, pe, me, gy, pi, pa, lu]
o 5 -> placed on [ka, nq, me, mm, mu, mq, ea, n4, ny, ea, lm, gm, my, mi, mm, nu, pe, nq, oi, lu]
o 6 -> placed on [ka, nq, me, mm, mu, mq, ea, n4, ny, ea, lm, o4, nq, mm, pa, mm, gm, my, pe, lu]
o 7 -> placed on [ka, nq, me, mm, mu, mq, ea, n4, ny, ea, lm, g4, g4, ni, mi, nq, o4, pe, o4, lu]
o 8 -> placed on [ka, nq, me, mm, mu, mq, ea, n4, ny, ea, lm, oa, my, me, oy, my, nu, oy, gm, lu]
o 9 -> placed on [ka, nq, me, mm, mu, mq, ea, n4, ny, ea, lm, my, ni, om, me, om, nu, nq, nq, lu]

...

I'm not sure why it says "Storage Index: (None)", that may be an unrelated
bug.

Here's an example of what a new (trunk) client's upload/download results show when used against an old (1.2.0) helper: ``` File Upload Status * Started: 13:09:20 07-Feb-2009 * Storage Index: (None) * Helper?: Yes * Total Size: 6089815 * Progress (Hash): 0.0% * Progress (Ciphertext): 100.0% * Progress (Encode+Push): 0.0% * Status: Done Upload Results * Shares Pushed: 9 * Shares Already Present: 1 * Sharemap: o 0 -> placed on [iy, n4, ou, ny, mq, ea, n4, ny, ea, lm, oa, gy, gy, gi, ne, na, my, gq, lu] o 1 -> placed on [ka, nq, me, mm, mu, mq, ea, n4, ny, ea, lm, g4, oy, ne, gi, nq, mm, ny, ne, lu] o 2 -> placed on [ka, nq, me, mm, mu, mq, ea, n4, ny, ea, lm, mu, om, gm, mu, ne, m4, me, mq, lu] o 3 -> placed on [ka, nq, me, mm, mu, mq, ea, n4, ny, ea, lm, oe, ny, oy, my, oi, gq, gi, om, lu] o 4 -> placed on [ka, nq, me, mm, mu, mq, ea, n4, ny, ea, lm, nm, ou, pi, pe, me, gy, pi, pa, lu] o 5 -> placed on [ka, nq, me, mm, mu, mq, ea, n4, ny, ea, lm, gm, my, mi, mm, nu, pe, nq, oi, lu] o 6 -> placed on [ka, nq, me, mm, mu, mq, ea, n4, ny, ea, lm, o4, nq, mm, pa, mm, gm, my, pe, lu] o 7 -> placed on [ka, nq, me, mm, mu, mq, ea, n4, ny, ea, lm, g4, g4, ni, mi, nq, o4, pe, o4, lu] o 8 -> placed on [ka, nq, me, mm, mu, mq, ea, n4, ny, ea, lm, oa, my, me, oy, my, nu, oy, gm, lu] o 9 -> placed on [ka, nq, me, mm, mu, mq, ea, n4, ny, ea, lm, my, ni, om, me, om, nu, nq, nq, lu] ... ``` I'm not sure why it says "Storage Index: (None)", that may be an unrelated bug.
Author

I want to look at this briefly before 1.3.0 is tagged.. we might want to change the code to make it more compatible. I'll try to finish this monday morning.

I want to look at this briefly before 1.3.0 is tagged.. we might want to change the code to make it more compatible. I'll try to finish this monday morning.
warner modified the milestone from undecided to 1.3.0 2009-02-09 09:20:12 +00:00
warner self-assigned this 2009-02-09 09:20:12 +00:00
Author

The compatibility matrix to pay attention to is:

A: old helper, old client B: old helper, new client
C: new helper, old client D: new helper, new client

The current code causes B (old helper, new client) to display a mangled
sharemap (list of tiny strings), and C (new helper, old client) to show a
differently mangled sharemap
(set(['\x1b\x1a\xa2\xbaE\xd8\x07\xf2\xb6N\x1eC\x7f\xc5X\x8d?\x96\xa8\xde'])).

So, the way I see it, we've got three options:

  1. ignore it, just tolerate the mangled display
  2. revert the change on the server (helper) side, and go back to sending a
    single string (well, a dict mapping share number to a single string).
    This is the only option which will fix the C (new helper, old client) case
  3. add a version identifier to the UploadResults structure, change the
    client to look for the version number, treat a missing version number as
    "version 0", translate version 0 sharemap attributes from a single string
    to a one-element set of strings. This will fix B but leave C broken. An
    alternative implementation would be to modify the client code to accept
    either a single string or a set of strings.

Eh, alright, it's not that big a deal. I'll implement the alternate form of
3, by making the client code accept either a single string or a set of
strings.

The compatibility matrix to pay attention to is: | | | |---|---| |A: old helper, old client|B: old helper, new client| |C: new helper, old client|D: new helper, new client| The current code causes B (old helper, new client) to display a mangled sharemap (list of tiny strings), and C (new helper, old client) to show a differently mangled sharemap (`set(['\x1b\x1a\xa2\xbaE\xd8\x07\xf2\xb6N\x1eC\x7f\xc5X\x8d?\x96\xa8\xde'])`). So, the way I see it, we've got three options: 1. ignore it, just tolerate the mangled display 2. revert the change on the server (helper) side, and go back to sending a single string (well, a dict mapping share number to a single string). This is the only option which will fix the C (new helper, old client) case 3. add a version identifier to the `UploadResults` structure, change the client to look for the version number, treat a missing version number as "version 0", translate version 0 sharemap attributes from a single string to a one-element set of strings. This will fix B but leave C broken. An alternative implementation would be to modify the client code to accept either a single string or a set of strings. Eh, alright, it's not that big a deal. I'll implement the alternate form of 3, by making the client code accept either a single string or a set of strings.
Author

sigh, it's weirder than that.

The old helper, when the file was already found in the grid (i.e. if there
were N distinct shares), returns an upload_results.sharemap that has a
mapping from shnum to a human-readable string: "Found on shortpeerid1,shortpeerid2..". If the file needed to be uploaded, it
returns the same sharemap that upload produces.

The 1.2.0 uploader sharemap looks like:

  • for shares that were found already present on servers during the
    peer-selection phase, the values are a human-readable string "Found on shortpeerid" (indicating only the last-queried server)
  • for shares that were uploaded to a server (even if that share was
    previously found on some other server), the value is a human-readable
    string "Placed on shortpeerid".

The 1.2.0 web "upload status" page just dumps the sharemap values out to
html, rather than trying to interpret them in any way.

The servermap is less human-readable and more consistently machine-readable,
and the web status page for it does more formatting and interpretation.

The helper code was not changed significantly, but the upload code changed
the way it produces the sharemap: now the sharemap only contains data for
newly-uploaded shares (and does not contain information about pre-existing
shares), and the sharemap values are a set of binary peerids (instead of a
human-readable string). As a result, the helper-generated sharemap might have
values which are either human-readable Found on x,y,z (for files that
are already healthy in the grid), or a set of binary peerids.

So I think what needs to happen is that the helper must change to produce a
set of binary peerids, and the client-side uploader should look carefully at
the sharemap it gets back from the helper and simply delete it if it's in the
old (human-readable) format. We could conceivably try to parse the "Placed
on../Found on.." strings, but it may be easier to just pretend the old
sharemaps don't exist.

sigh, it's weirder than that. The old helper, when the file was already found in the grid (i.e. if there were N distinct shares), returns an upload_results.sharemap that has a mapping from shnum to a human-readable string: "```Found on shortpeerid1,shortpeerid2..```". If the file needed to be uploaded, it returns the same sharemap that upload produces. The 1.2.0 uploader sharemap looks like: * for shares that were found already present on servers during the peer-selection phase, the values are a human-readable string "```Found on shortpeerid```" (indicating only the last-queried server) * for shares that were uploaded to a server (even if that share was previously found on some other server), the value is a human-readable string "`Placed on shortpeerid`". The 1.2.0 web "upload status" page just dumps the sharemap values out to html, rather than trying to interpret them in any way. The servermap is less human-readable and more consistently machine-readable, and the web status page for it does more formatting and interpretation. The helper code was not changed significantly, but the upload code changed the way it produces the sharemap: now the sharemap only contains data for newly-uploaded shares (and does not contain information about pre-existing shares), and the sharemap values are a set of binary peerids (instead of a human-readable string). As a result, the helper-generated sharemap might have values which are either human-readable `Found on x,y,z` (for files that are already healthy in the grid), or a set of binary peerids. So I think what needs to happen is that the helper must change to produce a set of binary peerids, and the client-side uploader should look carefully at the sharemap it gets back from the helper and simply delete it if it's in the old (human-readable) format. We could conceivably try to parse the "Placed on../Found on.." strings, but it may be easier to just pretend the old sharemaps don't exist.
Author

changeset:a5ab6c060df282ea fixes this, by detecting old helper UploadResults (by virtue of having string values) and ignoring them altogether (setting .sharemap=None). The "C" old-client/new-helper direction is still mangled, but at least it doesn't cause a crash.

In the future, we need to be more careful about these compatibility boundaries.

changeset:a5ab6c060df282ea fixes this, by detecting old helper `UploadResults` (by virtue of having string values) and ignoring them altogether (setting .sharemap=None). The "C" old-client/new-helper direction is still mangled, but at least it doesn't cause a crash. In the future, we need to be more careful about these compatibility boundaries.
warner added the
fixed
label 2009-02-09 20:48:23 +00:00
Sign in to join this conversation.
No Milestone
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Reference: tahoe-lafs/trac-2024-07-25#609
No description provided.