checker for mutable files: make "check" button on wui work #205

Closed
opened 2007-11-10 22:00:45 +00:00 by zooko · 7 comments

Currently if you click the "check" button on a mutable file it raises an exception saying it doesn't know how to check that kind of file.

Currently if you click the "check" button on a mutable file it raises an exception saying it doesn't know how to check that kind of file.
zooko added the
code
major
defect
0.6.1
labels 2007-11-10 22:00:45 +00:00
zooko added this to the 0.7.0 milestone 2007-11-10 22:00:45 +00:00
zooko added
0.7.0
and removed
0.6.1
labels 2007-11-13 18:17:10 +00:00
Author

I don't think we plan to do this before the end of December, so I'm putting it in Milestone 1.0.

I don't think we plan to do this before the end of December, so I'm putting it in Milestone 1.0.

Zooko and I came up with a refactoring of the mutable.Retrieve class that
would make it fairly easy to re-use the same code between Retrieve, Checker,
and Verifier.

The bulk of the code is in a new class, tenatively named
mutable.CiphertextRetrieval. This class is given a verifier capability,
a peer getter (i.e. client.get_permuted_peers, and a value of epsilon
(which might be 'infinity'). It might also get some indication of how much
data should be fetched: as little as possible, or full verification of all
outstanding shares. Likewise it might need to know if we want to get
ciphertext from all available versions, or just the most recent. This class
returns a MutableRetrievalResults instance.

This CiphertextRetrieval class then does the queries and readvs, and
populates the results instance with several things:

  • timing information (like what upload.UploadResults gets
  • version_map: dict mapping (seqnum, R, encodingparams) to a set of
    (shnum, peerid, good/bad) tuples. This is most useful for the checker.
  • retrieveable_versions: a set of (seqnum, R, encodingparams) tuples,
    one for each version that appears to be retrievable
  • ciphertext_map: dict mapping (seqnum, R, encodingparams) to a (salt/IV,
    ciphertext_string) tuple.

The MutableFileNode would then be responsible for doing two things with
the MutableRetrievalResults object:

  • extract the most recent version, by sorting ciphertext_map and
    taking the value associated with the last key
  • decrypting that with the readkey and the salt/IV.

The checker would be more interested in the version_map: the most useful
checker results would be "version 34 is retrievable, and there are 9 out of
10 shares available".

A more sophisticated app (some variant of MutableFileNode could expose
multiple versions to the application, instead of only picking the most recent
one. Single-version apps might benefit from knowing that there are
validly-signed shares of a newer version out there, even if that version has
too few shares to be retrieved. (i.e. version_map has a key which is greater
than the highest key of retrieveable_versions).

There are lots of intermediate points here. A really basic checker would just
care if any version is retrievable. A full verifier would want to download
all shares and check all signatures (signalling errors along the way). An
appropriate balance between bandwidth+CPU versus results will lie somewhere
in between.

Zooko and I came up with a refactoring of the mutable.Retrieve class that would make it fairly easy to re-use the same code between Retrieve, Checker, and Verifier. The bulk of the code is in a new class, tenatively named `mutable.CiphertextRetrieval`. This class is given a verifier capability, a peer getter (i.e. `client.get_permuted_peers`, and a value of epsilon (which might be 'infinity'). It might also get some indication of how much data should be fetched: as little as possible, or full verification of all outstanding shares. Likewise it might need to know if we want to get ciphertext from all available versions, or just the most recent. This class returns a `MutableRetrievalResults` instance. This `CiphertextRetrieval` class then does the queries and readvs, and populates the results instance with several things: * timing information (like what `upload.UploadResults` gets * version_map: dict mapping (seqnum, R, encodingparams) to a set of (shnum, peerid, good/bad) tuples. This is most useful for the checker. * retrieveable_versions: a set of (seqnum, R, encodingparams) tuples, one for each version that appears to be retrievable * ciphertext_map: dict mapping (seqnum, R, encodingparams) to a (salt/IV, ciphertext_string) tuple. The `MutableFileNode` would then be responsible for doing two things with the `MutableRetrievalResults` object: * extract the most recent version, by sorting `ciphertext_map` and taking the value associated with the last key * decrypting that with the readkey and the salt/IV. The checker would be more interested in the version_map: the most useful checker results would be "version 34 is retrievable, and there are 9 out of 10 shares available". A more sophisticated app (some variant of `MutableFileNode` could expose multiple versions to the application, instead of only picking the most recent one. Single-version apps might benefit from knowing that there are validly-signed shares of a newer version out there, even if that version has too few shares to be retrieved. (i.e. version_map has a key which is greater than the highest key of retrieveable_versions). There are lots of intermediate points here. A really basic checker would just care if any version is retrievable. A full verifier would want to download all shares and check all signatures (signalling errors along the way). An appropriate balance between bandwidth+CPU versus results will lie somewhere in between.
warner changed title from checker for mutable files to checker for mutable files / refactor Retrieve to show multiple versions 2008-03-10 20:34:40 +00:00
warner added
code-mutable
and removed
code
labels 2008-04-24 23:27:03 +00:00
warner modified the milestone from 1.1.0 to 1.2.0 2008-05-29 22:27:38 +00:00

The mutable file code has been refactored to make multiple versions visible to the application layer. It has not been modified to separate ciphertext retrieval from plaintext retrieval.

The mutable file code has been refactored to make multiple versions visible to the application layer. It has not been modified to separate ciphertext retrieval from plaintext retrieval.

however, there is new code in source:src/allmydata/mutable/checker.py which implements the Verifier in a different way. The regular download code is designed to do as little work as possible, while the verifier wants to download (and verify) as much data as possible. So the two goals are not particularly well aligned, making it more appropriate to use two separate classes.

The remaining pieces of this ticket are to make the 'check' button on the webapi work properly. This either means nailing down what the return value from filenode.check() ought to be, or incorporating the mutable checker results into the Checker service's results table.

however, there is new code in source:src/allmydata/mutable/checker.py which implements the Verifier in a different way. The regular download code is designed to do as little work as possible, while the verifier wants to download (and verify) as much data as possible. So the two goals are not particularly well aligned, making it more appropriate to use two separate classes. The remaining pieces of this ticket are to make the 'check' button on the webapi work properly. This either means nailing down what the return value from filenode.check() ought to be, or incorporating the mutable checker results into the Checker service's results table.
warner self-assigned this 2008-07-14 22:35:53 +00:00
warner changed title from checker for mutable files / refactor Retrieve to show multiple versions to checker for mutable files: make "check" button on wui work 2008-07-14 22:35:53 +00:00

My current plan:

  • filenode.check() should return an object that understands
    ICheckerResults, and stores a small amount of data into the checker
    service (indexed by verify-cap)
  • the webapi has a POST t=check (and t=verify) operation which creates a
    filenode, invokes a check/verify, and uses results.to_html() to render the
    output.
    • This output will contain detailed information about the file: which
      shares were where, request latency information, maybe pretty pictures
      showing the distribution of servers and shares around the peer-selection
      ring. If the check triggered a repair, the results of the repair
      operation will be included on this page.
    • The webapi operation will accept a when_done= argument, in which case
      the detailed output is ignored.
    • It will also accept a 'back_to=' argument, which will add a link to the
      resulting page (to bring the user back to the parent directory).
  • we add ImmutableVerifierNode and MutableVerifierNode, which
    have check/verify/repair methods.
  • the client.create_node_from_uri() will be updated to create verifier nodes
    from verifier caps
  • the Checker service will lose its check()/verify() methods, and will become
    merely a repository for checker results
My current plan: * filenode.check() should return an object that understands `ICheckerResults`, and stores a small amount of data into the checker service (indexed by verify-cap) * the webapi has a POST t=check (and t=verify) operation which creates a filenode, invokes a check/verify, and uses results.to_html() to render the output. * This output will contain detailed information about the file: which shares were where, request latency information, maybe pretty pictures showing the distribution of servers and shares around the peer-selection ring. If the check triggered a repair, the results of the repair operation will be included on this page. * The webapi operation will accept a when_done= argument, in which case the detailed output is ignored. * It will also accept a 'back_to=' argument, which will add a link to the resulting page (to bring the user back to the parent directory). * we add `ImmutableVerifierNode` and `MutableVerifierNode`, which have check/verify/repair methods. * the client.create_node_from_uri() will be updated to create verifier nodes from verifier caps * the Checker service will lose its check()/verify() methods, and will become merely a repository for checker results

we've now got ICheckerResults, and the Checker service is gone, and we've removed results-storage altogether. The new scheme is to invoke the .check() method on a filenode or dirnode, instead of passing a URI to the Checker.

we've now got `ICheckerResults`, and the Checker service is gone, and we've removed results-storage altogether. The new scheme is to invoke the .check() method on a filenode or dirnode, instead of passing a URI to the Checker.
warner added this to the 1.3.0 milestone 2008-09-03 01:19:26 +00:00

Ok, I'm now happy with the way the checker is driven. The web pages look good.

Ok, I'm now happy with the way the checker is driven. The web pages look good.
warner added the
fixed
label 2008-09-18 05:19:22 +00:00
Sign in to join this conversation.
No Milestone
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Reference: tahoe-lafs/trac-2024-07-25#205
No description provided.