add some questions to allmydata.interfaces
This commit is contained in:
parent
0d31acf113
commit
61760047cf
|
@ -115,17 +115,47 @@ class ICodecEncoder(Interface):
|
||||||
"""Encode some data. This may be called multiple times. Each call is
|
"""Encode some data. This may be called multiple times. Each call is
|
||||||
independent.
|
independent.
|
||||||
|
|
||||||
inshares is a sequence of length required_shares, containing buffers,
|
inshares is a sequence of length required_shares, containing buffers
|
||||||
where each buffer contains the next contiguous non-overlapping
|
(i.e. strings), where each buffer contains the next contiguous
|
||||||
segment of the input data. Each buffer is required to be the same
|
non-overlapping segment of the input data. Each buffer is required to
|
||||||
length, and the sum of the lengths of the buffers is required to be
|
be the same length, and the sum of the lengths of the buffers is
|
||||||
exactly the data_size promised by set_params(). (This implies that
|
required to be exactly the data_size promised by set_params(). (This
|
||||||
the data has to be padded before being passed to encode(), unless of
|
implies that the data has to be padded before being passed to
|
||||||
course it already happens to be an even multiple of required_shares in
|
encode(), unless of course it already happens to be an even multiple
|
||||||
length.)
|
of required_shares in length.)
|
||||||
|
|
||||||
'desired_share_ids', if provided, is required to be a sequence of ints,
|
QUESTION for zooko: that implies that 'data_size' must be an
|
||||||
each of which is required to be >= 0 and < max_shares.
|
integral multiple of 'required_shares', right? Which means these
|
||||||
|
restrictions should be documented in set_params() rather than (or in
|
||||||
|
addition to) encode(), since that's where they must really be
|
||||||
|
honored. This restriction feels like an abstraction leak, but maybe
|
||||||
|
it is cleaner to enforce constraints on 'data_size' rather than
|
||||||
|
quietly implement internal padding. I dunno.
|
||||||
|
|
||||||
|
ALSO: the requirement to break up your data into 'required_shares'
|
||||||
|
chunks before calling encode() feels a bit surprising, at least from
|
||||||
|
the point of view of a user who doesn't know how FEC works. It feels
|
||||||
|
like an implementation detail that has leaked outside the
|
||||||
|
abstraction barrier. Can you imagine a use case in which the data to
|
||||||
|
be encoded might already be available in pre-segmented chunks, such
|
||||||
|
that it is faster or less work to make encode() take a list rather
|
||||||
|
than splitting a single string?
|
||||||
|
|
||||||
|
ALSO ALSO: I think 'inshares' is a misleading term, since encode()
|
||||||
|
is supposed to *produce* shares, so what it *accepts* should be
|
||||||
|
something other than shares. Other places in this interface use the
|
||||||
|
word 'data' for that-which-is-not-shares.. maybe we should use that
|
||||||
|
term?
|
||||||
|
|
||||||
|
ALSO*3: given that we need to keep share0+shareid0 attached from
|
||||||
|
encode() to the eventual decode(), would it be better to return and
|
||||||
|
accept a zip() of these two lists? i.e. [(share0,shareid0),
|
||||||
|
(share1,shareid1),...]
|
||||||
|
|
||||||
|
'desired_share_ids', if provided, is required to be a sequence of
|
||||||
|
ints, each of which is required to be >= 0 and < max_shares. If not
|
||||||
|
provided, encode() will produce 'max_shares' shares, as if
|
||||||
|
'desired_share_ids' were set to range(max_shares).
|
||||||
|
|
||||||
For each call, encode() will return a Deferred that fires with two
|
For each call, encode() will return a Deferred that fires with two
|
||||||
lists, one containing shares and the other containing the shareids.
|
lists, one containing shares and the other containing the shareids.
|
||||||
|
|
Loading…
Reference in New Issue