Uploading a 20MB mutable fails in HTTP, but succeeds in Foolscap #3956

Closed
opened 2022-12-14 16:53:25 +00:00 by itamarst · 7 comments

To reproduce: use benchmarks/upload_download.py from https://github.com/tahoe-lafs/tahoe-lafs/tree/3952-benchmarks, with a 20MB file. It'll run with Foolscap, but in HTTP:

>               result = await self.clients[0].create_mutable_file(MutableData(DATA))
E               allmydata.mutable.common.NotEnoughServersError: ("Publish ran out of good servers, last failure was: [Failure instance: Traceback: <class 'allmydata.storage.http_client.ClientException'>: (413, b'')\n/home/itamarst/devel/tahoe-lafs/venv/lib/python3.10/site-packages/twisted/internet/defer.py:734:errback\n/home/itamarst/devel/tahoe-lafs/venv/lib/python3.10/site-packages/twisted/internet/defer.py:797:_startRunCallbacks\n/home/itamarst/devel/tahoe-lafs/venv/lib/python3.10/site-packages/twisted/internet/defer.py:891:_runCallbacks\n/home/itamarst/devel/tahoe-lafs/venv/lib/python3.10/site-packages/twisted/internet/defer.py:1791:gotResult\n--- <exception caught here> ---\n/home/itamarst/devel/tahoe-lafs/venv/lib/python3.10/site-packages/twisted/internet/defer.py:1692:_inlineCallbacks\n/home/itamarst/devel/tahoe-lafs/venv/lib/python3.10/site-packages/twisted/python/failure.py:518:throwExceptionIntoGenerator\n/home/itamarst/devel/tahoe-lafs/src/allmydata/storage_client.py:1445:slot_testv_and_readv_and_writev\n/home/itamarst/devel/tahoe-lafs/venv/lib/python3.10/site-packages/twisted/internet/defer.py:1696:_inlineCallbacks\n/home/itamarst/devel/tahoe-lafs/src/allmydata/storage/http_client.py:868:read_test_write_chunks\n]", None)

That is, we're hitting a hard-coded size limit in the client code which presumably shouldn't be enforced given this works for Foolscap?

To reproduce: use benchmarks/upload_download.py from <https://github.com/tahoe-lafs/tahoe-lafs/tree/3952-benchmarks>, with a 20MB file. It'll run with Foolscap, but in HTTP: ``` > result = await self.clients[0].create_mutable_file(MutableData(DATA)) E allmydata.mutable.common.NotEnoughServersError: ("Publish ran out of good servers, last failure was: [Failure instance: Traceback: <class 'allmydata.storage.http_client.ClientException'>: (413, b'')\n/home/itamarst/devel/tahoe-lafs/venv/lib/python3.10/site-packages/twisted/internet/defer.py:734:errback\n/home/itamarst/devel/tahoe-lafs/venv/lib/python3.10/site-packages/twisted/internet/defer.py:797:_startRunCallbacks\n/home/itamarst/devel/tahoe-lafs/venv/lib/python3.10/site-packages/twisted/internet/defer.py:891:_runCallbacks\n/home/itamarst/devel/tahoe-lafs/venv/lib/python3.10/site-packages/twisted/internet/defer.py:1791:gotResult\n--- <exception caught here> ---\n/home/itamarst/devel/tahoe-lafs/venv/lib/python3.10/site-packages/twisted/internet/defer.py:1692:_inlineCallbacks\n/home/itamarst/devel/tahoe-lafs/venv/lib/python3.10/site-packages/twisted/python/failure.py:518:throwExceptionIntoGenerator\n/home/itamarst/devel/tahoe-lafs/src/allmydata/storage_client.py:1445:slot_testv_and_readv_and_writev\n/home/itamarst/devel/tahoe-lafs/venv/lib/python3.10/site-packages/twisted/internet/defer.py:1696:_inlineCallbacks\n/home/itamarst/devel/tahoe-lafs/src/allmydata/storage/http_client.py:868:read_test_write_chunks\n]", None) ``` That is, we're hitting a hard-coded size limit in the client code which presumably shouldn't be enforced given this works for Foolscap?
itamarst added the
unknown
normal
defect
n/a
labels 2022-12-14 16:53:25 +00:00
itamarst added this to the HTTP Storage Protocol milestone 2022-12-14 16:53:25 +00:00
Author

Mutable uploads get sent as CBOR, and we limit CBOR messages to 1MB cause we don't have streaming parsing or streaming validation at the moment.

Solutions might be:

  1. Figure out streaming parsing + validation.
  2. Break up uploads into chunks in the HTTP client, which is... semantically dangerous possibly.
  3. Change HTTP protocol so it doesn't use CBOR for the actual data.
  4. ...
Mutable uploads get sent as CBOR, and we limit CBOR messages to 1MB cause we don't have streaming parsing or streaming validation at the moment. Solutions might be: 1. Figure out streaming parsing + validation. 2. Break up uploads into chunks in the HTTP client, which is... semantically dangerous possibly. 3. Change HTTP protocol so it doesn't use CBOR for the actual data. 4. ...
Author

cbor2 can parse from file, at least.

In theory one could do terrible things with mmap() such that the Rust CDDL (which requiers a byte slice) can validate a file.

But first... does the very flexible Foolscap mutable writing API actually use all that flexibility? If not, we might be able to simplify the HTTP protocol a lot.

`cbor2` can parse from file, at least. In theory one could do terrible things with `mmap()` such that the Rust CDDL (which requiers a byte slice) can validate a file. But first... does the very flexible Foolscap mutable writing API actually use all that flexibility? If not, we might be able to simplify the HTTP protocol a lot.
Author

The test vectors are "check strings", there's only one, and it looks like check strings are always short. But it's hard to be quite sure :( But plausibly saying "validation checks need to fit in an HTTP header" is actually a reasonable thing to try.

Digging deeper:

  • Only place that generates new calls to slot_testv_and_readv_and_writev (as opposed) to proxying are two styles of mutable in mutable/layout.py.
  • Reads are only ever some sort of header info, never actual user data.
  • There is only ever one test.
  • Most of the time the checkstring is just composed of a sequence number, hash, and salt. So in this case it's small.
  • The other time you have a checkstring is for bad shares (line 508 in mutable/publish.py). Bad share checkstrings are set via mark_bad_share() in mutable/servermap.py. As far as I can tell these are always "prefixes" of length 75 bytes.

So: seems like moving all the test vectors and read vectors into a HTTP header would work just fine. And then the body can be just the new data, and does not need to be CBOR validated... assuming there is only ever one write in the write vector. So next will look into that.

The test vectors are "check strings", there's only one, and it looks like check strings are always short. But it's hard to be quite sure :( But plausibly saying "validation checks need to fit in an HTTP header" is actually a reasonable thing to try. Digging deeper: * Only place that generates _new_ calls to `slot_testv_and_readv_and_writev` (as opposed) to proxying are two styles of mutable in mutable/layout.py. * Reads are only ever some sort of header info, never actual user data. * There is only ever one test. * Most of the time the checkstring is just composed of a sequence number, hash, and salt. So in this case it's small. * The other time you have a checkstring is for bad shares (line 508 in mutable/publish.py). Bad share checkstrings are set via mark_bad_share() in mutable/servermap.py. As far as I can tell these are always "prefixes" of length 75 bytes. So: seems like moving all the test vectors and read vectors into a HTTP header would work just fine. And then the body can be just the new data, and does not need to be CBOR validated... assuming there is only ever one write in the write vector. So next will look into that.
Author

MDMF has multiple entries in the write vector. So options are:

  1. Coalesce them into a single write (likely to be possible, but need to investigate).
  2. Continue to do body as CBOR, and fudge the schema validation.
  3. Use the actual relevant HTTP feature... which isn't that different than option 2, really.

Next I will look into coalescing.

MDMF has multiple entries in the write vector. So options are: 1. Coalesce them into a single write (likely to be possible, but need to investigate). 2. Continue to do body as CBOR, and fudge the schema validation. 3. Use the actual relevant HTTP feature... which isn't that different than option 2, really. Next I will look into coalescing.
Author

Coalescing is mostly OK, except! It's possible to do a write at an offset. So there will be holes, and when first creating a file a hole means null bytes (zeros), but later on it means "don't overwrite".

Coalescing is mostly OK, except! It's possible to do a write at an offset. So there will be holes, and when first creating a file a hole means null bytes (zeros), but later on it means "don't overwrite".
Author

Given all the above, best bet is just:

  1. Increasing size limit of CBOR for the mutable upload case.
  2. Do our best to not use too much memory for CBOR parsing and CDDL validation.
Given all the above, best bet is just: 1. Increasing size limit of CBOR for the mutable upload case. 2. Do our best to not use too much memory for CBOR parsing and CDDL validation.
GitHub <noreply@github.com> commented 2023-01-10 20:53:42 +00:00
Owner

In 7ef1c020/trunk:

Merge pull request #1244 from tahoe-lafs/3956-mutable-uploads

Fix mutable uploads over HTTP above a certain size

Fixes ticket:3956
In [7ef1c020/trunk](/tahoe-lafs/trac-2024-07-25/commit/7ef1c0206791073c80cfcbf649ac661c27f6cfca): ``` Merge pull request #1244 from tahoe-lafs/3956-mutable-uploads Fix mutable uploads over HTTP above a certain size Fixes ticket:3956 ```
tahoe-lafs added the
fixed
label 2023-01-10 20:53:42 +00:00
GitHub <noreply@github.com> closed this issue 2023-01-10 20:53:42 +00:00
Sign in to join this conversation.
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Reference: tahoe-lafs/trac-2024-07-25#3956
No description provided.