replace all use of pickles with JSON #1290

Open
opened 2011-01-04 02:13:07 +00:00 by davidsarah · 3 comments
davidsarah commented 2011-01-04 02:13:07 +00:00
Owner

The pickle format is specific to Python. Loading pickles allows arbitrary code execution (by design) and has been subject to memory corruption bugs.

The security exposure in Tahoe-LAFS is in practice not too bad because we only use pickles as private state, and it could be argued that a storage server has security problems anyway if an attacker can write to the filesystem under its node directory. Still, the potential for memory corruption is not nice.

We currently read and write pickles:

  • in PickleStatsGatherer at source:src/allmydata/stats.py#L245
  • in ShareCrawler in source:src/allmydata/storage/crawler.py
  • in LeaseCheckingCrawler (subclass of ShareCrawler) in source:src/allmydata/storage/expirer.py
  • in source:misc/operations_helpers/cpu-watcher.tac

If all of these uses of pickles were simply replaced with JSON, the state of crawls in progress at the time of the upgrade would be lost. This seems acceptable to me; I don't see any need to support resuming an interrupted crawl from a pickle written by a previous version.

See also #1280 and #561.

The [pickle](http://docs.python.org/library/pickle.html) format is specific to Python. Loading pickles allows arbitrary code execution (by design) and has been subject to [memory corruption bugs](http://scarybeastsecurity.blogspot.com/2008/10/some-python-bugs.html). The security exposure in Tahoe-LAFS is in practice not too bad because we only use pickles as private state, and it could be argued that a storage server has security problems anyway if an attacker can write to the filesystem under its node directory. Still, the potential for memory corruption is not nice. We currently read and write pickles: * ~~in `PickleStatsGatherer` at source:src/allmydata/stats.py#L245~~ * in `ShareCrawler` in source:src/allmydata/storage/crawler.py * in `LeaseCheckingCrawler` (subclass of `ShareCrawler`) in source:src/allmydata/storage/expirer.py * in source:misc/operations_helpers/cpu-watcher.tac If all of these uses of pickles were simply replaced with JSON, the state of crawls in progress at the time of the upgrade would be lost. This seems acceptable to me; I don't see any need to support resuming an interrupted crawl from a pickle written by a previous version. See also #1280 and #561.
tahoe-lafs added the
code
major
defect
1.8.1
labels 2011-01-04 02:13:07 +00:00
tahoe-lafs added this to the undecided milestone 2011-01-04 02:13:07 +00:00
davidsarah commented 2011-01-04 02:29:14 +00:00
Author
Owner

Twisted's plugin system also uses pickles.

Twisted's plugin system [also uses pickles](http://twistedmatrix.com/trac/browser/trunk/twisted/plugin.py).
davidsarah commented 2011-01-04 02:32:06 +00:00
Author
Owner

Replying to davidsarah:

it could be argued that a storage server has security problems anyway if an attacker can write to the filesystem under its node directory.

Not least because they could change the .tac file, which also allows arbitrary code execution.

Replying to [davidsarah](/tahoe-lafs/trac-2024-07-25/issues/6352): > it could be argued that a storage server has security problems anyway if an attacker can write to the filesystem under its node directory. Not least because they could change the `.tac` file, which also allows arbitrary code execution.

PickleStatsGatherer is gone, replaced by JSONStatsGatherer, in c9047b1 (which is after tahoe-lafs-1.11.0, and should be in 1.12.0).

`PickleStatsGatherer` is gone, replaced by `JSONStatsGatherer`, in c9047b1 (which is after tahoe-lafs-1.11.0, and should be in 1.12.0).
Sign in to join this conversation.
No Milestone
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Reference: tahoe-lafs/trac-2024-07-25#1290
No description provided.