Compare commits

...

153 Commits

Author SHA1 Message Date
Daira Hopwood c45f66772b Adds test_stats to test_storage.py
This is used to test the new bucket_count reported by the server,
as well as the new stats total_leased_sharecount and
total_leased_used_space.

Signed-off-by: Daira Hopwood <daira@jacaranda.org>
2014-05-05 19:45:17 +01:00
Mark Berger b9f1d00fad Removes BucketCounter 2014-05-05 19:29:43 +01:00
Mark Berger 50a617f7c6 Expands leasedb tests 2014-05-05 19:29:43 +01:00
Mark Berger 006a04976e Uses LeaseDB for share count instead of BucketCounter 2014-05-05 19:29:43 +01:00
Daira Hopwood cc916917f9 docs/configuration.rst: delete redundant description of backend = cloud.openstack.
Signed-off-by: Daira Hopwood <daira@jacaranda.org>
2014-05-05 19:29:43 +01:00
Daira Hopwood a68bf91dd8 Add total size output to 'tahoe admin ls-container'.
Signed-off-by: Daira Hopwood <daira@jacaranda.org>
2014-04-26 21:26:56 +01:00
Daira Hopwood 803fee48e3 Add Namespace class.
Signed-off-by: Daira Hopwood <daira@jacaranda.org>
2014-04-26 21:25:28 +01:00
Daira Hopwood 1cddcb6bc3 Leasedb and cloud backend merge is scheduled for v1.12.0.
Signed-off-by: Daira Hopwood <daira@jacaranda.org>
2014-04-09 05:34:08 +01:00
Daira Hopwood 05fb4039ab tests: rename *WithMockCloudBackend to *WithCloudBackendAndMockContainer.
Signed-off-by: Daira Hopwood <daira@jacaranda.org>
2014-04-09 01:58:07 +01:00
Daira Hopwood 3df9bb7934 Add the 'tahoe admin ls-container' command and tests. fixes #1759
Signed-off-by: Daira Hopwood <daira@jacaranda.org>
2014-04-09 01:57:52 +01:00
Daira Hopwood 64d259c142 Add backend support for listing container contents. refs #1759
Signed-off-by: Daira Hopwood <daira@jacaranda.org>
2014-04-09 01:57:52 +01:00
Daira Hopwood 6c4f51056c Refactoring to move ContainerItem and ContainerListing.
Signed-off-by: Daira Hopwood <daira@jacaranda.org>
2014-04-09 01:57:51 +01:00
Daira Hopwood 32547bbd10 Move a utility function into allmydata.util.
Signed-off-by: Daira Hopwood <daira@jacaranda.org>
2014-04-09 01:57:51 +01:00
Daira Hopwood e0ffe8bfcd S3Container: fix an oversight in the parent constructor call.
Signed-off-by: Daira Hopwood <daira@jacaranda.org>
2014-04-09 01:57:51 +01:00
Daira Hopwood de8b40188b Fix #2206. Includes refactoring of ContainerRetryMixin into CommonContainerMixin, and rearrangement of the code to initialize HTTPClientMixin.
Signed-off-by: Daira Hopwood <daira@jacaranda.org>
2014-04-09 01:57:51 +01:00
Daira Hopwood ea3111d24a S3 container: treat an empty GET bucket response as an error.
Signed-off-by: Daira Hopwood <daira@jacaranda.org>
2014-04-09 01:57:51 +01:00
Daira Hopwood a09ffad7d2 Fix wrong arguments to CorruptStoredShareError constructor in immutable share classes.
(Bug pointed out by Mark_B.)

Signed-off-by: Daira Hopwood <daira@jacaranda.org>
2014-04-09 01:57:51 +01:00
Mark Berger 78d8ed2e05 Fixes fd leak in leasedb (ticket #2015)
Brian correctly diagnozed this issue and suggested the fix in Weekly Dev
Chat of 2013-07-09. This patch has been manually tested by Zooko using his
fdleakfinder and a unit test has been added to ensure that database connections
are properly closed. Without this patch, the 1819-cloud-merge branch uses
more than 1020 fds and will start failing on operating systems with a low fds
limit.

The portions affecting leasedb.py were written by Zooko and Daira.
2014-04-09 01:57:51 +01:00
Daira Hopwood 5b31c24535 Accounting crawler: re-enable share deletion. refs #1987
Signed-off-by: Daira Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:57:51 +01:00
Daira Hopwood a389ee7196 Cloud backend: log URLs for HTTP responses, as well as requests.
Signed-off-by: Daira Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:57:51 +01:00
Daira Hopwood 47e487ce8a Cloud backend: suppress unhandled TaskStopped exceptions from FileBodyProducer.
Signed-off-by: Daira Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:57:51 +01:00
Daira Hopwood f1662b8162 Cloud backend: change the ChunkCache's replacement policy to LRU. fixes #1885
Signed-off-by: Daira Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:57:51 +01:00
Daira Hopwood 95182fe4ec test_storage.py: clean up arguments to FakeAccount methods.
Signed-off-by: Daira Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:57:51 +01:00
Daira Hopwood c23e01c2c0 Accounting crawler: make share deletion conditional (defaulting to False for now). refs #1987, #1921
Signed-off-by: Daira Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:57:51 +01:00
Daira Hopwood efd86e32a1 Azure: change 'container_name' config entry to 'container' for consistency with OpenStack.
Also fix a hidden bug in a test.

Signed-off-by: Daira Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:57:51 +01:00
Daira Hopwood b71637b2ce docs/backends/cloud.rst: cosmetics and caveat about support for 'create-container'. refs #1971
Signed-off-by: Daira Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:57:51 +01:00
Daira Hopwood d7fa405462 Add tests for 'tahoe admin create-container'. refs #1971
Signed-off-by: Daira Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:57:51 +01:00
Daira Hopwood 1e8bedb444 Refactor mock_cloud methods to simplify and make them more like CommonContainerMixin.
Also make sure that stripping of data arguments for logging is correct for all containers.

Signed-off-by: Daira Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:57:51 +01:00
Daira Hopwood dd809272a0 Move failure handling for create_container to make it more testable. refs #1971
Signed-off-by: Daira Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:57:51 +01:00
Daira Hopwood 722b26c711 Fix incorrect calls to create and delete S3 buckets in S3Container. refs #1971
Signed-off-by: Daira Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:57:51 +01:00
Daira Hopwood 0f840c4514 Add and document "tahoe admin create-container" command (rebased). refs #1971
Signed-off-by: Daira Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:57:51 +01:00
Daira Hopwood 5800e5e5b2 A missing basedir should cause an error if we try to read the config. refs #1971
Signed-off-by: Daira Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:44:25 +01:00
Daira Hopwood 11087d6dfa Refactoring to make node config accessible without actually creating a Node. refs #1971
Signed-off-by: Daira Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:44:25 +01:00
Daira Hopwood e786fcf1f7 Refactoring to make backend configuration accessible from outside Client. refs #1971
Signed-off-by: Daira Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:44:25 +01:00
Daira Hopwood 194935ba86 Help for admin commands: cosmetics and new tests.
Signed-off-by: Daira Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:44:25 +01:00
Itamar Turner-Trauring 92599d01b6 Actually run the creation code against real Azure service, and corresponding bug fix. 2014-04-09 01:44:25 +01:00
Itamar Turner-Trauring fb9065a99f Implement method to create containers in Microsoft Azure storage backend. 2014-04-09 01:44:25 +01:00
Itamar Turner-Trauring 0c9fab28d6 Add documentation for Microsoft Azure Blob Service cloud storage backend config. 2014-04-09 01:44:25 +01:00
Daira Hopwood 51cc8907ef Implement dumping of chunked shares and fix system tests. fixes #1959
Also, change 'assert' to '_assert' in debug.py.

Signed-off-by: Daira Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:44:25 +01:00
Daira Hopwood b897936045 Cleanup: improve error reporting for DataTooLargeError
(and fix an off-by-one error in MutableDiskShare._write_share_data).

Signed-off-by: Daira Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:44:25 +01:00
Daira Hopwood 181c91e747 Make the cloud backend report corrupted shares correctly. fixes #1566
Signed-off-by: Daira Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:44:25 +01:00
Daira Hopwood 634cce0773 Add CorruptStoredShareError as a superclass for all corrupt share errors raised by a storage server (rebased). refs #1566
Signed-off-by: Daira Hopwood <daira@jacaranda.org>
2014-04-09 01:44:25 +01:00
Daira Hopwood 64bf33bdb6 Improve skip messages for Azure tests.
Signed-off-by: Daira Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:44:25 +01:00
Daira Hopwood bfa799347e Use 'fail*' rather than 'assert*' methods for consistency with other tests.
Signed-off-by: Daira Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:44:25 +01:00
Daira Hopwood ea6d988b52 Test that share methods are called with the shareset locked. refs #1869
Signed-off-by: Daira Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:44:25 +01:00
Daira Hopwood 48f58ac8d2 Fix a pyflakes warning.
Signed-off-by: Daira Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:44:25 +01:00
Daira Hopwood 8ae9169e5b Lock remote operations on sharesets. fixes #1869
Signed-off-by: Daira Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:44:25 +01:00
Daira Hopwood 220de5d355 Cosmetic: fix trailing whitespace.
Signed-off-by: Daira Hopwood <daira@jacaranda.org>
2014-04-09 01:44:11 +01:00
Itamar Turner-Trauring 6d46d15e38 Add Google and Azure backends to setup.py. 2014-04-09 01:33:56 +01:00
Itamar Turner-Trauring db68ab19d7 Some improvements and bug fixes.
1. Discard body even if response code indicates a problem, when doing cloud backend HTTP requests. I believe this was triggering a bug in Twistd.
2. Google backend retries on 403 and other 4xx codes, not just 401.
3. More logging.
2014-04-09 01:33:56 +01:00
Itamar Turner-Trauring 5d774342af Fix bug when using oauth2client 1.1 instead of 1.0 (returned HTTP header was unicode rather than the expected bytes). 2014-04-09 01:33:56 +01:00
Daira Hopwood e913dad7b9 Fix pyflakes warnings.
Signed-off-by: Daira Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:56 +01:00
Daira Hopwood 564f6c799a Cleanup to declare not_import_versionable and ignorable packages in _auto_deps.py
Signed-off-by: Daira Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:56 +01:00
Daira Hopwood 1b70eb448f Fix version check warnings for httplib2 and python-gflags (used by oauth2client).
Signed-off-by: Daira Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:56 +01:00
Itamar Turner-Trauring 3ca2abce8a Add oauth2client to requirements. 2014-04-09 01:33:56 +01:00
Itamar Turner-Trauring 98baf80223 Retry cloud HTTP requests on *any* exception (the list is long, and hard to make complete, so easier to just handle all exceptions). 2014-04-09 01:33:56 +01:00
Daira Hopwood 99b98c8535 Retry on timeouts, and increase number of persistent HTTP connections.
Author: Itamar Turner-Trauring <itamar@futurefoundries.com>
Signed-off-by: Daira Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:56 +01:00
Itamar Turner-Trauring e3fdc479c9 Fix PUTs. 2014-04-09 01:33:56 +01:00
Itamar Turner-Trauring b3bd6a1279 Fix prefix inclusion, so authentication works. 2014-04-09 01:33:55 +01:00
Daira Hopwood 3303b94ab3 Configuration for MS Azure.
Author: Itamar Turner-Trauring <itamar@futurefoundries.com>
Signed-off-by: Daira Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:55 +01:00
Itamar Turner-Trauring d091b735cf First pass at implementing the Azure GET/PUT/DELETE. 2014-04-09 01:33:55 +01:00
Itamar Turner-Trauring 74b796d939 Address review comments from Daira.
1. Fix typo.
2. Rename config item googlestorage.bucket_name to googlestorage.bucket for
   consistency.
2014-04-09 01:33:55 +01:00
Itamar Turner-Trauring afdbce1569 Add documentation for Google Cloud Storage backend. 2014-04-09 01:33:55 +01:00
Daira Hopwood df3fc111b1 msazure_container.py: Implement authentication signature scheme.
Signed-off-by: Daira Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:55 +01:00
Daira Hopwood f6dd94465c Fix pyflakes errors.
Signed-off-by: Daira Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:55 +01:00
Daira Hopwood 18c5fda670 Unconditionally use HTTPConnectionPool, and depend on a Twisted that provides it.
Signed-off-by: Daira Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:55 +01:00
Daira Hopwood d070ec0e9c googlestorage_container.py: Use Amazon S3 namespace, since Google insists on using it.
Signed-off-by: Daira Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:55 +01:00
Itamar Turner-Trauring 88c5c01081 If oauth2client isn't present, skip Google Storage tests rather than blowing up. 2014-04-09 01:33:55 +01:00
Itamar Turner-Trauring 47110710ea Configuration support for Google Cloud Storage backend. 2014-04-09 01:33:55 +01:00
Itamar Turner-Trauring 7eec2f4fc9 googlestorage_container.py: Implement PUT and listing of bucket contents. 2014-04-09 01:33:55 +01:00
Itamar Turner-Trauring 96f3c65f14 googlestorage_container.py: Implement DELETE object. 2014-04-09 01:33:55 +01:00
Itamar Turner-Trauring 72612ea906 googlestorage_container.py: Implement GET object. 2014-04-09 01:33:55 +01:00
Itamar Turner-Trauring b82146a0cb Refactor useful functionality out of OpenStackContainer and into utility class. 2014-04-09 01:33:55 +01:00
Itamar Turner-Trauring f1ca398ca6 More tests for the Google Storage container, and fixes to the tests. 2014-04-09 01:33:55 +01:00
Itamar Turner-Trauring 24ed626678 Start of tests for the Google Storage container. 2014-04-09 01:33:55 +01:00
Itamar Turner-Trauring 818648cbf5 Tests for googlestorage_container.AuthenticationClient.
Author: Itamar Turner-Trauring <itamar@futurefoundries.com>
2014-04-09 01:33:55 +01:00
Itamar Turner-Trauring d42b232e6a Sketch of working Google Cloud Storage authentication, with some demo code. 2014-04-09 01:33:55 +01:00
David-Sarah Hopwood 5304c0b152 docs/backends/cloud.rst: clarify how to get to API Access in the Rackspace console.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:55 +01:00
David-Sarah Hopwood 2e38b3912a OpenStack: fix a type error introduced by the fix to #1921.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:55 +01:00
David-Sarah Hopwood 512cc28663 OpenStack: support HP Cloud Object Storage.
Also make the choice of auth protocol for Rackspace configurable via
openstack.provider, and change the reauth period to 11 hours.

Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:55 +01:00
David-Sarah Hopwood 9f6d12691e leasedb/accounting crawler: only treat stable shares as disappeared or unleased.
fixes #1921

Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:55 +01:00
David-Sarah Hopwood 76e7c5b97a Cloud backend: move potentially reusable HTTP request utilities to cloud_common.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:55 +01:00
David-Sarah Hopwood 81b396767e OpenStack: if we get a 401 Unauthorized response, reauthenticate immediately.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:55 +01:00
David-Sarah Hopwood 7fdb015e0c cloud_common.py: generalize ContainerRetryMixin to allow the container class to specify what to retry.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:55 +01:00
David-Sarah Hopwood e0f8942abd openstack_container.py: remove a superfluous argument to get_auth_info_locked.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:55 +01:00
David-Sarah Hopwood 46b00cfbd6 accounting_crawler.py: disable removing leasedb entries for disappeared shares.
This works around ticket #1921 for now.

Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:55 +01:00
David-Sarah Hopwood 96291392d3 openstack_container.py: avoid logging secrets in request headers.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:55 +01:00
David-Sarah Hopwood 70b8d5ac67 docs: add references to OpenStack/cloud backend in configuration.rst and running.rst.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:55 +01:00
David-Sarah Hopwood fa2cf092e3 OpenStack: generalize to support multiple auth protocols, and add V2 protocol.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:55 +01:00
David-Sarah Hopwood f311418382 OpenStack: add _http_request helper.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:55 +01:00
David-Sarah Hopwood 4e04008e75 openstack_container.py: factor out HTTP response code checking.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:55 +01:00
David-Sarah Hopwood ea8ec9c137 docs/backends/cloud.rst: add documentation for OpenStack config parameters.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:55 +01:00
David-Sarah Hopwood 956601dd16 openstack_container.py: improve name of _auth_lock; simplify by using DeferredLock.run.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:55 +01:00
David-Sarah Hopwood 27d4810349 OpenStack: change provider names to rackspace.com and rackspace.co.uk.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:55 +01:00
David-Sarah Hopwood 7db6c7f028 test_storage.py: add tests for OpenStackContainer.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:55 +01:00
David-Sarah Hopwood f53ef0baf1 openstack_container.py: disable or remove debug prints.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:54 +01:00
David-Sarah Hopwood 3a76e63690 openstack_container.py: fix a bug in type of ContainerListing.is_truncated.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:54 +01:00
David-Sarah Hopwood 4f1b51a26c Move classes common to mock and OpenStack cloud services, to cloud_common.py.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:54 +01:00
David-Sarah Hopwood 18a30d4d25 test_storage.py: refactor OpenStackCloudBackend to make it easier to add new tests.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:54 +01:00
David-Sarah Hopwood ed6ee84786 OpenStack: mostly complete implementation of OpenStackContainer.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:54 +01:00
David-Sarah Hopwood b9a9f9f30b OpenStack: improve logging in openstack_container.py.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:54 +01:00
David-Sarah Hopwood e85b97b253 OpenStack: add openstack.container config parameter.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:54 +01:00
David-Sarah Hopwood 5792a602a5 Add test for OpenStack authentication client.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:54 +01:00
David-Sarah Hopwood b20f10ee10 openstack_container.py: add shutdown() to avoid unclean reactor errors in tests.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:54 +01:00
David-Sarah Hopwood a4d66b49d0 openstack_container.py: add _ prefix to private attributes.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:54 +01:00
David-Sarah Hopwood 6a4f26456c setup.py: add allmydata.storage.backends.cloud.openstack module.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:54 +01:00
David-Sarah Hopwood ff8cd14fac test_client.py: add OpenStack config tests.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:54 +01:00
David-Sarah Hopwood 0da49ed0d7 test_client.py: cleanups to S3 config tests.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:54 +01:00
David-Sarah Hopwood 0e7e3bc51e OpenStack service: add AuthenticationClient.
Configure using properties relevant to OpenStack.

Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:54 +01:00
David-Sarah Hopwood 2dc48bc8c5 Add stub OpenStack container impl. as a copy of S3 container impl.
Generalize the container instantiation to work for either.

Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:54 +01:00
David-Sarah Hopwood 3c54924ecd Fix interface violations introduced in cloud merge.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:54 +01:00
Daira Hopwood 9160181d83 Make backupdb use dbutil.
Signed-off-by: Daira Hopwood <daira@jacaranda.org>
2014-04-09 01:33:54 +01:00
Daira Hopwood d29cfe15a5 Comment changes for ticket ref #1784
Signed-off-by: Daira Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:54 +01:00
David-Sarah Hopwood 598cd91f70 Makefile: have 'make tmpfstest' unmount and remove stale temp directories.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:54 +01:00
David-Sarah Hopwood 5438f4b35b Makefile: the timing for 'make tmpfstest' should exclude filesystem
mounting/unmounting and entering the password if needed.

Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:54 +01:00
Daira Hopwood d6d759f590 Makefile: allow tmpfs size to be more easily overridden, and use 500 MiB by default (rebased).
(The kernel will only allocate space that is used; the limit is just in case
tests write more than expected.)

Signed-off-by: Daira Hopwood <daira@jacaranda.org>
2014-04-09 01:33:54 +01:00
David-Sarah Hopwood cd67298d66 test_runner.py: add test for 'tahoe debug trial'.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:54 +01:00
Daira Hopwood 9b18949c91 Fixes to tests. Some tests are applied to multiple backends.
Signed-off-by: Daira Hopwood <daira@jacaranda.org>
2014-04-09 01:33:54 +01:00
David-Sarah Hopwood c896cfc2c1 Fixes to test infrastructure.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:54 +01:00
David-Sarah Hopwood 03e02eeece Miscellaneous corrections and additions.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:54 +01:00
David-Sarah Hopwood b2b91a6aaf Changes to crawler classes (ShareCrawler and AccountingCrawler).
Pass in a Clock to allow (in theory) deterministic testing, although this isn't used yet by tests.
Simplify the generic ShareCrawler code by not attempting to track state during processing
of a single prefix.

Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:54 +01:00
Daira Hopwood a79d3d69fb Changes to fileutil.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:54 +01:00
David-Sarah Hopwood 97268cc95f Fix bugs in Accountant.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:54 +01:00
David-Sarah Hopwood d0d17ff152 Simplify Account.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:54 +01:00
Daira Hopwood 230e57906d Changes to debug.py.
Signed-off-by: Daira Hopwood <daira@jacaranda.org>
2014-04-09 01:33:53 +01:00
Daira Hopwood cdbc1bcf36 Changes to node classes (Node, Client and StorageServer).
Signed-off-by: Daira Hopwood <daira@jacaranda.org>
2014-04-09 01:33:53 +01:00
Daira Hopwood 26aa98b9f4 Changes to Bucket{Reader,Writer} and disk backend (rebased).
Signed-off-by: Daira Hopwood <daira@jacaranda.org>
2014-04-09 01:33:53 +01:00
David-Sarah Hopwood 5a5622ce1d Changes and additions to interface documentation.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 01:33:53 +01:00
Daira Hopwood 7202791c3f Add new files for cloud merge (rebased).
Signed-off-by: Daira Hopwood <daira@jacaranda.org>
2014-04-09 01:33:33 +01:00
David-Sarah Hopwood 8faca7bc72 Move BucketWriter and BucketReader to storage/bucket.py.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 00:47:02 +01:00
David-Sarah Hopwood 434f781432 Move code around and add new directories for cloud backend merge.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 00:47:02 +01:00
David-Sarah Hopwood 8c92b50a33 Add dependency on our fork of txAWS (0.2.1.post5).
Add 'six' to ignorable package list because it is a dependency of txAWS via python-dateutil.

Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 00:47:02 +01:00
Daira Hopwood 61727bf2ec .gitignore: changes to facilitate cloud backend merge (rebased).
Signed-off-by: Daira Hopwood <daira@jacaranda.org>
2014-04-09 00:46:57 +01:00
Daira Hopwood c24a0b8270 Add documentation for each storage backend (rebased).
Signed-off-by: Daira Hopwood <daira@jacaranda.org>
2014-04-09 00:43:30 +01:00
David-Sarah Hopwood 8834a34a7e Add test_leasedb.py.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 00:33:34 +01:00
David-Sarah Hopwood d99a093168 leasedb: use a semantic primary key (storage_index, shnum, account_id), rather than an integer, for the leases table.
Take advantage of this to simplify add_or_renew_leases.
Fix a bug in add_starter_lease (which is not used yet) when a starter lease already exists.
Clean up leftover accesses to self._dirty.

Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 00:33:34 +01:00
David-Sarah Hopwood 1f61319128 Rename 'buckets' to 'sharesets' on storage status page.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 00:33:34 +01:00
David-Sarah Hopwood 68cb1b4e74 Remove unused files storage/lease.py and storage/expirer.py.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 00:33:34 +01:00
Daira Hopwood 0162c8bf69 Main leasedb changes (rebased).
Signed-off-by: Daira Hopwood <daira@jacaranda.org>
2014-04-09 00:33:21 +01:00
David-Sarah Hopwood dbd6321f37 Remove the 'original-*' and 'configured-*' lease crawler output that won't be supported by the accounting crawler.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 00:26:05 +01:00
David-Sarah Hopwood 74d6f4a19b Cosmetics.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 00:26:05 +01:00
David-Sarah Hopwood 503a9dfa82 test_storage.py: ss -> server for cases that will remain a server after the server/account split.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 00:26:05 +01:00
David-Sarah Hopwood a17fe86d69 Asyncify crawlers. Note that this breaks tests for the LeaseCrawler
(which is going away, to be replaced by the AccountingCrawler).

Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 00:26:05 +01:00
David-Sarah Hopwood a67b54662e scripts/debug.py: remove display of lease information and secrets.
This version replaces the expiration field with '-' instead of '0', per Zooko's comments.

Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 00:26:04 +01:00
David-Sarah Hopwood 79e4766b22 Remove the [storage]expire.{mutable,immutable} options.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 00:26:04 +01:00
David-Sarah Hopwood 91c66e8867 Remove support for [storage]debug_discard option.
(BucketWriter.throw_out_all_data is kept because it might still be useful.)

Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 00:26:04 +01:00
David-Sarah Hopwood f05887e558 Add new files for leasedb.
Authors: Brian Warner <warner@lothar.com> and David-Sarah Hopwood
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 00:26:04 +01:00
David-Sarah Hopwood f3f8d3fd7b Changes to specification of add_lease and renew_lease in RIStorageServer.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 00:26:04 +01:00
David-Sarah Hopwood 5da509a7c2 docs/garbage-collection.rst: update text for leasedb changes.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 00:26:04 +01:00
David-Sarah Hopwood d4184cffc2 Use "PRAGMA synchronous = OFF" for dbutil.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 00:26:04 +01:00
David-Sarah Hopwood dfeb188d32 Add util/dbutil.py: open/create/update sqlite databases given some schema.
Author: Brian Warner <warner@lothar.com>
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 00:26:04 +01:00
David-Sarah Hopwood 814620206a util/fileutil.py: add get_used_space. This version does not use FilePath.
Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 00:26:04 +01:00
David-Sarah Hopwood 7d120c3c58 Some useful Deferred utilities. Addresses Zooko's review comments.
HookMixin and WaitForDelayedCallsMixin have been added for leasedb,
the rest are from the cloud backend branch, with minor improvements.

Signed-off-by: David-Sarah Hopwood <david-sarah@jacaranda.org>
2014-04-09 00:26:04 +01:00
94 changed files with 12739 additions and 6239 deletions

4
.gitignore vendored
View File

@ -3,9 +3,12 @@
*~
*.DS_Store
.*.kate-swp
*.orig
*.rej
/build/
/support/
/_darcs/
# these are generated at build time, and never checked in
/src/allmydata/_version.py
@ -33,3 +36,4 @@ zope.interface-*.egg
/coverage-html/
/miscaptures.txt
/violations.txt
/*.diff

View File

@ -62,16 +62,21 @@ quicktest: make-version
$(TAHOE) debug trial $(TRIALARGS) $(TEST)
# "make tmpfstest" may be a faster way of running tests on Linux. It works best when you have
# at least 330 MiB of free physical memory (to run the whole test suite). Since it uses sudo
# to mount/unmount the tmpfs filesystem, it might prompt for your password.
# at least $(TMPFS_SIZE) of free physical memory (to run the whole test suite). Since it uses
# sudo to mount/unmount the tmpfs filesystem, it might prompt for your password.
TMPFS_SIZE = 500m
tmpfstest:
time make _tmpfstest 'TMPDIR=$(shell mktemp -d --tmpdir=.)'
make _tmpfstest 'TMPDIR=$(shell mktemp -d --tmpdir=.)'
_tmpfstest: make-version
sudo mount -t tmpfs -o size=400m tmpfs '$(TMPDIR)'
-$(TAHOE) debug trial --rterrors '--temp-directory=$(TMPDIR)/_trial_temp' $(TRIALARGS) $(TEST)
sudo mount -t tmpfs -o size=$(TMPFS_SIZE) tmpfs '$(TMPDIR)'
-time $(TAHOE) debug trial --rterrors '--temp-directory=$(TMPDIR)/_trial_temp' $(TRIALARGS) $(TEST)
sudo umount '$(TMPDIR)'
rmdir '$(TMPDIR)'
-sudo umount tmp.* 2>/dev/null
-rmdir --ignore-fail-on-non-empty tmp.* 2>/dev/null
# code-coverage: install the "coverage" package from PyPI, do "make
# quicktest-coverage" to do a unit test run with coverage-gathering enabled,

234
docs/backends/cloud.rst Normal file
View File

@ -0,0 +1,234 @@
================================
Storing Shares on Cloud Services
================================
The Tahoe-LAFS storage server can be configured to store its shares on a
cloud storage service, rather than on the local filesystem.
All cloud storage services store the data in a particular container (also
called a "bucket" in some storage services). You can create this container
using the "tahoe admin create-container" command, once you have a correctly
configured Tahoe-LAFS node as described below. That is, configure the node
with the container name you decided to use (e.g. "tahoedata"), then run the
command.
(Currently, "tahoe admin create-container" works only for the S3 and
Azure services. For Rackspace Cloud Files, HP Cloud Object Storage and
Google Cloud Storage, it is necessary to use the respective web interfaces
to create a container for the time being.)
Amazon Simple Storage Service (S3)
==================================
S3 is a commercial storage service provided by Amazon, described at
`<https://aws.amazon.com/s3/>`__.
To enable storing shares on S3, add the following keys to the server's
``tahoe.cfg`` file:
``[storage]``
``backend = cloud.s3``
This turns off the local filesystem backend and enables use of the cloud
backend with S3.
``s3.access_key_id = (string, required)``
This identifies your Amazon Web Services access key. The access key id is
not secret, but there is a secret key associated with it. The secret key
is stored in a separate file named ``private/s3secret``.
``s3.bucket = (string, required)``
This controls which bucket will be used to hold shares. The Tahoe-LAFS
storage server will only modify and access objects in the configured S3
bucket. Multiple storage servers cannot share the same bucket.
``s3.url = (URL string, optional)``
This URL tells the storage server how to access the S3 service. It
defaults to ``http://s3.amazonaws.com``, but by setting it to something
else, you may be able to use some other S3-like service if it is
sufficiently compatible.
The system time of the storage server must be correct to within 15 minutes
in order for S3 to accept the authentication provided with requests.
DevPay
------
Optionally, Amazon `DevPay`_ may be used to delegate billing for a service
based on Tahoe-LAFS and S3 to Amazon Payments.
If DevPay is to be used, the user token and product token (in base64 form)
must be stored in the files ``private/s3usertoken`` and ``private/s3producttoken``
respectively. DevPay-related request headers will be sent only if these files
are present when the server is started. It is currently assumed that only one
user and product token pair is needed by a given storage server.
.. _DevPay: http://docs.amazonwebservices.com/AmazonDevPay/latest/DevPayGettingStartedGuide/
OpenStack
=========
`OpenStack`_ is an open standard for cloud services, including cloud storage.
The cloud backend currently supports two OpenStack storage providers:
* Rackspace ( `<https://www.rackspace.com>`__ and `<https://www.rackspace.co.uk>`__ )
provides a service called `Cloud Files`_.
* HP ( `<https://www.hpcloud.com/>`__ ) provides a service called
`HP Cloud Object Storage`_.
Other OpenStack storage providers may be supported in future.
.. _OpenStack: https://www.openstack.org/
.. _Cloud Files: http://www.rackspace.com/cloud/files/
.. _HP Cloud Object Storage: https://www.hpcloud.com/products/object-storage
To enable storing shares on one of these services, add the following keys to
the server's ``tahoe.cfg`` file:
``[storage]``
``backend = cloud.openstack``
This turns off the local filesystem backend and enables use of the cloud
backend with OpenStack.
``openstack.provider = (string, optional, case-insensitive)``
The supported providers are ``rackspace.com``, ``rackspace.co.uk``,
``hpcloud.com west``, and ``hpcloud.com east``. For Rackspace, use the
site on which the Rackspace user account was created. For HP, "west"
and "east" refer to the two storage regions in the United States.
The default is ``rackspace.com``.
``openstack.container = (string, required)``
This controls which container will be used to hold shares. The Tahoe-LAFS
storage server will only modify and access objects in the configured
container. Multiple storage servers cannot share the same container.
``openstack.url = (URL string, optional)``
This overrides the URL used to access the authentication service. It
does not need to be set when using Rackspace or HP accounts, because the
correct service is chosen based on ``openstack.provider`` by default.
Authentication is less precisely specified than other parts of the OpenStack
standards, and so the two supported providers require slightly different user
credentials, described below.
*If using Rackspace:*
``openstack.username = (string, required)``
This identifies the Rackspace user account.
An API key for the account is also needed. It can be generated by
logging in at `<https://manage.rackspacecloud.com>`__ and selecting
"Your Account" followed by "API Access" in the left-hand menu, then
clicking the Show Key button.
The API key should be stored in a separate file named
``private/openstack_api_key``.
*If using HP:*
``openstack.access_key_id = (string, required)``
``openstack.tenant_id = (string, required)``
These are the Access Key ID and Tenant ID (not the tenant name) obtained
by logging in at `<https://console.hpcloud.com/account/api_keys>`__.
The secret key, obtained from the same page by clicking SHOW, should
be stored in a separate file named ``private/openstack_secret_key``.
Google Cloud Storage
====================
`Google Cloud Storage`_ is a block-based storage system provided by Google. To
access the storage system, you will need to create a project at the `Google
APIs Console`_, and then generate a Service Account client ID in the "API
Access" section. You will store the private key that will be downloaded by
your browser in your Tahoe configuration file; see below.
.. _Google Cloud Storage: https://cloud.google.com/products/cloud-storage
.. _Google APIs Console: https://code.google.com/apis/console/
To enable storing shares on one of these services, add the following keys to
the server's ``tahoe.cfg`` file:
``[storage]``
``backend = cloud.googlestorage``
This turns off the local filesystem backend and enables use of the cloud
backend with Google Storage.
``googlestorage.account_email = (string, required)``
This is the email on the Service Account you created,
e.g. ``123456@developer.gserviceaccount.com``.
``googlestorage.project_id = (string, required)``
This is the project number of the project you created,
e.g. ``123456``. You can find this number in the Google Cloud Storage
section of the APIs console (the number following `x-goog-project-id`).
``googlestorage.bucket = (string, required)``
This controls which bucket (a.k.a. container) will be used to hold
shares. The Tahoe-LAFS storage server will only modify and access objects
in the configured container. Multiple storage servers cannot share the
same container. Buckets can be created using a command-line tool (gsutil)
or a web UI; see the Google Cloud Storage section of the APIs console.
The private key you downloaded is stored in a separate file named
``private/googlestorage_private_key``.
Microsoft Azure Blob Storage
============================
`Microsoft Azure Blob Storage`_ is a block-based storage system provided by
Microsoft. To access the storage system, you will need to `create a storage
account`_. The DNS prefix you choose will be the account name, and either the
resulting primary or secondary keys can be used as the account key; you can
get them by using the "Manage Keys" button at the bottom of the storage
management page.
.. _Microsoft Azure Blob Storage: http://www.windowsazure.com/en-us/manage/services/storage/
.. _create a storage account: http://www.windowsazure.com/en-us/develop/python/how-to-guides/blob-service/#create-account
To enable storing shares in this services, add the following keys to the
server's ``tahoe.cfg`` file:
``[storage]``
``backend = cloud.msazure``
This turns off the local filesystem backend and enables use of the cloud
backend with Microsoft Azure.
``msazure.account_name = (string, required)``
This is the account name (subdomain) you chose when creating the account,
e.g. ``mydomain``.
``msazure.container = (string, required)``
This controls which container will be used to hold shares. The Tahoe-LAFS
storage server will only modify and access objects in the configured
container. Multiple storage servers cannot share the same container.
The private key you downloaded is stored in a separate file named
``private/msazure_account_key``.

43
docs/backends/disk.rst Normal file
View File

@ -0,0 +1,43 @@
====================================
Storing Shares on a Local Filesystem
====================================
The "disk" backend stores shares on the local filesystem. Versions of
Tahoe-LAFS before v1.12.0 always stored shares in this way.
``[storage]``
``backend = disk``
This enables use of the disk backend, and is the default.
``readonly = (boolean, optional)``
If ``True``, the node will run a storage server but will not accept any
shares, making it effectively read-only. Use this for storage servers
that are being decommissioned: the ``storage/`` directory could be
mounted read-only, while shares are moved to other servers. Note that
this currently only affects immutable shares. Mutable shares will be
written and modified anyway. See ticket `#390
<http://tahoe-lafs.org/trac/tahoe-lafs/ticket/390>`__ for the current
status of this bug. The default value is ``False``.
``reserved_space = (quantity of space, optional)``
If provided, this value defines how much disk space is reserved: the
storage server will not accept any share that causes the amount of free
disk space to drop below this value. (The free space is measured by a
call to ``statvfs(2)`` on Unix, or ``GetDiskFreeSpaceEx`` on Windows, and
is the space available to the user account under which the storage server
runs.)
This string contains a number, with an optional case-insensitive scale
suffix, optionally followed by "B" or "iB". The supported scale suffixes
are "K", "M", "G", "T", "P" and "E", and a following "i" indicates to use
powers of 1024 rather than 1000. So "100MB", "100 M", "100000000B",
"100000000", and "100000kb" all mean the same thing. Likewise, "1MiB",
"1024KiB", "1024 Ki", and "1048576 B" all mean the same thing.
"``tahoe create-node``" generates a tahoe.cfg with
"``reserved_space=1G``", but you may wish to raise, lower, or remove the
reservation to suit your needs.

View File

@ -4,7 +4,7 @@
Configuring a Tahoe-LAFS node
=============================
1. `Node Types`_
1. `Node Types`_
2. `Overall Node Configuration`_
3. `Client Configuration`_
4. `Storage Server Configuration`_
@ -443,35 +443,30 @@ Storage Server Configuration
for clients who do not wish to provide storage service. The default value
is ``True``.
``readonly = (boolean, optional)``
``backend = (string, optional)``
If ``True``, the node will run a storage server but will not accept any
shares, making it effectively read-only. Use this for storage servers
that are being decommissioned: the ``storage/`` directory could be
mounted read-only, while shares are moved to other servers. Note that
this currently only affects immutable shares. Mutable shares (used for
directories) will be written and modified anyway. See ticket `#390`_ for
the current status of this bug. The default value is ``False``.
Storage servers can store the data into different "backends". Clients
need not be aware of which backend is used by a server. The default
value is ``disk``.
``reserved_space = (str, optional)``
``backend = disk``
If provided, this value defines how much disk space is reserved: the
storage server will not accept any share that causes the amount of free
disk space to drop below this value. (The free space is measured by a
call to ``statvfs(2)`` on Unix, or ``GetDiskFreeSpaceEx`` on Windows, and
is the space available to the user account under which the storage server
runs.)
The storage server stores shares on the local filesystem (in
BASEDIR/storage/shares/). For configuration details (including how to
reserve a minimum amount of free space), see `<backends/disk.rst>`__.
This string contains a number, with an optional case-insensitive scale
suffix, optionally followed by "B" or "iB". The supported scale suffixes
are "K", "M", "G", "T", "P" and "E", and a following "i" indicates to use
powers of 1024 rather than 1000. So "100MB", "100 M", "100000000B",
"100000000", and "100000kb" all mean the same thing. Likewise, "1MiB",
"1024KiB", "1024 Ki", and "1048576 B" all mean the same thing.
``backend = cloud.<service>``
"``tahoe create-node``" generates a tahoe.cfg with
"``reserved_space=1G``", but you may wish to raise, lower, or remove the
reservation to suit your needs.
The storage server stores all shares to the cloud service specified
by <service>. For supported services and configuration details, see
`<backends/cloud.rst>`__. For backward compatibility, ``backend = s3``
is equivalent to ``backend = cloud.s3``.
``backend = debug_discard``
The storage server stores all shares in /dev/null. This is actually used,
for testing. It is not recommended for storage of data that you might
want to retrieve in the future.
``expire.enabled =``
@ -481,16 +476,9 @@ Storage Server Configuration
``expire.cutoff_date =``
``expire.immutable =``
``expire.mutable =``
These settings control garbage collection, in which the server will
delete shares that no longer have an up-to-date lease on them. Please see
garbage-collection.rst_ for full details.
.. _#390: https://tahoe-lafs.org/trac/tahoe-lafs/ticket/390
.. _garbage-collection.rst: garbage-collection.rst
`<garbage-collection.rst>`__ for full details.
Running A Helper

View File

@ -138,6 +138,12 @@ same way as "``tahoe run``".
is most often used by developers who have just modified the code and want to
start using their changes.
Some less frequently used administration commands, for key generation/derivation
and for creating and listing the contents of cloud backend containers, are
grouped as subcommands of "``tahoe admin``". For a list of these use
"``tahoe admin --help``", or for more detailed help on a particular command,
use "``tahoe admin COMMAND --help``".
Filesystem Manipulation
=======================

View File

@ -203,39 +203,28 @@ The ``tahoe.cfg`` file uses the following keys to control lease expiration:
"expire.mode = cutoff-date"). It will be rejected if age-based expiration
is in use.
expire.immutable = (boolean, optional)
If this is False, then immutable shares will never be deleted, even if
their leases have expired. This can be used in special situations to
perform GC on mutable files but not immutable ones. The default is True.
expire.mutable = (boolean, optional)
If this is False, then mutable shares will never be deleted, even if
their leases have expired. This can be used in special situations to
perform GC on immutable files but not mutable ones. The default is True.
In previous versions, the ``expire.immutable`` and ``expire.mutable`` keys
could be used to selectively expire only mutable or only immutable shares.
As of Tahoe-LAFS v1.12.0, these are no longer supported and will cause an
error if set to ``False``.
Expiration Progress
===================
In the current release, leases are stored as metadata in each share file, and
no separate database is maintained. As a result, checking and expiring leases
on a large server may require multiple reads from each of several million
share files. This process can take a long time and be very disk-intensive, so
a "share crawler" is used. The crawler limits the amount of time looking at
shares to a reasonable percentage of the storage server's overall usage: by
default it uses no more than 10% CPU, and yields to other code after 100ms. A
typical server with 1.1M shares was observed to take 3.5 days to perform this
rate-limited crawl through the whole set of shares, with expiration disabled.
It is expected to take perhaps 4 or 5 days to do the crawl with expiration
turned on.
As of Tahoe-LAFS v1.12.0, leases are stored in a database that can be queried
and updated quickly, rather than in share files. However, an "accounting
crawler" is still needed to discover shares when upgrading from a previous
version, and to actually delete expired shares. The crawler limits the amount
of time looking at shares to a reasonable percentage of the storage server's
overall usage: by default it uses no more than 10% CPU, and yields to other
code after 100ms.
The crawler's status is displayed on the "Storage Server Status Page", a web
page dedicated to the storage server. This page resides at $NODEURL/storage,
and there is a link to it from the front "welcome" page. The "Lease
Expiration crawler" section of the status page shows the progress of the
current crawler cycle, expected completion time, amount of space recovered,
and details of how many shares have been examined.
and there is a link to it from the front "welcome" page. The "Accounting
Crawler" section of the status page shows the progress of the current crawler
cycle, expected completion time, amount of space recovered, and details of how
many shares have been examined.
The crawler's state is persistent: restarting the node will not cause it to
lose significant progress. The state file is located in two files
@ -275,19 +264,12 @@ nevertheless be consuming extra disk space (and might be charged or otherwise
held accountable for it) until the ex-file's leases finally expire on their
own.
In the current release, these leases are each associated with a single "node
secret" (stored in $BASEDIR/private/secret), which is used to generate
renewal-secrets for each lease. Two nodes with different secrets
will produce separate leases, and will not be able to renew each
others' leases.
Once the Accounting project is in place, leases will be scoped by a
sub-delegatable "account id" instead of a node secret, so clients will be able
to manage multiple leases per file. In addition, servers will be able to
identify which shares are leased by which clients, so that clients can safely
reconcile their idea of which files/directories are active against the
server's list, and explicitly cancel leases on objects that aren't on the
active list.
Once more of the Accounting project has been implemented, leases will be
scoped by an "account id", and clients will be able to manage multiple
leases per file. In addition, servers will be able to identify which shares
are leased by which clients, so that clients can safely reconcile their
idea of which files/directories are active against the server's list, and
explicitly cancel leases on objects that aren't on the active list.
By reducing the size of the "lease scope", the coordination problem is made
easier. In general, mark-and-sweep is easier to implement (it requires mere

View File

@ -5,7 +5,7 @@ Lease database design
=====================
The target audience for this document is developers who wish to understand
the new lease database (leasedb) planned to be added in Tahoe-LAFS v1.11.0.
the new lease database (leasedb) planned to be added in Tahoe-LAFS v1.12.0.
Introduction
@ -113,9 +113,9 @@ The accounting crawler may perform the following functions (but see ticket
corrupted. This is handled in the same way as upgrading from a previous
version.
- Detect shares that have unexpectedly disappeared from storage. The
disappearance of a share is logged, and its entry and leases are removed
from the leasedb.
- Detect shares with stable entries in the leasedb that have unexpectedly
disappeared from storage. The disappearance of a share is logged, and its
entry and leases are removed from the leasedb.
Accounts

View File

@ -145,6 +145,14 @@ webapi.rst_.
.. _webapi.rst: frontends/webapi.rst
The Cloud Storage backend
-------------------------
By default, a Tahoe-LAFS storage server will store its shares on the
local filesystem. To store shares on a cloud storage service (for example
Amazon S3 or Rackspace Cloud Files) instead, see `<backends/cloud.rst>`__.
Socialize
=========

View File

@ -440,6 +440,14 @@ setup(name=APPNAME,
'allmydata.mutable',
'allmydata.scripts',
'allmydata.storage',
'allmydata.storage.backends',
'allmydata.storage.backends.cloud',
'allmydata.storage.backends.cloud.s3',
'allmydata.storage.backends.cloud.openstack',
'allmydata.storage.backends.cloud.googlestorage',
'allmydata.storage.backends.cloud.msazure',
'allmydata.storage.backends.disk',
'allmydata.storage.backends.null',
'allmydata.test',
'allmydata.util',
'allmydata.web',

View File

@ -280,10 +280,12 @@ def cross_check_pkg_resources_versus_import():
def cross_check(pkg_resources_vers_and_locs, imported_vers_and_locs_list):
"""This function returns a list of errors due to any failed cross-checks."""
from _auto_deps import not_import_versionable_packages, ignorable_packages
errors = []
not_pkg_resourceable = set(['python', 'platform', __appname__.lower()])
not_import_versionable = set(['zope.interface', 'mock', 'pyasn1'])
ignorable = set(['argparse', 'pyutil', 'zbase32', 'distribute', 'twisted-web', 'twisted-core', 'twisted-conch'])
not_import_versionable = set(not_import_versionable_packages)
ignorable = set(ignorable_packages)
for name, (imp_ver, imp_loc, imp_comment) in imported_vers_and_locs_list:
name = name.lower()

View File

@ -24,12 +24,14 @@ install_requires = [
# the drop-upload frontend.
# * We also need Twisted 10.1 for the FTP frontend in order for Twisted's
# FTP server to support asynchronous close.
# * When the cloud backend lands, it will depend on Twisted 10.2.0 which
# includes the fix to https://twistedmatrix.com/trac/ticket/411
# * The cloud backend depends on Twisted 10.2.0 which includes the fix to
# https://twistedmatrix.com/trac/ticket/411
# * The SFTP frontend depends on Twisted 11.0.0 to fix the SSH server
# rekeying bug http://twistedmatrix.com/trac/ticket/4395
# * The cloud backend depends on Twisted 12.1.0 for HTTPConnectionPool.
# * IPv6 support will also depend on Twisted 12.1.0.
#
"Twisted >= 11.0.0",
"Twisted >= 12.1.0",
# * foolscap < 0.5.1 had a performance bug which spent O(N**2) CPU for
# transferring large mutable files of size N.
@ -64,8 +66,13 @@ install_requires = [
# pycryptopp-0.6.0 includes ed25519
"pycryptopp >= 0.6.0",
# needed for cloud backend
"txAWS == 0.2.1.post5",
"oauth2client == 1.1.0",
# Will be needed to test web apps, but not yet. See #1001.
#"windmill >= 1.3",
]
# Includes some indirect dependencies, but does not include allmydata.
@ -85,8 +92,20 @@ package_imports = [
('pycrypto', 'Crypto'),
('pyasn1', 'pyasn1'),
('mock', 'mock'),
('txAWS', 'txaws'),
('oauth2client', 'oauth2client'),
('python-dateutil', 'dateutil'),
('httplib2', 'httplib2'),
('python-gflags', 'gflags'),
]
# Packages we cannot find a version number for by importing.
not_import_versionable_packages = ('zope.interface', 'mock', 'pyasn1', 'python-gflags')
# Packages that pkg_resources might report, but we don't care about checking their version.
ignorable_packages = ('argparse', 'pyutil', 'zbase32', 'distribute', 'twisted-web', 'twisted-core', 'twisted-conch', 'six')
def require_more():
import sys

View File

@ -8,15 +8,20 @@ from twisted.application.internet import TimerService
from pycryptopp.publickey import rsa
import allmydata
from allmydata.node import InvalidValueError
from allmydata.storage.server import StorageServer
from allmydata.storage.backends.null.null_backend import configure_null_backend
from allmydata.storage.backends.disk.disk_backend import configure_disk_backend
from allmydata.storage.backends.cloud.cloud_backend import configure_cloud_backend
from allmydata.storage.backends.cloud.mock_cloud import configure_mock_cloud_backend
from allmydata.storage.expiration import ExpirationPolicy
from allmydata import storage_client
from allmydata.immutable.upload import Uploader
from allmydata.immutable.offloaded import Helper
from allmydata.control import ControlServer
from allmydata.introducer.client import IntroducerClient
from allmydata.util import hashutil, base32, pollmixin, log, keyutil, idlib
from allmydata.util.encodingutil import get_filesystem_encoding
from allmydata.util.abbreviate import parse_abbreviated_size
from allmydata.util.encodingutil import get_filesystem_encoding, quote_output
from allmydata.util.time_format import parse_duration, parse_date
from allmydata.stats import StatsProvider
from allmydata.history import History
@ -165,7 +170,8 @@ class Client(node.Node, pollmixin.PollMixin):
self.init_web(webport) # strports string
def _sequencer(self):
seqnum_s = self.get_config_from_file("announcement-seqnum")
seqnum_path = os.path.join(self.basedir, "announcement-seqnum")
seqnum_s = self.get_optional_config_from_file(seqnum_path)
if not seqnum_s:
seqnum_s = "0"
seqnum = int(seqnum_s.strip())
@ -216,6 +222,7 @@ class Client(node.Node, pollmixin.PollMixin):
def _make_key():
sk_vs,vk_vs = keyutil.make_keypair()
return sk_vs+"\n"
sk_vs = self.get_or_create_private_config("node.privkey", _make_key)
sk,vk_vs = keyutil.parse_privkey(sk_vs.strip())
self.write_config("node.pubkey", vk_vs+"\n")
@ -230,11 +237,10 @@ class Client(node.Node, pollmixin.PollMixin):
return idlib.nodeid_b2a(self.nodeid)
def _init_permutation_seed(self, ss):
seed = self.get_config_from_file("permutation-seed")
seed = self.get_optional_private_config("permutation-seed")
if not seed:
have_shares = ss.have_shares()
if have_shares:
# if the server has shares but not a recorded
if ss.backend.must_use_tubid_as_permutation_seed():
# If a server using a disk backend has shares but not a recorded
# permutation-seed, then it has been around since pre-#466
# days, and the clients who uploaded those shares used our
# TubID as a permutation-seed. We should keep using that same
@ -250,25 +256,42 @@ class Client(node.Node, pollmixin.PollMixin):
self.write_config("permutation-seed", seed+"\n")
return seed.strip()
@classmethod
def configure_backend(cls, config):
"""This is also called directly by the implementation of 'tahoe admin create-container'."""
storedir = os.path.join(config.basedir, cls.STOREDIR)
# What sort of backend?
backendtype = config.get_config("storage", "backend", "disk")
if backendtype == "s3":
backendtype = "cloud.s3"
backendprefix = backendtype.partition('.')[0]
backend_configurators = {
'disk': configure_disk_backend,
'cloud': configure_cloud_backend,
'mock_cloud': configure_mock_cloud_backend,
'debug_discard': configure_null_backend,
}
if backendprefix not in backend_configurators:
raise InvalidValueError("%s is not supported; it must start with one of %s"
% (quote_output("[storage]backend = " + backendtype),
backend_configurators.keys()) )
return (backend_configurators[backendprefix](storedir, config), storedir)
def init_storage(self):
# should we run a storage server (and publish it for others to use)?
self.accountant = None
# Should we run a storage server (and publish it for others to use)?
if not self.get_config("storage", "enabled", True, boolean=True):
return
readonly = self.get_config("storage", "readonly", False, boolean=True)
storedir = os.path.join(self.basedir, self.STOREDIR)
(backend, storedir) = self.configure_backend(self)
data = self.get_config("storage", "reserved_space", None)
try:
reserved = parse_abbreviated_size(data)
except ValueError:
log.msg("[storage]reserved_space= contains unparseable value %s"
% data)
raise
if reserved is None:
reserved = 0
discard = self.get_config("storage", "debug_discard", False,
boolean=True)
if self.get_config("storage", "debug_discard", False, boolean=True):
raise OldConfigOptionError("[storage]debug_discard = True is no longer supported.")
expire = self.get_config("storage", "expire.enabled", False, boolean=True)
if expire:
@ -285,31 +308,29 @@ class Client(node.Node, pollmixin.PollMixin):
cutoff_date = self.get_config("storage", "expire.cutoff_date")
cutoff_date = parse_date(cutoff_date)
sharetypes = []
if self.get_config("storage", "expire.immutable", True, boolean=True):
sharetypes.append("immutable")
if self.get_config("storage", "expire.mutable", True, boolean=True):
sharetypes.append("mutable")
expiration_sharetypes = tuple(sharetypes)
if not self.get_config("storage", "expire.immutable", True, boolean=True):
raise OldConfigOptionError("[storage]expire.immutable = False is no longer supported.")
if not self.get_config("storage", "expire.mutable", True, boolean=True):
raise OldConfigOptionError("[storage]expire.mutable = False is no longer supported.")
ss = StorageServer(storedir, self.nodeid,
reserved_space=reserved,
discard_storage=discard,
readonly_storage=readonly,
stats_provider=self.stats_provider,
expiration_enabled=expire,
expiration_mode=mode,
expiration_override_lease_duration=o_l_d,
expiration_cutoff_date=cutoff_date,
expiration_sharetypes=expiration_sharetypes)
expiration_policy = ExpirationPolicy(enabled=expire, mode=mode, override_lease_duration=o_l_d,
cutoff_date=cutoff_date)
statedir = storedir
ss = StorageServer(self.nodeid, backend, statedir,
stats_provider=self.stats_provider)
self.accountant = ss.get_accountant()
self.accountant.set_expiration_policy(expiration_policy)
self.storage_server = ss
self.add_service(ss)
d = self.when_tub_ready()
# we can't do registerReference until the Tub is ready
def _publish(res):
furl_file = os.path.join(self.basedir, "private", "storage.furl").encode(get_filesystem_encoding())
furl = self.tub.registerReference(ss, furlFile=furl_file)
ann = {"anonymous-storage-FURL": furl,
anonymous_account = self.accountant.get_anonymous_account()
anonymous_account_furlfile = os.path.join(self.basedir, "private", "storage.furl").encode(get_filesystem_encoding())
anonymous_account_furl = self.tub.registerReference(anonymous_account, furlFile=anonymous_account_furlfile)
ann = {"anonymous-storage-FURL": anonymous_account_furl,
"permutation-seed-base32": self._init_permutation_seed(ss),
}
self.introducer_client.publish("storage", ann, self._node_key)
@ -317,6 +338,9 @@ class Client(node.Node, pollmixin.PollMixin):
d.addErrback(log.err, facility="tahoe.init",
level=log.BAD, umid="aLGBKw")
def get_accountant(self):
return self.accountant
def init_client(self):
helper_furl = self.get_config("client", "helper.furl", None)
if helper_furl in ("None", ""):

View File

@ -776,7 +776,7 @@ class Checker(log.PrefixingLogMixin):
unrecoverable = 1
# The file needs rebalancing if the set of servers that have at least
# one share is less than the number of uniquely-numbered shares
# one share is less than the number of uniquely-numbered good shares
# available.
# TODO: this may be wrong, see ticket #1115 comment:27 and ticket #1784.
needs_rebalancing = bool(good_share_hosts < len(verifiedshares))

View File

@ -337,7 +337,7 @@ class ReadBucketProxy:
return self._read(0, 0x44)
def _parse_offsets(self, data):
precondition(len(data) >= 0x4)
precondition(len(data) >= 0x4, len(data))
self._offsets = {}
(version,) = struct.unpack(">L", data[0:4])
if version != 1 and version != 2:

View File

@ -815,8 +815,26 @@ class EncryptAnUploadable:
self._status.set_progress(1, progress)
return cryptdata
def get_plaintext_hashtree_leaves(self, first, last, num_segments):
"""OBSOLETE; Get the leaf nodes of a merkle hash tree over the
plaintext segments, i.e. get the tagged hashes of the given segments.
The segment size is expected to be generated by the
IEncryptedUploadable before any plaintext is read or ciphertext
produced, so that the segment hashes can be generated with only a
single pass.
This returns a Deferred that fires with a sequence of hashes, using:
tuple(segment_hashes[first:last])
'num_segments' is used to assert that the number of segments that the
IEncryptedUploadable handled matches the number of segments that the
encoder was expecting.
This method must not be called until the final byte has been read
from read_encrypted(). Once this method is called, read_encrypted()
can never be called again.
"""
# this is currently unused, but will live again when we fix #453
if len(self._plaintext_segment_hashes) < num_segments:
# close out the last one
@ -835,6 +853,12 @@ class EncryptAnUploadable:
return defer.succeed(tuple(self._plaintext_segment_hashes[first:last]))
def get_plaintext_hash(self):
"""OBSOLETE; Get the hash of the whole plaintext.
This returns a Deferred that fires with a tagged SHA-256 hash of the
whole plaintext, obtained from hashutil.plaintext_hash(data).
"""
# this is currently unused, but will live again when we fix #453
h = self._plaintext_hasher.digest()
return defer.succeed(h)

View File

@ -27,8 +27,8 @@ Number = IntegerConstraint(8) # 2**(8*8) == 16EiB ~= 18e18 ~= 18 exabytes
Offset = Number
ReadSize = int # the 'int' constraint is 2**31 == 2Gib -- large files are processed in not-so-large increments
WriteEnablerSecret = Hash # used to protect mutable share modifications
LeaseRenewSecret = Hash # used to protect lease renewal requests
LeaseCancelSecret = Hash # was used to protect lease cancellation requests
LeaseRenewSecret = Hash # previously used to protect lease renewal requests; for backward compatibility
LeaseCancelSecret = Hash # previously used to protect lease cancellation requests; for backward compatibility
class RIBucketWriter(RemoteInterface):
@ -99,23 +99,29 @@ class RIStorageServer(RemoteInterface):
sharenums=SetOf(int, maxLength=MAX_BUCKETS),
allocated_size=Offset, canary=Referenceable):
"""
@param storage_index: the index of the bucket to be created or
increfed.
@param sharenums: these are the share numbers (probably between 0 and
99) that the sender is proposing to store on this
server.
@param renew_secret: This is the secret used to protect bucket refresh
This secret is generated by the client and
stored for later comparison by the server. Each
server is given a different secret.
@param cancel_secret: This no longer allows lease cancellation, but
must still be a unique value identifying the
lease. XXX stop relying on it to be unique.
@param canary: If the canary is lost before close(), the bucket is
deleted.
Allocate BucketWriters for a set of shares on this server.
renew_secret and cancel_secret are ignored as of Tahoe-LAFS v1.12.0,
but for backward compatibility with older servers, should be
calculated in the same way as previous clients (see
allmydata.util.hashutil.file_{renewal,cancel}_secret_hash).
Servers that ignore renew_secret and cancel_secret in methods
of this interface, will advertise a true value for the
'ignores-lease-renewal-and-cancel-secrets' key (under
'http://allmydata.org/tahoe/protocols/storage/v1') in their
version information.
@param storage_index: the index of the shares to be created.
@param sharenums: the share numbers that the sender is proposing to store
on this server.
@param renew_secret: previously used to authorize lease renewal.
@param cancel_secret: previously used to authorize lease cancellation.
@param canary: if the canary is lost before close(), the writes are
abandoned.
@return: tuple of (alreadygot, allocated), where alreadygot is what we
already have and allocated is what we hereby agree to accept.
New leases are added for shares in both lists.
Leases are added if necessary for shares in both lists.
"""
return TupleOf(SetOf(int, maxLength=MAX_BUCKETS),
DictOf(int, RIBucketWriter, maxKeys=MAX_BUCKETS))
@ -124,33 +130,48 @@ class RIStorageServer(RemoteInterface):
renew_secret=LeaseRenewSecret,
cancel_secret=LeaseCancelSecret):
"""
Add a new lease on the given bucket. If the renew_secret matches an
existing lease, that lease will be renewed instead. If there is no
bucket for the given storage_index, return silently. (note that in
tahoe-1.3.0 and earlier, IndexError was raised if there was no
bucket)
Add or renew a lease for this account on every share with the given
storage index held by the server. If there are no shares held by the
server with the given storage_index, return silently. (In Tahoe-LAFS
v1.3.0 and earlier, IndexError was raised in that case.)
The duration of leases is set to 31 days (unless there is already a
longer lease), but expiration behaviour also depends on the server's
configured policy (see docs/garbage-collection.rst).
renew_secret and cancel_secret are ignored as of Tahoe-LAFS v1.12.0,
but for backward compatibility with older servers, should be
calculated in the same way as previous clients (see
allmydata.util.hashutil.file_{renewal,cancel}_secret_hash).
"""
return Any() # returns None now, but future versions might change
return Any() # always None
def renew_lease(storage_index=StorageIndex, renew_secret=LeaseRenewSecret):
"""
Renew the lease on a given bucket, resetting the timer to 31 days.
Some networks will use this, some will not. If there is no bucket for
the given storage_index, IndexError will be raised.
Add or renew a lease for this account on every share with the given
storage index held by the server. If there are no shares held by the
server with the given storage_index, raise IndexError.
For mutable shares, if the given renew_secret does not match an
existing lease, IndexError will be raised with a note listing the
server-nodeids on the existing leases, so leases on migrated shares
can be renewed. For immutable shares, IndexError (without the note)
will be raised.
The duration of leases is set to 31 days (unless there is already a
longer lease), but expiration behaviour also depends on the server's
configured policy (see docs/garbage-collection.rst).
renew_secret is ignored as of Tahoe-LAFS v1.12.0, but for backward
compatibility with older servers, should be calculated in the same
way as previous clients (see
allmydata.util.hashutil.file_renewal_secret_hash). In versions
prior to v1.12.0, this method would only renew leases with the given
renew_secret.
Note that as of Tahoe-LAFS v1.12.0, the lease database does not retain
information about the node ids of lease holders, so if an IndexError
is raised for a mutable share, it no longer includes that information.
"""
return Any()
return Any() # always None
def get_buckets(storage_index=StorageIndex):
return DictOf(int, RIBucketReader, maxKeys=MAX_BUCKETS)
def slot_readv(storage_index=StorageIndex,
shares=ListOf(int), readv=ReadVector):
"""Read a vector from the numbered shares associated with the given
@ -181,24 +202,23 @@ class RIStorageServer(RemoteInterface):
This method is, um, large. The goal is to allow clients to update all
the shares associated with a mutable file in a single round trip.
@param storage_index: the index of the bucket to be created or
increfed.
@param storage_index: the index of the shareset to be operated on.
@param write_enabler: a secret that is stored along with the slot.
Writes are accepted from any caller who can
present the matching secret. A different secret
should be used for each slot*server pair.
@param renew_secret: This is the secret used to protect bucket refresh
This secret is generated by the client and
stored for later comparison by the server. Each
server is given a different secret.
@param cancel_secret: This no longer allows lease cancellation, but
must still be a unique value identifying the
lease. XXX stop relying on it to be unique.
@param renew_secret: previously used to authorize lease renewal.
@param cancel_secret: previously used to authorize lease cancellation.
The 'secrets' argument is a tuple of (write_enabler, renew_secret,
cancel_secret). The first is required to perform any write. The
latter two are used when allocating new shares. To simply acquire a
new lease on existing shares, use an empty testv and an empty writev.
cancel_secret). The first is required to perform any write.
renew_secret and cancel_secret are ignored as of Tahoe-LAFS v1.12.0,
but for backward compatibility with older servers, should be
calculated in the same way as previous clients (see
allmydata.util.hashutil.file_{renewal,cancel}_secret_hash).
To simply acquire a new lease on existing shares, use an empty testv
and an empty writev.
Each share can have a separate test vector (i.e. a list of
comparisons to perform). If all vectors for all shares pass, then all
@ -291,6 +311,277 @@ class RIStorageServer(RemoteInterface):
"""
class IStorageBackend(Interface):
"""
Objects of this kind live on the server side and are used by the
storage server object.
"""
def get_available_space():
"""
Returns available space for share storage in bytes, or
None if this information is not available or if the available
space is unlimited.
If the backend is configured for read-only mode then this will
return 0.
"""
def get_sharesets_for_prefix(prefix):
"""
Return a Deferred that fires with an iterable of IShareSet objects
for all storage indices matching the given base-32 prefix, for
which this backend holds shares.
A caller will typically perform operations that take locks on some
of the sharesets returned by this method. Nothing prevents sharesets
matching the prefix from being deleted or added between listing the
sharesets and taking any such locks; callers must be able to tolerate
this.
"""
def get_shareset(storageindex):
"""
Get an IShareSet object for the given storage index.
This method is synchronous.
"""
def fill_in_space_stats(stats):
"""
Fill in the 'stats' dict with space statistics for this backend, in
'storage_server.*' keys.
"""
def must_use_tubid_as_permutation_seed():
"""
Is this a disk backend with existing shares? If True, then the server
must assume that it was around before #466, so must use its TubID as a
permutation-seed.
"""
def create_container():
"""
Create a container for the configured backend, if necessary. Return a
Deferred that fires with False if no container is needed for this backend
type, or something other than False if a container has been successfully
created. It is an error to attempt to create a container that already exists.
"""
def list_container(prefix=str):
"""
Return a Deferred that fires with a list of ContainerItems for all
objects in the backend container. If prefix is given, restrict the
list to objects having keys with the given prefix.
"""
class IShareSet(Interface):
def get_storage_index():
"""
Returns the storage index for this shareset.
"""
def get_storage_index_string():
"""
Returns the base32-encoded storage index for this shareset.
"""
def get_overhead():
"""
Returns an estimate of the storage overhead, in bytes, of this shareset
(exclusive of the space used by its shares).
"""
def get_shares():
"""
Returns a Deferred that fires with a pair
(list of IShareBase objects, set of corrupted shnums).
The share objects include only completed shares in this shareset.
"""
def get_share(shnum):
"""
Returns a Deferred that fires with an IShareBase object if the given
share exists, or fails with IndexError otherwise.
"""
def delete_share(shnum):
"""
Delete a stored share. Returns a Deferred that fires when complete.
This does not delete incoming shares.
"""
def has_incoming(shnum):
"""
Returns True if this shareset has an incoming (partial) share with this
number, otherwise False.
"""
def make_bucket_writer(account, shnum, allocated_data_length, canary):
"""
Create a bucket writer that can be used to write data to a given share.
@param account=Account
@param shnum=int: A share number in this shareset
@param allocated_data_length=int: The maximum space allocated for the
share, in bytes
@param canary=Referenceable: If the canary is lost before close(), the
bucket is deleted.
@return an IStorageBucketWriter for the given share
"""
def make_bucket_reader(account, share):
"""
Create a bucket reader that can be used to read data from a given share.
@param account=Account
@param share=IShareForReading
@return an IStorageBucketReader for the given share
"""
def readv(wanted_shnums, read_vector):
"""
Read a vector from the numbered shares in this shareset. An empty
wanted_shnums list means to return data from all known shares.
Return a Deferred that fires with a dict mapping the share number
to the corresponding ReadData.
@param wanted_shnums=ListOf(int)
@param read_vector=ReadVector
@return DeferredOf(DictOf(int, ReadData)): shnum -> results, with one key per share
"""
def testv_and_readv_and_writev(write_enabler, test_and_write_vectors, read_vector,
expiration_time, account):
"""
General-purpose atomic test-read-and-set operation for mutable slots.
Perform a bunch of comparisons against the existing shares in this
shareset. If they all pass: use the read vectors to extract data from
all the shares, then apply a bunch of write vectors to those shares.
Return a Deferred that fires with a pair consisting of a boolean that is
True iff the test vectors passed, and a dict mapping the share number
to the corresponding ReadData. Reads do not include any modifications
made by the writes.
See the similar method in RIStorageServer for more detail.
@param write_enabler=WriteEnablerSecret
@param test_and_write_vectors=TestAndWriteVectorsForShares
@param read_vector=ReadVector
@param expiration_time=int
@param account=Account
@return DeferredOf(TupleOf(bool, DictOf(int, ReadData)))
"""
class IShareBase(Interface):
"""
I represent an immutable or mutable share stored by a particular backend.
I may hold some, all, or none of the share data in memory.
"""
def get_storage_index():
"""
Returns the storage index.
"""
def get_storage_index_string():
"""
Returns the base32-encoded storage index.
"""
def get_shnum():
"""
Returns the share number.
"""
def get_data_length():
"""
Returns the data length in bytes.
"""
def get_size():
"""
Returns the size of the share in bytes.
"""
def get_used_space():
"""
Returns the amount of backend storage including overhead (which may
have to be estimated), in bytes, used by this share.
"""
def unlink():
"""
Signal that this share can be removed from the backend storage. This does
not guarantee that the share data will be immediately inaccessible, or
that it will be securely erased.
Returns a Deferred that fires after the share has been removed.
This may be called on a share that is being written and is not closed.
"""
class IShareForReading(IShareBase):
"""
I represent an immutable share that can be read from.
"""
def read_share_data(offset, length):
"""
Return a Deferred that fires with the read result.
"""
def readv(read_vector):
"""
Given a list of (offset, length) pairs, return a Deferred that fires with
a list of read results.
"""
class IShareForWriting(IShareBase):
"""
I represent an immutable share that is being written.
"""
def get_allocated_data_length():
"""
Returns the allocated data length of the share in bytes. This is the maximum
amount of data that can be written (not including headers and leases).
"""
def write_share_data(offset, data):
"""
Write data at the given offset. Return a Deferred that fires when we
are ready to accept the next write.
Data must be written with no backtracking, i.e. offset must not be before
the previous end-of-data.
"""
def close():
"""
Complete writing to this share.
"""
class IMutableShare(IShareBase):
"""
I represent a mutable share.
"""
def check_write_enabler(write_enabler):
"""
@param write_enabler=WriteEnablerSecret
"""
def check_testv(test_vector):
"""
@param test_vector=TestVector
"""
def writev(datav, new_length):
"""
@param datav=DataVector
@param new_length=ChoiceOf(None, Offset)
"""
class IStorageBucketWriter(Interface):
"""
Objects of this kind live on the client side.
@ -326,7 +617,7 @@ class IStorageBucketWriter(Interface):
of plaintext, crypttext, and shares), as well as encoding parameters
that are necessary to recover the data. This is a serialized dict
mapping strings to other strings. The hash of this data is kept in
the URI and verified before any of the data is used. All buckets for
the URI and verified before any of the data is used. All shares for
a given file contain identical copies of this data.
The serialization format is specified with the following pseudocode:
@ -339,10 +630,9 @@ class IStorageBucketWriter(Interface):
"""
def close():
"""Finish writing and close the bucket. The share is not finalized
until this method is called: if the uploading client disconnects
before calling close(), the partially-written share will be
discarded.
"""
Finish writing and finalize the share. If the uploading client disconnects
before calling close(), the partially-written share will be discarded.
@return: a Deferred that fires (with None) when the operation completes
"""
@ -1212,7 +1502,7 @@ class IDirectoryNode(IFilesystemNode):
is empty, the metadata will be an empty dictionary.
"""
def set_uri(name, writecap, readcap=None, metadata=None, overwrite=True):
def set_uri(name, writecap, readcap, metadata=None, overwrite=True):
"""I add a child (by writecap+readcap) at the specific name. I return
a Deferred that fires when the operation finishes. If overwrite= is
True, I will replace any existing child of the same name, otherwise

View File

@ -742,8 +742,9 @@ class MutableFileVersion:
"""
return self._version[0] # verinfo[0] == the sequence number
def get_servermap(self):
return self._servermap
# TODO: Terminology?
def get_writekey(self):
"""
I return a writekey or None if I don't have a writekey.

View File

@ -71,6 +71,11 @@ HEADER_LENGTH = struct.calcsize(HEADER)
OFFSETS = ">LLLLQQ"
OFFSETS_LENGTH = struct.calcsize(OFFSETS)
# our sharefiles share with a recognizable string, plus some random
# binary data to reduce the chance that a regular text file will look
# like a sharefile.
MUTABLE_MAGIC = "Tahoe mutable container v1\n" + "\x75\x09\x44\x03\x8e"
MAX_MUTABLE_SHARE_SIZE = 69105*1000*1000*1000*1000 # 69105 TB, kind of arbitrary
@ -91,7 +96,7 @@ def unpack_header(data):
return (version, seqnum, root_hash, IV, k, N, segsize, datalen, o)
def unpack_share(data):
assert len(data) >= HEADER_LENGTH
assert len(data) >= HEADER_LENGTH, len(data)
o = {}
(version,
seqnum,

View File

@ -1,3 +1,4 @@
import datetime, os.path, re, types, ConfigParser, tempfile
from base64 import b32decode, b32encode
@ -12,6 +13,8 @@ from allmydata.util import fileutil, iputil, observer
from allmydata.util.assertutil import precondition, _assert
from allmydata.util.fileutil import abspath_expanduser_unicode
from allmydata.util.encodingutil import get_filesystem_encoding, quote_output
from allmydata.util.abbreviate import parse_abbreviated_size
# Add our application versions to the data that Foolscap's LogPublisher
# reports.
@ -37,9 +40,12 @@ such as private keys. On Unix-like systems, the permissions on this directory
are set to disallow users other than its owner from reading the contents of
the files. See the 'configuration.rst' documentation file for details."""
class _None: # used as a marker in get_config()
class _None: # used as a marker in get_config() and get_or_create_private_config()
pass
class InvalidValueError(Exception):
""" The configured value was not valid. """
class MissingConfigEntry(Exception):
""" A required config entry was not found. """
@ -56,7 +62,156 @@ class OldConfigOptionError(Exception):
pass
class Node(service.MultiService):
class ConfigMixin:
def get_config(self, section, option, default=_None, boolean=False):
try:
if boolean:
return self.config.getboolean(section, option)
return self.config.get(section, option)
except (ConfigParser.NoOptionError, ConfigParser.NoSectionError):
if default is _None:
fn = os.path.join(self.basedir, u"tahoe.cfg")
raise MissingConfigEntry("%s is missing the [%s]%s entry"
% (quote_output(fn), section, option))
return default
def get_config_size(self, section, option, default=_None):
data = self.get_config(section, option, default)
if data is None:
return None
try:
return parse_abbreviated_size(data)
except ValueError:
raise InvalidValueError("[%s]%s= contains unparseable size value %s"
% (section, option, quote_output(data)) )
def set_config(self, section, option, value):
if not self.config.has_section(section):
self.config.add_section(section)
self.config.set(section, option, value)
assert self.config.get(section, option) == value
def read_config(self):
self.error_about_old_config_files()
self.config = ConfigParser.SafeConfigParser()
tahoe_cfg = os.path.join(self.basedir, "tahoe.cfg")
try:
f = open(tahoe_cfg, "rb")
try:
# Skip any initial Byte Order Mark. Since this is an ordinary file, we
# don't need to handle incomplete reads, and can assume seekability.
if f.read(3) != '\xEF\xBB\xBF':
f.seek(0)
self.config.readfp(f)
finally:
f.close()
except EnvironmentError:
if os.path.exists(tahoe_cfg):
raise
if not os.path.isdir(self.basedir):
raise MissingConfigEntry("%s is missing or not a directory." % quote_output(self.basedir))
def error_about_old_config_files(self):
""" If any old configuration files are detected, raise OldConfigError. """
oldfnames = set()
for name in [
'nickname', 'webport', 'keepalive_timeout', 'log_gatherer.furl',
'disconnect_timeout', 'advertised_ip_addresses', 'introducer.furl',
'helper.furl', 'key_generator.furl', 'stats_gatherer.furl',
'no_storage', 'readonly_storage', 'sizelimit',
'debug_discard_storage', 'run_helper']:
if name not in self.GENERATED_FILES:
fullfname = os.path.join(self.basedir, name)
if os.path.exists(fullfname):
oldfnames.add(fullfname)
if oldfnames:
e = OldConfigError(oldfnames)
twlog.msg(e)
raise e
def get_optional_config_from_file(self, path):
"""Read the (string) contents of a file. Any leading or trailing
whitespace will be stripped from the data. If the file does not exist,
return None."""
try:
value = fileutil.read(path)
except EnvironmentError:
if os.path.exists(path):
raise
return None
return value.strip()
def _get_private_config_path(self, name):
return os.path.join(self.basedir, "private", name)
def write_private_config(self, name, value):
"""Write the (string) contents of a private config file (which is a
config file that resides within the subdirectory named 'private'), and
return it.
"""
fileutil.write(self._get_private_config_path(name), value, mode="")
def get_optional_private_config(self, name):
"""Try to get the (string) contents of a private config file (which
is a config file that resides within the subdirectory named
'private'), and return it. Any leading or trailing whitespace will be
stripped from the data. If the file does not exist, return None.
"""
return self.get_optional_config_from_file(self._get_private_config_path(name))
def get_private_config(self, name):
"""Read the (string) contents of a private config file (which is a
config file that resides within the subdirectory named 'private'),
and return it. Raise an error if the file was not found.
"""
return self.get_or_create_private_config(name)
def get_or_create_private_config(self, name, default=_None):
"""Try to get the (string) contents of a private config file (which
is a config file that resides within the subdirectory named
'private'), and return it. Any leading or trailing whitespace will be
stripped from the data.
If the file does not exist, and default is not given, report an error.
If the file does not exist and a default is specified, try to create
it using that default, and then return the value that was written.
If 'default' is a string, use it as a default value. If not, treat it
as a zero-argument callable that is expected to return a string.
"""
value = self.get_optional_private_config(name)
if value is None:
privpath = self._get_private_config_path(name)
if default is _None:
raise MissingConfigEntry("The required configuration file %s is missing."
% (quote_output(privpath),))
elif isinstance(default, basestring):
value = default.strip()
else:
value = default().strip()
fileutil.write(privpath, value, mode="")
return value
def write_config(self, name, value, mode=""):
"""Write a string to a config file."""
fn = os.path.join(self.basedir, name)
try:
fileutil.write(fn, value, mode=mode)
except EnvironmentError, e:
self.log("Unable to write config file '%s'" % fn)
self.log(e)
class ConfigOnly(object, ConfigMixin):
GENERATED_FILES = []
def __init__(self, basedir=u"."):
self.basedir = abspath_expanduser_unicode(unicode(basedir))
self.read_config()
class Node(service.MultiService, ConfigMixin):
# this implements common functionality of both Client nodes and Introducer
# nodes.
NODETYPE = "unknown NODETYPE"
@ -70,10 +225,23 @@ class Node(service.MultiService):
self._portnumfile = os.path.join(self.basedir, self.PORTNUMFILE)
self._tub_ready_observerlist = observer.OneShotObserverList()
fileutil.make_dirs(os.path.join(self.basedir, "private"), 0700)
open(os.path.join(self.basedir, "private", "README"), "w").write(PRIV_README)
fileutil.write(os.path.join(self.basedir, "private", "README"), PRIV_README, mode="")
# creates self.config
self.read_config()
cfg_tubport = self.get_config("node", "tub.port", "")
if not cfg_tubport:
# For 'tub.port', tahoe.cfg overrides the individual file on
# disk. So only read self._portnumfile if tahoe.cfg doesn't
# provide a value.
try:
file_tubport = fileutil.read(self._portnumfile).strip()
self.set_config("node", "tub.port", file_tubport)
except EnvironmentError:
if os.path.exists(self._portnumfile):
raise
nickname_utf8 = self.get_config("node", "nickname", "<unspecified>")
self.nickname = nickname_utf8.decode("utf-8")
assert type(self.nickname) is unicode
@ -101,74 +269,6 @@ class Node(service.MultiService):
test_name = tempfile.mktemp()
_assert(os.path.dirname(test_name) == tempdir, test_name, tempdir)
def get_config(self, section, option, default=_None, boolean=False):
try:
if boolean:
return self.config.getboolean(section, option)
return self.config.get(section, option)
except (ConfigParser.NoOptionError, ConfigParser.NoSectionError):
if default is _None:
fn = os.path.join(self.basedir, u"tahoe.cfg")
raise MissingConfigEntry("%s is missing the [%s]%s entry"
% (quote_output(fn), section, option))
return default
def set_config(self, section, option, value):
if not self.config.has_section(section):
self.config.add_section(section)
self.config.set(section, option, value)
assert self.config.get(section, option) == value
def read_config(self):
self.error_about_old_config_files()
self.config = ConfigParser.SafeConfigParser()
tahoe_cfg = os.path.join(self.basedir, "tahoe.cfg")
try:
f = open(tahoe_cfg, "rb")
try:
# Skip any initial Byte Order Mark. Since this is an ordinary file, we
# don't need to handle incomplete reads, and can assume seekability.
if f.read(3) != '\xEF\xBB\xBF':
f.seek(0)
self.config.readfp(f)
finally:
f.close()
except EnvironmentError:
if os.path.exists(tahoe_cfg):
raise
cfg_tubport = self.get_config("node", "tub.port", "")
if not cfg_tubport:
# For 'tub.port', tahoe.cfg overrides the individual file on
# disk. So only read self._portnumfile if tahoe.cfg doesn't
# provide a value.
try:
file_tubport = fileutil.read(self._portnumfile).strip()
self.set_config("node", "tub.port", file_tubport)
except EnvironmentError:
if os.path.exists(self._portnumfile):
raise
def error_about_old_config_files(self):
""" If any old configuration files are detected, raise OldConfigError. """
oldfnames = set()
for name in [
'nickname', 'webport', 'keepalive_timeout', 'log_gatherer.furl',
'disconnect_timeout', 'advertised_ip_addresses', 'introducer.furl',
'helper.furl', 'key_generator.furl', 'stats_gatherer.furl',
'no_storage', 'readonly_storage', 'sizelimit',
'debug_discard_storage', 'run_helper']:
if name not in self.GENERATED_FILES:
fullfname = os.path.join(self.basedir, name)
if os.path.exists(fullfname):
oldfnames.add(fullfname)
if oldfnames:
e = OldConfigError(oldfnames)
twlog.msg(e)
raise e
def create_tub(self):
certfile = os.path.join(self.basedir, "private", self.CERTFILE)
self.tub = Tub(certFile=certfile)
@ -209,81 +309,6 @@ class Node(service.MultiService):
# TODO: merge this with allmydata.get_package_versions
return dict(app_versions.versions)
def get_config_from_file(self, name, required=False):
"""Get the (string) contents of a config file, or None if the file
did not exist. If required=True, raise an exception rather than
returning None. Any leading or trailing whitespace will be stripped
from the data."""
fn = os.path.join(self.basedir, name)
try:
return fileutil.read(fn).strip()
except EnvironmentError:
if not required:
return None
raise
def write_private_config(self, name, value):
"""Write the (string) contents of a private config file (which is a
config file that resides within the subdirectory named 'private'), and
return it.
"""
privname = os.path.join(self.basedir, "private", name)
open(privname, "w").write(value)
def get_private_config(self, name, default=_None):
"""Read the (string) contents of a private config file (which is a
config file that resides within the subdirectory named 'private'),
and return it. Return a default, or raise an error if one was not
given.
"""
privname = os.path.join(self.basedir, "private", name)
try:
return fileutil.read(privname)
except EnvironmentError:
if os.path.exists(privname):
raise
if default is _None:
raise MissingConfigEntry("The required configuration file %s is missing."
% (quote_output(privname),))
return default
def get_or_create_private_config(self, name, default=_None):
"""Try to get the (string) contents of a private config file (which
is a config file that resides within the subdirectory named
'private'), and return it. Any leading or trailing whitespace will be
stripped from the data.
If the file does not exist, and default is not given, report an error.
If the file does not exist and a default is specified, try to create
it using that default, and then return the value that was written.
If 'default' is a string, use it as a default value. If not, treat it
as a zero-argument callable that is expected to return a string.
"""
privname = os.path.join(self.basedir, "private", name)
try:
value = fileutil.read(privname)
except EnvironmentError:
if os.path.exists(privname):
raise
if default is _None:
raise MissingConfigEntry("The required configuration file %s is missing."
% (quote_output(privname),))
if isinstance(default, basestring):
value = default
else:
value = default()
fileutil.write(privname, value)
return value.strip()
def write_config(self, name, value, mode="w"):
"""Write a string to a config file."""
fn = os.path.join(self.basedir, name)
try:
fileutil.write(fn, value, mode)
except EnvironmentError, e:
self.log("Unable to write config file '%s'" % fn)
self.log(e)
def startService(self):
# Note: this class can be started and stopped at most once.
self.log("Node.startService")

View File

@ -1,16 +1,18 @@
import os
from twisted.python import usage
from allmydata.scripts.common import BaseOptions
from allmydata.util.encodingutil import quote_output
from allmydata.scripts.common import BaseOptions, BasedirOptions
class GenerateKeypairOptions(BaseOptions):
def getSynopsis(self):
return "Usage: tahoe [global-opts] admin generate-keypair"
return "Usage: %s [global-opts] admin generate-keypair" % (self.command_name,)
def getUsage(self, width=None):
t = BaseOptions.getUsage(self, width)
t += """
Generate a public/private keypair, dumped to stdout as two lines of ASCII..
Generate a public/private keypair, dumped to stdout as two lines of ASCII.
"""
return t
@ -26,14 +28,13 @@ class DerivePubkeyOptions(BaseOptions):
self.privkey = privkey
def getSynopsis(self):
return "Usage: tahoe [global-opts] admin derive-pubkey PRIVKEY"
return "Usage: %s [global-opts] admin derive-pubkey PRIVKEY" % (self.command_name,)
def getUsage(self, width=None):
t = BaseOptions.getUsage(self, width)
t += """
Given a private (signing) key that was previously generated with
generate-keypair, derive the public key and print it to stdout.
"""
return t
@ -46,18 +47,145 @@ def derive_pubkey(options):
print >>out, "public:", pubkey_vs
return 0
class CreateContainerOptions(BasedirOptions):
def getSynopsis(self):
return "Usage: %s [global-opts] admin create-container [NODEDIR]" % (self.command_name,)
def getUsage(self, width=None):
t = BasedirOptions.getUsage(self, width)
t += """
Create a storage container, using the name and credentials configured in
tahoe.cfg. This is needed only for the cloud backend, and only if the
container has not already been created. See <docs/backends/cloud.rst>
for more details.
"""
return t
def create_container(options):
from twisted.internet import reactor, defer
d = defer.maybeDeferred(do_create_container, options)
d.addCallbacks(lambda ign: os._exit(0), lambda ign: os._exit(1))
reactor.run()
def do_create_container(options):
from twisted.internet import defer
from allmydata.node import ConfigOnly
from allmydata.client import Client
out = options.stdout
err = options.stderr
d = defer.succeed(None)
def _do_create(ign):
config = ConfigOnly(options['basedir'])
(backend, _) = Client.configure_backend(config)
d2 = backend.create_container()
def _done(res):
if res is False:
print >>out, ("It is not necessary to create a container for this backend type (%s)."
% (backend.__class__.__name__,))
else:
print >>out, "The container was successfully created."
print >>out
d2.addCallback(_done)
return d2
d.addCallback(_do_create)
def _failed(f):
print >>err, "Container creation failed."
print >>err, "%s: %s" % (f.value.__class__.__name__, f.value)
print >>err
return f
d.addErrback(_failed)
return d
class ListContainerOptions(BasedirOptions):
def getSynopsis(self):
return "Usage: %s [global-opts] admin ls-container [NODEDIR]" % (self.command_name,)
def getUsage(self, width=None):
t = BasedirOptions.getUsage(self, width)
t += """
List the contents of a storage container, using the name and credentials
configured in tahoe.cfg. This currently works only for the cloud backend.
"""
return t
def ls_container(options):
from twisted.internet import reactor, defer
d = defer.maybeDeferred(do_ls_container, options)
d.addCallbacks(lambda ign: os._exit(0), lambda ign: os._exit(1))
reactor.run()
def format_date(date):
datestr = str(date)
if datestr.endswith('+00:00'):
datestr = datestr[: -6] + 'Z'
return datestr
def do_ls_container(options):
from twisted.internet import defer
from allmydata.node import ConfigOnly
from allmydata.client import Client
from allmydata.util.namespace import Namespace
out = options.stdout
err = options.stderr
d = defer.succeed(None)
def _do_create(ign):
config = ConfigOnly(options['basedir'])
if not config.get_config("storage", "enabled", True, boolean=True):
raise AssertionError("'tahoe admin ls-container' is intended for administration of nodes running a storage service.\n"
"The node with base directory %s is not configured to provide storage."
% quote_output(options['basedir']))
(backend, _) = Client.configure_backend(config)
ns = Namespace()
ns.total_size = 0
d2 = backend.list_container()
def _done(items):
print >>out, "Listing %d object(s):" % len(items)
print >>out, " Size Last modified Key"
for item in items:
print >>out, "% 8s %20s %s" % (item.size, format_date(item.modification_date), item.key)
ns.total_size += int(item.size)
print >>out
print >>out, "Total size: %d bytes" % (ns.total_size,)
d2.addCallback(_done)
return d2
d.addCallback(_do_create)
def _failed(f):
print >>err, "Container listing failed."
print >>err, "%s: %s" % (f.value.__class__.__name__, f.value)
print >>err
return f
d.addErrback(_failed)
return d
class AdminCommand(BaseOptions):
subCommands = [
("generate-keypair", None, GenerateKeypairOptions,
"Generate a public/private keypair, write to stdout."),
("derive-pubkey", None, DerivePubkeyOptions,
"Derive a public key from a private key."),
("create-container", None, CreateContainerOptions,
"Create a container for the configured cloud backend."),
("ls-container", None, ListContainerOptions,
"List the contents of the configured backend container."),
]
def postOptions(self):
if not hasattr(self, 'subOptions'):
raise usage.UsageError("must specify a subcommand")
def getSynopsis(self):
return "Usage: tahoe [global-opts] admin SUBCOMMAND"
return "Usage: %s [global-opts] admin SUBCOMMAND" % (self.command_name,)
def getUsage(self, width=None):
t = BaseOptions.getUsage(self, width)
t += """
@ -69,6 +197,8 @@ each subcommand.
subDispatch = {
"generate-keypair": print_keypair,
"derive-pubkey": derive_pubkey,
"create-container": create_container,
"ls-container": ls_container,
}
def do_admin(options):

View File

@ -6,6 +6,8 @@ from allmydata.util.hashutil import backupdb_dirhash
from allmydata.util import base32
from allmydata.util.fileutil import abspath_expanduser_unicode
from allmydata.util.encodingutil import to_str
from allmydata.util.dbutil import get_db, DBError
DAY = 24*60*60
MONTH = 30*DAY
@ -58,47 +60,22 @@ UPDATE_v1_to_v2 = TABLE_DIRECTORY + """
UPDATE version SET version=2;
"""
UPDATERS = {
2: UPDATE_v1_to_v2,
}
def get_backupdb(dbfile, stderr=sys.stderr,
create_version=(SCHEMA_v2, 2), just_create=False):
# open or create the given backupdb file. The parent directory must
# Open or create the given backupdb file. The parent directory must
# exist.
import sqlite3
must_create = not os.path.exists(dbfile)
try:
db = sqlite3.connect(dbfile)
except (EnvironmentError, sqlite3.OperationalError), e:
print >>stderr, "Unable to create/open backupdb file %s: %s" % (dbfile, e)
return None
c = db.cursor()
if must_create:
schema, version = create_version
c.executescript(schema)
c.execute("INSERT INTO version (version) VALUES (?)", (version,))
db.commit()
try:
c.execute("SELECT version FROM version")
version = c.fetchone()[0]
except sqlite3.DatabaseError, e:
# this indicates that the file is not a compatible database format.
# Perhaps it was created with an old version, or it might be junk.
print >>stderr, "backupdb file is unusable: %s" % e
return None
if just_create: # for tests
return True
if version == 1:
c.executescript(UPDATE_v1_to_v2)
db.commit()
version = 2
if version == 2:
(sqlite3, db) = get_db(dbfile, stderr, create_version, updaters=UPDATERS,
just_create=just_create, dbname="backupdb")
return BackupDB_v2(sqlite3, db)
print >>stderr, "Unable to handle backupdb version %s" % version
return None
except DBError, e:
print >>stderr, e
return None
class FileResult:
def __init__(self, bdb, filecap, should_check,
@ -127,6 +104,7 @@ class FileResult:
def did_check_healthy(self, results):
self.bdb.did_check_file_healthy(self.filecap, results)
class DirectoryResult:
def __init__(self, bdb, dirhash, dircap, should_check):
self.bdb = bdb
@ -148,6 +126,7 @@ class DirectoryResult:
def did_check_healthy(self, results):
self.bdb.did_check_directory_healthy(self.dircap, results)
class BackupDB_v2:
VERSION = 2
NO_CHECK_BEFORE = 1*MONTH
@ -180,7 +159,7 @@ class BackupDB_v2:
is not healthy, please upload the file and call r.did_upload(filecap)
when you're done.
I use_timestamps=True (the default), I will compare ctime and mtime
If use_timestamps=True (the default), I will compare ctime and mtime
of the local file against an entry in my database, and consider the
file to be unchanged if ctime, mtime, and filesize are all the same
as the earlier version. If use_timestamps=False, I will not trust the

View File

@ -44,9 +44,14 @@ class BasedirOptions(BaseOptions):
]
def parseArgs(self, basedir=None):
if self.parent['node-directory'] and self['basedir']:
# This finds the node-directory option correctly even if we are in a subcommand.
root = self.parent
while root.parent is not None:
root = root.parent
if root['node-directory'] and self['basedir']:
raise usage.UsageError("The --node-directory (or -d) and --basedir (or -C) options cannot both be used.")
if self.parent['node-directory'] and basedir:
if root['node-directory'] and basedir:
raise usage.UsageError("The --node-directory (or -d) option and a basedir argument cannot both be used.")
if self['basedir'] and basedir:
raise usage.UsageError("The --basedir (or -C) option and a basedir argument cannot both be used.")
@ -55,8 +60,8 @@ class BasedirOptions(BaseOptions):
b = argv_to_abspath(basedir)
elif self['basedir']:
b = argv_to_abspath(self['basedir'])
elif self.parent['node-directory']:
b = argv_to_abspath(self.parent['node-directory'])
elif root['node-directory']:
b = argv_to_abspath(root['node-directory'])
elif self.default_nodedir:
b = self.default_nodedir
else:

View File

@ -2,20 +2,122 @@
# do not import any allmydata modules at this level. Do that from inside
# individual functions instead.
import struct, time, os, sys
from collections import deque
from twisted.python import usage, failure
from twisted.internet import defer
from twisted.scripts import trial as twisted_trial
from foolscap.logging import cli as foolscap_cli
from allmydata.util.assertutil import _assert
from allmydata.scripts.common import BaseOptions
class ChunkedShare(object):
def __init__(self, filename, preferred_chunksize):
self._filename = filename
self._position = 0
self._chunksize = os.stat(filename).st_size
self._total_size = self._chunksize
chunknum = 1
while True:
chunk_filename = self._get_chunk_filename(chunknum)
if not os.path.exists(chunk_filename):
break
size = os.stat(chunk_filename).st_size
_assert(size <= self._chunksize, size=size, chunksize=self._chunksize)
self._total_size += size
chunknum += 1
if self._chunksize == self._total_size:
# There is only one chunk, so we are at liberty to make the chunksize larger
# than that chunk, but not smaller.
self._chunksize = max(self._chunksize, preferred_chunksize)
def __repr__(self):
return "<ChunkedShare at %r>" % (self._filename,)
def seek(self, offset):
self._position = offset
def read(self, length):
data = self.pread(self._position, length)
self._position += len(data)
return data
def write(self, data):
self.pwrite(self._position, data)
self._position += len(data)
def pread(self, offset, length):
if offset + length > self._total_size:
length = max(0, self._total_size - offset)
pieces = deque()
chunknum = offset / self._chunksize
read_offset = offset % self._chunksize
remaining = length
while remaining > 0:
read_length = min(remaining, self._chunksize - read_offset)
_assert(read_length > 0, read_length=read_length)
pieces.append(self.read_from_chunk(chunknum, read_offset, read_length))
remaining -= read_length
read_offset = 0
chunknum += 1
return ''.join(pieces)
def _get_chunk_filename(self, chunknum):
if chunknum == 0:
return self._filename
else:
return "%s.%d" % (self._filename, chunknum)
def read_from_chunk(self, chunknum, offset, length):
f = open(self._get_chunk_filename(chunknum), "rb")
try:
f.seek(offset)
data = f.read(length)
_assert(len(data) == length, len_data = len(data), length=length)
return data
finally:
f.close()
def pwrite(self, offset, data):
if offset > self._total_size:
# fill the gap with zeroes
data = "\x00"*(offset + len(data) - self._total_size) + data
offset = self._total_size
self._total_size = max(self._total_size, offset + len(data))
chunknum = offset / self._chunksize
write_offset = offset % self._chunksize
data_offset = 0
remaining = len(data)
while remaining > 0:
write_length = min(remaining, self._chunksize - write_offset)
_assert(write_length > 0, write_length=write_length)
self.write_to_chunk(chunknum, write_offset, data[data_offset : data_offset + write_length])
remaining -= write_length
data_offset += write_length
write_offset = 0
chunknum += 1
def write_to_chunk(self, chunknum, offset, data):
f = open(self._get_chunk_filename(chunknum), "rw+b")
try:
f.seek(offset)
f.write(data)
finally:
f.close()
class DumpOptions(BaseOptions):
def getSynopsis(self):
return "Usage: tahoe [global-opts] debug dump-share SHARE_FILENAME"
optFlags = [
["offsets", None, "Display a table of section offsets."],
["leases-only", None, "Dump leases but not CHK contents."],
]
def getUsage(self, width=None):
@ -35,35 +137,37 @@ verify-cap for the file that uses the share.
from allmydata.util.encodingutil import argv_to_abspath
self['filename'] = argv_to_abspath(filename)
def dump_share(options):
from allmydata.storage.mutable import MutableShareFile
from allmydata.util.encodingutil import quote_output
from allmydata.mutable.layout import MUTABLE_MAGIC, MAX_MUTABLE_SHARE_SIZE
out = options.stdout
filename = options['filename']
# check the version, to see if we have a mutable or immutable share
print >>out, "share filename: %s" % quote_output(options['filename'])
print >>out, "share filename: %s" % quote_output(filename)
f = open(options['filename'], "rb")
prefix = f.read(32)
f.close()
if prefix == MutableShareFile.MAGIC:
return dump_mutable_share(options)
# otherwise assume it's immutable
return dump_immutable_share(options)
share = ChunkedShare(filename, MAX_MUTABLE_SHARE_SIZE)
prefix = share.pread(0, len(MUTABLE_MAGIC))
def dump_immutable_share(options):
from allmydata.storage.immutable import ShareFile
if prefix == MUTABLE_MAGIC:
return dump_mutable_share(options, share)
else:
return dump_immutable_share(options, share)
def dump_immutable_share(options, share):
from allmydata.storage.backends.disk.immutable import ImmutableDiskShare
share.DATA_OFFSET = ImmutableDiskShare.DATA_OFFSET
out = options.stdout
f = ShareFile(options['filename'])
if not options["leases-only"]:
dump_immutable_chk_share(f, out, options)
dump_immutable_lease_info(f, out)
dump_immutable_chk_share(share, out, options)
print >>out
return 0
def dump_immutable_chk_share(f, out, options):
def dump_immutable_chk_share(share, out, options):
from allmydata import uri
from allmydata.util import base32
from allmydata.immutable.layout import ReadBucketProxy
@ -71,13 +175,17 @@ def dump_immutable_chk_share(f, out, options):
# use a ReadBucketProxy to parse the bucket and find the uri extension
bp = ReadBucketProxy(None, None, '')
offsets = bp._parse_offsets(f.read_share_data(0, 0x44))
def read_share_data(offset, length):
return share.pread(share.DATA_OFFSET + offset, length)
offsets = bp._parse_offsets(read_share_data(0, 0x44))
print >>out, "%20s: %d" % ("version", bp._version)
seek = offsets['uri_extension']
length = struct.unpack(bp._fieldstruct,
f.read_share_data(seek, bp._fieldsize))[0]
read_share_data(seek, bp._fieldsize))[0]
seek += bp._fieldsize
UEB_data = f.read_share_data(seek, length)
UEB_data = read_share_data(seek, length)
unpacked = uri.unpack_extension_readable(UEB_data)
keys1 = ("size", "num_segments", "segment_size",
@ -137,25 +245,13 @@ def dump_immutable_chk_share(f, out, options):
if options['offsets']:
print >>out
print >>out, " Section Offsets:"
print >>out, "%20s: %s" % ("share data", f._data_offset)
print >>out, "%20s: %s" % ("share data", share.DATA_OFFSET)
for k in ["data", "plaintext_hash_tree", "crypttext_hash_tree",
"block_hashes", "share_hashes", "uri_extension"]:
name = {"data": "block data"}.get(k,k)
offset = f._data_offset + offsets[k]
offset = share.DATA_OFFSET + offsets[k]
print >>out, " %20s: %s (0x%x)" % (name, offset, offset)
print >>out, "%20s: %s" % ("leases", f._lease_offset)
def dump_immutable_lease_info(f, out):
# display lease information too
print >>out
leases = list(f.get_leases())
if leases:
for i,lease in enumerate(leases):
when = format_expiration_time(lease.expiration_time)
print >>out, " Lease #%d: owner=%d, expire in %s" \
% (i, lease.owner_num, when)
else:
print >>out, " No leases."
def format_expiration_time(expiration_time):
now = time.time()
@ -168,49 +264,32 @@ def format_expiration_time(expiration_time):
return when
def dump_mutable_share(options):
from allmydata.storage.mutable import MutableShareFile
def dump_mutable_share(options, m):
from allmydata.util import base32, idlib
from allmydata.storage.backends.disk.mutable import MutableDiskShare
out = options.stdout
m = MutableShareFile(options['filename'])
f = open(options['filename'], "rb")
WE, nodeid = m._read_write_enabler_and_nodeid(f)
num_extra_leases = m._read_num_extra_leases(f)
data_length = m._read_data_length(f)
extra_lease_offset = m._read_extra_lease_offset(f)
container_size = extra_lease_offset - m.DATA_OFFSET
leases = list(m._enumerate_leases(f))
m.DATA_OFFSET = MutableDiskShare.DATA_OFFSET
WE, nodeid = MutableDiskShare._read_write_enabler_and_nodeid(m)
data_length = MutableDiskShare._read_data_length(m)
container_size = MutableDiskShare._read_container_size(m)
share_type = "unknown"
f.seek(m.DATA_OFFSET)
version = f.read(1)
version = m.pread(m.DATA_OFFSET, 1)
if version == "\x00":
# this slot contains an SMDF share
share_type = "SDMF"
elif version == "\x01":
share_type = "MDMF"
f.close()
print >>out
print >>out, "Mutable slot found:"
print >>out, " share_type: %s" % share_type
print >>out, " write_enabler: %s" % base32.b2a(WE)
print >>out, " WE for nodeid: %s" % idlib.nodeid_b2a(nodeid)
print >>out, " num_extra_leases: %d" % num_extra_leases
print >>out, " container_size: %d" % container_size
print >>out, " data_length: %d" % data_length
if leases:
for (leasenum, lease) in leases:
print >>out
print >>out, " Lease #%d:" % leasenum
print >>out, " ownerid: %d" % lease.owner_num
when = format_expiration_time(lease.expiration_time)
print >>out, " expires in %s" % when
print >>out, " renew_secret: %s" % base32.b2a(lease.renew_secret)
print >>out, " cancel_secret: %s" % base32.b2a(lease.cancel_secret)
print >>out, " secrets are for nodeid: %s" % idlib.nodeid_b2a(lease.nodeid)
else:
print >>out, "No leases."
print >>out
if share_type == "SDMF":
@ -220,6 +299,7 @@ def dump_mutable_share(options):
return 0
def dump_SDMF_share(m, length, options):
from allmydata.mutable.layout import unpack_share, unpack_header
from allmydata.mutable.common import NeedMoreDataError
@ -227,24 +307,16 @@ def dump_SDMF_share(m, length, options):
from allmydata.uri import SSKVerifierURI
from allmydata.util.encodingutil import quote_output, to_str
offset = m.DATA_OFFSET
out = options.stdout
f = open(options['filename'], "rb")
f.seek(offset)
data = f.read(min(length, 2000))
f.close()
data = m.pread(m.DATA_OFFSET, min(length, 2000))
try:
pieces = unpack_share(data)
except NeedMoreDataError, e:
# retry once with the larger size
size = e.needed_bytes
f = open(options['filename'], "rb")
f.seek(offset)
data = f.read(min(length, size))
f.close()
data = m.pread(m.DATA_OFFSET, min(length, size))
pieces = unpack_share(data)
(seqnum, root_hash, IV, k, N, segsize, datalen,
@ -283,12 +355,12 @@ def dump_SDMF_share(m, length, options):
if options['offsets']:
# NOTE: this offset-calculation code is fragile, and needs to be
# merged with MutableShareFile's internals.
# merged with MutableDiskShare's internals.
print >>out
print >>out, " Section Offsets:"
def printoffset(name, value, shift=0):
print >>out, "%s%20s: %s (0x%x)" % (" "*shift, name, value, value)
printoffset("first lease", m.HEADER_SIZE)
printoffset("end of header", m.DATA_OFFSET)
printoffset("share data", m.DATA_OFFSET)
o_seqnum = m.DATA_OFFSET + struct.calcsize(">B")
printoffset("seqnum", o_seqnum, 2)
@ -301,30 +373,27 @@ def dump_SDMF_share(m, length, options):
"EOF": "end of share data"}.get(k,k)
offset = m.DATA_OFFSET + offsets[k]
printoffset(name, offset, 2)
f = open(options['filename'], "rb")
printoffset("extra leases", m._read_extra_lease_offset(f) + 4)
f.close()
print >>out
def dump_MDMF_share(m, length, options):
from allmydata.mutable.layout import MDMFSlotReadProxy
from allmydata.util import base32, hashutil
from allmydata.uri import MDMFVerifierURI
from allmydata.util.encodingutil import quote_output, to_str
from allmydata.storage.backends.disk.mutable import MutableDiskShare
DATA_OFFSET = MutableDiskShare.DATA_OFFSET
offset = m.DATA_OFFSET
out = options.stdout
f = open(options['filename'], "rb")
storage_index = None; shnum = 0
class ShareDumper(MDMFSlotReadProxy):
def _read(self, readvs, force_remote=False, queue=False):
data = []
for (where,length) in readvs:
f.seek(offset+where)
data.append(f.read(length))
data.append(m.pread(DATA_OFFSET + where, length))
return defer.succeed({shnum: data})
p = ShareDumper(None, storage_index, shnum)
@ -342,7 +411,6 @@ def dump_MDMF_share(m, length, options):
pubkey = extract(p.get_verification_key)
block_hash_tree = extract(p.get_blockhashes)
share_hash_chain = extract(p.get_sharehashes)
f.close()
(seqnum, root_hash, salt_to_use, segsize, datalen, k, N, prefix,
offsets) = verinfo
@ -377,13 +445,13 @@ def dump_MDMF_share(m, length, options):
if options['offsets']:
# NOTE: this offset-calculation code is fragile, and needs to be
# merged with MutableShareFile's internals.
# merged with MutableDiskShare's internals.
print >>out
print >>out, " Section Offsets:"
def printoffset(name, value, shift=0):
print >>out, "%s%.20s: %s (0x%x)" % (" "*shift, name, value, value)
printoffset("first lease", m.HEADER_SIZE, 2)
printoffset("end of header", m.DATA_OFFSET, 2)
printoffset("share data", m.DATA_OFFSET, 2)
o_seqnum = m.DATA_OFFSET + struct.calcsize(">B")
printoffset("seqnum", o_seqnum, 4)
@ -398,24 +466,16 @@ def dump_MDMF_share(m, length, options):
"EOF": "end of share data"}.get(k,k)
offset = m.DATA_OFFSET + offsets[k]
printoffset(name, offset, 4)
f = open(options['filename'], "rb")
printoffset("extra leases", m._read_extra_lease_offset(f) + 4, 2)
f.close()
print >>out
class DumpCapOptions(BaseOptions):
def getSynopsis(self):
return "Usage: tahoe [global-opts] debug dump-cap [options] FILECAP"
optParameters = [
["nodeid", "n",
None, "Specify the storage server nodeid (ASCII), to construct WE and secrets."],
["client-secret", "c", None,
"Specify the client's base secret (ASCII), to construct secrets."],
["client-dir", "d", None,
"Specify the client's base directory, from which a -c secret will be read."],
None, "Specify the storage server nodeid (ASCII), to construct the write enabler."],
]
def parseArgs(self, cap):
self.cap = cap
@ -433,16 +493,14 @@ This may be useful to determine if a read-cap and a write-cap refer to the
same time, or to extract the storage-index from a file-cap (to then use with
find-shares)
If additional information is provided (storage server nodeid and/or client
base secret), this command will compute the shared secrets used for the
write-enabler and for lease-renewal.
For mutable write-caps, if the storage server nodeid is provided, this command
will compute the write enabler.
"""
return t
def dump_cap(options):
from allmydata import uri
from allmydata.util import base32
from base64 import b32decode
import urlparse, urllib
@ -451,47 +509,18 @@ def dump_cap(options):
nodeid = None
if options['nodeid']:
nodeid = b32decode(options['nodeid'].upper())
secret = None
if options['client-secret']:
secret = base32.a2b(options['client-secret'])
elif options['client-dir']:
secretfile = os.path.join(options['client-dir'], "private", "secret")
try:
secret = base32.a2b(open(secretfile, "r").read().strip())
except EnvironmentError:
pass
if cap.startswith("http"):
scheme, netloc, path, params, query, fragment = urlparse.urlparse(cap)
assert path.startswith("/uri/")
_assert(path.startswith("/uri/"), path=path)
cap = urllib.unquote(path[len("/uri/"):])
u = uri.from_string(cap)
print >>out
dump_uri_instance(u, nodeid, secret, out)
dump_uri_instance(u, nodeid, out)
def _dump_secrets(storage_index, secret, nodeid, out):
from allmydata.util import hashutil
from allmydata.util import base32
if secret:
crs = hashutil.my_renewal_secret_hash(secret)
print >>out, " client renewal secret:", base32.b2a(crs)
frs = hashutil.file_renewal_secret_hash(crs, storage_index)
print >>out, " file renewal secret:", base32.b2a(frs)
if nodeid:
renew = hashutil.bucket_renewal_secret_hash(frs, nodeid)
print >>out, " lease renewal secret:", base32.b2a(renew)
ccs = hashutil.my_cancel_secret_hash(secret)
print >>out, " client cancel secret:", base32.b2a(ccs)
fcs = hashutil.file_cancel_secret_hash(ccs, storage_index)
print >>out, " file cancel secret:", base32.b2a(fcs)
if nodeid:
cancel = hashutil.bucket_cancel_secret_hash(fcs, nodeid)
print >>out, " lease cancel secret:", base32.b2a(cancel)
def dump_uri_instance(u, nodeid, secret, out, show_header=True):
def dump_uri_instance(u, nodeid, out, show_header=True):
from allmydata import uri
from allmydata.storage.server import si_b2a
from allmydata.util import base32, hashutil
@ -505,7 +534,6 @@ def dump_uri_instance(u, nodeid, secret, out, show_header=True):
print >>out, " size:", u.size
print >>out, " k/N: %d/%d" % (u.needed_shares, u.total_shares)
print >>out, " storage index:", si_b2a(u.get_storage_index())
_dump_secrets(u.get_storage_index(), secret, nodeid, out)
elif isinstance(u, uri.CHKFileVerifierURI):
if show_header:
print >>out, "CHK Verifier URI:"
@ -531,7 +559,6 @@ def dump_uri_instance(u, nodeid, secret, out, show_header=True):
we = hashutil.ssk_write_enabler_hash(u.writekey, nodeid)
print >>out, " write_enabler:", base32.b2a(we)
print >>out
_dump_secrets(u.get_storage_index(), secret, nodeid, out)
elif isinstance(u, uri.ReadonlySSKFileURI):
if show_header:
print >>out, "SDMF Read-only URI:"
@ -556,7 +583,6 @@ def dump_uri_instance(u, nodeid, secret, out, show_header=True):
we = hashutil.ssk_write_enabler_hash(u.writekey, nodeid)
print >>out, " write_enabler:", base32.b2a(we)
print >>out
_dump_secrets(u.get_storage_index(), secret, nodeid, out)
elif isinstance(u, uri.ReadonlyMDMFFileURI):
if show_header:
print >>out, "MDMF Read-only URI:"
@ -569,45 +595,45 @@ def dump_uri_instance(u, nodeid, secret, out, show_header=True):
print >>out, " storage index:", si_b2a(u.get_storage_index())
print >>out, " fingerprint:", base32.b2a(u.fingerprint)
elif isinstance(u, uri.ImmutableDirectoryURI): # CHK-based directory
if show_header:
print >>out, "CHK Directory URI:"
dump_uri_instance(u._filenode_uri, nodeid, secret, out, False)
dump_uri_instance(u._filenode_uri, nodeid, out, False)
elif isinstance(u, uri.ImmutableDirectoryURIVerifier):
if show_header:
print >>out, "CHK Directory Verifier URI:"
dump_uri_instance(u._filenode_uri, nodeid, secret, out, False)
dump_uri_instance(u._filenode_uri, nodeid, out, False)
elif isinstance(u, uri.DirectoryURI): # SDMF-based directory
if show_header:
print >>out, "Directory Writeable URI:"
dump_uri_instance(u._filenode_uri, nodeid, secret, out, False)
dump_uri_instance(u._filenode_uri, nodeid, out, False)
elif isinstance(u, uri.ReadonlyDirectoryURI):
if show_header:
print >>out, "Directory Read-only URI:"
dump_uri_instance(u._filenode_uri, nodeid, secret, out, False)
dump_uri_instance(u._filenode_uri, nodeid, out, False)
elif isinstance(u, uri.DirectoryURIVerifier):
if show_header:
print >>out, "Directory Verifier URI:"
dump_uri_instance(u._filenode_uri, nodeid, secret, out, False)
dump_uri_instance(u._filenode_uri, nodeid, out, False)
elif isinstance(u, uri.MDMFDirectoryURI): # MDMF-based directory
if show_header:
print >>out, "Directory Writeable URI:"
dump_uri_instance(u._filenode_uri, nodeid, secret, out, False)
dump_uri_instance(u._filenode_uri, nodeid, out, False)
elif isinstance(u, uri.ReadonlyMDMFDirectoryURI):
if show_header:
print >>out, "Directory Read-only URI:"
dump_uri_instance(u._filenode_uri, nodeid, secret, out, False)
dump_uri_instance(u._filenode_uri, nodeid, out, False)
elif isinstance(u, uri.MDMFDirectoryURIVerifier):
if show_header:
print >>out, "Directory Verifier URI:"
dump_uri_instance(u._filenode_uri, nodeid, secret, out, False)
dump_uri_instance(u._filenode_uri, nodeid, out, False)
else:
print >>out, "unknown cap type"
class FindSharesOptions(BaseOptions):
def getSynopsis(self):
return "Usage: tahoe [global-opts] debug find-shares STORAGE_INDEX NODEDIRS.."
@ -622,7 +648,7 @@ class FindSharesOptions(BaseOptions):
t += """
Locate all shares for the given storage index. This command looks through one
or more node directories to find the shares. It returns a list of filenames,
one per line, for each share file found.
one per line, for the initial chunk of each share found.
tahoe debug find-shares 4vozh77tsrw7mdhnj7qvp5ky74 testgrid/node-*
@ -644,24 +670,23 @@ def find_shares(options):
/home/warner/testnet/node-1/storage/shares/44k/44kai1tui348689nrw8fjegc8c/9
/home/warner/testnet/node-2/storage/shares/44k/44kai1tui348689nrw8fjegc8c/2
"""
from allmydata.storage.server import si_a2b, storage_index_to_dir
from allmydata.util.encodingutil import listdir_unicode
from allmydata.storage.common import si_a2b, NUM_RE
from allmydata.storage.backends.disk.disk_backend import si_si2dir
from allmydata.util import fileutil
from allmydata.util.encodingutil import quote_output
out = options.stdout
sharedir = storage_index_to_dir(si_a2b(options.si_s))
for d in options.nodedirs:
d = os.path.join(d, "storage/shares", sharedir)
if os.path.exists(d):
for shnum in listdir_unicode(d):
print >>out, os.path.join(d, shnum)
si = si_a2b(options.si_s)
for nodedir in options.nodedirs:
sharedir = si_si2dir(os.path.join(nodedir, "storage", "shares"), si)
for shnumstr in fileutil.listdir(sharedir, filter=NUM_RE):
sharefile = os.path.join(sharedir, shnumstr)
print >>out, quote_output(sharefile, quotemarks=False)
return 0
class CatalogSharesOptions(BaseOptions):
"""
"""
def parseArgs(self, *nodedirs):
from allmydata.util.encodingutil import argv_to_abspath
self.nodedirs = map(argv_to_abspath, nodedirs)
@ -681,18 +706,20 @@ of each share. Run it like this:
The lines it emits will look like the following:
CHK $SI $k/$N $filesize $UEB_hash $expiration $abspath_sharefile
SDMF $SI $k/$N $filesize $seqnum/$roothash $expiration $abspath_sharefile
CHK $SI $k/$N $filesize $UEB_hash - $abspath_sharefile
SDMF $SI $k/$N $filesize $seqnum/$roothash - $abspath_sharefile
MDMF $SI $k/$N $filesize $seqnum/$roothash - $abspath_sharefile
UNKNOWN $abspath_sharefile
This command can be used to build up a catalog of shares from many storage
servers and then sort the results to compare all shares for the same file. If
you see shares with the same SI but different parameters/filesize/UEB_hash,
then something is wrong. The misc/find-share/anomalies.py script may be
useful for purpose.
useful for that purpose.
"""
return t
def call(c, *args, **kwargs):
# take advantage of the fact that ImmediateReadBucketProxy returns
# Deferreds that are already fired
@ -704,32 +731,27 @@ def call(c, *args, **kwargs):
failures[0].raiseException()
return results[0]
def describe_share(abs_sharefile, si_s, shnum_s, now, out):
from allmydata import uri
from allmydata.storage.mutable import MutableShareFile
from allmydata.storage.immutable import ShareFile
from allmydata.mutable.layout import unpack_share
from allmydata.storage.backends.disk.immutable import ImmutableDiskShare
from allmydata.storage.backends.disk.mutable import MutableDiskShare
from allmydata.mutable.layout import unpack_share, MUTABLE_MAGIC, MAX_MUTABLE_SHARE_SIZE
from allmydata.mutable.common import NeedMoreDataError
from allmydata.immutable.layout import ReadBucketProxy
from allmydata.util import base32
from allmydata.util.encodingutil import quote_output
import struct
f = open(abs_sharefile, "rb")
prefix = f.read(32)
share = ChunkedShare(abs_sharefile, MAX_MUTABLE_SHARE_SIZE)
prefix = share.pread(0, len(MUTABLE_MAGIC))
if prefix == MutableShareFile.MAGIC:
# mutable share
m = MutableShareFile(abs_sharefile)
WE, nodeid = m._read_write_enabler_and_nodeid(f)
data_length = m._read_data_length(f)
expiration_time = min( [lease.expiration_time
for (i,lease) in m._enumerate_leases(f)] )
expiration = max(0, expiration_time - now)
if prefix == MUTABLE_MAGIC:
share.DATA_OFFSET = MutableDiskShare.DATA_OFFSET
WE, nodeid = MutableDiskShare._read_write_enabler_and_nodeid(share)
data_length = MutableDiskShare._read_data_length(share)
share_type = "unknown"
f.seek(m.DATA_OFFSET)
version = f.read(1)
version = share.pread(share.DATA_OFFSET, 1)
if version == "\x00":
# this slot contains an SMDF share
share_type = "SDMF"
@ -737,25 +759,23 @@ def describe_share(abs_sharefile, si_s, shnum_s, now, out):
share_type = "MDMF"
if share_type == "SDMF":
f.seek(m.DATA_OFFSET)
data = f.read(min(data_length, 2000))
data = share.pread(share.DATA_OFFSET, min(data_length, 2000))
try:
pieces = unpack_share(data)
except NeedMoreDataError, e:
# retry once with the larger size
size = e.needed_bytes
f.seek(m.DATA_OFFSET)
data = f.read(min(data_length, size))
data = share.pread(share.DATA_OFFSET, min(data_length, size))
pieces = unpack_share(data)
(seqnum, root_hash, IV, k, N, segsize, datalen,
pubkey, signature, share_hash_chain, block_hash_tree,
share_data, enc_privkey) = pieces
print >>out, "SDMF %s %d/%d %d #%d:%s %d %s" % \
print >>out, "SDMF %s %d/%d %d #%d:%s - %s" % \
(si_s, k, N, datalen,
seqnum, base32.b2a(root_hash),
expiration, quote_output(abs_sharefile))
quote_output(abs_sharefile))
elif share_type == "MDMF":
from allmydata.mutable.layout import MDMFSlotReadProxy
fake_shnum = 0
@ -764,8 +784,7 @@ def describe_share(abs_sharefile, si_s, shnum_s, now, out):
def _read(self, readvs, force_remote=False, queue=False):
data = []
for (where,length) in readvs:
f.seek(m.DATA_OFFSET+where)
data.append(f.read(length))
data.append(share.pread(share.DATA_OFFSET + where, length))
return defer.succeed({fake_shnum: data})
p = ShareDumper(None, "fake-si", fake_shnum)
@ -781,32 +800,33 @@ def describe_share(abs_sharefile, si_s, shnum_s, now, out):
verinfo = extract(p.get_verinfo)
(seqnum, root_hash, salt_to_use, segsize, datalen, k, N, prefix,
offsets) = verinfo
print >>out, "MDMF %s %d/%d %d #%d:%s %d %s" % \
print >>out, "MDMF %s %d/%d %d #%d:%s - %s" % \
(si_s, k, N, datalen,
seqnum, base32.b2a(root_hash),
expiration, quote_output(abs_sharefile))
quote_output(abs_sharefile))
else:
print >>out, "UNKNOWN mutable %s" % quote_output(abs_sharefile)
elif struct.unpack(">L", prefix[:4]) == (1,):
else:
# immutable
share.DATA_OFFSET = ImmutableDiskShare.DATA_OFFSET
#version = struct.unpack(">L", share.pread(0, struct.calcsize(">L")))
#if version != 1:
# print >>out, "UNKNOWN really-unknown %s" % quote_output(abs_sharefile)
# return
class ImmediateReadBucketProxy(ReadBucketProxy):
def __init__(self, sf):
self.sf = sf
def __init__(self, share):
self.share = share
ReadBucketProxy.__init__(self, None, None, "")
def __repr__(self):
return "<ImmediateReadBucketProxy>"
def _read(self, offset, size):
return defer.succeed(sf.read_share_data(offset, size))
return defer.maybeDeferred(self.share.pread, share.DATA_OFFSET + offset, size)
# use a ReadBucketProxy to parse the bucket and find the uri extension
sf = ShareFile(abs_sharefile)
bp = ImmediateReadBucketProxy(sf)
expiration_time = min( [lease.expiration_time
for lease in sf.get_leases()] )
expiration = max(0, expiration_time - now)
bp = ImmediateReadBucketProxy(share)
UEB_data = call(bp.get_uri_extension)
unpacked = uri.unpack_extension_readable(UEB_data)
@ -816,63 +836,54 @@ def describe_share(abs_sharefile, si_s, shnum_s, now, out):
filesize = unpacked["size"]
ueb_hash = unpacked["UEB_hash"]
print >>out, "CHK %s %d/%d %d %s %d %s" % (si_s, k, N, filesize,
ueb_hash, expiration,
quote_output(abs_sharefile))
print >>out, "CHK %s %d/%d %d %s - %s" % (si_s, k, N, filesize, ueb_hash,
quote_output(abs_sharefile))
else:
print >>out, "UNKNOWN really-unknown %s" % quote_output(abs_sharefile)
f.close()
def catalog_shares(options):
from allmydata.util.encodingutil import listdir_unicode, quote_output
from allmydata.util import fileutil
from allmydata.util.encodingutil import quote_output
out = options.stdout
err = options.stderr
now = time.time()
for d in options.nodedirs:
d = os.path.join(d, "storage/shares")
for node_dir in options.nodedirs:
shares_dir = os.path.join(node_dir, "storage", "shares")
try:
abbrevs = listdir_unicode(d)
prefixes = fileutil.listdir(shares_dir)
except EnvironmentError:
# ignore nodes that have storage turned off altogether
pass
else:
for abbrevdir in sorted(abbrevs):
if abbrevdir == "incoming":
for prefix in sorted(prefixes):
if prefix == "incoming":
continue
abbrevdir = os.path.join(d, abbrevdir)
prefix_dir = os.path.join(shares_dir, prefix)
# this tool may get run against bad disks, so we can't assume
# that listdir_unicode will always succeed. Try to catalog as much
# that fileutil.listdir will always succeed. Try to catalog as much
# as possible.
try:
sharedirs = listdir_unicode(abbrevdir)
for si_s in sorted(sharedirs):
si_dir = os.path.join(abbrevdir, si_s)
catalog_shares_one_abbrevdir(si_s, si_dir, now, out,err)
share_dirs = fileutil.listdir(prefix_dir)
for si_s in sorted(share_dirs):
si_dir = os.path.join(prefix_dir, si_s)
catalog_shareset(si_s, si_dir, now, out, err)
except:
print >>err, "Error processing %s" % quote_output(abbrevdir)
print >>err, "Error processing %s" % quote_output(prefix_dir)
failure.Failure().printTraceback(err)
return 0
def _as_number(s):
try:
return int(s)
except ValueError:
return "not int"
def catalog_shares_one_abbrevdir(si_s, si_dir, now, out, err):
from allmydata.util.encodingutil import listdir_unicode, quote_output
def catalog_shareset(si_s, si_dir, now, out, err):
from allmydata.storage.common import NUM_RE
from allmydata.util import fileutil
from allmydata.util.encodingutil import quote_output
try:
for shnum_s in sorted(listdir_unicode(si_dir), key=_as_number):
for shnum_s in sorted(fileutil.listdir(si_dir, filter=NUM_RE), key=int):
abs_sharefile = os.path.join(si_dir, shnum_s)
assert os.path.isfile(abs_sharefile)
_assert(os.path.isfile(abs_sharefile), "%r is not a file" % (abs_sharefile,))
try:
describe_share(abs_sharefile, si_s, shnum_s, now,
out)
describe_share(abs_sharefile, si_s, shnum_s, now, out)
except:
print >>err, "Error processing %s" % quote_output(abs_sharefile)
failure.Failure().printTraceback(err)
@ -880,6 +891,7 @@ def catalog_shares_one_abbrevdir(si_s, si_dir, now, out, err):
print >>err, "Error processing %s" % quote_output(si_dir)
failure.Failure().printTraceback(err)
class CorruptShareOptions(BaseOptions):
def getSynopsis(self):
return "Usage: tahoe [global-opts] debug corrupt-share SHARE_FILENAME"
@ -903,63 +915,64 @@ to flip a single random bit of the block data.
Obviously, this command should not be used in normal operation.
"""
return t
def parseArgs(self, filename):
self['filename'] = filename
def corrupt_share(options):
do_corrupt_share(options.stdout, options['filename'], options['offset'])
def do_corrupt_share(out, filename, offset="block-random"):
import random
from allmydata.storage.mutable import MutableShareFile
from allmydata.storage.immutable import ShareFile
from allmydata.mutable.layout import unpack_header
from allmydata.storage.backends.disk.immutable import ImmutableDiskShare
from allmydata.storage.backends.disk.mutable import MutableDiskShare
from allmydata.mutable.layout import unpack_header, MUTABLE_MAGIC, MAX_MUTABLE_SHARE_SIZE
from allmydata.immutable.layout import ReadBucketProxy
out = options.stdout
fn = options['filename']
assert options["offset"] == "block-random", "other offsets not implemented"
# first, what kind of share is it?
_assert(offset == "block-random", "other offsets not implemented")
def flip_bit(start, end):
offset = random.randrange(start, end)
bit = random.randrange(0, 8)
print >>out, "[%d..%d): %d.b%d" % (start, end, offset, bit)
f = open(fn, "rb+")
f.seek(offset)
d = f.read(1)
d = chr(ord(d) ^ 0x01)
f.seek(offset)
f.write(d)
f.close()
f = open(filename, "rb+")
try:
f.seek(offset)
d = f.read(1)
d = chr(ord(d) ^ 0x01)
f.seek(offset)
f.write(d)
finally:
f.close()
f = open(fn, "rb")
prefix = f.read(32)
f.close()
if prefix == MutableShareFile.MAGIC:
# mutable
m = MutableShareFile(fn)
f = open(fn, "rb")
f.seek(m.DATA_OFFSET)
data = f.read(2000)
# what kind of share is it?
share = ChunkedShare(filename, MAX_MUTABLE_SHARE_SIZE)
prefix = share.pread(0, len(MUTABLE_MAGIC))
if prefix == MUTABLE_MAGIC:
data = share.pread(MutableDiskShare.DATA_OFFSET, 2000)
# make sure this slot contains an SMDF share
assert data[0] == "\x00", "non-SDMF mutable shares not supported"
f.close()
_assert(data[0] == "\x00", "non-SDMF mutable shares not supported")
(version, ig_seqnum, ig_roothash, ig_IV, ig_k, ig_N, ig_segsize,
ig_datalen, offsets) = unpack_header(data)
assert version == 0, "we only handle v0 SDMF files"
start = m.DATA_OFFSET + offsets["share_data"]
end = m.DATA_OFFSET + offsets["enc_privkey"]
_assert(version == 0, "we only handle v0 SDMF files")
start = MutableDiskShare.DATA_OFFSET + offsets["share_data"]
end = MutableDiskShare.DATA_OFFSET + offsets["enc_privkey"]
flip_bit(start, end)
else:
# otherwise assume it's immutable
f = ShareFile(fn)
bp = ReadBucketProxy(None, None, '')
offsets = bp._parse_offsets(f.read_share_data(0, 0x24))
start = f._data_offset + offsets["data"]
end = f._data_offset + offsets["plaintext_hash_tree"]
header = share.pread(ImmutableDiskShare.DATA_OFFSET, 0x24)
offsets = bp._parse_offsets(header)
start = ImmutableDiskShare.DATA_OFFSET + offsets["data"]
end = ImmutableDiskShare.DATA_OFFSET + offsets["plaintext_hash_tree"]
flip_bit(start, end)
class ReplOptions(BaseOptions):
def getSynopsis(self):
return "Usage: tahoe [global-opts] debug repl"

View File

@ -95,7 +95,7 @@ def check_location(options, where):
stdout.write(" corrupt shares:\n")
for (serverid, storage_index, sharenum) in corrupt:
stdout.write(" %s\n" % _quote_serverid_index_share(serverid, storage_index, sharenum))
return 0;
def check(options):
@ -106,7 +106,7 @@ def check(options):
return 0
for location in options.locations:
errno = check_location(options, location)
if errno != 0:
if errno != 0:
return errno
return 0
@ -320,7 +320,6 @@ class DeepCheckStreamer(LineOnlyReceiver):
if not self.options["raw"]:
output.done()
return 0
def run(self, options):
if len(options.locations) == 0:
@ -331,7 +330,7 @@ class DeepCheckStreamer(LineOnlyReceiver):
for location in options.locations:
errno = self.deepcheck_location(options, location)
if errno != 0:
return errno
return errno
return self.rc
def deepcheck(options):

View File

@ -0,0 +1,150 @@
"""
This file contains the client-facing interface for manipulating shares, named
"Account". It implements RIStorageServer. Each Account instance contains an
owner_num that is used for all operations that touch leases. In the current
version of the code, clients will receive a special 'anonymous' instance of
this class with owner_num=0. In a future version each client will get a
different instance, with a dedicated owner_num.
"""
import time
from foolscap.api import Referenceable
from zope.interface import implements
from allmydata.interfaces import RIStorageServer
from allmydata.storage.common import si_b2a
class Account(Referenceable):
implements(RIStorageServer)
def __init__(self, owner_num, pubkey_vs, server, leasedb):
self.owner_num = owner_num
self.server = server
self._leasedb = leasedb
# for static accounts ("starter", "anonymous"), pubkey_vs is None
self.pubkey_vs = pubkey_vs
self.debug = False
def is_static(self):
return self.owner_num in (0,1)
# these methods are called by StorageServer
def get_owner_num(self):
return self.owner_num
def get_renewal_and_expiration_times(self):
renewal_time = time.time()
return (renewal_time, renewal_time + 31*24*60*60)
# immutable.BucketWriter.close() does:
# add_share(), add_or_renew_lease(), mark_share_as_stable()
# mutable writev() does:
# deleted shares: mark_share_as_going(), remove_share_and_leases()
# new shares: add_share(), add_or_renew_lease(), mark_share_as_stable()
# changed shares: change_share_space(), add_or_renew_lease()
def add_share(self, storage_index, shnum, used_space, sharetype):
if self.debug: print "ADD_SHARE", si_b2a(storage_index), shnum, used_space, sharetype
self._leasedb.add_new_share(storage_index, shnum, used_space, sharetype)
def add_or_renew_default_lease(self, storage_index, shnum):
renewal_time, expiration_time = self.get_renewal_and_expiration_times()
return self.add_or_renew_lease(storage_index, shnum, renewal_time, expiration_time)
def add_or_renew_lease(self, storage_index, shnum, renewal_time, expiration_time):
if self.debug: print "ADD_OR_RENEW_LEASE", si_b2a(storage_index), shnum
self._leasedb.add_or_renew_leases(storage_index, shnum, self.owner_num,
renewal_time, expiration_time)
def change_share_space(self, storage_index, shnum, used_space):
if self.debug: print "CHANGE_SHARE_SPACE", si_b2a(storage_index), shnum, used_space
self._leasedb.change_share_space(storage_index, shnum, used_space)
def mark_share_as_stable(self, storage_index, shnum, used_space):
if self.debug: print "MARK_SHARE_AS_STABLE", si_b2a(storage_index), shnum, used_space
self._leasedb.mark_share_as_stable(storage_index, shnum, used_space)
def mark_share_as_going(self, storage_index, shnum):
if self.debug: print "MARK_SHARE_AS_GOING", si_b2a(storage_index), shnum
self._leasedb.mark_share_as_going(storage_index, shnum)
def remove_share_and_leases(self, storage_index, shnum):
if self.debug: print "REMOVE_SHARE_AND_LEASES", si_b2a(storage_index), shnum
self._leasedb.remove_deleted_share(storage_index, shnum)
# remote_add_lease() and remote_renew_lease() do this
def add_lease_for_bucket(self, storage_index):
if self.debug: print "ADD_LEASE_FOR_BUCKET", si_b2a(storage_index)
renewal_time, expiration_time = self.get_renewal_and_expiration_times()
self._leasedb.add_or_renew_leases(storage_index, None,
self.owner_num, renewal_time, expiration_time)
# The following RIStorageServer methods are called by remote clients
def remote_get_version(self):
return self.server.client_get_version(self)
# all other RIStorageServer methods should pass through to self.server
# but add the account as a final argument.
def remote_allocate_buckets(self, storage_index, renew_secret, cancel_secret,
sharenums, allocated_size, canary):
if self.debug: print "REMOTE_ALLOCATE_BUCKETS", si_b2a(storage_index)
return self.server.client_allocate_buckets(storage_index,
sharenums, allocated_size,
canary, self)
def remote_add_lease(self, storage_index, renew_secret, cancel_secret):
if self.debug: print "REMOTE_ADD_LEASE", si_b2a(storage_index)
self.add_lease_for_bucket(storage_index)
return None
def remote_renew_lease(self, storage_index, renew_secret):
self.add_lease_for_bucket(storage_index)
return None
def remote_get_buckets(self, storage_index):
return self.server.client_get_buckets(storage_index, self)
def remote_slot_testv_and_readv_and_writev(self, storage_index, secrets,
test_and_write_vectors, read_vector):
write_enabler = secrets[0]
return self.server.client_slot_testv_and_readv_and_writev(
storage_index, write_enabler, test_and_write_vectors, read_vector, self)
def remote_slot_readv(self, storage_index, shares, readv):
return self.server.client_slot_readv(storage_index, shares, readv, self)
def remote_advise_corrupt_share(self, share_type, storage_index, shnum, reason):
return self.server.client_advise_corrupt_share(share_type, storage_index, shnum,
reason, self)
def get_account_creation_time(self):
return self._leasedb.get_account_creation_time(self.owner_num)
def get_id(self):
return self.pubkey_vs
def get_leases(self, storage_index):
return self._leasedb.get_leases(storage_index, self.owner_num)
def get_stats(self):
return self.server.get_stats()
def get_accounting_crawler(self):
return self.server.get_accounting_crawler()
def get_expiration_policy(self):
return self.server.get_expiration_policy()
def get_bucket_counter(self):
return self.server.get_bucket_counter()
def get_serverid(self):
return self.server.get_serverid()

View File

@ -0,0 +1,65 @@
"""
This file contains the cross-account management code. It creates per-client
Account objects, as well as the "anonymous account" for use until a future
version of Tahoe-LAFS implements the FURLification dance. It also provides
usage statistics and reports for the status UI. This will also implement the
backend of the control UI (once we figure out how to express that: maybe a
CLI command, or tahoe.cfg settings, or a web frontend), for things like
enabling/disabling accounts and setting quotas.
"""
import weakref
from twisted.application import service
from allmydata.storage.leasedb import LeaseDB
from allmydata.storage.accounting_crawler import AccountingCrawler
from allmydata.storage.account import Account
class Accountant(service.MultiService):
def __init__(self, storage_server, dbfile, statefile, clock=None):
service.MultiService.__init__(self)
self._storage_server = storage_server
self._leasedb = LeaseDB(dbfile)
self._leasedb.setServiceParent(self)
self._active_accounts = weakref.WeakValueDictionary()
self._anonymous_account = Account(LeaseDB.ANONYMOUS_ACCOUNTID, None,
self._storage_server, self._leasedb)
self._starter_account = Account(LeaseDB.STARTER_LEASE_ACCOUNTID, None,
self._storage_server, self._leasedb)
crawler = AccountingCrawler(self._storage_server.backend, statefile, self._leasedb, clock=clock)
self._accounting_crawler = crawler
crawler.setServiceParent(self)
def get_leasedb(self):
return self._leasedb
def set_expiration_policy(self, policy):
self._accounting_crawler.set_expiration_policy(policy)
def get_anonymous_account(self):
return self._anonymous_account
def get_starter_account(self):
return self._starter_account
def get_accounting_crawler(self):
return self._accounting_crawler
# methods used by admin interfaces
def get_all_accounts(self):
for ownerid, pubkey_vs in self._leasedb.get_all_accounts():
if pubkey_vs in self._active_accounts:
yield self._active_accounts[pubkey_vs]
else:
yield Account(ownerid, pubkey_vs,
self.storage_server, self._leasedb)
def get_total_leased_sharecount_and_used_space(self):
return self._leasedb.get_total_leased_sharecount_and_used_space()
def get_number_of_sharesets(self):
return self._leasedb.get_number_of_sharesets()

View File

@ -0,0 +1,373 @@
import time
from twisted.internet import defer
from allmydata.util.deferredutil import for_items
from allmydata.util.assertutil import _assert
from allmydata.util import log
from allmydata.storage.crawler import ShareCrawler
from allmydata.storage.common import si_a2b
from allmydata.storage.leasedb import SHARETYPES, SHARETYPE_UNKNOWN, SHARETYPE_CORRUPTED, \
STATE_STABLE
class AccountingCrawler(ShareCrawler):
"""
I perform the following functions:
- Remove leases that are past their expiration time.
- Delete objects containing unleased shares.
- Discover shares that have been manually added to storage.
- Discover shares that are present when a storage server is upgraded from
a pre-leasedb version, and give them "starter leases".
- Recover from a situation where the leasedb is lost or detectably
corrupted. This is handled in the same way as upgrading.
- Detect shares that have unexpectedly disappeared from storage.
See ticket #1834 for a proposal to greatly reduce the scope of what I am
responsible for, and the times when I might do work.
"""
slow_start = 600 # don't start crawling for 10 minutes after startup
minimum_cycle_time = 12*60*60 # not more than twice per day
def __init__(self, backend, statefile, leasedb, clock=None):
ShareCrawler.__init__(self, backend, statefile, clock=clock)
self._leasedb = leasedb
self._enable_share_deletion = True
def process_prefix(self, cycle, prefix, start_slice):
# Assume that we can list every prefixdir in this prefix quickly.
# Otherwise we would have to retain more state between timeslices.
d = self.backend.get_sharesets_for_prefix(prefix)
def _got_sharesets(sharesets):
stored_sharemap = {} # (SI string, shnum) -> (used_space, sharetype)
d2 = defer.succeed(None)
for shareset in sharesets:
d2.addCallback(lambda ign, shareset=shareset: shareset.get_shares())
def _got_some_shares( (valid, corrupted) ):
for share in valid:
shareid = (share.get_storage_index_string(), share.get_shnum())
sharetype = SHARETYPE_UNKNOWN # FIXME
stored_sharemap[shareid] = (share.get_used_space(), sharetype)
for share in corrupted:
shareid = (share.get_storage_index_string(), share.get_shnum())
sharetype = SHARETYPE_CORRUPTED
stored_sharemap[shareid] = (share.get_used_space(), sharetype)
d2.addCallback(_got_some_shares)
d2.addCallback(lambda ign: stored_sharemap)
return d2
d.addCallback(_got_sharesets)
def _got_stored_sharemap(stored_sharemap):
# now check the database for everything in this prefix
db_sharemap = self._leasedb.get_shares_for_prefix(prefix)
rec = self.state["cycle-to-date"]["space-recovered"]
examined_sharesets = [set() for st in xrange(len(SHARETYPES))]
# The lease crawler used to calculate the lease age histogram while
# crawling shares, and tests currently rely on that, but it would be
# more efficient to maintain the histogram as leases are added,
# updated, and removed.
for key, value in db_sharemap.iteritems():
(si_s, shnum) = key
(used_space, sharetype, state) = value
examined_sharesets[sharetype].add(si_s)
for age in self._leasedb.get_lease_ages(si_a2b(si_s), shnum, start_slice):
self.add_lease_age_to_histogram(age)
self.increment(rec, "examined-shares", 1)
self.increment(rec, "examined-sharebytes", used_space)
self.increment(rec, "examined-shares-" + SHARETYPES[sharetype], 1)
self.increment(rec, "examined-sharebytes-" + SHARETYPES[sharetype], used_space)
self.increment(rec, "examined-buckets", sum([len(s) for s in examined_sharesets]))
for st in SHARETYPES:
self.increment(rec, "examined-buckets-" + SHARETYPES[st], len(examined_sharesets[st]))
stored_shares = set(stored_sharemap)
db_shares = set(db_sharemap)
# Add new shares to the DB.
new_shares = stored_shares - db_shares
for shareid in new_shares:
(si_s, shnum) = shareid
(used_space, sharetype) = stored_sharemap[shareid]
self._leasedb.add_new_share(si_a2b(si_s), shnum, used_space, sharetype)
self._leasedb.add_starter_lease(si_s, shnum)
# Remove disappeared shares from the DB. Note that only shares in STATE_STABLE
# should be considered "disappeared", since otherwise it would be possible for
# this to delete shares that are in the process of being created (see ticket #1921).
potentially_disappeared_shares = db_shares - stored_shares
for shareid in potentially_disappeared_shares:
(used_space, sharetype, state) = db_sharemap[shareid]
if state == STATE_STABLE:
(si_s, shnum) = shareid
log.msg(format="share SI=%(si_s)s shnum=%(shnum)s unexpectedly disappeared",
si_s=si_s, shnum=shnum, level=log.WEIRD)
if self._enable_share_deletion:
self._leasedb.remove_deleted_share(si_a2b(si_s), shnum)
recovered_sharesets = [set() for st in xrange(len(SHARETYPES))]
def _delete_share(ign, key, value):
(si_s, shnum) = key
(used_space, sharetype, state) = value
_assert(state == STATE_STABLE, state=state)
storage_index = si_a2b(si_s)
d3 = defer.succeed(None)
def _mark_and_delete(ign):
self._leasedb.mark_share_as_going(storage_index, shnum)
return self.backend.get_shareset(storage_index).delete_share(shnum)
d3.addCallback(_mark_and_delete)
def _deleted(ign):
self._leasedb.remove_deleted_share(storage_index, shnum)
recovered_sharesets[sharetype].add(si_s)
self.increment(rec, "actual-shares", 1)
self.increment(rec, "actual-sharebytes", used_space)
self.increment(rec, "actual-shares-" + SHARETYPES[sharetype], 1)
self.increment(rec, "actual-sharebytes-" + SHARETYPES[sharetype], used_space)
def _not_deleted(f):
log.err(format="accounting crawler could not delete share SI=%(si_s)s shnum=%(shnum)s",
si_s=si_s, shnum=shnum, failure=f, level=log.WEIRD)
try:
self._leasedb.mark_share_as_stable(storage_index, shnum)
except Exception, e:
log.err(e)
# discard the failure
d3.addCallbacks(_deleted, _not_deleted)
return d3
d2 = defer.succeed(None)
if self._enable_share_deletion:
# This only includes stable unleased shares (see ticket #1921).
unleased_sharemap = self._leasedb.get_unleased_shares_for_prefix(prefix)
d2.addCallback(lambda ign: for_items(_delete_share, unleased_sharemap))
def _inc_recovered_sharesets(ign):
self.increment(rec, "actual-buckets", sum([len(s) for s in recovered_sharesets]))
for st in SHARETYPES:
self.increment(rec, "actual-buckets-" + SHARETYPES[st], len(recovered_sharesets[st]))
d2.addCallback(_inc_recovered_sharesets)
return d2
d.addCallback(_got_stored_sharemap)
return d
# these methods are for outside callers to use
def set_expiration_policy(self, policy):
self._expiration_policy = policy
def get_expiration_policy(self):
return self._expiration_policy
def is_expiration_enabled(self):
return self._expiration_policy.is_enabled()
def db_is_incomplete(self):
# don't bother looking at the sqlite database: it's certainly not
# complete.
return self.state["last-cycle-finished"] is None
def increment(self, d, k, delta=1):
if k not in d:
d[k] = 0
d[k] += delta
def add_lease_age_to_histogram(self, age):
bin_interval = 24*60*60
bin_number = int(age/bin_interval)
bin_start = bin_number * bin_interval
bin_end = bin_start + bin_interval
k = (bin_start, bin_end)
self.increment(self.state["cycle-to-date"]["lease-age-histogram"], k, 1)
def convert_lease_age_histogram(self, lah):
# convert { (minage,maxage) : count } into [ (minage,maxage,count) ]
# since the former is not JSON-safe (JSON dictionaries must have
# string keys).
json_safe_lah = []
for k in sorted(lah):
(minage,maxage) = k
json_safe_lah.append( (minage, maxage, lah[k]) )
return json_safe_lah
def add_initial_state(self):
# we fill ["cycle-to-date"] here (even though they will be reset in
# self.started_cycle) just in case someone grabs our state before we
# get started: unit tests do this
so_far = self.create_empty_cycle_dict()
self.state.setdefault("cycle-to-date", so_far)
# in case we upgrade the code while a cycle is in progress, update
# the keys individually
for k in so_far:
self.state["cycle-to-date"].setdefault(k, so_far[k])
def create_empty_cycle_dict(self):
recovered = self.create_empty_recovered_dict()
so_far = {"corrupt-shares": [],
"space-recovered": recovered,
"lease-age-histogram": {}, # (minage,maxage)->count
}
return so_far
def create_empty_recovered_dict(self):
recovered = {}
for a in ("actual", "examined"):
for b in ("buckets", "shares", "diskbytes"):
recovered["%s-%s" % (a, b)] = 0
for st in SHARETYPES:
recovered["%s-%s-%s" % (a, b, SHARETYPES[st])] = 0
return recovered
def started_cycle(self, cycle):
self.state["cycle-to-date"] = self.create_empty_cycle_dict()
current_time = time.time()
self._expiration_policy.remove_expired_leases(self._leasedb, current_time)
def finished_cycle(self, cycle):
# add to our history state, prune old history
h = {}
start = self.state["current-cycle-start-time"]
now = time.time()
h["cycle-start-finish-times"] = (start, now)
ep = self.get_expiration_policy()
h["expiration-enabled"] = ep.is_enabled()
h["configured-expiration-mode"] = ep.get_parameters()
s = self.state["cycle-to-date"]
# state["lease-age-histogram"] is a dictionary (mapping
# (minage,maxage) tuple to a sharecount), but we report
# self.get_state()["lease-age-histogram"] as a list of
# (min,max,sharecount) tuples, because JSON can handle that better.
# We record the list-of-tuples form into the history for the same
# reason.
lah = self.convert_lease_age_histogram(s["lease-age-histogram"])
h["lease-age-histogram"] = lah
h["corrupt-shares"] = s["corrupt-shares"][:]
# note: if ["shares-recovered"] ever acquires an internal dict, this
# copy() needs to become a deepcopy
h["space-recovered"] = s["space-recovered"].copy()
self._leasedb.add_history_entry(cycle, h)
def get_state(self):
"""In addition to the crawler state described in
ShareCrawler.get_state(), I return the following keys which are
specific to the lease-checker/expirer. Note that the non-history keys
(with 'cycle' in their names) are only present if a cycle is currently
running. If the crawler is between cycles, it is appropriate to show
the latest item in the 'history' key instead. Also note that each
history item has all the data in the 'cycle-to-date' value, plus
cycle-start-finish-times.
cycle-to-date:
expiration-enabled
configured-expiration-mode
lease-age-histogram (list of (minage,maxage,sharecount) tuples)
corrupt-shares (list of (si_b32,shnum) tuples, minimal verification)
space-recovered
estimated-remaining-cycle:
# Values may be None if not enough data has been gathered to
# produce an estimate.
space-recovered
estimated-current-cycle:
# cycle-to-date plus estimated-remaining. Values may be None if
# not enough data has been gathered to produce an estimate.
space-recovered
history: maps cyclenum to a dict with the following keys:
cycle-start-finish-times
expiration-enabled
configured-expiration-mode
lease-age-histogram
corrupt-shares
space-recovered
The 'space-recovered' structure is a dictionary with the following
keys:
# 'examined' is what was looked at
examined-buckets, examined-buckets-$SHARETYPE
examined-shares, examined-shares-$SHARETYPE
examined-diskbytes, examined-diskbytes-$SHARETYPE
# 'actual' is what was deleted
actual-buckets, actual-buckets-$SHARETYPE
actual-shares, actual-shares-$SHARETYPE
actual-diskbytes, actual-diskbytes-$SHARETYPE
Note that the preferred terminology has changed since these keys
were defined; "buckets" refers to what are now called sharesets,
and "diskbytes" refers to bytes of used space on the storage backend,
which is not necessarily the disk backend.
The 'original-*' and 'configured-*' keys that were populated in
pre-leasedb versions are no longer supported.
The 'leases-per-share-histogram' is also no longer supported.
"""
progress = self.get_progress()
state = ShareCrawler.get_state(self) # does a shallow copy
state["history"] = self._leasedb.get_history()
if not progress["cycle-in-progress"]:
del state["cycle-to-date"]
return state
so_far = state["cycle-to-date"].copy()
state["cycle-to-date"] = so_far
lah = so_far["lease-age-histogram"]
so_far["lease-age-histogram"] = self.convert_lease_age_histogram(lah)
so_far["expiration-enabled"] = self._expiration_policy.is_enabled()
so_far["configured-expiration-mode"] = self._expiration_policy.get_parameters()
so_far_sr = so_far["space-recovered"]
remaining_sr = {}
remaining = {"space-recovered": remaining_sr}
cycle_sr = {}
cycle = {"space-recovered": cycle_sr}
if progress["cycle-complete-percentage"] > 0.0:
pc = progress["cycle-complete-percentage"] / 100.0
m = (1-pc)/pc
for a in ("actual", "examined"):
for b in ("buckets", "shares", "diskbytes"):
k = "%s-%s" % (a, b)
remaining_sr[k] = m * so_far_sr[k]
cycle_sr[k] = so_far_sr[k] + remaining_sr[k]
for st in SHARETYPES:
k = "%s-%s-%s" % (a, b, SHARETYPES[st])
remaining_sr[k] = m * so_far_sr[k]
cycle_sr[k] = so_far_sr[k] + remaining_sr[k]
else:
for a in ("actual", "examined"):
for b in ("buckets", "shares", "diskbytes"):
k = "%s-%s" % (a, b)
remaining_sr[k] = None
cycle_sr[k] = None
for st in SHARETYPES:
k = "%s-%s-%s" % (a, b, SHARETYPES[st])
remaining_sr[k] = None
cycle_sr[k] = None
state["estimated-remaining-cycle"] = remaining
state["estimated-current-cycle"] = cycle
return state

View File

@ -0,0 +1,297 @@
from weakref import WeakValueDictionary
from twisted.application import service
from twisted.internet import defer
from allmydata.util.assertutil import precondition
from allmydata.util.deferredutil import async_iterate, gatherResults
from allmydata.storage.common import si_b2a
from allmydata.storage.bucket import BucketReader
from allmydata.storage.leasedb import SHARETYPE_MUTABLE
class Backend(service.MultiService):
def __init__(self):
service.MultiService.__init__(self)
self._lock_table = WeakValueDictionary()
def _get_lock(self, storage_index):
# Getting a shareset ensures that a lock exists for that storage_index.
# The _lock_table won't let go of an entry while the ShareSet (or any
# other objects that reference the lock) are live, or while it is locked.
lock = self._lock_table.get(storage_index, None)
if lock is None:
lock = defer.DeferredLock()
self._lock_table[storage_index] = lock
return lock
def must_use_tubid_as_permutation_seed(self):
# New backends cannot have been around before #466, and so have no backward
# compatibility requirements for permutation seeds. The disk backend overrides this.
return False
def create_container(self):
# Backends for which it is necessary to create a container, should override this
# and return a Deferred that fires with something other than False when the
# container has been created.
return defer.succeed(False)
class ShareSet(object):
"""
This class implements shareset logic that could work for all backends, but
might be useful to override for efficiency.
"""
def __init__(self, storage_index, lock):
self.storage_index = storage_index
self.lock = lock
def get_storage_index(self):
return self.storage_index
def get_storage_index_string(self):
return si_b2a(self.storage_index)
def make_bucket_reader(self, account, share):
return BucketReader(account, share)
def get_shares(self):
return self.lock.run(self._locked_get_shares)
def get_share(self, shnum):
return self.lock.run(self._locked_get_share, shnum)
def delete_share(self, shnum):
return self.lock.run(self._locked_delete_share, shnum)
def testv_and_readv_and_writev(self, write_enabler,
test_and_write_vectors, read_vector,
expiration_time, account):
return self.lock.run(self._locked_testv_and_readv_and_writev, write_enabler,
test_and_write_vectors, read_vector,
expiration_time, account)
def _locked_testv_and_readv_and_writev(self, write_enabler,
test_and_write_vectors, read_vector,
expiration_time, account):
# The implementation here depends on the following helper methods,
# which must be provided by subclasses:
#
# def _clean_up_after_unlink(self):
# """clean up resources associated with the shareset after some
# shares might have been deleted"""
#
# def _create_mutable_share(self, account, shnum, write_enabler):
# """create a mutable share with the given shnum and write_enabler"""
sharemap = {}
d = self._locked_get_shares()
def _got_shares( (shares, corrupted) ):
d2 = defer.succeed(None)
for share in shares:
assert not isinstance(share, defer.Deferred), share
# XXX is it correct to ignore immutable shares? Maybe get_shares should
# have a parameter saying what type it's expecting.
if share.sharetype == "mutable":
d2.addCallback(lambda ign, share=share: share.check_write_enabler(write_enabler))
sharemap[share.get_shnum()] = share
shnums = sorted(sharemap.keys())
# if d2 does not fail, write_enabler is good for all existing shares
# now evaluate test vectors
def _check_testv(shnum):
(testv, datav, new_length) = test_and_write_vectors[shnum]
if shnum in sharemap:
d3 = sharemap[shnum].check_testv(testv)
elif shnum in corrupted:
# a corrupted share does not match any test vector
d3 = defer.succeed(False)
else:
# compare the vectors against an empty share, in which all
# reads return empty strings
d3 = defer.succeed(empty_check_testv(testv))
def _check_result(res):
if not res:
account.server.log("testv failed: [%d] %r" % (shnum, testv))
return res
d3.addCallback(_check_result)
return d3
d2.addCallback(lambda ign: async_iterate(_check_testv, test_and_write_vectors))
def _gather(testv_is_good):
# Gather the read vectors, before we do any writes. This ignores any
# corrupted shares.
d3 = gatherResults([sharemap[shnum].readv(read_vector) for shnum in shnums])
def _do_writes(reads):
read_data = {}
for i in range(len(shnums)):
read_data[shnums[i]] = reads[i]
d4 = defer.succeed(None)
if testv_is_good:
if len(set(test_and_write_vectors.keys()) & corrupted) > 0:
# XXX think of a better exception to raise
raise AssertionError("You asked to write share numbers %r of storage index %r, "
"but one or more of those is corrupt (numbers %r)"
% (list(sorted(test_and_write_vectors.keys())),
self.get_storage_index_string(),
list(sorted(corrupted))) )
# now apply the write vectors
for shnum in test_and_write_vectors:
(testv, datav, new_length) = test_and_write_vectors[shnum]
if new_length == 0:
if shnum in sharemap:
d4.addCallback(lambda ign, shnum=shnum:
sharemap[shnum].unlink())
d4.addCallback(lambda ign, shnum=shnum:
account.remove_share_and_leases(self.storage_index, shnum))
else:
if shnum not in sharemap:
# allocate a new share
d4.addCallback(lambda ign, shnum=shnum:
self._create_mutable_share(account, shnum,
write_enabler))
def _record_share(share, shnum=shnum):
sharemap[shnum] = share
account.add_share(self.storage_index, shnum, share.get_used_space(),
SHARETYPE_MUTABLE)
d4.addCallback(_record_share)
d4.addCallback(lambda ign, shnum=shnum, datav=datav, new_length=new_length:
sharemap[shnum].writev(datav, new_length))
def _update_lease(ign, shnum=shnum):
account.add_or_renew_default_lease(self.storage_index, shnum)
account.mark_share_as_stable(self.storage_index, shnum,
sharemap[shnum].get_used_space())
d4.addCallback(_update_lease)
if new_length == 0:
d4.addCallback(lambda ign: self._clean_up_after_unlink())
d4.addCallback(lambda ign: (testv_is_good, read_data))
return d4
d3.addCallback(_do_writes)
return d3
d2.addCallback(_gather)
return d2
d.addCallback(_got_shares)
return d
def readv(self, wanted_shnums, read_vector):
return self.lock.run(self._locked_readv, wanted_shnums, read_vector)
def _locked_readv(self, wanted_shnums, read_vector):
"""
Read a vector from the numbered shares in this shareset. An empty
shares list means to return data from all known shares.
@param wanted_shnums=ListOf(int)
@param read_vector=ReadVector
@return DictOf(int, ReadData): shnum -> results, with one key per share
"""
shnums = []
dreads = []
d = self._locked_get_shares()
def _got_shares( (shares, corrupted) ):
# We ignore corrupted shares.
for share in shares:
assert not isinstance(share, defer.Deferred), share
shnum = share.get_shnum()
if not wanted_shnums or shnum in wanted_shnums:
shnums.append(share.get_shnum())
dreads.append(share.readv(read_vector))
return gatherResults(dreads)
d.addCallback(_got_shares)
def _got_reads(reads):
datavs = {}
for i in range(len(shnums)):
datavs[shnums[i]] = reads[i]
return datavs
d.addCallback(_got_reads)
return d
def testv_compare(a, op, b):
assert op in ("lt", "le", "eq", "ne", "ge", "gt")
if op == "lt":
return a < b
if op == "le":
return a <= b
if op == "eq":
return a == b
if op == "ne":
return a != b
if op == "ge":
return a >= b
if op == "gt":
return a > b
# never reached
def empty_check_testv(testv):
test_good = True
for (offset, length, operator, specimen) in testv:
data = ""
if not testv_compare(data, operator, specimen):
test_good = False
break
return test_good
# Originally from txaws.s3.model (under different class names), which was under the MIT / Expat licence.
class ContainerItem(object):
"""
An item in a listing of cloud objects.
"""
def __init__(self, key, modification_date, etag, size, storage_class,
owner=None):
self.key = key
self.modification_date = modification_date
self.etag = etag
self.size = size
self.storage_class = storage_class
self.owner = owner
def __repr__(self):
return "<ContainerItem %r>" % ({
"key": self.key,
"modification_date": self.modification_date,
"etag": self.etag,
"size": self.size,
"storage_class": self.storage_class,
"owner": self.owner,
},)
class ContainerListing(object):
def __init__(self, name, prefix, marker, max_keys, is_truncated,
contents=None, common_prefixes=None):
precondition(isinstance(is_truncated, str))
self.name = name
self.prefix = prefix
self.marker = marker
self.max_keys = max_keys
self.is_truncated = is_truncated
self.contents = contents
self.common_prefixes = common_prefixes
def __repr__(self):
return "<ContainerListing %r>" % ({
"name": self.name,
"prefix": self.prefix,
"marker": self.marker,
"max_keys": self.max_keys,
"is_truncated": self.is_truncated,
"contents": self.contents,
"common_prefixes": self.common_prefixes,
})

View File

@ -0,0 +1,189 @@
import sys
from twisted.internet import defer
from zope.interface import implements
from allmydata.interfaces import IStorageBackend, IShareSet
from allmydata.node import InvalidValueError
from allmydata.util.assertutil import _assert
from allmydata.util.dictutil import NumDict
from allmydata.util.encodingutil import quote_output
from allmydata.storage.common import si_a2b, NUM_RE, CorruptStoredShareError
from allmydata.storage.bucket import BucketWriter
from allmydata.storage.backends.base import Backend, ShareSet
from allmydata.storage.backends.cloud.immutable import ImmutableCloudShareForReading, ImmutableCloudShareForWriting
from allmydata.storage.backends.cloud.mutable import MutableCloudShare
from allmydata.storage.backends.cloud.cloud_common import get_share_key, delete_chunks
from allmydata.mutable.layout import MUTABLE_MAGIC
CLOUD_INTERFACES = ("cloud.s3", "cloud.openstack", "cloud.googlestorage", "cloud.msazure")
def get_cloud_share(container, storage_index, shnum, total_size):
key = get_share_key(storage_index, shnum)
d = container.get_object(key)
def _make_share(first_chunkdata):
if first_chunkdata.startswith(MUTABLE_MAGIC):
return MutableCloudShare(container, storage_index, shnum, total_size, first_chunkdata)
else:
# assume it's immutable
return ImmutableCloudShareForReading(container, storage_index, shnum, total_size, first_chunkdata)
d.addCallback(_make_share)
return d
def configure_cloud_backend(storedir, config):
if config.get_config("storage", "readonly", False, boolean=True):
raise InvalidValueError("[storage]readonly is not supported by the cloud backend; "
"make the container read-only instead.")
backendtype = config.get_config("storage", "backend", "disk")
if backendtype == "s3":
backendtype = "cloud.s3"
if backendtype not in CLOUD_INTERFACES:
raise InvalidValueError("%s is not supported by the cloud backend; it must be one of %s"
% (quote_output("[storage]backend = " + backendtype), CLOUD_INTERFACES) )
pkgname = "allmydata.storage.backends." + backendtype
__import__(pkgname)
container = sys.modules[pkgname].configure_container(storedir, config)
return CloudBackend(container)
class CloudBackend(Backend):
implements(IStorageBackend)
def __init__(self, container):
Backend.__init__(self)
self._container = container
# set of (storage_index, shnum) of incoming shares
self._incomingset = set()
def get_sharesets_for_prefix(self, prefix):
d = self._container.list_objects(prefix='shares/%s/' % (prefix,))
def _get_sharesets(res):
# XXX this enumerates all shares to get the set of SIs.
# Is there a way to enumerate SIs more efficiently?
si_strings = set()
for item in res.contents:
# XXX better error handling
path = item.key.split('/')
_assert(path[0:2] == ["shares", prefix], path=path, prefix=prefix)
si_strings.add(path[2])
# XXX we want this to be deterministic, so we return the sharesets sorted
# by their si_strings, but we shouldn't need to explicitly re-sort them
# because list_objects returns a sorted list.
return [self.get_shareset(si_a2b(s)) for s in sorted(si_strings)]
d.addCallback(_get_sharesets)
return d
def get_shareset(self, storage_index):
return CloudShareSet(storage_index, self._get_lock(storage_index),
self._container, self._incomingset)
def fill_in_space_stats(self, stats):
# TODO: query space usage of container if supported.
# TODO: query whether the container is read-only and set
# accepting_immutable_shares accordingly.
stats['storage_server.accepting_immutable_shares'] = 1
def get_available_space(self):
# TODO: query space usage of container if supported.
return 2**64
def create_container(self):
return self._container.create()
def list_container(self, prefix=''):
d = self._container.list_objects(prefix)
d.addCallback(lambda listing: listing.contents)
return d
class CloudShareSet(ShareSet):
implements(IShareSet)
def __init__(self, storage_index, lock, container, incomingset):
ShareSet.__init__(self, storage_index, lock)
self._container = container
self._incomingset = incomingset
self._key = get_share_key(storage_index)
def get_overhead(self):
return 0
def _locked_get_shares(self):
d = self._container.list_objects(prefix=self._key)
def _get_shares(res):
si = self.get_storage_index()
shnum_to_total_size = NumDict()
for item in res.contents:
key = item.key
_assert(key.startswith(self._key), key=key, self_key=self._key)
path = key.split('/')
if len(path) == 4:
(shnumstr, _, chunknumstr) = path[3].partition('.')
chunknumstr = chunknumstr or '0'
if NUM_RE.match(shnumstr) and NUM_RE.match(chunknumstr):
# The size is taken as the sum of sizes for all chunks, but for simplicity
# we don't check here that the individual chunk sizes match expectations.
# If they don't, that will cause an error on reading.
shnum_to_total_size.add_num(int(shnumstr), int(item.size))
return defer.DeferredList([get_cloud_share(self._container, si, shnum, total_size)
for (shnum, total_size) in shnum_to_total_size.items_sorted_by_key()],
consumeErrors=True)
d.addCallback(_get_shares)
def _got_list(outcomes):
# DeferredList gives us a list of (success, result) pairs, which we
# convert to a pair (list of shares, set of corrupt shnums).
shares = [share for (success, share) in outcomes if success]
corrupted = set([f.value.shnum for (success, f) in outcomes
if not success and isinstance(f.value, CorruptStoredShareError)])
return (shares, corrupted)
d.addCallback(_got_list)
return d
def _locked_get_share(self, shnum):
key = "%s%d" % (self._key, shnum)
d = self._container.list_objects(prefix=key)
def _get_share(res):
total_size = 0
for item in res.contents:
total_size += item.size
return get_cloud_share(self._container, self.get_storage_index(), shnum, total_size)
d.addCallback(_get_share)
return d
def _locked_delete_share(self, shnum):
key = "%s%d" % (self._key, shnum)
return delete_chunks(self._container, key)
def has_incoming(self, shnum):
return (self.get_storage_index(), shnum) in self._incomingset
def make_bucket_writer(self, account, shnum, allocated_data_length, canary):
immsh = ImmutableCloudShareForWriting(self._container, self.get_storage_index(), shnum,
allocated_data_length, self._incomingset)
d = defer.succeed(None)
d.addCallback(lambda ign: BucketWriter(account, immsh, canary))
return d
def _create_mutable_share(self, account, shnum, write_enabler):
serverid = account.server.get_serverid()
return MutableCloudShare.create_empty_share(self._container, serverid, write_enabler,
self.get_storage_index(), shnum, parent=account.server)
def _clean_up_after_unlink(self):
pass
def _get_sharedir(self):
# For use by tests, only with the mock cloud backend.
# It is OK that _get_path doesn't exist on real container objects.
return self._container._get_path(self._key)

View File

@ -0,0 +1,694 @@
from collections import deque
from cStringIO import StringIO
import urllib
from twisted.internet import defer, reactor, task
from twisted.python.failure import Failure
from twisted.web.error import Error
from twisted.web.client import FileBodyProducer, ResponseDone, Agent, HTTPConnectionPool
from twisted.web.http_headers import Headers
from twisted.internet.protocol import Protocol
from zope.interface import Interface, implements
from allmydata.interfaces import IShareBase
from allmydata.util import log
from allmydata.util.assertutil import precondition, _assert
from allmydata.util.deferredutil import eventually_callback, eventually_errback, eventual_chain, gatherResults
from allmydata.util.listutil import concat
from allmydata.storage.common import si_b2a, NUM_RE
# The container has keys of the form shares/$PREFIX/$STORAGEINDEX/$SHNUM.$CHUNK
def get_share_key(si, shnum=None):
sistr = si_b2a(si)
if shnum is None:
return "shares/%s/%s/" % (sistr[:2], sistr)
else:
return "shares/%s/%s/%d" % (sistr[:2], sistr, shnum)
def get_chunk_key(share_key, chunknum):
precondition(chunknum >= 0, chunknum=chunknum)
if chunknum == 0:
return share_key
else:
return "%s.%d" % (share_key, chunknum)
DEFAULT_PREFERRED_CHUNK_SIZE = 512*1024
PREFERRED_CHUNK_SIZE = DEFAULT_PREFERRED_CHUNK_SIZE
PIPELINE_DEPTH = 5
CACHED_CHUNKS = 5
ZERO_CHUNKDATA = "\x00"*PREFERRED_CHUNK_SIZE
def get_zero_chunkdata(size):
if size <= PREFERRED_CHUNK_SIZE:
return ZERO_CHUNKDATA[: size]
else:
return "\x00"*size
class IContainer(Interface):
"""
I represent a cloud container.
"""
def create():
"""
Create this container.
"""
def delete():
"""
Delete this container.
The cloud service may require the container to be empty before it can be deleted.
"""
def list_objects(prefix=''):
"""
Get a ContainerListing that lists objects in this container.
prefix: (str) limit the returned keys to those starting with prefix.
"""
def put_object(object_name, data, content_type=None, metadata={}):
"""
Put an object in this bucket.
Any existing object of the same name will be replaced.
"""
def get_object(object_name):
"""
Get an object from this container.
"""
def head_object(object_name):
"""
Retrieve object metadata only.
"""
def delete_object(object_name):
"""
Delete an object from this container.
Once deleted, there is no method to restore or undelete an object.
"""
def delete_chunks(container, share_key, from_chunknum=0):
d = container.list_objects(prefix=share_key)
def _delete(res):
def _suppress_404(f):
e = f.trap(container.ServiceError)
if e.get_error_code() != 404:
return f
d2 = defer.succeed(None)
for item in res.contents:
key = item.key
_assert(key.startswith(share_key), key=key, share_key=share_key)
path = key.split('/')
if len(path) == 4:
(_, _, chunknumstr) = path[3].partition('.')
chunknumstr = chunknumstr or "0"
if NUM_RE.match(chunknumstr) and int(chunknumstr) >= from_chunknum:
d2.addCallback(lambda ign, key=key: container.delete_object(key))
d2.addErrback(_suppress_404)
return d2
d.addCallback(_delete)
return d
class CloudShareBase(object):
implements(IShareBase)
"""
Attributes:
_container: (IContainer) the cloud container that stores this share
_storage_index: (str) binary storage index
_shnum: (integer) share number
_key: (str) the key prefix under which this share will be stored (no .chunknum suffix)
_data_length: (integer) length of data excluding headers and leases
_total_size: (integer) total size of the sharefile
Methods:
_discard(self): object will no longer be used; discard references to potentially large data
"""
def __init__(self, container, storage_index, shnum):
precondition(IContainer.providedBy(container), container=container)
precondition(isinstance(storage_index, str), storage_index=storage_index)
precondition(isinstance(shnum, int), shnum=shnum)
# These are always known immediately.
self._container = container
self._storage_index = storage_index
self._shnum = shnum
self._key = get_share_key(storage_index, shnum)
# Subclasses must set _data_length and _total_size.
def __repr__(self):
return ("<%s at %r key %r>" % (self.__class__.__name__, self._container, self._key,))
def get_storage_index(self):
return self._storage_index
def get_storage_index_string(self):
return si_b2a(self._storage_index)
def get_shnum(self):
return self._shnum
def get_data_length(self):
return self._data_length
def get_size(self):
return self._total_size
def get_used_space(self):
# We're not charged for any per-object overheads in supported cloud services, so
# total object data sizes are what we're interested in for statistics and accounting.
return self.get_size()
def unlink(self):
self._discard()
return delete_chunks(self._container, self._key)
def _get_path(self):
"""
When used with the mock cloud container, this returns the path of the file containing
the first chunk. For a real cloud container, it raises an error.
"""
# It is OK that _get_path doesn't exist on real cloud container objects.
return self._container._get_path(self._key)
class CloudShareReaderMixin:
"""
Attributes:
_data_length: (integer) length of data excluding headers and leases
_chunksize: (integer) size of each chunk possibly excluding the last
_cache: (ChunkCache) the cache used to read chunks
DATA_OFFSET: (integer) offset to the start-of-data from start of the sharefile
"""
def readv(self, readv):
sorted_readv = sorted(zip(readv, xrange(len(readv))))
datav = [None]*len(readv)
for (v, i) in sorted_readv:
(offset, length) = v
datav[i] = self.read_share_data(offset, length)
return gatherResults(datav)
def read_share_data(self, offset, length):
precondition(offset >= 0)
# Reads beyond the end of the data are truncated.
# Reads that start beyond the end of the data return an empty string.
seekpos = self.DATA_OFFSET + offset
actuallength = max(0, min(length, self._data_length - offset))
if actuallength == 0:
return defer.succeed("")
lastpos = seekpos + actuallength - 1
_assert(lastpos > 0, seekpos=seekpos, actuallength=actuallength, lastpos=lastpos)
start_chunknum = seekpos / self._chunksize
start_offset = seekpos % self._chunksize
last_chunknum = lastpos / self._chunksize
last_offset = lastpos % self._chunksize
_assert(start_chunknum <= last_chunknum, start_chunknum=start_chunknum, last_chunknum=last_chunknum)
parts = deque()
def _load_part(ign, chunknum):
# determine which part of this chunk we need
start = 0
end = self._chunksize
if chunknum == start_chunknum:
start = start_offset
if chunknum == last_chunknum:
end = last_offset + 1
#print "LOAD", get_chunk_key(self._key, chunknum), start, end
# d2 fires when we should continue loading the next chunk; chunkdata_d fires with the actual data.
chunkdata_d = defer.Deferred()
d2 = self._cache.get(chunknum, chunkdata_d)
if start > 0 or end < self._chunksize:
chunkdata_d.addCallback(lambda chunkdata: chunkdata[start : end])
parts.append(chunkdata_d)
return d2
d = defer.succeed(None)
for i in xrange(start_chunknum, last_chunknum + 1):
d.addCallback(_load_part, i)
d.addCallback(lambda ign: gatherResults(parts))
d.addCallback(lambda pieces: ''.join(pieces))
return d
class CloudError(Exception):
pass
class CloudServiceError(Error):
"""
A error class similar to txaws' S3Error.
"""
def __init__(self, xml_bytes, status, message=None, response=None, request_id="", host_id=""):
Error.__init__(self, status, message, response)
self.original = xml_bytes
self.status = str(status)
self.message = str(message)
self.request_id = request_id
self.host_id = host_id
def get_error_code(self):
return self.status
def get_error_message(self):
return self.message
def parse(self, xml_bytes=""):
raise NotImplementedError
def has_error(self, errorString):
raise NotImplementedError
def get_error_codes(self):
raise NotImplementedError
def get_error_messages(self):
raise NotImplementedError
BACKOFF_SECONDS_BEFORE_RETRY = (0, 2, 10)
class CommonContainerMixin:
"""
Base class for cloud storage providers with similar APIs.
I provide a helper method for performing an operation on a cloud container that will retry up to
len(BACKOFF_SECONDS_FOR_RETRY) times (not including the initial try). If the initial try fails, a
single incident will be triggered after the operation has succeeded or failed.
Subclasses should define:
ServiceError:
An exceptions class with meaningful status codes that can be
filtered using _react_to_error. Other exceptions will cause
unconditional retries.
and can override:
_react_to_error(self, response_code):
Returns True if the error should be retried. May perform side effects before the retry.
"""
def __init__(self, container_name, override_reactor=None):
self._container_name = container_name
self._reactor = override_reactor or reactor
self.ServiceError = CloudServiceError
def __repr__(self):
return ("<%s %r>" % (self.__class__.__name__, self._container_name,))
def _make_container_url(self, public_storage_url):
return "%s/%s" % (public_storage_url, urllib.quote(self._container_name, safe=''))
def _make_object_url(self, public_storage_url, object_name):
return "%s/%s/%s" % (public_storage_url, urllib.quote(self._container_name, safe=''),
urllib.quote(object_name))
def _react_to_error(self, response_code):
"""
The default policy is to retry on 5xx errors.
"""
return response_code >= 500 and response_code < 600
def _strip_data(self, args):
# Retain only one argument, object_name, for logging (we want to avoid logging data).
return args[:1]
def _do_request(self, description, operation, *args, **kwargs):
d = defer.maybeDeferred(operation, *args, **kwargs)
def _retry(f):
d2 = self._handle_error(f, 1, None, description, operation, *args, **kwargs)
def _trigger_incident(res):
log.msg(format="error(s) on cloud container operation: %(description)s %(arguments)s %(kwargs)s %(res)s",
arguments=self._strip_data(args), kwargs=kwargs, description=description, res=res,
level=log.WEIRD)
return res
d2.addBoth(_trigger_incident)
return d2
d.addErrback(_retry)
return d
def _handle_error(self, f, trynum, first_err_and_tb, description, operation, *args, **kwargs):
# Don't use f.getTracebackObject() since a fake traceback will not do for the 3-arg form of 'raise'.
# tb can be None (which is acceptable for 3-arg raise) if we don't have a traceback.
tb = getattr(f, 'tb', None)
fargs = f.value.args
if len(fargs) > 2 and fargs[2] and '<code>signaturedoesnotmatch</code>' in fargs[2].lower():
fargs = fargs[:2] + ("SignatureDoesNotMatch response redacted",) + fargs[3:]
args_without_data = self._strip_data(args)
msg = "try %d failed: %s %s %s" % (trynum, description, args_without_data, kwargs)
err = CloudError(msg, *fargs)
# This should not trigger an incident; we want to do that at the end.
log.msg(format="try %(trynum)d failed: %(description)s %(arguments)s %(kwargs)s %(ftype)s %(fargs)s",
trynum=trynum, arguments=args_without_data, kwargs=kwargs, description=description, ftype=str(f.value.__class__), fargs=repr(fargs),
level=log.INFREQUENT)
if first_err_and_tb is None:
first_err_and_tb = (err, tb)
if trynum > len(BACKOFF_SECONDS_BEFORE_RETRY):
# If we run out of tries, raise the error we got on the first try (which *may* have
# a more useful traceback).
(first_err, first_tb) = first_err_and_tb
raise first_err.__class__, first_err, first_tb
retry = True
if f.check(self.ServiceError):
fargs = f.value.args
if len(fargs) > 0:
retry = self._react_to_error(int(fargs[0]))
else:
retry = False
if retry:
log.msg("Rescheduling failed task for retry in %d seconds." % (BACKOFF_SECONDS_BEFORE_RETRY[trynum-1],))
d = task.deferLater(self._reactor, BACKOFF_SECONDS_BEFORE_RETRY[trynum-1], operation, *args, **kwargs)
d.addErrback(self._handle_error, trynum+1, first_err_and_tb, description, operation, *args, **kwargs)
return d
# If we get an error response for which _react_to_error says we should not retry,
# raise that error even if the request was itself a retry.
log.msg("Giving up, no retry for %s" % (err,))
raise err.__class__, err, tb
def create(self):
return self._do_request('create container', self._create)
def delete(self):
return self._do_request('delete container', self._delete)
def list_objects(self, prefix=''):
return self._do_request('list objects', self._list_objects, prefix)
def put_object(self, object_name, data, content_type='application/octet-stream', metadata={}):
return self._do_request('PUT object', self._put_object, object_name, data, content_type, metadata)
def get_object(self, object_name):
return self._do_request('GET object', self._get_object, object_name)
def head_object(self, object_name):
return self._do_request('HEAD object', self._head_object, object_name)
def delete_object(self, object_name):
return self._do_request('DELETE object', self._delete_object, object_name)
class ContainerListMixin:
"""
S3 has a limitation of 1000 object entries returned on each list (GET Bucket) request.
I provide a helper method to repeat the call as many times as necessary to get a full
listing. The container is assumed to implement:
def list_some_objects(self, **kwargs):
# kwargs may include 'prefix' and 'marker' parameters as documented at
# <http://docs.amazonwebservices.com/AmazonS3/latest/API/RESTBucketGET.html>.
# returns Deferred ContainerListing
Note that list_some_objects is assumed to be reliable; so, if retries are needed,
the container class should also inherit from ContainerRetryMixin and list_some_objects
should make the request via _do_request.
The 'delimiter' parameter of the GET Bucket API is not supported.
"""
def list_objects(self, prefix=''):
kwargs = {'prefix': prefix}
all_contents = deque()
def _list_some():
d2 = self.list_some_objects(**kwargs)
def _got_listing(res):
all_contents.append(res.contents)
if res.is_truncated == "true":
_assert(len(res.contents) > 0)
marker = res.contents[-1].key
_assert('marker' not in kwargs or marker > kwargs['marker'],
"Not making progress in list_objects", kwargs=kwargs, marker=marker)
kwargs['marker'] = marker
return _list_some()
else:
_assert(res.is_truncated == "false", is_truncated=res.is_truncated)
return res
d2.addCallback(_got_listing)
return d2
d = _list_some()
d.addCallback(lambda res: res.__class__(res.name, res.prefix, res.marker, res.max_keys,
"false", concat(all_contents)))
def _log(f):
log.msg(f, level=log.WEIRD)
return f
d.addErrback(_log)
return d
class BackpressurePipeline(object):
"""
I manage a pipeline of Deferred operations that allows the data source to feel backpressure
when the pipeline is "full". I do not actually limit the number of operations in progress.
"""
OPEN = 0
CLOSING = 1
CLOSED = 2
def __init__(self, capacity):
self._capacity = capacity # how full we can be before causing calls to 'add' to block
self._gauge = 0 # how full we are
self._waiting = [] # callers of add() who are blocked
self._unfinished = 0 # number of pending operations
self._result_d = defer.Deferred()
self._state = self.OPEN
def add(self, _size, _func, *args, **kwargs):
if self._state == self.CLOSED:
msg = "add() called on closed BackpressurePipeline"
log.err(msg, level=log.WEIRD)
def _already_closed(): raise AssertionError(msg)
return defer.execute(_already_closed)
self._gauge += _size
self._unfinished += 1
fd = defer.maybeDeferred(_func, *args, **kwargs)
fd.addBoth(self._call_finished, _size)
fd.addErrback(log.err, "BackpressurePipeline._call_finished raised an exception")
if self._gauge < self._capacity:
return defer.succeed(None)
d = defer.Deferred()
self._waiting.append(d)
return d
def fail(self, f):
if self._state != self.CLOSED:
self._state = self.CLOSED
eventually_errback(self._result_d)(f)
def flush(self):
if self._state == self.CLOSED:
return defer.succeed(self._result_d)
d = self.close()
d.addBoth(self.reopen)
return d
def close(self):
if self._state != self.CLOSED:
if self._unfinished == 0:
self._state = self.CLOSED
eventually_callback(self._result_d)(None)
else:
self._state = self.CLOSING
return self._result_d
def reopen(self, res=None):
_assert(self._state == self.CLOSED, state=self._state)
self._result_d = defer.Deferred()
self._state = self.OPEN
return res
def _call_finished(self, res, size):
self._unfinished -= 1
self._gauge -= size
if isinstance(res, Failure):
self.fail(res)
if self._state == self.CLOSING:
# repeat the unfinished == 0 check
self.close()
if self._state == self.CLOSED:
while self._waiting:
d = self._waiting.pop(0)
eventual_chain(self._result_d, d)
elif self._gauge < self._capacity:
while self._waiting:
d = self._waiting.pop(0)
eventually_callback(d)(None)
return None
class ChunkCache(object):
"""I cache chunks for a specific share object."""
def __init__(self, container, key, chunksize, cached_chunks=CACHED_CHUNKS, initial_cachemap={}):
self._container = container
self._key = key
self._chunksize = chunksize
self._cached_chunks = cached_chunks
# chunknum -> deferred data
self._cachemap = initial_cachemap
self._lru = deque(sorted(initial_cachemap.keys()))
self._pipeline = BackpressurePipeline(PIPELINE_DEPTH)
def _load_chunk(self, chunknum, chunkdata_d):
d = self._container.get_object(get_chunk_key(self._key, chunknum))
eventual_chain(source=d, target=chunkdata_d)
return d
def _discard(self):
while len(self._lru) > self._cached_chunks:
self.flush_chunk(self._lru.popleft())
def get(self, chunknum, result_d):
if chunknum in self._cachemap:
# cache hit; never stall
self._lru.remove(chunknum) # takes O(cached_chunks) time, but that's fine
self._lru.append(chunknum)
eventual_chain(source=self._cachemap[chunknum], target=result_d)
return defer.succeed(None)
# cache miss; stall when the pipeline is full
chunkdata_d = defer.Deferred()
d = self._pipeline.add(1, self._load_chunk, chunknum, chunkdata_d)
def _check(res):
_assert(res is not None)
return res
chunkdata_d.addCallback(_check)
self._cachemap[chunknum] = chunkdata_d
self._lru.append(chunknum)
self._discard()
eventual_chain(source=chunkdata_d, target=result_d)
return d
def flush_chunk(self, chunknum):
if chunknum in self._cachemap:
del self._cachemap[chunknum]
def close(self):
self._cachemap = None
return self._pipeline.close()
class Discard(Protocol):
# see http://twistedmatrix.com/trac/ticket/5488
def makeConnection(self, producer):
producer.stopProducing()
class DataCollector(Protocol):
def __init__(self, ServiceError):
self._data = deque()
self._done = defer.Deferred()
self.ServiceError = ServiceError
def dataReceived(self, bytes):
self._data.append(bytes)
def connectionLost(self, reason):
if reason.check(ResponseDone):
eventually_callback(self._done)("".join(self._data))
else:
def _failed(): raise self.ServiceError(None, 0, message=reason.getErrorMessage())
eventually_errback(self._done)(defer.execute(_failed))
def when_done(self):
"""CAUTION: this always returns the same Deferred."""
return self._done
class QuieterFileBodyProducer(FileBodyProducer):
"""
Workaround for a minor bug in Twisted: losing a connection may result in stopProducing
being called twice, causing a spurious unhandled TaskStopped exception to be logged.
"""
def stopProducing(self):
try:
FileBodyProducer.stopProducing(self)
except task.TaskStopped:
log.msg("ignoring a harmless TaskStopped exception", level=log.OPERATIONAL)
class HTTPClientMixin:
"""
I implement helper methods for making HTTP requests and getting response headers.
Subclasses should define:
_agent:
The instance of twisted.web.client.Agent to be used.
USER_AGENT:
User agent string.
ServiceError:
The error class to trap (CloudServiceError or similar).
"""
def _init_agent(self):
pool = HTTPConnectionPool(self._reactor)
pool.maxPersistentPerHost = 20
self._agent = Agent(self._reactor, connectTimeout=10, pool=pool)
def _http_request(self, what, method, url, request_headers, body=None, need_response_body=False):
# Agent.request adds a Host header automatically based on the URL.
request_headers['User-Agent'] = [self.USER_AGENT]
if body is None:
bodyProducer = None
else:
bodyProducer = QuieterFileBodyProducer(StringIO(body))
# We don't need to explicitly set Content-Length because FileBodyProducer knows the length
# (and if we do it won't work, because in that case Content-Length would be duplicated).
log.msg(format="%(what)s request: %(method)s %(url)s %(header_keys)s",
what=what, method=method, url=url, header_keys=repr(request_headers.keys()), level=log.OPERATIONAL)
d = defer.maybeDeferred(self._agent.request, method, url, Headers(request_headers), bodyProducer)
def _got_response(response):
log.msg(format="%(what)s response for %(url)s: %(code)d %(phrase)s",
what=what, url=url, code=response.code, phrase=response.phrase, level=log.OPERATIONAL)
if response.code < 200 or response.code >= 300:
response.deliverBody(Discard())
raise self.ServiceError(None, response.code,
message="unexpected response code %r %s" % (response.code, response.phrase))
if need_response_body:
collector = DataCollector(self.ServiceError)
response.deliverBody(collector)
d2 = collector.when_done()
d2.addCallback(lambda body: (response, body))
return d2
else:
response.deliverBody(Discard())
return (response, None)
d.addCallback(_got_response)
return d
def _get_header(self, response, name):
hs = response.headers.getRawHeaders(name)
if len(hs) == 0:
raise self.ServiceError(None, response.code,
message="missing response header %r" % (name,))
return hs[0]

View File

@ -0,0 +1,3 @@
from allmydata.storage.backends.cloud.googlestorage.googlestorage_container import configure_googlestorage_container
configure_container = configure_googlestorage_container

View File

@ -0,0 +1,276 @@
"""
This requires the oauth2client library:
http://code.google.com/p/google-api-python-client/downloads/list
"""
import urllib
try:
from xml.etree import cElementTree as ElementTree
ElementTree # hush pyflakes
except ImportError:
from xml.etree import ElementTree
# Maybe we can make a thing that looks like httplib2.Http but actually uses
# Twisted?
import httplib2
from twisted.internet.defer import DeferredLock
from twisted.internet.threads import deferToThread
try:
from oauth2client.client import SignedJwtAssertionCredentials
SignedJwtAssertionCredentials # hush pyflakes
oauth2client_available = True
except ImportError:
oauth2client_available = False
SignedJwtAssertionCredentials = None
from zope.interface import implements
from allmydata.util import log
from allmydata.storage.backends.base import ContainerItem, ContainerListing
from allmydata.storage.backends.cloud.cloud_common import IContainer, \
CommonContainerMixin, HTTPClientMixin
class AuthenticationClient(object):
"""
Retrieve access tokens for the Google Storage API, using OAuth 2.0.
See https://developers.google.com/accounts/docs/OAuth2ServiceAccount for
more details.
"""
def __init__(self, account_name, private_key, private_key_password='notasecret',
_credentialsClass=SignedJwtAssertionCredentials,
_deferToThread=deferToThread):
# Google ships pkcs12 private keys encrypted with "notasecret" as the
# password. In order for automated running to work we'd need to
# include the password in the config file, so it adds no extra
# security even if someone chooses a different password. So it's seems
# simplest to hardcode it for now and it'll work with unmodified
# private keys issued by Google.
self._credentials = _credentialsClass(
account_name, private_key,
"https://www.googleapis.com/auth/devstorage.read_write",
private_key_password = private_key_password,
)
self._deferToThread = _deferToThread
self._need_first_auth = True
self._lock = DeferredLock()
# Get initial token:
self._refresh_if_necessary(force=True)
def _refresh_if_necessary(self, force=False):
"""
Get a new authorization token, if necessary.
"""
def run():
if force or self._credentials.access_token_expired:
# Generally using a task-specific thread pool is better than using
# the reactor one. However, this particular call will only run
# once an hour, so it's not likely to tie up all the threads.
log.msg("Reauthenticating against Google Cloud Storage.")
def finished(result):
log.msg("Done reauthenticating against Google Cloud Storage.")
return result
d = self._deferToThread(self._credentials.refresh, httplib2.Http())
d.addBoth(finished)
return d
return self._lock.run(run)
def get_authorization_header(self):
"""
Return a Deferred that fires with the value to use for the
Authorization header in HTTP requests.
"""
d = self._refresh_if_necessary()
def refreshed(ignore):
headers = {}
self._credentials.apply(headers)
result = headers['Authorization']
# The value was bytes in oauth2client 1.0, unicode in 1.1, maybe
# they'll change it again in 1.2...
if isinstance(result, unicode):
result = result.encode("ascii")
return result
d.addCallback(refreshed)
return d
class GoogleStorageContainer(CommonContainerMixin, HTTPClientMixin):
implements(IContainer)
USER_AGENT = "Tahoe-LAFS Google Storage client"
URI = "https://storage.googleapis.com"
NAMESPACE = "{http://doc.s3.amazonaws.com/2006-03-01}"
# I can't get Google to actually use their own namespace?!
#NAMESPACE="{http://doc.storage.googleapis.com/2010-04-03}"
def __init__(self, auth_client, project_id, bucket_name, override_reactor=None):
CommonContainerMixin.__init__(self, bucket_name, override_reactor)
self._init_agent()
self._auth_client = auth_client
self._project_id = project_id # Only need for bucket creation/deletion
def _react_to_error(self, response_code):
if response_code >= 400 and response_code < 500:
# Unauthorized/forbidden/etc. we should retry, eventually we will
# reauthenticate:
return True
else:
return CommonContainerMixin._react_to_error(self, response_code)
def _get_object(self, object_name):
"""
Get an object from this container.
"""
d = self._auth_client.get_authorization_header()
def _do_get(auth_header):
request_headers = {
'Authorization': [auth_header],
"x-goog-api-version": ["2"],
}
url = self._make_object_url(self.URI, object_name)
return self._http_request("Google Storage GET object", 'GET', url, request_headers,
body=None,
need_response_body=True)
d.addCallback(_do_get)
d.addCallback(lambda (response, body): body)
return d
def _delete_object(self, object_name):
"""
Delete an object from this container.
"""
d = self._auth_client.get_authorization_header()
def _do_delete(auth_header):
request_headers = {
'Authorization': [auth_header],
"x-goog-api-version": ["2"],
}
url = self._make_object_url(self.URI, object_name)
return self._http_request("Google Storage DELETE object", 'DELETE', url, request_headers,
body=None,
need_response_body=False)
d.addCallback(_do_delete)
d.addCallback(lambda (response, body): body)
return d
def _put_object(self, object_name, data, content_type, metadata):
"""
Put an object into this container.
"""
d = self._auth_client.get_authorization_header()
def _do_put(auth_header):
request_headers = {
'Authorization': [auth_header],
"x-goog-api-version": ["2"],
"Content-Type": [content_type],
}
for key, value in metadata.items():
request_headers["x-goog-meta-" + key] = [value]
url = self._make_object_url(self.URI, object_name)
return self._http_request("Google Storage PUT object", 'PUT', url, request_headers,
body=data,
need_response_body=False)
d.addCallback(_do_put)
d.addCallback(lambda (response, body): body)
return d
def _parse_item(self, element):
"""
Parse a <Contents> XML element into a ContainerItem.
"""
key = element.find(self.NAMESPACE + "Key").text
last_modified = element.find(self.NAMESPACE + "LastModified").text
etag = element.find(self.NAMESPACE + "ETag").text
size = int(element.find(self.NAMESPACE + "Size").text)
storage_class = "STANDARD"
owner = None # Don't bother parsing this at the moment
return ContainerItem(key, last_modified, etag, size, storage_class,
owner)
def _parse_list(self, data, prefix):
"""
Parse the XML response, converting it into a ContainerListing.
"""
name = self._container_name
marker = None
max_keys = None
is_truncated = "false"
common_prefixes = []
contents = []
# Sigh.
ns_len = len(self.NAMESPACE)
root = ElementTree.fromstring(data)
if root.tag != self.NAMESPACE + "ListBucketResult":
raise ValueError("Unknown root XML element %s" % (root.tag,))
for element in root:
tag = element.tag[ns_len:]
if tag == "Marker":
marker = element.text
elif tag == "IsTruncated":
is_truncated = element.text
elif tag == "Contents":
contents.append(self._parse_item(element))
elif tag == "CommonPrefixes":
common_prefixes.append(element.find(self.NAMESPACE + "Prefix").text)
return ContainerListing(name, prefix, marker, max_keys, is_truncated,
contents, common_prefixes)
def _list_objects(self, prefix):
"""
List objects in this container with the given prefix.
"""
d = self._auth_client.get_authorization_header()
def _do_list(auth_header):
request_headers = {
'Authorization': [auth_header],
"x-goog-api-version": ["2"],
"x-goog-project-id": [self._project_id],
}
url = self._make_container_url(self.URI)
url += "?prefix=" + urllib.quote(prefix, safe='')
return self._http_request("Google Storage list objects", 'GET', url, request_headers,
body=None,
need_response_body=True)
d.addCallback(_do_list)
d.addCallback(lambda (response, body): self._parse_list(body, prefix))
return d
def configure_googlestorage_container(storedir, config):
"""
Configure the Google Cloud Storage container.
"""
account_email = config.get_config("storage", "googlestorage.account_email")
private_key = config.get_private_config("googlestorage_private_key")
bucket_name = config.get_config("storage", "googlestorage.bucket")
# Only necessary if we do bucket creation/deletion, otherwise can be
# removed:
project_id = config.get_config("storage", "googlestorage.project_id")
authclient = AuthenticationClient(account_email, private_key)
return GoogleStorageContainer(authclient, project_id, bucket_name)
if __name__ == '__main__':
from twisted.internet import reactor
import sys
auth = AuthenticationClient(sys.argv[1], file(sys.argv[2]).read())
gsc = GoogleStorageContainer(auth, sys.argv[3], sys.argv[4])
def println(result):
for item in result.contents:
print "Bucket has key", item.key
reactor.stop()
def gotAuth(value):
gsc.list_objects().addCallback(println)
auth.get_authorization_header().addCallback(gotAuth)
reactor.run()

View File

@ -0,0 +1,190 @@
import struct
from cStringIO import StringIO
from twisted.internet import defer
from zope.interface import implements
from allmydata.interfaces import IShareForReading, IShareForWriting
from allmydata.util.assertutil import precondition, _assert
from allmydata.storage.common import CorruptStoredShareError, UnknownImmutableContainerVersionError, \
DataTooLargeError
from allmydata.storage.backends.cloud import cloud_common
from allmydata.storage.backends.cloud.cloud_common import get_chunk_key, \
BackpressurePipeline, ChunkCache, CloudShareBase, CloudShareReaderMixin
# Each share file (stored in the chunks with keys 'shares/$PREFIX/$STORAGEINDEX/$SHNUM.$CHUNK')
# contains lease information [currently inaccessible] and share data. The share data is
# accessed by RIBucketWriter.write and RIBucketReader.read .
# The share file has the following layout:
# 0x00: share file version number, four bytes, current version is 1
# 0x04: always zero (was share data length prior to Tahoe-LAFS v1.3.0)
# 0x08: number of leases, four bytes big-endian
# 0x0c: beginning of share data (see immutable.layout.WriteBucketProxy)
# data_length + 0x0c: first lease. Each lease record is 72 bytes. (not used)
class ImmutableCloudShareMixin:
sharetype = "immutable"
LEASE_SIZE = struct.calcsize(">L32s32sL") # for compatibility
HEADER = ">LLL"
HEADER_SIZE = struct.calcsize(HEADER)
DATA_OFFSET = HEADER_SIZE
class ImmutableCloudShareForWriting(CloudShareBase, ImmutableCloudShareMixin):
implements(IShareForWriting)
def __init__(self, container, storage_index, shnum, allocated_data_length, incomingset):
"""
I won't allow more than allocated_data_length to be written to me.
"""
precondition(isinstance(allocated_data_length, (int, long)), allocated_data_length)
CloudShareBase.__init__(self, container, storage_index, shnum)
self._chunksize = cloud_common.PREFERRED_CHUNK_SIZE
self._allocated_data_length = allocated_data_length
self._buf = StringIO()
# The second field, which was the four-byte share data length in
# Tahoe-LAFS versions prior to 1.3.0, is not used; we always write 0.
# We also write 0 for the number of leases.
self._buf.write(struct.pack(self.HEADER, 1, 0, 0) )
self._set_size(self._buf.tell())
self._current_chunknum = 0
self._incomingset = incomingset
self._incomingset.add( (storage_index, shnum) )
self._pipeline = BackpressurePipeline(cloud_common.PIPELINE_DEPTH)
def _set_size(self, size):
self._total_size = size
self._data_length = size - self.DATA_OFFSET # no leases
def get_allocated_data_length(self):
return self._allocated_data_length
def write_share_data(self, offset, data):
"""Write 'data' at position 'offset' past the end of the header."""
seekpos = self.DATA_OFFSET + offset
precondition(seekpos >= self._total_size, offset=offset, seekpos=seekpos, total_size=self._total_size)
if offset + len(data) > self._allocated_data_length:
raise DataTooLargeError(self._shnum, self._allocated_data_length, offset, len(data))
self._set_size(self._total_size + len(data))
return self._store_or_buffer( (seekpos, data, 0) )
def close(self):
chunkdata = self._buf.getvalue()
self._discard()
d = self._pipeline_store_next_chunk(chunkdata)
d.addCallback(lambda ign: self._pipeline.close())
return d
def _store_or_buffer(self, (seekpos, b, b_offset) ):
"""
Helper method that stores the next complete chunk to the container or buffers
an incomplete chunk. The data still to be written is b[b_offset:], but we may
only process part of it in this call.
"""
chunknum = seekpos / self._chunksize
offset_in_chunk = seekpos % self._chunksize
_assert(chunknum >= self._current_chunknum, seekpos=seekpos, chunknum=chunknum,
current_chunknum=self._current_chunknum)
if chunknum > self._current_chunknum or offset_in_chunk + (len(b) - b_offset) >= self._chunksize:
if chunknum > self._current_chunknum:
# The write left a gap that spans a chunk boundary. Fill with zeroes to the end
# of the current chunk and store it.
# TODO: test this case
self._buf.seek(self._chunksize - 1)
self._buf.write("\x00")
else:
# Store a complete chunk.
writelen = self._chunksize - offset_in_chunk
self._buf.seek(offset_in_chunk)
self._buf.write(b[b_offset : b_offset + writelen])
seekpos += writelen
b_offset += writelen
chunkdata = self._buf.getvalue()
self._buf = StringIO()
_assert(len(chunkdata) == self._chunksize, len_chunkdata=len(chunkdata), chunksize=self._chunksize)
d2 = self._pipeline_store_next_chunk(chunkdata)
d2.addCallback(lambda ign: self._store_or_buffer( (seekpos, b, b_offset) ))
return d2
else:
# Buffer an incomplete chunk.
if b_offset > 0:
b = b[b_offset :]
self._buf.seek(offset_in_chunk)
self._buf.write(b)
return defer.succeed(None)
def _pipeline_store_next_chunk(self, chunkdata):
chunkkey = get_chunk_key(self._key, self._current_chunknum)
self._current_chunknum += 1
#print "STORING", chunkkey, len(chunkdata)
# We'd like to stream writes, but the supported service containers
# (and the IContainer interface) don't support that yet. For txaws, see
# https://bugs.launchpad.net/txaws/+bug/767205 and
# https://bugs.launchpad.net/txaws/+bug/783801
return self._pipeline.add(1, self._container.put_object, chunkkey, chunkdata)
def _discard(self):
self._buf = None
self._incomingset.discard( (self.get_storage_index(), self.get_shnum()) )
class ImmutableCloudShareForReading(CloudShareBase, ImmutableCloudShareMixin, CloudShareReaderMixin):
implements(IShareForReading)
def __init__(self, container, storage_index, shnum, total_size, first_chunkdata):
CloudShareBase.__init__(self, container, storage_index, shnum)
precondition(isinstance(total_size, (int, long)), total_size=total_size)
precondition(isinstance(first_chunkdata, str), type(first_chunkdata))
precondition(len(first_chunkdata) <= total_size, len_first_chunkdata=len(first_chunkdata), total_size=total_size)
chunksize = len(first_chunkdata)
if chunksize < self.HEADER_SIZE:
msg = "%r had incomplete header (%d bytes)" % (self, chunksize)
raise UnknownImmutableContainerVersionError(shnum, msg)
self._total_size = total_size
self._chunksize = chunksize
initial_cachemap = {0: defer.succeed(first_chunkdata)}
self._cache = ChunkCache(container, self._key, chunksize, initial_cachemap=initial_cachemap)
#print "ImmutableCloudShareForReading", total_size, chunksize, self._key
header = first_chunkdata[:self.HEADER_SIZE]
try:
(version, unused, num_leases) = struct.unpack(self.HEADER, header)
except struct.error, e:
raise CorruptStoredShareError(shnum, "invalid immutable share header for shnum %d: %s" % (shnum, e))
if version != 1:
msg = "%r had version %d but we wanted 1" % (self, version)
raise UnknownImmutableContainerVersionError(shnum, msg)
# We cannot write leases in share files, but allow them to be present
# in case a share file is copied from a disk backend, or in case we
# need them in future.
self._data_length = total_size - self.DATA_OFFSET - (num_leases * self.LEASE_SIZE)
if self._data_length < 0:
raise CorruptStoredShareError(shnum, "calculated data length for shnum %d is %d" % (shnum, self._data_length))
# Boilerplate is in CloudShareBase, read implementation is in CloudShareReaderMixin.
# So nothing to implement here. Yay!
def _discard(self):
pass

View File

@ -0,0 +1,136 @@
import os.path
from twisted.internet import defer, reactor
from allmydata.util.deferredutil import async_iterate
from zope.interface import implements
from allmydata.util.assertutil import _assert
from allmydata.storage.backends.base import ContainerItem, ContainerListing
from allmydata.storage.backends.cloud.cloud_common import IContainer, \
CloudServiceError, CommonContainerMixin, ContainerListMixin
from allmydata.util.time_format import iso_utc
from allmydata.util import fileutil
MAX_KEYS = 1000
def configure_mock_cloud_backend(storedir, config):
from allmydata.storage.backends.cloud.cloud_backend import CloudBackend
container = MockContainer(storedir)
return CloudBackend(container)
def _not_implemented():
raise NotImplementedError()
def hook_create_container():
return defer.execute(_not_implemented)
class MockContainer(ContainerListMixin, CommonContainerMixin):
implements(IContainer)
"""
I represent a mock cloud container that stores its data in the local filesystem.
I also keep track of the number of loads and stores.
"""
def __init__(self, storagedir):
self._storagedir = storagedir
self.container_name = "MockContainer"
self.ServiceError = CloudServiceError
self._load_count = 0
self._store_count = 0
self._reactor = reactor
fileutil.make_dirs(os.path.join(self._storagedir, "shares"))
def __repr__(self):
return ("<%s at %r>" % (self.__class__.__name__, self._storagedir,))
def _create(self):
return hook_create_container()
def _delete(self):
return defer.execute(_not_implemented)
def _iterate_dirs(self):
shares_dir = os.path.join(self._storagedir, "shares")
for prefixstr in sorted(fileutil.listdir(shares_dir)):
prefixkey = "shares/%s" % (prefixstr,)
prefixdir = os.path.join(shares_dir, prefixstr)
for sistr in sorted(fileutil.listdir(prefixdir)):
sikey = "%s/%s" % (prefixkey, sistr)
sidir = os.path.join(prefixdir, sistr)
for shnumstr in sorted(fileutil.listdir(sidir)):
sharefile = os.path.join(sidir, shnumstr)
yield (sharefile, "%s/%s" % (sikey, shnumstr))
def list_some_objects(self, **kwargs):
return self._do_request('list objects', self._list_some_objects, **kwargs)
def _list_some_objects(self, prefix='', marker=None, max_keys=None):
if max_keys is None:
max_keys = MAX_KEYS
contents = []
def _next_share(res):
if res is None:
return
(sharefile, sharekey) = res
# note that all strings are > None
if sharekey.startswith(prefix) and sharekey > marker:
stat_result = os.stat(sharefile)
mtime_utc = iso_utc(stat_result.st_mtime, sep=' ')+'+00:00'
item = ContainerItem(key=sharekey, modification_date=mtime_utc, etag="",
size=stat_result.st_size, storage_class="STANDARD")
contents.append(item)
return len(contents) < max_keys
d = async_iterate(_next_share, self._iterate_dirs())
def _done(completed):
contents.sort(key=lambda item: item.key)
return ContainerListing(self.container_name, '', '', max_keys,
is_truncated=str(not completed).lower(), contents=contents)
d.addCallback(_done)
return d
def _get_path(self, object_name, must_exist=False):
# This method is also called by tests.
sharefile = os.path.join(self._storagedir, object_name)
if must_exist and not os.path.exists(sharefile):
raise self.ServiceError("", 404, "not found")
return sharefile
def _put_object(self, object_name, data, content_type, metadata):
_assert(content_type == 'application/octet-stream', content_type=content_type)
_assert(metadata == {}, metadata=metadata)
sharefile = self._get_path(object_name)
fileutil.make_dirs(os.path.dirname(sharefile))
fileutil.write(sharefile, data)
self._store_count += 1
return defer.succeed(None)
def _get_object(self, object_name):
self._load_count += 1
data = fileutil.read(self._get_path(object_name, must_exist=True))
return defer.succeed(data)
def _head_object(self, object_name):
return defer.execute(_not_implemented)
def _delete_object(self, object_name):
fileutil.remove(self._get_path(object_name, must_exist=True))
return defer.succeed(None)
def reset_load_store_counts(self):
self._load_count = 0
self._store_count = 0
def get_load_count(self):
return self._load_count
def get_store_count(self):
return self._store_count

View File

@ -0,0 +1,3 @@
from allmydata.storage.backends.cloud.msazure.msazure_container import configure_msazure_container
configure_container = configure_msazure_container

View File

@ -0,0 +1,276 @@
"""
Storage backend using Microsoft Azure Blob Storage service.
See http://msdn.microsoft.com/en-us/library/windowsazure/dd179428.aspx for
details on the authentication scheme.
"""
import urlparse
import base64
import hmac
import hashlib
import urllib
try:
from xml.etree import cElementTree as ElementTree
ElementTree # hush pyflakes
except ImportError:
from xml.etree import ElementTree
import time
from zope.interface import implements
from twisted.web.http_headers import Headers
from twisted.web.http import datetimeToString
from allmydata.storage.backends.base import ContainerItem, ContainerListing
from allmydata.storage.backends.cloud.cloud_common import IContainer, \
CommonContainerMixin, HTTPClientMixin
class MSAzureStorageContainer(CommonContainerMixin, HTTPClientMixin):
implements(IContainer)
USER_AGENT = "Tahoe-LAFS Microsoft Azure client"
_time = time.time
def __init__(self, account_name, account_key, container_name,
override_reactor=None):
CommonContainerMixin.__init__(self, container_name, override_reactor)
self._init_agent()
self._account_name = account_name
self._account_key = base64.b64decode(account_key)
self.URI = "https://%s.blob.core.windows.net" % (account_name, )
def _calculate_presignature(self, method, url, headers):
"""
Calculate the value to be signed for the given request information.
We only implement a subset of the standard. In particular, we assume
x-ms-date header has been provided, so don't include any Date header.
The HMAC, and formatting into HTTP header, is not done in this layer.
"""
headers = Headers(headers)
parsed_url = urlparse.urlparse(url)
result = method + "\n"
# Add standard headers:
for header in ['content-encoding', 'content-language',
'content-length', 'content-md5',
'content-type', 'date', 'if-modified-since',
'if-match', 'if-none-match',
'if-unmodified-since', 'range']:
value = headers.getRawHeaders(header, [""])[0]
if header == "date":
value = ""
result += value + "\n"
# Add x-ms headers:
x_ms_headers = []
x_ms_date = False
for name, values in headers.getAllRawHeaders():
name = name.lower()
if name.startswith("x-ms"):
x_ms_headers.append("%s:%s" % (name, values[0]))
if name == "x-ms-date":
x_ms_date = True
x_ms_headers.sort()
if x_ms_headers:
result += "\n".join(x_ms_headers) + "\n"
if not x_ms_date:
raise ValueError("x-ms-date must be included")
# Add path:
result += "/%s%s" % (self._account_name, parsed_url.path)
# Add query args:
query_args = urlparse.parse_qs(parsed_url.query).items()
query_args.sort()
for name, value in query_args:
result += "\n%s:%s" % (name, ",".join(value))
return result
def _calculate_signature(self, method, url, headers):
"""
Calculate the signature for the given request information.
This includes base64ing and HMACing.
headers is a twisted.web.http_headers.Headers instance.
The returned value is suitable for us as an Authorization header.
"""
data = self._calculate_presignature(method, url, headers)
signature = hmac.HMAC(self._account_key, data, hashlib.sha256).digest()
return "SharedKey %s:%s" % (self._account_name, base64.b64encode(signature))
def _authorized_http_request(self, what, method, url, request_headers,
body=None, need_response_body=False):
"""
Do an HTTP request with the addition of a authorization header.
"""
request_headers["x-ms-date"] = [datetimeToString(self._time())]
request_headers["x-ms-version"] = ["2012-02-12"]
request_headers["Authorization"] = [
self._calculate_signature(method, url, request_headers)]
return self._http_request(what, method, url, request_headers, body=body,
need_response_body=need_response_body)
def _parse_item(self, element):
"""
Parse a <Blob> XML element into a ContainerItem.
"""
key = element.find("Name").text
element = element.find("Properties")
last_modified = element.find("Last-Modified").text
etag = element.find("Etag").text
size = int(element.find("Content-Length").text)
storage_class = "STANDARD" # not sure what it means in this context
owner = None # Don't bother parsing this at the moment
return ContainerItem(key, last_modified, etag, size, storage_class,
owner)
def _parse_list(self, data, prefix):
"""
Parse the XML response, converting it into a ContainerListing.
"""
name = self._container_name
marker = None
max_keys = None
is_truncated = "false"
common_prefixes = []
contents = []
root = ElementTree.fromstring(data)
if root.tag != "EnumerationResults":
raise ValueError("Unknown root XML element %s" % (root.tag,))
for element in root:
tag = element.tag
if tag == "NextMarker":
marker = element.text
elif tag == "Blobs":
for subelement in element:
if subelement.tag == "Blob":
contents.append(self._parse_item(subelement))
return ContainerListing(name, prefix, marker, max_keys, is_truncated,
contents, common_prefixes)
def _list_objects(self, prefix):
"""
List objects in this container with the given prefix.
"""
url = self._make_container_url(self.URI)
url += "?comp=list&restype=container"
if prefix:
url += "&prefix=" + urllib.quote(prefix, safe='')
d = self._authorized_http_request("MS Azure list objects", 'GET',
url, {},
body=None,
need_response_body=True)
d.addCallback(lambda (response, body): self._parse_list(body, prefix))
return d
def _put_object(self, object_name, data, content_type, metadata):
"""
Put an object into this container.
"""
url = self._make_object_url(self.URI, object_name)
# In theory Agent will add the content length for us, but we need it
# at this layer in order for the HMAC authorization to be calculated
# correctly:
request_headers = {'Content-Length': ["%d" % (len(data),)],
'Content-Type': [content_type],
"x-ms-blob-type": ["BlockBlob"],
}
for key, value in metadata.items():
request_headers["x-ms-meta-%s" % (key,)] = [value]
d = self._authorized_http_request("MS Azure PUT object", 'PUT', url,
request_headers,
body=data, need_response_body=False)
d.addCallback(lambda (response, body): body)
return d
def _get_object(self, object_name):
"""
Get an object from this container.
"""
url = self._make_object_url(self.URI, object_name)
d = self._authorized_http_request("MS Azure GET object", 'GET',
url, {},
body=None,
need_response_body=True)
d.addCallback(lambda (response, body): body)
return d
def _delete_object(self, object_name):
"""
Delete an object from this container.
"""
url = self._make_object_url(self.URI, object_name)
d = self._authorized_http_request("MS Azure DELETE object", 'DELETE',
url, {},
body=None,
need_response_body=False)
d.addCallback(lambda (response, body): body)
return d
def _create(self):
"""
Create the container.
"""
url = self._make_container_url(self.URI)
url += "?restype=container"
d = self._authorized_http_request("MS Azure PUT container", 'PUT',
url, {'Content-length': ['0']},
body=None,
need_response_body=False)
d.addCallback(lambda (response, body): body)
return d
def configure_msazure_container(storedir, config):
"""
Configure the MS Azure storage container.
"""
account_name = config.get_config("storage", "msazure.account_name")
container_name = config.get_config("storage", "msazure.container")
account_key = config.get_private_config("msazure_account_key")
return MSAzureStorageContainer(account_name, account_key, container_name)
if __name__ == '__main__':
from twisted.internet import reactor, defer
from twisted.python import log
import sys
msc = MSAzureStorageContainer(sys.argv[1], sys.argv[2], sys.argv[3])
@defer.inlineCallbacks
def testtransactions():
print "Creating container...",
try:
yield msc.create()
except Exception, e:
print "failed:", e
else:
print "succeeded."
yield msc.put_object("key", "the value")
print "Uploaded 'key', with value 'the value'"
print
print "Get contents:",
result = yield msc.list_objects()
print [item.key for item in result.contents]
print "Get key, value is:"
print (yield msc.get_object("key"))
print
print "Delete item..."
yield msc.delete_object("key")
print
print "Get contents:",
result = yield msc.list_objects()
print [item.key for item in result.contents]
reactor.stop()
testtransactions().addErrback(log.err)
reactor.run()

View File

@ -0,0 +1,470 @@
import struct
from collections import deque
from twisted.internet import defer
from allmydata.util.deferredutil import gatherResults, async_iterate
from zope.interface import implements
from allmydata.interfaces import IMutableShare, BadWriteEnablerError
from allmydata.util import idlib, log
from allmydata.util.assertutil import precondition, _assert
from allmydata.util.namespace import Namespace
from allmydata.util.mathutil import div_ceil
from allmydata.util.hashutil import timing_safe_compare
from allmydata.storage.common import CorruptStoredShareError, UnknownMutableContainerVersionError, \
DataTooLargeError
from allmydata.storage.backends.base import testv_compare
from allmydata.mutable.layout import MUTABLE_MAGIC, MAX_MUTABLE_SHARE_SIZE
from allmydata.storage.backends.cloud import cloud_common
from allmydata.storage.backends.cloud.cloud_common import get_chunk_key, get_zero_chunkdata, \
delete_chunks, BackpressurePipeline, ChunkCache, CloudShareBase, CloudShareReaderMixin
# Mutable shares have a different layout to immutable shares. See docs/mutable.rst
# for more details.
# # offset size name
# 1 0 32 magic verstr "tahoe mutable container v1" plus binary
# 2 32 20 write enabler's nodeid
# 3 52 32 write enabler
# 4 84 8 data size (actual share data present) (a)
# 5 92 8 offset of (8) count of extra leases (after data)
# 6 100 368 four leases, 92 bytes each, unused
# 7 468 (a) data
# 8 ?? 4 count of extra leases
# 9 ?? n*92 extra leases
# The struct module doc says that L's are 4 bytes in size, and that Q's are
# 8 bytes in size. Since compatibility depends upon this, double-check it.
assert struct.calcsize(">L") == 4, struct.calcsize(">L")
assert struct.calcsize(">Q") == 8, struct.calcsize(">Q")
class MutableCloudShare(CloudShareBase, CloudShareReaderMixin):
implements(IMutableShare)
sharetype = "mutable"
DATA_LENGTH_OFFSET = struct.calcsize(">32s20s32s")
EXTRA_LEASE_OFFSET = DATA_LENGTH_OFFSET + 8
HEADER = ">32s20s32sQQ"
HEADER_SIZE = struct.calcsize(HEADER) # doesn't include leases
LEASE_SIZE = struct.calcsize(">LL32s32s20s")
assert LEASE_SIZE == 92, LEASE_SIZE
DATA_OFFSET = HEADER_SIZE + 4*LEASE_SIZE
assert DATA_OFFSET == 468, DATA_OFFSET
NUM_EXTRA_LEASES_SIZE = struct.calcsize(">L")
MAGIC = MUTABLE_MAGIC
assert len(MAGIC) == 32
MAX_SIZE = MAX_MUTABLE_SHARE_SIZE
def __init__(self, container, storage_index, shnum, total_size, first_chunkdata, parent=None):
CloudShareBase.__init__(self, container, storage_index, shnum)
precondition(isinstance(total_size, (int, long)), total_size=total_size)
precondition(isinstance(first_chunkdata, str), type(first_chunkdata))
precondition(len(first_chunkdata) <= total_size, "total size is smaller than first chunk",
len_first_chunkdata=len(first_chunkdata), total_size=total_size)
if len(first_chunkdata) < self.HEADER_SIZE:
msg = "%r had incomplete header (%d bytes)" % (self, len(first_chunkdata))
raise UnknownMutableContainerVersionError(shnum, msg)
header = first_chunkdata[:self.HEADER_SIZE]
try:
(magic, write_enabler_nodeid, real_write_enabler,
data_length, extra_lease_offset) = struct.unpack(self.HEADER, header)
except struct.error, e:
raise CorruptStoredShareError(shnum, "invalid mutable share header for shnum %d: %s" % (shnum, e))
if magic != self.MAGIC:
msg = "%r had magic %r but we wanted %r" % (self, magic, self.MAGIC)
raise UnknownMutableContainerVersionError(shnum, msg)
self._write_enabler_nodeid = write_enabler_nodeid
self._real_write_enabler = real_write_enabler
# We want to support changing PREFERRED_CHUNK_SIZE without breaking compatibility,
# but without "rechunking" any existing shares. Also, existing shares created by
# the pre-chunking code should be handled correctly.
# If there is more than one chunk, the chunksize must be equal to the size of the
# first chunk, to avoid rechunking.
self._chunksize = len(first_chunkdata)
if self._chunksize == total_size:
# There is only one chunk, so we are at liberty to make the chunksize larger
# than that chunk, but not smaller.
self._chunksize = max(self._chunksize, cloud_common.PREFERRED_CHUNK_SIZE)
self._zero_chunkdata = get_zero_chunkdata(self._chunksize)
initial_cachemap = {0: defer.succeed(first_chunkdata)}
self._cache = ChunkCache(container, self._key, self._chunksize, initial_cachemap=initial_cachemap)
#print "CONSTRUCT %s with %r" % (object.__repr__(self), self._cache)
self._data_length = data_length
self._set_total_size(self.DATA_OFFSET + data_length + self.NUM_EXTRA_LEASES_SIZE)
# The initial total size may not be less than the size of header + data + extra lease count.
# TODO: raise a better exception.
_assert(total_size >= self._total_size, share=repr(self),
total_size=total_size, self_total_size=self._total_size, data_length=data_length)
self._is_oversize = total_size > self._total_size
self._pipeline = BackpressurePipeline(cloud_common.PIPELINE_DEPTH)
self.parent = parent # for logging
def _set_total_size(self, total_size):
self._total_size = total_size
self._nchunks = div_ceil(self._total_size, self._chunksize)
def log(self, *args, **kwargs):
if self.parent:
return self.parent.log(*args, **kwargs)
@classmethod
def create_empty_share(cls, container, serverid, write_enabler, storage_index=None, shnum=None, parent=None):
# Unlike the disk backend, we don't check that the cloud object does not exist;
# we assume that it does not because create was used, and no-one else should be
# writing to the bucket.
# There are no extra leases, but for compatibility, the offset they would have
# still needs to be stored in the header.
data_length = 0
extra_lease_offset = cls.DATA_OFFSET + data_length
header = struct.pack(cls.HEADER, cls.MAGIC, serverid, write_enabler,
data_length, extra_lease_offset)
leases = "\x00"*(cls.LEASE_SIZE * 4)
extra_lease_count = struct.pack(">L", 0)
first_chunkdata = header + leases + extra_lease_count
share = cls(container, storage_index, shnum, len(first_chunkdata), first_chunkdata, parent=parent)
d = share._raw_writev(deque([(0, first_chunkdata)]), 0, 0)
d.addCallback(lambda ign: share)
return d
def _discard(self):
# TODO: discard read cache
pass
def check_write_enabler(self, write_enabler):
# avoid a timing attack
if not timing_safe_compare(write_enabler, self._real_write_enabler):
# accomodate share migration by reporting the nodeid used for the
# old write enabler.
self.log(format="bad write enabler on SI %(si)s,"
" recorded by nodeid %(nodeid)s",
facility="tahoe.storage",
level=log.WEIRD, umid="DF2fCR",
si=self.get_storage_index_string(),
nodeid=idlib.nodeid_b2a(self._write_enabler_nodeid))
msg = "The write enabler was recorded by nodeid '%s'." % \
(idlib.nodeid_b2a(self._write_enabler_nodeid),)
raise BadWriteEnablerError(msg)
return defer.succeed(None)
def check_testv(self, testv):
def _test( (offset, length, operator, specimen) ):
d = self.read_share_data(offset, length)
d.addCallback(lambda data: testv_compare(data, operator, specimen))
return d
return async_iterate(_test, sorted(testv))
def writev(self, datav, new_length):
precondition(new_length is None or new_length >= 0, new_length=new_length)
raw_datav, preserved_size, new_data_length = self._prepare_writev(datav, new_length)
return self._raw_writev(raw_datav, preserved_size, new_data_length)
def _prepare_writev(self, datav, new_length):
# Translate the client's write vector and 'new_length' into a "raw" write vector
# and new total size. This has no side effects to make it easier to test.
preserved_size = self.DATA_OFFSET + self._data_length
# chunk containing the byte after the current end-of-data
endofdata_chunknum = preserved_size / self._chunksize
# Whether we need to add a dummy write to zero-extend the end-of-data chunk.
ns = Namespace()
ns.need_zeroextend_write = preserved_size % self._chunksize != 0
raw_datav = deque()
def _add_write(seekpos, data):
#print "seekpos =", seekpos
raw_datav.append( (seekpos, data) )
lastpos = seekpos + len(data) - 1
start_chunknum = seekpos / self._chunksize
last_chunknum = lastpos / self._chunksize
if start_chunknum <= endofdata_chunknum and endofdata_chunknum <= last_chunknum:
# If any of the client's writes overlaps the end-of-data chunk, we should not
# add the zero-extending dummy write.
ns.need_zeroextend_write = False
#print "need_zeroextend_write =", ns.need_zeroextend_write
new_data_length = self._data_length
# Validate the write vector and translate its offsets into seek positions from
# the start of the share.
for (offset, data) in datav:
length = len(data)
precondition(offset >= 0, offset=offset)
if offset + length > self.MAX_SIZE:
raise DataTooLargeError(self._shnum, self.MAX_SIZE, offset, length)
if new_length is not None and new_length < offset + length:
length = max(0, new_length - offset)
data = data[: length]
new_data_length = max(new_data_length, offset + length)
if length > 0:
_add_write(self.DATA_OFFSET + offset, data)
# new_length can only be used to truncate, not extend.
if new_length is not None:
new_data_length = min(new_length, new_data_length)
# If the data length has changed, include additional raw writes to the data length
# field in the header, and to the extra lease count field after the data.
#
# Also do this if there were extra leases (e.g. if this was a share copied from a
# disk backend), so that they will be deleted. If the size hasn't changed and there
# are no extra leases, we don't bother to ensure that the extra lease count field is
# zero; it is ignored anyway.
if new_data_length != self._data_length or self._is_oversize:
extra_lease_offset = self.DATA_OFFSET + new_data_length
# Don't preserve old data past the new end-of-data.
preserved_size = min(preserved_size, extra_lease_offset)
# These are disjoint with any ranges already in raw_datav.
_add_write(self.DATA_LENGTH_OFFSET, struct.pack(">Q", new_data_length))
_add_write(extra_lease_offset, struct.pack(">L", 0))
#print "need_zeroextend_write =", ns.need_zeroextend_write
# If the data length is being increased and there are no other writes to the
# current end-of-data chunk (including the two we just added), add a dummy write
# of one zero byte at the end of that chunk. This will cause that chunk to be
# zero-extended to the full chunk size, which would not otherwise happen.
if new_data_length > self._data_length and ns.need_zeroextend_write:
_add_write((endofdata_chunknum + 1)*self._chunksize - 1, "\x00")
# Sorting the writes simplifies things (and we need all the simplification we can get :-)
raw_datav = deque(sorted(raw_datav, key=lambda (offset, data): offset))
# Complain if write vector elements overlap, that's too hard in general.
(last_seekpos, last_data) = (0, "")
have_duplicates = False
for (i, (seekpos, data)) in enumerate(raw_datav):
# The MDMF publisher in 1.9.0 and 1.9.1 produces duplicated writes to the MDMF header.
# If this is an exactly duplicated write, skip it.
if seekpos == last_seekpos and data == last_data:
raw_datav[i] = None
have_duplicates = True
else:
last_endpos = last_seekpos + len(last_data)
_assert(seekpos >= last_endpos, "overlapping write vector elements",
seekpos=seekpos, last_seekpos=last_seekpos, last_endpos=last_endpos)
(last_seekpos, last_data) = (seekpos, data)
if have_duplicates:
raw_datav.remove(None)
# Return a vector of writes to ranges in the share, the size of previous contents to
# be preserved, and the final data length.
return (raw_datav, preserved_size, new_data_length)
def _raw_writev(self, raw_datav, preserved_size, new_data_length):
#print "%r._raw_writev(%r, %r, %r)" % (self, raw_datav, preserved_size, new_data_length)
old_nchunks = self._nchunks
# The _total_size and _nchunks attributes are updated as each write is applied.
self._set_total_size(preserved_size)
final_size = self.DATA_OFFSET + new_data_length + self.NUM_EXTRA_LEASES_SIZE
d = self._raw_write_share_data(None, raw_datav, final_size)
def _resize(ign):
self._data_length = new_data_length
self._set_total_size(final_size)
if self._nchunks < old_nchunks or self._is_oversize:
self._is_oversize = False
#print "DELETING chunks from", self._nchunks
return delete_chunks(self._container, self._key, from_chunknum=self._nchunks)
d.addCallback(_resize)
d.addCallback(lambda ign: self._pipeline.flush())
return d
def _raw_write_share_data(self, ign, raw_datav, final_size):
"""
raw_datav: (deque of (integer, str)) the remaining raw write vector
final_size: (integer) the size the file will be after all writes in the writev
"""
#print "%r._raw_write_share_data(%r, %r)" % (self, (seekpos, data), final_size)
precondition(final_size >= 0, final_size=final_size)
d = defer.succeed(None)
if not raw_datav:
return d
(seekpos, data) = raw_datav.popleft()
_assert(seekpos >= 0 and len(data) > 0, seekpos=seekpos, len_data=len(data),
len_raw_datav=len(raw_datav), final_size=final_size)
# We *may* need to read the start chunk and/or last chunk before rewriting them.
# (If they are the same chunk, that's fine, the cache will ensure we don't
# read the cloud object twice.)
lastpos = seekpos + len(data) - 1
_assert(lastpos > 0, seekpos=seekpos, len_data=len(data), lastpos=lastpos)
start_chunknum = seekpos / self._chunksize
start_chunkpos = start_chunknum*self._chunksize
start_offset = seekpos % self._chunksize
last_chunknum = lastpos / self._chunksize
last_chunkpos = last_chunknum*self._chunksize
last_offset = lastpos % self._chunksize
_assert(start_chunknum <= last_chunknum, start_chunknum=start_chunknum, last_chunknum=last_chunknum)
#print "lastpos =", lastpos
#print "len(data) =", len(data)
#print "start_chunknum =", start_chunknum
#print "start_offset =", start_offset
#print "last_chunknum =", last_chunknum
#print "last_offset =", last_offset
#print "_total_size =", self._total_size
#print "_chunksize =", self._chunksize
#print "_nchunks =", self._nchunks
start_chunkdata_d = defer.Deferred()
last_chunkdata_d = defer.Deferred()
# Is the first byte of the start chunk preserved?
if start_chunknum*self._chunksize < self._total_size and start_offset > 0:
# Yes, so we need to read it first.
d.addCallback(lambda ign: self._cache.get(start_chunknum, start_chunkdata_d))
else:
start_chunkdata_d.callback("")
# Is any byte of the last chunk preserved?
if last_chunkpos < self._total_size and lastpos < min(self._total_size, last_chunkpos + self._chunksize) - 1:
# Yes, so we need to read it first.
d.addCallback(lambda ign: self._cache.get(last_chunknum, last_chunkdata_d))
else:
last_chunkdata_d.callback("")
d.addCallback(lambda ign: gatherResults( (start_chunkdata_d, last_chunkdata_d) ))
def _got( (start_chunkdata, last_chunkdata) ):
#print "start_chunkdata =", len(start_chunkdata), repr(start_chunkdata)
#print "last_chunkdata =", len(last_chunkdata), repr(last_chunkdata)
d2 = defer.succeed(None)
# Zero any chunks from self._nchunks (i.e. after the last currently valid chunk)
# to before the start chunk of the write.
for zero_chunknum in xrange(self._nchunks, start_chunknum):
d2.addCallback(self._pipeline_store_chunk, zero_chunknum, self._zero_chunkdata)
# start_chunkdata and last_chunkdata may need to be truncated and/or zero-extended.
start_preserved = max(0, min(len(start_chunkdata), self._total_size - start_chunkpos, start_offset))
last_preserved = max(0, min(len(last_chunkdata), self._total_size - last_chunkpos))
start_chunkdata = (start_chunkdata[: start_preserved] +
self._zero_chunkdata[: max(0, start_offset - start_preserved)] +
data[: self._chunksize - start_offset])
# last_slice_len = len(last_chunkdata[last_offset + 1 : last_preserved])
last_slice_len = max(0, last_preserved - (last_offset + 1))
last_chunksize = min(final_size - last_chunkpos, self._chunksize)
last_chunkdata = (last_chunkdata[last_offset + 1 : last_preserved] +
self._zero_chunkdata[: max(0, last_chunksize - (last_offset + 1) - last_slice_len)])
# This loop eliminates redundant reads and writes, by merging the contents of writes
# after this one into last_chunkdata as far as possible. It ensures that we never need
# to read a chunk twice in the same writev (which is needed for correctness; see below).
while raw_datav:
# Does the next write start in the same chunk as this write ends (last_chunknum)?
(next_seekpos, next_chunkdata) = raw_datav[0]
next_start_chunknum = next_seekpos / self._chunksize
next_start_offset = next_seekpos % self._chunksize
next_lastpos = next_seekpos + len(next_chunkdata) - 1
if next_start_chunknum != last_chunknum:
break
_assert(next_start_offset > last_offset,
next_start_offset=next_start_offset, last_offset=last_offset)
# Cut next_chunkdata at the end of next_start_chunknum.
next_cutpos = (next_start_chunknum + 1)*self._chunksize
last_chunkdata = (last_chunkdata[: next_start_offset - (last_offset + 1)] +
next_chunkdata[: next_cutpos - next_seekpos] +
last_chunkdata[next_lastpos - lastpos :])
# Does the next write extend beyond that chunk?
if next_lastpos >= next_cutpos:
# The part after the cut will be processed in the next call to _raw_write_share_data.
raw_datav[0] = (next_cutpos, next_chunkdata[next_cutpos - next_seekpos :])
break
else:
# Discard the write that has already been processed.
raw_datav.popleft()
# start_chunknum and last_chunknum are going to be written, so need to be flushed
# from the read cache in case the new contents are needed by a subsequent readv
# or writev. (Due to the 'while raw_datav' loop above, we won't need to read them
# again in *this* writev. That property is needed for correctness because we don't
# flush the write pipeline until the end of the writev.)
d2.addCallback(lambda ign: self._cache.flush_chunk(start_chunkdata))
d2.addCallback(lambda ign: self._cache.flush_chunk(last_chunkdata))
# Now do the current write.
if last_chunknum == start_chunknum:
d2.addCallback(self._pipeline_store_chunk, start_chunknum,
start_chunkdata + last_chunkdata)
else:
d2.addCallback(self._pipeline_store_chunk, start_chunknum,
start_chunkdata)
for middle_chunknum in xrange(start_chunknum + 1, last_chunknum):
d2.addCallback(self._pipeline_store_chunk, middle_chunknum,
data[middle_chunknum*self._chunksize - seekpos
: (middle_chunknum + 1)*self._chunksize - seekpos])
d2.addCallback(self._pipeline_store_chunk, last_chunknum,
data[last_chunkpos - seekpos :] + last_chunkdata)
return d2
d.addCallback(_got)
d.addCallback(self._raw_write_share_data, raw_datav, final_size) # continue the iteration
return d
def _pipeline_store_chunk(self, ign, chunknum, chunkdata):
precondition(len(chunkdata) <= self._chunksize, len_chunkdata=len(chunkdata), chunksize=self._chunksize)
chunkkey = get_chunk_key(self._key, chunknum)
#print "STORING", chunkkey, len(chunkdata), repr(chunkdata)
endpos = chunknum*self._chunksize + len(chunkdata)
if endpos > self._total_size:
self._set_total_size(endpos)
# We'd like to stream writes, but the supported service containers
# (and the IContainer interface) don't support that yet. For txaws, see
# https://bugs.launchpad.net/txaws/+bug/767205 and
# https://bugs.launchpad.net/txaws/+bug/783801
return self._pipeline.add(1, self._container.put_object, chunkkey, chunkdata)
def close(self):
# FIXME: 'close' doesn't exist in IMutableShare
self._discard()
d = self._pipeline.close()
d.addCallback(lambda ign: self._cache.close())
return d

View File

@ -0,0 +1,4 @@
from allmydata.storage.backends.cloud.openstack.openstack_container import configure_openstack_container
configure_container = configure_openstack_container

View File

@ -0,0 +1,380 @@
import urllib, simplejson
from twisted.internet import defer, reactor
from twisted.web.client import Agent
from twisted.web.http import UNAUTHORIZED
from zope.interface import implements, Interface
from allmydata.util import log
from allmydata.node import InvalidValueError
from allmydata.storage.backends.base import ContainerItem, ContainerListing
from allmydata.storage.backends.cloud.cloud_common import IContainer, \
CloudServiceError, CommonContainerMixin, HTTPClientMixin
# Enabling this will cause secrets to be logged.
UNSAFE_DEBUG = False
DEFAULT_AUTH_URLS = {
"rackspace.com v1": "https://identity.api.rackspacecloud.com/v1.0",
"rackspace.co.uk v1": "https://lon.identity.api.rackspacecloud.com/v1.0",
"rackspace.com": "https://identity.api.rackspacecloud.com/v2.0/tokens",
"rackspace.co.uk": "https://lon.identity.api.rackspacecloud.com/v2.0/tokens",
"hpcloud.com west": "https://region-a.geo-1.identity.hpcloudsvc.com:35357/v2.0/tokens",
"hpcloud.com east": "https://region-b.geo-1.identity.hpcloudsvc.com:35357/v2.0/tokens",
}
def configure_openstack_container(storedir, config):
provider = config.get_config("storage", "openstack.provider", "rackspace.com").lower()
if provider not in DEFAULT_AUTH_URLS:
raise InvalidValueError("[storage]openstack.provider %r is not recognized\n"
"Valid providers are: %s" % (provider, ", ".join(sorted(DEFAULT_AUTH_URLS.keys()))))
auth_service_url = config.get_config("storage", "openstack.url", DEFAULT_AUTH_URLS[provider])
container_name = config.get_config("storage", "openstack.container")
reauth_period = 11*60*60 #seconds
access_key_id = config.get_config("storage", "openstack.access_key_id", None)
if access_key_id is None:
username = config.get_config("storage", "openstack.username")
api_key = config.get_private_config("openstack_api_key")
if auth_service_url.endswith("/v1.0"):
authenticator = AuthenticatorV1(auth_service_url, username, api_key)
else:
authenticator = AuthenticatorV2(auth_service_url, {
'RAX-KSKEY:apiKeyCredentials': {
'username': username,
'apiKey': api_key,
}
})
else:
tenant_id = config.get_config("storage", "openstack.tenant_id")
secret_key = config.get_private_config("openstack_secret_key")
authenticator = AuthenticatorV2(auth_service_url, {
'apiAccessKeyCredentials': {
'accessKey': access_key_id,
'secretKey': secret_key,
},
'tenantId': tenant_id,
})
auth_client = AuthenticationClient(authenticator, reauth_period)
return OpenStackContainer(auth_client, container_name)
class AuthenticationInfo(object):
def __init__(self, auth_token, public_storage_url, internal_storage_url=None):
self.auth_token = auth_token
self.public_storage_url = public_storage_url
self.internal_storage_url = internal_storage_url
class IAuthenticator(Interface):
def make_auth_request():
"""Returns (method, url, headers, body, need_response_body)."""
def parse_auth_response(response, get_header, body):
"""Returns AuthenticationInfo."""
class AuthenticatorV1(object):
implements(IAuthenticator)
"""
Authenticates according to V1 protocol as documented by Rackspace:
<http://docs.rackspace.com/files/api/v1/cf-devguide/content/Authentication-d1e639.html>.
"""
def __init__(self, auth_service_url, username, api_key):
self._auth_service_url = auth_service_url
self._username = username
self._api_key = api_key
def make_auth_request(self):
request_headers = {
'X-Auth-User': [self._username],
'X-Auth-Key': [self._api_key],
}
return ('GET', self._auth_service_url, request_headers, None, False)
def parse_auth_response(self, response, get_header, body):
auth_token = get_header(response, 'X-Auth-Token')
storage_url = get_header(response, 'X-Storage-Url')
#cdn_management_url = get_header(response, 'X-CDN-Management-Url')
return AuthenticationInfo(auth_token, storage_url)
class AuthenticatorV2(object):
implements(IAuthenticator)
"""
Authenticates according to V2 protocol as documented by Rackspace:
<http://docs.rackspace.com/auth/api/v2.0/auth-client-devguide/content/POST_authenticate_v2.0_tokens_.html>.
This is also compatible with HP's protocol (using different credentials):
<https://docs.hpcloud.com/api/identity#authenticate-jumplink-span>.
"""
def __init__(self, auth_service_url, credentials):
self._auth_service_url = auth_service_url
self._credentials = credentials
def make_auth_request(self):
request = {'auth': self._credentials}
json = simplejson.dumps(request)
request_headers = {
'Content-Type': ['application/json'],
}
return ('POST', self._auth_service_url, request_headers, json, True)
def parse_auth_response(self, response, get_header, body):
try:
decoded_body = simplejson.loads(body)
except simplejson.decoder.JSONDecodeError, e:
raise CloudServiceError(None, response.code,
message="could not decode auth response: %s" % (e,))
try:
# Scrabble around in the annoyingly complicated response body for the credentials we need.
access = decoded_body['access']
token = access['token']
auth_token = token['id']
user = access['user']
default_region = user.get('RAX-AUTH:defaultRegion', '')
serviceCatalog = access['serviceCatalog']
for service in serviceCatalog:
if service['type'] == 'object-store':
endpoints = service['endpoints']
for endpoint in endpoints:
if not default_region or endpoint['region'] == default_region:
public_storage_url = endpoint['publicURL']
internal_storage_url = endpoint.get('internalURL', None)
return AuthenticationInfo(auth_token, public_storage_url, internal_storage_url)
except KeyError, e:
raise CloudServiceError(None, response.code,
message="missing field in auth response: %s" % (e,))
raise CloudServiceError(None, response.code,
message="could not find a suitable storage endpoint in auth response")
class AuthenticationClient(HTTPClientMixin):
"""
I implement a generic authentication client.
The construction of the auth request and parsing of the response is delegated to an authenticator.
"""
USER_AGENT = "Tahoe-LAFS OpenStack authentication client"
def __init__(self, authenticator, reauth_period, override_reactor=None):
self._authenticator = authenticator
self._reauth_period = reauth_period
self._reactor = override_reactor or reactor
self._agent = Agent(self._reactor)
self._shutdown = False
self.ServiceError = CloudServiceError
# Not authorized yet.
self._auth_info = None
self._auth_lock = defer.DeferredLock()
self._reauthenticate()
def get_auth_info(self):
# It is intentional that this returns the previous auth_info while a reauthentication is in progress.
if self._auth_info is not None:
return defer.succeed(self._auth_info)
else:
return self.get_auth_info_locked()
def get_auth_info_locked(self):
d = self._auth_lock.run(self._authenticate)
d.addCallback(lambda ign: self._auth_info)
return d
def invalidate(self):
self._auth_info = None
self._reauthenticate()
def _authenticate(self):
(method, url, request_headers, body, need_response_body) = self._authenticator.make_auth_request()
d = self._http_request("OpenStack auth", method, url, request_headers, body, need_response_body)
def _got_response( (response, body) ):
self._auth_info = self._authenticator.parse_auth_response(response, self._get_header, body)
if UNSAFE_DEBUG:
print "Auth response is %s %s" % (self._auth_info.auth_token, self._auth_info.public_storage_url)
if not self._shutdown:
if self._delayed:
self._delayed.cancel()
self._delayed = self._reactor.callLater(self._reauth_period, self._reauthenticate)
d.addCallback(_got_response)
def _failed(f):
self._auth_info = None
# do we need to retry?
log.err(f)
return f
d.addErrback(_failed)
return d
def _reauthenticate(self):
self._delayed = None
d = self.get_auth_info_locked()
d.addBoth(lambda ign: None)
return d
def shutdown(self):
"""Used by unit tests to avoid unclean reactor errors."""
self._shutdown = True
if self._delayed:
self._delayed.cancel()
class OpenStackContainer(CommonContainerMixin, HTTPClientMixin):
implements(IContainer)
USER_AGENT = "Tahoe-LAFS OpenStack storage client"
def __init__(self, auth_client, container_name, override_reactor=None):
CommonContainerMixin.__init__(self, container_name, override_reactor)
self._init_agent()
self._auth_client = auth_client
def _react_to_error(self, response_code):
if response_code == UNAUTHORIZED:
# Invalidate auth_info and retry.
self._auth_client.invalidate()
return True
else:
return CommonContainerMixin._react_to_error(self, response_code)
def _create(self):
"""
Create this container.
"""
raise NotImplementedError
def _delete(self):
"""
Delete this container.
The cloud service may require the container to be empty before it can be deleted.
"""
raise NotImplementedError
def _list_objects(self, prefix=''):
"""
Get a ContainerListing that lists objects in this container.
prefix: (str) limit the returned keys to those starting with prefix.
"""
d = self._auth_client.get_auth_info()
def _do_list(auth_info):
request_headers = {
'X-Auth-Token': [auth_info.auth_token],
}
url = self._make_container_url(auth_info.public_storage_url)
if prefix:
url += "?format=json&prefix=%s" % (urllib.quote(prefix, safe=''),)
return self._http_request("OpenStack list objects", 'GET', url, request_headers,
need_response_body=True)
d.addCallback(_do_list)
d.addCallback(lambda (response, json): self._parse_list(response, json, prefix))
return d
def _parse_list(self, response, json, prefix):
try:
items = simplejson.loads(json)
except simplejson.decoder.JSONDecodeError, e:
raise self.ServiceError(None, response.code,
message="could not decode list response: %s" % (e,))
log.msg(format="OpenStack list read %(length)d bytes, parsed as %(items)d items",
length=len(json), items=len(items), level=log.OPERATIONAL)
def _make_containeritem(item):
try:
key = item['name']
size = item['bytes']
modification_date = item['last_modified']
etag = item['hash']
storage_class = 'STANDARD'
except KeyError, e:
raise self.ServiceError(None, response.code,
message="missing field in list response: %s" % (e,))
else:
return ContainerItem(key, modification_date, etag, size, storage_class)
contents = map(_make_containeritem, items)
return ContainerListing(self._container_name, prefix, None, 10000, "false", contents=contents)
def _put_object(self, object_name, data, content_type='application/octet-stream', metadata={}):
"""
Put an object in this bucket.
Any existing object of the same name will be replaced.
"""
d = self._auth_client.get_auth_info()
def _do_put(auth_info):
request_headers = {
'X-Auth-Token': [auth_info.auth_token],
'Content-Type': [content_type],
}
url = self._make_object_url(auth_info.public_storage_url, object_name)
return self._http_request("OpenStack put object", 'PUT', url, request_headers, data)
d.addCallback(_do_put)
d.addCallback(lambda ign: None)
return d
def _get_object(self, object_name):
"""
Get an object from this container.
"""
d = self._auth_client.get_auth_info()
def _do_get(auth_info):
request_headers = {
'X-Auth-Token': [auth_info.auth_token],
}
url = self._make_object_url(auth_info.public_storage_url, object_name)
return self._http_request("OpenStack get object", 'GET', url, request_headers,
need_response_body=True)
d.addCallback(_do_get)
d.addCallback(lambda (response, body): body)
return d
def _head_object(self, object_name):
"""
Retrieve object metadata only.
"""
d = self._auth_client.get_auth_info()
def _do_head(auth_info):
request_headers = {
'X-Auth-Token': [auth_info.auth_token],
}
url = self._make_object_url(auth_info.public_storage_url, object_name)
return self._http_request("OpenStack head object", 'HEAD', url, request_headers)
d.addCallback(_do_head)
def _got_head_response( (response, body) ):
print response
raise NotImplementedError
d.addCallback(_got_head_response)
return d
def _delete_object(self, object_name):
"""
Delete an object from this container.
Once deleted, there is no method to restore or undelete an object.
"""
d = self._auth_client.get_auth_info()
def _do_delete(auth_info):
request_headers = {
'X-Auth-Token': [auth_info.auth_token],
}
url = self._make_object_url(auth_info.public_storage_url, object_name)
return self._http_request("OpenStack delete object", 'DELETE', url, request_headers)
d.addCallback(_do_delete)
d.addCallback(lambda ign: None)
return d

View File

@ -0,0 +1,4 @@
from allmydata.storage.backends.cloud.s3.s3_container import configure_s3_container
configure_container = configure_s3_container

View File

@ -0,0 +1,109 @@
from zope.interface import implements
try:
from xml.etree.ElementTree import ParseError
except ImportError:
from elementtree.ElementTree import ParseError
from allmydata.node import InvalidValueError
from allmydata.storage.backends.cloud.cloud_common import IContainer, \
CommonContainerMixin, ContainerListMixin
def configure_s3_container(storedir, config):
accesskeyid = config.get_config("storage", "s3.access_key_id")
secretkey = config.get_or_create_private_config("s3secret")
usertoken = config.get_optional_private_config("s3usertoken")
producttoken = config.get_optional_private_config("s3producttoken")
if producttoken and not usertoken:
raise InvalidValueError("If private/s3producttoken is present, private/s3usertoken must also be present.")
url = config.get_config("storage", "s3.url", "http://s3.amazonaws.com")
container_name = config.get_config("storage", "s3.bucket")
return S3Container(accesskeyid, secretkey, url, container_name, usertoken, producttoken)
class S3Container(ContainerListMixin, CommonContainerMixin):
implements(IContainer)
"""
I represent a real S3 container (bucket), accessed using the txaws library.
"""
def __init__(self, access_key, secret_key, url, container_name, usertoken=None, producttoken=None, override_reactor=None):
CommonContainerMixin.__init__(self, container_name, override_reactor)
# We only depend on txaws when this class is actually instantiated.
from txaws.credentials import AWSCredentials
from txaws.service import AWSServiceEndpoint
from txaws.s3.client import S3Client, Query
from txaws.s3.exception import S3Error
creds = AWSCredentials(access_key=access_key, secret_key=secret_key)
endpoint = AWSServiceEndpoint(uri=url)
query_factory = None
if usertoken is not None:
def make_query(*args, **kwargs):
amz_headers = kwargs.get("amz_headers", {})
if producttoken is not None:
amz_headers["security-token"] = (usertoken, producttoken)
else:
amz_headers["security-token"] = usertoken
kwargs["amz_headers"] = amz_headers
return Query(*args, **kwargs)
query_factory = make_query
self.client = S3Client(creds=creds, endpoint=endpoint, query_factory=query_factory)
self.ServiceError = S3Error
def _create(self):
return self.client.create(self._container_name)
def _delete(self):
return self.client.delete(self._container_name)
def list_some_objects(self, **kwargs):
return self._do_request('list objects', self._list_some_objects, **kwargs)
def _list_some_objects(self, **kwargs):
d = self.client.get_bucket(self._container_name, **kwargs)
def _err(f):
f.trap(ParseError)
raise self.ServiceError("", 500, "list objects: response body is not valid XML (possibly empty)\n" + f)
d.addErrback(_err)
return d
def _put_object(self, object_name, data, content_type='application/octet-stream', metadata={}):
return self.client.put_object(self._container_name, object_name, data, content_type, metadata)
def _get_object(self, object_name):
return self.client.get_object(self._container_name, object_name)
def _head_object(self, object_name):
return self.client.head_object(self._container_name, object_name)
def _delete_object(self, object_name):
return self.client.delete_object(self._container_name, object_name)
def put_policy(self, policy):
"""
Set access control policy on a bucket.
"""
query = self.client.query_factory(
action='PUT', creds=self.client.creds, endpoint=self.client.endpoint,
bucket=self._container_name, object_name='?policy', data=policy)
return self._do_request('PUT policy', query.submit)
def get_policy(self):
query = self.client.query_factory(
action='GET', creds=self.client.creds, endpoint=self.client.endpoint,
bucket=self._container_name, object_name='?policy')
return self._do_request('GET policy', query.submit)
def delete_policy(self):
query = self.client.query_factory(
action='DELETE', creds=self.client.creds, endpoint=self.client.endpoint,
bucket=self._container_name, object_name='?policy')
return self._do_request('DELETE policy', query.submit)

View File

@ -0,0 +1,187 @@
import os.path
from twisted.internet import defer
from zope.interface import implements
from allmydata.interfaces import IStorageBackend, IShareSet
from allmydata.util import fileutil, log
from allmydata.storage.common import si_b2a, si_a2b, NUM_RE, CorruptStoredShareError
from allmydata.storage.bucket import BucketWriter
from allmydata.storage.backends.base import Backend, ShareSet
from allmydata.storage.backends.disk.immutable import load_immutable_disk_share, create_immutable_disk_share
from allmydata.storage.backends.disk.mutable import load_mutable_disk_share, create_mutable_disk_share
from allmydata.mutable.layout import MUTABLE_MAGIC
# storage/
# storage/shares/incoming
# incoming/ holds temp dirs named $PREFIX/$STORAGEINDEX/$SHNUM which will
# be moved to storage/shares/$PREFIX/$STORAGEINDEX/$SHNUM upon success
# storage/shares/$PREFIX/$STORAGEINDEX
# storage/shares/$PREFIX/$STORAGEINDEX/$SHNUM
# where "$PREFIX" denotes the first 10 bits worth of $STORAGEINDEX (that's 2
# base-32 chars).
def si_si2dir(startdir, storage_index):
sia = si_b2a(storage_index)
return os.path.join(startdir, sia[:2], sia)
def get_disk_share(home, storage_index=None, shnum=None):
f = open(home, 'rb')
try:
prefix = f.read(len(MUTABLE_MAGIC))
finally:
f.close()
if prefix == MUTABLE_MAGIC:
return load_mutable_disk_share(home, storage_index, shnum)
else:
# assume it's immutable
return load_immutable_disk_share(home, storage_index, shnum)
def configure_disk_backend(storedir, config):
readonly = config.get_config("storage", "readonly", False, boolean=True)
reserved_space = config.get_config_size("storage", "reserved_space", "0")
return DiskBackend(storedir, readonly, reserved_space)
class DiskBackend(Backend):
implements(IStorageBackend)
def __init__(self, storedir, readonly=False, reserved_space=0):
Backend.__init__(self)
self._storedir = storedir
self._readonly = readonly
self._reserved_space = reserved_space
self._sharedir = os.path.join(self._storedir, 'shares')
fileutil.make_dirs(self._sharedir)
self._incomingdir = os.path.join(self._sharedir, 'incoming')
self._clean_incomplete()
if self._reserved_space and (self.get_available_space() is None):
log.msg("warning: [storage]reserved_space= is set, but this platform does not support an API to get disk statistics (statvfs(2) or GetDiskFreeSpaceEx), so this reservation cannot be honored",
umid="0wZ27w", level=log.UNUSUAL)
def _clean_incomplete(self):
fileutil.rm_dir(self._incomingdir)
fileutil.make_dirs(self._incomingdir)
def get_sharesets_for_prefix(self, prefix):
prefixdir = os.path.join(self._sharedir, prefix)
sharesets = [self.get_shareset(si_a2b(si_s))
for si_s in sorted(fileutil.listdir(prefixdir))]
return defer.succeed(sharesets)
def get_shareset(self, storage_index):
sharehomedir = si_si2dir(self._sharedir, storage_index)
incominghomedir = si_si2dir(self._incomingdir, storage_index)
return DiskShareSet(storage_index, self._get_lock(storage_index), sharehomedir, incominghomedir)
def fill_in_space_stats(self, stats):
stats['storage_server.reserved_space'] = self._reserved_space
try:
disk = fileutil.get_disk_stats(self._sharedir, self._reserved_space)
writeable = disk['avail'] > 0
# spacetime predictors should use disk_avail / (d(disk_used)/dt)
stats['storage_server.disk_total'] = disk['total']
stats['storage_server.disk_used'] = disk['used']
stats['storage_server.disk_free_for_root'] = disk['free_for_root']
stats['storage_server.disk_free_for_nonroot'] = disk['free_for_nonroot']
stats['storage_server.disk_avail'] = disk['avail']
except AttributeError:
writeable = True
except EnvironmentError:
log.msg("OS call to get disk statistics failed", level=log.UNUSUAL)
writeable = False
if self._readonly:
stats['storage_server.disk_avail'] = 0
writeable = False
stats['storage_server.accepting_immutable_shares'] = int(writeable)
def get_available_space(self):
if self._readonly:
return 0
try:
return fileutil.get_available_space(self._sharedir, self._reserved_space)
except EnvironmentError:
return 0
def must_use_tubid_as_permutation_seed(self):
# A disk backend with existing shares must assume that it was around before #466,
# so must use its TubID as a permutation-seed.
return bool(set(fileutil.listdir(self._sharedir)) - set(["incoming"]))
def list_container(self, prefix=''):
def _not_implemented():
raise NotImplementedError("the disk backend does not support listing container contents.\n" +
"Use 'tahoe debug catalog-shares' instead.")
return defer.execute(_not_implemented)
class DiskShareSet(ShareSet):
implements(IShareSet)
def __init__(self, storage_index, lock, sharehomedir, incominghomedir=None):
ShareSet.__init__(self, storage_index, lock)
self._sharehomedir = sharehomedir
self._incominghomedir = incominghomedir
def get_overhead(self):
return (fileutil.get_used_space(self._sharehomedir) +
fileutil.get_used_space(self._incominghomedir))
def _locked_get_shares(self):
si = self.get_storage_index()
shares = {}
corrupted = set()
for shnumstr in fileutil.listdir(self._sharehomedir, filter=NUM_RE):
shnum = int(shnumstr)
sharefile = os.path.join(self._sharehomedir, shnumstr)
try:
shares[shnum] = get_disk_share(sharefile, si, shnum)
except CorruptStoredShareError:
corrupted.add(shnum)
valid = [shares[shnum] for shnum in sorted(shares.keys())]
return defer.succeed( (valid, corrupted) )
def _locked_get_share(self, shnum):
return get_disk_share(os.path.join(self._sharehomedir, str(shnum)),
self.get_storage_index(), shnum)
def _locked_delete_share(self, shnum):
fileutil.remove(os.path.join(self._sharehomedir, str(shnum)))
return defer.succeed(None)
def has_incoming(self, shnum):
if self._incominghomedir is None:
return False
return os.path.exists(os.path.join(self._incominghomedir, str(shnum)))
def make_bucket_writer(self, account, shnum, allocated_data_length, canary):
finalhome = os.path.join(self._sharehomedir, str(shnum))
incominghome = os.path.join(self._incominghomedir, str(shnum))
immsh = create_immutable_disk_share(incominghome, finalhome, allocated_data_length,
self.get_storage_index(), shnum)
bw = BucketWriter(account, immsh, canary)
return bw
def _create_mutable_share(self, account, shnum, write_enabler):
fileutil.make_dirs(self._sharehomedir)
sharehome = os.path.join(self._sharehomedir, str(shnum))
serverid = account.server.get_serverid()
return create_mutable_disk_share(sharehome, serverid, write_enabler,
self.get_storage_index(), shnum, parent=account.server)
def _clean_up_after_unlink(self):
fileutil.rmdir_if_empty(self._sharehomedir)
def _get_sharedir(self):
return self._sharehomedir

View File

@ -0,0 +1,211 @@
import os, os.path, struct
from twisted.internet import defer
from zope.interface import implements
from allmydata.interfaces import IShareForReading, IShareForWriting
from allmydata.util import fileutil
from allmydata.util.assertutil import precondition, _assert
from allmydata.storage.common import si_b2a, CorruptStoredShareError, UnknownImmutableContainerVersionError, \
DataTooLargeError
# Each share file (in storage/shares/$PREFIX/$STORAGEINDEX/$SHNUM) contains
# share data that can be accessed by RIBucketWriter.write and RIBucketReader.read .
# The share file has the following layout:
# 0x00: share file version number, four bytes, current version is 1
# 0x04: share data length, four bytes big-endian # Footnote 1
# 0x08: number of leases, four bytes big-endian = N # Footnote 2
# 0x0c: beginning of share data (see immutable.layout.WriteBucketProxy)
# filesize - 72*N: leases (ignored). Each lease is 72 bytes.
# Footnote 1: as of Tahoe v1.3.0 this field is not used by storage servers.
# Footnote 2: as of Tahoe v1.12.0 this field is not used by storage servers.
# New shares will have a 0 here. Old shares will have whatever value was left
# over when the server was upgraded. All lease information is now kept in the
# leasedb.
class ImmutableDiskShare(object):
implements(IShareForReading, IShareForWriting)
sharetype = "immutable"
LEASE_SIZE = struct.calcsize(">L32s32sL")
HEADER = ">LLL"
HEADER_SIZE = struct.calcsize(HEADER)
DATA_OFFSET = HEADER_SIZE
def __init__(self, home, storage_index, shnum, finalhome=None, allocated_data_length=None):
"""
If allocated_data_length is not None then I won't allow more than allocated_data_length
to be written to me.
If finalhome is not None (meaning that we are creating the share) then allocated_data_length
must not be None.
Clients should use the load_immutable_disk_share and create_immutable_disk_share
factory functions rather than creating instances directly.
"""
precondition((allocated_data_length is not None) or (finalhome is None),
allocated_data_length=allocated_data_length, finalhome=finalhome)
self._storage_index = storage_index
self._allocated_data_length = allocated_data_length
# If we are creating the share, _finalhome refers to the final path and
# _home to the incoming path. Otherwise, _finalhome is None.
self._finalhome = finalhome
self._home = home
self._shnum = shnum
if self._finalhome is not None:
# Touch the file, so later callers will see that we're working on
# it. Also construct the metadata.
_assert(not os.path.exists(self._finalhome), finalhome=self._finalhome)
fileutil.make_dirs(os.path.dirname(self._home))
# The second field -- the four-byte share data length -- is no
# longer used as of Tahoe v1.3.0, but we continue to write it in
# there in case someone downgrades a storage server from >=
# Tahoe-1.3.0 to < Tahoe-1.3.0, or moves a share file from one
# server to another, etc. We do saturation -- a share data length
# larger than 2**32-1 (what can fit into the field) is marked as
# the largest length that can fit into the field. That way, even
# if this does happen, the old < v1.3.0 server will still allow
# clients to read the first part of the share.
fileutil.write(self._home, struct.pack(self.HEADER, 1, min(2**32-1, allocated_data_length), 0))
self._data_length = allocated_data_length
else:
f = open(self._home, 'rb')
try:
(version, unused, num_leases) = struct.unpack(self.HEADER, f.read(self.HEADER_SIZE))
except struct.error, e:
raise CorruptStoredShareError(shnum, "invalid immutable share header for shnum %d: %s" % (shnum, e))
finally:
f.close()
if version != 1:
msg = "sharefile %r had version %d but we wanted 1" % (self._home, version)
raise UnknownImmutableContainerVersionError(shnum, msg)
filesize = os.stat(self._home).st_size
self._data_length = filesize - self.DATA_OFFSET - (num_leases * self.LEASE_SIZE)
if self._data_length < 0:
raise CorruptStoredShareError(shnum, "calculated data length for shnum %d is %d" % (shnum, self._data_length))
def __repr__(self):
return ("<ImmutableDiskShare %s:%r at %r>"
% (si_b2a(self._storage_index or ""), self._shnum, self._home))
def close(self):
fileutil.make_dirs(os.path.dirname(self._finalhome))
fileutil.move_into_place(self._home, self._finalhome)
# self._home is like storage/shares/incoming/ab/abcde/4 .
# We try to delete the parent (.../ab/abcde) to avoid leaving
# these directories lying around forever, but the delete might
# fail if we're working on another share for the same storage
# index (like ab/abcde/5). The alternative approach would be to
# use a hierarchy of objects (PrefixHolder, BucketHolder,
# ShareWriter), each of which is responsible for a single
# directory on disk, and have them use reference counting of
# their children to know when they should do the rmdir. This
# approach is simpler, but relies on os.rmdir (used by
# rmdir_if_empty) refusing to delete a non-empty directory.
# Do *not* use fileutil.remove() here!
parent = os.path.dirname(self._home)
fileutil.rmdir_if_empty(parent)
# we also delete the grandparent (prefix) directory, .../ab ,
# again to avoid leaving directories lying around. This might
# fail if there is another bucket open that shares a prefix (like
# ab/abfff).
fileutil.rmdir_if_empty(os.path.dirname(parent))
# we leave the great-grandparent (incoming/) directory in place.
self._home = self._finalhome
self._finalhome = None
return defer.succeed(None)
def get_used_space(self):
return (fileutil.get_used_space(self._finalhome) +
fileutil.get_used_space(self._home))
def get_storage_index(self):
return self._storage_index
def get_storage_index_string(self):
return si_b2a(self._storage_index)
def get_shnum(self):
return self._shnum
def unlink(self):
fileutil.remove(self._home)
return defer.succeed(None)
def get_allocated_data_length(self):
return self._allocated_data_length
def get_size(self):
return os.stat(self._home).st_size
def get_data_length(self):
return self._data_length
def readv(self, readv):
datav = []
f = open(self._home, 'rb')
try:
for (offset, length) in readv:
datav.append(self._read_share_data(f, offset, length))
finally:
f.close()
return defer.succeed(datav)
def _get_path(self):
return self._home
def _read_share_data(self, f, offset, length):
precondition(offset >= 0)
# Reads beyond the end of the data are truncated. Reads that start
# beyond the end of the data return an empty string.
seekpos = self.DATA_OFFSET + offset
actuallength = max(0, min(length, self._data_length - offset))
if actuallength == 0:
return ""
f.seek(seekpos)
return f.read(actuallength)
def read_share_data(self, offset, length):
f = open(self._home, 'rb')
try:
return defer.succeed(self._read_share_data(f, offset, length))
finally:
f.close()
def write_share_data(self, offset, data):
length = len(data)
precondition(offset >= 0, offset)
if self._allocated_data_length is not None and offset+length > self._allocated_data_length:
raise DataTooLargeError(self._shnum, self._allocated_data_length, offset, length)
f = open(self._home, 'rb+')
try:
real_offset = self.DATA_OFFSET + offset
f.seek(real_offset)
_assert(f.tell() == real_offset)
f.write(data)
return defer.succeed(None)
finally:
f.close()
def load_immutable_disk_share(home, storage_index=None, shnum=None):
return ImmutableDiskShare(home, storage_index=storage_index, shnum=shnum)
def create_immutable_disk_share(home, finalhome, allocated_data_length, storage_index=None, shnum=None):
return ImmutableDiskShare(home, finalhome=finalhome, allocated_data_length=allocated_data_length,
storage_index=storage_index, shnum=shnum)

View File

@ -0,0 +1,290 @@
import os, struct
from twisted.internet import defer
from zope.interface import implements
from allmydata.interfaces import IMutableShare, BadWriteEnablerError
from allmydata.util import fileutil, idlib, log
from allmydata.util.assertutil import precondition, _assert
from allmydata.util.hashutil import timing_safe_compare
from allmydata.storage.common import si_b2a, CorruptStoredShareError, UnknownMutableContainerVersionError, \
DataTooLargeError
from allmydata.storage.backends.base import testv_compare
from allmydata.mutable.layout import MUTABLE_MAGIC, MAX_MUTABLE_SHARE_SIZE
# MutableDiskShare is like ImmutableDiskShare, but used for mutable data.
# Mutable shares have a different layout. See docs/mutable.rst for more details.
# # offset size name
# 1 0 32 magic verstr "tahoe mutable container v1" plus binary
# 2 32 20 write enabler's nodeid
# 3 52 32 write enabler
# 4 84 8 data size (actual share data present) (a)
# 5 92 8 offset of (8) count of extra leases (after data)
# 6 100 368 four leases, 92 bytes each (ignored)
# 7 468 (a) data
# 8 ?? 4 count of extra leases
# 9 ?? n*92 extra leases (ignored)
# The struct module doc says that L's are 4 bytes in size, and that Q's are
# 8 bytes in size. Since compatibility depends upon this, double-check it.
assert struct.calcsize(">L") == 4, struct.calcsize(">L")
assert struct.calcsize(">Q") == 8, struct.calcsize(">Q")
class MutableDiskShare(object):
implements(IMutableShare)
sharetype = "mutable"
DATA_LENGTH_OFFSET = struct.calcsize(">32s20s32s")
EXTRA_LEASE_COUNT_OFFSET = DATA_LENGTH_OFFSET + 8
HEADER = ">32s20s32sQQ"
HEADER_SIZE = struct.calcsize(HEADER) # doesn't include leases
LEASE_SIZE = struct.calcsize(">LL32s32s20s")
assert LEASE_SIZE == 92, LEASE_SIZE
DATA_OFFSET = HEADER_SIZE + 4*LEASE_SIZE
assert DATA_OFFSET == 468, DATA_OFFSET
MAGIC = MUTABLE_MAGIC
assert len(MAGIC) == 32
MAX_SIZE = MAX_MUTABLE_SHARE_SIZE
def __init__(self, home, storage_index, shnum, parent=None):
"""
Clients should use the load_mutable_disk_share and create_mutable_disk_share
factory functions rather than creating instances directly.
"""
self._storage_index = storage_index
self._shnum = shnum
self._home = home
if os.path.exists(self._home):
# we don't cache anything, just check the magic
f = open(self._home, 'rb')
try:
data = f.read(self.HEADER_SIZE)
(magic,
_write_enabler_nodeid, _write_enabler,
_data_length, _extra_lease_count_offset) = struct.unpack(self.HEADER, data)
if magic != self.MAGIC:
msg = "sharefile %r had magic '%r' but we wanted '%r'" % \
(self._home, magic, self.MAGIC)
raise UnknownMutableContainerVersionError(shnum, msg)
except struct.error, e:
raise CorruptStoredShareError(shnum, "invalid mutable share header for shnum %d: %s" % (shnum, e))
finally:
f.close()
self.parent = parent # for logging
def log(self, *args, **kwargs):
if self.parent:
return self.parent.log(*args, **kwargs)
def create(self, serverid, write_enabler):
_assert(not os.path.exists(self._home), "%r already exists and should not" % (self._home,))
data_length = 0
extra_lease_count_offset = (self.HEADER_SIZE
+ 4 * self.LEASE_SIZE
+ data_length)
assert extra_lease_count_offset == self.DATA_OFFSET # true at creation
num_extra_leases = 0
f = open(self._home, 'wb')
try:
header = struct.pack(self.HEADER,
self.MAGIC, serverid, write_enabler,
data_length, extra_lease_count_offset,
)
leases = ("\x00"*self.LEASE_SIZE) * 4
f.write(header + leases)
# data goes here, empty after creation
f.write(struct.pack(">L", num_extra_leases))
# extra leases go here, none at creation
finally:
f.close()
return self
def __repr__(self):
return ("<MutableDiskShare %s:%r at %r>"
% (si_b2a(self._storage_index or ""), self._shnum, self._home))
def get_size(self):
return os.stat(self._home).st_size
def get_data_length(self):
f = open(self._home, 'rb')
try:
data_length = self._read_data_length(f)
finally:
f.close()
return data_length
def get_used_space(self):
return fileutil.get_used_space(self._home)
def get_storage_index(self):
return self._storage_index
def get_storage_index_string(self):
return si_b2a(self._storage_index)
def get_shnum(self):
return self._shnum
def unlink(self):
fileutil.remove(self._home)
return defer.succeed(None)
def _get_path(self):
return self._home
@classmethod
def _read_data_length(cls, f):
f.seek(cls.DATA_LENGTH_OFFSET)
(data_length,) = struct.unpack(">Q", f.read(8))
return data_length
@classmethod
def _read_container_size(cls, f):
f.seek(cls.EXTRA_LEASE_COUNT_OFFSET)
(extra_lease_count_offset,) = struct.unpack(">Q", f.read(8))
return extra_lease_count_offset - cls.DATA_OFFSET
@classmethod
def _write_data_length(cls, f, data_length):
extra_lease_count_offset = cls.DATA_OFFSET + data_length
f.seek(cls.DATA_LENGTH_OFFSET)
f.write(struct.pack(">QQ", data_length, extra_lease_count_offset))
f.seek(extra_lease_count_offset)
f.write(struct.pack(">L", 0))
def _read_share_data(self, f, offset, length):
precondition(offset >= 0, offset=offset)
data_length = self._read_data_length(f)
if offset + length > data_length:
# reads beyond the end of the data are truncated. Reads that
# start beyond the end of the data return an empty string.
length = max(0, data_length - offset)
if length == 0:
return ""
precondition(offset + length <= data_length)
f.seek(self.DATA_OFFSET+offset)
data = f.read(length)
return data
def _write_share_data(self, f, offset, data):
length = len(data)
precondition(offset >= 0, offset=offset)
if offset + length > self.MAX_SIZE:
raise DataTooLargeError(self._shnum, self.MAX_SIZE, offset, length)
data_length = self._read_data_length(f)
if offset+length >= data_length:
# They are expanding their data size. We must write
# their new data and modify the recorded data size.
# Fill any newly exposed empty space with 0's.
if offset > data_length:
f.seek(self.DATA_OFFSET + data_length)
f.write('\x00'*(offset - data_length))
f.flush()
new_data_length = offset + length
self._write_data_length(f, new_data_length)
# an interrupt here will result in a corrupted share
# now all that's left to do is write out their data
f.seek(self.DATA_OFFSET + offset)
f.write(data)
return
@classmethod
def _read_write_enabler_and_nodeid(cls, f):
f.seek(0)
data = f.read(cls.HEADER_SIZE)
(magic,
write_enabler_nodeid, write_enabler,
_data_length, _extra_lease_count_offset) = struct.unpack(cls.HEADER, data)
assert magic == cls.MAGIC
return (write_enabler, write_enabler_nodeid)
def readv(self, readv):
datav = []
f = open(self._home, 'rb')
try:
for (offset, length) in readv:
datav.append(self._read_share_data(f, offset, length))
finally:
f.close()
return defer.succeed(datav)
def check_write_enabler(self, write_enabler):
f = open(self._home, 'rb+')
try:
(real_write_enabler, write_enabler_nodeid) = self._read_write_enabler_and_nodeid(f)
finally:
f.close()
# avoid a timing attack
if not timing_safe_compare(write_enabler, real_write_enabler):
# accomodate share migration by reporting the nodeid used for the
# old write enabler.
def _bad_write_enabler():
nodeid_s = idlib.nodeid_b2a(write_enabler_nodeid)
self.log(format="bad write enabler on SI %(si)s,"
" recorded by nodeid %(nodeid)s",
facility="tahoe.storage",
level=log.WEIRD, umid="cE1eBQ",
si=self.get_storage_index_string(),
nodeid=nodeid_s)
raise BadWriteEnablerError("The write enabler was recorded by nodeid '%s'."
% (nodeid_s,))
return defer.execute(_bad_write_enabler)
return defer.succeed(None)
def check_testv(self, testv):
test_good = True
f = open(self._home, 'rb+')
try:
for (offset, length, operator, specimen) in testv:
data = self._read_share_data(f, offset, length)
if not testv_compare(data, operator, specimen):
test_good = False
break
finally:
f.close()
return defer.succeed(test_good)
def writev(self, datav, new_length):
precondition(new_length is None or new_length >= 0, new_length=new_length)
for (offset, data) in datav:
precondition(offset >= 0, offset=offset)
if offset + len(data) > self.MAX_SIZE:
raise DataTooLargeError(self._shnum, self.MAX_SIZE, offset, len(data))
f = open(self._home, 'rb+')
try:
for (offset, data) in datav:
self._write_share_data(f, offset, data)
if new_length is not None:
cur_length = self._read_data_length(f)
if new_length < cur_length:
self._write_data_length(f, new_length)
# TODO: shrink the share file.
finally:
f.close()
return defer.succeed(None)
def close(self):
return defer.succeed(None)
def load_mutable_disk_share(home, storage_index=None, shnum=None, parent=None):
return MutableDiskShare(home, storage_index, shnum, parent)
def create_mutable_disk_share(home, serverid, write_enabler, storage_index=None, shnum=None, parent=None):
ms = MutableDiskShare(home, storage_index, shnum, parent)
return ms.create(serverid, write_enabler)

View File

@ -0,0 +1,203 @@
from twisted.internet import defer
from zope.interface import implements
from allmydata.interfaces import IStorageBackend, IShareSet, IShareBase, \
IShareForReading, IShareForWriting, IMutableShare
from allmydata.util.assertutil import precondition
from allmydata.util.listutil import concat
from allmydata.storage.backends.base import Backend, ShareSet, empty_check_testv
from allmydata.storage.bucket import BucketWriter
from allmydata.storage.common import si_b2a
from allmydata.storage.backends.base import ContainerItem
def configure_null_backend(storedir, config):
return NullBackend()
class NullBackend(Backend):
implements(IStorageBackend)
"""
I am a test backend that records (in memory) which shares exist, but not their contents, leases,
or write-enablers.
"""
def __init__(self):
Backend.__init__(self)
# mapping from storage_index to NullShareSet
self._sharesets = {}
def get_available_space(self):
return None
def get_sharesets_for_prefix(self, prefix):
sharesets = []
for (si, shareset) in self._sharesets.iteritems():
if si_b2a(si).startswith(prefix):
sharesets.append(shareset)
def _by_base32si(b):
return b.get_storage_index_string()
sharesets.sort(key=_by_base32si)
return defer.succeed(sharesets)
def get_shareset(self, storage_index):
shareset = self._sharesets.get(storage_index, None)
if shareset is None:
shareset = NullShareSet(storage_index, self._get_lock(storage_index))
self._sharesets[storage_index] = shareset
return shareset
def fill_in_space_stats(self, stats):
pass
def list_container(self, prefix=''):
return defer.succeed(concat([s._list_items() for s in self.get_sharesets_for_prefix(prefix)]))
class NullShareSet(ShareSet):
implements(IShareSet)
def __init__(self, storage_index, lock):
ShareSet.__init__(self, storage_index, lock)
self._incoming_shnums = set()
self._immutable_shnums = set()
self._mutable_shnums = set()
def close_shnum(self, shnum):
self._incoming_shnums.remove(shnum)
self._immutable_shnums.add(shnum)
return defer.succeed(None)
def get_overhead(self):
return 0
def _list_items(self):
sistr = si_b2a(self.storage_index)
return [ContainerItem("shares/%s/%s/%d" % (sistr[:2], sistr, shnum), None, "", 0, "STANDARD", None)
for shnum in set.union(self._immutable_shnums, self._mutable_shnums)]
def _locked_get_shares(self):
shares = {}
for shnum in self._immutable_shnums:
shares[shnum] = ImmutableNullShare(self, shnum)
for shnum in self._mutable_shnums:
shares[shnum] = MutableNullShare(self, shnum)
# This backend never has any corrupt shares.
return defer.succeed( ([shares[shnum] for shnum in sorted(shares.keys())], set()) )
def _locked_get_share(self, shnum):
if shnum in self._immutable_shnums:
return defer.succeed(ImmutableNullShare(self, shnum))
elif shnum in self._mutable_shnums:
return defer.succeed(MutableNullShare(self, shnum))
else:
def _not_found(): raise IndexError("no such share %d" % (shnum,))
return defer.execute(_not_found)
def _locked_delete_share(self, shnum, include_incoming=False):
if include_incoming and (shnum in self._incoming_shnums):
self._incoming_shnums.remove(shnum)
if shnum in self._immutable_shnums:
self._immutable_shnums.remove(shnum)
if shnum in self._mutable_shnums:
self._mutable_shnums.remove(shnum)
return defer.succeed(None)
def has_incoming(self, shnum):
return shnum in self._incoming_shnums
def get_storage_index(self):
return self.storage_index
def get_storage_index_string(self):
return si_b2a(self.storage_index)
def make_bucket_writer(self, account, shnum, allocated_data_length, canary):
self._incoming_shnums.add(shnum)
immutableshare = ImmutableNullShare(self, shnum)
bw = BucketWriter(account, immutableshare, canary)
bw.throw_out_all_data = True
return bw
class NullShareBase(object):
implements(IShareBase)
def __init__(self, shareset, shnum):
self.shareset = shareset
self.shnum = shnum
def get_storage_index(self):
return self.shareset.get_storage_index()
def get_storage_index_string(self):
return self.shareset.get_storage_index_string()
def get_shnum(self):
return self.shnum
def get_data_length(self):
return 0
def get_size(self):
return 0
def get_used_space(self):
return 0
def unlink(self):
return self.shareset.delete_share(self.shnum, include_incoming=True)
def readv(self, readv):
datav = []
for (offset, length) in readv:
datav.append("")
return defer.succeed(datav)
def get_leases(self):
pass
def add_lease(self, lease):
pass
def renew_lease(self, renew_secret, new_expire_time):
raise IndexError("unable to renew non-existent lease")
def add_or_renew_lease(self, lease_info):
pass
class ImmutableNullShare(NullShareBase):
implements(IShareForReading, IShareForWriting)
sharetype = "immutable"
def read_share_data(self, offset, length):
precondition(offset >= 0)
return defer.succeed("")
def get_allocated_data_length(self):
return 0
def write_share_data(self, offset, data):
return defer.succeed(None)
def close(self):
return self.shareset.close_shnum(self.shnum)
class MutableNullShare(NullShareBase):
implements(IMutableShare)
sharetype = "mutable"
def check_write_enabler(self, write_enabler):
# Null backend doesn't check write enablers.
return defer.succeed(None)
def check_testv(self, testv):
return defer.succeed(empty_check_testv(testv))
def writev(self, datav, new_length):
return defer.succeed(None)

View File

@ -0,0 +1,131 @@
import time
from foolscap.api import Referenceable
from twisted.internet import defer
from zope.interface import implements
from allmydata.interfaces import RIBucketWriter, RIBucketReader
from allmydata.util import base32, log
from allmydata.util.assertutil import precondition
from allmydata.storage.leasedb import SHARETYPE_IMMUTABLE
class BucketWriter(Referenceable):
implements(RIBucketWriter)
def __init__(self, account, share, canary):
self.ss = account.server
self._account = account
self._share = share
self._canary = canary
self._disconnect_marker = canary.notifyOnDisconnect(self._disconnected)
self.closed = False
self.throw_out_all_data = False
self._account.add_share(share.get_storage_index(), share.get_shnum(),
share.get_allocated_data_length(), SHARETYPE_IMMUTABLE)
def allocated_size(self):
return self._share.get_allocated_data_length()
def _add_latency(self, res, name, start):
self.ss.add_latency(name, time.time() - start)
self.ss.count(name)
return res
def remote_write(self, offset, data):
start = time.time()
precondition(not self.closed)
if self.throw_out_all_data:
return defer.succeed(None)
d = self._share.write_share_data(offset, data)
d.addBoth(self._add_latency, "write", start)
return d
def remote_close(self):
precondition(not self.closed)
start = time.time()
d = defer.succeed(None)
d.addCallback(lambda ign: self._share.close())
d.addCallback(lambda ign: self._share.get_used_space())
def _got_used_space(used_space):
storage_index = self._share.get_storage_index()
shnum = self._share.get_shnum()
self._share = None
self.closed = True
self._canary.dontNotifyOnDisconnect(self._disconnect_marker)
self.ss.bucket_writer_closed(self, used_space)
self._account.add_or_renew_default_lease(storage_index, shnum)
self._account.mark_share_as_stable(storage_index, shnum, used_space)
d.addCallback(_got_used_space)
d.addBoth(self._add_latency, "close", start)
return d
def _disconnected(self):
if not self.closed:
return self._abort()
return defer.succeed(None)
def remote_abort(self):
log.msg("storage: aborting write to share %r" % self._share,
facility="tahoe.storage", level=log.UNUSUAL)
if not self.closed:
self._canary.dontNotifyOnDisconnect(self._disconnect_marker)
d = self._abort()
def _count(ign):
self.ss.count("abort")
d.addBoth(_count)
return d
def _abort(self):
d = defer.succeed(None)
if self.closed:
return d
d.addCallback(lambda ign: self._share.unlink())
def _unlinked(ign):
self._share = None
# We are now considered closed for further writing. We must tell
# the storage server about this so that it stops expecting us to
# use the space it allocated for us earlier.
self.closed = True
self.ss.bucket_writer_closed(self, 0)
d.addCallback(_unlinked)
return d
class BucketReader(Referenceable):
implements(RIBucketReader)
def __init__(self, account, share):
self.ss = account.server
self._account = account
self._share = share
self.storage_index = share.get_storage_index()
self.shnum = share.get_shnum()
def __repr__(self):
return "<%s %s %s>" % (self.__class__.__name__,
base32.b2a_l(self.storage_index[:8], 60),
self.shnum)
def _add_latency(self, res, name, start):
self.ss.add_latency(name, time.time() - start)
self.ss.count(name)
return res
def remote_read(self, offset, length):
start = time.time()
d = self._share.read_share_data(offset, length)
d.addBoth(self._add_latency, "read", start)
return d
def remote_advise_corrupt_share(self, reason):
return self._account.remote_advise_corrupt_share("immutable",
self.storage_index,
self.shnum,
reason)

View File

@ -1,12 +1,39 @@
import os.path
import os, re
from allmydata.util import base32
# Share numbers match this regex:
NUM_RE=re.compile("^[0-9]+$")
PREFIX = re.compile("^[%s]{2}$" % (base32.z_base_32_alphabet,))
class DataTooLargeError(Exception):
def __init__(self, shnum, allocated_data_length, offset, length):
self.shnum = shnum
self.allocated_data_length = allocated_data_length
self.offset = offset
self.length = length
def __str__(self):
return ("attempted write to shnum %d of %d bytes at offset %d exceeds allocated data length of %d bytes"
% (self.__class__.__name__, self.shnum, self.length, self.offset, self.allocated_data_length))
class CorruptStoredShareError(Exception):
def __init__(self, shnum, *rest):
Exception.__init__(self, shnum, *rest)
self.shnum = shnum
class UnknownContainerVersionError(CorruptStoredShareError):
pass
class UnknownMutableContainerVersionError(Exception):
class UnknownMutableContainerVersionError(UnknownContainerVersionError):
pass
class UnknownImmutableContainerVersionError(Exception):
class UnknownImmutableContainerVersionError(UnknownContainerVersionError):
pass
@ -16,6 +43,10 @@ def si_b2a(storageindex):
def si_a2b(ascii_storageindex):
return base32.a2b(ascii_storageindex)
def storage_index_to_prefix(storageindex):
sia = si_b2a(storageindex)
return sia[:2]
def storage_index_to_dir(storageindex):
sia = si_b2a(storageindex)
return os.path.join(sia[:2], sia)

View File

@ -1,82 +1,100 @@
import os, time, struct
import time, struct
import cPickle as pickle
from twisted.internet import reactor
from twisted.internet import defer, reactor
from twisted.application import service
from allmydata.interfaces import IStorageBackend
from allmydata.storage.common import si_b2a
from allmydata.util import fileutil
from allmydata.util.assertutil import precondition
from allmydata.util.deferredutil import HookMixin, async_iterate
class TimeSliceExceeded(Exception):
pass
class ShareCrawler(service.MultiService):
"""A ShareCrawler subclass is attached to a StorageServer, and
periodically walks all of its shares, processing each one in some
fashion. This crawl is rate-limited, to reduce the IO burden on the host,
since large servers can easily have a terabyte of shares, in several
million files, which can take hours or days to read.
class ShareCrawler(HookMixin, service.MultiService):
"""
An instance of a subclass of ShareCrawler is attached to a storage
backend, and periodically walks the backend's shares, processing them
in some fashion. This crawl is rate-limited to reduce the I/O burden on
the host, since large servers can easily have a terabyte of shares in
several million files, which can take hours or days to read.
Once the crawler starts a cycle, it will proceed at a rate limited by the
allowed_cpu_percentage= and cpu_slice= parameters: yielding the reactor
allowed_cpu_proportion= and cpu_slice= parameters: yielding the reactor
after it has worked for 'cpu_slice' seconds, and not resuming right away,
always trying to use less than 'allowed_cpu_percentage'.
always trying to use less than 'allowed_cpu_proportion'.
Once the crawler finishes a cycle, it will put off starting the next one
long enough to ensure that 'minimum_cycle_time' elapses between the start
of two consecutive cycles.
We assume that the normal upload/download/get_buckets traffic of a tahoe
We assume that the normal upload/download/DYHB traffic of a Tahoe-LAFS
grid will cause the prefixdir contents to be mostly cached in the kernel,
or that the number of buckets in each prefixdir will be small enough to
load quickly. A 1TB allmydata.com server was measured to have 2.56M
buckets, spread into the 1024 prefixdirs, with about 2500 buckets per
prefix. On this server, each prefixdir took 130ms-200ms to list the first
or that the number of sharesets in each prefixdir will be small enough to
load quickly. A 1TB allmydata.com server was measured to have 2.56 million
sharesets, spread into the 1024 prefixes, with about 2500 sharesets per
prefix. On this server, each prefix took 130ms-200ms to list the first
time, and 17ms to list the second time.
To use a crawler, create a subclass which implements the process_bucket()
method. It will be called with a prefixdir and a base32 storage index
string. process_bucket() must run synchronously. Any keys added to
self.state will be preserved. Override add_initial_state() to set up
initial state keys. Override finished_cycle() to perform additional
processing when the cycle is complete. Any status that the crawler
produces should be put in the self.state dictionary. Status renderers
(like a web page which describes the accomplishments of your crawler)
will use crawler.get_state() to retrieve this dictionary; they can
present the contents as they see fit.
To implement a crawler, create a subclass that implements the
process_prefix() method. This method may be asynchronous. It will be
called with a string prefix. Any keys that it adds to self.state will be
preserved. Override add_initial_state() to set up initial state keys.
Override finished_cycle() to perform additional processing when the cycle
is complete. Any status that the crawler produces should be put in the
self.state dictionary. Status renderers (like a web page describing the
accomplishments of your crawler) will use crawler.get_state() to retrieve
this dictionary; they can present the contents as they see fit.
Then create an instance, with a reference to a StorageServer and a
filename where it can store persistent state. The statefile is used to
keep track of how far around the ring the process has travelled, as well
as timing history to allow the pace to be predicted and controlled. The
statefile will be updated and written to disk after each time slice (just
before the crawler yields to the reactor), and also after each cycle is
finished, and also when stopService() is called. Note that this means
that a crawler which is interrupted with SIGKILL while it is in the
middle of a time slice will lose progress: the next time the node is
started, the crawler will repeat some unknown amount of work.
Then create an instance, with a reference to a backend object providing
the IStorageBackend interface, and a filename where it can store
persistent state. The statefile is used to keep track of how far around
the ring the process has travelled, as well as timing history to allow
the pace to be predicted and controlled. The statefile will be updated
and written to disk after each time slice (just before the crawler yields
to the reactor), and also after each cycle is finished, and also when
stopService() is called. Note that this means that a crawler that is
interrupted with SIGKILL while it is in the middle of a time slice will
lose progress: the next time the node is started, the crawler will repeat
some unknown amount of work.
The crawler instance must be started with startService() before it will
do any work. To make it stop doing work, call stopService().
do any work. To make it stop doing work, call stopService(). A crawler
is usually a child service of a StorageServer, although it should not
depend on that.
For historical reasons, some dictionary key names use the term "bucket"
for what is now preferably called a "shareset" (the set of shares that a
server holds under a given storage index).
Subclasses should measure time using self.clock.seconds(), rather than
time.time(), in order to make themselves deterministically testable.
"""
slow_start = 300 # don't start crawling for 5 minutes after startup
# all three of these can be changed at any time
allowed_cpu_percentage = .10 # use up to 10% of the CPU, on average
allowed_cpu_proportion = .10 # use up to 10% of the CPU, on average
cpu_slice = 1.0 # use up to 1.0 seconds before yielding
minimum_cycle_time = 300 # don't run a cycle faster than this
def __init__(self, server, statefile, allowed_cpu_percentage=None):
def __init__(self, backend, statefile, allowed_cpu_proportion=None, clock=None):
precondition(IStorageBackend.providedBy(backend), backend)
service.MultiService.__init__(self)
if allowed_cpu_percentage is not None:
self.allowed_cpu_percentage = allowed_cpu_percentage
self.server = server
self.sharedir = server.sharedir
self.backend = backend
self.statefile = statefile
if allowed_cpu_proportion is not None:
self.allowed_cpu_proportion = allowed_cpu_proportion
self.clock = clock or reactor
self.prefixes = [si_b2a(struct.pack(">H", i << (16-10)))[:2]
for i in range(2**10)]
self.prefixes.sort()
self.timer = None
self.bucket_cache = (None, [])
self.current_sleep_time = None
self.next_wake_time = None
self.last_prefix_finished_time = None
@ -85,6 +103,9 @@ class ShareCrawler(service.MultiService):
self.last_cycle_elapsed_time = None
self.load_state()
# used by tests
self._hooks = {'after_prefix': None, 'after_cycle': None, 'yield': None}
def minus_or_none(self, a, b):
if a is None:
return None
@ -109,9 +130,9 @@ class ShareCrawler(service.MultiService):
remaining-sleep-time: float, seconds from now when we do more work
estimated-cycle-complete-time-left:
float, seconds remaining until the current cycle is finished.
TODO: this does not yet include the remaining time left in
the current prefixdir, and it will be very inaccurate on fast
crawlers (which can process a whole prefix in a single tick)
This does not include the remaining time left in the current
prefix, and it will be very inaccurate on fast crawlers
(which can process a whole prefix in a single tick)
estimated-time-per-cycle: float, seconds required to do a complete
cycle
@ -123,47 +144,40 @@ class ShareCrawler(service.MultiService):
cycle
"""
d = {}
p = {}
if self.state["current-cycle"] is None:
d["cycle-in-progress"] = False
d["next-crawl-time"] = self.next_wake_time
d["remaining-wait-time"] = self.minus_or_none(self.next_wake_time,
p["cycle-in-progress"] = False
p["next-crawl-time"] = self.next_wake_time
p["remaining-wait-time"] = self.minus_or_none(self.next_wake_time,
time.time())
else:
d["cycle-in-progress"] = True
p["cycle-in-progress"] = True
pct = 100.0 * self.last_complete_prefix_index / len(self.prefixes)
d["cycle-complete-percentage"] = pct
p["cycle-complete-percentage"] = pct
remaining = None
if self.last_prefix_elapsed_time is not None:
left = len(self.prefixes) - self.last_complete_prefix_index
remaining = left * self.last_prefix_elapsed_time
# TODO: remainder of this prefix: we need to estimate the
# per-bucket time, probably by measuring the time spent on
# this prefix so far, divided by the number of buckets we've
# processed.
d["estimated-cycle-complete-time-left"] = remaining
p["estimated-cycle-complete-time-left"] = remaining
# it's possible to call get_progress() from inside a crawler's
# finished_prefix() function
d["remaining-sleep-time"] = self.minus_or_none(self.next_wake_time,
time.time())
p["remaining-sleep-time"] = self.minus_or_none(self.next_wake_time,
self.clock.seconds())
per_cycle = None
if self.last_cycle_elapsed_time is not None:
per_cycle = self.last_cycle_elapsed_time
elif self.last_prefix_elapsed_time is not None:
per_cycle = len(self.prefixes) * self.last_prefix_elapsed_time
d["estimated-time-per-cycle"] = per_cycle
return d
p["estimated-time-per-cycle"] = per_cycle
return p
def get_state(self):
"""I return the current state of the crawler. This is a copy of my
state dictionary.
If we are not currently sleeping (i.e. get_state() was called from
inside the process_prefixdir, process_bucket, or finished_cycle()
methods, or if startService has not yet been called on this crawler),
these two keys will be None.
Subclasses can override this to add computed keys to the return value,
but don't forget to start with the upcall.
"""
@ -171,10 +185,10 @@ class ShareCrawler(service.MultiService):
return state
def load_state(self):
# we use this to store state for both the crawler's internals and
# We use this to store state for both the crawler's internals and
# anything the subclass-specific code needs. The state is stored
# after each bucket is processed, after each prefixdir is processed,
# and after a cycle is complete. The internal keys we use are:
# after each prefix is processed, and after a cycle is complete.
# The internal keys we use are:
# ["version"]: int, always 1
# ["last-cycle-finished"]: int, or None if we have not yet finished
# any cycle
@ -187,21 +201,18 @@ class ShareCrawler(service.MultiService):
# are sleeping between cycles, or if we
# have not yet finished any prefixdir since
# a cycle was started
# ["last-complete-bucket"]: str, base32 storage index bucket name
# of the last bucket to be processed, or
# None if we are sleeping between cycles
try:
f = open(self.statefile, "rb")
state = pickle.load(f)
f.close()
pickled = fileutil.read(self.statefile)
except Exception:
state = {"version": 1,
"last-cycle-finished": None,
"current-cycle": None,
"last-complete-prefix": None,
"last-complete-bucket": None,
}
state.setdefault("current-cycle-start-time", time.time()) # approximate
else:
state = pickle.loads(pickled)
state.setdefault("current-cycle-start-time", self.clock.seconds()) # approximate
self.state = state
lcp = state["last-complete-prefix"]
if lcp == None:
@ -229,19 +240,16 @@ class ShareCrawler(service.MultiService):
else:
last_complete_prefix = self.prefixes[lcpi]
self.state["last-complete-prefix"] = last_complete_prefix
tmpfile = self.statefile + ".tmp"
f = open(tmpfile, "wb")
pickle.dump(self.state, f)
f.close()
fileutil.move_into_place(tmpfile, self.statefile)
pickled = pickle.dumps(self.state)
fileutil.write(self.statefile, pickled)
def startService(self):
# arrange things to look like we were just sleeping, so
# status/progress values work correctly
self.sleeping_between_cycles = True
self.current_sleep_time = self.slow_start
self.next_wake_time = time.time() + self.slow_start
self.timer = reactor.callLater(self.slow_start, self.start_slice)
self.next_wake_time = self.clock.seconds() + self.slow_start
self.timer = self.clock.callLater(self.slow_start, self.start_slice)
service.MultiService.startService(self)
def stopService(self):
@ -252,44 +260,56 @@ class ShareCrawler(service.MultiService):
return service.MultiService.stopService(self)
def start_slice(self):
start_slice = time.time()
start_slice = self.clock.seconds()
self.timer = None
self.sleeping_between_cycles = False
self.current_sleep_time = None
self.next_wake_time = None
try:
self.start_current_prefix(start_slice)
finished_cycle = True
except TimeSliceExceeded:
finished_cycle = False
self.save_state()
if not self.running:
# someone might have used stopService() to shut us down
return
# either we finished a whole cycle, or we ran out of time
now = time.time()
this_slice = now - start_slice
# this_slice/(this_slice+sleep_time) = percentage
# this_slice/percentage = this_slice+sleep_time
# sleep_time = (this_slice/percentage) - this_slice
sleep_time = (this_slice / self.allowed_cpu_percentage) - this_slice
# if the math gets weird, or a timequake happens, don't sleep
# forever. Note that this means that, while a cycle is running, we
# will process at least one bucket every 5 minutes, no matter how
# long that bucket takes.
sleep_time = max(0.0, min(sleep_time, 299))
if finished_cycle:
# how long should we sleep between cycles? Don't run faster than
# allowed_cpu_percentage says, but also run faster than
# minimum_cycle_time
self.sleeping_between_cycles = True
sleep_time = max(sleep_time, self.minimum_cycle_time)
else:
self.sleeping_between_cycles = False
self.current_sleep_time = sleep_time # for status page
self.next_wake_time = now + sleep_time
self.yielding(sleep_time)
self.timer = reactor.callLater(sleep_time, self.start_slice)
d = self.start_current_prefix(start_slice)
def _err(f):
f.trap(TimeSliceExceeded)
return False
def _ok(ign):
return True
d.addCallbacks(_ok, _err)
def _done(finished_cycle):
self.save_state()
if not self.running:
# someone might have used stopService() to shut us down
return
# Either we finished a whole cycle, or we ran out of time.
now = self.clock.seconds()
this_slice = now - start_slice
# this_slice/(this_slice+sleep_time) = percentage
# this_slice/percentage = this_slice+sleep_time
# sleep_time = (this_slice/percentage) - this_slice
sleep_time = (this_slice / self.allowed_cpu_proportion) - this_slice
# If the math gets weird, or a timequake happens, don't sleep
# forever. Note that this means that, while a cycle is running, we
# will process at least one prefix every 5 minutes, provided prefixes
# do not take more than 5 minutes to process.
sleep_time = max(0.0, min(sleep_time, 299))
if finished_cycle:
# how long should we sleep between cycles? Don't run faster than
# allowed_cpu_proportion says, but also run faster than
# minimum_cycle_time
self.sleeping_between_cycles = True
sleep_time = max(sleep_time, self.minimum_cycle_time)
else:
self.sleeping_between_cycles = False
self.current_sleep_time = sleep_time # for status page
self.next_wake_time = now + sleep_time
self.yielding(sleep_time)
self.timer = self.clock.callLater(sleep_time, self.start_slice)
d.addCallback(_done)
d.addBoth(self._call_hook, 'yield')
return d
def start_current_prefix(self, start_slice):
state = self.state
@ -303,21 +323,35 @@ class ShareCrawler(service.MultiService):
self.started_cycle(state["current-cycle"])
cycle = state["current-cycle"]
for i in range(self.last_complete_prefix_index+1, len(self.prefixes)):
# if we want to yield earlier, just raise TimeSliceExceeded()
prefix = self.prefixes[i]
prefixdir = os.path.join(self.sharedir, prefix)
if i == self.bucket_cache[0]:
buckets = self.bucket_cache[1]
else:
try:
buckets = os.listdir(prefixdir)
buckets.sort()
except EnvironmentError:
buckets = []
self.bucket_cache = (i, buckets)
self.process_prefixdir(cycle, prefix, prefixdir,
buckets, start_slice)
def _prefix_loop(i):
d2 = self._do_prefix(cycle, i, start_slice)
d2.addBoth(self._call_hook, 'after_prefix')
d2.addCallback(lambda ign: True)
return d2
d = async_iterate(_prefix_loop, xrange(self.last_complete_prefix_index + 1, len(self.prefixes)))
def _cycle_done(ign):
# yay! we finished the whole cycle
self.last_complete_prefix_index = -1
self.last_prefix_finished_time = None # don't include the sleep
now = time.time()
if self.last_cycle_started_time is not None:
self.last_cycle_elapsed_time = now - self.last_cycle_started_time
state["last-complete-bucket"] = None
state["last-cycle-finished"] = cycle
state["current-cycle"] = None
self.finished_cycle(cycle)
self.save_state()
return cycle
d.addCallback(_cycle_done)
d.addBoth(self._call_hook, 'after_cycle')
return d
def _do_prefix(self, cycle, i, start_slice):
prefix = self.prefixes[i]
d = defer.maybeDeferred(self.process_prefix, cycle, prefix, start_slice)
def _done(ign):
self.last_complete_prefix_index = i
now = time.time()
@ -327,40 +361,19 @@ class ShareCrawler(service.MultiService):
self.last_prefix_finished_time = now
self.finished_prefix(cycle, prefix)
if time.time() >= start_slice + self.cpu_slice:
raise TimeSliceExceeded()
# yay! we finished the whole cycle
self.last_complete_prefix_index = -1
self.last_prefix_finished_time = None # don't include the sleep
now = time.time()
if self.last_cycle_started_time is not None:
self.last_cycle_elapsed_time = now - self.last_cycle_started_time
state["last-complete-bucket"] = None
state["last-cycle-finished"] = cycle
state["current-cycle"] = None
self.finished_cycle(cycle)
self.save_state()
return prefix
d.addCallback(_done)
return d
def process_prefixdir(self, cycle, prefix, prefixdir, buckets, start_slice):
"""This gets a list of bucket names (i.e. storage index strings,
base32-encoded) in sorted order.
You can override this if your crawler doesn't care about the actual
shares, for example a crawler which merely keeps track of how many
buckets are being managed by this server.
Subclasses which *do* care about actual bucket should leave this
method along, and implement process_bucket() instead.
def process_prefix(self, cycle, prefix, start_slice):
"""
for bucket in buckets:
if bucket <= self.state["last-complete-bucket"]:
continue
self.process_bucket(cycle, prefix, prefixdir, bucket)
self.state["last-complete-bucket"] = bucket
if time.time() >= start_slice + self.cpu_slice:
raise TimeSliceExceeded()
Called for each prefix.
"""
return defer.succeed(None)
# the remaining methods are explictly for subclasses to implement.
@ -371,29 +384,6 @@ class ShareCrawler(service.MultiService):
"""
pass
def process_bucket(self, cycle, prefix, prefixdir, storage_index_b32):
"""Examine a single bucket. Subclasses should do whatever they want
to do to the shares therein, then update self.state as necessary.
If the crawler is never interrupted by SIGKILL, this method will be
called exactly once per share (per cycle). If it *is* interrupted,
then the next time the node is started, some amount of work will be
duplicated, according to when self.save_state() was last called. By
default, save_state() is called at the end of each timeslice, and
after finished_cycle() returns, and when stopService() is called.
To reduce the chance of duplicate work (i.e. to avoid adding multiple
records to a database), you can call save_state() at the end of your
process_bucket() method. This will reduce the maximum duplicated work
to one bucket per SIGKILL. It will also add overhead, probably 1-20ms
per bucket (and some disk writes), which will count against your
allowed_cpu_percentage, and which may be considerable if
process_bucket() runs quickly.
This method is for subclasses to override. No upcall is necessary.
"""
pass
def finished_prefix(self, cycle, prefix):
"""Notify a subclass that the crawler has just finished processing a
prefix directory (all buckets with the same two-character/10bit
@ -407,8 +397,9 @@ class ShareCrawler(service.MultiService):
pass
def finished_cycle(self, cycle):
"""Notify subclass that a cycle (one complete traversal of all
prefixdirs) has just finished. 'cycle' is the number of the cycle
"""
Notify subclass that a cycle (one complete traversal of all
prefixes) has just finished. 'cycle' is the number of the cycle
that just finished. This method should perform summary work and
update self.state to publish information to status displays.
@ -424,65 +415,11 @@ class ShareCrawler(service.MultiService):
pass
def yielding(self, sleep_time):
"""The crawler is about to sleep for 'sleep_time' seconds. This
"""
The crawler is about to sleep for 'sleep_time' seconds. This
method is mostly for the convenience of unit tests.
This method is for subclasses to override. No upcall is necessary.
"""
pass
class BucketCountingCrawler(ShareCrawler):
"""I keep track of how many buckets are being managed by this server.
This is equivalent to the number of distributed files and directories for
which I am providing storage. The actual number of files+directories in
the full grid is probably higher (especially when there are more servers
than 'N', the number of generated shares), because some files+directories
will have shares on other servers instead of me. Also note that the
number of buckets will differ from the number of shares in small grids,
when more than one share is placed on a single server.
"""
minimum_cycle_time = 60*60 # we don't need this more than once an hour
def __init__(self, server, statefile, num_sample_prefixes=1):
ShareCrawler.__init__(self, server, statefile)
self.num_sample_prefixes = num_sample_prefixes
def add_initial_state(self):
# ["bucket-counts"][cyclenum][prefix] = number
# ["last-complete-cycle"] = cyclenum # maintained by base class
# ["last-complete-bucket-count"] = number
# ["storage-index-samples"][prefix] = (cyclenum,
# list of SI strings (base32))
self.state.setdefault("bucket-counts", {})
self.state.setdefault("last-complete-bucket-count", None)
self.state.setdefault("storage-index-samples", {})
def process_prefixdir(self, cycle, prefix, prefixdir, buckets, start_slice):
# we override process_prefixdir() because we don't want to look at
# the individual buckets. We'll save state after each one. On my
# laptop, a mostly-empty storage server can process about 70
# prefixdirs in a 1.0s slice.
if cycle not in self.state["bucket-counts"]:
self.state["bucket-counts"][cycle] = {}
self.state["bucket-counts"][cycle][prefix] = len(buckets)
if prefix in self.prefixes[:self.num_sample_prefixes]:
self.state["storage-index-samples"][prefix] = (cycle, buckets)
def finished_cycle(self, cycle):
last_counts = self.state["bucket-counts"].get(cycle, [])
if len(last_counts) == len(self.prefixes):
# great, we have a whole cycle.
num_buckets = sum(last_counts.values())
self.state["last-complete-bucket-count"] = num_buckets
# get rid of old counts
for old_cycle in list(self.state["bucket-counts"].keys()):
if old_cycle != cycle:
del self.state["bucket-counts"][old_cycle]
# get rid of old samples too
for prefix in list(self.state["storage-index-samples"].keys()):
old_cycle,buckets = self.state["storage-index-samples"][prefix]
if old_cycle != cycle:
del self.state["storage-index-samples"][prefix]

View File

@ -0,0 +1,71 @@
import time
from types import NoneType
from allmydata.util.assertutil import precondition
from allmydata.util import time_format
from allmydata.web.common import abbreviate_time
class ExpirationPolicy(object):
def __init__(self, enabled=False, mode="age", override_lease_duration=None,
cutoff_date=None):
precondition(isinstance(enabled, bool), enabled=enabled)
precondition(mode in ("age", "cutoff-date"),
"GC mode %r must be 'age' or 'cutoff-date'" % (mode,))
precondition(isinstance(override_lease_duration, (int, NoneType)),
override_lease_duration=override_lease_duration)
precondition(isinstance(cutoff_date, int) or (mode != "cutoff-date" and cutoff_date is None),
cutoff_date=cutoff_date)
self._enabled = enabled
self._mode = mode
self._override_lease_duration = override_lease_duration
self._cutoff_date = cutoff_date
def remove_expired_leases(self, leasedb, current_time):
if not self._enabled:
return
if self._mode == "age":
if self._override_lease_duration is not None:
leasedb.remove_leases_by_renewal_time(current_time - self._override_lease_duration)
else:
leasedb.remove_leases_by_expiration_time(current_time)
else:
# self._mode == "cutoff-date"
leasedb.remove_leases_by_renewal_time(self._cutoff_date)
def get_parameters(self):
"""
Return the parameters as represented in the "configured-expiration-mode" field
of a history entry.
"""
return (self._mode,
self._override_lease_duration,
self._cutoff_date,
self._enabled and ("mutable", "immutable") or ())
def is_enabled(self):
return self._enabled
def describe_enabled(self):
if self.is_enabled():
return "Enabled: expired leases will be removed"
else:
return "Disabled: scan-only mode, no leases will be removed"
def describe_expiration(self):
if self._mode == "age":
if self._override_lease_duration is None:
return ("Leases will expire naturally, probably 31 days after "
"creation or renewal.")
else:
return ("Leases created or last renewed more than %s ago "
"will be considered expired."
% abbreviate_time(self._override_lease_duration))
else:
localizedutcdate = time.strftime("%d-%b-%Y", time.gmtime(self._cutoff_date))
isoutcdate = time_format.iso_utc_date(self._cutoff_date)
return ("Leases created or last renewed before %s (%s) UTC "
"will be considered expired." % (isoutcdate, localizedutcdate))

View File

@ -1,424 +0,0 @@
import time, os, pickle, struct
from allmydata.storage.crawler import ShareCrawler
from allmydata.storage.shares import get_share_file
from allmydata.storage.common import UnknownMutableContainerVersionError, \
UnknownImmutableContainerVersionError
from twisted.python import log as twlog
class LeaseCheckingCrawler(ShareCrawler):
"""I examine the leases on all shares, determining which are still valid
and which have expired. I can remove the expired leases (if so
configured), and the share will be deleted when the last lease is
removed.
I collect statistics on the leases and make these available to a web
status page, including::
Space recovered during this cycle-so-far:
actual (only if expiration_enabled=True):
num-buckets, num-shares, sum of share sizes, real disk usage
('real disk usage' means we use stat(fn).st_blocks*512 and include any
space used by the directory)
what it would have been with the original lease expiration time
what it would have been with our configured expiration time
Prediction of space that will be recovered during the rest of this cycle
Prediction of space that will be recovered by the entire current cycle.
Space recovered during the last 10 cycles <-- saved in separate pickle
Shares/buckets examined:
this cycle-so-far
prediction of rest of cycle
during last 10 cycles <-- separate pickle
start/finish time of last 10 cycles <-- separate pickle
expiration time used for last 10 cycles <-- separate pickle
Histogram of leases-per-share:
this-cycle-to-date
last 10 cycles <-- separate pickle
Histogram of lease ages, buckets = 1day
cycle-to-date
last 10 cycles <-- separate pickle
All cycle-to-date values remain valid until the start of the next cycle.
"""
slow_start = 360 # wait 6 minutes after startup
minimum_cycle_time = 12*60*60 # not more than twice per day
def __init__(self, server, statefile, historyfile,
expiration_enabled, mode,
override_lease_duration, # used if expiration_mode=="age"
cutoff_date, # used if expiration_mode=="cutoff-date"
sharetypes):
self.historyfile = historyfile
self.expiration_enabled = expiration_enabled
self.mode = mode
self.override_lease_duration = None
self.cutoff_date = None
if self.mode == "age":
assert isinstance(override_lease_duration, (int, type(None)))
self.override_lease_duration = override_lease_duration # seconds
elif self.mode == "cutoff-date":
assert isinstance(cutoff_date, int) # seconds-since-epoch
assert cutoff_date is not None
self.cutoff_date = cutoff_date
else:
raise ValueError("GC mode '%s' must be 'age' or 'cutoff-date'" % mode)
self.sharetypes_to_expire = sharetypes
ShareCrawler.__init__(self, server, statefile)
def add_initial_state(self):
# we fill ["cycle-to-date"] here (even though they will be reset in
# self.started_cycle) just in case someone grabs our state before we
# get started: unit tests do this
so_far = self.create_empty_cycle_dict()
self.state.setdefault("cycle-to-date", so_far)
# in case we upgrade the code while a cycle is in progress, update
# the keys individually
for k in so_far:
self.state["cycle-to-date"].setdefault(k, so_far[k])
# initialize history
if not os.path.exists(self.historyfile):
history = {} # cyclenum -> dict
f = open(self.historyfile, "wb")
pickle.dump(history, f)
f.close()
def create_empty_cycle_dict(self):
recovered = self.create_empty_recovered_dict()
so_far = {"corrupt-shares": [],
"space-recovered": recovered,
"lease-age-histogram": {}, # (minage,maxage)->count
"leases-per-share-histogram": {}, # leasecount->numshares
}
return so_far
def create_empty_recovered_dict(self):
recovered = {}
for a in ("actual", "original", "configured", "examined"):
for b in ("buckets", "shares", "sharebytes", "diskbytes"):
recovered[a+"-"+b] = 0
recovered[a+"-"+b+"-mutable"] = 0
recovered[a+"-"+b+"-immutable"] = 0
return recovered
def started_cycle(self, cycle):
self.state["cycle-to-date"] = self.create_empty_cycle_dict()
def stat(self, fn):
return os.stat(fn)
def process_bucket(self, cycle, prefix, prefixdir, storage_index_b32):
bucketdir = os.path.join(prefixdir, storage_index_b32)
s = self.stat(bucketdir)
would_keep_shares = []
wks = None
for fn in os.listdir(bucketdir):
try:
shnum = int(fn)
except ValueError:
continue # non-numeric means not a sharefile
sharefile = os.path.join(bucketdir, fn)
try:
wks = self.process_share(sharefile)
except (UnknownMutableContainerVersionError,
UnknownImmutableContainerVersionError,
struct.error):
twlog.msg("lease-checker error processing %s" % sharefile)
twlog.err()
which = (storage_index_b32, shnum)
self.state["cycle-to-date"]["corrupt-shares"].append(which)
wks = (1, 1, 1, "unknown")
would_keep_shares.append(wks)
sharetype = None
if wks:
# use the last share's sharetype as the buckettype
sharetype = wks[3]
rec = self.state["cycle-to-date"]["space-recovered"]
self.increment(rec, "examined-buckets", 1)
if sharetype:
self.increment(rec, "examined-buckets-"+sharetype, 1)
try:
bucket_diskbytes = s.st_blocks * 512
except AttributeError:
bucket_diskbytes = 0 # no stat().st_blocks on windows
if sum([wks[0] for wks in would_keep_shares]) == 0:
self.increment_bucketspace("original", bucket_diskbytes, sharetype)
if sum([wks[1] for wks in would_keep_shares]) == 0:
self.increment_bucketspace("configured", bucket_diskbytes, sharetype)
if sum([wks[2] for wks in would_keep_shares]) == 0:
self.increment_bucketspace("actual", bucket_diskbytes, sharetype)
def process_share(self, sharefilename):
# first, find out what kind of a share it is
sf = get_share_file(sharefilename)
sharetype = sf.sharetype
now = time.time()
s = self.stat(sharefilename)
num_leases = 0
num_valid_leases_original = 0
num_valid_leases_configured = 0
expired_leases_configured = []
for li in sf.get_leases():
num_leases += 1
original_expiration_time = li.get_expiration_time()
grant_renew_time = li.get_grant_renew_time_time()
age = li.get_age()
self.add_lease_age_to_histogram(age)
# expired-or-not according to original expiration time
if original_expiration_time > now:
num_valid_leases_original += 1
# expired-or-not according to our configured age limit
expired = False
if self.mode == "age":
age_limit = original_expiration_time
if self.override_lease_duration is not None:
age_limit = self.override_lease_duration
if age > age_limit:
expired = True
else:
assert self.mode == "cutoff-date"
if grant_renew_time < self.cutoff_date:
expired = True
if sharetype not in self.sharetypes_to_expire:
expired = False
if expired:
expired_leases_configured.append(li)
else:
num_valid_leases_configured += 1
so_far = self.state["cycle-to-date"]
self.increment(so_far["leases-per-share-histogram"], num_leases, 1)
self.increment_space("examined", s, sharetype)
would_keep_share = [1, 1, 1, sharetype]
if self.expiration_enabled:
for li in expired_leases_configured:
sf.cancel_lease(li.cancel_secret)
if num_valid_leases_original == 0:
would_keep_share[0] = 0
self.increment_space("original", s, sharetype)
if num_valid_leases_configured == 0:
would_keep_share[1] = 0
self.increment_space("configured", s, sharetype)
if self.expiration_enabled:
would_keep_share[2] = 0
self.increment_space("actual", s, sharetype)
return would_keep_share
def increment_space(self, a, s, sharetype):
sharebytes = s.st_size
try:
# note that stat(2) says that st_blocks is 512 bytes, and that
# st_blksize is "optimal file sys I/O ops blocksize", which is
# independent of the block-size that st_blocks uses.
diskbytes = s.st_blocks * 512
except AttributeError:
# the docs say that st_blocks is only on linux. I also see it on
# MacOS. But it isn't available on windows.
diskbytes = sharebytes
so_far_sr = self.state["cycle-to-date"]["space-recovered"]
self.increment(so_far_sr, a+"-shares", 1)
self.increment(so_far_sr, a+"-sharebytes", sharebytes)
self.increment(so_far_sr, a+"-diskbytes", diskbytes)
if sharetype:
self.increment(so_far_sr, a+"-shares-"+sharetype, 1)
self.increment(so_far_sr, a+"-sharebytes-"+sharetype, sharebytes)
self.increment(so_far_sr, a+"-diskbytes-"+sharetype, diskbytes)
def increment_bucketspace(self, a, bucket_diskbytes, sharetype):
rec = self.state["cycle-to-date"]["space-recovered"]
self.increment(rec, a+"-diskbytes", bucket_diskbytes)
self.increment(rec, a+"-buckets", 1)
if sharetype:
self.increment(rec, a+"-diskbytes-"+sharetype, bucket_diskbytes)
self.increment(rec, a+"-buckets-"+sharetype, 1)
def increment(self, d, k, delta=1):
if k not in d:
d[k] = 0
d[k] += delta
def add_lease_age_to_histogram(self, age):
bucket_interval = 24*60*60
bucket_number = int(age/bucket_interval)
bucket_start = bucket_number * bucket_interval
bucket_end = bucket_start + bucket_interval
k = (bucket_start, bucket_end)
self.increment(self.state["cycle-to-date"]["lease-age-histogram"], k, 1)
def convert_lease_age_histogram(self, lah):
# convert { (minage,maxage) : count } into [ (minage,maxage,count) ]
# since the former is not JSON-safe (JSON dictionaries must have
# string keys).
json_safe_lah = []
for k in sorted(lah):
(minage,maxage) = k
json_safe_lah.append( (minage, maxage, lah[k]) )
return json_safe_lah
def finished_cycle(self, cycle):
# add to our history state, prune old history
h = {}
start = self.state["current-cycle-start-time"]
now = time.time()
h["cycle-start-finish-times"] = (start, now)
h["expiration-enabled"] = self.expiration_enabled
h["configured-expiration-mode"] = (self.mode,
self.override_lease_duration,
self.cutoff_date,
self.sharetypes_to_expire)
s = self.state["cycle-to-date"]
# state["lease-age-histogram"] is a dictionary (mapping
# (minage,maxage) tuple to a sharecount), but we report
# self.get_state()["lease-age-histogram"] as a list of
# (min,max,sharecount) tuples, because JSON can handle that better.
# We record the list-of-tuples form into the history for the same
# reason.
lah = self.convert_lease_age_histogram(s["lease-age-histogram"])
h["lease-age-histogram"] = lah
h["leases-per-share-histogram"] = s["leases-per-share-histogram"].copy()
h["corrupt-shares"] = s["corrupt-shares"][:]
# note: if ["shares-recovered"] ever acquires an internal dict, this
# copy() needs to become a deepcopy
h["space-recovered"] = s["space-recovered"].copy()
history = pickle.load(open(self.historyfile, "rb"))
history[cycle] = h
while len(history) > 10:
oldcycles = sorted(history.keys())
del history[oldcycles[0]]
f = open(self.historyfile, "wb")
pickle.dump(history, f)
f.close()
def get_state(self):
"""In addition to the crawler state described in
ShareCrawler.get_state(), I return the following keys which are
specific to the lease-checker/expirer. Note that the non-history keys
(with 'cycle' in their names) are only present if a cycle is
currently running. If the crawler is between cycles, it appropriate
to show the latest item in the 'history' key instead. Also note that
each history item has all the data in the 'cycle-to-date' value, plus
cycle-start-finish-times.
cycle-to-date:
expiration-enabled
configured-expiration-mode
lease-age-histogram (list of (minage,maxage,sharecount) tuples)
leases-per-share-histogram
corrupt-shares (list of (si_b32,shnum) tuples, minimal verification)
space-recovered
estimated-remaining-cycle:
# Values may be None if not enough data has been gathered to
# produce an estimate.
space-recovered
estimated-current-cycle:
# cycle-to-date plus estimated-remaining. Values may be None if
# not enough data has been gathered to produce an estimate.
space-recovered
history: maps cyclenum to a dict with the following keys:
cycle-start-finish-times
expiration-enabled
configured-expiration-mode
lease-age-histogram
leases-per-share-histogram
corrupt-shares
space-recovered
The 'space-recovered' structure is a dictionary with the following
keys:
# 'examined' is what was looked at
examined-buckets, examined-buckets-mutable, examined-buckets-immutable
examined-shares, -mutable, -immutable
examined-sharebytes, -mutable, -immutable
examined-diskbytes, -mutable, -immutable
# 'actual' is what was actually deleted
actual-buckets, -mutable, -immutable
actual-shares, -mutable, -immutable
actual-sharebytes, -mutable, -immutable
actual-diskbytes, -mutable, -immutable
# would have been deleted, if the original lease timer was used
original-buckets, -mutable, -immutable
original-shares, -mutable, -immutable
original-sharebytes, -mutable, -immutable
original-diskbytes, -mutable, -immutable
# would have been deleted, if our configured max_age was used
configured-buckets, -mutable, -immutable
configured-shares, -mutable, -immutable
configured-sharebytes, -mutable, -immutable
configured-diskbytes, -mutable, -immutable
"""
progress = self.get_progress()
state = ShareCrawler.get_state(self) # does a shallow copy
history = pickle.load(open(self.historyfile, "rb"))
state["history"] = history
if not progress["cycle-in-progress"]:
del state["cycle-to-date"]
return state
so_far = state["cycle-to-date"].copy()
state["cycle-to-date"] = so_far
lah = so_far["lease-age-histogram"]
so_far["lease-age-histogram"] = self.convert_lease_age_histogram(lah)
so_far["expiration-enabled"] = self.expiration_enabled
so_far["configured-expiration-mode"] = (self.mode,
self.override_lease_duration,
self.cutoff_date,
self.sharetypes_to_expire)
so_far_sr = so_far["space-recovered"]
remaining_sr = {}
remaining = {"space-recovered": remaining_sr}
cycle_sr = {}
cycle = {"space-recovered": cycle_sr}
if progress["cycle-complete-percentage"] > 0.0:
pc = progress["cycle-complete-percentage"] / 100.0
m = (1-pc)/pc
for a in ("actual", "original", "configured", "examined"):
for b in ("buckets", "shares", "sharebytes", "diskbytes"):
for c in ("", "-mutable", "-immutable"):
k = a+"-"+b+c
remaining_sr[k] = m * so_far_sr[k]
cycle_sr[k] = so_far_sr[k] + remaining_sr[k]
else:
for a in ("actual", "original", "configured", "examined"):
for b in ("buckets", "shares", "sharebytes", "diskbytes"):
for c in ("", "-mutable", "-immutable"):
k = a+"-"+b+c
remaining_sr[k] = None
cycle_sr[k] = None
state["estimated-remaining-cycle"] = remaining
state["estimated-current-cycle"] = cycle
return state

View File

@ -1,321 +0,0 @@
import os, stat, struct, time
from foolscap.api import Referenceable
from zope.interface import implements
from allmydata.interfaces import RIBucketWriter, RIBucketReader
from allmydata.util import base32, fileutil, log
from allmydata.util.assertutil import precondition
from allmydata.util.hashutil import timing_safe_compare
from allmydata.storage.lease import LeaseInfo
from allmydata.storage.common import UnknownImmutableContainerVersionError, \
DataTooLargeError
# each share file (in storage/shares/$SI/$SHNUM) contains lease information
# and share data. The share data is accessed by RIBucketWriter.write and
# RIBucketReader.read . The lease information is not accessible through these
# interfaces.
# The share file has the following layout:
# 0x00: share file version number, four bytes, current version is 1
# 0x04: share data length, four bytes big-endian = A # See Footnote 1 below.
# 0x08: number of leases, four bytes big-endian
# 0x0c: beginning of share data (see immutable.layout.WriteBucketProxy)
# A+0x0c = B: first lease. Lease format is:
# B+0x00: owner number, 4 bytes big-endian, 0 is reserved for no-owner
# B+0x04: renew secret, 32 bytes (SHA256)
# B+0x24: cancel secret, 32 bytes (SHA256)
# B+0x44: expiration time, 4 bytes big-endian seconds-since-epoch
# B+0x48: next lease, or end of record
# Footnote 1: as of Tahoe v1.3.0 this field is not used by storage servers,
# but it is still filled in by storage servers in case the storage server
# software gets downgraded from >= Tahoe v1.3.0 to < Tahoe v1.3.0, or the
# share file is moved from one storage server to another. The value stored in
# this field is truncated, so if the actual share data length is >= 2**32,
# then the value stored in this field will be the actual share data length
# modulo 2**32.
class ShareFile:
LEASE_SIZE = struct.calcsize(">L32s32sL")
sharetype = "immutable"
def __init__(self, filename, max_size=None, create=False):
""" If max_size is not None then I won't allow more than max_size to be written to me. If create=True and max_size must not be None. """
precondition((max_size is not None) or (not create), max_size, create)
self.home = filename
self._max_size = max_size
if create:
# touch the file, so later callers will see that we're working on
# it. Also construct the metadata.
assert not os.path.exists(self.home)
fileutil.make_dirs(os.path.dirname(self.home))
f = open(self.home, 'wb')
# The second field -- the four-byte share data length -- is no
# longer used as of Tahoe v1.3.0, but we continue to write it in
# there in case someone downgrades a storage server from >=
# Tahoe-1.3.0 to < Tahoe-1.3.0, or moves a share file from one
# server to another, etc. We do saturation -- a share data length
# larger than 2**32-1 (what can fit into the field) is marked as
# the largest length that can fit into the field. That way, even
# if this does happen, the old < v1.3.0 server will still allow
# clients to read the first part of the share.
f.write(struct.pack(">LLL", 1, min(2**32-1, max_size), 0))
f.close()
self._lease_offset = max_size + 0x0c
self._num_leases = 0
else:
f = open(self.home, 'rb')
filesize = os.path.getsize(self.home)
(version, unused, num_leases) = struct.unpack(">LLL", f.read(0xc))
f.close()
if version != 1:
msg = "sharefile %s had version %d but we wanted 1" % \
(filename, version)
raise UnknownImmutableContainerVersionError(msg)
self._num_leases = num_leases
self._lease_offset = filesize - (num_leases * self.LEASE_SIZE)
self._data_offset = 0xc
def unlink(self):
os.unlink(self.home)
def read_share_data(self, offset, length):
precondition(offset >= 0)
# reads beyond the end of the data are truncated. Reads that start
# beyond the end of the data return an empty string.
seekpos = self._data_offset+offset
actuallength = max(0, min(length, self._lease_offset-seekpos))
if actuallength == 0:
return ""
f = open(self.home, 'rb')
f.seek(seekpos)
return f.read(actuallength)
def write_share_data(self, offset, data):
length = len(data)
precondition(offset >= 0, offset)
if self._max_size is not None and offset+length > self._max_size:
raise DataTooLargeError(self._max_size, offset, length)
f = open(self.home, 'rb+')
real_offset = self._data_offset+offset
f.seek(real_offset)
assert f.tell() == real_offset
f.write(data)
f.close()
def _write_lease_record(self, f, lease_number, lease_info):
offset = self._lease_offset + lease_number * self.LEASE_SIZE
f.seek(offset)
assert f.tell() == offset
f.write(lease_info.to_immutable_data())
def _read_num_leases(self, f):
f.seek(0x08)
(num_leases,) = struct.unpack(">L", f.read(4))
return num_leases
def _write_num_leases(self, f, num_leases):
f.seek(0x08)
f.write(struct.pack(">L", num_leases))
def _truncate_leases(self, f, num_leases):
f.truncate(self._lease_offset + num_leases * self.LEASE_SIZE)
def get_leases(self):
"""Yields a LeaseInfo instance for all leases."""
f = open(self.home, 'rb')
(version, unused, num_leases) = struct.unpack(">LLL", f.read(0xc))
f.seek(self._lease_offset)
for i in range(num_leases):
data = f.read(self.LEASE_SIZE)
if data:
yield LeaseInfo().from_immutable_data(data)
def add_lease(self, lease_info):
f = open(self.home, 'rb+')
num_leases = self._read_num_leases(f)
self._write_lease_record(f, num_leases, lease_info)
self._write_num_leases(f, num_leases+1)
f.close()
def renew_lease(self, renew_secret, new_expire_time):
for i,lease in enumerate(self.get_leases()):
if timing_safe_compare(lease.renew_secret, renew_secret):
# yup. See if we need to update the owner time.
if new_expire_time > lease.expiration_time:
# yes
lease.expiration_time = new_expire_time
f = open(self.home, 'rb+')
self._write_lease_record(f, i, lease)
f.close()
return
raise IndexError("unable to renew non-existent lease")
def add_or_renew_lease(self, lease_info):
try:
self.renew_lease(lease_info.renew_secret,
lease_info.expiration_time)
except IndexError:
self.add_lease(lease_info)
def cancel_lease(self, cancel_secret):
"""Remove a lease with the given cancel_secret. If the last lease is
cancelled, the file will be removed. Return the number of bytes that
were freed (by truncating the list of leases, and possibly by
deleting the file. Raise IndexError if there was no lease with the
given cancel_secret.
"""
leases = list(self.get_leases())
num_leases_removed = 0
for i,lease in enumerate(leases):
if timing_safe_compare(lease.cancel_secret, cancel_secret):
leases[i] = None
num_leases_removed += 1
if not num_leases_removed:
raise IndexError("unable to find matching lease to cancel")
if num_leases_removed:
# pack and write out the remaining leases. We write these out in
# the same order as they were added, so that if we crash while
# doing this, we won't lose any non-cancelled leases.
leases = [l for l in leases if l] # remove the cancelled leases
f = open(self.home, 'rb+')
for i,lease in enumerate(leases):
self._write_lease_record(f, i, lease)
self._write_num_leases(f, len(leases))
self._truncate_leases(f, len(leases))
f.close()
space_freed = self.LEASE_SIZE * num_leases_removed
if not len(leases):
space_freed += os.stat(self.home)[stat.ST_SIZE]
self.unlink()
return space_freed
class BucketWriter(Referenceable):
implements(RIBucketWriter)
def __init__(self, ss, incominghome, finalhome, max_size, lease_info, canary):
self.ss = ss
self.incominghome = incominghome
self.finalhome = finalhome
self._max_size = max_size # don't allow the client to write more than this
self._canary = canary
self._disconnect_marker = canary.notifyOnDisconnect(self._disconnected)
self.closed = False
self.throw_out_all_data = False
self._sharefile = ShareFile(incominghome, create=True, max_size=max_size)
# also, add our lease to the file now, so that other ones can be
# added by simultaneous uploaders
self._sharefile.add_lease(lease_info)
def allocated_size(self):
return self._max_size
def remote_write(self, offset, data):
start = time.time()
precondition(not self.closed)
if self.throw_out_all_data:
return
self._sharefile.write_share_data(offset, data)
self.ss.add_latency("write", time.time() - start)
self.ss.count("write")
def remote_close(self):
precondition(not self.closed)
start = time.time()
fileutil.make_dirs(os.path.dirname(self.finalhome))
fileutil.rename(self.incominghome, self.finalhome)
try:
# self.incominghome is like storage/shares/incoming/ab/abcde/4 .
# We try to delete the parent (.../ab/abcde) to avoid leaving
# these directories lying around forever, but the delete might
# fail if we're working on another share for the same storage
# index (like ab/abcde/5). The alternative approach would be to
# use a hierarchy of objects (PrefixHolder, BucketHolder,
# ShareWriter), each of which is responsible for a single
# directory on disk, and have them use reference counting of
# their children to know when they should do the rmdir. This
# approach is simpler, but relies on os.rmdir refusing to delete
# a non-empty directory. Do *not* use fileutil.rm_dir() here!
os.rmdir(os.path.dirname(self.incominghome))
# we also delete the grandparent (prefix) directory, .../ab ,
# again to avoid leaving directories lying around. This might
# fail if there is another bucket open that shares a prefix (like
# ab/abfff).
os.rmdir(os.path.dirname(os.path.dirname(self.incominghome)))
# we leave the great-grandparent (incoming/) directory in place.
except EnvironmentError:
# ignore the "can't rmdir because the directory is not empty"
# exceptions, those are normal consequences of the
# above-mentioned conditions.
pass
self._sharefile = None
self.closed = True
self._canary.dontNotifyOnDisconnect(self._disconnect_marker)
filelen = os.stat(self.finalhome)[stat.ST_SIZE]
self.ss.bucket_writer_closed(self, filelen)
self.ss.add_latency("close", time.time() - start)
self.ss.count("close")
def _disconnected(self):
if not self.closed:
self._abort()
def remote_abort(self):
log.msg("storage: aborting sharefile %s" % self.incominghome,
facility="tahoe.storage", level=log.UNUSUAL)
if not self.closed:
self._canary.dontNotifyOnDisconnect(self._disconnect_marker)
self._abort()
self.ss.count("abort")
def _abort(self):
if self.closed:
return
os.remove(self.incominghome)
# if we were the last share to be moved, remove the incoming/
# directory that was our parent
parentdir = os.path.split(self.incominghome)[0]
if not os.listdir(parentdir):
os.rmdir(parentdir)
self._sharefile = None
# We are now considered closed for further writing. We must tell
# the storage server about this so that it stops expecting us to
# use the space it allocated for us earlier.
self.closed = True
self.ss.bucket_writer_closed(self, 0)
class BucketReader(Referenceable):
implements(RIBucketReader)
def __init__(self, ss, sharefname, storage_index=None, shnum=None):
self.ss = ss
self._share_file = ShareFile(sharefname)
self.storage_index = storage_index
self.shnum = shnum
def __repr__(self):
return "<%s %s %s>" % (self.__class__.__name__,
base32.b2a_l(self.storage_index[:8], 60),
self.shnum)
def remote_read(self, offset, length):
start = time.time()
data = self._share_file.read_share_data(offset, length)
self.ss.add_latency("read", time.time() - start)
self.ss.count("read")
return data
def remote_advise_corrupt_share(self, reason):
return self.ss.remote_advise_corrupt_share("immutable",
self.storage_index,
self.shnum,
reason)

View File

@ -1,47 +0,0 @@
import struct, time
class LeaseInfo:
def __init__(self, owner_num=None, renew_secret=None, cancel_secret=None,
expiration_time=None, nodeid=None):
self.owner_num = owner_num
self.renew_secret = renew_secret
self.cancel_secret = cancel_secret
self.expiration_time = expiration_time
if nodeid is not None:
assert isinstance(nodeid, str)
assert len(nodeid) == 20
self.nodeid = nodeid
def get_expiration_time(self):
return self.expiration_time
def get_grant_renew_time_time(self):
# hack, based upon fixed 31day expiration period
return self.expiration_time - 31*24*60*60
def get_age(self):
return time.time() - self.get_grant_renew_time_time()
def from_immutable_data(self, data):
(self.owner_num,
self.renew_secret,
self.cancel_secret,
self.expiration_time) = struct.unpack(">L32s32sL", data)
self.nodeid = None
return self
def to_immutable_data(self):
return struct.pack(">L32s32sL",
self.owner_num,
self.renew_secret, self.cancel_secret,
int(self.expiration_time))
def to_mutable_data(self):
return struct.pack(">LL32s32s20s",
self.owner_num,
int(self.expiration_time),
self.renew_secret, self.cancel_secret,
self.nodeid)
def from_mutable_data(self, data):
(self.owner_num,
self.expiration_time,
self.renew_secret, self.cancel_secret,
self.nodeid) = struct.unpack(">LL32s32s20s", data)
return self

View File

@ -0,0 +1,391 @@
import time, simplejson
from allmydata.util.assertutil import _assert
from allmydata.util import dbutil
from allmydata.storage.common import si_b2a
from twisted.application import service
class NonExistentShareError(Exception):
def __init__(self, si_s, shnum):
Exception.__init__(self, si_s, shnum)
self.si_s = si_s
self.shnum = shnum
def __str__(self):
return "can't find SI=%r shnum=%r in `shares` table" % (self.si_s, self.shnum)
class LeaseInfo(object):
def __init__(self, storage_index, shnum, owner_num, renewal_time, expiration_time):
self.storage_index = storage_index
self.shnum = shnum
self.owner_num = owner_num
self.renewal_time = renewal_time
self.expiration_time = expiration_time
def int_or_none(s):
if s is None:
return s
return int(s)
SHARETYPE_IMMUTABLE = 0
SHARETYPE_MUTABLE = 1
SHARETYPE_CORRUPTED = 2
SHARETYPE_UNKNOWN = 3
SHARETYPES = { SHARETYPE_IMMUTABLE: 'immutable',
SHARETYPE_MUTABLE: 'mutable',
SHARETYPE_CORRUPTED: 'corrupted',
SHARETYPE_UNKNOWN: 'unknown' }
STATE_COMING = 0
STATE_STABLE = 1
STATE_GOING = 2
LEASE_SCHEMA_V1 = """
CREATE TABLE `version`
(
version INTEGER -- contains one row, set to 1
);
CREATE TABLE `shares`
(
`storage_index` VARCHAR(26) not null,
`shnum` INTEGER not null,
`prefix` VARCHAR(2) not null,
`backend_key` VARCHAR, -- not used by current backends; NULL means '$prefix/$storage_index/$shnum'
`used_space` INTEGER not null,
`sharetype` INTEGER not null, -- SHARETYPE_*
`state` INTEGER not null, -- STATE_*
PRIMARY KEY (`storage_index`, `shnum`)
);
CREATE INDEX `prefix` ON `shares` (`prefix`);
-- CREATE UNIQUE INDEX `share_id` ON `shares` (`storage_index`,`shnum`);
CREATE TABLE `leases`
(
`storage_index` VARCHAR(26) not null,
`shnum` INTEGER not null,
`account_id` INTEGER not null,
`renewal_time` INTEGER not null, -- duration is implicit: expiration-renewal
`expiration_time` INTEGER, -- seconds since epoch; NULL means the end of time
FOREIGN KEY (`storage_index`, `shnum`) REFERENCES `shares` (`storage_index`, `shnum`),
FOREIGN KEY (`account_id`) REFERENCES `accounts` (`id`)
PRIMARY KEY (`storage_index`, `shnum`, `account_id`)
);
CREATE INDEX `account_id` ON `leases` (`account_id`);
CREATE INDEX `expiration_time` ON `leases` (`expiration_time`);
CREATE TABLE accounts
(
`id` INTEGER PRIMARY KEY AUTOINCREMENT,
`pubkey_vs` VARCHAR(52),
`creation_time` INTEGER
);
CREATE UNIQUE INDEX `pubkey_vs` ON `accounts` (`pubkey_vs`);
CREATE TABLE account_attributes
(
`id` INTEGER PRIMARY KEY AUTOINCREMENT,
`account_id` INTEGER,
`name` VARCHAR(20),
`value` VARCHAR(20) -- actually anything: usually string, unicode, integer
);
CREATE UNIQUE INDEX `account_attr` ON `account_attributes` (`account_id`, `name`);
INSERT INTO `accounts` VALUES (0, "anonymous", 0);
INSERT INTO `accounts` VALUES (1, "starter", 0);
CREATE TABLE crawler_history
(
`cycle` INTEGER,
`json` TEXT
);
CREATE UNIQUE INDEX `cycle` ON `crawler_history` (`cycle`);
"""
DAY = 24*60*60
MONTH = 30*DAY
class LeaseDB(service.Service):
ANONYMOUS_ACCOUNTID = 0
STARTER_LEASE_ACCOUNTID = 1
STARTER_LEASE_DURATION = 2*MONTH
def __init__(self, dbfile):
self.debug = False
self.retained_history_entries = 10
self._dbfile = dbfile
self._db = None
self._open_db()
def _open_db(self):
if self._db is None:
# For the reasoning behind WAL and NORMAL, refer to
# <https://tahoe-lafs.org/pipermail/tahoe-dev/2012-December/007877.html>.
(self._sqlite,
self._db) = dbutil.get_db(self._dbfile, create_version=(LEASE_SCHEMA_V1, 1),
journal_mode="WAL",
synchronous="NORMAL")
self._cursor = self._db.cursor()
def _close_db(self):
try:
self._cursor.close()
finally:
self._cursor = None
self._db.close()
self._db = None
def startService(self):
self._open_db()
def stopService(self):
self._close_db()
def get_shares_for_prefix(self, prefix):
"""
Returns a dict mapping (si_s, shnum) pairs to (used_space, sharetype, state) triples
for shares with this prefix.
"""
self._cursor.execute("SELECT `storage_index`,`shnum`, `used_space`, `sharetype`, `state`"
" FROM `shares`"
" WHERE `prefix` == ?",
(prefix,))
db_sharemap = dict([((str(si_s), int(shnum)), (int(used_space), int(sharetype), int(state)))
for (si_s, shnum, used_space, sharetype, state) in self._cursor.fetchall()])
return db_sharemap
def add_new_share(self, storage_index, shnum, used_space, sharetype):
si_s = si_b2a(storage_index)
prefix = si_s[:2]
if self.debug: print "ADD_NEW_SHARE", prefix, si_s, shnum, used_space, sharetype
backend_key = None
# This needs to be an INSERT OR REPLACE because it is possible for add_new_share
# to be called when this share is already in the database (but not on disk).
self._cursor.execute("INSERT OR REPLACE INTO `shares`"
" VALUES (?,?,?,?,?,?,?)",
(si_s, shnum, prefix, backend_key, used_space, sharetype, STATE_COMING))
def add_starter_lease(self, storage_index, shnum):
si_s = si_b2a(storage_index)
if self.debug: print "ADD_STARTER_LEASE", si_s, shnum
renewal_time = time.time()
self.add_or_renew_leases(storage_index, shnum, self.STARTER_LEASE_ACCOUNTID,
int(renewal_time), int(renewal_time + self.STARTER_LEASE_DURATION))
def mark_share_as_stable(self, storage_index, shnum, used_space=None, backend_key=None):
"""
Call this method after adding a share to backend storage.
"""
si_s = si_b2a(storage_index)
if self.debug: print "MARK_SHARE_AS_STABLE", si_s, shnum, used_space
if used_space is not None:
self._cursor.execute("UPDATE `shares` SET `state`=?, `used_space`=?, `backend_key`=?"
" WHERE `storage_index`=? AND `shnum`=? AND `state`!=?",
(STATE_STABLE, used_space, backend_key, si_s, shnum, STATE_GOING))
else:
_assert(backend_key is None, backend_key=backend_key)
self._cursor.execute("UPDATE `shares` SET `state`=?"
" WHERE `storage_index`=? AND `shnum`=? AND `state`!=?",
(STATE_STABLE, si_s, shnum, STATE_GOING))
self._db.commit()
if self._cursor.rowcount < 1:
raise NonExistentShareError(si_s, shnum)
def mark_share_as_going(self, storage_index, shnum):
"""
Call this method and commit before deleting a share from backend storage,
then call remove_deleted_share.
"""
si_s = si_b2a(storage_index)
if self.debug: print "MARK_SHARE_AS_GOING", si_s, shnum
self._cursor.execute("UPDATE `shares` SET `state`=?"
" WHERE `storage_index`=? AND `shnum`=? AND `state`!=?",
(STATE_GOING, si_s, shnum, STATE_COMING))
self._db.commit()
if self._cursor.rowcount < 1:
raise NonExistentShareError(si_s, shnum)
def remove_deleted_share(self, storage_index, shnum):
si_s = si_b2a(storage_index)
if self.debug: print "REMOVE_DELETED_SHARE", si_s, shnum
# delete leases first to maintain integrity constraint
self._cursor.execute("DELETE FROM `leases`"
" WHERE `storage_index`=? AND `shnum`=?",
(si_s, shnum))
try:
self._cursor.execute("DELETE FROM `shares`"
" WHERE `storage_index`=? AND `shnum`=?",
(si_s, shnum))
except Exception:
self._db.rollback() # roll back the lease deletion
raise
else:
self._db.commit()
def change_share_space(self, storage_index, shnum, used_space):
si_s = si_b2a(storage_index)
if self.debug: print "CHANGE_SHARE_SPACE", si_s, shnum, used_space
self._cursor.execute("UPDATE `shares` SET `used_space`=?"
" WHERE `storage_index`=? AND `shnum`=?",
(used_space, si_s, shnum))
self._db.commit()
if self._cursor.rowcount < 1:
raise NonExistentShareError(si_s, shnum)
# lease management
def add_or_renew_leases(self, storage_index, shnum, ownerid,
renewal_time, expiration_time):
"""
shnum=None means renew leases on all shares; do nothing if there are no shares for this storage_index in the `shares` table.
Raises NonExistentShareError if a specific shnum is given and that share does not exist in the `shares` table.
"""
si_s = si_b2a(storage_index)
if self.debug: print "ADD_OR_RENEW_LEASES", si_s, shnum, ownerid, renewal_time, expiration_time
if shnum is None:
self._cursor.execute("SELECT `storage_index`, `shnum` FROM `shares`"
" WHERE `storage_index`=?",
(si_s,))
rows = self._cursor.fetchall()
else:
self._cursor.execute("SELECT `storage_index`, `shnum` FROM `shares`"
" WHERE `storage_index`=? AND `shnum`=?",
(si_s, shnum))
rows = self._cursor.fetchall()
if not rows:
raise NonExistentShareError(si_s, shnum)
for (found_si_s, found_shnum) in rows:
_assert(si_s == found_si_s, si_s=si_s, found_si_s=found_si_s)
# Note that unlike the pre-LeaseDB code, this allows leases to be backdated.
# There is currently no way for a client to specify lease duration, and so
# backdating can only happen in normal operation if there is a timequake on
# the server and time goes backward by more than 31 days. This needs to be
# revisited for ticket #1816, which would allow the client to request a lease
# duration.
self._cursor.execute("INSERT OR REPLACE INTO `leases` VALUES (?,?,?,?,?)",
(si_s, found_shnum, ownerid, renewal_time, expiration_time))
self._db.commit()
def get_leases(self, storage_index, ownerid):
si_s = si_b2a(storage_index)
self._cursor.execute("SELECT `shnum`, `account_id`, `renewal_time`, `expiration_time` FROM `leases`"
" WHERE `storage_index`=? AND `account_id`=?",
(si_s, ownerid))
rows = self._cursor.fetchall()
def _to_LeaseInfo(row):
(shnum, account_id, renewal_time, expiration_time) = tuple(row)
return LeaseInfo(storage_index, int(shnum), int(account_id), float(renewal_time), float(expiration_time))
return map(_to_LeaseInfo, rows)
def get_lease_ages(self, storage_index, shnum, now):
si_s = si_b2a(storage_index)
self._cursor.execute("SELECT `renewal_time` FROM `leases`"
" WHERE `storage_index`=? AND `shnum`=?",
(si_s, shnum))
rows = self._cursor.fetchall()
def _to_age(row):
return now - float(row[0])
return map(_to_age, rows)
def get_unleased_shares_for_prefix(self, prefix):
"""
Returns a dict mapping (si_s, shnum) pairs to (used_space, sharetype, state) triples
for stable, unleased shares with this prefix.
"""
if self.debug: print "GET_UNLEASED_SHARES_FOR_PREFIX", prefix
# This would be simpler, but it doesn't work because 'NOT IN' doesn't support multiple columns.
#query = ("SELECT `storage_index`, `shnum`, `used_space`, `sharetype`, `state` FROM `shares`"
# " WHERE `state` = STATE_STABLE "
# " AND (`storage_index`, `shnum`) NOT IN (SELECT DISTINCT `storage_index`, `shnum` FROM `leases`)")
# This "negative join" should be equivalent.
self._cursor.execute("SELECT DISTINCT s.storage_index, s.shnum, s.used_space, s.sharetype, s.state"
" FROM `shares` s LEFT JOIN `leases` l"
" ON (s.storage_index = l.storage_index AND s.shnum = l.shnum)"
" WHERE s.prefix = ? AND s.state = ? AND l.storage_index IS NULL",
(prefix, STATE_STABLE))
db_sharemap = dict([((str(si_s), int(shnum)), (int(used_space), int(sharetype), int(state)))
for (si_s, shnum, used_space, sharetype, state) in self._cursor.fetchall()])
return db_sharemap
def remove_leases_by_renewal_time(self, renewal_cutoff_time):
if self.debug: print "REMOVE_LEASES_BY_RENEWAL_TIME", renewal_cutoff_time
self._cursor.execute("DELETE FROM `leases` WHERE `renewal_time` < ?",
(renewal_cutoff_time,))
self._db.commit()
def remove_leases_by_expiration_time(self, expiration_cutoff_time):
if self.debug: print "REMOVE_LEASES_BY_EXPIRATION_TIME", expiration_cutoff_time
self._cursor.execute("DELETE FROM `leases` WHERE `expiration_time` IS NOT NULL AND `expiration_time` < ?",
(expiration_cutoff_time,))
self._db.commit()
# history
def add_history_entry(self, cycle, entry):
if self.debug: print "ADD_HISTORY_ENTRY", cycle, entry
json = simplejson.dumps(entry)
self._cursor.execute("SELECT `cycle` FROM `crawler_history`")
rows = self._cursor.fetchall()
if len(rows) >= self.retained_history_entries:
first_cycle_to_retain = list(sorted(rows))[-(self.retained_history_entries - 1)][0]
self._cursor.execute("DELETE FROM `crawler_history` WHERE `cycle` < ?",
(first_cycle_to_retain,))
self._db.commit()
try:
self._cursor.execute("INSERT OR REPLACE INTO `crawler_history` VALUES (?,?)",
(cycle, json))
except Exception:
self._db.rollback() # roll back the deletion of unretained entries
raise
else:
self._db.commit()
def get_history(self):
self._cursor.execute("SELECT `cycle`,`json` FROM `crawler_history`")
rows = self._cursor.fetchall()
decoded = [(row[0], simplejson.loads(row[1])) for row in rows]
return dict(decoded)
def get_account_creation_time(self, owner_num):
self._cursor.execute("SELECT `creation_time` from `accounts`"
" WHERE `id`=?",
(owner_num,))
row = self._cursor.fetchone()
if row:
return row[0]
return None
def get_all_accounts(self):
self._cursor.execute("SELECT `id`,`pubkey_vs`"
" FROM `accounts` ORDER BY `id` ASC")
return self._cursor.fetchall()
def get_total_leased_sharecount_and_used_space(self):
self._cursor.execute("SELECT COUNT(*), SUM(`used_space`)"
" FROM (SELECT `used_space`"
" FROM `shares` s JOIN `leases` l"
" ON (s.`storage_index` = l.`storage_index` AND s.`shnum` = l.`shnum`)"
" GROUP BY s.`storage_index`, s.`shnum`)")
share_count, used_space = self._cursor.fetchall()[0]
if share_count == 0 and used_space is None:
used_space = 0
return share_count, used_space
def get_number_of_sharesets(self):
self._cursor.execute("SELECT COUNT(DISTINCT `storage_index`) AS si_num FROM `shares`")
return self._cursor.fetchall()[0][0]

View File

@ -1,462 +0,0 @@
import os, stat, struct
from allmydata.interfaces import BadWriteEnablerError
from allmydata.util import idlib, log
from allmydata.util.assertutil import precondition
from allmydata.util.hashutil import timing_safe_compare
from allmydata.storage.lease import LeaseInfo
from allmydata.storage.common import UnknownMutableContainerVersionError, \
DataTooLargeError
from allmydata.mutable.layout import MAX_MUTABLE_SHARE_SIZE
# the MutableShareFile is like the ShareFile, but used for mutable data. It
# has a different layout. See docs/mutable.txt for more details.
# # offset size name
# 1 0 32 magic verstr "tahoe mutable container v1" plus binary
# 2 32 20 write enabler's nodeid
# 3 52 32 write enabler
# 4 84 8 data size (actual share data present) (a)
# 5 92 8 offset of (8) count of extra leases (after data)
# 6 100 368 four leases, 92 bytes each
# 0 4 ownerid (0 means "no lease here")
# 4 4 expiration timestamp
# 8 32 renewal token
# 40 32 cancel token
# 72 20 nodeid which accepted the tokens
# 7 468 (a) data
# 8 ?? 4 count of extra leases
# 9 ?? n*92 extra leases
# The struct module doc says that L's are 4 bytes in size., and that Q's are
# 8 bytes in size. Since compatibility depends upon this, double-check it.
assert struct.calcsize(">L") == 4, struct.calcsize(">L")
assert struct.calcsize(">Q") == 8, struct.calcsize(">Q")
class MutableShareFile:
sharetype = "mutable"
DATA_LENGTH_OFFSET = struct.calcsize(">32s20s32s")
EXTRA_LEASE_OFFSET = DATA_LENGTH_OFFSET + 8
HEADER_SIZE = struct.calcsize(">32s20s32sQQ") # doesn't include leases
LEASE_SIZE = struct.calcsize(">LL32s32s20s")
assert LEASE_SIZE == 92
DATA_OFFSET = HEADER_SIZE + 4*LEASE_SIZE
assert DATA_OFFSET == 468, DATA_OFFSET
# our sharefiles share with a recognizable string, plus some random
# binary data to reduce the chance that a regular text file will look
# like a sharefile.
MAGIC = "Tahoe mutable container v1\n" + "\x75\x09\x44\x03\x8e"
assert len(MAGIC) == 32
MAX_SIZE = MAX_MUTABLE_SHARE_SIZE
# TODO: decide upon a policy for max share size
def __init__(self, filename, parent=None):
self.home = filename
if os.path.exists(self.home):
# we don't cache anything, just check the magic
f = open(self.home, 'rb')
data = f.read(self.HEADER_SIZE)
(magic,
write_enabler_nodeid, write_enabler,
data_length, extra_least_offset) = \
struct.unpack(">32s20s32sQQ", data)
if magic != self.MAGIC:
msg = "sharefile %s had magic '%r' but we wanted '%r'" % \
(filename, magic, self.MAGIC)
raise UnknownMutableContainerVersionError(msg)
self.parent = parent # for logging
def log(self, *args, **kwargs):
return self.parent.log(*args, **kwargs)
def create(self, my_nodeid, write_enabler):
assert not os.path.exists(self.home)
data_length = 0
extra_lease_offset = (self.HEADER_SIZE
+ 4 * self.LEASE_SIZE
+ data_length)
assert extra_lease_offset == self.DATA_OFFSET # true at creation
num_extra_leases = 0
f = open(self.home, 'wb')
header = struct.pack(">32s20s32sQQ",
self.MAGIC, my_nodeid, write_enabler,
data_length, extra_lease_offset,
)
leases = ("\x00"*self.LEASE_SIZE) * 4
f.write(header + leases)
# data goes here, empty after creation
f.write(struct.pack(">L", num_extra_leases))
# extra leases go here, none at creation
f.close()
def unlink(self):
os.unlink(self.home)
def _read_data_length(self, f):
f.seek(self.DATA_LENGTH_OFFSET)
(data_length,) = struct.unpack(">Q", f.read(8))
return data_length
def _write_data_length(self, f, data_length):
f.seek(self.DATA_LENGTH_OFFSET)
f.write(struct.pack(">Q", data_length))
def _read_share_data(self, f, offset, length):
precondition(offset >= 0)
data_length = self._read_data_length(f)
if offset+length > data_length:
# reads beyond the end of the data are truncated. Reads that
# start beyond the end of the data return an empty string.
length = max(0, data_length-offset)
if length == 0:
return ""
precondition(offset+length <= data_length)
f.seek(self.DATA_OFFSET+offset)
data = f.read(length)
return data
def _read_extra_lease_offset(self, f):
f.seek(self.EXTRA_LEASE_OFFSET)
(extra_lease_offset,) = struct.unpack(">Q", f.read(8))
return extra_lease_offset
def _write_extra_lease_offset(self, f, offset):
f.seek(self.EXTRA_LEASE_OFFSET)
f.write(struct.pack(">Q", offset))
def _read_num_extra_leases(self, f):
offset = self._read_extra_lease_offset(f)
f.seek(offset)
(num_extra_leases,) = struct.unpack(">L", f.read(4))
return num_extra_leases
def _write_num_extra_leases(self, f, num_leases):
extra_lease_offset = self._read_extra_lease_offset(f)
f.seek(extra_lease_offset)
f.write(struct.pack(">L", num_leases))
def _change_container_size(self, f, new_container_size):
if new_container_size > self.MAX_SIZE:
raise DataTooLargeError()
old_extra_lease_offset = self._read_extra_lease_offset(f)
new_extra_lease_offset = self.DATA_OFFSET + new_container_size
if new_extra_lease_offset < old_extra_lease_offset:
# TODO: allow containers to shrink. For now they remain large.
return
num_extra_leases = self._read_num_extra_leases(f)
f.seek(old_extra_lease_offset)
leases_size = 4 + num_extra_leases * self.LEASE_SIZE
extra_lease_data = f.read(leases_size)
# Zero out the old lease info (in order to minimize the chance that
# it could accidentally be exposed to a reader later, re #1528).
f.seek(old_extra_lease_offset)
f.write('\x00' * leases_size)
f.flush()
# An interrupt here will corrupt the leases.
f.seek(new_extra_lease_offset)
f.write(extra_lease_data)
self._write_extra_lease_offset(f, new_extra_lease_offset)
def _write_share_data(self, f, offset, data):
length = len(data)
precondition(offset >= 0)
data_length = self._read_data_length(f)
extra_lease_offset = self._read_extra_lease_offset(f)
if offset+length >= data_length:
# They are expanding their data size.
if self.DATA_OFFSET+offset+length > extra_lease_offset:
# TODO: allow containers to shrink. For now, they remain
# large.
# Their new data won't fit in the current container, so we
# have to move the leases. With luck, they're expanding it
# more than the size of the extra lease block, which will
# minimize the corrupt-the-share window
self._change_container_size(f, offset+length)
extra_lease_offset = self._read_extra_lease_offset(f)
# an interrupt here is ok.. the container has been enlarged
# but the data remains untouched
assert self.DATA_OFFSET+offset+length <= extra_lease_offset
# Their data now fits in the current container. We must write
# their new data and modify the recorded data size.
# Fill any newly exposed empty space with 0's.
if offset > data_length:
f.seek(self.DATA_OFFSET+data_length)
f.write('\x00'*(offset - data_length))
f.flush()
new_data_length = offset+length
self._write_data_length(f, new_data_length)
# an interrupt here will result in a corrupted share
# now all that's left to do is write out their data
f.seek(self.DATA_OFFSET+offset)
f.write(data)
return
def _write_lease_record(self, f, lease_number, lease_info):
extra_lease_offset = self._read_extra_lease_offset(f)
num_extra_leases = self._read_num_extra_leases(f)
if lease_number < 4:
offset = self.HEADER_SIZE + lease_number * self.LEASE_SIZE
elif (lease_number-4) < num_extra_leases:
offset = (extra_lease_offset
+ 4
+ (lease_number-4)*self.LEASE_SIZE)
else:
# must add an extra lease record
self._write_num_extra_leases(f, num_extra_leases+1)
offset = (extra_lease_offset
+ 4
+ (lease_number-4)*self.LEASE_SIZE)
f.seek(offset)
assert f.tell() == offset
f.write(lease_info.to_mutable_data())
def _read_lease_record(self, f, lease_number):
# returns a LeaseInfo instance, or None
extra_lease_offset = self._read_extra_lease_offset(f)
num_extra_leases = self._read_num_extra_leases(f)
if lease_number < 4:
offset = self.HEADER_SIZE + lease_number * self.LEASE_SIZE
elif (lease_number-4) < num_extra_leases:
offset = (extra_lease_offset
+ 4
+ (lease_number-4)*self.LEASE_SIZE)
else:
raise IndexError("No such lease number %d" % lease_number)
f.seek(offset)
assert f.tell() == offset
data = f.read(self.LEASE_SIZE)
lease_info = LeaseInfo().from_mutable_data(data)
if lease_info.owner_num == 0:
return None
return lease_info
def _get_num_lease_slots(self, f):
# how many places do we have allocated for leases? Not all of them
# are filled.
num_extra_leases = self._read_num_extra_leases(f)
return 4+num_extra_leases
def _get_first_empty_lease_slot(self, f):
# return an int with the index of an empty slot, or None if we do not
# currently have an empty slot
for i in range(self._get_num_lease_slots(f)):
if self._read_lease_record(f, i) is None:
return i
return None
def get_leases(self):
"""Yields a LeaseInfo instance for all leases."""
f = open(self.home, 'rb')
for i, lease in self._enumerate_leases(f):
yield lease
f.close()
def _enumerate_leases(self, f):
for i in range(self._get_num_lease_slots(f)):
try:
data = self._read_lease_record(f, i)
if data is not None:
yield i,data
except IndexError:
return
def add_lease(self, lease_info):
precondition(lease_info.owner_num != 0) # 0 means "no lease here"
f = open(self.home, 'rb+')
num_lease_slots = self._get_num_lease_slots(f)
empty_slot = self._get_first_empty_lease_slot(f)
if empty_slot is not None:
self._write_lease_record(f, empty_slot, lease_info)
else:
self._write_lease_record(f, num_lease_slots, lease_info)
f.close()
def renew_lease(self, renew_secret, new_expire_time):
accepting_nodeids = set()
f = open(self.home, 'rb+')
for (leasenum,lease) in self._enumerate_leases(f):
if timing_safe_compare(lease.renew_secret, renew_secret):
# yup. See if we need to update the owner time.
if new_expire_time > lease.expiration_time:
# yes
lease.expiration_time = new_expire_time
self._write_lease_record(f, leasenum, lease)
f.close()
return
accepting_nodeids.add(lease.nodeid)
f.close()
# Return the accepting_nodeids set, to give the client a chance to
# update the leases on a share which has been migrated from its
# original server to a new one.
msg = ("Unable to renew non-existent lease. I have leases accepted by"
" nodeids: ")
msg += ",".join([("'%s'" % idlib.nodeid_b2a(anid))
for anid in accepting_nodeids])
msg += " ."
raise IndexError(msg)
def add_or_renew_lease(self, lease_info):
precondition(lease_info.owner_num != 0) # 0 means "no lease here"
try:
self.renew_lease(lease_info.renew_secret,
lease_info.expiration_time)
except IndexError:
self.add_lease(lease_info)
def cancel_lease(self, cancel_secret):
"""Remove any leases with the given cancel_secret. If the last lease
is cancelled, the file will be removed. Return the number of bytes
that were freed (by truncating the list of leases, and possibly by
deleting the file. Raise IndexError if there was no lease with the
given cancel_secret."""
accepting_nodeids = set()
modified = 0
remaining = 0
blank_lease = LeaseInfo(owner_num=0,
renew_secret="\x00"*32,
cancel_secret="\x00"*32,
expiration_time=0,
nodeid="\x00"*20)
f = open(self.home, 'rb+')
for (leasenum,lease) in self._enumerate_leases(f):
accepting_nodeids.add(lease.nodeid)
if timing_safe_compare(lease.cancel_secret, cancel_secret):
self._write_lease_record(f, leasenum, blank_lease)
modified += 1
else:
remaining += 1
if modified:
freed_space = self._pack_leases(f)
f.close()
if not remaining:
freed_space += os.stat(self.home)[stat.ST_SIZE]
self.unlink()
return freed_space
msg = ("Unable to cancel non-existent lease. I have leases "
"accepted by nodeids: ")
msg += ",".join([("'%s'" % idlib.nodeid_b2a(anid))
for anid in accepting_nodeids])
msg += " ."
raise IndexError(msg)
def _pack_leases(self, f):
# TODO: reclaim space from cancelled leases
return 0
def _read_write_enabler_and_nodeid(self, f):
f.seek(0)
data = f.read(self.HEADER_SIZE)
(magic,
write_enabler_nodeid, write_enabler,
data_length, extra_least_offset) = \
struct.unpack(">32s20s32sQQ", data)
assert magic == self.MAGIC
return (write_enabler, write_enabler_nodeid)
def readv(self, readv):
datav = []
f = open(self.home, 'rb')
for (offset, length) in readv:
datav.append(self._read_share_data(f, offset, length))
f.close()
return datav
# def remote_get_length(self):
# f = open(self.home, 'rb')
# data_length = self._read_data_length(f)
# f.close()
# return data_length
def check_write_enabler(self, write_enabler, si_s):
f = open(self.home, 'rb+')
(real_write_enabler, write_enabler_nodeid) = \
self._read_write_enabler_and_nodeid(f)
f.close()
# avoid a timing attack
#if write_enabler != real_write_enabler:
if not timing_safe_compare(write_enabler, real_write_enabler):
# accomodate share migration by reporting the nodeid used for the
# old write enabler.
self.log(format="bad write enabler on SI %(si)s,"
" recorded by nodeid %(nodeid)s",
facility="tahoe.storage",
level=log.WEIRD, umid="cE1eBQ",
si=si_s, nodeid=idlib.nodeid_b2a(write_enabler_nodeid))
msg = "The write enabler was recorded by nodeid '%s'." % \
(idlib.nodeid_b2a(write_enabler_nodeid),)
raise BadWriteEnablerError(msg)
def check_testv(self, testv):
test_good = True
f = open(self.home, 'rb+')
for (offset, length, operator, specimen) in testv:
data = self._read_share_data(f, offset, length)
if not testv_compare(data, operator, specimen):
test_good = False
break
f.close()
return test_good
def writev(self, datav, new_length):
f = open(self.home, 'rb+')
for (offset, data) in datav:
self._write_share_data(f, offset, data)
if new_length is not None:
cur_length = self._read_data_length(f)
if new_length < cur_length:
self._write_data_length(f, new_length)
# TODO: if we're going to shrink the share file when the
# share data has shrunk, then call
# self._change_container_size() here.
f.close()
def testv_compare(a, op, b):
assert op in ("lt", "le", "eq", "ne", "ge", "gt")
if op == "lt":
return a < b
if op == "le":
return a <= b
if op == "eq":
return a == b
if op == "ne":
return a != b
if op == "ge":
return a >= b
if op == "gt":
return a > b
# never reached
class EmptyShare:
def check_testv(self, testv):
test_good = True
for (offset, length, operator, specimen) in testv:
data = ""
if not testv_compare(data, operator, specimen):
test_good = False
break
return test_good
def create_mutable_sharefile(filename, my_nodeid, write_enabler, parent):
ms = MutableShareFile(filename, parent)
ms.create(my_nodeid, write_enabler)
del ms
return MutableShareFile(filename, parent)

View File

@ -1,78 +1,54 @@
import os, re, weakref, struct, time
from foolscap.api import Referenceable
import os, weakref
from twisted.application import service
from twisted.internet import defer, reactor
from zope.interface import implements
from allmydata.interfaces import RIStorageServer, IStatsProducer
from allmydata.interfaces import IStatsProducer, IStorageBackend
from allmydata.util.assertutil import precondition
from allmydata.util import fileutil, idlib, log, time_format
import allmydata # for __full_version__
from allmydata.storage.common import si_b2a, si_a2b, storage_index_to_dir
_pyflakes_hush = [si_b2a, si_a2b, storage_index_to_dir] # re-exported
from allmydata.storage.lease import LeaseInfo
from allmydata.storage.mutable import MutableShareFile, EmptyShare, \
create_mutable_sharefile
from allmydata.mutable.layout import MAX_MUTABLE_SHARE_SIZE
from allmydata.storage.immutable import ShareFile, BucketWriter, BucketReader
from allmydata.storage.crawler import BucketCountingCrawler
from allmydata.storage.expirer import LeaseCheckingCrawler
# storage/
# storage/shares/incoming
# incoming/ holds temp dirs named $START/$STORAGEINDEX/$SHARENUM which will
# be moved to storage/shares/$START/$STORAGEINDEX/$SHARENUM upon success
# storage/shares/$START/$STORAGEINDEX
# storage/shares/$START/$STORAGEINDEX/$SHARENUM
# Where "$START" denotes the first 10 bits worth of $STORAGEINDEX (that's 2
# base-32 chars).
# $SHARENUM matches this regex:
NUM_RE=re.compile("^[0-9]+$")
from allmydata.storage.accountant import Accountant
from allmydata.storage.expiration import ExpirationPolicy
class StorageServer(service.MultiService, Referenceable):
implements(RIStorageServer, IStatsProducer)
class StorageServer(service.MultiService):
implements(IStatsProducer)
name = 'storage'
LeaseCheckerClass = LeaseCheckingCrawler
DEFAULT_EXPIRATION_POLICY = ExpirationPolicy(enabled=False)
def __init__(self, storedir, nodeid, reserved_space=0,
discard_storage=False, readonly_storage=False,
def __init__(self, serverid, backend, statedir,
stats_provider=None,
expiration_enabled=False,
expiration_mode="age",
expiration_override_lease_duration=None,
expiration_cutoff_date=None,
expiration_sharetypes=("mutable", "immutable")):
expiration_policy=None,
clock=None):
service.MultiService.__init__(self)
assert isinstance(nodeid, str)
assert len(nodeid) == 20
self.my_nodeid = nodeid
self.storedir = storedir
sharedir = os.path.join(storedir, "shares")
fileutil.make_dirs(sharedir)
self.sharedir = sharedir
# we don't actually create the corruption-advisory dir until necessary
self.corruption_advisory_dir = os.path.join(storedir,
"corruption-advisories")
self.reserved_space = int(reserved_space)
self.no_storage = discard_storage
self.readonly_storage = readonly_storage
precondition(IStorageBackend.providedBy(backend), backend)
precondition(isinstance(serverid, str), serverid)
precondition(len(serverid) == 20, serverid)
self._serverid = serverid
self.clock = clock or reactor
self.stats_provider = stats_provider
if self.stats_provider:
self.stats_provider.register_producer(self)
self.incomingdir = os.path.join(sharedir, 'incoming')
self._clean_incomplete()
fileutil.make_dirs(self.incomingdir)
self._active_writers = weakref.WeakKeyDictionary()
log.msg("StorageServer created", facility="tahoe.storage")
if reserved_space:
if self.get_available_space() is None:
log.msg("warning: [storage]reserved_space= is set, but this platform does not support an API to get disk statistics (statvfs(2) or GetDiskFreeSpaceEx), so this reservation cannot be honored",
umin="0wZ27w", level=log.UNUSUAL)
self.backend = backend
self.backend.setServiceParent(self)
self._active_writers = weakref.WeakKeyDictionary()
self._statedir = statedir
fileutil.make_dirs(self._statedir)
# we don't actually create the corruption-advisory dir until necessary
self._corruption_advisory_dir = os.path.join(self._statedir,
"corruption-advisories")
log.msg("StorageServer created", facility="tahoe.storage")
self.latencies = {"allocate": [], # immutable
"write": [],
@ -85,30 +61,30 @@ class StorageServer(service.MultiService, Referenceable):
"renew": [],
"cancel": [],
}
self.add_bucket_counter()
statefile = os.path.join(self.storedir, "lease_checker.state")
historyfile = os.path.join(self.storedir, "lease_checker.history")
klass = self.LeaseCheckerClass
self.lease_checker = klass(self, statefile, historyfile,
expiration_enabled, expiration_mode,
expiration_override_lease_duration,
expiration_cutoff_date,
expiration_sharetypes)
self.lease_checker.setServiceParent(self)
self.init_accountant(expiration_policy or self.DEFAULT_EXPIRATION_POLICY)
def init_accountant(self, expiration_policy):
dbfile = os.path.join(self._statedir, "leasedb.sqlite")
statefile = os.path.join(self._statedir, "accounting_crawler.state")
self.accountant = Accountant(self, dbfile, statefile, clock=self.clock)
self.accountant.set_expiration_policy(expiration_policy)
self.accountant.setServiceParent(self)
def get_accountant(self):
return self.accountant
def get_accounting_crawler(self):
return self.accountant.get_accounting_crawler()
def get_expiration_policy(self):
return self.accountant.get_accounting_crawler().get_expiration_policy()
def get_serverid(self):
return self._serverid
def __repr__(self):
return "<StorageServer %s>" % (idlib.shortnodeid_b2a(self.my_nodeid),)
def have_shares(self):
# quick test to decide if we need to commit to an implicit
# permutation-seed or if we should use a new one
return bool(set(os.listdir(self.sharedir)) - set(["incoming"]))
def add_bucket_counter(self):
statefile = os.path.join(self.storedir, "bucket_counter.state")
self.bucket_counter = BucketCountingCrawler(self, statefile)
self.bucket_counter.setServiceParent(self)
return "<StorageServer %s>" % (idlib.shortnodeid_b2a(self.get_serverid()),)
def count(self, name, delta=1):
if self.stats_provider:
@ -120,11 +96,15 @@ class StorageServer(service.MultiService, Referenceable):
if len(a) > 1000:
self.latencies[category] = a[-1000:]
def _add_latency(self, res, category, start):
self.add_latency(category, self.clock.seconds() - start)
return res
def get_latencies(self):
"""Return a dict, indexed by category, that contains a dict of
latency numbers for each category. If there are sufficient samples
for unambiguous interpretation, each dict will contain the
following keys: mean, 01_0_percentile, 10_0_percentile,
following keys: samplesize, mean, 01_0_percentile, 10_0_percentile,
50_0_percentile (median), 90_0_percentile, 95_0_percentile,
99_0_percentile, 99_9_percentile. If there are insufficient
samples for a given percentile to be interpreted unambiguously
@ -165,52 +145,25 @@ class StorageServer(service.MultiService, Referenceable):
kwargs["facility"] = "tahoe.storage"
return log.msg(*args, **kwargs)
def _clean_incomplete(self):
fileutil.rm_dir(self.incomingdir)
def get_stats(self):
# remember: RIStatsProvider requires that our return dict
# contains numeric values.
# contains numeric, or None values.
stats = { 'storage_server.allocated': self.allocated_size(), }
stats['storage_server.reserved_space'] = self.reserved_space
for category,ld in self.get_latencies().items():
for name,v in ld.items():
stats['storage_server.latencies.%s.%s' % (category, name)] = v
try:
disk = fileutil.get_disk_stats(self.sharedir, self.reserved_space)
writeable = disk['avail'] > 0
self.backend.fill_in_space_stats(stats)
# spacetime predictors should use disk_avail / (d(disk_used)/dt)
stats['storage_server.disk_total'] = disk['total']
stats['storage_server.disk_used'] = disk['used']
stats['storage_server.disk_free_for_root'] = disk['free_for_root']
stats['storage_server.disk_free_for_nonroot'] = disk['free_for_nonroot']
stats['storage_server.disk_avail'] = disk['avail']
except AttributeError:
writeable = True
except EnvironmentError:
log.msg("OS call to get disk statistics failed", level=log.UNUSUAL)
writeable = False
if self.readonly_storage:
stats['storage_server.disk_avail'] = 0
writeable = False
stats['storage_server.accepting_immutable_shares'] = int(writeable)
s = self.bucket_counter.get_state()
bucket_count = s.get("last-complete-bucket-count")
if bucket_count:
stats['storage_server.total_bucket_count'] = bucket_count
sharecount = self.accountant.get_number_of_sharesets()
leased_share_count, leased_used_space = self.accountant.get_total_leased_sharecount_and_used_space()
stats['storage_server.total_bucket_count'] = sharecount
stats["storage_server.total_leased_sharecount"] = leased_share_count
stats["storage_server.total_leased_used_space"] = leased_used_space
return stats
def get_available_space(self):
"""Returns available space for share storage in bytes, or None if no
API to get this information is available."""
if self.readonly_storage:
return 0
return fileutil.get_available_space(self.sharedir, self.reserved_space)
return self.backend.get_available_space()
def allocated_size(self):
space = 0
@ -218,8 +171,10 @@ class StorageServer(service.MultiService, Referenceable):
space += bw.allocated_size()
return space
def remote_get_version(self):
remaining_space = self.get_available_space()
# these methods can be invoked by our callers
def client_get_version(self, account):
remaining_space = self.backend.get_available_space()
if remaining_space is None:
# We're on a platform that has no API to get disk stats.
remaining_space = 2**64
@ -231,316 +186,154 @@ class StorageServer(service.MultiService, Referenceable):
"delete-mutable-shares-with-zero-length-writev": True,
"fills-holes-with-zero-bytes": True,
"prevents-read-past-end-of-share-data": True,
"ignores-lease-renewal-and-cancel-secrets": True,
"has-immutable-readv": True,
},
"application-version": str(allmydata.__full_version__),
}
return version
def remote_allocate_buckets(self, storage_index,
renew_secret, cancel_secret,
sharenums, allocated_size,
canary, owner_num=0):
# owner_num is not for clients to set, but rather it should be
# curried into the PersonalStorageServer instance that is dedicated
# to a particular owner.
start = time.time()
def client_allocate_buckets(self, storage_index,
sharenums, allocated_data_length,
canary, account):
start = self.clock.seconds()
self.count("allocate")
alreadygot = set()
bucketwriters = {} # k: shnum, v: BucketWriter
si_dir = storage_index_to_dir(storage_index)
si_s = si_b2a(storage_index)
log.msg("storage: allocate_buckets %s" % si_s)
# in this implementation, the lease information (including secrets)
# goes into the share files themselves. It could also be put into a
# separate database. Note that the lease should not be added until
# the BucketWriter has been closed.
expire_time = time.time() + 31*24*60*60
lease_info = LeaseInfo(owner_num,
renew_secret, cancel_secret,
expire_time, self.my_nodeid)
max_space_per_bucket = allocated_size
remaining_space = self.get_available_space()
limited = remaining_space is not None
if limited:
# this is a bit conservative, since some of this allocated_size()
# has already been written to disk, where it will show up in
# This is a bit conservative, since some of this allocated_size()
# has already been written to the backend, where it will show up in
# get_available_space.
remaining_space -= self.allocated_size()
# self.readonly_storage causes remaining_space <= 0
# If the backend is read-only, remaining_space will be <= 0.
# fill alreadygot with all shares that we have, not just the ones
# they asked about: this will save them a lot of work. Add or update
# leases for all of them: if they want us to hold shares for this
# file, they'll want us to hold leases for this file.
for (shnum, fn) in self._get_bucket_shares(storage_index):
alreadygot.add(shnum)
sf = ShareFile(fn)
sf.add_or_renew_lease(lease_info)
# Fill alreadygot with all shares that we have, not just the ones
# they asked about: this will save them a lot of work. Leases will
# be added or updated for all of them.
alreadygot = set()
shareset = self.backend.get_shareset(storage_index)
d = shareset.get_shares()
def _got_shares( (shares, corrupted) ):
remaining = remaining_space
for share in shares:
# XXX do we need to explicitly add a lease here?
alreadygot.add(share.get_shnum())
for shnum in sharenums:
incominghome = os.path.join(self.incomingdir, si_dir, "%d" % shnum)
finalhome = os.path.join(self.sharedir, si_dir, "%d" % shnum)
if os.path.exists(finalhome):
# great! we already have it. easy.
pass
elif os.path.exists(incominghome):
# Note that we don't create BucketWriters for shnums that
# have a partial share (in incoming/), so if a second upload
# occurs while the first is still in progress, the second
# uploader will use different storage servers.
pass
elif (not limited) or (remaining_space >= max_space_per_bucket):
# ok! we need to create the new share file.
bw = BucketWriter(self, incominghome, finalhome,
max_space_per_bucket, lease_info, canary)
if self.no_storage:
bw.throw_out_all_data = True
bucketwriters[shnum] = bw
self._active_writers[bw] = 1
if limited:
remaining_space -= max_space_per_bucket
else:
# bummer! not enough space to accept this bucket
pass
d2 = defer.succeed(None)
if bucketwriters:
fileutil.make_dirs(os.path.join(self.sharedir, si_dir))
# We don't create BucketWriters for shnums where we have a share
# that is corrupted. Is that right, or should we allow the corrupted
# share to be clobbered? Note that currently the disk share classes
# have assertions that prevent them from clobbering existing files.
for shnum in set(sharenums) - alreadygot - corrupted:
if shareset.has_incoming(shnum):
# Note that we don't create BucketWriters for shnums that
# have an incoming share, so if a second upload occurs while
# the first is still in progress, the second uploader will
# use different storage servers.
pass
elif (not limited) or remaining >= allocated_data_length:
if limited:
remaining -= allocated_data_length
self.add_latency("allocate", time.time() - start)
return alreadygot, bucketwriters
d2.addCallback(lambda ign, shnum=shnum:
shareset.make_bucket_writer(account, shnum, allocated_data_length,
canary))
def _record_writer(bw, shnum=shnum):
bucketwriters[shnum] = bw
self._active_writers[bw] = 1
d2.addCallback(_record_writer)
else:
# not enough space to accept this share
pass
def _iter_share_files(self, storage_index):
for shnum, filename in self._get_bucket_shares(storage_index):
f = open(filename, 'rb')
header = f.read(32)
f.close()
if header[:32] == MutableShareFile.MAGIC:
sf = MutableShareFile(filename, self)
# note: if the share has been migrated, the renew_lease()
# call will throw an exception, with information to help the
# client update the lease.
elif header[:4] == struct.pack(">L", 1):
sf = ShareFile(filename)
else:
continue # non-sharefile
yield sf
def remote_add_lease(self, storage_index, renew_secret, cancel_secret,
owner_num=1):
start = time.time()
self.count("add-lease")
new_expire_time = time.time() + 31*24*60*60
lease_info = LeaseInfo(owner_num,
renew_secret, cancel_secret,
new_expire_time, self.my_nodeid)
for sf in self._iter_share_files(storage_index):
sf.add_or_renew_lease(lease_info)
self.add_latency("add-lease", time.time() - start)
return None
def remote_renew_lease(self, storage_index, renew_secret):
start = time.time()
self.count("renew")
new_expire_time = time.time() + 31*24*60*60
found_buckets = False
for sf in self._iter_share_files(storage_index):
found_buckets = True
sf.renew_lease(renew_secret, new_expire_time)
self.add_latency("renew", time.time() - start)
if not found_buckets:
raise IndexError("no such lease to renew")
d2.addCallback(lambda ign: (alreadygot, bucketwriters))
return d2
d.addCallback(_got_shares)
d.addBoth(self._add_latency, "allocate", start)
return d
def bucket_writer_closed(self, bw, consumed_size):
if self.stats_provider:
self.stats_provider.count('storage_server.bytes_added', consumed_size)
del self._active_writers[bw]
def _get_bucket_shares(self, storage_index):
"""Return a list of (shnum, pathname) tuples for files that hold
shares for this storage_index. In each tuple, 'shnum' will always be
the integer form of the last component of 'pathname'."""
storagedir = os.path.join(self.sharedir, storage_index_to_dir(storage_index))
try:
for f in os.listdir(storagedir):
if NUM_RE.match(f):
filename = os.path.join(storagedir, f)
yield (int(f), filename)
except OSError:
# Commonly caused by there being no buckets at all.
pass
def remote_get_buckets(self, storage_index):
start = time.time()
def client_get_buckets(self, storage_index, account):
start = self.clock.seconds()
self.count("get")
si_s = si_b2a(storage_index)
log.msg("storage: get_buckets %s" % si_s)
bucketreaders = {} # k: sharenum, v: BucketReader
for shnum, filename in self._get_bucket_shares(storage_index):
bucketreaders[shnum] = BucketReader(self, filename,
storage_index, shnum)
self.add_latency("get", time.time() - start)
return bucketreaders
def get_leases(self, storage_index):
"""Provide an iterator that yields all of the leases attached to this
bucket. Each lease is returned as a LeaseInfo instance.
shareset = self.backend.get_shareset(storage_index)
d = shareset.get_shares()
def _make_readers( (shares, corrupted) ):
# We don't create BucketReaders for corrupted shares.
for share in shares:
assert not isinstance(share, defer.Deferred), share
bucketreaders[share.get_shnum()] = shareset.make_bucket_reader(account, share)
return bucketreaders
d.addCallback(_make_readers)
d.addBoth(self._add_latency, "get", start)
return d
This method is not for client use.
"""
# since all shares get the same lease data, we just grab the leases
# from the first share
try:
shnum, filename = self._get_bucket_shares(storage_index).next()
sf = ShareFile(filename)
return sf.get_leases()
except StopIteration:
return iter([])
def remote_slot_testv_and_readv_and_writev(self, storage_index,
secrets,
def client_slot_testv_and_readv_and_writev(self, storage_index,
write_enabler,
test_and_write_vectors,
read_vector):
start = time.time()
read_vector, account):
start = self.clock.seconds()
self.count("writev")
si_s = si_b2a(storage_index)
log.msg("storage: slot_writev %s" % si_s)
si_dir = storage_index_to_dir(storage_index)
(write_enabler, renew_secret, cancel_secret) = secrets
# shares exist if there is a file for them
bucketdir = os.path.join(self.sharedir, si_dir)
shares = {}
if os.path.isdir(bucketdir):
for sharenum_s in os.listdir(bucketdir):
try:
sharenum = int(sharenum_s)
except ValueError:
continue
filename = os.path.join(bucketdir, sharenum_s)
msf = MutableShareFile(filename, self)
msf.check_write_enabler(write_enabler, si_s)
shares[sharenum] = msf
# write_enabler is good for all existing shares.
# Now evaluate test vectors.
testv_is_good = True
for sharenum in test_and_write_vectors:
(testv, datav, new_length) = test_and_write_vectors[sharenum]
if sharenum in shares:
if not shares[sharenum].check_testv(testv):
self.log("testv failed: [%d]: %r" % (sharenum, testv))
testv_is_good = False
break
else:
# compare the vectors against an empty share, in which all
# reads return empty strings.
if not EmptyShare().check_testv(testv):
self.log("testv failed (empty): [%d] %r" % (sharenum,
testv))
testv_is_good = False
break
shareset = self.backend.get_shareset(storage_index)
expiration_time = start + 31*24*60*60 # one month from now
# now gather the read vectors, before we do any writes
read_data = {}
for sharenum, share in shares.items():
read_data[sharenum] = share.readv(read_vector)
d = shareset.testv_and_readv_and_writev(write_enabler, test_and_write_vectors,
read_vector, expiration_time, account)
d.addBoth(self._add_latency, "writev", start)
return d
ownerid = 1 # TODO
expire_time = time.time() + 31*24*60*60 # one month
lease_info = LeaseInfo(ownerid,
renew_secret, cancel_secret,
expire_time, self.my_nodeid)
if testv_is_good:
# now apply the write vectors
for sharenum in test_and_write_vectors:
(testv, datav, new_length) = test_and_write_vectors[sharenum]
if new_length == 0:
if sharenum in shares:
shares[sharenum].unlink()
else:
if sharenum not in shares:
# allocate a new share
allocated_size = 2000 # arbitrary, really
share = self._allocate_slot_share(bucketdir, secrets,
sharenum,
allocated_size,
owner_num=0)
shares[sharenum] = share
shares[sharenum].writev(datav, new_length)
# and update the lease
shares[sharenum].add_or_renew_lease(lease_info)
if new_length == 0:
# delete empty bucket directories
if not os.listdir(bucketdir):
os.rmdir(bucketdir)
# all done
self.add_latency("writev", time.time() - start)
return (testv_is_good, read_data)
def _allocate_slot_share(self, bucketdir, secrets, sharenum,
allocated_size, owner_num=0):
(write_enabler, renew_secret, cancel_secret) = secrets
my_nodeid = self.my_nodeid
fileutil.make_dirs(bucketdir)
filename = os.path.join(bucketdir, "%d" % sharenum)
share = create_mutable_sharefile(filename, my_nodeid, write_enabler,
self)
return share
def remote_slot_readv(self, storage_index, shares, readv):
start = time.time()
def client_slot_readv(self, storage_index, shares, readv, account):
start = self.clock.seconds()
self.count("readv")
si_s = si_b2a(storage_index)
lp = log.msg("storage: slot_readv %s %s" % (si_s, shares),
facility="tahoe.storage", level=log.OPERATIONAL)
si_dir = storage_index_to_dir(storage_index)
# shares exist if there is a file for them
bucketdir = os.path.join(self.sharedir, si_dir)
if not os.path.isdir(bucketdir):
self.add_latency("readv", time.time() - start)
return {}
datavs = {}
for sharenum_s in os.listdir(bucketdir):
try:
sharenum = int(sharenum_s)
except ValueError:
continue
if sharenum in shares or not shares:
filename = os.path.join(bucketdir, sharenum_s)
msf = MutableShareFile(filename, self)
datavs[sharenum] = msf.readv(readv)
log.msg("returning shares %s" % (datavs.keys(),),
facility="tahoe.storage", level=log.NOISY, parent=lp)
self.add_latency("readv", time.time() - start)
return datavs
log.msg("storage: slot_readv %s %s" % (si_s, shares),
facility="tahoe.storage", level=log.OPERATIONAL)
def remote_advise_corrupt_share(self, share_type, storage_index, shnum,
reason):
fileutil.make_dirs(self.corruption_advisory_dir)
shareset = self.backend.get_shareset(storage_index)
d = shareset.readv(shares, readv)
d.addBoth(self._add_latency, "readv", start)
return d
def client_advise_corrupt_share(self, share_type, storage_index, shnum, reason, account):
fileutil.make_dirs(self._corruption_advisory_dir)
now = time_format.iso_utc(sep="T")
si_s = si_b2a(storage_index)
owner_num = account.get_owner_num()
# windows can't handle colons in the filename
fn = os.path.join(self.corruption_advisory_dir,
fn = os.path.join(self._corruption_advisory_dir,
"%s--%s-%d" % (now, si_s, shnum)).replace(":","")
f = open(fn, "w")
f.write("report: Share Corruption\n")
f.write("type: %s\n" % share_type)
f.write("storage_index: %s\n" % si_s)
f.write("share_number: %d\n" % shnum)
f.write("\n")
f.write(reason)
f.write("\n")
f.close()
log.msg(format=("client claims corruption in (%(share_type)s) " +
try:
f.write("report: Share Corruption\n")
f.write("type: %s\n" % (share_type,))
f.write("storage_index: %s\n" % (si_s,))
f.write("share_number: %d\n" % (shnum,))
f.write("owner_num: %s\n" % (owner_num,))
f.write("\n")
f.write(reason)
f.write("\n")
finally:
f.close()
log.msg(format=("client #%(owner_num)d claims corruption in (%(share_type)s) " +
"%(si)s-%(shnum)d: %(reason)s"),
share_type=share_type, si=si_s, shnum=shnum, reason=reason,
owner_num=owner_num, share_type=share_type, si=si_s, shnum=shnum, reason=reason,
level=log.SCARY, umid="SGx2fA")
return None

View File

@ -1,14 +0,0 @@
#! /usr/bin/python
from allmydata.storage.mutable import MutableShareFile
from allmydata.storage.immutable import ShareFile
def get_share_file(filename):
f = open(filename, "rb")
prefix = f.read(32)
f.close()
if prefix == MutableShareFile.MAGIC:
return MutableShareFile(filename)
# otherwise assume it's immutable
return ShareFile(filename)

View File

@ -75,7 +75,7 @@ class StorageFarmBroker:
# these two are used in unit tests
def test_add_rref(self, serverid, rref, ann):
s = NativeStorageServer(serverid, ann.copy())
s = NativeStorageServer(None, ann.copy())
s.rref = rref
s._is_connected = True
self.servers[serverid] = s

View File

@ -188,17 +188,13 @@ class SystemFramework(pollmixin.PollMixin):
"shares.happy = 1\n"
"[storage]\n"
% (self.introducer_furl,))
# the only tests for which we want the internal nodes to actually
# retain shares are the ones where somebody's going to download
# them.
if self.mode in ("download", "download-GET", "download-GET-slow"):
# retain shares
pass
else:
# for these tests, we tell the storage servers to pretend to
# accept shares, but really just throw them out, since we're
# only testing upload and not download.
f.write("debug_discard = true\n")
# We used to set the [storage]debug_discard option to discard
# shares when they will not be needed, i.e. when self.mode not in
# ("download", "download-GET", "download-GET-slow").
# But debug_discard is no longer supported. It should be OK to
# retain the shares anyway.
if self.mode in ("receive",):
# for this mode, the client-under-test gets all the shares,
# so our internal nodes can refuse requests
@ -249,8 +245,7 @@ this file are ignored.
else:
# don't accept any shares
f.write("readonly = true\n")
## also, if we do receive any shares, throw them away
#f.write("debug_discard = true")
if self.mode == "upload-self":
pass
f.close()

View File

@ -76,7 +76,7 @@ class SpeedTest:
return d
def measure_rtt(self, res):
# use RIClient.get_nodeid() to measure the foolscap-level RTT
# measure the foolscap-level RTT
d = self.client_rref.callRemote("measure_peer_response_time")
def _got(res):
assert len(res) # need at least one peer

View File

@ -17,7 +17,7 @@ from allmydata.check_results import CheckResults, CheckAndRepairResults, \
from allmydata.storage_client import StubServer
from allmydata.mutable.layout import unpack_header
from allmydata.mutable.publish import MutableData
from allmydata.storage.mutable import MutableShareFile
from allmydata.storage.backends.disk.mutable import MutableDiskShare
from allmydata.util import hashutil, log, fileutil, pollmixin
from allmydata.util.assertutil import precondition
from allmydata.util.consumer import download_to_data
@ -35,11 +35,32 @@ def flush_but_dont_ignore(res):
d.addCallback(_done)
return d
class DummyProducer:
implements(IPullProducer)
def resumeProducing(self):
pass
class Marker:
pass
class FakeCanary:
def __init__(self, ignore_disconnectors=False):
self.ignore = ignore_disconnectors
self.disconnectors = {}
def notifyOnDisconnect(self, f, *args, **kwargs):
if self.ignore:
return
m = Marker()
self.disconnectors[m] = (f, args, kwargs)
return m
def dontNotifyOnDisconnect(self, marker):
if self.ignore:
return
del self.disconnectors[marker]
class FakeCHKFileNode:
"""I provide IImmutableFileNode, but all of my data is stored in a
class-level dictionary."""
@ -430,6 +451,34 @@ def create_mutable_filenode(contents, mdmf=False, all_contents=None):
return filenode
class CrawlerTestMixin:
def _wait_for_yield(self, res, crawler):
"""
Wait for the crawler to yield. This should be called at the end of a test
so that we leave a clean reactor.
"""
if isinstance(res, failure.Failure):
print res
d = crawler.set_hook('yield')
d.addCallback(lambda ign: res)
return d
def _after_prefix(self, prefix, target_prefix, crawler):
"""
Wait for the crawler to reach a given target_prefix. Return a deferred
for the crawler state at that point.
"""
if prefix != target_prefix:
d = crawler.set_hook('after_prefix')
d.addCallback(self._after_prefix, target_prefix, crawler)
return d
crawler.save_state()
state = crawler.get_state()
self.failUnlessEqual(prefix, state["last-complete-prefix"])
return defer.succeed(state)
class LoggingServiceParent(service.MultiService):
def log(self, *args, **kwargs):
return log.msg(*args, **kwargs)
@ -457,6 +506,9 @@ class SystemTestMixin(pollmixin.PollMixin, testutil.StallMixin):
d.addBoth(flush_but_dont_ignore)
return d
def workdir(self, name):
return os.path.join("system", self.__class__.__name__, name)
def getdir(self, subdir):
return os.path.join(self.basedir, subdir)
@ -575,11 +627,10 @@ class SystemTestMixin(pollmixin.PollMixin, testutil.StallMixin):
else:
config += nodeconfig
fileutil.write(os.path.join(basedir, 'tahoe.cfg'), config)
# give subclasses a chance to append lines to the nodes' tahoe.cfg files.
config += self._get_extra_config(i)
# give subclasses a chance to append lines to the node's tahoe.cfg
# files before they are launched.
self._set_up_nodes_extra_config()
fileutil.write(os.path.join(basedir, 'tahoe.cfg'), config)
# start clients[0], wait for it's tub to be ready (at which point it
# will have registered the helper furl).
@ -617,9 +668,9 @@ class SystemTestMixin(pollmixin.PollMixin, testutil.StallMixin):
d.addCallback(_connected)
return d
def _set_up_nodes_extra_config(self):
def _get_extra_config(self, i):
# for overriding by subclasses
pass
return ""
def _grab_stats(self, res):
d = self.stats_gatherer.poll()
@ -1279,8 +1330,8 @@ def _corrupt_offset_of_uri_extension_to_force_short_read(data, debug=False):
def _corrupt_mutable_share_data(data, debug=False):
prefix = data[:32]
assert prefix == MutableShareFile.MAGIC, "This function is designed to corrupt mutable shares of v1, and the magic number doesn't look right: %r vs %r" % (prefix, MutableShareFile.MAGIC)
data_offset = MutableShareFile.DATA_OFFSET
assert prefix == MutableDiskShare.MAGIC, "This function is designed to corrupt mutable shares of v1, and the magic number doesn't look right: %r vs %r" % (prefix, MutableDiskShare.MAGIC)
data_offset = MutableDiskShare.DATA_OFFSET
sharetype = data[data_offset:data_offset+1]
assert sharetype == "\x00", "non-SDMF mutable shares not supported"
(version, ig_seqnum, ig_roothash, ig_IV, ig_k, ig_N, ig_segsize,

View File

@ -51,8 +51,9 @@ class WebRenderingMixin:
ctx = self.make_context(req)
return page.renderSynchronously(ctx)
def failUnlessIn(self, substring, s):
self.failUnless(substring in s, s)
def render_json(self, page):
d = self.render1(page, args={"t": ["json"]})
return d
def remove_tags(self, s):
s = re.sub(r'<[^>]*>', ' ', s)

View File

@ -13,23 +13,28 @@
# Tubs, so it is not useful for tests that involve a Helper, a KeyGenerator,
# or the control.furl .
import os.path
import os.path, shutil
from zope.interface import implements
from twisted.application import service
from twisted.internet import defer, reactor
from twisted.python.failure import Failure
from foolscap.api import Referenceable, fireEventually, RemoteException
from base64 import b32encode
from allmydata import uri as tahoe_uri
from allmydata.client import Client
from allmydata.storage.server import StorageServer, storage_index_to_dir
from allmydata.util import fileutil, idlib, hashutil
from allmydata.storage.server import StorageServer
from allmydata.storage.backends.disk.disk_backend import DiskBackend
from allmydata.util import fileutil, idlib, hashutil, log
from allmydata.util.hashutil import sha1
from allmydata.test.common_web import HTTPClientGETFactory
from allmydata.interfaces import IStorageBroker, IServer
from allmydata.test.common import TEST_RSA_KEY_SIZE
PRINT_TRACEBACKS = False
class IntentionalError(Exception):
pass
@ -87,23 +92,34 @@ class LocalWrapper:
return d2
return _really_call()
if PRINT_TRACEBACKS:
import traceback
tb = traceback.extract_stack()
d = fireEventually()
d.addCallback(lambda res: _call())
def _wrap_exception(f):
if PRINT_TRACEBACKS and not f.check(NameError):
print ">>>" + ">>>".join(traceback.format_list(tb))
print "+++ %s%r %r: %s" % (methname, args, kwargs, f)
#f.printDetailedTraceback()
return Failure(RemoteException(f))
d.addErrback(_wrap_exception)
def _return_membrane(res):
# rather than complete the difficult task of building a
# Rather than complete the difficult task of building a
# fully-general Membrane (which would locate all Referenceable
# objects that cross the simulated wire and replace them with
# wrappers), we special-case certain methods that we happen to
# know will return Referenceables.
# The outer return value of such a method may be Deferred, but
# its components must not be.
if methname == "allocate_buckets":
(alreadygot, allocated) = res
for shnum in allocated:
assert not isinstance(allocated[shnum], defer.Deferred), (methname, allocated)
allocated[shnum] = LocalWrapper(allocated[shnum])
if methname == "get_buckets":
for shnum in res:
assert not isinstance(res[shnum], defer.Deferred), (methname, res)
res[shnum] = LocalWrapper(res[shnum])
return res
d.addCallback(_return_membrane)
@ -168,11 +184,20 @@ class NoNetworkStorageBroker:
seed = server.get_permutation_seed()
return sha1(peer_selection_index + seed).digest()
return sorted(self.get_connected_servers(), key=_permuted)
def get_connected_servers(self):
return self.client._servers
def get_nickname_for_serverid(self, serverid):
return None
def get_known_servers(self):
return self.get_connected_servers()
def get_all_serverids(self):
return self.client.get_all_serverids()
class NoNetworkClient(Client):
def create_tub(self):
pass
@ -234,8 +259,8 @@ class NoNetworkGrid(service.MultiService):
self.clients = []
for i in range(num_servers):
ss = self.make_server(i)
self.add_server(i, ss)
server = self.make_server(i)
self.add_server(i, server)
self.rebuild_serverlist()
for i in range(num_clients):
@ -266,23 +291,25 @@ class NoNetworkGrid(service.MultiService):
def make_server(self, i, readonly=False):
serverid = hashutil.tagged_hash("serverid", str(i))[:20]
serverdir = os.path.join(self.basedir, "servers",
idlib.shortnodeid_b2a(serverid), "storage")
fileutil.make_dirs(serverdir)
ss = StorageServer(serverdir, serverid, stats_provider=SimpleStats(),
readonly_storage=readonly)
ss._no_network_server_number = i
return ss
storagedir = os.path.join(self.basedir, "servers",
idlib.shortnodeid_b2a(serverid), "storage")
def add_server(self, i, ss):
# The backend will make the storage directory and any necessary parents.
backend = DiskBackend(storagedir, readonly=readonly)
server = StorageServer(serverid, backend, storagedir, stats_provider=SimpleStats())
server._no_network_server_number = i
return server
def add_server(self, i, server):
# to deal with the fact that all StorageServers are named 'storage',
# we interpose a middleman
middleman = service.MultiService()
middleman.setServiceParent(self)
ss.setServiceParent(middleman)
serverid = ss.my_nodeid
self.servers_by_number[i] = ss
wrapper = wrap_storage_server(ss)
server.setServiceParent(middleman)
serverid = server.get_serverid()
self.servers_by_number[i] = server
aa = server.get_accountant().get_anonymous_account()
wrapper = wrap_storage_server(aa)
self.wrappers_by_id[serverid] = wrapper
self.proxies_by_id[serverid] = NoNetworkServer(serverid, wrapper)
self.rebuild_serverlist()
@ -298,14 +325,14 @@ class NoNetworkGrid(service.MultiService):
def remove_server(self, serverid):
# it's enough to remove the server from c._servers (we don't actually
# have to detach and stopService it)
for i,ss in self.servers_by_number.items():
if ss.my_nodeid == serverid:
for i, server in self.servers_by_number.items():
if server.get_serverid() == serverid:
del self.servers_by_number[i]
break
del self.wrappers_by_id[serverid]
del self.proxies_by_id[serverid]
self.rebuild_serverlist()
return ss
return server
def break_server(self, serverid, count=True):
# mark the given server as broken, so it will throw exceptions when
@ -315,23 +342,25 @@ class NoNetworkGrid(service.MultiService):
def hang_server(self, serverid):
# hang the given server
ss = self.wrappers_by_id[serverid]
assert ss.hung_until is None
ss.hung_until = defer.Deferred()
server = self.wrappers_by_id[serverid]
assert server.hung_until is None
server.hung_until = defer.Deferred()
def unhang_server(self, serverid):
# unhang the given server
ss = self.wrappers_by_id[serverid]
assert ss.hung_until is not None
ss.hung_until.callback(None)
ss.hung_until = None
server = self.wrappers_by_id[serverid]
assert server.hung_until is not None
server.hung_until.callback(None)
server.hung_until = None
def nuke_from_orbit(self):
""" Empty all share directories in this grid. It's the only way to be sure ;-) """
"""Empty all share directories in this grid. It's the only way to be sure ;-)
This works only for a disk backend."""
for server in self.servers_by_number.values():
for prefixdir in os.listdir(server.sharedir):
sharedir = server.backend._sharedir
for prefixdir in os.listdir(sharedir):
if prefixdir != 'incoming':
fileutil.rm_dir(os.path.join(server.sharedir, prefixdir))
fileutil.rm_dir(os.path.join(sharedir, prefixdir))
class GridTestMixin:
@ -358,66 +387,117 @@ class GridTestMixin:
def get_clientdir(self, i=0):
return self.g.clients[i].basedir
def get_server(self, i):
return self.g.servers_by_number[i]
def get_serverdir(self, i):
return self.g.servers_by_number[i].storedir
return self.g.servers_by_number[i].backend._storedir
def remove_server(self, i):
self.g.remove_server(self.g.servers_by_number[i].get_serverid())
def iterate_servers(self):
for i in sorted(self.g.servers_by_number.keys()):
ss = self.g.servers_by_number[i]
yield (i, ss, ss.storedir)
server = self.g.servers_by_number[i]
yield (i, server, server.backend._storedir)
def find_uri_shares(self, uri):
si = tahoe_uri.from_string(uri).get_storage_index()
prefixdir = storage_index_to_dir(si)
shares = []
for i,ss in self.g.servers_by_number.items():
serverid = ss.my_nodeid
basedir = os.path.join(ss.sharedir, prefixdir)
if not os.path.exists(basedir):
continue
for f in os.listdir(basedir):
try:
shnum = int(f)
shares.append((shnum, serverid, os.path.join(basedir, f)))
except ValueError:
pass
return sorted(shares)
sharelist = []
d = defer.succeed(None)
for i, server in self.g.servers_by_number.items():
d.addCallback(lambda ign, server=server: server.backend.get_shareset(si).get_shares())
def _append_shares( (shares_for_server, corrupted), server=server):
assert len(corrupted) == 0, (shares_for_server, corrupted)
for share in shares_for_server:
assert not isinstance(share, defer.Deferred), share
sharelist.append( (share.get_shnum(), server.get_serverid(), share._get_path()) )
d.addCallback(_append_shares)
d.addCallback(lambda ign: sorted(sharelist))
return d
def add_server(self, server_number, readonly=False):
assert self.g, "I tried to find a grid at self.g, but failed"
ss = self.g.make_server(server_number, readonly)
log.msg("just created a server, number: %s => %s" % (server_number, ss,))
self.g.add_server(server_number, ss)
def add_server_with_share(self, uri, server_number, share_number=None,
readonly=False):
self.add_server(server_number, readonly)
if share_number is not None:
self.copy_share_to_server(uri, server_number, share_number)
def copy_share_to_server(self, uri, server_number, share_number):
ss = self.g.servers_by_number[server_number]
self.copy_share(self.shares[share_number], uri, ss)
def copy_shares(self, uri):
shares = {}
for (shnum, serverid, sharefile) in self.find_uri_shares(uri):
shares[sharefile] = open(sharefile, "rb").read()
return shares
d = self.find_uri_shares(uri)
def _got_shares(sharelist):
for (shnum, serverid, sharefile) in sharelist:
shares[sharefile] = fileutil.read(sharefile)
return shares
d.addCallback(_got_shares)
return d
def copy_share(self, from_share, uri, to_server):
si = tahoe_uri.from_string(uri).get_storage_index()
(i_shnum, i_serverid, i_sharefile) = from_share
shares_dir = to_server.backend.get_shareset(si)._get_sharedir()
new_sharefile = os.path.join(shares_dir, str(i_shnum))
fileutil.make_dirs(shares_dir)
if os.path.normpath(i_sharefile) != os.path.normpath(new_sharefile):
shutil.copy(i_sharefile, new_sharefile)
def restore_all_shares(self, shares):
for sharefile, data in shares.items():
open(sharefile, "wb").write(data)
fileutil.write(sharefile, data)
def delete_share(self, (shnum, serverid, sharefile)):
os.unlink(sharefile)
fileutil.remove(sharefile)
def delete_shares_numbered(self, uri, shnums):
for (i_shnum, i_serverid, i_sharefile) in self.find_uri_shares(uri):
if i_shnum in shnums:
os.unlink(i_sharefile)
d = self.find_uri_shares(uri)
def _got_shares(sharelist):
for (i_shnum, i_serverid, i_sharefile) in sharelist:
if i_shnum in shnums:
fileutil.remove(i_sharefile)
d.addCallback(_got_shares)
return d
def corrupt_share(self, (shnum, serverid, sharefile), corruptor_function):
sharedata = open(sharefile, "rb").read()
corruptdata = corruptor_function(sharedata)
open(sharefile, "wb").write(corruptdata)
def delete_all_shares(self, uri):
d = self.find_uri_shares(uri)
def _got_shares(shares):
for sh in shares:
self.delete_share(sh)
d.addCallback(_got_shares)
return d
def corrupt_share(self, (shnum, serverid, sharefile), corruptor_function, debug=False):
sharedata = fileutil.read(sharefile)
corruptdata = corruptor_function(sharedata, debug=debug)
fileutil.write(sharefile, corruptdata)
def corrupt_shares_numbered(self, uri, shnums, corruptor, debug=False):
for (i_shnum, i_serverid, i_sharefile) in self.find_uri_shares(uri):
if i_shnum in shnums:
sharedata = open(i_sharefile, "rb").read()
corruptdata = corruptor(sharedata, debug=debug)
open(i_sharefile, "wb").write(corruptdata)
d = self.find_uri_shares(uri)
def _got_shares(sharelist):
for (i_shnum, i_serverid, i_sharefile) in sharelist:
if i_shnum in shnums:
self.corrupt_share((i_shnum, i_serverid, i_sharefile), corruptor, debug=debug)
d.addCallback(_got_shares)
return d
def corrupt_all_shares(self, uri, corruptor, debug=False):
for (i_shnum, i_serverid, i_sharefile) in self.find_uri_shares(uri):
sharedata = open(i_sharefile, "rb").read()
corruptdata = corruptor(sharedata, debug=debug)
open(i_sharefile, "wb").write(corruptdata)
d = self.find_uri_shares(uri)
def _got_shares(sharelist):
for (i_shnum, i_serverid, i_sharefile) in sharelist:
self.corrupt_share((i_shnum, i_serverid, i_sharefile), corruptor, debug=debug)
d.addCallback(_got_shares)
return d
def GET(self, urlpath, followRedirect=False, return_response=False,
method="GET", clientnum=0, **kwargs):

View File

@ -1,20 +1,20 @@
import simplejson
import os.path, shutil
from twisted.trial import unittest
from twisted.internet import defer
from allmydata import check_results, uri
from allmydata import uri as tahoe_uri
from allmydata.util import base32
from allmydata.web import check_results as web_check_results
from allmydata.storage_client import StorageFarmBroker, NativeStorageServer
from allmydata.storage.server import storage_index_to_dir
from allmydata.monitor import Monitor
from allmydata.test.no_network import GridTestMixin
from allmydata.immutable.upload import Data
from allmydata.test.common_web import WebRenderingMixin
from allmydata.mutable.publish import MutableData
class FakeClient:
def get_storage_broker(self):
return self.storage_broker
@ -41,7 +41,7 @@ class WebResultsRendering(unittest.TestCase, WebRenderingMixin):
"my-version": "ver",
"oldest-supported": "oldest",
}
s = NativeStorageServer(key_s, ann)
s = NativeStorageServer(key_s, ann, None)
sb.test_add_server(peerid, s) # XXX: maybe use key_s?
c = FakeClient()
c.storage_broker = sb
@ -315,53 +315,23 @@ class WebResultsRendering(unittest.TestCase, WebRenderingMixin):
class BalancingAct(GridTestMixin, unittest.TestCase):
# test for #1115 regarding the 'count-good-share-hosts' metric
def add_server(self, server_number, readonly=False):
assert self.g, "I tried to find a grid at self.g, but failed"
ss = self.g.make_server(server_number, readonly)
#log.msg("just created a server, number: %s => %s" % (server_number, ss,))
self.g.add_server(server_number, ss)
def add_server_with_share(self, server_number, uri, share_number=None,
readonly=False):
self.add_server(server_number, readonly)
if share_number is not None:
self.copy_share_to_server(uri, share_number, server_number)
def copy_share_to_server(self, uri, share_number, server_number):
ss = self.g.servers_by_number[server_number]
# Copy share i from the directory associated with the first
# storage server to the directory associated with this one.
assert self.g, "I tried to find a grid at self.g, but failed"
assert self.shares, "I tried to find shares at self.shares, but failed"
old_share_location = self.shares[share_number][2]
new_share_location = os.path.join(ss.storedir, "shares")
si = tahoe_uri.from_string(self.uri).get_storage_index()
new_share_location = os.path.join(new_share_location,
storage_index_to_dir(si))
if not os.path.exists(new_share_location):
os.makedirs(new_share_location)
new_share_location = os.path.join(new_share_location,
str(share_number))
if old_share_location != new_share_location:
shutil.copy(old_share_location, new_share_location)
shares = self.find_uri_shares(uri)
# Make sure that the storage server has the share.
self.failUnless((share_number, ss.my_nodeid, new_share_location)
in shares)
def _pretty_shares_chart(self, uri):
def _print_pretty_shares_chart(self, res):
# Servers are labeled A-Z, shares are labeled 0-9
letters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
assert len(self.g.servers_by_number) < len(letters), \
"This little printing function is only meant for < 26 servers"
shares_chart = {}
names = dict(zip([ss.my_nodeid
names = dict(zip([ss.get_serverid()
for _,ss in self.g.servers_by_number.iteritems()],
letters))
for shnum, serverid, _ in self.find_uri_shares(uri):
shares_chart.setdefault(shnum, []).append(names[serverid])
return shares_chart
d = self.find_uri_shares(self.uri)
def _got(shares):
shares_chart = {}
for shnum, serverid, _ in shares:
shares_chart.setdefault(shnum, []).append(names[serverid])
print shares_chart
return res
d.addCallback(_got)
return d
def test_good_share_hosts(self):
self.basedir = "checker/BalancingAct/1115"
@ -383,18 +353,18 @@ class BalancingAct(GridTestMixin, unittest.TestCase):
self.shares = shares
d.addCallback(_store_shares)
def add_three(_, i):
# Add a new server with just share 3
self.add_server_with_share(i, self.uri, 3)
#print self._pretty_shares_chart(self.uri)
for i in range(1,5):
d.addCallback(add_three, i)
def _check_and_repair(_):
return self.imm.check_and_repair(Monitor())
def _layout(ign):
# Add servers with just share 3
for i in range(1, 5):
self.add_server_with_share(self.uri, server_number=i, share_number=3)
d.addCallback(_layout)
#d.addCallback(self._print_pretty_shares_chart)
def _check_and_repair(ign):
d2 = self.imm.check_and_repair(Monitor())
#d2.addCallback(self._print_pretty_shares_chart)
return d2
def _check_counts(crr, shares_good, good_share_hosts):
prr = crr.get_post_repair_results()
#print self._pretty_shares_chart(self.uri)
self.failUnlessEqual(prr.get_share_counter_good(), shares_good)
self.failUnlessEqual(prr.get_host_counter_good_shares(),
good_share_hosts)
@ -418,6 +388,7 @@ class BalancingAct(GridTestMixin, unittest.TestCase):
d.addCallback(_check_counts, 0, 0)
return d
class AddLease(GridTestMixin, unittest.TestCase):
# test for #875, in which failures in the add-lease call cause
# false-negatives in the checker
@ -454,8 +425,9 @@ class AddLease(GridTestMixin, unittest.TestCase):
def broken_add_lease(*args, **kwargs):
really_did_break.append(1)
raise KeyError("intentional failure, should be ignored")
assert self.g.servers_by_number[0].remote_add_lease
self.g.servers_by_number[0].remote_add_lease = broken_add_lease
ss = self.g.servers_by_number[0].get_accountant().get_anonymous_account()
assert ss.remote_add_lease
ss.remote_add_lease = broken_add_lease
d.addCallback(_break_add_lease)
# and confirm that the files still look healthy

View File

@ -18,10 +18,10 @@ import allmydata.scripts.common_http
from pycryptopp.publickey import ed25519
# Test that the scripts can be imported.
from allmydata.scripts import create_node, debug, keygen, startstop_node, \
from allmydata.scripts import create_node, admin, debug, keygen, startstop_node, \
tahoe_add_alias, tahoe_backup, tahoe_check, tahoe_cp, tahoe_get, tahoe_ls, \
tahoe_manifest, tahoe_mkdir, tahoe_mv, tahoe_put, tahoe_unlink, tahoe_webopen
_hush_pyflakes = [create_node, debug, keygen, startstop_node,
_hush_pyflakes = [create_node, admin, debug, keygen, startstop_node,
tahoe_add_alias, tahoe_backup, tahoe_check, tahoe_cp, tahoe_get, tahoe_ls,
tahoe_manifest, tahoe_mkdir, tahoe_mv, tahoe_put, tahoe_unlink, tahoe_webopen]
@ -107,10 +107,6 @@ class CLI(CLITestMixin, unittest.TestCase):
self.failUnless("k/N: 25/100" in output, output)
self.failUnless("storage index: hdis5iaveku6lnlaiccydyid7q" in output, output)
output = self._dump_cap("--client-secret", "5s33nk3qpvnj2fw3z4mnm2y6fa",
u.to_string())
self.failUnless("client renewal secret: znxmki5zdibb5qlt46xbdvk2t55j7hibejq3i5ijyurkr6m6jkhq" in output, output)
output = self._dump_cap(u.get_verify_cap().to_string())
self.failIf("key: " in output, output)
self.failUnless("UEB hash: nf3nimquen7aeqm36ekgxomalstenpkvsdmf6fplj7swdatbv5oa" in output, output)
@ -145,31 +141,8 @@ class CLI(CLITestMixin, unittest.TestCase):
self.failUnless("storage index: nt4fwemuw7flestsezvo2eveke" in output, output)
self.failUnless("fingerprint: 737p57x6737p57x6737p57x6737p57x6737p57x6737p57x6737a" in output, output)
output = self._dump_cap("--client-secret", "5s33nk3qpvnj2fw3z4mnm2y6fa",
u.to_string())
self.failUnless("file renewal secret: arpszxzc2t6kb4okkg7sp765xgkni5z7caavj7lta73vmtymjlxq" in output, output)
fileutil.make_dirs("cli/test_dump_cap/private")
fileutil.write("cli/test_dump_cap/private/secret", "5s33nk3qpvnj2fw3z4mnm2y6fa\n")
output = self._dump_cap("--client-dir", "cli/test_dump_cap",
u.to_string())
self.failUnless("file renewal secret: arpszxzc2t6kb4okkg7sp765xgkni5z7caavj7lta73vmtymjlxq" in output, output)
output = self._dump_cap("--client-dir", "cli/test_dump_cap_BOGUS",
u.to_string())
self.failIf("file renewal secret:" in output, output)
output = self._dump_cap("--nodeid", "tqc35esocrvejvg4mablt6aowg6tl43j",
u.to_string())
output = self._dump_cap("--nodeid", "tqc35esocrvejvg4mablt6aowg6tl43j", u.to_string())
self.failUnless("write_enabler: mgcavriox2wlb5eer26unwy5cw56elh3sjweffckkmivvsxtaknq" in output, output)
self.failIf("file renewal secret:" in output, output)
output = self._dump_cap("--nodeid", "tqc35esocrvejvg4mablt6aowg6tl43j",
"--client-secret", "5s33nk3qpvnj2fw3z4mnm2y6fa",
u.to_string())
self.failUnless("write_enabler: mgcavriox2wlb5eer26unwy5cw56elh3sjweffckkmivvsxtaknq" in output, output)
self.failUnless("file renewal secret: arpszxzc2t6kb4okkg7sp765xgkni5z7caavj7lta73vmtymjlxq" in output, output)
self.failUnless("lease renewal secret: 7pjtaumrb7znzkkbvekkmuwpqfjyfyamznfz4bwwvmh4nw33lorq" in output, output)
u = u.get_readonly()
output = self._dump_cap(u.to_string())
@ -196,31 +169,8 @@ class CLI(CLITestMixin, unittest.TestCase):
self.failUnless("storage index: nt4fwemuw7flestsezvo2eveke" in output, output)
self.failUnless("fingerprint: 737p57x6737p57x6737p57x6737p57x6737p57x6737p57x6737a" in output, output)
output = self._dump_cap("--client-secret", "5s33nk3qpvnj2fw3z4mnm2y6fa",
u.to_string())
self.failUnless("file renewal secret: arpszxzc2t6kb4okkg7sp765xgkni5z7caavj7lta73vmtymjlxq" in output, output)
fileutil.make_dirs("cli/test_dump_cap/private")
fileutil.write("cli/test_dump_cap/private/secret", "5s33nk3qpvnj2fw3z4mnm2y6fa\n")
output = self._dump_cap("--client-dir", "cli/test_dump_cap",
u.to_string())
self.failUnless("file renewal secret: arpszxzc2t6kb4okkg7sp765xgkni5z7caavj7lta73vmtymjlxq" in output, output)
output = self._dump_cap("--client-dir", "cli/test_dump_cap_BOGUS",
u.to_string())
self.failIf("file renewal secret:" in output, output)
output = self._dump_cap("--nodeid", "tqc35esocrvejvg4mablt6aowg6tl43j",
u.to_string())
output = self._dump_cap("--nodeid", "tqc35esocrvejvg4mablt6aowg6tl43j", u.to_string())
self.failUnless("write_enabler: mgcavriox2wlb5eer26unwy5cw56elh3sjweffckkmivvsxtaknq" in output, output)
self.failIf("file renewal secret:" in output, output)
output = self._dump_cap("--nodeid", "tqc35esocrvejvg4mablt6aowg6tl43j",
"--client-secret", "5s33nk3qpvnj2fw3z4mnm2y6fa",
u.to_string())
self.failUnless("write_enabler: mgcavriox2wlb5eer26unwy5cw56elh3sjweffckkmivvsxtaknq" in output, output)
self.failUnless("file renewal secret: arpszxzc2t6kb4okkg7sp765xgkni5z7caavj7lta73vmtymjlxq" in output, output)
self.failUnless("lease renewal secret: 7pjtaumrb7znzkkbvekkmuwpqfjyfyamznfz4bwwvmh4nw33lorq" in output, output)
u = u.get_readonly()
output = self._dump_cap(u.to_string())
@ -257,10 +207,6 @@ class CLI(CLITestMixin, unittest.TestCase):
self.failUnless("k/N: 25/100" in output, output)
self.failUnless("storage index: hdis5iaveku6lnlaiccydyid7q" in output, output)
output = self._dump_cap("--client-secret", "5s33nk3qpvnj2fw3z4mnm2y6fa",
u.to_string())
self.failUnless("file renewal secret: csrvkjgomkyyyil5yo4yk5np37p6oa2ve2hg6xmk2dy7kaxsu6xq" in output, output)
u = u.get_verify_cap()
output = self._dump_cap(u.to_string())
self.failUnless("CHK Directory Verifier URI:" in output, output)
@ -285,21 +231,8 @@ class CLI(CLITestMixin, unittest.TestCase):
output)
self.failUnless("fingerprint: 737p57x6737p57x6737p57x6737p57x6737p57x6737p57x6737a" in output, output)
output = self._dump_cap("--client-secret", "5s33nk3qpvnj2fw3z4mnm2y6fa",
u.to_string())
self.failUnless("file renewal secret: arpszxzc2t6kb4okkg7sp765xgkni5z7caavj7lta73vmtymjlxq" in output, output)
output = self._dump_cap("--nodeid", "tqc35esocrvejvg4mablt6aowg6tl43j",
u.to_string())
output = self._dump_cap("--nodeid", "tqc35esocrvejvg4mablt6aowg6tl43j", u.to_string())
self.failUnless("write_enabler: mgcavriox2wlb5eer26unwy5cw56elh3sjweffckkmivvsxtaknq" in output, output)
self.failIf("file renewal secret:" in output, output)
output = self._dump_cap("--nodeid", "tqc35esocrvejvg4mablt6aowg6tl43j",
"--client-secret", "5s33nk3qpvnj2fw3z4mnm2y6fa",
u.to_string())
self.failUnless("write_enabler: mgcavriox2wlb5eer26unwy5cw56elh3sjweffckkmivvsxtaknq" in output, output)
self.failUnless("file renewal secret: arpszxzc2t6kb4okkg7sp765xgkni5z7caavj7lta73vmtymjlxq" in output, output)
self.failUnless("lease renewal secret: 7pjtaumrb7znzkkbvekkmuwpqfjyfyamznfz4bwwvmh4nw33lorq" in output, output)
u = u.get_readonly()
output = self._dump_cap(u.to_string())
@ -329,21 +262,8 @@ class CLI(CLITestMixin, unittest.TestCase):
output)
self.failUnless("fingerprint: 737p57x6737p57x6737p57x6737p57x6737p57x6737p57x6737a" in output, output)
output = self._dump_cap("--client-secret", "5s33nk3qpvnj2fw3z4mnm2y6fa",
u.to_string())
self.failUnless("file renewal secret: arpszxzc2t6kb4okkg7sp765xgkni5z7caavj7lta73vmtymjlxq" in output, output)
output = self._dump_cap("--nodeid", "tqc35esocrvejvg4mablt6aowg6tl43j",
u.to_string())
output = self._dump_cap("--nodeid", "tqc35esocrvejvg4mablt6aowg6tl43j", u.to_string())
self.failUnless("write_enabler: mgcavriox2wlb5eer26unwy5cw56elh3sjweffckkmivvsxtaknq" in output, output)
self.failIf("file renewal secret:" in output, output)
output = self._dump_cap("--nodeid", "tqc35esocrvejvg4mablt6aowg6tl43j",
"--client-secret", "5s33nk3qpvnj2fw3z4mnm2y6fa",
u.to_string())
self.failUnless("write_enabler: mgcavriox2wlb5eer26unwy5cw56elh3sjweffckkmivvsxtaknq" in output, output)
self.failUnless("file renewal secret: arpszxzc2t6kb4okkg7sp765xgkni5z7caavj7lta73vmtymjlxq" in output, output)
self.failUnless("lease renewal secret: 7pjtaumrb7znzkkbvekkmuwpqfjyfyamznfz4bwwvmh4nw33lorq" in output, output)
u = u.get_readonly()
output = self._dump_cap(u.to_string())
@ -687,6 +607,26 @@ class Help(unittest.TestCase):
subhelp = str(oClass())
self.failUnlessIn(" [global-opts] debug flogtool %s " % (option,), subhelp)
def test_create_admin(self):
help = str(admin.AdminCommand())
self.failUnlessIn(" [global-opts] admin SUBCOMMAND", help)
def test_create_admin_generate_keypair(self):
help = str(admin.GenerateKeypairOptions())
self.failUnlessIn(" [global-opts] admin generate-keypair", help)
def test_create_admin_derive_pubkey(self):
help = str(admin.DerivePubkeyOptions())
self.failUnlessIn(" [global-opts] admin derive-pubkey", help)
def test_create_admin_create_container(self):
help = str(admin.CreateContainerOptions())
self.failUnlessIn(" [global-opts] admin create-container [NODEDIR]", help)
def test_create_admin_ls_container(self):
help = str(admin.ListContainerOptions())
self.failUnlessIn(" [global-opts] admin ls-container [NODEDIR]", help)
class CreateAlias(GridTestMixin, CLITestMixin, unittest.TestCase):
@ -2992,20 +2932,19 @@ class Check(GridTestMixin, CLITestMixin, unittest.TestCase):
d.addCallback(lambda ign: self.do_cli("check", "--raw", self.lit_uri))
d.addCallback(_check_lit_raw)
def _clobber_shares(ignored):
d.addCallback(lambda ign: self.find_uri_shares(self.uri))
def _clobber_shares(shares):
# delete one, corrupt a second
shares = self.find_uri_shares(self.uri)
self.failUnlessReallyEqual(len(shares), 10)
os.unlink(shares[0][2])
cso = debug.CorruptShareOptions()
cso.stdout = StringIO()
cso.parseOptions([shares[1][2]])
fileutil.remove(shares[0][2])
stdout = StringIO()
sharefile = shares[1][2]
storage_index = uri.from_string(self.uri).get_storage_index()
self._corrupt_share_line = " server %s, SI %s, shnum %d" % \
(base32.b2a(shares[1][1]),
base32.b2a(storage_index),
shares[1][0])
debug.corrupt_share(cso)
debug.do_corrupt_share(stdout, sharefile)
d.addCallback(_clobber_shares)
d.addCallback(lambda ign: self.do_cli("check", "--verify", self.uri))
@ -3131,22 +3070,23 @@ class Check(GridTestMixin, CLITestMixin, unittest.TestCase):
self.failUnlessIn(" 317-1000 : 1 (1000 B, 1000 B)", lines)
d.addCallback(_check_stats)
def _clobber_shares(ignored):
shares = self.find_uri_shares(self.uris[u"g\u00F6\u00F6d"])
d.addCallback(lambda ign: self.find_uri_shares(self.uris[u"g\u00F6\u00F6d"]))
def _clobber_shares(shares):
self.failUnlessReallyEqual(len(shares), 10)
os.unlink(shares[0][2])
fileutil.remove(shares[0][2])
d.addCallback(_clobber_shares)
shares = self.find_uri_shares(self.uris["mutable"])
cso = debug.CorruptShareOptions()
cso.stdout = StringIO()
cso.parseOptions([shares[1][2]])
d.addCallback(lambda ign: self.find_uri_shares(self.uris["mutable"]))
def _clobber_mutable_shares(shares):
stdout = StringIO()
sharefile = shares[1][2]
storage_index = uri.from_string(self.uris["mutable"]).get_storage_index()
self._corrupt_share_line = " corrupt: server %s, SI %s, shnum %d" % \
(base32.b2a(shares[1][1]),
base32.b2a(storage_index),
shares[1][0])
debug.corrupt_share(cso)
d.addCallback(_clobber_shares)
debug.do_corrupt_share(stdout, sharefile)
d.addCallback(_clobber_mutable_shares)
# root
# root/g\u00F6\u00F6d [9 shares]
@ -3305,11 +3245,11 @@ class Check(GridTestMixin, CLITestMixin, unittest.TestCase):
c0 = self.g.clients[0]
d = c0.create_dirnode()
def _stash_uri(n):
self.uriList.append(n.get_uri())
self.uriList.append(n.get_uri())
d.addCallback(_stash_uri)
d = c0.create_dirnode()
d.addCallback(_stash_uri)
d.addCallback(lambda ign: self.do_cli("check", self.uriList[0], self.uriList[1]))
def _check((rc, out, err)):
self.failUnlessReallyEqual(rc, 0)
@ -3318,7 +3258,7 @@ class Check(GridTestMixin, CLITestMixin, unittest.TestCase):
self.failUnlessIn("Healthy", out[:len(out)/2])
self.failUnlessIn("Healthy", out[len(out)/2:])
d.addCallback(_check)
d.addCallback(lambda ign: self.do_cli("check", self.uriList[0], "nonexistent:"))
def _check2((rc, out, err)):
self.failUnlessReallyEqual(rc, 1)
@ -3326,7 +3266,6 @@ class Check(GridTestMixin, CLITestMixin, unittest.TestCase):
self.failUnlessIn("error:", err)
self.failUnlessIn("nonexistent", err)
d.addCallback(_check2)
return d

View File

@ -3,9 +3,11 @@ from twisted.trial import unittest
from twisted.application import service
import allmydata
from allmydata.node import OldConfigError, OldConfigOptionError, MissingConfigEntry
from allmydata.node import OldConfigError, OldConfigOptionError, InvalidValueError, MissingConfigEntry
from allmydata import client
from allmydata.storage_client import StorageFarmBroker
from allmydata.storage.backends.disk.disk_backend import DiskBackend
from allmydata.storage.backends.cloud.cloud_backend import CloudBackend
from allmydata.util import base32, fileutil
from allmydata.interfaces import IFilesystemNode, IFileNode, \
IImmutableFileNode, IMutableFileNode, IDirectoryNode
@ -14,6 +16,7 @@ import allmydata.test.common_util as testutil
import mock
BASECONFIG = ("[client]\n"
"introducer.furl = \n"
)
@ -26,9 +29,11 @@ class Basic(testutil.ReallyEqualMixin, unittest.TestCase):
def test_loadable(self):
basedir = "test_client.Basic.test_loadable"
os.mkdir(basedir)
fileutil.write(os.path.join(basedir, "tahoe.cfg"), \
BASECONFIG)
client.Client(basedir)
fileutil.write(os.path.join(basedir, "tahoe.cfg"),
BASECONFIG)
c = client.Client(basedir)
server = c.getServiceNamed("storage")
self.failUnless(isinstance(server.backend, DiskBackend), server.backend)
@mock.patch('twisted.python.log.msg')
def test_error_on_old_config_files(self, mock_log_msg):
@ -68,8 +73,8 @@ class Basic(testutil.ReallyEqualMixin, unittest.TestCase):
def test_secrets(self):
basedir = "test_client.Basic.test_secrets"
os.mkdir(basedir)
fileutil.write(os.path.join(basedir, "tahoe.cfg"), \
BASECONFIG)
fileutil.write(os.path.join(basedir, "tahoe.cfg"),
BASECONFIG)
c = client.Client(basedir)
secret_fname = os.path.join(basedir, "private", "secret")
self.failUnless(os.path.exists(secret_fname), secret_fname)
@ -97,66 +102,405 @@ class Basic(testutil.ReallyEqualMixin, unittest.TestCase):
def test_reserved_1(self):
basedir = "client.Basic.test_reserved_1"
os.mkdir(basedir)
fileutil.write(os.path.join(basedir, "tahoe.cfg"), \
BASECONFIG + \
"[storage]\n" + \
"enabled = true\n" + \
"reserved_space = 1000\n")
fileutil.write(os.path.join(basedir, "tahoe.cfg"),
BASECONFIG +
"[storage]\n" +
"enabled = true\n" +
"reserved_space = 1000\n")
c = client.Client(basedir)
self.failUnlessEqual(c.getServiceNamed("storage").reserved_space, 1000)
server = c.getServiceNamed("storage")
self.failUnlessReallyEqual(server.backend._reserved_space, 1000)
def test_reserved_2(self):
basedir = "client.Basic.test_reserved_2"
os.mkdir(basedir)
fileutil.write(os.path.join(basedir, "tahoe.cfg"), \
BASECONFIG + \
"[storage]\n" + \
"enabled = true\n" + \
"reserved_space = 10K\n")
fileutil.write(os.path.join(basedir, "tahoe.cfg"),
BASECONFIG +
"[storage]\n" +
"enabled = true\n" +
"reserved_space = 10K\n")
c = client.Client(basedir)
self.failUnlessEqual(c.getServiceNamed("storage").reserved_space, 10*1000)
server = c.getServiceNamed("storage")
self.failUnlessReallyEqual(server.backend._reserved_space, 10*1000)
def test_reserved_3(self):
basedir = "client.Basic.test_reserved_3"
os.mkdir(basedir)
fileutil.write(os.path.join(basedir, "tahoe.cfg"), \
BASECONFIG + \
"[storage]\n" + \
"enabled = true\n" + \
"reserved_space = 5mB\n")
fileutil.write(os.path.join(basedir, "tahoe.cfg"),
BASECONFIG +
"[storage]\n" +
"enabled = true\n" +
"reserved_space = 5mB\n")
c = client.Client(basedir)
self.failUnlessEqual(c.getServiceNamed("storage").reserved_space,
5*1000*1000)
server = c.getServiceNamed("storage")
self.failUnlessReallyEqual(server.backend._reserved_space, 5*1000*1000)
def test_reserved_4(self):
basedir = "client.Basic.test_reserved_4"
os.mkdir(basedir)
fileutil.write(os.path.join(basedir, "tahoe.cfg"), \
BASECONFIG + \
"[storage]\n" + \
"enabled = true\n" + \
"reserved_space = 78Gb\n")
fileutil.write(os.path.join(basedir, "tahoe.cfg"),
BASECONFIG +
"[storage]\n" +
"enabled = true\n" +
"reserved_space = 78Gb\n")
c = client.Client(basedir)
self.failUnlessEqual(c.getServiceNamed("storage").reserved_space,
78*1000*1000*1000)
server = c.getServiceNamed("storage")
self.failUnlessReallyEqual(server.backend._reserved_space, 78*1000*1000*1000)
def test_reserved_default(self):
# This is testing the default when 'reserved_space' is not present, not
# the default for a newly created node.
basedir = "client.Basic.test_reserved_default"
os.mkdir(basedir)
fileutil.write(os.path.join(basedir, "tahoe.cfg"),
BASECONFIG +
"[storage]\n" +
"enabled = true\n")
c = client.Client(basedir)
server = c.getServiceNamed("storage")
self.failUnlessReallyEqual(server.backend._reserved_space, 0)
def test_reserved_bad(self):
basedir = "client.Basic.test_reserved_bad"
os.mkdir(basedir)
fileutil.write(os.path.join(basedir, "tahoe.cfg"),
BASECONFIG +
"[storage]\n" +
"enabled = true\n" +
"reserved_space = bogus\n")
self.failUnlessRaises(InvalidValueError, client.Client, basedir)
def _write_secret(self, basedir, filename, secret="dummy"):
fileutil.make_dirs(os.path.join(basedir, "private"))
fileutil.write(os.path.join(basedir, "private", filename), secret)
@mock.patch('allmydata.storage.backends.cloud.s3.s3_container.S3Container')
def test_s3_config_good_defaults(self, mock_S3Container):
basedir = "client.Basic.test_s3_config_good_defaults"
os.mkdir(basedir)
self._write_secret(basedir, "s3secret")
config = (BASECONFIG +
"[storage]\n" +
"enabled = true\n" +
"backend = cloud.s3\n" +
"s3.access_key_id = keyid\n" +
"s3.bucket = test\n")
fileutil.write(os.path.join(basedir, "tahoe.cfg"), config)
c = client.Client(basedir)
mock_S3Container.assert_called_with("keyid", "dummy", "http://s3.amazonaws.com", "test", None, None)
server = c.getServiceNamed("storage")
self.failUnless(isinstance(server.backend, CloudBackend), server.backend)
mock_S3Container.reset_mock()
self._write_secret(basedir, "s3producttoken", secret="{ProductToken}")
self.failUnlessRaises(InvalidValueError, client.Client, basedir)
mock_S3Container.reset_mock()
self._write_secret(basedir, "s3usertoken", secret="{UserToken}")
fileutil.write(os.path.join(basedir, "tahoe.cfg"), config + "s3.url = http://s3.example.com\n")
c = client.Client(basedir)
mock_S3Container.assert_called_with("keyid", "dummy", "http://s3.example.com", "test",
"{UserToken}", "{ProductToken}")
def test_s3_readonly_bad(self):
basedir = "client.Basic.test_s3_readonly_bad"
os.mkdir(basedir)
self._write_secret(basedir, "s3secret")
fileutil.write(os.path.join(basedir, "tahoe.cfg"),
BASECONFIG +
"[storage]\n" +
"enabled = true\n" +
"readonly = true\n" +
"backend = cloud.s3\n" +
"s3.access_key_id = keyid\n" +
"s3.bucket = test\n")
self.failUnlessRaises(InvalidValueError, client.Client, basedir)
def test_s3_config_no_access_key_id(self):
basedir = "client.Basic.test_s3_config_no_access_key_id"
os.mkdir(basedir)
self._write_secret(basedir, "s3secret")
fileutil.write(os.path.join(basedir, "tahoe.cfg"),
BASECONFIG +
"[storage]\n" +
"enabled = true\n" +
"backend = cloud.s3\n" +
"s3.bucket = test\n")
self.failUnlessRaises(MissingConfigEntry, client.Client, basedir)
def test_s3_config_no_bucket(self):
basedir = "client.Basic.test_s3_config_no_bucket"
os.mkdir(basedir)
self._write_secret(basedir, "s3secret")
fileutil.write(os.path.join(basedir, "tahoe.cfg"),
BASECONFIG +
"[storage]\n" +
"enabled = true\n" +
"backend = cloud.s3\n" +
"s3.access_key_id = keyid\n")
self.failUnlessRaises(MissingConfigEntry, client.Client, basedir)
def test_s3_config_no_s3secret(self):
basedir = "client.Basic.test_s3_config_no_s3secret"
os.mkdir(basedir)
fileutil.write(os.path.join(basedir, "tahoe.cfg"),
BASECONFIG +
"[storage]\n" +
"enabled = true\n" +
"backend = cloud.s3\n" +
"s3.access_key_id = keyid\n" +
"s3.bucket = test\n")
self.failUnlessRaises(MissingConfigEntry, client.Client, basedir)
@mock.patch('allmydata.storage.backends.cloud.openstack.openstack_container.AuthenticatorV2')
@mock.patch('allmydata.storage.backends.cloud.openstack.openstack_container.AuthenticationClient')
@mock.patch('allmydata.storage.backends.cloud.openstack.openstack_container.OpenStackContainer')
def test_openstack_config_good_defaults(self, mock_OpenStackContainer, mock_AuthenticationClient,
mock_Authenticator):
basedir = "client.Basic.test_openstack_config_good_defaults"
os.mkdir(basedir)
self._write_secret(basedir, "openstack_api_key")
config = (BASECONFIG +
"[storage]\n" +
"enabled = true\n" +
"backend = cloud.openstack\n" +
"openstack.provider = rackspace.com\n" +
"openstack.username = alex\n" +
"openstack.container = test\n")
fileutil.write(os.path.join(basedir, "tahoe.cfg"), config)
c = client.Client(basedir)
mock_Authenticator.assert_called_with("https://identity.api.rackspacecloud.com/v2.0/tokens",
{'RAX-KSKEY:apiKeyCredentials': {'username': 'alex', 'apiKey': 'dummy'}})
authclient_call_args = mock_AuthenticationClient.call_args_list
self.failUnlessEqual(len(authclient_call_args), 1)
self.failUnlessEqual(authclient_call_args[0][0][1:], (11*60*60,))
container_call_args = mock_OpenStackContainer.call_args_list
self.failUnlessEqual(len(container_call_args), 1)
self.failUnlessEqual(container_call_args[0][0][1:], ("test",))
server = c.getServiceNamed("storage")
self.failUnless(isinstance(server.backend, CloudBackend), server.backend)
def test_openstack_readonly_bad(self):
basedir = "client.Basic.test_openstack_readonly_bad"
os.mkdir(basedir)
self._write_secret(basedir, "openstack_api_key")
fileutil.write(os.path.join(basedir, "tahoe.cfg"),
BASECONFIG +
"[storage]\n" +
"enabled = true\n" +
"readonly = true\n" +
"backend = cloud.openstack\n" +
"openstack.provider = rackspace.com\n" +
"openstack.username = alex\n" +
"openstack.container = test\n")
self.failUnlessRaises(InvalidValueError, client.Client, basedir)
def test_openstack_config_no_username(self):
basedir = "client.Basic.test_openstack_config_no_username"
os.mkdir(basedir)
self._write_secret(basedir, "openstack_api_key")
fileutil.write(os.path.join(basedir, "tahoe.cfg"),
BASECONFIG +
"[storage]\n" +
"enabled = true\n" +
"backend = cloud.openstack\n" +
"openstack.provider = rackspace.com\n" +
"openstack.container = test\n")
self.failUnlessRaises(MissingConfigEntry, client.Client, basedir)
def test_openstack_config_no_container(self):
basedir = "client.Basic.test_openstack_config_no_container"
os.mkdir(basedir)
self._write_secret(basedir, "openstack_api_key")
fileutil.write(os.path.join(basedir, "tahoe.cfg"),
BASECONFIG +
"[storage]\n" +
"enabled = true\n" +
"backend = cloud.openstack\n" +
"openstack.provider = rackspace.com\n" +
"openstack.username = alex\n")
self.failUnlessRaises(MissingConfigEntry, client.Client, basedir)
def test_openstack_config_no_api_key(self):
basedir = "client.Basic.test_openstack_config_no_api_key"
os.mkdir(basedir)
fileutil.write(os.path.join(basedir, "tahoe.cfg"),
BASECONFIG +
"[storage]\n" +
"enabled = true\n" +
"backend = cloud.openstack\n" +
"openstack.provider = rackspace.com\n" +
"openstack.username = alex\n" +
"openstack.container = test\n")
self.failUnlessRaises(MissingConfigEntry, client.Client, basedir)
def test_googlestorage_config_required(self):
"""
account_email, bucket and project_id are all required by
googlestorage configuration.
"""
configs = ["googlestorage.account_email = u@example.com",
"googlestorage.bucket = bucket",
"googlestorage.project_id = 456"]
for i in range(len(configs)):
basedir = self.mktemp()
os.mkdir(basedir)
bad_config = configs[:]
del bad_config[i]
self._write_secret(basedir, "googlestorage_private_key")
fileutil.write(os.path.join(basedir, "tahoe.cfg"),
BASECONFIG +
"[storage]\n" +
"enabled = true\n" +
"backend = cloud.googlestorage\n" +
"\n".join(bad_config) + "\n")
self.failUnlessRaises(MissingConfigEntry, client.Client, basedir)
def test_googlestorage_config_required_private_key(self):
"""
googlestorage_private_key secret is required by googlestorage
configuration.
"""
basedir = self.mktemp()
os.mkdir(basedir)
fileutil.write(os.path.join(basedir, "tahoe.cfg"),
BASECONFIG +
"[storage]\n" +
"enabled = true\n" +
"backend = cloud.googlestorage\n" +
"googlestorage.account_email = u@example.com\n" +
"googlestorage.bucket = bucket\n" +
"googlestorage.project_id = 456\n")
self.failUnlessRaises(MissingConfigEntry, client.Client, basedir)
@mock.patch('allmydata.storage.backends.cloud.googlestorage.googlestorage_container.AuthenticationClient')
@mock.patch('allmydata.storage.backends.cloud.googlestorage.googlestorage_container.GoogleStorageContainer')
def test_googlestorage_config(self, mock_OpenStackContainer, mock_AuthenticationClient):
"""
Given good configuration, we correctly configure a good GoogleStorageContainer.
"""
basedir = self.mktemp()
os.mkdir(basedir)
self._write_secret(basedir, "googlestorage_private_key", "sekrit")
fileutil.write(os.path.join(basedir, "tahoe.cfg"),
BASECONFIG +
"[storage]\n" +
"enabled = true\n" +
"backend = cloud.googlestorage\n" +
"googlestorage.account_email = u@example.com\n" +
"googlestorage.bucket = bucket\n" +
"googlestorage.project_id = 456\n")
c = client.Client(basedir)
server = c.getServiceNamed("storage")
self.failUnless(isinstance(server.backend, CloudBackend), server.backend)
# Protect against typos with isinstance(), because mock is dangerous.
self.assertFalse(isinstance(mock_AuthenticationClient.assert_called_once_with,
mock.Mock))
mock_AuthenticationClient.assert_called_once_with("u@example.com", "sekrit")
self.assertFalse(isinstance(mock_OpenStackContainer.assert_called_once_with,
mock.Mock))
mock_OpenStackContainer.assert_called_once_with(mock_AuthenticationClient.return_value,
"456", "bucket")
def test_msazure_config_required(self):
"""
account_name and container are all required by MS Azure configuration.
"""
configs = ["mszure.account_name = theaccount",
"msazure.container = bucket"]
for i in range(len(configs)):
basedir = self.mktemp()
os.mkdir(basedir)
bad_config = configs[:]
del bad_config[i]
self._write_secret(basedir, "msazure_account_key")
fileutil.write(os.path.join(basedir, "tahoe.cfg"),
BASECONFIG +
"[storage]\n" +
"enabled = true\n" +
"backend = cloud.msazure\n" +
"\n".join(bad_config) + "\n")
self.failUnlessRaises(MissingConfigEntry, client.Client, basedir)
def test_msazure_config_required_private_key(self):
"""
msazure_account_key secret is required by MS Azure configuration.
"""
basedir = self.mktemp()
os.mkdir(basedir)
fileutil.write(os.path.join(basedir, "tahoe.cfg"),
BASECONFIG +
"[storage]\n" +
"enabled = true\n" +
"backend = cloud.msazure\n" +
"msazure.account_name = theaccount\n" +
"msazure.container = bucket\n")
self.failUnlessRaises(MissingConfigEntry, client.Client, basedir)
@mock.patch('allmydata.storage.backends.cloud.msazure.msazure_container.MSAzureStorageContainer')
def test_msazure_config(self, mock_MSAzureStorageContainer):
"""
Given good configuration, we correctly configure a good MSAzureStorageContainer.
"""
basedir = self.mktemp()
os.mkdir(basedir)
self._write_secret(basedir, "msazure_account_key", "abc")
fileutil.write(os.path.join(basedir, "tahoe.cfg"),
BASECONFIG +
"[storage]\n" +
"enabled = true\n" +
"backend = cloud.msazure\n" +
"msazure.account_name = theaccount\n" +
"msazure.container = bucket\n")
c = client.Client(basedir)
server = c.getServiceNamed("storage")
self.failUnless(isinstance(server.backend, CloudBackend), server.backend)
# Protect against typos with isinstance(), because mock is dangerous.
self.assertFalse(isinstance(
mock_MSAzureStorageContainer.assert_called_once_with, mock.Mock))
mock_MSAzureStorageContainer.assert_called_once_with(
"theaccount", "abc", "bucket")
def test_expire_mutable_false_unsupported(self):
basedir = "client.Basic.test_expire_mutable_false_unsupported"
os.mkdir(basedir)
fileutil.write(os.path.join(basedir, "tahoe.cfg"), \
BASECONFIG + \
"[storage]\n" + \
"enabled = true\n" + \
"reserved_space = bogus\n")
self.failUnlessRaises(ValueError, client.Client, basedir)
BASECONFIG + \
"[storage]\n" + \
"enabled = true\n" + \
"expire.mutable = False\n")
self.failUnlessRaises(OldConfigOptionError, client.Client, basedir)
def test_expire_immutable_false_unsupported(self):
basedir = "client.Basic.test_expire_immutable_false_unsupported"
os.mkdir(basedir)
fileutil.write(os.path.join(basedir, "tahoe.cfg"), \
BASECONFIG + \
"[storage]\n" + \
"enabled = true\n" + \
"expire.immutable = False\n")
self.failUnlessRaises(OldConfigOptionError, client.Client, basedir)
def test_debug_discard_true_unsupported(self):
basedir = "client.Basic.test_debug_discard_true_unsupported"
os.mkdir(basedir)
fileutil.write(os.path.join(basedir, "tahoe.cfg"), \
BASECONFIG + \
"[storage]\n" + \
"enabled = true\n" + \
"debug_discard = true\n")
self.failUnlessRaises(OldConfigOptionError, client.Client, basedir)
def _permute(self, sb, key):
return [ s.get_longname() for s in sb.get_servers_for_psi(key) ]
return [ base32.a2b(s.get_longname()) for s in sb.get_servers_for_psi(key) ]
def test_permute(self):
sb = StorageFarmBroker(None, True)
for k in ["%d" % i for i in range(5)]:
ann = {"anonymous-storage-FURL": "pb://abcde@nowhere/fake",
ann = {"anonymous-storage-FURL": "pb://%s@nowhere/fake" % base32.b2a(k),
"permutation-seed-base32": base32.b2a(k) }
sb.test_add_rref(k, "rref", ann)
@ -173,8 +517,9 @@ class Basic(testutil.ReallyEqualMixin, unittest.TestCase):
"[storage]\n" + \
"enabled = true\n")
c = client.Client(basedir)
ss = c.getServiceNamed("storage")
verdict = ss.remote_get_version()
server = c.getServiceNamed("storage")
aa = server.get_accountant().get_anonymous_account()
verdict = aa.remote_get_version()
self.failUnlessReallyEqual(verdict["application-version"],
str(allmydata.__full_version__))
self.failIfEqual(str(allmydata.__version__), "unknown")

View File

@ -1,55 +1,43 @@
import time
import os.path
from twisted.trial import unittest
from twisted.application import service
from twisted.internet import defer
from foolscap.api import eventually, fireEventually
from foolscap.api import fireEventually
from allmydata.util.deferredutil import gatherResults
from allmydata.util import fileutil, hashutil, pollmixin
from allmydata.util import hashutil
from allmydata.storage.server import StorageServer, si_b2a
from allmydata.storage.crawler import ShareCrawler, TimeSliceExceeded
from allmydata.storage.backends.disk.disk_backend import DiskBackend
from allmydata.storage.backends.cloud.cloud_backend import CloudBackend
from allmydata.storage.backends.cloud.mock_cloud import MockContainer
from allmydata.test.test_storage import FakeCanary
from allmydata.test.common import CrawlerTestMixin, FakeCanary
from allmydata.test.common_util import StallMixin
class BucketEnumeratingCrawler(ShareCrawler):
cpu_slice = 500 # make sure it can complete in a single slice
slow_start = 0
def __init__(self, *args, **kwargs):
ShareCrawler.__init__(self, *args, **kwargs)
self.all_buckets = []
self.finished_d = defer.Deferred()
def process_bucket(self, cycle, prefix, prefixdir, storage_index_b32):
self.all_buckets.append(storage_index_b32)
def finished_cycle(self, cycle):
eventually(self.finished_d.callback, None)
class PacedCrawler(ShareCrawler):
class EnumeratingCrawler(ShareCrawler):
cpu_slice = 500 # make sure it can complete in a single slice
slow_start = 0
def __init__(self, *args, **kwargs):
ShareCrawler.__init__(self, *args, **kwargs)
self.countdown = 6
self.all_buckets = []
self.finished_d = defer.Deferred()
self.yield_cb = None
def process_bucket(self, cycle, prefix, prefixdir, storage_index_b32):
self.all_buckets.append(storage_index_b32)
self.countdown -= 1
if self.countdown == 0:
# force a timeout. We restore it in yielding()
self.cpu_slice = -1.0
def yielding(self, sleep_time):
self.cpu_slice = 500
if self.yield_cb:
self.yield_cb()
def finished_cycle(self, cycle):
eventually(self.finished_d.callback, None)
self.sharesets = []
def process_prefix(self, cycle, prefix, start_slice):
d = self.backend.get_sharesets_for_prefix(prefix)
def _got_sharesets(sharesets):
self.sharesets += [s.get_storage_index_string() for s in sharesets]
d.addCallback(_got_sharesets)
return d
class ConsumingCrawler(ShareCrawler):
cpu_slice = 0.5
allowed_cpu_percentage = 0.5
allowed_cpu_proportion = 0.5
minimum_cycle_time = 0
slow_start = 0
@ -58,31 +46,31 @@ class ConsumingCrawler(ShareCrawler):
self.accumulated = 0.0
self.cycles = 0
self.last_yield = 0.0
def process_bucket(self, cycle, prefix, prefixdir, storage_index_b32):
start = time.time()
time.sleep(0.05)
elapsed = time.time() - start
self.accumulated += elapsed
self.last_yield += elapsed
def process_prefix(self, cycle, prefix, start_slice):
# XXX I don't know whether this behaviour makes sense for the test
# that uses it any more.
d = self.backend.get_sharesets_for_prefix(prefix)
def _got_sharesets(sharesets):
for shareset in sharesets:
start = time.time()
time.sleep(0.05)
elapsed = time.time() - start
self.accumulated += elapsed
self.last_yield += elapsed
if self.clock.seconds() >= start_slice + self.cpu_slice:
raise TimeSliceExceeded()
d.addCallback(_got_sharesets)
return d
def finished_cycle(self, cycle):
self.cycles += 1
def yielding(self, sleep_time):
self.last_yield = 0.0
class OneShotCrawler(ShareCrawler):
cpu_slice = 500 # make sure it can complete in a single slice
slow_start = 0
def __init__(self, *args, **kwargs):
ShareCrawler.__init__(self, *args, **kwargs)
self.counter = 0
self.finished_d = defer.Deferred()
def process_bucket(self, cycle, prefix, prefixdir, storage_index_b32):
self.counter += 1
def finished_cycle(self, cycle):
self.finished_d.callback(None)
self.disownServiceParent()
class Basic(unittest.TestCase, StallMixin, pollmixin.PollMixin):
class CrawlerTest(StallMixin, CrawlerTestMixin):
def setUp(self):
self.s = service.MultiService()
self.s.startService()
@ -97,258 +85,101 @@ class Basic(unittest.TestCase, StallMixin, pollmixin.PollMixin):
def cs(self, i, serverid):
return hashutil.bucket_cancel_secret_hash(str(i), serverid)
def write(self, i, ss, serverid, tail=0):
def create(self, name):
self.basedir = os.path.join("crawler", self.__class__.__name__, name)
self.serverid = "\x00" * 20
backend = self.make_backend(self.basedir)
server = StorageServer(self.serverid, backend, self.basedir)
server.setServiceParent(self.s)
return server
def write(self, i, aa, serverid, tail=0):
si = self.si(i)
si = si[:-1] + chr(tail)
had,made = ss.remote_allocate_buckets(si,
self.rs(i, serverid),
self.cs(i, serverid),
set([0]), 99, FakeCanary())
made[0].remote_write(0, "data")
made[0].remote_close()
return si_b2a(si)
def test_immediate(self):
self.basedir = "crawler/Basic/immediate"
fileutil.make_dirs(self.basedir)
serverid = "\x00" * 20
ss = StorageServer(self.basedir, serverid)
ss.setServiceParent(self.s)
sis = [self.write(i, ss, serverid) for i in range(10)]
statefile = os.path.join(self.basedir, "statefile")
c = BucketEnumeratingCrawler(ss, statefile, allowed_cpu_percentage=.1)
c.load_state()
c.start_current_prefix(time.time())
self.failUnlessEqual(sorted(sis), sorted(c.all_buckets))
# make sure the statefile has been returned to the starting point
c.finished_d = defer.Deferred()
c.all_buckets = []
c.start_current_prefix(time.time())
self.failUnlessEqual(sorted(sis), sorted(c.all_buckets))
# check that a new crawler picks up on the state file properly
c2 = BucketEnumeratingCrawler(ss, statefile)
c2.load_state()
c2.start_current_prefix(time.time())
self.failUnlessEqual(sorted(sis), sorted(c2.all_buckets))
def test_service(self):
self.basedir = "crawler/Basic/service"
fileutil.make_dirs(self.basedir)
serverid = "\x00" * 20
ss = StorageServer(self.basedir, serverid)
ss.setServiceParent(self.s)
sis = [self.write(i, ss, serverid) for i in range(10)]
statefile = os.path.join(self.basedir, "statefile")
c = BucketEnumeratingCrawler(ss, statefile)
c.setServiceParent(self.s)
# it should be legal to call get_state() and get_progress() right
# away, even before the first tick is performed. No work should have
# been done yet.
s = c.get_state()
p = c.get_progress()
self.failUnlessEqual(s["last-complete-prefix"], None)
self.failUnlessEqual(s["current-cycle"], None)
self.failUnlessEqual(p["cycle-in-progress"], False)
d = c.finished_d
def _check(ignored):
self.failUnlessEqual(sorted(sis), sorted(c.all_buckets))
d.addCallback(_check)
d = defer.succeed(None)
d.addCallback(lambda ign: aa.remote_allocate_buckets(si,
self.rs(i, serverid),
self.cs(i, serverid),
set([0]), 99, FakeCanary()))
def _allocated( (had, made) ):
d2 = defer.succeed(None)
d2.addCallback(lambda ign: made[0].remote_write(0, "data"))
d2.addCallback(lambda ign: made[0].remote_close())
d2.addCallback(lambda ign: si_b2a(si))
return d2
d.addCallback(_allocated)
return d
def test_paced(self):
self.basedir = "crawler/Basic/paced"
fileutil.make_dirs(self.basedir)
serverid = "\x00" * 20
ss = StorageServer(self.basedir, serverid)
ss.setServiceParent(self.s)
def test_service(self):
server = self.create("test_service")
aa = server.get_accountant().get_anonymous_account()
# put four buckets in each prefixdir
sis = []
for i in range(10):
for tail in range(4):
sis.append(self.write(i, ss, serverid, tail))
d = gatherResults([self.write(i, aa, self.serverid) for i in range(10)])
def _writes_done(sis):
statefile = os.path.join(self.basedir, "statefile")
c = EnumeratingCrawler(server.backend, statefile)
c.setServiceParent(self.s)
statefile = os.path.join(self.basedir, "statefile")
# it should be legal to call get_state() and get_progress() right
# away, even before the first tick is performed. No work should have
# been done yet.
s = c.get_state()
p = c.get_progress()
self.failUnlessEqual(s["last-complete-prefix"], None)
self.failUnlessEqual(s["current-cycle"], None)
self.failUnlessEqual(p["cycle-in-progress"], False)
c = PacedCrawler(ss, statefile)
c.load_state()
try:
c.start_current_prefix(time.time())
except TimeSliceExceeded:
pass
# that should stop in the middle of one of the buckets. Since we
# aren't using its normal scheduler, we have to save its state
# manually.
c.save_state()
c.cpu_slice = PacedCrawler.cpu_slice
self.failUnlessEqual(len(c.all_buckets), 6)
c.start_current_prefix(time.time()) # finish it
self.failUnlessEqual(len(sis), len(c.all_buckets))
self.failUnlessEqual(sorted(sis), sorted(c.all_buckets))
# make sure the statefile has been returned to the starting point
c.finished_d = defer.Deferred()
c.all_buckets = []
c.start_current_prefix(time.time())
self.failUnlessEqual(sorted(sis), sorted(c.all_buckets))
del c
# start a new crawler, it should start from the beginning
c = PacedCrawler(ss, statefile)
c.load_state()
try:
c.start_current_prefix(time.time())
except TimeSliceExceeded:
pass
# that should stop in the middle of one of the buckets. Since we
# aren't using its normal scheduler, we have to save its state
# manually.
c.save_state()
c.cpu_slice = PacedCrawler.cpu_slice
# a third crawler should pick up from where it left off
c2 = PacedCrawler(ss, statefile)
c2.all_buckets = c.all_buckets[:]
c2.load_state()
c2.countdown = -1
c2.start_current_prefix(time.time())
self.failUnlessEqual(len(sis), len(c2.all_buckets))
self.failUnlessEqual(sorted(sis), sorted(c2.all_buckets))
del c, c2
# now stop it at the end of a bucket (countdown=4), to exercise a
# different place that checks the time
c = PacedCrawler(ss, statefile)
c.load_state()
c.countdown = 4
try:
c.start_current_prefix(time.time())
except TimeSliceExceeded:
pass
# that should stop at the end of one of the buckets. Again we must
# save state manually.
c.save_state()
c.cpu_slice = PacedCrawler.cpu_slice
self.failUnlessEqual(len(c.all_buckets), 4)
c.start_current_prefix(time.time()) # finish it
self.failUnlessEqual(len(sis), len(c.all_buckets))
self.failUnlessEqual(sorted(sis), sorted(c.all_buckets))
del c
# stop it again at the end of the bucket, check that a new checker
# picks up correctly
c = PacedCrawler(ss, statefile)
c.load_state()
c.countdown = 4
try:
c.start_current_prefix(time.time())
except TimeSliceExceeded:
pass
# that should stop at the end of one of the buckets.
c.save_state()
c2 = PacedCrawler(ss, statefile)
c2.all_buckets = c.all_buckets[:]
c2.load_state()
c2.countdown = -1
c2.start_current_prefix(time.time())
self.failUnlessEqual(len(sis), len(c2.all_buckets))
self.failUnlessEqual(sorted(sis), sorted(c2.all_buckets))
del c, c2
def test_paced_service(self):
self.basedir = "crawler/Basic/paced_service"
fileutil.make_dirs(self.basedir)
serverid = "\x00" * 20
ss = StorageServer(self.basedir, serverid)
ss.setServiceParent(self.s)
sis = [self.write(i, ss, serverid) for i in range(10)]
statefile = os.path.join(self.basedir, "statefile")
c = PacedCrawler(ss, statefile)
did_check_progress = [False]
def check_progress():
c.yield_cb = None
try:
d2 = self._after_prefix(None, 'sg', c)
def _after_sg_prefix(state):
p = c.get_progress()
self.failUnlessEqual(p["cycle-in-progress"], True)
pct = p["cycle-complete-percentage"]
# after 6 buckets, we happen to be at 76.17% complete. As
# long as we create shares in deterministic order, this will
# continue to be true.
# After the 'sg' prefix, we happen to be 76.17% complete and to
# have processed 6 sharesets. As long as we create shares in
# deterministic order, this will continue to be true.
self.failUnlessEqual(int(pct), 76)
left = p["remaining-sleep-time"]
self.failUnless(isinstance(left, float), left)
self.failUnless(left > 0.0, left)
except Exception, e:
did_check_progress[0] = e
else:
did_check_progress[0] = True
c.yield_cb = check_progress
self.failUnlessEqual(len(c.sharesets), 6)
c.setServiceParent(self.s)
# that should get through 6 buckets, pause for a little while (and
# run check_progress()), then resume
return c.set_hook('after_cycle')
d2.addCallback(_after_sg_prefix)
d = c.finished_d
def _check(ignored):
if did_check_progress[0] is not True:
raise did_check_progress[0]
self.failUnless(did_check_progress[0])
self.failUnlessEqual(sorted(sis), sorted(c.all_buckets))
# at this point, the crawler should be sitting in the inter-cycle
# timer, which should be pegged at the minumum cycle time
self.failUnless(c.timer)
self.failUnless(c.sleeping_between_cycles)
self.failUnlessEqual(c.current_sleep_time, c.minimum_cycle_time)
d2.addCallback(lambda ign: self.failUnlessEqual(sorted(sis), sorted(c.sharesets)))
d2.addBoth(self._wait_for_yield, c)
p = c.get_progress()
self.failUnlessEqual(p["cycle-in-progress"], False)
naptime = p["remaining-wait-time"]
self.failUnless(isinstance(naptime, float), naptime)
# min-cycle-time is 300, so this is basically testing that it took
# less than 290s to crawl
self.failUnless(naptime > 10.0, naptime)
soon = p["next-crawl-time"] - time.time()
self.failUnless(soon > 10.0, soon)
# Check that a new crawler picks up on the state file correctly.
def _new_crawler(ign):
c_new = EnumeratingCrawler(server.backend, statefile)
c_new.setServiceParent(self.s)
d.addCallback(_check)
d3 = c_new.set_hook('after_cycle')
d3.addCallback(lambda ign: self.failUnlessEqual(sorted(sis), sorted(c_new.sharesets)))
d3.addBoth(self._wait_for_yield, c_new)
return d3
d2.addCallback(_new_crawler)
d.addCallback(_writes_done)
return d
def OFF_test_cpu_usage(self):
# this test can't actually assert anything, because too many
# This test can't actually assert anything, because too many
# buildslave machines are slow. But on a fast developer machine, it
# can produce interesting results. So if you care about how well the
# Crawler is accomplishing it's run-slowly goals, re-enable this test
# and read the stdout when it runs.
self.basedir = "crawler/Basic/cpu_usage"
fileutil.make_dirs(self.basedir)
serverid = "\x00" * 20
ss = StorageServer(self.basedir, serverid)
ss.setServiceParent(self.s)
# FIXME: it should be possible to make this test run deterministically
# by passing a Clock into the crawler.
server = self.create("test_cpu_usage")
aa = server.get_accountant().get_anonymous_account()
for i in range(10):
self.write(i, ss, serverid)
self.write(i, aa, self.serverid)
statefile = os.path.join(self.basedir, "statefile")
c = ConsumingCrawler(ss, statefile)
c = ConsumingCrawler(server.backend, statefile)
c.setServiceParent(self.s)
# this will run as fast as it can, consuming about 50ms per call to
# This will run as fast as it can, consuming about 50ms per call to
# process_bucket(), limited by the Crawler to about 50% cpu. We let
# it run for a few seconds, then compare how much time
# process_bucket() got vs wallclock time. It should get between 10%
@ -377,57 +208,47 @@ class Basic(unittest.TestCase, StallMixin, pollmixin.PollMixin):
print "crawler: got %d%% percent when trying for 50%%" % percent
print "crawler: got %d full cycles" % c.cycles
d.addCallback(_done)
d.addBoth(self._wait_for_yield, c)
return d
def test_empty_subclass(self):
self.basedir = "crawler/Basic/empty_subclass"
fileutil.make_dirs(self.basedir)
serverid = "\x00" * 20
ss = StorageServer(self.basedir, serverid)
ss.setServiceParent(self.s)
server = self.create("test_empty_subclass")
aa = server.get_accountant().get_anonymous_account()
for i in range(10):
self.write(i, ss, serverid)
self.write(i, aa, self.serverid)
statefile = os.path.join(self.basedir, "statefile")
c = ShareCrawler(ss, statefile)
c = ShareCrawler(server.backend, statefile)
c.slow_start = 0
c.setServiceParent(self.s)
# we just let it run for a while, to get figleaf coverage of the
# empty methods in the base class
# We just let it run for a while, to get coverage of the
# empty methods in the base class.
def _check():
return bool(c.state["last-cycle-finished"] is not None)
d = self.poll(_check)
def _done(ignored):
state = c.get_state()
self.failUnless(state["last-cycle-finished"] is not None)
d.addCallback(_done)
d = defer.succeed(None)
d.addBoth(self._wait_for_yield, c)
return d
def test_oneshot(self):
self.basedir = "crawler/Basic/oneshot"
fileutil.make_dirs(self.basedir)
serverid = "\x00" * 20
ss = StorageServer(self.basedir, serverid)
ss.setServiceParent(self.s)
server = self.create("test_oneshot")
aa = server.get_accountant().get_anonymous_account()
for i in range(30):
self.write(i, ss, serverid)
self.write(i, aa, self.serverid)
statefile = os.path.join(self.basedir, "statefile")
c = OneShotCrawler(ss, statefile)
c = EnumeratingCrawler(server.backend, statefile)
c.setServiceParent(self.s)
d = c.finished_d
def _finished_first_cycle(ignored):
return fireEventually(c.counter)
d.addCallback(_finished_first_cycle)
d = c.set_hook('after_cycle')
def _after_first_cycle(ignored):
c.disownServiceParent()
return fireEventually(len(c.sharesets))
d.addCallback(_after_first_cycle)
def _check(old_counter):
# the crawler should do any work after it's been stopped
self.failUnlessEqual(old_counter, c.counter)
# The crawler shouldn't do any work after it has been stopped.
self.failUnlessEqual(old_counter, len(c.sharesets))
self.failIf(c.running)
self.failIf(c.timer)
self.failIf(c.current_sleep_time)
@ -437,3 +258,12 @@ class Basic(unittest.TestCase, StallMixin, pollmixin.PollMixin):
d.addCallback(_check)
return d
class CrawlerTestWithDiskBackend(CrawlerTest, unittest.TestCase):
def make_backend(self, basedir):
return DiskBackend(basedir)
class CrawlerTestWithCloudBackendAndMockContainer(CrawlerTest, unittest.TestCase):
def make_backend(self, basedir):
return CloudBackend(MockContainer(basedir))

View File

@ -20,6 +20,8 @@ from allmydata.test.common import ErrorMixin, _corrupt_mutable_share_data, \
ShouldFailMixin
from allmydata.test.common_util import StallMixin
from allmydata.test.no_network import GridTestMixin
from allmydata.scripts import debug
timeout = 2400 # One of these took 1046.091s on Zandr's ARM box.
@ -63,8 +65,8 @@ class MutableChecker(GridTestMixin, unittest.TestCase, ErrorMixin):
def _stash_and_corrupt(node):
self.node = node
self.fileurl = "uri/" + urllib.quote(node.get_uri())
self.corrupt_shares_numbered(node.get_uri(), [0],
_corrupt_mutable_share_data)
return self.corrupt_shares_numbered(node.get_uri(), [0],
_corrupt_mutable_share_data)
d.addCallback(_stash_and_corrupt)
# now make sure the webapi verifier notices it
d.addCallback(lambda ign: self.GET(self.fileurl+"?t=check&verify=true",
@ -898,8 +900,6 @@ class DeepCheckWebBad(DeepCheckBase, unittest.TestCase):
d.addErrback(self.explain_error)
return d
def set_up_damaged_tree(self):
# 6.4s
@ -984,24 +984,20 @@ class DeepCheckWebBad(DeepCheckBase, unittest.TestCase):
return d
def _run_cli(self, argv):
stdout, stderr = StringIO(), StringIO()
# this can only do synchronous operations
assert argv[0] == "debug"
runner.runner(argv, run_by_human=False, stdout=stdout, stderr=stderr)
return stdout.getvalue()
def _delete_some_shares(self, node):
self.delete_shares_numbered(node.get_uri(), [0,1])
return self.delete_shares_numbered(node.get_uri(), [0,1])
def _corrupt_some_shares(self, node):
for (shnum, serverid, sharefile) in self.find_uri_shares(node.get_uri()):
if shnum in (0,1):
self._run_cli(["debug", "corrupt-share", sharefile])
d = self.find_uri_shares(node.get_uri())
def _got_shares(sharelist):
for (shnum, serverid, sharefile) in sharelist:
if shnum in (0,1):
debug.do_corrupt_share(StringIO(), sharefile)
d.addCallback(_got_shares)
return d
def _delete_most_shares(self, node):
self.delete_shares_numbered(node.get_uri(), range(1,10))
return self.delete_shares_numbered(node.get_uri(), range(1,10))
def check_is_healthy(self, cr, where):
try:

View File

@ -4,10 +4,13 @@
# shares from a previous version.
import os
from twisted.trial import unittest
from twisted.internet import defer, reactor
from foolscap.eventual import eventually, fireEventually, flushEventualQueue
from allmydata.util.deferredutil import async_iterate
from allmydata import uri
from allmydata.storage.server import storage_index_to_dir
from allmydata.util import base32, fileutil, spans, log, hashutil
from allmydata.util.consumer import download_to_data, MemoryConsumer
from allmydata.immutable import upload, layout
@ -20,7 +23,7 @@ from allmydata.immutable.downloader.common import BadSegmentNumberError, \
from allmydata.immutable.downloader.status import DownloadStatus
from allmydata.immutable.downloader.fetcher import SegmentFetcher
from allmydata.codec import CRSDecoder
from foolscap.eventual import eventually, fireEventually, flushEventualQueue
plaintext = "This is a moderate-sized file.\n" * 10
mutable_plaintext = "This is a moderate-sized mutable file.\n" * 10
@ -84,90 +87,71 @@ class _Base(GridTestMixin, ShouldFailMixin):
u = upload.Data(plaintext, None)
d = self.c0.upload(u)
f = open("stored_shares.py", "w")
def _write_py(uri):
si = uri.from_string(uri).get_storage_index()
def _each_server( (i,ss,ssdir) ):
sharemap = {}
shareset = ss.backend.get_shareset(si)
d2 = shareset.get_shares()
def _got_shares( (shares, corrupted) ):
assert len(corrupted) == 0, (shares, corrupted)
for share in shares:
sharedata = fileutil.read(share._get_path())
sharemap[share.get_shnum()] = sharedata
fileutil.remove(shareset._get_sharedir())
if sharemap:
f.write(' %d: { # client[%d]\n' % (i, i))
for shnum in sorted(sharemap.keys()):
f.write(' %d: base32.a2b("%s"),\n' %
(shnum, base32.b2a(sharemap[shnum])))
f.write(' },\n')
return True
d2.addCallback(_got_shares)
return d2
d = async_iterate(_each_server, self.iterate_servers())
d.addCallback(lambda ign: f.write('}\n'))
return d
def _created_immutable(ur):
# write the generated shares and URI to a file, which can then be
# incorporated into this one next time.
f.write('immutable_uri = "%s"\n' % ur.get_uri())
f.write('immutable_shares = {\n')
si = uri.from_string(ur.get_uri()).get_storage_index()
si_dir = storage_index_to_dir(si)
for (i,ss,ssdir) in self.iterate_servers():
sharedir = os.path.join(ssdir, "shares", si_dir)
shares = {}
for fn in os.listdir(sharedir):
shnum = int(fn)
sharedata = open(os.path.join(sharedir, fn), "rb").read()
shares[shnum] = sharedata
fileutil.rm_dir(sharedir)
if shares:
f.write(' %d: { # client[%d]\n' % (i, i))
for shnum in sorted(shares.keys()):
f.write(' %d: base32.a2b("%s"),\n' %
(shnum, base32.b2a(shares[shnum])))
f.write(' },\n')
f.write('}\n')
f.write('\n')
return _write_py(ur.get_uri())
d.addCallback(_created_immutable)
d.addCallback(lambda ignored:
self.c0.create_mutable_file(mutable_plaintext))
def _created_mutable(n):
f.write('\n')
f.write('mutable_uri = "%s"\n' % n.get_uri())
f.write('mutable_shares = {\n')
si = uri.from_string(n.get_uri()).get_storage_index()
si_dir = storage_index_to_dir(si)
for (i,ss,ssdir) in self.iterate_servers():
sharedir = os.path.join(ssdir, "shares", si_dir)
shares = {}
for fn in os.listdir(sharedir):
shnum = int(fn)
sharedata = open(os.path.join(sharedir, fn), "rb").read()
shares[shnum] = sharedata
fileutil.rm_dir(sharedir)
if shares:
f.write(' %d: { # client[%d]\n' % (i, i))
for shnum in sorted(shares.keys()):
f.write(' %d: base32.a2b("%s"),\n' %
(shnum, base32.b2a(shares[shnum])))
f.write(' },\n')
f.write('}\n')
f.close()
return _write_py(n.get_uri())
d.addCallback(_created_mutable)
def _done(ignored):
f.close()
d.addCallback(_done)
d.addBoth(_done)
return d
def _write_shares(self, fileuri, shares):
si = uri.from_string(fileuri).get_storage_index()
for i in shares:
shares_for_server = shares[i]
for shnum in shares_for_server:
share_dir = self.get_server(i).backend.get_shareset(si)._get_sharedir()
fileutil.make_dirs(share_dir)
fileutil.write(os.path.join(share_dir, str(shnum)), shares_for_server[shnum])
def load_shares(self, ignored=None):
# this uses the data generated by create_shares() to populate the
# storage servers with pre-generated shares
si = uri.from_string(immutable_uri).get_storage_index()
si_dir = storage_index_to_dir(si)
for i in immutable_shares:
shares = immutable_shares[i]
for shnum in shares:
dn = os.path.join(self.get_serverdir(i), "shares", si_dir)
fileutil.make_dirs(dn)
fn = os.path.join(dn, str(shnum))
f = open(fn, "wb")
f.write(shares[shnum])
f.close()
si = uri.from_string(mutable_uri).get_storage_index()
si_dir = storage_index_to_dir(si)
for i in mutable_shares:
shares = mutable_shares[i]
for shnum in shares:
dn = os.path.join(self.get_serverdir(i), "shares", si_dir)
fileutil.make_dirs(dn)
fn = os.path.join(dn, str(shnum))
f = open(fn, "wb")
f.write(shares[shnum])
f.close()
self._write_shares(immutable_uri, immutable_shares)
self._write_shares(mutable_uri, mutable_shares)
def download_immutable(self, ignored=None):
n = self.c0.create_node_from_uri(immutable_uri)
@ -188,6 +172,7 @@ class _Base(GridTestMixin, ShouldFailMixin):
d.addCallback(_got_data)
return d
class DownloadTest(_Base, unittest.TestCase):
timeout = 2400 # It takes longer than 240 seconds on Zandr's ARM box.
def test_download(self):
@ -210,7 +195,6 @@ class DownloadTest(_Base, unittest.TestCase):
self.load_shares()
si = uri.from_string(immutable_uri).get_storage_index()
si_dir = storage_index_to_dir(si)
n = self.c0.create_node_from_uri(immutable_uri)
d = download_to_data(n)
@ -222,13 +206,15 @@ class DownloadTest(_Base, unittest.TestCase):
# find the three shares that were used, and delete them. Then
# download again, forcing the downloader to fail over to other
# shares
d2 = defer.succeed(None)
for s in n._cnode._node._shares:
for clientnum in immutable_shares:
for shnum in immutable_shares[clientnum]:
if s._shnum == shnum:
fn = os.path.join(self.get_serverdir(clientnum),
"shares", si_dir, str(shnum))
os.unlink(fn)
d2.addCallback(lambda ign, clientnum=clientnum, shnum=shnum:
self.get_server(clientnum).backend.get_shareset(si).get_share(shnum))
d2.addCallback(lambda share: share.unlink())
return d2
d.addCallback(_clobber_some_shares)
d.addCallback(lambda ign: download_to_data(n))
d.addCallback(_got_data)
@ -237,27 +223,29 @@ class DownloadTest(_Base, unittest.TestCase):
# delete all but one of the shares that are still alive
live_shares = [s for s in n._cnode._node._shares if s.is_alive()]
save_me = live_shares[0]._shnum
d2 = defer.succeed(None)
for clientnum in immutable_shares:
for shnum in immutable_shares[clientnum]:
if shnum == save_me:
continue
fn = os.path.join(self.get_serverdir(clientnum),
"shares", si_dir, str(shnum))
if os.path.exists(fn):
os.unlink(fn)
d2.addCallback(lambda ign, clientnum=clientnum, shnum=shnum:
self.get_server(clientnum).backend.get_shareset(si).get_share(shnum))
def _eb(f):
f.trap(EnvironmentError)
d2.addCallbacks(lambda share: share.unlink(), _eb)
# now the download should fail with NotEnoughSharesError
return self.shouldFail(NotEnoughSharesError, "1shares", None,
download_to_data, n)
d2.addCallback(lambda ign: self.shouldFail(NotEnoughSharesError, "1shares", None,
download_to_data, n))
return d2
d.addCallback(_clobber_most_shares)
def _clobber_all_shares(ign):
# delete the last remaining share
for clientnum in immutable_shares:
for shnum in immutable_shares[clientnum]:
fn = os.path.join(self.get_serverdir(clientnum),
"shares", si_dir, str(shnum))
if os.path.exists(fn):
os.unlink(fn)
share_dir = self.get_server(clientnum).backend.get_shareset(si)._get_sharedir()
fileutil.remove(os.path.join(share_dir, str(shnum)))
# now a new download should fail with NoSharesError. We want a
# new ImmutableFileNode so it will forget about the old shares.
# If we merely called create_node_from_uri() without first
@ -834,22 +822,22 @@ class DownloadTest(_Base, unittest.TestCase):
# will report two shares, and the ShareFinder will handle the
# duplicate by attaching both to the same CommonShare instance.
si = uri.from_string(immutable_uri).get_storage_index()
si_dir = storage_index_to_dir(si)
sh0_file = [sharefile
for (shnum, serverid, sharefile)
in self.find_uri_shares(immutable_uri)
if shnum == 0][0]
sh0_data = open(sh0_file, "rb").read()
for clientnum in immutable_shares:
if 0 in immutable_shares[clientnum]:
continue
cdir = self.get_serverdir(clientnum)
target = os.path.join(cdir, "shares", si_dir, "0")
outf = open(target, "wb")
outf.write(sh0_data)
outf.close()
d = self.download_immutable()
d = defer.succeed(None)
d.addCallback(lambda ign: self.find_uri_shares(immutable_uri))
def _duplicate(sharelist):
sh0_file = [sharefile for (shnum, serverid, sharefile) in sharelist
if shnum == 0][0]
sh0_data = fileutil.read(sh0_file)
for clientnum in immutable_shares:
if 0 in immutable_shares[clientnum]:
continue
cdir = self.get_server(clientnum).backend.get_shareset(si)._get_sharedir()
fileutil.make_dirs(cdir)
fileutil.write(os.path.join(cdir, str(shnum)), sh0_data)
d.addCallback(_duplicate)
d.addCallback(lambda ign: self.download_immutable())
return d
def test_verifycap(self):
@ -934,13 +922,13 @@ class Corruption(_Base, unittest.TestCase):
log.msg("corrupt %d" % which)
def _corruptor(s, debug=False):
return s[:which] + chr(ord(s[which])^0x01) + s[which+1:]
self.corrupt_shares_numbered(imm_uri, [0], _corruptor)
return self.corrupt_shares_numbered(imm_uri, [0], _corruptor)
def _corrupt_set(self, ign, imm_uri, which, newvalue):
log.msg("corrupt %d" % which)
def _corruptor(s, debug=False):
return s[:which] + chr(newvalue) + s[which+1:]
self.corrupt_shares_numbered(imm_uri, [0], _corruptor)
return self.corrupt_shares_numbered(imm_uri, [0], _corruptor)
def test_each_byte(self):
# Setting catalog_detection=True performs an exhaustive test of the
@ -951,6 +939,7 @@ class Corruption(_Base, unittest.TestCase):
# (since we don't need every byte of the share). That takes 50s to
# run on my laptop and doesn't have any actual asserts, so we don't
# normally do that.
# XXX this has bitrotted (before v1.8.2) and gives an AttributeError.
self.catalog_detection = False
self.basedir = "download/Corruption/each_byte"
@ -999,12 +988,10 @@ class Corruption(_Base, unittest.TestCase):
d.addCallback(_got_data)
return d
d = self.c0.upload(u)
def _uploaded(ur):
imm_uri = ur.get_uri()
self.shares = self.copy_shares(imm_uri)
d = defer.succeed(None)
# 'victims' is a list of corruption tests to run. Each one flips
# the low-order bit of the specified offset in the share file (so
# offset=0 is the MSB of the container version, offset=15 is the
@ -1048,23 +1035,32 @@ class Corruption(_Base, unittest.TestCase):
[(i, "need-4th") for i in need_4th_victims])
if self.catalog_detection:
corrupt_me = [(i, "") for i in range(len(self.sh0_orig))]
for i,expected in corrupt_me:
# All these tests result in a successful download. What we're
# measuring is how many shares the downloader had to use.
d.addCallback(self._corrupt_flip, imm_uri, i)
d.addCallback(_download, imm_uri, i, expected)
d.addCallback(lambda ign: self.restore_all_shares(self.shares))
d.addCallback(fireEventually)
corrupt_values = [(3, 2, "no-sh0"),
(15, 2, "need-4th"), # share looks v2
]
for i,newvalue,expected in corrupt_values:
d.addCallback(self._corrupt_set, imm_uri, i, newvalue)
d.addCallback(_download, imm_uri, i, expected)
d.addCallback(lambda ign: self.restore_all_shares(self.shares))
d.addCallback(fireEventually)
return d
d2 = defer.succeed(None)
d2.addCallback(lambda ign: self.copy_shares(imm_uri))
def _copied(copied_shares):
d3 = defer.succeed(None)
for i, expected in corrupt_me:
# All these tests result in a successful download. What we're
# measuring is how many shares the downloader had to use.
d3.addCallback(self._corrupt_flip, imm_uri, i)
d3.addCallback(_download, imm_uri, i, expected)
d3.addCallback(lambda ign: self.restore_all_shares(copied_shares))
d3.addCallback(fireEventually)
corrupt_values = [(3, 2, "no-sh0"),
(15, 2, "need-4th"), # share looks v2
]
for i, newvalue, expected in corrupt_values:
d3.addCallback(self._corrupt_set, imm_uri, i, newvalue)
d3.addCallback(_download, imm_uri, i, expected)
d3.addCallback(lambda ign: self.restore_all_shares(copied_shares))
d3.addCallback(fireEventually)
return d3
d2.addCallback(_copied)
return d2
d.addCallback(_uploaded)
def _show_results(ign):
print
print ("of [0:%d], corruption ignored in %s" %
@ -1100,8 +1096,6 @@ class Corruption(_Base, unittest.TestCase):
d = self.c0.upload(u)
def _uploaded(ur):
imm_uri = ur.get_uri()
self.shares = self.copy_shares(imm_uri)
corrupt_me = [(48, "block data", "Last failure: None"),
(600+2*32, "block_hashes[2]", "BadHashError"),
(376+2*32, "crypttext_hash_tree[2]", "BadHashError"),
@ -1115,25 +1109,31 @@ class Corruption(_Base, unittest.TestCase):
assert not n._cnode._node._shares
return download_to_data(n)
d = defer.succeed(None)
for i,which,substring in corrupt_me:
# All these tests result in a failed download.
d.addCallback(self._corrupt_flip_all, imm_uri, i)
d.addCallback(lambda ign, which=which, substring=substring:
self.shouldFail(NoSharesError, which,
substring,
_download, imm_uri))
d.addCallback(lambda ign: self.restore_all_shares(self.shares))
d.addCallback(fireEventually)
return d
d.addCallback(_uploaded)
d2 = defer.succeed(None)
d2.addCallback(lambda ign: self.copy_shares(imm_uri))
def _copied(copied_shares):
d3 = defer.succeed(None)
for i, which, substring in corrupt_me:
# All these tests result in a failed download.
d3.addCallback(self._corrupt_flip_all, imm_uri, i)
d3.addCallback(lambda ign, which=which, substring=substring:
self.shouldFail(NoSharesError, which,
substring,
_download, imm_uri))
d3.addCallback(lambda ign: self.restore_all_shares(copied_shares))
d3.addCallback(fireEventually)
return d3
d2.addCallback(_copied)
return d2
d.addCallback(_uploaded)
return d
def _corrupt_flip_all(self, ign, imm_uri, which):
def _corruptor(s, debug=False):
return s[:which] + chr(ord(s[which])^0x01) + s[which+1:]
self.corrupt_all_shares(imm_uri, _corruptor)
return self.corrupt_all_shares(imm_uri, _corruptor)
class DownloadV2(_Base, unittest.TestCase):
# tests which exercise v2-share code. They first upload a file with
@ -1203,17 +1203,17 @@ class DownloadV2(_Base, unittest.TestCase):
d = self.c0.upload(u)
def _uploaded(ur):
imm_uri = ur.get_uri()
def _do_corrupt(which, newvalue):
def _corruptor(s, debug=False):
return s[:which] + chr(newvalue) + s[which+1:]
self.corrupt_shares_numbered(imm_uri, [0], _corruptor)
_do_corrupt(12+3, 0x00)
n = self.c0.create_node_from_uri(imm_uri)
d = download_to_data(n)
def _got_data(data):
self.failUnlessEqual(data, plaintext)
d.addCallback(_got_data)
return d
which = 12+3
newvalue = 0x00
def _corruptor(s, debug=False):
return s[:which] + chr(newvalue) + s[which+1:]
d2 = defer.succeed(None)
d2.addCallback(lambda ign: self.corrupt_shares_numbered(imm_uri, [0], _corruptor))
d2.addCallback(lambda ign: self.c0.create_node_from_uri(imm_uri))
d2.addCallback(lambda n: download_to_data(n))
d2.addCallback(lambda data: self.failUnlessEqual(data, plaintext))
return d2
d.addCallback(_uploaded)
return d

View File

@ -131,7 +131,7 @@ class FakeBucketReaderWriterProxy:
d.addCallback(_try)
return d
def get_share_hashes(self, at_least_these=()):
def get_share_hashes(self):
d = self._start()
def _try(unused=None):
if self.mode == "bad sharehash":

View File

@ -1,14 +1,13 @@
# -*- coding: utf-8 -*-
import os, shutil
import os
from twisted.trial import unittest
from twisted.internet import defer
from allmydata import uri
from allmydata.util.consumer import download_to_data
from allmydata.immutable import upload
from allmydata.mutable.common import UnrecoverableFileError
from allmydata.mutable.publish import MutableData
from allmydata.storage.common import storage_index_to_dir
from allmydata.test.no_network import GridTestMixin
from allmydata.test.common import ShouldFailMixin
from allmydata.util.pollmixin import PollMixin
@ -17,9 +16,10 @@ from allmydata.interfaces import NotEnoughSharesError
immutable_plaintext = "data" * 10000
mutable_plaintext = "muta" * 10000
class HungServerDownloadTest(GridTestMixin, ShouldFailMixin, PollMixin,
unittest.TestCase):
# Many of these tests take around 60 seconds on François's ARM buildslave:
# Many of these tests take around 60 seconds on Franc,ois's ARM buildslave:
# http://tahoe-lafs.org/buildbot/builders/FranXois%20lenny-armv5tel
# allmydata.test.test_hung_server.HungServerDownloadTest.test_2_good_8_broken_duplicate_share_fail
# once ERRORed after 197 seconds on Midnight Magic's NetBSD buildslave:
@ -29,16 +29,16 @@ class HungServerDownloadTest(GridTestMixin, ShouldFailMixin, PollMixin,
timeout = 240
def _break(self, servers):
for (id, ss) in servers:
self.g.break_server(id)
for ss in servers:
self.g.break_server(ss.original.get_serverid())
def _hang(self, servers, **kwargs):
for (id, ss) in servers:
self.g.hang_server(id, **kwargs)
for ss in servers:
self.g.hang_server(ss.original.get_serverid(), **kwargs)
def _unhang(self, servers, **kwargs):
for (id, ss) in servers:
self.g.unhang_server(id, **kwargs)
for ss in servers:
self.g.unhang_server(ss.original.get_serverid(), **kwargs)
def _hang_shares(self, shnums, **kwargs):
# hang all servers who are holding the given shares
@ -50,46 +50,29 @@ class HungServerDownloadTest(GridTestMixin, ShouldFailMixin, PollMixin,
hung_serverids.add(i_serverid)
def _delete_all_shares_from(self, servers):
serverids = [id for (id, ss) in servers]
serverids = [ss.original.get_serverid() for ss in servers]
for (i_shnum, i_serverid, i_sharefile) in self.shares:
if i_serverid in serverids:
os.unlink(i_sharefile)
def _corrupt_all_shares_in(self, servers, corruptor_func):
serverids = [id for (id, ss) in servers]
serverids = [ss.original.get_serverid() for ss in servers]
for (i_shnum, i_serverid, i_sharefile) in self.shares:
if i_serverid in serverids:
self._corrupt_share((i_shnum, i_sharefile), corruptor_func)
self.corrupt_share((i_shnum, i_serverid, i_sharefile), corruptor_func)
def _copy_all_shares_from(self, from_servers, to_server):
serverids = [id for (id, ss) in from_servers]
serverids = [ss.original.get_serverid() for ss in from_servers]
for (i_shnum, i_serverid, i_sharefile) in self.shares:
if i_serverid in serverids:
self._copy_share((i_shnum, i_sharefile), to_server)
self.copy_share((i_shnum, i_serverid, i_sharefile), self.uri,
to_server.original.server)
def _copy_share(self, share, to_server):
(sharenum, sharefile) = share
(id, ss) = to_server
shares_dir = os.path.join(ss.original.storedir, "shares")
si = uri.from_string(self.uri).get_storage_index()
si_dir = os.path.join(shares_dir, storage_index_to_dir(si))
if not os.path.exists(si_dir):
os.makedirs(si_dir)
new_sharefile = os.path.join(si_dir, str(sharenum))
shutil.copy(sharefile, new_sharefile)
self.shares = self.find_uri_shares(self.uri)
# Make sure that the storage server has the share.
self.failUnless((sharenum, ss.original.my_nodeid, new_sharefile)
in self.shares)
def _corrupt_share(self, share, corruptor_func):
(sharenum, sharefile) = share
data = open(sharefile, "rb").read()
newdata = corruptor_func(data)
os.unlink(sharefile)
wf = open(sharefile, "wb")
wf.write(newdata)
wf.close()
d = self.find_uri_shares(self.uri)
def _got_shares(shares):
self.shares = shares
d.addCallback(_got_shares)
return d
def _set_up(self, mutable, testdir, num_clients=1, num_servers=10):
self.mutable = mutable
@ -102,8 +85,8 @@ class HungServerDownloadTest(GridTestMixin, ShouldFailMixin, PollMixin,
self.c0 = self.g.clients[0]
nm = self.c0.nodemaker
self.servers = sorted([(s.get_serverid(), s.get_rref())
for s in nm.storage_broker.get_connected_servers()])
unsorted = [(s.get_serverid(), s.get_rref()) for s in nm.storage_broker.get_connected_servers()]
self.servers = [ss for (id, ss) in sorted(unsorted)]
self.servers = self.servers[5:] + self.servers[:5]
if mutable:
@ -111,15 +94,18 @@ class HungServerDownloadTest(GridTestMixin, ShouldFailMixin, PollMixin,
d = nm.create_mutable_file(uploadable)
def _uploaded_mutable(node):
self.uri = node.get_uri()
self.shares = self.find_uri_shares(self.uri)
d.addCallback(_uploaded_mutable)
else:
data = upload.Data(immutable_plaintext, convergence="")
d = self.c0.upload(data)
def _uploaded_immutable(upload_res):
self.uri = upload_res.get_uri()
self.shares = self.find_uri_shares(self.uri)
d.addCallback(_uploaded_immutable)
d.addCallback(lambda ign: self.find_uri_shares(self.uri))
def _got_shares(shares):
self.shares = shares
d.addCallback(_got_shares)
return d
def _start_download(self):
@ -264,7 +250,7 @@ class HungServerDownloadTest(GridTestMixin, ShouldFailMixin, PollMixin,
# stuck-but-not-overdue, and 4 live requests. All 4 live requests
# will retire before the download is complete and the ShareFinder
# is shut off. That will leave 4 OVERDUE and 1
# stuck-but-not-overdue, for a total of 5 requests in in
# stuck-but-not-overdue, for a total of 5 requests in
# _sf.pending_requests
for t in self._sf.overdue_timers.values()[:4]:
t.reset(-1.0)

View File

@ -237,7 +237,7 @@ class Test(GridTestMixin, unittest.TestCase, common.ShouldFailMixin):
d = self.startup("download_from_only_3_shares_with_good_crypttext_hash")
def _corrupt_7(ign):
c = common._corrupt_offset_of_block_hashes_to_truncate_crypttext_hashes
self.corrupt_shares_numbered(self.uri, self._shuffled(7), c)
return self.corrupt_shares_numbered(self.uri, self._shuffled(7), c)
d.addCallback(_corrupt_7)
d.addCallback(self._download_and_check_plaintext)
return d
@ -264,7 +264,7 @@ class Test(GridTestMixin, unittest.TestCase, common.ShouldFailMixin):
d = self.startup("download_abort_if_too_many_corrupted_shares")
def _corrupt_8(ign):
c = common._corrupt_sharedata_version_number
self.corrupt_shares_numbered(self.uri, self._shuffled(8), c)
return self.corrupt_shares_numbered(self.uri, self._shuffled(8), c)
d.addCallback(_corrupt_8)
def _try_download(ign):
start_reads = self._count_reads()

View File

@ -4,12 +4,10 @@ import re, errno, subprocess, os
from twisted.trial import unittest
from allmydata.util import iputil
from allmydata.util.namespace import Namespace
import allmydata.test.common_util as testutil
class Namespace:
pass
DOTTED_QUAD_RE=re.compile("^[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+$")
MOCK_IPADDR_OUTPUT = """\

View File

@ -0,0 +1,150 @@
import os
from twisted.trial import unittest
from allmydata.util import fileutil
from allmydata.util import dbutil
from allmydata.util.dbutil import IntegrityError
from allmydata.storage.leasedb import LeaseDB, LeaseInfo, NonExistentShareError, \
SHARETYPE_IMMUTABLE
BASE_ACCOUNTS = set([(0, u"anonymous"), (1, u"starter")])
class DB(unittest.TestCase):
def make(self, testname):
basedir = os.path.join("leasedb", "DB", testname)
fileutil.make_dirs(basedir)
dbfilename = os.path.join(basedir, "leasedb.sqlite")
return dbfilename
def test_create(self):
dbfilename = self.make("create")
l = LeaseDB(dbfilename)
l.startService()
self.failUnlessEqual(set(l.get_all_accounts()), BASE_ACCOUNTS)
# should be able to open an existing one too
l2 = LeaseDB(dbfilename)
l2.startService()
self.failUnlessEqual(set(l2.get_all_accounts()), BASE_ACCOUNTS)
def test_basic(self):
dbfilename = self.make("create")
l = LeaseDB(dbfilename)
l.startService()
l.add_new_share('si1', 0, 12345, SHARETYPE_IMMUTABLE)
# lease for non-existant share
self.failUnlessRaises(IntegrityError, l._cursor.execute,
"INSERT INTO `leases` VALUES(?,?,?,?,?)",
('si2', 0, LeaseDB.ANONYMOUS_ACCOUNTID, 0, 0))
self.failUnlessRaises(NonExistentShareError, l.add_starter_lease,
'si2', 0)
self.failUnlessRaises(NonExistentShareError, l.add_or_renew_leases,
'si2', 0, LeaseDB.ANONYMOUS_ACCOUNTID, 0, 0)
l.add_starter_lease('si1', 0)
# updating the lease should succeed
l.add_starter_lease('si1', 0)
leaseinfo = l.get_leases('si1', LeaseDB.STARTER_LEASE_ACCOUNTID)
self.failUnlessEqual(len(leaseinfo), 1)
self.failUnlessIsInstance(leaseinfo[0], LeaseInfo)
self.failUnlessEqual(leaseinfo[0].storage_index, 'si1')
self.failUnlessEqual(leaseinfo[0].shnum, 0)
self.failUnlessEqual(leaseinfo[0].owner_num, LeaseDB.STARTER_LEASE_ACCOUNTID)
# adding a duplicate entry directly should fail
self.failUnlessRaises(IntegrityError, l._cursor.execute,
"INSERT INTO `leases` VALUES(?,?,?,?,?)",
('si1', 0, LeaseDB.ANONYMOUS_ACCOUNTID, 0, 0))
# same for add_or_renew_leases
l.add_or_renew_leases('si1', 0, LeaseDB.ANONYMOUS_ACCOUNTID, 0, 0)
# updating the lease should succeed
l.add_or_renew_leases('si1', 0, LeaseDB.ANONYMOUS_ACCOUNTID, 1, 2)
leaseinfo = l.get_leases('si1', LeaseDB.ANONYMOUS_ACCOUNTID)
self.failUnlessEqual(len(leaseinfo), 1)
self.failUnlessIsInstance(leaseinfo[0], LeaseInfo)
self.failUnlessEqual(leaseinfo[0].storage_index, 'si1')
self.failUnlessEqual(leaseinfo[0].shnum, 0)
self.failUnlessEqual(leaseinfo[0].owner_num, LeaseDB.ANONYMOUS_ACCOUNTID)
self.failUnlessEqual(leaseinfo[0].renewal_time, 1)
self.failUnlessEqual(leaseinfo[0].expiration_time, 2)
# adding a duplicate entry directly should fail
self.failUnlessRaises(IntegrityError, l._cursor.execute,
"INSERT INTO `leases` VALUES(?,?,?,?,?)",
('si1', 0, LeaseDB.ANONYMOUS_ACCOUNTID, 0, 0))
num_shares, total_leased_used_space = l.get_total_leased_sharecount_and_used_space()
num_sharesets = l.get_number_of_sharesets()
self.failUnlessEqual(total_leased_used_space, 12345)
self.failUnlessEqual(num_shares, 1)
self.failUnlessEqual(num_sharesets, 1)
l.add_new_share('si1', 1, 12345, SHARETYPE_IMMUTABLE)
l.add_starter_lease('si1', 1)
num_shares, total_leased_used_space = l.get_total_leased_sharecount_and_used_space()
num_sharesets = l.get_number_of_sharesets()
self.failUnlessEqual(total_leased_used_space, 24690)
self.failUnlessEqual(num_shares, 2)
self.failUnlessEqual(num_sharesets, 1)
l.add_new_share('si2', 0, 12345, SHARETYPE_IMMUTABLE)
l.add_starter_lease('si2', 0)
num_sharesets = l.get_number_of_sharesets()
num_shares, total_leased_used_space = l.get_total_leased_sharecount_and_used_space()
num_sharesets = l.get_number_of_sharesets()
self.failUnlessEqual(total_leased_used_space, 37035)
self.failUnlessEqual(num_shares, 3)
self.failUnlessEqual(num_sharesets, 2)
class MockCursor:
def __init__(self):
self.closed = False
def close(self):
self.closed = True
class MockDB:
def __init__(self):
self.closed = False
def cursor(self):
return MockCursor()
def close(self):
self.closed = True
class FD_Leak(unittest.TestCase):
def create_leasedb(self, testname):
basedir = os.path.join("leasedb", "FD_Leak", testname)
fileutil.make_dirs(basedir)
dbfilename = os.path.join(basedir, "leasedb.sqlite")
return dbfilename
def test_basic(self):
# This test ensures that the db connection is closed by leasedb after
# the service stops.
def _call_get_db(*args, **kwargs):
return None, MockDB()
self.patch(dbutil, 'get_db', _call_get_db)
dbfilename = self.create_leasedb("test_basic")
l = LeaseDB(dbfilename)
l.startService()
db = l._db
cursor = l._cursor
l.stopService()
self.failUnless(db.closed)
self.failUnless(cursor.closed)

View File

@ -8,7 +8,7 @@ from allmydata.util import base32, consumer, fileutil, mathutil
from allmydata.util.hashutil import tagged_hash, ssk_writekey_hash, \
ssk_pubkey_fingerprint_hash
from allmydata.util.consumer import MemoryConsumer
from allmydata.util.deferredutil import gatherResults
from allmydata.util.deferredutil import gatherResults, WaitForDelayedCallsMixin
from allmydata.interfaces import IRepairResults, ICheckAndRepairResults, \
NotEnoughSharesError, SDMF_VERSION, MDMF_VERSION, DownloadStopped
from allmydata.monitor import Monitor
@ -17,7 +17,6 @@ from allmydata.test.no_network import GridTestMixin
from foolscap.api import eventually, fireEventually
from foolscap.logging import log
from allmydata.storage_client import StorageFarmBroker
from allmydata.storage.common import storage_index_to_dir
from allmydata.scripts import debug
from allmydata.mutable.filenode import MutableFileNode, BackoffAgent
@ -229,14 +228,14 @@ def corrupt(res, s, offset, shnums_to_corrupt=None, offset_offset=0):
def make_storagebroker(s=None, num_peers=10):
if not s:
s = FakeStorage()
peerids = [tagged_hash("peerid", "%d" % i)[:20]
for i in range(num_peers)]
serverids = [tagged_hash("peerid", "%d" % i)[:20]
for i in range(num_peers)]
storage_broker = StorageFarmBroker(None, True)
for peerid in peerids:
fss = FakeStorageServer(peerid, s)
ann = {"anonymous-storage-FURL": "pb://%s@nowhere/fake" % base32.b2a(peerid),
"permutation-seed-base32": base32.b2a(peerid) }
storage_broker.test_add_rref(peerid, fss, ann)
for serverid in serverids:
fss = FakeStorageServer(serverid, s)
ann = {"anonymous-storage-FURL": "pb://%s@nowhere/fake" % base32.b2a(serverid),
"permutation-seed-base32": base32.b2a(serverid) }
storage_broker.test_add_rref(serverid, fss, ann)
return storage_broker
def make_nodemaker(s=None, num_peers=10, keysize=TEST_RSA_KEY_SIZE):
@ -250,7 +249,7 @@ def make_nodemaker(s=None, num_peers=10, keysize=TEST_RSA_KEY_SIZE):
{"k": 3, "n": 10}, SDMF_VERSION, keygen)
return nodemaker
class Filenode(unittest.TestCase, testutil.ShouldFailMixin):
class Filenode(unittest.TestCase, testutil.ShouldFailMixin, WaitForDelayedCallsMixin):
# this used to be in Publish, but we removed the limit. Some of
# these tests test whether the new code correctly allows files
# larger than the limit.
@ -842,6 +841,7 @@ class Filenode(unittest.TestCase, testutil.ShouldFailMixin):
return d
d.addCallback(_created)
d.addBoth(self.wait_for_delayed_calls)
return d
def test_upload_and_download_full_size_keys(self):
@ -894,6 +894,7 @@ class Filenode(unittest.TestCase, testutil.ShouldFailMixin):
d.addCallback(_created)
d.addCallback(lambda ignored:
self.failUnlessEqual(self.n.get_size(), 9))
d.addBoth(self.wait_for_delayed_calls)
return d
@ -1903,7 +1904,7 @@ class Checker(unittest.TestCase, CheckerMixin, PublishMixin):
class Repair(unittest.TestCase, PublishMixin, ShouldFailMixin):
def get_shares(self, s):
def get_all_shares(self, s):
all_shares = {} # maps (peerid, shnum) to share data
for peerid in s._peers:
shares = s._peers[peerid]
@ -1913,7 +1914,7 @@ class Repair(unittest.TestCase, PublishMixin, ShouldFailMixin):
return all_shares
def copy_shares(self, ignored=None):
self.old_shares.append(self.get_shares(self._storage))
self.old_shares.append(self.get_all_shares(self._storage))
def test_repair_nop(self):
self.old_shares = []
@ -2692,7 +2693,6 @@ class Problems(GridTestMixin, unittest.TestCase, testutil.ShouldFailMixin):
nm.create_mutable_file, MutableData("contents"))
return d
def test_privkey_query_error(self):
# when a servermap is updated with MODE_WRITE, it tries to get the
# privkey. Something might go wrong during this query attempt.
@ -2810,12 +2810,10 @@ class Problems(GridTestMixin, unittest.TestCase, testutil.ShouldFailMixin):
for share, shnum in [(TEST_1654_SH0, 0), (TEST_1654_SH1, 1)]:
sharedata = base64.b64decode(share)
storedir = self.get_serverdir(shnum)
storage_path = os.path.join(storedir, "shares",
storage_index_to_dir(si))
fileutil.make_dirs(storage_path)
fileutil.write(os.path.join(storage_path, "%d" % shnum),
sharedata)
# This must be a disk backend.
storage_dir = self.get_server(shnum).backend.get_shareset(si)._get_sharedir()
fileutil.make_dirs(storage_dir)
fileutil.write(os.path.join(storage_dir, str(shnum)), sharedata)
nm = self.g.clients[0].nodemaker
n = nm.create_from_cap(TEST_1654_CAP)
@ -3110,7 +3108,7 @@ class Version(GridTestMixin, unittest.TestCase, testutil.ShouldFailMixin, \
fso = debug.FindSharesOptions()
storage_index = base32.b2a(n.get_storage_index())
fso.si_s = storage_index
fso.nodedirs = [unicode(os.path.dirname(os.path.abspath(storedir)))
fso.nodedirs = [os.path.dirname(storedir)
for (i,ss,storedir)
in self.iterate_servers()]
fso.stdout = StringIO()
@ -3118,7 +3116,8 @@ class Version(GridTestMixin, unittest.TestCase, testutil.ShouldFailMixin, \
debug.find_shares(fso)
sharefiles = fso.stdout.getvalue().splitlines()
expected = self.nm.default_encoding_parameters["n"]
self.failUnlessEqual(len(sharefiles), expected)
self.failUnlessEqual(len(sharefiles), expected,
str((fso.stdout.getvalue(), fso.stderr.getvalue())))
do = debug.DumpOptions()
do["filename"] = sharefiles[0]
@ -3128,7 +3127,6 @@ class Version(GridTestMixin, unittest.TestCase, testutil.ShouldFailMixin, \
lines = set(output.splitlines())
self.failUnless("Mutable slot found:" in lines, output)
self.failUnless(" share_type: MDMF" in lines, output)
self.failUnless(" num_extra_leases: 0" in lines, output)
self.failUnless(" MDMF contents:" in lines, output)
self.failUnless(" seqnum: 1" in lines, output)
self.failUnless(" required_shares: 3" in lines, output)
@ -3144,6 +3142,7 @@ class Version(GridTestMixin, unittest.TestCase, testutil.ShouldFailMixin, \
cso.stderr = StringIO()
debug.catalog_shares(cso)
shares = cso.stdout.getvalue().splitlines()
self.failIf(len(shares) < 1, shares)
oneshare = shares[0] # all shares should be MDMF
self.failIf(oneshare.startswith("UNKNOWN"), oneshare)
self.failUnless(oneshare.startswith("MDMF"), oneshare)
@ -3729,6 +3728,7 @@ class Interoperability(GridTestMixin, unittest.TestCase, testutil.ShouldFailMixi
sdmf_old_shares[9] = "VGFob2UgbXV0YWJsZSBjb250YWluZXIgdjEKdQlEA47ESLbTdKdpLJXCpBxd5OH239tl5hvAiz1dvGdE5rIOpf8cbfxbPcwNF+Y5dM92uBVbmV6KAAAAAAAAB/wAAAAAAAAJ0AAAAAFOWSw7jSx7WXzaMpdleJYXwYsRCV82jNA5oex9m2YhXSnb2POh+vvC1LE1NAfRc9GOb2zQG84Xdsx1Jub2brEeKkyt0sRIttN0p2kslcKkHF3k4fbf22XmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABamJprL6ecrsOoFKdrXUmWveLq8nzEGDOjFnyK9detI3noX3uyK2MwSnFdAfyN0tuAwoAAAAAAAAAFQAAAAAAAAAVAAABjwAAAo8AAAMXAAADNwAAAAAAAAM+AAAAAAAAB/wwggEgMA0GCSqGSIb3DQEBAQUAA4IBDQAwggEIAoIBAQC1IkainlJF12IBXBQdpRK1zXB7a26vuEYqRmQM09YjC6sQjCs0F2ICk8n9m/2Kw4l16eIEboB2Au9pODCE+u/dEAakEFh4qidTMn61rbGUbsLK8xzuWNW22ezzz9/nPia0HDrulXt51/FYtfnnAuD1RJGXJv/8tDllE9FL/18TzlH4WuB6Fp8FTgv7QdbZAfWJHDGFIpVCJr1XxOCsSZNFJIqGwZnD2lsChiWw5OJDbKd8otqN1hIbfHyMyfMOJ/BzRzvZXaUt4Dv5nf93EmQDWClxShRwpuX/NkZ5B2K9OFonFTbOCexm/MjMAdCBqebKKaiHFkiknUCn9eJQpZ5bAgERgV50VKj+AVTDfgTpqfO2vfo4wrufi6ZBb8QV7hllhUFBjYogQ9C96dnS7skv0s+cqFuUjwMILr5/rsbEmEMGvl0T0ytyAbtlXuowEFVj/YORNknM4yjY72YUtEPTlMpk0Cis7aIgTvu5qWMPER26PMApZuRqiwRsGIkaJIvOVOTHHjFYe3/YzdMkc7OZtqRMfQLtwVl2/zKQQV8b/a9vaT6q3mRLRd4P3esaAFe/+7sR/t+9tmB+a8kxtKM6kmaVQJMbXJZ4aoHGfeLX0m35Rcvu2Bmph7QfSDjk/eaE3q55zYSoGWShmlhlw4Kwg84sMuhmcVhLvo0LovR8bKmbdgABUSzNKiMx0E91q51/WH6ASL0fDEOLef9oxuyBX5F5cpoABojmWkDX3k3FKfgNHIeptE3lxB8HHzxDfSD250psyfNCAAwGsKbMxbmI2NpdTozZ3SICrySwgGkatA1gsDOJmOnTzgAXVnLiODzHiLFAI/MsXcR71fmvb7UghLA1b8pq66KAyl+aopjsD29AKG5hrXt9hLIp6shvfrzaPGIid5C8IxYIrjgBj1YohGgDE0Wua7Lx6Bnad5n91qmHAnwSEJE5YIhQM634omd6cq9Wk4seJCUIn+ucoknrpxp0IR9QMxpKSMRHRUg2K8ZegnY3YqFunRZKCfsq9ufQEKgjZN12AFqi551KPBdn4/3V5HK6xTv0P4robSsE/BvuIfByvRf/W7ZrDx+CFC4EEcsBOACOZCrkhhqd5TkYKbe9RA+vs56+9N5qZGurkxcoKviiyEncxvTuShD65DK/6x6kMDMgQv/EdZDI3x9GtHTnRBYXwDGnPJ19w+q2zC3e2XarbxTGYQIPEC5mYx0gAA0sbjf018NGfwBhl6SB54iGsa8uLvR3jHv6OSRJgwxL6j7P0Ts4Hv2EtO12P0Lv21pwi3JC1O/WviSrKCvrQD5lMHL9Uym3hwFi2zu0mqwZvxOAbGy7kfOPXkLYKOHTZLthzKj3PsdjeceWBfYIvPGKYcd6wDr36d1aXSYS4IWeApTS2AQ2lu0DUcgSefAvsA8NkgOklvJY1cjTMSg6j6cxQo48Bvl8RAWGLbr4h2S/8KwDGxwLsSv0Gop/gnFc3GzCsmL0EkEyHHWkCA8YRXCghfW80KLDV495ff7yF5oiwK56GniqowZ3RG9Jxp5MXoJQgsLV1VMQFMAmsY69yz8eoxRH3wl9L0dMyndLulhWWzNwPMQ2I0yAWdzA/pksVmwTJTFenB3MHCiWc5rEwJ3yofe6NZZnZQrYyL9r1TNnVwfTwRUiykPiLSk4x9Mi6DX7RamDAxc8u3gDVfjPsTOTagBOEGUWlGAL54KE/E6sgCQ5DEAt12chk8AxbjBFLPgV+/idrzS0lZHOL+IVBI9D0i3Bq1yZcSIqcjZB0M3IbxbPm4gLAYOWEiTUN2ecsEHHg9nt6rhgffVoqSbCCFPbpC0xf7WOC3+BQORIZECOCC7cUAciXq3xn+GuxpFE40RWRJeKAK7bBQ21X89ABIXlQFkFddZ9kRvlZ2Pnl0oeF+2pjnZu0Yc2czNfZEQF2P7BKIdLrgMgxG89snxAY8qAYTCKyQw6xTG87wkjDcpy1wzsZLP3WsOuO7cAm7b27xU0jRKq8Cw4d1hDoyRG+RdS53F8RFJzVMaNNYgxU2tfRwUvXpTRXiOheeRVvh25+YGVnjakUXjx/dSDnOw4ETHGHD+7styDkeSfc3BdSZxswzc6OehgMI+xsCxeeRym15QUm9hxvg8X7Bfz/0WulgFwgzrm11TVynZYOmvyHpiZKoqQyQyKahIrfhwuchCr7lMsZ4a+umIkNkKxCLZnI+T7jd+eGFMgKItjz3kTTxRl3IhaJG3LbPmwRUJynMxQKdMi4Uf0qy0U7+i8hIJ9m50QXc+3tw2bwDSbx22XYJ9Wf14gxx5G5SPTb1JVCbhe4fxNt91xIxCow2zk62tzbYfRe6dfmDmgYHkv2PIEtMJZK8iKLDjFfu2ZUxsKT2A5g1q17og6o9MeXeuFS3mzJXJYFQZd+3UzlFR9qwkFkby9mg5y4XSeMvRLOHPt/H/r5SpEqBE6a9MadZYt61FBV152CUEzd43ihXtrAa0XH9HdsiySBcWI1SpM3mv9rRP0DiLjMUzHw/K1D8TE2f07zW4t/9kvE11tFj/NpICixQAAAAA="
sdmf_old_cap = "URI:SSK:gmjgofw6gan57gwpsow6gtrz3e:5adm6fayxmu3e4lkmfvt6lkkfix34ai2wop2ioqr4bgvvhiol3kq"
sdmf_old_contents = "This is a test file.\n"
def copy_sdmf_shares(self):
# We'll basically be short-circuiting the upload process.
servernums = self.g.servers_by_number.keys()
@ -3740,28 +3740,33 @@ class Interoperability(GridTestMixin, unittest.TestCase, testutil.ShouldFailMixi
si = cap.get_storage_index()
# Now execute each assignment by writing the storage.
for (share, servernum) in assignments:
sharedata = base64.b64decode(self.sdmf_old_shares[share])
storedir = self.get_serverdir(servernum)
storage_path = os.path.join(storedir, "shares",
storage_index_to_dir(si))
fileutil.make_dirs(storage_path)
fileutil.write(os.path.join(storage_path, "%d" % share),
sharedata)
for (shnum, servernum) in assignments:
sharedata = base64.b64decode(self.sdmf_old_shares[shnum])
# This must be a disk backend.
storage_dir = self.get_server(servernum).backend.get_shareset(si)._get_sharedir()
fileutil.make_dirs(storage_dir)
fileutil.write(os.path.join(storage_dir, str(shnum)), sharedata)
# ...and verify that the shares are there.
shares = self.find_uri_shares(self.sdmf_old_cap)
assert len(shares) == 10
d = self.find_uri_shares(self.sdmf_old_cap)
def _got_shares(shares):
assert len(shares) == 10
d.addCallback(_got_shares)
return d
def test_new_downloader_can_read_old_shares(self):
self.basedir = "mutable/Interoperability/new_downloader_can_read_old_shares"
self.set_up_grid()
self.copy_sdmf_shares()
nm = self.g.clients[0].nodemaker
n = nm.create_from_cap(self.sdmf_old_cap)
d = n.download_best_version()
d.addCallback(self.failUnlessEqual, self.sdmf_old_contents)
d = self.copy_sdmf_shares()
def _create_node(ign):
nm = self.g.clients[0].nodemaker
return nm.create_from_cap(self.sdmf_old_cap)
d.addCallback(_create_node)
d.addCallback(lambda n: n.download_best_version())
d.addCallback(lambda res: self.failUnlessEqual(res, self.sdmf_old_contents))
return d
class DifferentEncoding(unittest.TestCase):
def setUp(self):
self._storage = s = FakeStorage()

View File

@ -97,7 +97,6 @@ class TestCase(testutil.SignalMixin, unittest.TestCase):
n = TestNode(basedir)
self.failUnlessEqual(n.get_private_config("already"), "secret")
self.failUnlessEqual(n.get_private_config("not", "default"), "default")
self.failUnlessRaises(MissingConfigEntry, n.get_private_config, "not")
value = n.get_or_create_private_config("new", "start")
self.failUnlessEqual(value, "start")

View File

@ -1,21 +1,26 @@
# -*- coding: utf-8 -*-
import random
from twisted.internet import defer
from twisted.trial import unittest
from allmydata.test import common
from allmydata.monitor import Monitor
from allmydata import check_results
from allmydata.interfaces import NotEnoughSharesError
from allmydata.check_results import CheckAndRepairResults
from allmydata.immutable import upload
from allmydata.util import fileutil
from allmydata.util.consumer import download_to_data
from twisted.internet import defer
from twisted.trial import unittest
import random
from allmydata.test.no_network import GridTestMixin
# We'll allow you to pass this test even if you trigger eighteen times as
# many disk reads and block fetches as would be optimal.
READ_LEEWAY = 18
MAX_DELTA_READS = 10 * READ_LEEWAY # N = 10
timeout=240 # François's ARM box timed out after 120 seconds of Verifier.test_corrupt_crypttext_hashtree
timeout=240 # Franc,ois's ARM box timed out after 120 seconds of Verifier.test_corrupt_crypttext_hashtree
class RepairTestMixin:
def failUnlessIsInstance(self, x, xtype):
@ -86,10 +91,7 @@ class Verifier(GridTestMixin, unittest.TestCase, RepairTestMixin):
self.failIfBigger(delta_reads, 0)
d.addCallback(_check)
def _remove_all(ignored):
for sh in self.find_uri_shares(self.uri):
self.delete_share(sh)
d.addCallback(_remove_all)
d.addCallback(lambda ign: self.delete_all_shares(self.uri))
d.addCallback(lambda ignored: self._stash_counts())
d.addCallback(lambda ignored:
@ -175,6 +177,7 @@ class Verifier(GridTestMixin, unittest.TestCase, RepairTestMixin):
self.basedir = "repairer/Verifier/corrupt_file_verno"
return self._help_test_verify(common._corrupt_file_version_number,
self.judge_visible_corruption)
test_corrupt_file_verno.todo = "Behaviour changed for corrupted shares; test is probably now invalid."
def judge_share_version_incompatibility(self, vr):
# corruption of the share version (inside the container, the 1/2
@ -401,25 +404,22 @@ class Repairer(GridTestMixin, unittest.TestCase, RepairTestMixin,
Monitor(), verify=False))
# test share corruption
def _test_corrupt(ignored):
d.addCallback(lambda ign: self.find_uri_shares(self.uri))
def _test_corrupt(shares):
olddata = {}
shares = self.find_uri_shares(self.uri)
for (shnum, serverid, sharefile) in shares:
olddata[ (shnum, serverid) ] = open(sharefile, "rb").read()
olddata[ (shnum, serverid) ] = fileutil.read(sharefile)
for sh in shares:
self.corrupt_share(sh, common._corrupt_uri_extension)
for (shnum, serverid, sharefile) in shares:
newdata = open(sharefile, "rb").read()
newdata = fileutil.read(sharefile)
self.failIfEqual(olddata[ (shnum, serverid) ], newdata)
d.addCallback(_test_corrupt)
def _remove_all(ignored):
for sh in self.find_uri_shares(self.uri):
self.delete_share(sh)
d.addCallback(_remove_all)
d.addCallback(lambda ignored: self.find_uri_shares(self.uri))
d.addCallback(lambda shares: self.failUnlessEqual(shares, []))
d.addCallback(lambda ign: self.delete_all_shares(self.uri))
d.addCallback(lambda ign: self.find_uri_shares(self.uri))
d.addCallback(lambda shares: self.failUnlessEqual(shares, []))
return d
def test_repair_from_deletion_of_1(self):
@ -445,13 +445,12 @@ class Repairer(GridTestMixin, unittest.TestCase, RepairTestMixin,
self.failIfBigger(delta_allocates, DELTA_WRITES_PER_SHARE)
self.failIf(pre.is_healthy())
self.failUnless(post.is_healthy())
# Now we inspect the filesystem to make sure that it has 10
# shares.
shares = self.find_uri_shares(self.uri)
self.failIf(len(shares) < 10)
d.addCallback(_check_results)
# Now we inspect the filesystem to make sure that it has 10 shares.
d.addCallback(lambda ign: self.find_uri_shares(self.uri))
d.addCallback(lambda shares: self.failIf(len(shares) < 10))
d.addCallback(lambda ignored:
self.c0_filenode.check(Monitor(), verify=True))
d.addCallback(lambda vr: self.failUnless(vr.is_healthy()))
@ -491,12 +490,12 @@ class Repairer(GridTestMixin, unittest.TestCase, RepairTestMixin,
self.failIfBigger(delta_allocates, (DELTA_WRITES_PER_SHARE * 7))
self.failIf(pre.is_healthy())
self.failUnless(post.is_healthy(), post.as_dict())
# Make sure we really have 10 shares.
shares = self.find_uri_shares(self.uri)
self.failIf(len(shares) < 10)
d.addCallback(_check_results)
# Now we inspect the filesystem to make sure that it has 10 shares.
d.addCallback(lambda ign: self.find_uri_shares(self.uri))
d.addCallback(lambda shares: self.failIf(len(shares) < 10))
d.addCallback(lambda ignored:
self.c0_filenode.check(Monitor(), verify=True))
d.addCallback(lambda vr: self.failUnless(vr.is_healthy()))
@ -526,7 +525,7 @@ class Repairer(GridTestMixin, unittest.TestCase, RepairTestMixin,
# happiness setting.
def _delete_some_servers(ignored):
for i in xrange(7):
self.g.remove_server(self.g.servers_by_number[i].my_nodeid)
self.remove_server(i)
assert len(self.g.servers_by_number) == 3
@ -619,8 +618,8 @@ class Repairer(GridTestMixin, unittest.TestCase, RepairTestMixin,
# are two shares that it should upload, if the server fails
# to serve the first share.
self.failIf(after_repair_allocates - before_repair_allocates > (DELTA_WRITES_PER_SHARE * 2), (after_repair_allocates, before_repair_allocates))
self.failIf(prerepairres.is_healthy(), (prerepairres.data, corruptor_func))
self.failUnless(postrepairres.is_healthy(), (postrepairres.data, corruptor_func))
self.failIf(prerepairres.is_healthy(), (prerepairres.get_data(), corruptor_func))
self.failUnless(postrepairres.is_healthy(), (postrepairres.get_data(), corruptor_func))
# Now we inspect the filesystem to make sure that it has 10
# shares.
@ -699,29 +698,40 @@ class Repairer(GridTestMixin, unittest.TestCase, RepairTestMixin,
return d
def test_servers_responding(self):
# This test exercises a bug (ticket #1739) in which the servers-responding list
# did not include servers that responded to the Repair, but not the pre-repair
# filecheck.
self.basedir = "repairer/Repairer/servers_responding"
self.set_up_grid(num_clients=2)
d = self.upload_and_stash()
# now cause one of the servers to not respond during the pre-repair
# filecheck, but then *do* respond to the post-repair filecheck
def _then(ign):
# Cause one of the servers to not respond during the pre-repair
# filecheck, but then *do* respond to the post-repair filecheck.
ss = self.g.servers_by_number[0]
self.g.break_server(ss.my_nodeid, count=1)
self.g.break_server(ss.get_serverid(), count=1)
return self.find_uri_shares(self.uri)
d.addCallback(_then)
def _got_shares(shares):
self.failUnlessEqual(len(shares), 10)
self.delete_shares_numbered(self.uri, [9])
return self.c0_filenode.check_and_repair(Monitor())
d.addCallback(_then)
d.addCallback(_got_shares)
def _check(rr):
# this exercises a bug in which the servers-responding list did
# not include servers that responded to the Repair, but which did
# not respond to the pre-repair filecheck
self.failUnlessIsInstance(rr, CheckAndRepairResults)
prr = rr.get_post_repair_results()
# We expect the repair to have restored all shares...
self.failUnlessEqual(prr.get_share_counter_good(), 10)
# ... and all the servers should be in servers-responding.
expected = set(self.g.get_all_serverids())
responding_set = frozenset([s.get_serverid() for s in prr.get_servers_responding()])
self.failIf(expected - responding_set, expected - responding_set)
self.failIf(responding_set - expected, responding_set - expected)
self.failUnlessEqual(expected,
set([s.get_serverid()
for s in prr.get_servers_responding()]))
responding = set([s.get_serverid() for s in prr.get_servers_responding()])
self.failUnlessEqual(expected, responding,
("\nexpected - responding = %r"
"\nresponding - expected = %r")
% (expected - responding, responding - expected))
d.addCallback(_check)
return d

View File

@ -220,6 +220,43 @@ class BinTahoe(common_util.SignalMixin, unittest.TestCase, RunBinTahoeMixin):
d.addCallback(_cb)
return d
def test_debug_trial(self):
def _check_for_line(lines, result, test):
for l in lines:
if result in l and test in l:
return
self.fail("output (prefixed with '##') does not have a line containing both %r and %r:\n## %s"
% (result, test, "\n## ".join(lines)))
def _check_for_outcome(lines, out, outcome):
self.failUnlessIn(outcome, out, "output (prefixed with '##') does not contain %r:\n## %s"
% (outcome, "\n## ".join(lines)))
d = self.run_bintahoe(['debug', 'trial', '--reporter=verbose',
'allmydata.test.trialtest'])
def _check_failure( (out, err, rc) ):
self.failUnlessEqual(rc, 1)
lines = out.split('\n')
_check_for_line(lines, "[SKIPPED]", "test_skip")
_check_for_line(lines, "[TODO]", "test_todo")
_check_for_line(lines, "[FAIL]", "test_fail")
_check_for_line(lines, "[ERROR]", "test_deferred_error")
_check_for_line(lines, "[ERROR]", "test_error")
_check_for_outcome(lines, out, "FAILED")
d.addCallback(_check_failure)
# the --quiet argument regression-tests a problem in finding which arguments to pass to trial
d.addCallback(lambda ign: self.run_bintahoe(['--quiet', 'debug', 'trial', '--reporter=verbose',
'allmydata.test.trialtest.Success']))
def _check_success( (out, err, rc) ):
self.failUnlessEqual(rc, 0)
lines = out.split('\n')
_check_for_line(lines, "[SKIPPED]", "test_skip")
_check_for_line(lines, "[TODO]", "test_todo")
_check_for_outcome(lines, out, "PASSED")
d.addCallback(_check_success)
return d
class CreateNode(unittest.TestCase):
# exercise "tahoe create-node", create-introducer,

File diff suppressed because it is too large Load Diff

View File

@ -8,24 +8,28 @@ from twisted.internet import threads # CLI tests use deferToThread
import allmydata
from allmydata import uri
from allmydata.storage.mutable import MutableShareFile
from allmydata.storage.backends.cloud import cloud_common, mock_cloud
from allmydata.storage.server import si_a2b
from allmydata.immutable import offloaded, upload
from allmydata.immutable.literal import LiteralFileNode
from allmydata.immutable.filenode import ImmutableFileNode
from allmydata.util import idlib, mathutil
from allmydata.util import idlib, mathutil, fileutil
from allmydata.util import log, base32
from allmydata.util.verlib import NormalizedVersion
from allmydata.util.encodingutil import quote_output, unicode_to_argv, get_filesystem_encoding
from allmydata.util.fileutil import abspath_expanduser_unicode
from allmydata.util.consumer import MemoryConsumer, download_to_data
from allmydata.scripts import runner
from allmydata.scripts.debug import ChunkedShare
from allmydata.interfaces import IDirectoryNode, IFileNode, \
NoSuchChildError, NoSharesError
from allmydata.monitor import Monitor
from allmydata.mutable.common import NotWriteableError
from allmydata.mutable import layout as mutable_layout
from allmydata.mutable.publish import MutableData
from allmydata.mutable.layout import MAX_MUTABLE_SHARE_SIZE
from allmydata.storage.common import NUM_RE
from allmydata.storage.backends.disk.mutable import MutableDiskShare
import foolscap
from foolscap.api import DeadReferenceError, fireEventually
@ -56,11 +60,12 @@ class CountingDataUploadable(upload.Data):
self.interrupt_after_d.callback(self)
return upload.Data.read(self, length)
class SystemTest(SystemTestMixin, RunBinTahoeMixin, unittest.TestCase):
class SystemTest(SystemTestMixin, RunBinTahoeMixin):
timeout = 3600 # It takes longer than 960 seconds on Zandr's ARM box.
def test_connections(self):
self.basedir = "system/SystemTest/test_connections"
self.basedir = self.workdir("test_connections")
d = self.set_up_nodes()
self.extra_node = None
d.addCallback(lambda res: self.add_extra_node(self.numclients))
@ -88,11 +93,11 @@ class SystemTest(SystemTestMixin, RunBinTahoeMixin, unittest.TestCase):
del test_connections
def test_upload_and_download_random_key(self):
self.basedir = "system/SystemTest/test_upload_and_download_random_key"
self.basedir = self.workdir("test_upload_and_download_random_key")
return self._test_upload_and_download(convergence=None)
def test_upload_and_download_convergent(self):
self.basedir = "system/SystemTest/test_upload_and_download_convergent"
self.basedir = self.workdir("test_upload_and_download_convergent")
return self._test_upload_and_download(convergence="some convergence string")
def _test_upload_and_download(self, convergence):
@ -358,7 +363,8 @@ class SystemTest(SystemTestMixin, RunBinTahoeMixin, unittest.TestCase):
(bytes_sent, len(DATA)))
n = self.clients[1].create_node_from_uri(cap)
return download_to_data(n)
d.addCallback(_uploaded)
# FIXME: renable
#d.addCallback(_uploaded)
def _check(newdata):
self.failUnlessEqual(newdata, DATA)
@ -370,7 +376,8 @@ class SystemTest(SystemTestMixin, RunBinTahoeMixin, unittest.TestCase):
self.failUnlessEqual(files, [])
files = os.listdir(os.path.join(basedir, "CHK_incoming"))
self.failUnlessEqual(files, [])
d.addCallback(_check)
# FIXME: renable
#d.addCallback(_check)
return d
d.addCallback(_upload_resumable)
@ -416,19 +423,24 @@ class SystemTest(SystemTestMixin, RunBinTahoeMixin, unittest.TestCase):
storage_index_s = pieces[-1]
storage_index = si_a2b(storage_index_s)
for sharename in filenames:
shnum = int(sharename)
filename = os.path.join(dirpath, sharename)
data = (client_num, storage_index, filename, shnum)
shares.append(data)
# If the share is chunked, only pay attention to the first chunk here.
if '.' not in sharename:
shnum = int(sharename)
filename = os.path.join(dirpath, sharename)
data = (client_num, storage_index, filename, shnum)
shares.append(data)
if not shares:
self.fail("unable to find any share files in %s" % basedir)
return shares
def _corrupt_mutable_share(self, filename, which):
msf = MutableShareFile(filename)
datav = msf.readv([ (0, 1000000) ])
final_share = datav[0]
def _corrupt_mutable_share(self, ign, what, which):
(storageindex, filename, shnum) = what
# Avoid chunking a share that isn't already chunked when using ChunkedShare.pwrite.
share = ChunkedShare(filename, MAX_MUTABLE_SHARE_SIZE)
final_share = share.pread(MutableDiskShare.DATA_OFFSET, 1000000)
assert len(final_share) < 1000000 # ought to be truncated
pieces = mutable_layout.unpack_share(final_share)
(seqnum, root_hash, IV, k, N, segsize, datalen,
verification_key, signature, share_hash_chain, block_hash_tree,
@ -465,11 +477,12 @@ class SystemTest(SystemTestMixin, RunBinTahoeMixin, unittest.TestCase):
block_hash_tree,
share_data,
enc_privkey)
msf.writev( [(0, final_share)], None)
share.pwrite(MutableDiskShare.DATA_OFFSET, final_share)
MutableDiskShare._write_data_length(share, len(final_share))
def test_mutable(self):
self.basedir = "system/SystemTest/test_mutable"
self.basedir = self.workdir("test_mutable")
DATA = "initial contents go here." # 25 bytes % 3 != 0
DATA_uploadable = MutableData(DATA)
NEWDATA = "new contents yay"
@ -504,26 +517,24 @@ class SystemTest(SystemTestMixin, RunBinTahoeMixin, unittest.TestCase):
filename],
stdout=out, stderr=err)
output = out.getvalue()
self.failUnlessEqual(err.getvalue(), "")
self.failUnlessEqual(rc, 0)
try:
self.failUnless("Mutable slot found:\n" in output)
self.failUnless("share_type: SDMF\n" in output)
self.failUnlessIn("Mutable slot found:\n", output)
self.failUnlessIn("share_type: SDMF\n", output)
peerid = idlib.nodeid_b2a(self.clients[client_num].nodeid)
self.failUnless(" WE for nodeid: %s\n" % peerid in output)
self.failUnless(" num_extra_leases: 0\n" in output)
self.failUnless(" secrets are for nodeid: %s\n" % peerid
in output)
self.failUnless(" SDMF contents:\n" in output)
self.failUnless(" seqnum: 1\n" in output)
self.failUnless(" required_shares: 3\n" in output)
self.failUnless(" total_shares: 10\n" in output)
self.failUnless(" segsize: 27\n" in output, (output, filename))
self.failUnless(" datalen: 25\n" in output)
self.failUnlessIn(" WE for nodeid: %s\n" % peerid, output)
self.failUnlessIn(" SDMF contents:\n", output)
self.failUnlessIn(" seqnum: 1\n", output)
self.failUnlessIn(" required_shares: 3\n", output)
self.failUnlessIn(" total_shares: 10\n", output)
self.failUnlessIn(" segsize: 27\n", output)
self.failUnlessIn(" datalen: 25\n", output)
# the exact share_hash_chain nodes depends upon the sharenum,
# and is more of a hassle to compute than I want to deal with
# now
self.failUnless(" share_hash_chain: " in output)
self.failUnless(" block_hash_tree: 1 nodes\n" in output)
self.failUnlessIn(" share_hash_chain: ", output)
self.failUnlessIn(" block_hash_tree: 1 nodes\n", output)
expected = (" verify-cap: URI:SSK-Verifier:%s:" %
base32.b2a(storage_index))
self.failUnless(expected in output)
@ -599,11 +610,13 @@ class SystemTest(SystemTestMixin, RunBinTahoeMixin, unittest.TestCase):
shares = self._find_all_shares(self.basedir)
## sort by share number
#shares.sort( lambda a,b: cmp(a[3], b[3]) )
where = dict([ (shnum, filename)
for (client_num, storage_index, filename, shnum)
where = dict([ (shnum, (storageindex, filename, shnum))
for (client_num, storageindex, filename, shnum)
in shares ])
assert len(where) == 10 # this test is designed for 3-of-10
for shnum, filename in where.items():
d2 = defer.succeed(None)
for shnum, what in where.items():
# shares 7,8,9 are left alone. read will check
# (share_hash_chain, block_hash_tree, share_data). New
# seqnum+R pairs will trigger a check of (seqnum, R, IV,
@ -611,23 +624,23 @@ class SystemTest(SystemTestMixin, RunBinTahoeMixin, unittest.TestCase):
if shnum == 0:
# read: this will trigger "pubkey doesn't match
# fingerprint".
self._corrupt_mutable_share(filename, "pubkey")
self._corrupt_mutable_share(filename, "encprivkey")
d2.addCallback(self._corrupt_mutable_share, what, "pubkey")
d2.addCallback(self._corrupt_mutable_share, what, "encprivkey")
elif shnum == 1:
# triggers "signature is invalid"
self._corrupt_mutable_share(filename, "seqnum")
d2.addCallback(self._corrupt_mutable_share, what, "seqnum")
elif shnum == 2:
# triggers "signature is invalid"
self._corrupt_mutable_share(filename, "R")
d2.addCallback(self._corrupt_mutable_share, what, "R")
elif shnum == 3:
# triggers "signature is invalid"
self._corrupt_mutable_share(filename, "segsize")
d2.addCallback(self._corrupt_mutable_share, what, "segsize")
elif shnum == 4:
self._corrupt_mutable_share(filename, "share_hash_chain")
d2.addCallback(self._corrupt_mutable_share, what, "share_hash_chain")
elif shnum == 5:
self._corrupt_mutable_share(filename, "block_hash_tree")
d2.addCallback(self._corrupt_mutable_share, what, "block_hash_tree")
elif shnum == 6:
self._corrupt_mutable_share(filename, "share_data")
d2.addCallback(self._corrupt_mutable_share, what, "share_data")
# other things to correct: IV, signature
# 7,8,9 are left alone
@ -643,8 +656,8 @@ class SystemTest(SystemTestMixin, RunBinTahoeMixin, unittest.TestCase):
# for one failure mode at a time.
# when we retrieve this, we should get three signature
# failures (where we've mangled seqnum, R, and segsize). The
# pubkey mangling
# failures (where we've mangled seqnum, R, and segsize).
return d2
d.addCallback(_corrupt_shares)
d.addCallback(lambda res: self._newnode3.download_best_version())
@ -719,7 +732,7 @@ class SystemTest(SystemTestMixin, RunBinTahoeMixin, unittest.TestCase):
# plaintext_hash check.
def test_filesystem(self):
self.basedir = "system/SystemTest/test_filesystem"
self.basedir = self.workdir("test_filesystem")
self.data = LARGE_DATA
d = self.set_up_nodes(use_stats_gatherer=True)
def _new_happy_semantics(ign):
@ -754,6 +767,11 @@ class SystemTest(SystemTestMixin, RunBinTahoeMixin, unittest.TestCase):
# P/s2-ro -> /subdir1/subdir2/ (read-only)
d.addCallback(self._check_publish_private)
d.addCallback(self.log, "did _check_publish_private")
# Put back the default PREFERRED_CHUNK_SIZE, because these tests have
# pathologically bad performance with small chunks.
d.addCallback(lambda ign: self._restore_chunk_size())
d.addCallback(self._test_web)
d.addCallback(self._test_control)
d.addCallback(self._test_cli)
@ -765,6 +783,63 @@ class SystemTest(SystemTestMixin, RunBinTahoeMixin, unittest.TestCase):
d.addCallback(self._test_checker)
return d
def test_simple(self):
"""
This test is redundant with test_filesystem, but it is simpler, much shorter, and easier for debugging.
It creates a directory containing a subdirectory, and then puts & gets (immutable, SDMF, MDMF) files in
the subdirectory.
"""
self.basedir = self.workdir("test_simple")
d = self.set_up_nodes(NUMCLIENTS=1, use_stats_gatherer=True)
def _set_happy_and_nodeargs(ign):
for c in self.clients:
# TODO: this hangs with k = n = 10; figure out why.
c.DEFAULT_ENCODING_PARAMETERS['k'] = 3
c.DEFAULT_ENCODING_PARAMETERS['happy'] = 1
c.DEFAULT_ENCODING_PARAMETERS['n'] = 3
self.nodeargs = [
"--node-directory", self.getdir("client0"),
]
d.addCallback(_set_happy_and_nodeargs)
def _publish(ign):
c0 = self.clients[0]
d2 = c0.create_dirnode()
def _made_root(new_dirnode):
self._root_directory_uri = new_dirnode.get_uri()
return c0.create_node_from_uri(self._root_directory_uri)
d2.addCallback(_made_root)
d2.addCallback(lambda root: root.create_subdirectory(u"subdir"))
return d2
d.addCallback(_publish)
formats = ([], ["--format=SDMF"], ["--format=MDMF"])
def _put_and_get(ign, i):
name = "file%d" % i
tahoe_path = "%s/subdir/%s" % (self._root_directory_uri, name)
format_options = formats[i]
fn = os.path.join(self.basedir, name)
data = "%s%d\n" % (LARGE_DATA, i)
fileutil.write(fn, data)
d2 = defer.succeed(None)
d2.addCallback(lambda ign: self._run_cli(self.nodeargs + ["put"] + format_options + [fn, tahoe_path]))
def _check_put( (out, err) ):
self.failUnlessIn("201 Created", err)
self.failUnlessIn("URI:", out)
d2.addCallback(_check_put)
d2.addCallback(lambda ign: self._run_cli(self.nodeargs + ["get"] + [tahoe_path]))
def _check_get( (out, err) ):
self.failUnlessEqual(err, "")
self.failUnlessEqual(out, data)
d2.addCallback(_check_get)
return d2
for i in range(len(formats)):
d.addCallback(_put_and_get, i)
return d
def _test_introweb(self, res):
d = getPage(self.introweb_url, method="GET", followRedirect=True)
def _check(res):
@ -1240,16 +1315,12 @@ class SystemTest(SystemTestMixin, RunBinTahoeMixin, unittest.TestCase):
# exercise more code paths
workdir = os.path.join(self.getdir("client0"), "helper")
incfile = os.path.join(workdir, "CHK_incoming", "spurious")
f = open(incfile, "wb")
f.write("small file")
f.close()
fileutil.write(incfile, "small file")
then = time.time() - 86400*3
now = time.time()
os.utime(incfile, (now, then))
encfile = os.path.join(workdir, "CHK_encoding", "spurious")
f = open(encfile, "wb")
f.write("less small file")
f.close()
fileutil.write(encfile, "less small file")
os.utime(encfile, (now, then))
d.addCallback(_got_helper_status)
# and that the json form exists
@ -1322,8 +1393,9 @@ class SystemTest(SystemTestMixin, RunBinTahoeMixin, unittest.TestCase):
if (len(pieces) >= 4
and pieces[-4] == "storage"
and pieces[-3] == "shares"):
# we're sitting in .../storage/shares/$START/$SINDEX , and there
# are sharefiles here
# We're sitting in .../storage/shares/$START/$SINDEX , and there
# are sharefiles here. Choose one that is an initial chunk.
filenames = filter(NUM_RE.match, filenames)
filename = os.path.join(dirpath, filenames[0])
# peek at the magic to see if it is a chk share
magic = open(filename, "rb").read(4)
@ -1339,24 +1411,25 @@ class SystemTest(SystemTestMixin, RunBinTahoeMixin, unittest.TestCase):
unicode_to_argv(filename)],
stdout=out, stderr=err)
output = out.getvalue()
self.failUnlessEqual(err.getvalue(), "")
self.failUnlessEqual(rc, 0)
# we only upload a single file, so we can assert some things about
# its size and shares.
self.failUnlessIn("share filename: %s" % quote_output(abspath_expanduser_unicode(filename)), output)
self.failUnlessIn("size: %d\n" % len(self.data), output)
self.failUnlessIn("num_segments: 1\n", output)
self.failUnlessIn(" file_size: %d\n" % len(self.data), output)
self.failUnlessIn(" num_segments: 1\n", output)
# segment_size is always a multiple of needed_shares
self.failUnlessIn("segment_size: %d\n" % mathutil.next_multiple(len(self.data), 3), output)
self.failUnlessIn("total_shares: 10\n", output)
self.failUnlessIn(" segment_size: %d\n" % mathutil.next_multiple(len(self.data), 3), output)
self.failUnlessIn(" total_shares: 10\n", output)
# keys which are supposed to be present
for key in ("size", "num_segments", "segment_size",
for key in ("file_size", "num_segments", "segment_size",
"needed_shares", "total_shares",
"codec_name", "codec_params", "tail_codec_params",
#"plaintext_hash", "plaintext_root_hash",
"crypttext_hash", "crypttext_root_hash",
"share_root_hash", "UEB_hash"):
self.failUnlessIn("%s: " % key, output)
self.failUnlessIn(" %s: " % key, output)
self.failUnlessIn(" verify-cap: URI:CHK-Verifier:", output)
# now use its storage index to find the other shares using the
@ -1368,6 +1441,7 @@ class SystemTest(SystemTestMixin, RunBinTahoeMixin, unittest.TestCase):
nodedirs = [self.getdir("client%d" % i) for i in range(self.numclients)]
cmd = ["debug", "find-shares", storage_index_s] + nodedirs
rc = runner.runner(cmd, stdout=out, stderr=err)
self.failUnlessEqual(err.getvalue(), "")
self.failUnlessEqual(rc, 0)
out.seek(0)
sharefiles = [sfn.strip() for sfn in out.readlines()]
@ -1378,10 +1452,11 @@ class SystemTest(SystemTestMixin, RunBinTahoeMixin, unittest.TestCase):
nodedirs = [self.getdir("client%d" % i) for i in range(self.numclients)]
cmd = ["debug", "catalog-shares"] + nodedirs
rc = runner.runner(cmd, stdout=out, stderr=err)
self.failUnlessEqual(err.getvalue(), "")
self.failUnlessEqual(rc, 0)
out.seek(0)
descriptions = [sfn.strip() for sfn in out.readlines()]
self.failUnlessEqual(len(descriptions), 30)
self.failUnlessEqual(len(descriptions), 30, repr((cmd, descriptions)))
matching = [line
for line in descriptions
if line.startswith("CHK %s " % storage_index_s)]
@ -1504,12 +1579,12 @@ class SystemTest(SystemTestMixin, RunBinTahoeMixin, unittest.TestCase):
files = []
datas = []
for i in range(10):
for i in range(11):
fn = os.path.join(self.basedir, "file%d" % i)
files.append(fn)
data = "data to be uploaded: file%d\n" % i
datas.append(data)
open(fn,"wb").write(data)
fileutil.write(fn, data)
def _check_stdout_against((out,err), filenum=None, data=None):
self.failUnlessEqual(err, "")
@ -1523,7 +1598,7 @@ class SystemTest(SystemTestMixin, RunBinTahoeMixin, unittest.TestCase):
d.addCallback(run, "put", files[0], "tahoe-file0")
def _put_out((out,err)):
self.failUnless("URI:LIT:" in out, out)
self.failUnless("201 Created" in err, err)
self.failUnlessIn("201 Created", err)
uri0 = out.strip()
return run(None, "get", uri0)
d.addCallback(_put_out)
@ -1532,13 +1607,23 @@ class SystemTest(SystemTestMixin, RunBinTahoeMixin, unittest.TestCase):
d.addCallback(run, "put", files[1], "subdir/tahoe-file1")
# tahoe put bar tahoe:FOO
d.addCallback(run, "put", files[2], "tahoe:file2")
d.addCallback(run, "put", "--format=SDMF", files[3], "tahoe:file3")
def _check_put_mutable((out,err)):
def _check_put_sdmf((out,err)):
self.failUnlessIn("201 Created", err)
self._mutable_file3_uri = out.strip()
d.addCallback(_check_put_mutable)
d.addCallback(_check_put_sdmf)
d.addCallback(run, "get", "tahoe:file3")
d.addCallback(_check_stdout_against, 3)
d.addCallback(run, "put", "--format=MDMF", files[10], "tahoe:file10")
def _check_put_mdmf((out,err)):
self.failUnlessIn("201 Created", err)
self._mutable_file10_uri = out.strip()
d.addCallback(_check_put_mdmf)
d.addCallback(run, "get", "tahoe:file10")
d.addCallback(_check_stdout_against, 10)
# tahoe put FOO
STDIN_DATA = "This is the file to upload from stdin."
d.addCallback(run, "put", "-", "tahoe-file-stdin", stdin=STDIN_DATA)
@ -1547,7 +1632,7 @@ class SystemTest(SystemTestMixin, RunBinTahoeMixin, unittest.TestCase):
stdin="Other file from stdin.")
d.addCallback(run, "ls")
d.addCallback(_check_ls, ["tahoe-file0", "file2", "file3", "subdir",
d.addCallback(_check_ls, ["tahoe-file0", "file2", "file3", "file10", "subdir",
"tahoe-file-stdin", "from-stdin"])
d.addCallback(run, "ls", "subdir")
d.addCallback(_check_ls, ["tahoe-file1"])
@ -1588,7 +1673,7 @@ class SystemTest(SystemTestMixin, RunBinTahoeMixin, unittest.TestCase):
if "tahoe-file-stdin" in l:
self.failUnless(l.startswith("-r-- "), l)
self.failUnless(" %d " % len(STDIN_DATA) in l)
if "file3" in l:
if "file3" in l or "file10" in l:
self.failUnless(l.startswith("-rw- "), l) # mutable
d.addCallback(_check_ls_l)
@ -1598,6 +1683,8 @@ class SystemTest(SystemTestMixin, RunBinTahoeMixin, unittest.TestCase):
for l in lines:
if "file3" in l:
self.failUnless(self._mutable_file3_uri in l)
if "file10" in l:
self.failUnless(self._mutable_file10_uri in l)
d.addCallback(_check_ls_uri)
d.addCallback(run, "ls", "--readonly-uri")
@ -1606,9 +1693,13 @@ class SystemTest(SystemTestMixin, RunBinTahoeMixin, unittest.TestCase):
for l in lines:
if "file3" in l:
rw_uri = self._mutable_file3_uri
u = uri.from_string_mutable_filenode(rw_uri)
ro_uri = u.get_readonly().to_string()
self.failUnless(ro_uri in l)
elif "file10" in l:
rw_uri = self._mutable_file10_uri
else:
break
u = uri.from_string_mutable_filenode(rw_uri)
ro_uri = u.get_readonly().to_string()
self.failUnless(ro_uri in l)
d.addCallback(_check_ls_rouri)
@ -1676,13 +1767,13 @@ class SystemTest(SystemTestMixin, RunBinTahoeMixin, unittest.TestCase):
# recursive copy: setup
dn = os.path.join(self.basedir, "dir1")
os.makedirs(dn)
open(os.path.join(dn, "rfile1"), "wb").write("rfile1")
open(os.path.join(dn, "rfile2"), "wb").write("rfile2")
open(os.path.join(dn, "rfile3"), "wb").write("rfile3")
fileutil.write(os.path.join(dn, "rfile1"), "rfile1")
fileutil.write(os.path.join(dn, "rfile2"), "rfile2")
fileutil.write(os.path.join(dn, "rfile3"), "rfile3")
sdn2 = os.path.join(dn, "subdir2")
os.makedirs(sdn2)
open(os.path.join(sdn2, "rfile4"), "wb").write("rfile4")
open(os.path.join(sdn2, "rfile5"), "wb").write("rfile5")
fileutil.write(os.path.join(sdn2, "rfile4"), "rfile4")
fileutil.write(os.path.join(sdn2, "rfile5"), "rfile5")
# from disk into tahoe
d.addCallback(run, "cp", "-r", dn, "tahoe:dir1")
@ -1760,7 +1851,7 @@ class SystemTest(SystemTestMixin, RunBinTahoeMixin, unittest.TestCase):
def test_filesystem_with_cli_in_subprocess(self):
# We do this in a separate test so that test_filesystem doesn't skip if we can't run bin/tahoe.
self.basedir = "system/SystemTest/test_filesystem_with_cli_in_subprocess"
self.basedir = self.workdir("test_filesystem_with_cli_in_subprocess")
d = self.set_up_nodes()
def _new_happy_semantics(ign):
for c in self.clients:
@ -1803,43 +1894,6 @@ class SystemTest(SystemTestMixin, RunBinTahoeMixin, unittest.TestCase):
d.addCallback(_check_ls)
return d
def test_debug_trial(self):
def _check_for_line(lines, result, test):
for l in lines:
if result in l and test in l:
return
self.fail("output (prefixed with '##') does not have a line containing both %r and %r:\n## %s"
% (result, test, "\n## ".join(lines)))
def _check_for_outcome(lines, out, outcome):
self.failUnlessIn(outcome, out, "output (prefixed with '##') does not contain %r:\n## %s"
% (outcome, "\n## ".join(lines)))
d = self.run_bintahoe(['debug', 'trial', '--reporter=verbose',
'allmydata.test.trialtest'])
def _check_failure( (out, err, rc) ):
self.failUnlessEqual(rc, 1)
lines = out.split('\n')
_check_for_line(lines, "[SKIPPED]", "test_skip")
_check_for_line(lines, "[TODO]", "test_todo")
_check_for_line(lines, "[FAIL]", "test_fail")
_check_for_line(lines, "[ERROR]", "test_deferred_error")
_check_for_line(lines, "[ERROR]", "test_error")
_check_for_outcome(lines, out, "FAILED")
d.addCallback(_check_failure)
# the --quiet argument regression-tests a problem in finding which arguments to pass to trial
d.addCallback(lambda ign: self.run_bintahoe(['--quiet', 'debug', 'trial', '--reporter=verbose',
'allmydata.test.trialtest.Success']))
def _check_success( (out, err, rc) ):
self.failUnlessEqual(rc, 0)
lines = out.split('\n')
_check_for_line(lines, "[SKIPPED]", "test_skip")
_check_for_line(lines, "[TODO]", "test_todo")
_check_for_outcome(lines, out, "PASSED")
d.addCallback(_check_success)
return d
def _run_cli(self, argv, stdin=""):
#print "CLI:", argv
stdout, stderr = StringIO(), StringIO()
@ -1888,6 +1942,32 @@ class SystemTest(SystemTestMixin, RunBinTahoeMixin, unittest.TestCase):
return d
class SystemWithDiskBackend(SystemTest, unittest.TestCase):
# The disk backend can use default options.
def _restore_chunk_size(self):
pass
class SystemWithCloudBackendAndMockContainer(SystemTest, unittest.TestCase):
def setUp(self):
SystemTest.setUp(self)
# A smaller chunk size causes the tests to exercise more cases in the chunking implementation.
self.patch(cloud_common, 'PREFERRED_CHUNK_SIZE', 500)
# This causes ContainerListMixin to be exercised.
self.patch(mock_cloud, 'MAX_KEYS', 2)
def _restore_chunk_size(self):
self.patch(cloud_common, 'PREFERRED_CHUNK_SIZE', cloud_common.DEFAULT_PREFERRED_CHUNK_SIZE)
def _get_extra_config(self, i):
# all nodes are storage servers
return ("[storage]\n"
"backend = mock_cloud\n")
class Connections(SystemTestMixin, unittest.TestCase):
def test_rref(self):
if NormalizedVersion(foolscap.__version__) < NormalizedVersion('0.6.4'):

View File

@ -1,6 +1,6 @@
# -*- coding: utf-8 -*-
import os, shutil
import os
from cStringIO import StringIO
from twisted.trial import unittest
from twisted.python.failure import Failure
@ -11,7 +11,7 @@ import allmydata # for __full_version__
from allmydata import uri, monitor, client
from allmydata.immutable import upload, encode
from allmydata.interfaces import FileTooLargeError, UploadUnhappinessError
from allmydata.util import log, base32
from allmydata.util import base32, fileutil
from allmydata.util.assertutil import precondition
from allmydata.util.deferredutil import DeferredListShouldSucceed
from allmydata.test.no_network import GridTestMixin
@ -19,7 +19,6 @@ from allmydata.test.common_util import ShouldFailMixin
from allmydata.util.happinessutil import servers_of_happiness, \
shares_by_server, merge_servers
from allmydata.storage_client import StorageFarmBroker
from allmydata.storage.server import storage_index_to_dir
from allmydata.client import Client
MiB = 1024*1024
@ -755,7 +754,7 @@ class EncodingParameters(GridTestMixin, unittest.TestCase, SetDEPMixin,
servertoshnums = {} # k: server, v: set(shnum)
for i, c in self.g.servers_by_number.iteritems():
for (dirp, dirns, fns) in os.walk(c.sharedir):
for (dirp, dirns, fns) in os.walk(c.backend._sharedir):
for fn in fns:
try:
sharenum = int(fn)
@ -815,40 +814,17 @@ class EncodingParameters(GridTestMixin, unittest.TestCase, SetDEPMixin,
h = self.g.clients[0].DEFAULT_ENCODING_PARAMETERS['happy']
return is_happy_enough(servertoshnums, h, k)
# for compatibility, before we refactor this class to use the methods in GridTestMixin
def _add_server(self, server_number, readonly=False):
assert self.g, "I tried to find a grid at self.g, but failed"
ss = self.g.make_server(server_number, readonly)
log.msg("just created a server, number: %s => %s" % (server_number, ss,))
self.g.add_server(server_number, ss)
def _add_server_with_share(self, server_number, share_number=None,
readonly=False):
self._add_server(server_number, readonly)
if share_number is not None:
self._copy_share_to_server(share_number, server_number)
return self.add_server(server_number, readonly=readonly)
def _add_server_with_share(self, server_number, share_number=None, readonly=False):
self.add_server_with_share(self.uri, server_number=server_number,
share_number=share_number, readonly=readonly)
def _copy_share_to_server(self, share_number, server_number):
ss = self.g.servers_by_number[server_number]
# Copy share i from the directory associated with the first
# storage server to the directory associated with this one.
assert self.g, "I tried to find a grid at self.g, but failed"
assert self.shares, "I tried to find shares at self.shares, but failed"
old_share_location = self.shares[share_number][2]
new_share_location = os.path.join(ss.storedir, "shares")
si = uri.from_string(self.uri).get_storage_index()
new_share_location = os.path.join(new_share_location,
storage_index_to_dir(si))
if not os.path.exists(new_share_location):
os.makedirs(new_share_location)
new_share_location = os.path.join(new_share_location,
str(share_number))
if old_share_location != new_share_location:
shutil.copy(old_share_location, new_share_location)
shares = self.find_uri_shares(self.uri)
# Make sure that the storage server has the share.
self.failUnless((share_number, ss.my_nodeid, new_share_location)
in shares)
self.copy_share_to_server(self.uri, server_number=server_number,
share_number=share_number)
def _setup_grid(self):
"""
@ -999,8 +975,7 @@ class EncodingParameters(GridTestMixin, unittest.TestCase, SetDEPMixin,
readonly=True))
# Remove the first share from server 0.
def _remove_share_0_from_server_0():
share_location = self.shares[0][2]
os.remove(share_location)
fileutil.remove(self.shares[0][2])
d.addCallback(lambda ign:
_remove_share_0_from_server_0())
# Set happy = 4 in the client.
@ -1129,8 +1104,7 @@ class EncodingParameters(GridTestMixin, unittest.TestCase, SetDEPMixin,
self._copy_share_to_server(i, 2)
d.addCallback(_copy_shares)
# Remove the first server, and add a placeholder with share 0
d.addCallback(lambda ign:
self.g.remove_server(self.g.servers_by_number[0].my_nodeid))
d.addCallback(lambda ign: self.remove_server(0))
d.addCallback(lambda ign:
self._add_server_with_share(server_number=4, share_number=0))
# Now try uploading.
@ -1161,8 +1135,7 @@ class EncodingParameters(GridTestMixin, unittest.TestCase, SetDEPMixin,
d.addCallback(lambda ign:
self._add_server(server_number=4))
d.addCallback(_copy_shares)
d.addCallback(lambda ign:
self.g.remove_server(self.g.servers_by_number[0].my_nodeid))
d.addCallback(lambda ign: self.remove_server(0))
d.addCallback(_reset_encoding_parameters)
d.addCallback(lambda client:
client.upload(upload.Data("data" * 10000, convergence="")))
@ -1224,8 +1197,7 @@ class EncodingParameters(GridTestMixin, unittest.TestCase, SetDEPMixin,
self._copy_share_to_server(i, 2)
d.addCallback(_copy_shares)
# Remove server 0, and add another in its place
d.addCallback(lambda ign:
self.g.remove_server(self.g.servers_by_number[0].my_nodeid))
d.addCallback(lambda ign: self.remove_server(0))
d.addCallback(lambda ign:
self._add_server_with_share(server_number=4, share_number=0,
readonly=True))
@ -1266,8 +1238,7 @@ class EncodingParameters(GridTestMixin, unittest.TestCase, SetDEPMixin,
for i in xrange(1, 10):
self._copy_share_to_server(i, 2)
d.addCallback(_copy_shares)
d.addCallback(lambda ign:
self.g.remove_server(self.g.servers_by_number[0].my_nodeid))
d.addCallback(lambda ign: self.remove_server(0))
def _reset_encoding_parameters(ign, happy=4):
client = self.g.clients[0]
client.DEFAULT_ENCODING_PARAMETERS['happy'] = happy
@ -1303,10 +1274,7 @@ class EncodingParameters(GridTestMixin, unittest.TestCase, SetDEPMixin,
# remove the original server
# (necessary to ensure that the Tahoe2ServerSelector will distribute
# all the shares)
def _remove_server(ign):
server = self.g.servers_by_number[0]
self.g.remove_server(server.my_nodeid)
d.addCallback(_remove_server)
d.addCallback(lambda ign: self.remove_server(0))
# This should succeed; we still have 4 servers, and the
# happiness of the upload is 4.
d.addCallback(lambda ign:
@ -1318,7 +1286,7 @@ class EncodingParameters(GridTestMixin, unittest.TestCase, SetDEPMixin,
d.addCallback(lambda ign:
self._setup_and_upload())
d.addCallback(_do_server_setup)
d.addCallback(_remove_server)
d.addCallback(lambda ign: self.remove_server(0))
d.addCallback(lambda ign:
self.shouldFail(UploadUnhappinessError,
"test_dropped_servers_in_encoder",
@ -1340,14 +1308,14 @@ class EncodingParameters(GridTestMixin, unittest.TestCase, SetDEPMixin,
self._add_server_with_share(4, 7, readonly=True)
self._add_server_with_share(5, 8, readonly=True)
d.addCallback(_do_server_setup_2)
d.addCallback(_remove_server)
d.addCallback(lambda ign: self.remove_server(0))
d.addCallback(lambda ign:
self._do_upload_with_broken_servers(1))
d.addCallback(_set_basedir)
d.addCallback(lambda ign:
self._setup_and_upload())
d.addCallback(_do_server_setup_2)
d.addCallback(_remove_server)
d.addCallback(lambda ign: self.remove_server(0))
d.addCallback(lambda ign:
self.shouldFail(UploadUnhappinessError,
"test_dropped_servers_in_encoder",
@ -1561,8 +1529,7 @@ class EncodingParameters(GridTestMixin, unittest.TestCase, SetDEPMixin,
for i in xrange(1, 10):
self._copy_share_to_server(i, 1)
d.addCallback(_copy_shares)
d.addCallback(lambda ign:
self.g.remove_server(self.g.servers_by_number[0].my_nodeid))
d.addCallback(lambda ign: self.remove_server(0))
def _prepare_client(ign):
client = self.g.clients[0]
client.DEFAULT_ENCODING_PARAMETERS['happy'] = 4
@ -1584,7 +1551,7 @@ class EncodingParameters(GridTestMixin, unittest.TestCase, SetDEPMixin,
def _setup(ign):
for i in xrange(1, 11):
self._add_server(server_number=i)
self.g.remove_server(self.g.servers_by_number[0].my_nodeid)
self.remove_server(0)
c = self.g.clients[0]
# We set happy to an unsatisfiable value so that we can check the
# counting in the exception message. The same progress message
@ -1611,7 +1578,7 @@ class EncodingParameters(GridTestMixin, unittest.TestCase, SetDEPMixin,
self._add_server(server_number=i)
self._add_server(server_number=11, readonly=True)
self._add_server(server_number=12, readonly=True)
self.g.remove_server(self.g.servers_by_number[0].my_nodeid)
self.remove_server(0)
c = self.g.clients[0]
c.DEFAULT_ENCODING_PARAMETERS['happy'] = 45
return c
@ -1639,8 +1606,7 @@ class EncodingParameters(GridTestMixin, unittest.TestCase, SetDEPMixin,
# the first one that the selector sees.
for i in xrange(10):
self._copy_share_to_server(i, 9)
# Remove server 0, and its contents
self.g.remove_server(self.g.servers_by_number[0].my_nodeid)
self.remove_server(0)
# Make happiness unsatisfiable
c = self.g.clients[0]
c.DEFAULT_ENCODING_PARAMETERS['happy'] = 45
@ -1660,7 +1626,7 @@ class EncodingParameters(GridTestMixin, unittest.TestCase, SetDEPMixin,
def _then(ign):
for i in xrange(1, 11):
self._add_server(server_number=i, readonly=True)
self.g.remove_server(self.g.servers_by_number[0].my_nodeid)
self.remove_server(0)
c = self.g.clients[0]
c.DEFAULT_ENCODING_PARAMETERS['k'] = 2
c.DEFAULT_ENCODING_PARAMETERS['happy'] = 4
@ -1696,8 +1662,7 @@ class EncodingParameters(GridTestMixin, unittest.TestCase, SetDEPMixin,
self._add_server(server_number=4, readonly=True))
d.addCallback(lambda ign:
self._add_server(server_number=5, readonly=True))
d.addCallback(lambda ign:
self.g.remove_server(self.g.servers_by_number[0].my_nodeid))
d.addCallback(lambda ign: self.remove_server(0))
def _reset_encoding_parameters(ign, happy=4):
client = self.g.clients[0]
client.DEFAULT_ENCODING_PARAMETERS['happy'] = happy
@ -1732,7 +1697,7 @@ class EncodingParameters(GridTestMixin, unittest.TestCase, SetDEPMixin,
d.addCallback(lambda ign:
self._add_server(server_number=2))
def _break_server_2(ign):
serverid = self.g.servers_by_number[2].my_nodeid
serverid = self.get_server(2).get_serverid()
self.g.break_server(serverid)
d.addCallback(_break_server_2)
d.addCallback(lambda ign:
@ -1741,8 +1706,7 @@ class EncodingParameters(GridTestMixin, unittest.TestCase, SetDEPMixin,
self._add_server(server_number=4, readonly=True))
d.addCallback(lambda ign:
self._add_server(server_number=5, readonly=True))
d.addCallback(lambda ign:
self.g.remove_server(self.g.servers_by_number[0].my_nodeid))
d.addCallback(lambda ign: self.remove_server(0))
d.addCallback(_reset_encoding_parameters)
d.addCallback(lambda client:
self.shouldFail(UploadUnhappinessError, "test_selection_exceptions",
@ -1853,8 +1817,7 @@ class EncodingParameters(GridTestMixin, unittest.TestCase, SetDEPMixin,
# Copy shares
self._copy_share_to_server(1, 1)
self._copy_share_to_server(2, 1)
# Remove server 0
self.g.remove_server(self.g.servers_by_number[0].my_nodeid)
self.remove_server(0)
client = self.g.clients[0]
client.DEFAULT_ENCODING_PARAMETERS['happy'] = 3
return client
@ -1887,9 +1850,9 @@ class EncodingParameters(GridTestMixin, unittest.TestCase, SetDEPMixin,
self._copy_share_to_server(3, 1)
storedir = self.get_serverdir(0)
# remove the storedir, wiping out any existing shares
shutil.rmtree(storedir)
fileutil.rm_dir(storedir)
# create an empty storedir to replace the one we just removed
os.mkdir(storedir)
fileutil.make_dirs(storedir)
client = self.g.clients[0]
client.DEFAULT_ENCODING_PARAMETERS['happy'] = 4
return client
@ -1928,9 +1891,9 @@ class EncodingParameters(GridTestMixin, unittest.TestCase, SetDEPMixin,
self._copy_share_to_server(3, 1)
storedir = self.get_serverdir(0)
# remove the storedir, wiping out any existing shares
shutil.rmtree(storedir)
fileutil.rm_dir(storedir)
# create an empty storedir to replace the one we just removed
os.mkdir(storedir)
fileutil.make_dirs(storedir)
client = self.g.clients[0]
client.DEFAULT_ENCODING_PARAMETERS['happy'] = 4
return client
@ -1968,8 +1931,7 @@ class EncodingParameters(GridTestMixin, unittest.TestCase, SetDEPMixin,
readonly=True)
self._add_server_with_share(server_number=4, share_number=3,
readonly=True)
# Remove server 0.
self.g.remove_server(self.g.servers_by_number[0].my_nodeid)
self.remove_server(0)
# Set the client appropriately
c = self.g.clients[0]
c.DEFAULT_ENCODING_PARAMETERS['happy'] = 4

View File

@ -2,7 +2,9 @@
def foo(): pass # keep the line number constant
import os, time, sys
from collections import deque
from StringIO import StringIO
from twisted.trial import unittest
from twisted.internet import defer, reactor
from twisted.python.failure import Failure
@ -12,7 +14,7 @@ from pycryptopp.hash.sha256 import SHA256 as _hash
from allmydata.util import base32, idlib, humanreadable, mathutil, hashutil
from allmydata.util import assertutil, fileutil, deferredutil, abbreviate
from allmydata.util import limiter, time_format, pollmixin, cachedir
from allmydata.util import statistics, dictutil, pipeline
from allmydata.util import statistics, dictutil, listutil, pipeline
from allmydata.util import log as tahoe_log
from allmydata.util.spans import Spans, overlap, DataSpans
@ -557,7 +559,7 @@ class PollMixinTests(unittest.TestCase):
d.addCallbacks(_suc, _err)
return d
class DeferredUtilTests(unittest.TestCase):
class DeferredUtilTests(unittest.TestCase, deferredutil.WaitForDelayedCallsMixin):
def test_gather_results(self):
d1 = defer.Deferred()
d2 = defer.Deferred()
@ -597,6 +599,21 @@ class DeferredUtilTests(unittest.TestCase):
self.failUnless(isinstance(f, Failure))
self.failUnless(f.check(ValueError))
def test_wait_for_delayed_calls(self):
"""
This tests that 'wait_for_delayed_calls' does in fact wait for a
delayed call that is active when the test returns. If it didn't,
Trial would report an unclean reactor error for this test.
"""
def _trigger():
#print "trigger"
pass
reactor.callLater(0.1, _trigger)
d = defer.succeed(None)
d.addBoth(self.wait_for_delayed_calls)
return d
class HashUtilTests(unittest.TestCase):
def test_random_key(self):
@ -1416,6 +1433,13 @@ class DictUtil(unittest.TestCase):
self.failUnlessEqual(d["one"], 1)
self.failUnlessEqual(d.get_aux("one"), None)
class ListUtil(unittest.TestCase):
def test_concat(self):
x = deque([[1, 2], (), xrange(3, 6)])
self.failUnlessEqual(listutil.concat(x), [1, 2, 3, 4, 5])
class Pipeline(unittest.TestCase):
def pause(self, *args, **kwargs):
d = defer.Deferred()

View File

@ -15,7 +15,6 @@ from nevow.util import escapeToXML
from nevow import rend
from allmydata import interfaces, uri, webish, dirnode
from allmydata.storage.shares import get_share_file
from allmydata.storage_client import StorageFarmBroker, StubServer
from allmydata.immutable import upload
from allmydata.immutable.downloader.status import DownloadStatus
@ -39,6 +38,8 @@ from allmydata.test.common_web import HTTPClientGETFactory, \
HTTPClientHEADFactory
from allmydata.client import Client, SecretHolder
from allmydata.introducer import IntroducerNode
from allmydata.storage.expiration import ExpirationPolicy
# create a fake uploader/downloader, and a couple of fake dirnodes, then
# create a webserver that works against them
@ -200,7 +201,7 @@ class FakeBucketCounter(object):
"cycle-in-progress": False,
"remaining-wait-time": 0}
class FakeLeaseChecker(object):
class FakeAccountingCrawler(object):
def __init__(self):
self.expiration_enabled = False
self.mode = "age"
@ -217,12 +218,26 @@ class FakeStorageServer(service.MultiService):
name = 'storage'
def __init__(self, nodeid, nickname):
service.MultiService.__init__(self)
self.my_nodeid = nodeid
self.serverid = nodeid
self.nickname = nickname
self.bucket_counter = FakeBucketCounter()
self.lease_checker = FakeLeaseChecker()
self.accounting_crawler = FakeAccountingCrawler()
self.accountant = FakeAccountant()
self.expiration_policy = ExpirationPolicy(enabled=False)
def get_stats(self):
return {"storage_server.accepting_immutable_shares": False}
def get_serverid(self):
return self.serverid
def get_bucket_counter(self):
return self.bucket_counter
def get_accounting_crawler(self):
return self.accounting_crawler
def get_expiration_policy(self):
return self.expiration_policy
class FakeAccountant:
def get_all_accounts(self):
return []
class FakeClient(Client):
def __init__(self):
@ -252,7 +267,9 @@ class FakeClient(Client):
None, None, None)
self.nodemaker.all_contents = self.all_contents
self.mutable_file_default = SDMF_VERSION
self.addService(FakeStorageServer(self.nodeid, self.nickname))
server = FakeStorageServer(self.nodeid, self.nickname)
self.accountant = server.accountant
self.addService(server)
def get_long_nodeid(self):
return "v0-nodeid"
@ -4548,20 +4565,22 @@ class Grid(GridTestMixin, WebErrorMixin, ShouldFailMixin, testutil.ReallyEqualMi
self.fileurls[which] = "uri/" + urllib.quote(self.uris[which])
d.addCallback(_compute_fileurls)
def _clobber_shares(ignored):
good_shares = self.find_uri_shares(self.uris["good"])
self.failUnlessReallyEqual(len(good_shares), 10)
sick_shares = self.find_uri_shares(self.uris["sick"])
os.unlink(sick_shares[0][2])
dead_shares = self.find_uri_shares(self.uris["dead"])
d.addCallback(lambda ign: self.find_uri_shares(self.uris["good"]))
d.addCallback(lambda good_shares: self.failUnlessReallyEqual(len(good_shares), 10))
d.addCallback(lambda ign: self.find_uri_shares(self.uris["sick"]))
d.addCallback(lambda sick_shares: fileutil.remove(sick_shares[0][2]))
d.addCallback(lambda ign: self.find_uri_shares(self.uris["dead"]))
def _remove_dead_shares(dead_shares):
for i in range(1, 10):
os.unlink(dead_shares[i][2])
c_shares = self.find_uri_shares(self.uris["corrupt"])
fileutil.remove(dead_shares[i][2])
d.addCallback(_remove_dead_shares)
d.addCallback(lambda ign: self.find_uri_shares(self.uris["corrupt"]))
def _corrupt_shares(c_shares):
cso = CorruptShareOptions()
cso.stdout = StringIO()
cso.parseOptions([c_shares[0][2]])
corrupt_share(cso)
d.addCallback(_clobber_shares)
d.addCallback(_corrupt_shares)
d.addCallback(self.CHECK, "good", "t=check")
def _got_html_good(res):
@ -4688,20 +4707,22 @@ class Grid(GridTestMixin, WebErrorMixin, ShouldFailMixin, testutil.ReallyEqualMi
self.fileurls[which] = "uri/" + urllib.quote(self.uris[which])
d.addCallback(_compute_fileurls)
def _clobber_shares(ignored):
good_shares = self.find_uri_shares(self.uris["good"])
self.failUnlessReallyEqual(len(good_shares), 10)
sick_shares = self.find_uri_shares(self.uris["sick"])
os.unlink(sick_shares[0][2])
dead_shares = self.find_uri_shares(self.uris["dead"])
d.addCallback(lambda ign: self.find_uri_shares(self.uris["good"]))
d.addCallback(lambda good_shares: self.failUnlessReallyEqual(len(good_shares), 10))
d.addCallback(lambda ign: self.find_uri_shares(self.uris["sick"]))
d.addCallback(lambda sick_shares: fileutil.remove(sick_shares[0][2]))
d.addCallback(lambda ign: self.find_uri_shares(self.uris["dead"]))
def _remove_dead_shares(dead_shares):
for i in range(1, 10):
os.unlink(dead_shares[i][2])
c_shares = self.find_uri_shares(self.uris["corrupt"])
fileutil.remove(dead_shares[i][2])
d.addCallback(_remove_dead_shares)
d.addCallback(lambda ign: self.find_uri_shares(self.uris["corrupt"]))
def _corrupt_shares(c_shares):
cso = CorruptShareOptions()
cso.stdout = StringIO()
cso.parseOptions([c_shares[0][2]])
corrupt_share(cso)
d.addCallback(_clobber_shares)
d.addCallback(_corrupt_shares)
d.addCallback(self.CHECK, "good", "t=check&repair=true")
def _got_html_good(res):
@ -4757,10 +4778,8 @@ class Grid(GridTestMixin, WebErrorMixin, ShouldFailMixin, testutil.ReallyEqualMi
self.fileurls[which] = "uri/" + urllib.quote(self.uris[which])
d.addCallback(_compute_fileurls)
def _clobber_shares(ignored):
sick_shares = self.find_uri_shares(self.uris["sick"])
os.unlink(sick_shares[0][2])
d.addCallback(_clobber_shares)
d.addCallback(lambda ign: self.find_uri_shares(self.uris["sick"]))
d.addCallback(lambda sick_shares: fileutil.remove(sick_shares[0][2]))
d.addCallback(self.CHECK, "sick", "t=check&repair=true&output=json")
def _got_json_sick(res):
@ -5073,9 +5092,7 @@ class Grid(GridTestMixin, WebErrorMixin, ShouldFailMixin, testutil.ReallyEqualMi
future_node = UnknownNode(unknown_rwcap, unknown_rocap)
d.addCallback(lambda ign: self.rootnode.set_node(u"future", future_node))
def _clobber_shares(ignored):
self.delete_shares_numbered(self.uris["sick"], [0,1])
d.addCallback(_clobber_shares)
d.addCallback(lambda ign: self.delete_shares_numbered(self.uris["sick"], [0,1]))
# root
# root/good
@ -5247,21 +5264,22 @@ class Grid(GridTestMixin, WebErrorMixin, ShouldFailMixin, testutil.ReallyEqualMi
#d.addCallback(lambda fn: self.rootnode.set_node(u"corrupt", fn))
#d.addCallback(_stash_uri, "corrupt")
def _clobber_shares(ignored):
good_shares = self.find_uri_shares(self.uris["good"])
self.failUnlessReallyEqual(len(good_shares), 10)
sick_shares = self.find_uri_shares(self.uris["sick"])
os.unlink(sick_shares[0][2])
#dead_shares = self.find_uri_shares(self.uris["dead"])
#for i in range(1, 10):
# os.unlink(dead_shares[i][2])
#c_shares = self.find_uri_shares(self.uris["corrupt"])
#cso = CorruptShareOptions()
#cso.stdout = StringIO()
#cso.parseOptions([c_shares[0][2]])
#corrupt_share(cso)
d.addCallback(_clobber_shares)
d.addCallback(lambda ign: self.find_uri_shares(self.uris["good"]))
d.addCallback(lambda good_shares: self.failUnlessReallyEqual(len(good_shares), 10))
d.addCallback(lambda ign: self.find_uri_shares(self.uris["sick"]))
d.addCallback(lambda sick_shares: fileutil.remove(sick_shares[0][2]))
#d.addCallback(lambda ign: self.find_uri_shares(self.uris["dead"]))
#def _remove_dead_shares(dead_shares):
# for i in range(1, 10):
# fileutil.remove(dead_shares[i][2])
#d.addCallback(_remove_dead_shares)
#d.addCallback(lambda ign: self.find_uri_shares(self.uris["corrupt"]))
#def _corrupt_shares(c_shares):
# cso = CorruptShareOptions()
# cso.stdout = StringIO()
# cso.parseOptions([c_shares[0][2]])
# corrupt_share(cso)
#d.addCallback(_corrupt_shares)
# root
# root/good CHK, 10 shares
@ -5310,25 +5328,24 @@ class Grid(GridTestMixin, WebErrorMixin, ShouldFailMixin, testutil.ReallyEqualMi
d.addErrback(self.explain_web_error)
return d
def _count_leases(self, ignored, which):
def _assert_leasecount(self, ign, which, expected):
u = self.uris[which]
shares = self.find_uri_shares(u)
lease_counts = []
for shnum, serverid, fn in shares:
sf = get_share_file(fn)
num_leases = len(list(sf.get_leases()))
lease_counts.append( (fn, num_leases) )
return lease_counts
si = uri.from_string(u).get_storage_index()
num_leases = 0
for server in self.g.servers_by_number.values():
ss = server.get_accountant().get_anonymous_account()
ss2 = server.get_accountant().get_starter_account()
num_leases += len(ss.get_leases(si)) + len(ss2.get_leases(si))
def _assert_leasecount(self, lease_counts, expected):
for (fn, num_leases) in lease_counts:
if num_leases != expected:
self.fail("expected %d leases, have %d, on %s" %
(expected, num_leases, fn))
if num_leases != expected:
self.fail("expected %d leases, have %d, on '%s'" %
(expected, num_leases, which))
def test_add_lease(self):
self.basedir = "web/Grid/add_lease"
self.set_up_grid(num_clients=2)
N = 10
self.set_up_grid(num_clients=2, num_servers=N)
c0 = self.g.clients[0]
self.uris = {}
DATA = "data" * 100
@ -5352,12 +5369,9 @@ class Grid(GridTestMixin, WebErrorMixin, ShouldFailMixin, testutil.ReallyEqualMi
self.fileurls[which] = "uri/" + urllib.quote(self.uris[which])
d.addCallback(_compute_fileurls)
d.addCallback(self._count_leases, "one")
d.addCallback(self._assert_leasecount, 1)
d.addCallback(self._count_leases, "two")
d.addCallback(self._assert_leasecount, 1)
d.addCallback(self._count_leases, "mutable")
d.addCallback(self._assert_leasecount, 1)
d.addCallback(self._assert_leasecount, "one", N)
d.addCallback(self._assert_leasecount, "two", N)
d.addCallback(self._assert_leasecount, "mutable", N)
d.addCallback(self.CHECK, "one", "t=check") # no add-lease
def _got_html_good(res):
@ -5365,63 +5379,45 @@ class Grid(GridTestMixin, WebErrorMixin, ShouldFailMixin, testutil.ReallyEqualMi
self.failIfIn("Not Healthy", res)
d.addCallback(_got_html_good)
d.addCallback(self._count_leases, "one")
d.addCallback(self._assert_leasecount, 1)
d.addCallback(self._count_leases, "two")
d.addCallback(self._assert_leasecount, 1)
d.addCallback(self._count_leases, "mutable")
d.addCallback(self._assert_leasecount, 1)
d.addCallback(self._assert_leasecount, "one", N)
d.addCallback(self._assert_leasecount, "two", N)
d.addCallback(self._assert_leasecount, "mutable", N)
# this CHECK uses the original client, which uses the same
# lease-secrets, so it will just renew the original lease
d.addCallback(self.CHECK, "one", "t=check&add-lease=true")
d.addCallback(_got_html_good)
d.addCallback(self._count_leases, "one")
d.addCallback(self._assert_leasecount, 1)
d.addCallback(self._count_leases, "two")
d.addCallback(self._assert_leasecount, 1)
d.addCallback(self._count_leases, "mutable")
d.addCallback(self._assert_leasecount, 1)
d.addCallback(self._assert_leasecount, "one", N)
d.addCallback(self._assert_leasecount, "two", N)
d.addCallback(self._assert_leasecount, "mutable", N)
# this CHECK uses an alternate client, which adds a second lease
d.addCallback(self.CHECK, "one", "t=check&add-lease=true", clientnum=1)
d.addCallback(_got_html_good)
d.addCallback(self._count_leases, "one")
d.addCallback(self._assert_leasecount, 2)
d.addCallback(self._count_leases, "two")
d.addCallback(self._assert_leasecount, 1)
d.addCallback(self._count_leases, "mutable")
d.addCallback(self._assert_leasecount, 1)
# XXX why are the checks below commented out? --Zooko 2012-11-27
d.addCallback(self.CHECK, "mutable", "t=check&add-lease=true")
d.addCallback(_got_html_good)
#d.addCallback(self._assert_leasecount, "one", 2*N)
d.addCallback(self._count_leases, "one")
d.addCallback(self._assert_leasecount, 2)
d.addCallback(self._count_leases, "two")
d.addCallback(self._assert_leasecount, 1)
d.addCallback(self._count_leases, "mutable")
d.addCallback(self._assert_leasecount, 1)
#d.addCallback(self.CHECK, "mutable", "t=check&add-lease=true")
#d.addCallback(_got_html_good)
#d.addCallback(self._assert_leasecount, "mutable", N)
d.addCallback(self.CHECK, "mutable", "t=check&add-lease=true",
clientnum=1)
d.addCallback(_got_html_good)
d.addCallback(self._count_leases, "one")
d.addCallback(self._assert_leasecount, 2)
d.addCallback(self._count_leases, "two")
d.addCallback(self._assert_leasecount, 1)
d.addCallback(self._count_leases, "mutable")
d.addCallback(self._assert_leasecount, 2)
#d.addCallback(self._assert_leasecount, "mutable", 2*N)
d.addErrback(self.explain_web_error)
return d
def test_deep_add_lease(self):
self.basedir = "web/Grid/deep_add_lease"
self.set_up_grid(num_clients=2)
N = 10
self.set_up_grid(num_clients=2, num_servers=N)
c0 = self.g.clients[0]
self.uris = {}
self.fileurls = {}
@ -5456,33 +5452,24 @@ class Grid(GridTestMixin, WebErrorMixin, ShouldFailMixin, testutil.ReallyEqualMi
self.failUnlessReallyEqual(len(units), 4+1)
d.addCallback(_done)
d.addCallback(self._count_leases, "root")
d.addCallback(self._assert_leasecount, 1)
d.addCallback(self._count_leases, "one")
d.addCallback(self._assert_leasecount, 1)
d.addCallback(self._count_leases, "mutable")
d.addCallback(self._assert_leasecount, 1)
d.addCallback(self._assert_leasecount, "root", N)
d.addCallback(self._assert_leasecount, "one", N)
d.addCallback(self._assert_leasecount, "mutable", N)
d.addCallback(self.CHECK, "root", "t=stream-deep-check&add-lease=true")
d.addCallback(_done)
d.addCallback(self._count_leases, "root")
d.addCallback(self._assert_leasecount, 1)
d.addCallback(self._count_leases, "one")
d.addCallback(self._assert_leasecount, 1)
d.addCallback(self._count_leases, "mutable")
d.addCallback(self._assert_leasecount, 1)
d.addCallback(self._assert_leasecount, "root", N)
d.addCallback(self._assert_leasecount, "one", N)
d.addCallback(self._assert_leasecount, "mutable", N)
d.addCallback(self.CHECK, "root", "t=stream-deep-check&add-lease=true",
clientnum=1)
d.addCallback(_done)
d.addCallback(self._count_leases, "root")
d.addCallback(self._assert_leasecount, 2)
d.addCallback(self._count_leases, "one")
d.addCallback(self._assert_leasecount, 2)
d.addCallback(self._count_leases, "mutable")
d.addCallback(self._assert_leasecount, 2)
#d.addCallback(self._assert_leasecount, "root", 2*N)
#d.addCallback(self._assert_leasecount, "one", 2*N)
#d.addCallback(self._assert_leasecount, "mutable", 2*N)
d.addErrback(self.explain_web_error)
return d

View File

@ -0,0 +1,66 @@
import os, sys
import sqlite3
from sqlite3 import IntegrityError
[IntegrityError]
class DBError(Exception):
pass
def get_db(dbfile, stderr=sys.stderr,
create_version=(None, None), updaters={}, just_create=False, dbname="db",
journal_mode=None, synchronous=None):
"""Open or create the given db file. The parent directory must exist.
create_version=(SCHEMA, VERNUM), and SCHEMA must have a 'version' table.
Updaters is a {newver: commands} mapping, where e.g. updaters[2] is used
to get from ver=1 to ver=2. Returns a (sqlite3,db) tuple, or raises
DBError.
"""
must_create = not os.path.exists(dbfile)
try:
db = sqlite3.connect(dbfile)
except (EnvironmentError, sqlite3.OperationalError), e:
raise DBError("Unable to create/open %s file %s: %s" % (dbname, dbfile, e))
schema, target_version = create_version
c = db.cursor()
# Enabling foreign keys allows stricter integrity checking.
# The default is unspecified according to <http://www.sqlite.org/foreignkeys.html#fk_enable>.
c.execute("PRAGMA foreign_keys = ON;")
if journal_mode is not None:
c.execute("PRAGMA journal_mode = %s;" % (journal_mode,))
if synchronous is not None:
c.execute("PRAGMA synchronous = %s;" % (synchronous,))
if must_create:
c.executescript(schema)
c.execute("INSERT INTO version (version) VALUES (?)", (target_version,))
db.commit()
try:
c.execute("SELECT version FROM version")
version = c.fetchone()[0]
except sqlite3.DatabaseError, e:
# this indicates that the file is not a compatible database format.
# Perhaps it was created with an old version, or it might be junk.
raise DBError("%s file is unusable: %s" % (dbname, e))
if just_create: # for tests
return (sqlite3, db)
while version < target_version and version+1 in updaters:
c.executescript(updaters[version+1])
db.commit()
version = version+1
if version != target_version:
raise DBError("Unable to handle %s version %s" % (dbname, version))
return (sqlite3, db)

View File

@ -1,4 +1,12 @@
from twisted.internet import defer
import time
from foolscap.api import eventually, fireEventually
from twisted.internet import defer, reactor
from allmydata.util import log
from allmydata.util.pollmixin import PollMixin
# utility wrapper for DeferredList
def _check_deferred_list(results):
@ -9,6 +17,7 @@ def _check_deferred_list(results):
if not success:
return f
return [r[1] for r in results]
def DeferredListShouldSucceed(dl):
d = defer.DeferredList(dl)
d.addCallback(_check_deferred_list)
@ -33,3 +42,143 @@ def gatherResults(deferredList):
d.addCallbacks(_parseDListResult, _unwrapFirstError)
return d
def _with_log(op, res):
"""
The default behaviour on firing an already-fired Deferred is unhelpful for
debugging, because the AlreadyCalledError can easily get lost or be raised
in a context that results in a different error. So make sure it is logged
(for the abstractions defined here). If we are in a test, log.err will cause
the test to fail.
"""
try:
op(res)
except defer.AlreadyCalledError, e:
log.err(e, op=repr(op), level=log.WEIRD)
def eventually_callback(d):
def _callback(res):
eventually(_with_log, d.callback, res)
return res
return _callback
def eventually_errback(d):
def _errback(res):
eventually(_with_log, d.errback, res)
return res
return _errback
def eventual_chain(source, target):
source.addCallbacks(eventually_callback(target), eventually_errback(target))
class HookMixin:
"""
I am a helper mixin that maintains a collection of named hooks, primarily
for use in tests. Each hook is set to an unfired Deferred using 'set_hook',
and can then be fired exactly once at the appropriate time by '_call_hook'.
I assume a '_hooks' attribute that should set by the class constructor to
a dict mapping each valid hook name to None.
"""
def set_hook(self, name, d=None):
"""
Called by the hook observer (e.g. by a test).
If d is not given, an unfired Deferred is created and returned.
The hook must not already be set.
"""
if d is None:
d = defer.Deferred()
assert self._hooks[name] is None, self._hooks[name]
assert isinstance(d, defer.Deferred), d
self._hooks[name] = d
return d
def _call_hook(self, res, name):
"""
Called to trigger the hook, with argument 'res'. This is a no-op if the
hook is unset. Otherwise, the hook will be unset, and then its Deferred
will be fired synchronously.
The expected usage is "deferred.addBoth(self._call_hook, 'hookname')".
This ensures that if 'res' is a failure, the hook will be errbacked,
which will typically cause the test to also fail.
'res' is returned so that the current result or failure will be passed
through.
"""
d = self._hooks[name]
if d is None:
return defer.succeed(None)
self._hooks[name] = None
_with_log(d.callback, res)
return res
def async_iterate(process, iterable, *extra_args, **kwargs):
"""
I iterate over the elements of 'iterable' (which may be deferred), eventually
applying 'process' to each one, optionally with 'extra_args' and 'kwargs'.
'process' should return a (possibly deferred) boolean: True to continue the
iteration, False to stop.
I return a Deferred that fires with True if all elements of the iterable
were processed (i.e. 'process' only returned True values); with False if
the iteration was stopped by 'process' returning False; or that fails with
the first failure of either 'process' or the iterator.
"""
iterator = iter(iterable)
d = defer.succeed(None)
def _iterate(ign):
d2 = defer.maybeDeferred(iterator.next)
def _cb(item):
d3 = defer.maybeDeferred(process, item, *extra_args, **kwargs)
def _maybe_iterate(res):
if res:
d4 = fireEventually()
d4.addCallback(_iterate)
return d4
return False
d3.addCallback(_maybe_iterate)
return d3
def _eb(f):
f.trap(StopIteration)
return True
d2.addCallbacks(_cb, _eb)
return d2
d.addCallback(_iterate)
return d
def for_items(cb, mapping):
"""
For each (key, value) pair in a mapping, I add a callback to cb(None, key, value)
to a Deferred that fires immediately. I return that Deferred.
"""
d = defer.succeed(None)
for k, v in mapping.items():
d.addCallback(lambda ign, k=k, v=v: cb(None, k, v))
return d
class WaitForDelayedCallsMixin(PollMixin):
def _delayed_calls_done(self):
# We're done when the only remaining DelayedCalls fire after threshold.
# (These will be associated with the test timeout, or else they *should*
# cause an unclean reactor error because the test should have waited for
# them.)
threshold = time.time() + 10
for delayed in reactor.getDelayedCalls():
if delayed.getTime() < threshold:
return False
return True
def wait_for_delayed_calls(self, res=None):
"""
Use like this at the end of a test:
d.addBoth(self.wait_for_delayed_calls)
"""
d = self.poll(self._delayed_calls_done)
d.addErrback(log.err, "error while waiting for delayed calls")
d.addBoth(lambda ign: res)
return d

View File

@ -2,7 +2,7 @@
Futz with files like a pro.
"""
import sys, exceptions, os, stat, tempfile, time, binascii
import errno, sys, exceptions, os, re, stat, tempfile, time, binascii
from twisted.python import log
@ -202,10 +202,12 @@ def rm_dir(dirname):
else:
remove(fullname)
os.rmdir(dirname)
except Exception, le:
# Ignore "No such file or directory"
if (not isinstance(le, OSError)) or le.args[0] != 2:
except EnvironmentError, le:
# Ignore "No such file or directory", collect any other exception.
if (le.args[0] != 2 and le.args[0] != 3) or (le.args[0] != errno.ENOENT):
excs.append(le)
except Exception, le:
excs.append(le)
# Okay, now we've recursively removed everything, ignoring any "No
# such file or directory" errors, and collecting any other errors.
@ -217,13 +219,33 @@ def rm_dir(dirname):
raise OSError, "Failed to remove dir for unknown reason."
raise OSError, excs
def remove_if_possible(f):
try:
remove(f)
except:
pass
def rmdir_if_empty(path):
""" Remove the directory if it is empty. """
try:
os.rmdir(path)
except OSError, e:
if e.errno != errno.ENOTEMPTY:
raise
ASCII = re.compile(r'^[\x00-\x7F]*$')
def listdir(path, filter=ASCII):
try:
children = os.listdir(path)
except OSError, e:
if e.errno != errno.ENOENT:
raise
return []
else:
return [str(child) for child in children if filter.match(child)]
def open_or_create(fname, binarymode=True):
try:
return open(fname, binarymode and "r+b" or "r+")
@ -255,8 +277,8 @@ def write_atomically(target, contents, mode="b"):
f.close()
move_into_place(target+".tmp", target)
def write(path, data, mode="wb"):
wf = open(path, mode)
def write(path, data, mode="b"):
wf = open(path, "w"+mode)
try:
wf.write(data)
finally:
@ -426,3 +448,38 @@ def get_available_space(whichdir, reserved_space):
except EnvironmentError:
log.msg("OS call to get disk statistics failed")
return 0
def get_used_space(path):
if path is None:
return 0
try:
s = os.stat(path)
except EnvironmentError:
if not os.path.exists(path):
return 0
raise
else:
# POSIX defines st_blocks (originally a BSDism):
# <http://pubs.opengroup.org/onlinepubs/009695399/basedefs/sys/stat.h.html>
# but does not require stat() to give it a "meaningful value"
# <http://pubs.opengroup.org/onlinepubs/009695399/functions/stat.html>
# and says:
# "The unit for the st_blocks member of the stat structure is not defined
# within IEEE Std 1003.1-2001. In some implementations it is 512 bytes.
# It may differ on a file system basis. There is no correlation between
# values of the st_blocks and st_blksize, and the f_bsize (from <sys/statvfs.h>)
# structure members."
#
# The Linux docs define it as "the number of blocks allocated to the file,
# [in] 512-byte units." It is also defined that way on MacOS X. Python does
# not set the attribute on Windows.
#
# This code relies on the underlying platform to either define st_blocks in
# units of 512 bytes or else leave st_blocks undefined. See also
# <http://bugs.python.org/issue12350>.
if hasattr(s, 'st_blocks'):
return s.st_blocks * 512
else:
return s.st_size

View File

@ -0,0 +1,20 @@
from allmydata.util.assertutil import _assert
def concat(seqs):
"""
O(n), rather than O(n^2), concatenation of list-like things, returning a list.
I can't believe this isn't built in.
"""
total_len = 0
for seq in seqs:
total_len += len(seq)
result = [None]*total_len
i = 0
for seq in seqs:
for x in seq:
result[i] = x
i += 1
_assert(i == total_len, i=i, total_len=total_len)
return result

View File

@ -0,0 +1,3 @@
class Namespace(object):
pass

View File

@ -3,7 +3,7 @@ import time, simplejson
from nevow import rend, tags as T, inevow
from allmydata.web.common import getxmlfile, abbreviate_time, get_arg
from allmydata.util.abbreviate import abbreviate_space
from allmydata.util import time_format, idlib
from allmydata.util import idlib
def remove_prefix(s, prefix):
if not s.startswith(prefix):
@ -28,17 +28,18 @@ class StorageStatus(rend.Page):
def render_JSON(self, req):
req.setHeader("content-type", "text/plain")
accounting_crawler = self.storage.get_accounting_crawler()
d = {"stats": self.storage.get_stats(),
"bucket-counter": self.storage.bucket_counter.get_state(),
"lease-checker": self.storage.lease_checker.get_state(),
"lease-checker-progress": self.storage.lease_checker.get_progress(),
"bucket-counter": None,
"lease-checker": accounting_crawler.get_state(),
"lease-checker-progress": accounting_crawler.get_progress(),
}
return simplejson.dumps(d, indent=1) + "\n"
def data_nickname(self, ctx, storage):
return self.nickname
def data_nodeid(self, ctx, storage):
return idlib.nodeid_b2a(self.storage.my_nodeid)
return idlib.nodeid_b2a(self.storage.get_serverid())
def render_storage_running(self, ctx, storage):
if storage:
@ -93,15 +94,13 @@ class StorageStatus(rend.Page):
return d
def data_last_complete_bucket_count(self, ctx, data):
s = self.storage.bucket_counter.get_state()
count = s.get("last-complete-bucket-count")
if count is None:
s = self.storage.get_stats()
if "storage_server.total_bucket_count" not in s:
return "Not computed yet"
return count
return s['storage_server.total_bucket_count']
def render_count_crawler_status(self, ctx, storage):
p = self.storage.bucket_counter.get_progress()
return ctx.tag[self.format_crawler_progress(p)]
return ctx.tag
def format_crawler_progress(self, p):
cycletime = p["estimated-time-per-cycle"]
@ -129,31 +128,12 @@ class StorageStatus(rend.Page):
cycletime_s]
def render_lease_expiration_enabled(self, ctx, data):
lc = self.storage.lease_checker
if lc.expiration_enabled:
return ctx.tag["Enabled: expired leases will be removed"]
else:
return ctx.tag["Disabled: scan-only mode, no leases will be removed"]
ep = self.storage.get_expiration_policy()
return ctx.tag[ep.describe_enabled()]
def render_lease_expiration_mode(self, ctx, data):
lc = self.storage.lease_checker
if lc.mode == "age":
if lc.override_lease_duration is None:
ctx.tag["Leases will expire naturally, probably 31 days after "
"creation or renewal."]
else:
ctx.tag["Leases created or last renewed more than %s ago "
"will be considered expired."
% abbreviate_time(lc.override_lease_duration)]
else:
assert lc.mode == "cutoff-date"
localizedutcdate = time.strftime("%d-%b-%Y", time.gmtime(lc.cutoff_date))
isoutcdate = time_format.iso_utc_date(lc.cutoff_date)
ctx.tag["Leases created or last renewed before %s (%s) UTC "
"will be considered expired." % (isoutcdate, localizedutcdate, )]
if len(lc.mode) > 2:
ctx.tag[" The following sharetypes will be expired: ",
" ".join(sorted(lc.sharetypes_to_expire)), "."]
ep = self.storage.get_expiration_policy()
ctx.tag[ep.describe_expiration()]
return ctx.tag
def format_recovered(self, sr, a):
@ -161,7 +141,7 @@ class StorageStatus(rend.Page):
if d is None:
return "?"
return "%d" % d
return "%s shares, %s buckets (%s mutable / %s immutable), %s (%s / %s)" % \
return "%s shares, %s sharesets (%s mutable / %s immutable), %s (%s / %s)" % \
(maybe(sr["%s-shares" % a]),
maybe(sr["%s-buckets" % a]),
maybe(sr["%s-buckets-mutable" % a]),
@ -172,16 +152,16 @@ class StorageStatus(rend.Page):
)
def render_lease_current_cycle_progress(self, ctx, data):
lc = self.storage.lease_checker
p = lc.get_progress()
ac = self.storage.get_accounting_crawler()
p = ac.get_progress()
return ctx.tag[self.format_crawler_progress(p)]
def render_lease_current_cycle_results(self, ctx, data):
lc = self.storage.lease_checker
p = lc.get_progress()
ac = self.storage.get_accounting_crawler()
p = ac.get_progress()
if not p["cycle-in-progress"]:
return ""
s = lc.get_state()
s = ac.get_state()
so_far = s["cycle-to-date"]
sr = so_far["space-recovered"]
er = s["estimated-remaining-cycle"]
@ -197,7 +177,7 @@ class StorageStatus(rend.Page):
if d is None:
return "?"
return "%d" % d
add("So far, this cycle has examined %d shares in %d buckets"
add("So far, this cycle has examined %d shares in %d sharesets"
% (sr["examined-shares"], sr["examined-buckets"]),
" (%d mutable / %d immutable)"
% (sr["examined-buckets-mutable"], sr["examined-buckets-immutable"]),
@ -208,22 +188,11 @@ class StorageStatus(rend.Page):
if so_far["expiration-enabled"]:
add("The remainder of this cycle is expected to recover: ",
self.format_recovered(esr, "actual"))
add("The whole cycle is expected to examine %s shares in %s buckets"
add("The whole cycle is expected to examine %s shares in %s sharesets"
% (maybe(ecr["examined-shares"]), maybe(ecr["examined-buckets"])))
add("and to recover: ", self.format_recovered(ecr, "actual"))
else:
add("If expiration were enabled, we would have recovered: ",
self.format_recovered(sr, "configured"), " by now")
add("and the remainder of this cycle would probably recover: ",
self.format_recovered(esr, "configured"))
add("and the whole cycle would probably recover: ",
self.format_recovered(ecr, "configured"))
add("if we were strictly using each lease's default 31-day lease lifetime "
"(instead of our configured behavior), "
"this cycle would be expected to recover: ",
self.format_recovered(ecr, "original"))
add("Expiration was not enabled.")
if so_far["corrupt-shares"]:
add("Corrupt shares:",
@ -234,8 +203,8 @@ class StorageStatus(rend.Page):
return ctx.tag["Current cycle:", p]
def render_lease_last_cycle_results(self, ctx, data):
lc = self.storage.lease_checker
h = lc.get_state()["history"]
ac = self.storage.get_accounting_crawler()
h = ac.get_state()["history"]
if not h:
return ""
last = h[max(h.keys())]
@ -255,9 +224,7 @@ class StorageStatus(rend.Page):
add("and saw a total of ", saw)
if not last["expiration-enabled"]:
rec = self.format_recovered(last["space-recovered"], "configured")
add("but expiration was not enabled. If it had been, "
"it would have recovered: ", rec)
add("but expiration was not enabled.")
if last["corrupt-shares"]:
add("Corrupt shares:",

View File

@ -59,7 +59,7 @@
<li>Server Nodeid: <span class="nodeid mine data-chars" n:render="string" n:data="nodeid" /></li>
<li n:data="stats">Accepting new shares:
<span n:render="bool" n:data="accepting_immutable_shares" /></li>
<li>Total buckets:
<li>Total sharesets:
<span n:render="string" n:data="last_complete_bucket_count" />
(the number of files and directories for which this server is holding
a share)