Welcome to the Allmydata-Tahoe project. This project implements a secure,
distributed, fault-tolerant storage grid. All of the source code is available
under a Free Software licence.
The basic idea is that the data in this storage grid is spread over all
participating nodes, using an algorithm that can recover the data even if a
majority of the nodes are no longer available.
The interface to the storage grid allows you to store and fetch files, either
by self-authenticating cryptographic identifier or by filename and path.
See the web site for all kinds of information, news, and community
contributions, and prebuilt packages for Debian-like systems:
http://allmydata.org
LICENCE:
This program is free software; you can redistribute it and/or modify it
under the terms of the GNU General Public License as published by the Free
Software Foundation; either version 2 of the License, or (at your option)
any later version, with the added permission that, if you become obligated
to release a derived work under this licence (as per section 2.b), you may
delay the fulfillment of this obligation for up to 12 months. If you are
obligated to release code under section 2.b of this licence, you are
obligated to release it under these same terms, including the 12-month grace
period clause. See the COPYING file for details.
GETTING PRECOMPILED BINARIES:
See http://allmydata.org . Currently pre-compiled binaries are available
only for Debian or Ubuntu. For any other platform you have to build it
yourself from source, which is what this text file is all about.
GETTING THE SOURCE CODE:
The code is available via darcs by running the following command:
darcs get http://allmydata.org/source/tahoe/trunk tahoe
This will create a directory named "tahoe" in the current working directory
and put a copy of the latest source code into it. Later, if you want to get
any new changes, then cd into that directory and run the command "darcs
pull".
Tarballs of sources are available at:
http://allmydata.org/source/tahoe/
DEPENDENCIES:
Note: All of the following dependencies can probably be installed through
your standard package management tool if you are running on a modern Unix
operating system.
For example, on an debian-like system, you can do "sudo apt-get install
gcc make python-dev python-twisted python-nevow python-pyopenssl".
+ a C compiler (language)
+ GNU make (build tool)
+ Python 2.4 or newer (tested against 2.4, and 2.5.1 ), including
development headers (language)
http://python.org/
+ Twisted Python (tested against 2.2.0, 2.4.0, and 2.5.0) (network and
operating system integration library)
http://twistedmatrix.com/
You need the following subpackages, which are included in the default
Twisted distribution:
* core (the standard Twisted package)
* web, trial, conch
Twisted requires zope.interface, a copy of which is included in the
Twisted distribution. Note that Twisted does *not* require the entire Zope
distribution, merely the much smaller zope.interface component.
+ Python Nevow (0.6.0 or later) (web presentation language)
http://divmod.org/trac/wiki/DivmodNevow
Note that the current version of Nevow (0.9.18) requires Twisted 2.4.0 or
later.
+ Python setuptools (build and distribution tool)
Note: The build process will automatically download and install setuptools
if it is not present. However, if an old, incompatible version of
setuptools is present (< v0.6c6 on Cygwin, or < v0.6a9 on other
platforms), then the build will fail.
So if the build fails due to setuptools not being compatible, you can
either upgrade or uninstall your version of setuptools and try again.
http://peak.telecommunity.com/DevCenter/EasyInstall#installation-instructions
+ Python PyOpenSSL (0.6 or later) (secure transport layer)
http://pyopenssl.sourceforge.net
To install PyOpenSSL on Windows-native, download this:
http://allmydata.org/source/pyOpenSSL-0.6.win32-py2.5.exe
To install PyOpenSSL on Windows-cygwin, install the OpenSSL development
libraries with the cygwin package management tool, then get the pyOpenSSL
source code, cd into it, and run "python ./setup.py install".
+ the pywin32 package: only required on Windows
http://sourceforge.net/projects/pywin32/
(Tested with build 210, and known to not work with build 204.
Feedback with details of other builds is greatly appreciated)
BUILDING:
Just type 'make' in the top-level tahoe directory. This works on Windows
too, provided that you have the dependencies mentioned above. (Either a
normal cygwin build or a mingw-style native build will be done by the
makefile, depending on whether the version of python that you have installed
is the Windows-native python or the cygwin python.)
If the desired version of 'python' is not already on your PATH, then type
'make PYTHON=/path/to/your/preferred/python'.
'make test' runs the unit test suites. (This can take a long time on
slow computers. There are a lot of tests and some of them do a lot of
public-key cryptography.)
INSTALLING:
There are three ways to do it: The Debian Way, The Python Way, and The
Running-In-Place Way. Choose one:
The Debian Way:
The Debian Way is to build .deb files which you can then install with
"dpkg".
This requires certain debian packages (build-essential, fakeroot,
devscripts, debhelper, cdbs) to be installed first, since they are used to
construct the tahoe .deb files. A full list of these required packages can
be found in the "Build-Depends" line in the misc/DIST/debian/control in the
top-level tahoe directory (replacing the word DIST with etch, dapper, edgy,
or feisty as appropriate).
If you're running on a debian system, run 'make deb-etch', 'make deb-sid',
'make deb-edgy', or 'make deb-feisty' from within the tahoe top-level
directory to construct a debian package named 'allmydata-tahoe' which you
can then install with dpkg.
The Setuptools Way:
Just run 'python setup.py install'. This will compile and install the Tahoe
code to a system-specific standard location (somewhere inside /usr/lib/ on
unix). It will also acquire and install many of the necessary dependencies
in the same place. (The dependency-checking can handle foolscap, nevow, and
zfec, but you are still responsible for ensuring that twisted and pyopenssl
are installed).
To install it to a non-standard location, learn about the
"--single-version-externally-managed" flag, and visit
http://allmydata.org/trac/tahoe/wiki/Installing .
The easy_install Way:
Tahoe is registered with the Python Package Index (PyPI), so the
'easy_install' tool can download and install it for you. Just type
'easy_install allmydata-tahoe' from any shell. That will download the most
recent Tahoe source tarball, unpack it in a temporary directory, install it
to the standard location, then download and install any easy_install-able
dependencies (like zfec and foolscap) that you need.
The Running-In-Place Way:
You can use Tahoe without installing it. Once you've built Tahoe then you
can execute "./bin/allmydata-tahoe". (When the allmydata-tahoe script is in
an Tahoe source distribution, it adds the necessary directory to the Python
"sys.path".)
TESTING THAT IT IS PROPERLY INSTALLED
To test that all the modules got installed properly, cd to the root
directory of the tahoe source distribution (the directory which contains
this README file), start a python interpreter and import modules as follows.
If each one imports successfully instead of raising ImportError then it is
correctly installed.
% python
Python 2.4.4 (#2, Jan 13 2007, 17:50:26)
[GCC 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import zfec
>>> import allmydata.Crypto
>>> import foolscap
>>> import allmydata.interfaces
RUNNING:
If you installed one of the debian packages constructed by "make deb-*", or
installed "The Python Way", then it creates an 'allmydata-tahoe' executable,
usually in /usr/bin . Else, you can find allmydata-tahoe in ./instdir/bin/ .
This tool is used to create, start, and stop nodes. Each node lives in a
separate base directory, inside of which you can add files to configure and
control the node. Nodes also read and write files within that directory.
A grid consists of a single central 'introducer and vdrive' node and one or
more 'client' nodes. If you are joining an existing grid, the
introducer-and-vdrive node will already be running, and you'll just need to
create a client node. If you're creating a brand new grid, you'll need to
create both an introducer-and-vdrive and a client (and then invite other
people to create their own client nodes and join your grid).
The introducer (-and-vdrive) node is constructed by running 'allmydata-tahoe
create-introducer --basedir $HERE'. Once constructed, you can start the
introducer by running 'allmydata-tahoe start --basedir $HERE' (or, if you
are already in the introducer's base directory, just type 'allmydata-tahoe
start'). Inside that base directory, there will be a pair of files
'introducer.furl' and 'vdrive.furl'. Make a copy of these, as they'll be
needed on the client nodes.
To construct a client node, pick a new working directory for it, then run
'allmydata-tahoe create-client --basedir $HERE'. Copy the two .furl files
from the introducer into this new directory, then run 'allmydata-tahoe start
--basedir $HERE'. After that, the client node should be off and running.
The first thing it will do is connect to the introducer and introduce itself
to all other nodes on the grid. You can follow its progress by looking at
the $HERE/logs/twistd.log file.
To actually use the client, enable the web interface by writing a port
number (like "8080") into a file named $HERE/webport and then restarting the
node with 'allmydata-tahoe restart --basedir $HERE'. This will prompt the
client node to run a webserver on the desired port, through which you can
view, upload, download, and delete files. This 'webport' file is actually a
"strports specification", defined in
http://twistedmatrix.com/documents/current/api/twisted.application.strports.html
, so you can have it only listen on a local interface by writing
"tcp:8080:interface=127.0.0.1" to this file, or make it use SSL by writing
"ssl:8443:privateKey=mykey.pem:certKey=cert.pem" instead.
A client node directory can also be created without installing the code
first. Just use 'make create-client', and a new directory named 'CLIENTDIR'
will be created inside the top of the source tree. Copy the relevant .furl
files in, set the webport, then start the node by using 'make start-client'.
To stop it again, use 'make stop-client'. Similar makefile targets exist
for making and running an introducer node.
If you are behind a firewall and you can configure your firewall to forward
TCP connections on a port to the computer running your Tahoe node, then you
can configure the Tahoe node to announce itself as being available on that
IP address and port. The way to do this is to create a file named
$HERE/advertised_ip_addresses, in which you can put IP addresses and port numbers in
"dotted-quad:port" form, e.g. "209.97.232.113:1345". You can put multiple
IP-address-and-port-number entries into this file, on separate lines.
There is a public grid available for testing. Look at the wiki page
(http://allmydata.org) for the necessary .furl data.