encrypted cloud database #2472
Labels
No Label
0.2.0
0.3.0
0.4.0
0.5.0
0.5.1
0.6.0
0.6.1
0.7.0
0.8.0
0.9.0
1.0.0
1.1.0
1.10.0
1.10.1
1.10.2
1.10a2
1.11.0
1.12.0
1.12.1
1.13.0
1.14.0
1.15.0
1.15.1
1.2.0
1.3.0
1.4.1
1.5.0
1.6.0
1.6.1
1.7.0
1.7.1
1.7β
1.8.0
1.8.1
1.8.2
1.8.3
1.8β
1.9.0
1.9.0-s3branch
1.9.0a1
1.9.0a2
1.9.0b1
1.9.1
1.9.2
1.9.2a1
LeastAuthority.com automation
blocker
cannot reproduce
cloud-branch
code
code-dirnodes
code-encoding
code-frontend
code-frontend-cli
code-frontend-ftp-sftp
code-frontend-magic-folder
code-frontend-web
code-mutable
code-network
code-nodeadmin
code-peerselection
code-storage
contrib
critical
defect
dev-infrastructure
documentation
duplicate
enhancement
fixed
invalid
major
minor
n/a
normal
operational
packaging
somebody else's problem
supercritical
task
trivial
unknown
was already fixed
website
wontfix
worksforme
No Milestone
No Assignees
1 Participants
Notifications
Due Date
No due date set.
Reference: tahoe-lafs/trac-2024-07-25#2472
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Tahoe-LAFS does a reasonable job of flat-file storage, and of directories structure. But, kids these days (for the last few decades, I mean) are really into structured storage, i.e. relational databases, queriable nosql databases, etc.
Here's a proposal for a stab at a "Minimum Viable Product" for an end-to-end encrypted cloud database. It's extremely simple: store a sqlite db in Tahoe-LAFS.
This would immediately give off-site storage (possibly even peer-to-peer if the underlying Tahoe-LAFS grid is a peer-to-peer grid), erasure-coding for redundancy, and it would also immediately give Tahoe-LAFS's nice access-control semantics: you can give people read-only access to your sqlitedb.
A potentially interesting use for this would be to store Tahoe-LAFS caps in the sqlitedb so that you can query them out. ☺
There are a few important details about how to map sqlite's storage needs to Tahoe-LAFS's storage offerings for best performance and to retain Tahoe-LAFS's guarantees about access control and atomicity and so forth. I looked into it at one point about a year ago, and unfortunately didn't post notes to the trac so I don't remember precisely, what I decided, but I think it was that the sqlitedb should be in write-ahead-logging
WAL
mode (https://www.sqlite.org/wal.html), and with exclusive locking mode, and should be stored a single MDMF file with its segment size set to be the same as the sqlitedb's page size.The
-wal
file should probably also be an MDMF, although it would be cool if sqlite happened to use it in write-once mode, in which case maybe it could be an immutable.There's an open issue about whether read-only access to such a DB would work without
PRAGMA journal_mode=DELETE
. Read https://www.sqlite.org/wal.html#readonly to see what I mean, and keep in mind that because we're telling the user that they have to set exclusive locking mode: https://www.sqlite.org/wal.html#noshmWith this setup, the cap to the database has to be a cap to the directory containing the sqlitedb file, not a cap to the sqlitedb file itself. That's because sqlite needs to access the
-wal
file adjacent to the sqlitedb file itself.A different approach would be to use the older rollback-log functionality of sqlite instead of WAL. The trade-offs listed in https://www.sqlite.org/wal.html make it sound like maybe that would fit better into Tahoe-LAFS. It might require experimentation and benchmarking to understand.
But also it requires careful study of things like https://www.sqlite.org/lockingv3.html#how_to_corrupt and https://www.sqlite.org/atomiccommit.html#sect_9_0 to figure out if Tahoe-LAFS could provide the guarantees that sqlite needs. I think we can, and I tentatively think that the
WAL
is easier to guarantee than the rollback journal, because with the rollback journal there is a positive requirement to preserve and make available any hot journal, or else corruption can result, whereas with aWAL
a failure of preservation or availability of the-wal
just results in rollback, not corruption. I think.