Add file-with-metadata caps #947
Labels
No Label
0.2.0
0.3.0
0.4.0
0.5.0
0.5.1
0.6.0
0.6.1
0.7.0
0.8.0
0.9.0
1.0.0
1.1.0
1.10.0
1.10.1
1.10.2
1.10a2
1.11.0
1.12.0
1.12.1
1.13.0
1.14.0
1.15.0
1.15.1
1.2.0
1.3.0
1.4.1
1.5.0
1.6.0
1.6.1
1.7.0
1.7.1
1.7β
1.8.0
1.8.1
1.8.2
1.8.3
1.8β
1.9.0
1.9.0-s3branch
1.9.0a1
1.9.0a2
1.9.0b1
1.9.1
1.9.2
1.9.2a1
LeastAuthority.com automation
blocker
cannot reproduce
cloud-branch
code
code-dirnodes
code-encoding
code-frontend
code-frontend-cli
code-frontend-ftp-sftp
code-frontend-magic-folder
code-frontend-web
code-mutable
code-network
code-nodeadmin
code-peerselection
code-storage
contrib
critical
defect
dev-infrastructure
documentation
duplicate
enhancement
fixed
invalid
major
minor
n/a
normal
operational
packaging
somebody else's problem
supercritical
task
trivial
unknown
was already fixed
website
wontfix
worksforme
No Milestone
No Assignees
4 Participants
Notifications
Due Date
No due date set.
Reference: tahoe-lafs/trac-2024-07-25#947
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Web architecture expects that a resource has a Content-Type. Modern filesystems have "extended attributes" per file as well as metadata such as modification time, and it is desirable to back these things up. Both of these things point to the idea that there ought to be addressable (has-a-URL) objects which designate the metadata as well as the file data (binary blob). My understanding of current Tahoe architecture is that all metadata is instead stored in the rows of the directory objects.
Additionally, metadata should be mutable iff the file is, so that it can be updated in-place without access to every directory which might contain it.
I imagine that these objects would contain a current-design file cap, rather than themselves containing the file data, so that we still get the convergent encryption space advantage even if file metadata differs among separately-created instances.
This idea raised at friam 2010-02-12.
Hm, the way I imagined implementing this at first was to have the client first fetch the associated metadata and then fetch the file. One way to envision the implementation would simply be to define a kind of directory which can only have one child link in it. Then take the cap to that directory and wrap it in a different cap type which means "fetch this directory then fetch the file it points to, applying all of the metadata that it contains".
But, we could also consider bundling some metadata along with the cap itself. For example, if the cap is being embedded into a URL, then include the metadata in the URL, along with the cap. Spelling out the content type in standard text format e.g.
image/svg+xml
would add significantly to the length of the URL, but perhaps we could define a custom compression scheme which could represent the most common types in only a character or two while falling back to uncompressed form for types that we haven't included in our compression definition.One reason that I am thinking about this is the "security-related extra metadata" that I've been ticketing about tonight: highest-known-version-number (#955), petrification-marker (#954), LAFS 301 Moved Permanently marker (not yet ticketed), etc.. It would be cool if, when I send you a URL containing a Tahoe-LAFS cap to a mutable file, I automatically include in that URL the highest version number of that file that I have ever seen, thus empowering you to reject rollback attacks which present an older file to you when you try to read it.
That one, at least, can't really be implemented in the indirection-node way (because if someone is going to rollback the file, they might also rollback the indirection-node), but would have to be in the bundled-with-the-original-URL way.
If you like this ticket, you might also like #956 (embed security metadata in parent directory) and #957 (embed security metadata in URL).
Is this the same as #307 (maybe add node metadata? (in addition to edge metadata))?
as zooko correctly pointed out there, this is relevant for #1325 (make
tahoe backup
useable as a replacement for rsync).personally i'd go for storing the file metadata in the directory. this does require the relevant data (mime type) to be included in the url in order to be used in connection with the file, but think about it that way: that's even true for the file name.
other reasons supporting metadata-in-directory are
[to me as a reminder to explain why the Content-Type-in-direntry feature can't be implemented on its own without the Content-Type-in-URL feature]assigning
Ticket retargeted after milestone closed (editing milestones)