include SI's of files in logs? #367

Closed
opened 2008-03-24 22:20:12 +00:00 by zooko · 1 comment

I think it would be good to include information about files added, linked, and unlinked, in the logs, including the file's storage index ("SI").

This might help analyze behavior after the fact.

Brian is uncomfortable with this idea. I think that his objections are two-fold:

  1. Parsing logs is an error-prone way to infer filesystem structure. There are better ways to track the evolution of filesystem structure that we want to implement eventually: manifests and/or deep-verify capabilities. Also, people can already share their read-capabilities with us so that we can examine their filesystem directly.

  2. If you want to share your logs with someone else, so that they can debug, then including SI's in logs exposes to them more information about the structure (but not the content) of your filesystem.

My counter-arguments to Brian's objections are:

  1. Putting SIs in logs makes the logs more informative than deep-verify caps or sharing your read-caps -- it includes information about the timing and related-event to filesystem changes.

1.b. Putting SIs in logs doesn't require us to parse those logs and infer filesystem structure -- it is useful for simpler, flatter, lossier kinds of monitoring, too.

1.c. Putting SIs in logs doesn't preclude us from implementing something good like deep verify caps -- it is too lossy to succeed at that. ;-)

  1. Users of Tahoe are not gaining strong protection against traffic analysis, anyway, so SIs are not especially sensitive. Also, your logs are not public.
I think it would be good to include information about files added, linked, and unlinked, in the logs, including the file's storage index ("SI"). This might help analyze behavior after the fact. Brian is uncomfortable with this idea. I think that his objections are two-fold: 1. Parsing logs is an error-prone way to infer filesystem structure. There are better ways to track the evolution of filesystem structure that we want to implement eventually: manifests and/or deep-verify capabilities. Also, people can already share their read-capabilities with us so that we can examine their filesystem directly. 2. If you want to share your logs with someone else, so that they can debug, then including SI's in logs exposes to them more information about the structure (but not the content) of your filesystem. My counter-arguments to Brian's objections are: 1. Putting SIs in logs makes the logs more informative than deep-verify caps or sharing your read-caps -- it includes information about the timing and related-event to filesystem changes. 1.b. Putting SIs in logs doesn't require us to parse those logs and infer filesystem structure -- it is useful for simpler, flatter, lossier kinds of monitoring, too. 1.c. Putting SIs in logs doesn't preclude us from implementing something good like deep verify caps -- it is too lossy to succeed at that. ;-) 2. Users of Tahoe are not gaining strong protection against traffic analysis, anyway, so SIs are not especially sensitive. Also, your logs are not public.
zooko added the
code-nodeadmin
major
defect
0.9.0
labels 2008-03-24 22:20:12 +00:00
zooko added this to the 1.0.0 milestone 2008-03-24 22:20:12 +00:00
warner was assigned by zooko 2008-03-24 22:20:12 +00:00

Native upload (when clients speak directly to storage servers) now includes
this message. The log message looks like this:

self.log(format="plaintext_hash=%(plaintext_hash)s, SI=%(SI)s, size=%(size)d",
         plaintext_hash=base32.b2a(plaintext_hash),
         SI=storage.si_b2a(self._storage_index),
         size=self.file_size)

Clients which upload through a Helper will not emit this message on their
own, however the helper will emit one on their behalf.

Not the easiest thing in the world to gather and work with, but it's a start..

Native upload (when clients speak directly to storage servers) now includes this message. The log message looks like this: ``` self.log(format="plaintext_hash=%(plaintext_hash)s, SI=%(SI)s, size=%(size)d", plaintext_hash=base32.b2a(plaintext_hash), SI=storage.si_b2a(self._storage_index), size=self.file_size) ``` Clients which upload through a Helper will not emit this message on their own, however the helper will emit one on their behalf. Not the easiest thing in the world to gather and work with, but it's a start..
warner added the
fixed
label 2008-03-25 18:56:14 +00:00
Sign in to join this conversation.
No Milestone
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Reference: tahoe-lafs/trac-2024-07-25#367
No description provided.