Corrections and clarifications to remote-to-local-sync.rst.

Signed-off-by: Daira Hopwood <daira@jacaranda.org>
This commit is contained in:
Daira Hopwood 2015-12-28 19:27:45 +00:00
parent b5222e3679
commit 41cf600820
1 changed files with 105 additions and 48 deletions

View File

@ -174,7 +174,7 @@ collapsed into the same DMD, which could get quite large. In practice a
single DMD can easily handle the number of files expected to be written single DMD can easily handle the number of files expected to be written
by a client, so this is unlikely to be a significant issue. by a client, so this is unlikely to be a significant issue.
123 : In these designs, the set of files in a Magic Folder is 123 : In these designs, the set of files in a Magic Folder is
represented as the union of the files in all client DMDs. However, represented as the union of the files in all client DMDs. However,
when a file is modified by more than one client, it will be linked when a file is modified by more than one client, it will be linked
from multiple client DMDs. We therefore need a mechanism, such as a from multiple client DMDs. We therefore need a mechanism, such as a
@ -231,11 +231,11 @@ leave some corner cases of the write coordination problem unsolved.
+------------------------------------------------+------+------+------+------+------+------+ +------------------------------------------------+------+------+------+------+------+------+
| Can result in large DMDs | | | | | | | | Can result in large DMDs | | | | | | |
+------------------------------------------------+------+------+------+------+------+------+ +------------------------------------------------+------+------+------+------+------+------+
| Need version number to determine priority | | | | | | | | Need version number to determine priority | | | | | | |
+------------------------------------------------+------+------+------+------+------+------+ +------------------------------------------------+------+------+------+------+------+------+
| Must traverse immutable directory structure | | | | | | | | Must traverse immutable directory structure | | | | | | |
+------------------------------------------------+------+------+------+------+------+------+ +------------------------------------------------+------+------+------+------+------+------+
| Must traverse mutable directory structure | | | | | | | | Must traverse mutable directory structure | | | | | | |
+------------------------------------------------+------+------+------+------+------+------+ +------------------------------------------------+------+------+------+------+------+------+
| Must suppress duplicate representation changes | | | | | | | | Must suppress duplicate representation changes | | | | | | |
+------------------------------------------------+------+------+------+------+------+------+ +------------------------------------------------+------+------+------+------+------+------+
@ -350,6 +350,9 @@ remote change has been initially classified as an overwrite.
.. _`Fire Dragons`: #fire-dragons-distinguishing-conflicts-from-overwrites .. _`Fire Dragons`: #fire-dragons-distinguishing-conflicts-from-overwrites
Note that writing a file that does not already have an entry in
the `magic folder db`_ is initially classed as an overwrite.
A *write/download collision* occurs when another program writes A *write/download collision* occurs when another program writes
to ``foo`` in the local filesystem, concurrently with the new to ``foo`` in the local filesystem, concurrently with the new
version being written by the Magic Folder client. We need to version being written by the Magic Folder client. We need to
@ -372,7 +375,12 @@ this procedure for an overwrite in response to a remote change:
conflict. (This takes as input the ``last_downloaded_uri`` conflict. (This takes as input the ``last_downloaded_uri``
field from the directory entry of the changed ``foo``.) field from the directory entry of the changed ``foo``.)
3. Set the ``mtime`` of the replacement file to be *T* seconds 3. Set the ``mtime`` of the replacement file to be *T* seconds
before the current local time. before the current local time. Stat the replacement file
to obtain its ``mtime`` and ``ctime`` as stored in the local
filesystem, and update the file's last-seen statinfo in
the magic folder db with this information. (Note that the
retrieved ``mtime`` may differ from the one that was set due
to rounding.)
4. Perform a ''file replacement'' operation (explained below) 4. Perform a ''file replacement'' operation (explained below)
with backup filename ``foo.backup``, replaced file ``foo``, with backup filename ``foo.backup``, replaced file ``foo``,
and replacement file ``.foo.tmp``. If any step of this and replacement file ``.foo.tmp``. If any step of this
@ -384,11 +392,16 @@ To reclassify as a conflict, attempt to rename ``.foo.tmp`` to
The implementation of file replacement differs between Unix The implementation of file replacement differs between Unix
and Windows. On Unix, it can be implemented as follows: and Windows. On Unix, it can be implemented as follows:
* 4a. Set the permissions of the replacement file to be the * 4a. Stat the replaced path, and set the permissions of the
same as the replaced file, bitwise-or'd with octal 600 replacement file to be the same as the replaced file,
(``rw-------``). bitwise-or'd with octal 600 (``rw-------``). If the replaced
file does not exist, set the permissions according to the
user's umask. If there is a directory at the replaced path,
fail.
* 4b. Attempt to move the replaced file (``foo``) to the * 4b. Attempt to move the replaced file (``foo``) to the
backup filename (``foo.backup``). backup filename (``foo.backup``). If an ``ENOENT`` error
occurs because the replaced file does not exist, ignore this
error and continue with steps 4c and 4d.
* 4c. Attempt to create a hard link at the replaced filename * 4c. Attempt to create a hard link at the replaced filename
(``foo``) pointing to the replacement file (``.foo.tmp``). (``foo``) pointing to the replacement file (``.foo.tmp``).
* 4d. Attempt to unlink the replacement file (``.foo.tmp``), * 4d. Attempt to unlink the replacement file (``.foo.tmp``),
@ -396,24 +409,30 @@ and Windows. On Unix, it can be implemented as follows:
Note that, if there is no conflict, the entry for ``foo`` Note that, if there is no conflict, the entry for ``foo``
recorded in the `magic folder db`_ will reflect the ``mtime`` recorded in the `magic folder db`_ will reflect the ``mtime``
set in step 3. The link operation in step 4c will cause an set in step 3. The move operation in step 4b will cause a
``IN_CREATE`` event for ``foo``, but this will not trigger an ``MOVED_FROM`` event for ``foo``, and the link operation in
upload, because the metadata recorded in the database entry step 4c will cause an ``IN_CREATE`` event for ``foo``.
will exactly match the metadata for the file's inode on disk. However, these events will not trigger an upload, because they
(The two hard links — ``foo`` and, while it still exists, are guaranteed to be processed only after the file replacement
``.foo.tmp`` — share the same inode and therefore the same has finished, at which point the last-seen statinfo recorded
metadata.) in the database entry will exactly match the metadata for the
file's inode on disk. (The two hard links — ``foo`` and, while
it still exists, ``.foo.tmp`` — share the same inode and
therefore the same metadata.)
.. _`magic folder db`: filesystem_integration.rst#local-scanning-and-database .. _`magic folder db`: filesystem_integration.rst#local-scanning-and-database
On Windows, file replacement can be implemented as a single On Windows, file replacement can be implemented by a call to
call to the `ReplaceFileW`_ API (with the the `ReplaceFileW`_ API (with the
``REPLACEFILE_IGNORE_MERGE_ERRORS`` flag). ``REPLACEFILE_IGNORE_MERGE_ERRORS`` flag). If an error occurs
because the replaced file does not exist, then we ignore this
error and attempt to move the replacement file to the replaced
file.
Similar to the Unix case, the `ReplaceFileW`_ operation will Similar to the Unix case, the `ReplaceFileW`_ operation will
cause a change notification for ``foo``. The replaced ``foo`` cause one or more change notifications for ``foo``. The replaced
has the same ``mtime`` as the replacement file, and so this ``foo`` has the same ``mtime`` as the replacement file, and so any
notification will not trigger an unwanted upload. such notification(s) will not trigger an unwanted upload.
.. _`ReplaceFileW`: https://msdn.microsoft.com/en-us/library/windows/desktop/aa365512%28v=vs.85%29.aspx .. _`ReplaceFileW`: https://msdn.microsoft.com/en-us/library/windows/desktop/aa365512%28v=vs.85%29.aspx
@ -425,7 +444,7 @@ operations performed by the Magic Folder client and the other process.
(Note that atomic operations on a directory are totally ordered.) (Note that atomic operations on a directory are totally ordered.)
The set of possible interleavings differs between Windows and Unix. The set of possible interleavings differs between Windows and Unix.
On Unix, we have: On Unix, for the case where the replaced file already exists, we have:
* Interleaving A: the other process' rename precedes our rename in * Interleaving A: the other process' rename precedes our rename in
step 4b, and we get an ``IN_MOVED_TO`` event for its rename by step 4b, and we get an ``IN_MOVED_TO`` event for its rename by
@ -457,6 +476,14 @@ On Unix, we have:
Therefore, an upload will be triggered for ``foo`` after its Therefore, an upload will be triggered for ``foo`` after its
change, which is correct and avoids data loss. change, which is correct and avoids data loss.
If the replaced file did not already exist, an ``ENOENT`` error
occurs at step 4b, and we continue with steps 4c and 4d. The other
process' rename races with our link operation in step 4c. If the
other process wins the race then the effect is similar to
Interleaving C, and if we win the race this it is similar to
Interleaving D. Either case avoids data loss.
On Windows, the internal implementation of `ReplaceFileW`_ is similar On Windows, the internal implementation of `ReplaceFileW`_ is similar
to what we have described above for Unix; it works like this: to what we have described above for Unix; it works like this:
@ -477,7 +504,11 @@ step 4c. (If there is a failure at steps 4c after step 4b has
completed, the `ReplaceFileW`_ call will fail with return code completed, the `ReplaceFileW`_ call will fail with return code
``ERROR_UNABLE_TO_MOVE_REPLACEMENT_2``. However, it is still ``ERROR_UNABLE_TO_MOVE_REPLACEMENT_2``. However, it is still
preferable to use this API over two `MoveFileExW`_ calls, because preferable to use this API over two `MoveFileExW`_ calls, because
it retains the attributes and ACLs of ``foo`` where possible.) it retains the attributes and ACLs of ``foo`` where possible.
Also note that if the `ReplaceFileW`_ call fails with
``ERROR_FILE_NOT_FOUND`` because the replaced file does not exist,
then the replacment operation ignores this error and continues with
the equivalent of step 4c, as on Unix.)
However, on Windows the other application will not be able to However, on Windows the other application will not be able to
directly rename ``foo.other`` onto ``foo`` (which would fail because directly rename ``foo.other`` onto ``foo`` (which would fail because
@ -486,7 +517,10 @@ the destination already exists); it will have to rename or delete
deleted. This complicates the interleaving analysis, because we deleted. This complicates the interleaving analysis, because we
have two operations done by the other process interleaving with have two operations done by the other process interleaving with
three done by the magic folder process (rather than one operation three done by the magic folder process (rather than one operation
interleaving with four as on Unix). The cases are: interleaving with four as on Unix).
So on Windows, for the case where the replaced file already exists,
we have:
* Interleaving A: the other process' deletion of ``foo`` and its * Interleaving A: the other process' deletion of ``foo`` and its
rename of ``foo.other`` to ``foo`` both precede our rename in rename of ``foo.other`` to ``foo`` both precede our rename in
@ -504,10 +538,14 @@ interleaving with four as on Unix). The cases are:
our rename of ``foo`` to ``foo.backup`` done by `ReplaceFileW`_, our rename of ``foo`` to ``foo.backup`` done by `ReplaceFileW`_,
but its rename of ``foo.other`` to ``foo`` does not, so we get but its rename of ``foo.other`` to ``foo`` does not, so we get
an ``ERROR_FILE_NOT_FOUND`` error from `ReplaceFileW`_ indicating an ``ERROR_FILE_NOT_FOUND`` error from `ReplaceFileW`_ indicating
that the replaced file does not exist. Then we reclassify as a that the replaced file does not exist. We ignore this error and
conflict; the other process' changes end up at ``foo`` (after attempt to move ``foo.tmp`` to ``foo``, racing with the other
it has renamed ``foo.other`` to ``foo``) and our changes end up process which is attempting to move ``foo.other`` to ``foo``.
at ``foo.conflicted``. This avoids data loss. If we win the race, then our changes end up at ``foo``, and the
other process' move fails. If the other process wins the race,
then its changes end up at ``foo``, our move fails, and we
reclassify as a conflict, so that our changes end up at
``foo.conflicted``. Either possibility avoids data loss.
* Interleaving D: the other process' deletion and/or rename happen * Interleaving D: the other process' deletion and/or rename happen
during the call to `ReplaceFileW`_, causing the latter to fail. during the call to `ReplaceFileW`_, causing the latter to fail.
@ -540,6 +578,11 @@ interleaving with four as on Unix). The cases are:
.. _`MoveFileExW`: https://msdn.microsoft.com/en-us/library/windows/desktop/aa365240%28v=vs.85%29.aspx .. _`MoveFileExW`: https://msdn.microsoft.com/en-us/library/windows/desktop/aa365240%28v=vs.85%29.aspx
If the replaced file did not already exist, we get an
``ERROR_FILE_NOT_FOUND`` error from `ReplaceFileW`_, and attempt to
move ``foo.tmp`` to ``foo``. This is similar to Interleaving C, and
either possibility for the resulting race avoids data loss.
We also need to consider what happens if another process opens ``foo`` We also need to consider what happens if another process opens ``foo``
and writes to it directly, rather than renaming another file onto it: and writes to it directly, rather than renaming another file onto it:
@ -652,9 +695,9 @@ Fire Dragons: Distinguishing conflicts from overwrites
When synchronizing a file that has changed remotely, the Magic Folder When synchronizing a file that has changed remotely, the Magic Folder
client needs to distinguish between overwrites, in which the remote client needs to distinguish between overwrites, in which the remote
side was aware of your most recent version and overwrote it with a side was aware of your most recent version (if any) and overwrote it
new version, and conflicts, in which the remote side was unaware of with a new version, and conflicts, in which the remote side was unaware
your most recent version when it published its new version. Those two of your most recent version when it published its new version. Those two
cases have to be handled differently — the latter needs to be raised cases have to be handled differently — the latter needs to be raised
to the user as an issue the user will have to resolve and the former to the user as an issue the user will have to resolve and the former
must not bother the user. must not bother the user.
@ -662,8 +705,9 @@ must not bother the user.
For example, suppose that Alice's Magic Folder client sees a change For example, suppose that Alice's Magic Folder client sees a change
to ``foo`` in Bob's DMD. If the version it downloads from Bob's DMD to ``foo`` in Bob's DMD. If the version it downloads from Bob's DMD
is "based on" the version currently in Alice's local filesystem at is "based on" the version currently in Alice's local filesystem at
the time Alice's client attempts to write the downloaded file, then the time Alice's client attempts to write the downloaded file or if
it is an overwrite. Otherwise it is initially classified as a there is no existing version in Alice's local filesystem at that time
then it is an overwrite. Otherwise it is initially classified as a
conflict. conflict.
This initial classification is used by the procedure for writing a This initial classification is used by the procedure for writing a
@ -729,6 +773,9 @@ metadata. This will have the effect of making other clients treat
this change as a conflict whenever they already have a copy of the this change as a conflict whenever they already have a copy of the
file. file.
Conflict/overwrite decision algorithm
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Now we are ready to describe the algorithm for determining whether a Now we are ready to describe the algorithm for determining whether a
download for the file ``foo`` is an overwrite or a conflict (refining download for the file ``foo`` is an overwrite or a conflict (refining
step 2 of the procedure from the `Earth Dragons`_ section). step 2 of the procedure from the `Earth Dragons`_ section).
@ -737,27 +784,27 @@ Let ``last_downloaded_uri`` be the field of that name obtained from
the directory entry metadata for ``foo`` in Bob's DMD (this field the directory entry metadata for ``foo`` in Bob's DMD (this field
may be absent). Then the algorithm is: may be absent). Then the algorithm is:
* 2a. If Alice has no local copy of ``foo``, classify as an overwrite. * 2a. Attempt to "stat" ``foo`` to get its *current statinfo* (size
in bytes, ``mtime``, and ``ctime``). If Alice has no local copy
of ``foo``, classify as an overwrite.
* 2b. Otherwise, "stat" ``foo`` to get its *current statinfo* (size * 2b. Read the following information for the path ``foo`` from the
in bytes, ``mtime``, and ``ctime``).
* 2c. Read the following information for the path ``foo`` from the
local magic folder db: local magic folder db:
* the *last-uploaded statinfo*, if any (this is the size in * the *last-seen statinfo*, if any (this is the size in
bytes, ``mtime``, and ``ctime`` stored in the ``local_files`` bytes, ``mtime``, and ``ctime`` stored in the ``local_files``
table when the file was last uploaded); table when the file was last uploaded);
* the ``filecap`` field of the ``caps`` table for this file, * the ``last_uploaded_uri`` field of the ``local_files`` table
which is the URI under which the file was last uploaded. for this file, which is the URI under which the file was last
Call this ``last_uploaded_uri``. uploaded.
* 2d. If any of the following are true, then classify as a conflict: * 2c. If any of the following are true, then classify as a conflict:
* there are pending notifications of changes to ``foo``; * i. there are pending notifications of changes to ``foo``;
* the last-uploaded statinfo is either absent, or different * ii. the last-seen statinfo is either absent (i.e. there is
from the current statinfo; no entry in the database for this path), or different from the
* either ``last_downloaded_uri`` or ``last_uploaded_uri`` current statinfo;
* iii. either ``last_downloaded_uri`` or ``last_uploaded_uri``
(or both) are absent, or they are different. (or both) are absent, or they are different.
Otherwise, classify as an overwrite. Otherwise, classify as an overwrite.
@ -857,10 +904,20 @@ take this as a signal to rename their copies to the backup filename.
Note that the entry for this zero-length file has a version number as Note that the entry for this zero-length file has a version number as
usual, and later versions may restore the file. usual, and later versions may restore the file.
When the downloader deletes a file (or renames it to a filename
ending in ``.backup``) in response to a remote change, a local
filesystem notification will occur, and we must make sure that this
is not treated as a local change. To do this we have the downloader
set the ``size`` field in the magic folder db to ``None`` (SQL NULL)
just before deleting the file, and suppress notifications for which
the local file does not exist, and the recorded ``size`` field is
``None``.
When a Magic Folder client restarts, we can detect files that had When a Magic Folder client restarts, we can detect files that had
been downloaded but were deleted while it was not running, because been downloaded but were deleted while it was not running, because
their paths will have last-downloaded records in the magic folder db their paths will have last-downloaded records in the magic folder db
without any corresponding local file. with a ``size`` other than ``None``, and without any corresponding
local file.
Deletion of a directory Deletion of a directory
~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~