SFTP: put an approximation of grid capacity and available space in the 'df' output #1285

Open
opened 2010-12-29 21:28:25 +00:00 by davidsarah · 4 comments
davidsarah commented 2010-12-29 21:28:25 +00:00
Owner

Ticket #648 is about collecting server capacities and putting them on the welcome page. This might also allow a better approximation of available space than what SFTP currently outputs for 'df'.

Because the SFTP protocol did not originally have a way of implementing 'df', clients use either an extension (implemented at [sftpd.py SftpUserHandler.extendedRequest]source:src/allmydata/frontends/sftpd.py@4545#L1757) or try to log in to the server and issue a 'df' command (implemented in [sftpd.py ShellSession.execCommand]source:src/allmydata/frontends/sftpd.py@4545#L1879). sshfs does the latter, and will not mount the filesystem if it cannot do so. Currently we always report 314159265 KiB free, and double that capacity, to keep sshfs happy.

However, even given the space available on each server, it is not entirely obvious what to report as the total space available on the filesystem. Brian wrote in ticket #648:

Wait, what? What's the relationship between server-space available and the number that SFTP reports as available to any given client? Not trivial, I think.

If we do this, let's make it clear that we're providing only a very rough approximation of the client-side space. Adding together all of the raw server space and dividing by the expansion factor is pretty rough, especially with the servers-of-happiness change (e.g. one server has 14TB free, but you can't upload anything because everyone else is full: SFTP should announce 0).

Also let's make room for Accounting APIs to generate this data (since really it's a function of accounting: how much space an individual "user" is allowed to consume, which may be far less than the sum of all server capacities). At least let's be thinking in that direction when we name the functions.

Ticket #648 is about collecting server capacities and putting them on the welcome page. This might also allow a better approximation of available space than what SFTP currently outputs for 'df'. Because the SFTP protocol did not originally have a way of implementing 'df', clients use either an extension (implemented at [sftpd.py [SftpUserHandler](wiki/SftpUserHandler).extendedRequest]source:src/allmydata/frontends/sftpd.py@4545#L1757) or try to log in to the server and issue a 'df' command (implemented in [sftpd.py [ShellSession](wiki/ShellSession).execCommand]source:src/allmydata/frontends/sftpd.py@4545#L1879). sshfs does the latter, and will not mount the filesystem if it cannot do so. Currently we always report 314159265 KiB free, and double that capacity, to keep sshfs happy. However, even given the space available on each server, it is not entirely obvious what to report as the total space available on the filesystem. Brian wrote in ticket #648: > Wait, what? What's the relationship between server-space available and the number that SFTP reports as available to any given client? Not trivial, I think. > If we do this, let's make it clear that we're providing only a very rough approximation of the client-side space. Adding together all of the raw server space and dividing by the expansion factor is pretty rough, especially with the servers-of-happiness change (e.g. one server has 14TB free, but you can't upload anything because everyone else is full: SFTP should announce 0). > Also let's make room for Accounting APIs to generate this data (since really it's a function of accounting: how much space an individual "user" is allowed to consume, which may be far less than the sum of all server capacities). At least let's be thinking in that direction when we name the functions.
tahoe-lafs added the
code-frontend
major
defect
1.8.1
labels 2010-12-29 21:28:25 +00:00
tahoe-lafs added this to the undecided milestone 2010-12-29 21:28:25 +00:00
davidsarah commented 2010-12-30 02:32:59 +00:00
Author
Owner

Proposed method of calculating the grid capacity and available space:

H = shares.happy
if number_of_servers < H:
    total_available = 0
else:
    available = get_available_space_on_each_server()
    available.sort(reversed=True)
    for i in range(0, H-1):
        # don't count the H-1 servers with most available space
        # as having any more space than the H'th
        available[i] = available[H-1]

    total_available = sum(available)/expansion

total_capacity = total_available + sum(get_used_space_on_each_server())/expansion

The rationale here is that the H servers with most available space are likely to fill up last, and when the H'th of those servers fills up, any extra space on the remaining H-1 that have more space will be unusable. (Implementing #872 would improve this situation.)

Proposed method of calculating the grid capacity and available space: ``` H = shares.happy if number_of_servers < H: total_available = 0 else: available = get_available_space_on_each_server() available.sort(reversed=True) for i in range(0, H-1): # don't count the H-1 servers with most available space # as having any more space than the H'th available[i] = available[H-1] total_available = sum(available)/expansion total_capacity = total_available + sum(get_used_space_on_each_server())/expansion ``` The rationale here is that the `H` servers with most available space are likely to fill up last, and when the `H`'th of those servers fills up, any extra space on the remaining `H-1` that have more space will be unusable. (Implementing #872 would improve this situation.)
davidsarah commented 2010-12-30 04:44:29 +00:00
Author
Owner

Hmm, do we have any way of implementing get_used_space_on_each_server()? I don't think we do (but that would be part of #648).

Hmm, do we have any way of implementing `get_used_space_on_each_server()`? I don't think we do (but that would be part of #648).
davidsarah commented 2010-12-30 04:52:08 +00:00
Author
Owner

Attachment delegate-grid-stats-calculation-to-client.darcs.patch (11898 bytes) added

Move responsibility for calculating the estimated total/used/available space on a grid as used by SFTP to client.py.

**Attachment** delegate-grid-stats-calculation-to-client.darcs.patch (11898 bytes) added Move responsibility for calculating the estimated total/used/available space on a grid as used by SFTP to client.py.

Replying to davidsarah:

Hmm, do we have any way of implementing get_used_space_on_each_server()? I don't think we do (but that would be part of #648).

Figuring out how much space your local storage server is using is #671.

Replying to [davidsarah](/tahoe-lafs/trac-2024-07-25/issues/1285#issuecomment-81617): > Hmm, do we have any way of implementing `get_used_space_on_each_server()`? I don't think we do (but that would be part of #648). Figuring out how much space your local storage server is using is #671.
tahoe-lafs added
normal
and removed
major
labels 2012-04-01 05:02:05 +00:00
warner added
code-frontend-ftp-sftp
and removed
code-frontend
labels 2014-12-02 19:48:28 +00:00
Sign in to join this conversation.
No Milestone
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Reference: tahoe-lafs/trac-2024-07-25#1285
No description provided.