SFTP and FTP: support for non-UTF-8 charsets (error message "Path could not be decoded as UTF-8") #1089
Labels
No Label
0.2.0
0.3.0
0.4.0
0.5.0
0.5.1
0.6.0
0.6.1
0.7.0
0.8.0
0.9.0
1.0.0
1.1.0
1.10.0
1.10.1
1.10.2
1.10a2
1.11.0
1.12.0
1.12.1
1.13.0
1.14.0
1.15.0
1.15.1
1.2.0
1.3.0
1.4.1
1.5.0
1.6.0
1.6.1
1.7.0
1.7.1
1.7β
1.8.0
1.8.1
1.8.2
1.8.3
1.8β
1.9.0
1.9.0-s3branch
1.9.0a1
1.9.0a2
1.9.0b1
1.9.1
1.9.2
1.9.2a1
LeastAuthority.com automation
blocker
cannot reproduce
cloud-branch
code
code-dirnodes
code-encoding
code-frontend
code-frontend-cli
code-frontend-ftp-sftp
code-frontend-magic-folder
code-frontend-web
code-mutable
code-network
code-nodeadmin
code-peerselection
code-storage
contrib
critical
defect
dev-infrastructure
documentation
duplicate
enhancement
fixed
invalid
major
minor
n/a
normal
operational
packaging
somebody else's problem
supercritical
task
trivial
unknown
was already fixed
website
wontfix
worksforme
No Milestone
No Assignees
1 Participants
Notifications
Due Date
No due date set.
Reference: tahoe-lafs/trac-2024-07-25#1089
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
I open new ticket, because I didn't found any reported problem with UTF-8 (except this one #704 which is afaik something different).
When I try to create file/directory over web frontend, everything goes well. But any attempt to create directory/file with national symbols over FTP/SFTP frontend fails.
WinSCP tell me "Path could not be decoded as UTF-8", Total Commander just tell me some general error.
Example directory name which fails: žluťoučký kůň úpěl ďábelské ódy (means: Yellow horse moaned devil odes), but fails also with any other reasonable name as "dovolená" (holiday).
Reproducibility: always
Python version: 2.6
And when I create directory thru web interface, I cannot enter them using FTP client (TC tell me Internal server error).
This error will occur if the file or directory name is not valid UTF-8. Polish systems often use ISO-Latin-2 locales -- is that what your filesystem uses?
If so, the SFTP specification implies that it is the responsibility of the client to convert names to UTF-8, and apparently the clients you tried aren't doing that.
Alternatives:
tahoe.cfg
or the FTP/SFTP accounts file to specify another encoding.The FTP frontend does not support Unicode at all; that is ticket #682. However, if #682 were fixed by implementing RFC 2640, then the FTP frontend would also only support UTF-8.
We can improve the error message for clients that display it.
It is a bug in Total Commander that it doesn't display the message. Also, if "Internal server error" was reported for SFTP then that description is inaccurate -- FX_FAILURE just means an error that has no more specific code. In this case we could arguably report FX_BAD_MESSAGE, though.
Path could not be decoded as UTF-8to SFTP and FTP: Path could not be decoded as UTF-8(wiki/SftpFrontend#Unicodefilenames) already documented this problem, but I added a reference to this ticket.
Replying to slush:
This problem (listing a directory containing non-ASCII names in FTP) is part of #682.
@David:
You are right, I fixed problems with UTF-8 in WinSCP by setting correct charset. I think it should be mentioned in SFTP frontend docs (because many other SFTP servers Im using works without any settings changes).
Replying to slush:
Please change the 'Unicode filenames' section of wiki/SftpFrontend to explain how to do this.
Replying to [davidsarah]comment:8:
Done.
I'm unconvinced that supporting non-UTF-8 encodings is worth the hassle and complexity. Does anyone want to argue in favour of it?
Note that neither SFTP nor FTP have any standard by which a specific non-UTF-8 encoding could be automatically negotiated. So, this would have to be manually configured. But clients that do a reasonable job of supporting non-ASCII characters at all, usually have an option to select UTF-8. So I think that support for other encodings would benefit very few users.
SFTP and FTP: Path could not be decoded as UTF-8to SFTP and FTP: support for non-UTF-8 charsets (error message "Path could not be decoded as UTF-8")I'm wontfixing this, because SFTP and FTP simply have no support for negotiating non-UTF-8 encodings.