[Top] [All Lists]

Re: [RFC] Unicode/UTF-8 support for XFS

To: Christoph Hellwig <hch@xxxxxxxxxxxxx>
Subject: Re: [RFC] Unicode/UTF-8 support for XFS
From: Olaf Weber <olaf@xxxxxxx>
Date: Mon, 15 Sep 2014 09:16:24 +0200
Cc: Dave Chinner <david@xxxxxxxxxxxxx>, Ben Myers <bpm@xxxxxxx>, <tinguely@xxxxxxx>, <xfs@xxxxxxxxxxx>
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <20140912205528.GB11717@xxxxxxxxxxxxx>
Organization: SGI
References: <20140911203735.GA19952@xxxxxxx> <20140912100230.GB4267@dastard> <5412DF37.9030005@xxxxxxx> <20140912205528.GB11717@xxxxxxxxxxxxx>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.1.1
On 12-09-14 22:55, Christoph Hellwig wrote:
On Fri, Sep 12, 2014 at 01:55:35PM +0200, Olaf Weber wrote:
I looked up those discussions in the archives.  For example, here's
Christoph about rejecting filenames if they're not well-formed unicode.
And Jamie Lokier making a similar point:

And I might now disagree with my past self.  While non-ut8 characters
are perfectly valid unix filenames, and I think everyones life is easier
if we generally stay out of the utf8 business it seems that for this
particular use case (shared filesystem with Windows, right) just
accepting utf8 should be fine.  ZFS is doing, MacOS X apparently is,
and NFSv4 requires it, although as far as I know most implementations
ignore that requirement.

One issue is working in environments that are not UTF-8 clean. For example, unpacking a tarball with non-UTF-8 filenames in it. The names would have to be transcoded, which is only really possible if you know the original character set. And if the filesystem flat out rejects non-UTF-8 filenames, then you'd be unable to unpack the tarball at all.

Olaf Weber                 SGI               Phone:  +31(0)30-6696796
                           Veldzigt 2b       Fax:    +31(0)30-6696799
Technical Lead             3454 PW de Meern  Vnet:   955-6796
Storage Software           The Netherlands   Email:  olaf@xxxxxxx

<Prev in Thread] Current Thread [Next in Thread>