xfs
[Top] [All Lists]

Re: RFC: Case-insensitive support for XFS

To: "Nicholas Miell" <nmiell@xxxxxxxxxxx>, "Christoph Hellwig" <hch@xxxxxxxxxxxxx>
Subject: Re: RFC: Case-insensitive support for XFS
From: "Barry Naujok" <bnaujok@xxxxxxx>
Date: Mon, 08 Oct 2007 15:07:48 +1000
Cc: "xfs@xxxxxxxxxxx" <xfs@xxxxxxxxxxx>, linux-fsdevel@xxxxxxxxxxxxxxx, urban@xxxxxxxxxxxxxx
In-reply-to: <1191610338.2695.8.camel@entropy>
Organization: SGI
References: <op.ty6361ut3jf8g2@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <op.tzpbqspl3jf8g2@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <20071005154442.GA6432@xxxxxxxxxxxxx> <1191610338.2695.8.camel@entropy>
Sender: xfs-bounce@xxxxxxxxxxx
User-agent: Opera Mail/9.10 (Win32)
On Sat, 06 Oct 2007 04:52:18 +1000, Nicholas Miell <nmiell@xxxxxxxxxxx> wrote:

On Fri, 2007-10-05 at 16:44 +0100, Christoph Hellwig wrote:
[Adding -fsdevel because some of the things touched here might be of
 broader interest and Urban because his name is on nls_utf8.c]

On Fri, Oct 05, 2007 at 11:57:54AM +1000, Barry Naujok wrote:
>
> On it's own, linux only provides case conversion for old-style
> character sets - 8 bit sequences only. A lot of distos are
> now defaulting to UTF-8 and Linux NLS stuff does not support
> case conversion for any unicode sets.

The lack of case tables in nls_utf8.c defintively seems odd to me.
Urban, is there a reason for that?  The only thing that comes to
mind is that these tables might be quite large.


Case conversion in Unicode is locale dependent. The legacy 8-bit
character encodings don't code for enough characters to run into the
ambiguities, so they can get away with fixed case conversion tables.
Unicode can't.

Based on http://www.unicode.org/reports/tr21/tr21-5.html and
http://www.unicode.org/Public/UNIDATA/CaseFolding.txt

Doing case comparison using that table should cater for most
circumstances except a few exeptions. It should be enough
to satisfy a locale independant case-insensitive filesystem
(ie. the C + F case folding option).

Is normalization required after case-folding? What I read
implies it is not necessary for this purpose (and would
slow things down and bloat the code more).

Now I suppose, it's just a question of a fixed table in the
kernel driver (HFS+ style), or data stored in a special
inode on-disk (NTFS style, shared refcounted in memory
when the same). With the on-disk, the table can be generated
from mkfs.xfs.

Regards,
Barry.



<Prev in Thread] Current Thread [Next in Thread>