xfs
[Top] [All Lists]

Re: RFC: Case-insensitive support for XFS

To: Christoph Hellwig <hch@xxxxxxxxxxxxx>
Subject: Re: RFC: Case-insensitive support for XFS
From: Nicholas Miell <nmiell@xxxxxxxxxxx>
Date: Fri, 05 Oct 2007 11:52:18 -0700
Cc: Barry Naujok <bnaujok@xxxxxxx>, "xfs@xxxxxxxxxxx" <xfs@xxxxxxxxxxx>, linux-fsdevel@xxxxxxxxxxxxxxx, urban@xxxxxxxxxxxxxx
In-reply-to: <20071005154442.GA6432@xxxxxxxxxxxxx>
References: <op.ty6361ut3jf8g2@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <op.tzpbqspl3jf8g2@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <20071005154442.GA6432@xxxxxxxxxxxxx>
Sender: xfs-bounce@xxxxxxxxxxx
On Fri, 2007-10-05 at 16:44 +0100, Christoph Hellwig wrote:
> [Adding -fsdevel because some of the things touched here might be of
>  broader interest and Urban because his name is on nls_utf8.c]
> 
> On Fri, Oct 05, 2007 at 11:57:54AM +1000, Barry Naujok wrote:
> > 
> > On it's own, linux only provides case conversion for old-style
> > character sets - 8 bit sequences only. A lot of distos are
> > now defaulting to UTF-8 and Linux NLS stuff does not support
> > case conversion for any unicode sets.
> 
> The lack of case tables in nls_utf8.c defintively seems odd to me.
> Urban, is there a reason for that?  The only thing that comes to
> mind is that these tables might be quite large.
> 

Case conversion in Unicode is locale dependent. The legacy 8-bit
character encodings don't code for enough characters to run into the
ambiguities, so they can get away with fixed case conversion tables.
Unicode can't.

I'd point you to the Unicode technical report which explains how to do
it, but unicode.org seems to be offline right now.

> > NTFS in Linux also implements it's own dcache and NTFS also
> 
>                                       ^^^^^^^ dentry operations?
> 
> > stores its unicode case table on disk. This allows the filesystem
> > to migrate to newer forms of Unicode at the time of formatting
> > the filesystem. Eg. Windows Vista now supports Unicode 5.0
> > while older version would support an earlier version of
> > Unicode. Linux's version of NTFS case table is implemented
> > in fs/ntfs/upcase.c defined as default_upcase.
> 
> Because ntfs uses 16bit wide chars it prefers to use it's own tables.
> I'm not sure it's a that good idea.  

Well, Windows uses those on-disk tables, so the Linux driver has to
also. I don't see how that's a bad idea or any way to not do it and
remain compatible.

-- 
Nicholas Miell <nmiell@xxxxxxxxxxx>


<Prev in Thread] Current Thread [Next in Thread>