On Fri, 2007-10-05 at 16:44 +0100, Christoph Hellwig wrote:
> [Adding -fsdevel because some of the things touched here might be of
> broader interest and Urban because his name is on nls_utf8.c]
> On Fri, Oct 05, 2007 at 11:57:54AM +1000, Barry Naujok wrote:
> > On it's own, linux only provides case conversion for old-style
> > character sets - 8 bit sequences only. A lot of distos are
> > now defaulting to UTF-8 and Linux NLS stuff does not support
> > case conversion for any unicode sets.
> The lack of case tables in nls_utf8.c defintively seems odd to me.
> Urban, is there a reason for that? The only thing that comes to
> mind is that these tables might be quite large.
Case conversion in Unicode is locale dependent. The legacy 8-bit
character encodings don't code for enough characters to run into the
ambiguities, so they can get away with fixed case conversion tables.
I'd point you to the Unicode technical report which explains how to do
it, but unicode.org seems to be offline right now.
> > NTFS in Linux also implements it's own dcache and NTFS also
> ^^^^^^^ dentry operations?
> > stores its unicode case table on disk. This allows the filesystem
> > to migrate to newer forms of Unicode at the time of formatting
> > the filesystem. Eg. Windows Vista now supports Unicode 5.0
> > while older version would support an earlier version of
> > Unicode. Linux's version of NTFS case table is implemented
> > in fs/ntfs/upcase.c defined as default_upcase.
> Because ntfs uses 16bit wide chars it prefers to use it's own tables.
> I'm not sure it's a that good idea.
Well, Windows uses those on-disk tables, so the Linux driver has to
also. I don't see how that's a bad idea or any way to not do it and
Nicholas Miell <nmiell@xxxxxxxxxxx>