[Top] [All Lists]

Re: Tomorrow

To: Michael Sinz <msinz@xxxxxxxxx>
Subject: Re: Tomorrow
From: Andi Kleen <ak@xxxxxxx>
Date: Tue, 27 May 2003 14:06:50 +0200
Cc: Andi Kleen <ak@xxxxxxx>, linux-xfs@xxxxxxxxxxx
In-reply-to: <3ED344C0.1010700@xxxxxxxxx>
References: <1053694002.2887.1.camel@xxxxxxxxxxxxxxxxxxxxx> <1053697162.21472.51.camel@xxxxxxxxxxxxxxxxxxxx> <20030523134438.GC30288@xxxxxxxxxxxxx> <20030523150530.A31022@xxxxxxxxxxxxx> <20030524071709.GK27626@xxxxxxxxxxxxxxx> <20030524095245.A24074@xxxxxxxxxxxxx> <20030524091516.GM27626@xxxxxxxxxxxxxxx> <20030524093103.GA12181@xxxxxxxxxxxxx> <3ED344C0.1010700@xxxxxxxxx>
Sender: linux-xfs-bounce@xxxxxxxxxxx
On Tue, May 27, 2003 at 06:58:08AM -0400, Michael Sinz wrote:
> When we did this for the Amiga (oh so many years ago) it was a royal
> PITA.  We ended up punting for the most part on anything that was
> outside of the ISO-Latin-1 code page and even there we had a problem
> due to some "differences" of opinion by certain language groups what
> was supposed to happen.

I wrote a C Library for the Amiga a long time ago and in the end I 
left it all for locale.library because it was too nasty to do by itself.

> This gets worse when you look at behavior patterns due to the fact that
> a file, especially one accessed over the network, may be accessed by
> a machine with different locale settings and thus have slightly different
> rules as to what is the lowercase form of an uppercase letter or wordform.

AFAIk the SMB protocol handles this. 

> >You either only support UTF-8 Unicode (shifting the burden of conversion 
> >to user space) or you need to store a "codepage" per filesystem.  Linux 
> >seems
> >to go towards the UTF-8 route.  The kernel already has some code for this 
> >(JFS does it), but it will be not pretty.
> I have not looked at the JFS code at all but this can not be very pretty
> if they supported the locale preferences.  (Unless, in the last 10 years

JFS has a code page as mount option or you can use UTF-8. The locale 
code to support this is a generic kernel subsystem, also used by VFAT.

> there was some new agreement such that case conversion for all locales
> are consistant with eachother)

Yes there is: Unicode/UTF-8. That is where all the Linux distributions are 
going too. For legacy SMB support you will still need to support codepages,
but that could be done by samba. For XFS I guess it would be enough to just
support UTF-8. Supporting different code pages is probably not too useful


<Prev in Thread] Current Thread [Next in Thread>