On Tue, May 27, 2003 at 06:58:08AM -0400, Michael Sinz wrote:
> When we did this for the Amiga (oh so many years ago) it was a royal
> PITA. We ended up punting for the most part on anything that was
> outside of the ISO-Latin-1 code page and even there we had a problem
> due to some "differences" of opinion by certain language groups what
> was supposed to happen.
I wrote a C Library for the Amiga a long time ago and in the end I
left it all for locale.library because it was too nasty to do by itself.
> This gets worse when you look at behavior patterns due to the fact that
> a file, especially one accessed over the network, may be accessed by
> a machine with different locale settings and thus have slightly different
> rules as to what is the lowercase form of an uppercase letter or wordform.
AFAIk the SMB protocol handles this.
> >You either only support UTF-8 Unicode (shifting the burden of conversion
> >to user space) or you need to store a "codepage" per filesystem. Linux
> >seems
> >to go towards the UTF-8 route. The kernel already has some code for this
> >(JFS does it), but it will be not pretty.
>
> I have not looked at the JFS code at all but this can not be very pretty
> if they supported the locale preferences. (Unless, in the last 10 years
JFS has a code page as mount option or you can use UTF-8. The locale
code to support this is a generic kernel subsystem, also used by VFAT.
> there was some new agreement such that case conversion for all locales
> are consistant with eachother)
Yes there is: Unicode/UTF-8. That is where all the Linux distributions are
going too. For legacy SMB support you will still need to support codepages,
but that could be done by samba. For XFS I guess it would be enough to just
support UTF-8. Supporting different code pages is probably not too useful
anymore.
-Andi
|