[Top] [All Lists]

Re: Tomorrow

To: Andi Kleen <ak@xxxxxxx>
Subject: Re: Tomorrow
From: Michael Sinz <msinz@xxxxxxxxx>
Date: Tue, 27 May 2003 09:01:59 -0400
Cc: linux-xfs@xxxxxxxxxxx
In-reply-to: <20030527120650.GA22306@xxxxxxxxxxxxx>
References: <1053694002.2887.1.camel@xxxxxxxxxxxxxxxxxxxxx> <1053697162.21472.51.camel@xxxxxxxxxxxxxxxxxxxx> <20030523134438.GC30288@xxxxxxxxxxxxx> <20030523150530.A31022@xxxxxxxxxxxxx> <20030524071709.GK27626@xxxxxxxxxxxxxxx> <20030524095245.A24074@xxxxxxxxxxxxx> <20030524091516.GM27626@xxxxxxxxxxxxxxx> <20030524093103.GA12181@xxxxxxxxxxxxx> <3ED344C0.1010700@xxxxxxxxx> <20030527120650.GA22306@xxxxxxxxxxxxx>
Sender: linux-xfs-bounce@xxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4b) Gecko/20030507
Andi Kleen wrote:
On Tue, May 27, 2003 at 06:58:08AM -0400, Michael Sinz wrote:

When we did this for the Amiga (oh so many years ago) it was a royal
PITA.  We ended up punting for the most part on anything that was
outside of the ISO-Latin-1 code page and even there we had a problem
due to some "differences" of opinion by certain language groups what
was supposed to happen.

I wrote a C Library for the Amiga a long time ago and in the end I left it all for locale.library because it was too nasty to do by itself.

That is why we wrote the locale.library - it was nasty.
Even worse when you add in the sorting issues.

This gets worse when you look at behavior patterns due to the fact that
a file, especially one accessed over the network, may be accessed by
a machine with different locale settings and thus have slightly different
rules as to what is the lowercase form of an uppercase letter or wordform.

AFAIk the SMB protocol handles this.

I would have to look at how it deals with uniqueness vs non-uniqueness
between different clients.  That was the really hard problem for us.

there was some new agreement such that case conversion for all locales
are consistant with eachother)

Yes there is: Unicode/UTF-8. That is where all the Linux distributions are going too. For legacy SMB support you will still need to support codepages,
but that could be done by samba. For XFS I guess it would be enough to just
support UTF-8. Supporting different code pages is probably not too useful

Does UNICODE actually define the case-ness of characters now?  I
have been out of UNICODE stuff for some time (working at different
levels of system design - not the OS guru I used to be :-()

It used to just define the glyphs and give not symantic meaning to
them.  In fact, the 16-bit UNICODE had the problem of not even
keeping all of the glyphs for a locale together.  It was just a
way of enumerating glyphs and some "compatibility" stuff for
ASCII and ECMA/ISO Latin-1

Michael Sinz -- Director, Systems Engineering -- Worldgate Communications
A master's secrets are only as good as
        the master's ability to explain them to others.

<Prev in Thread] Current Thread [Next in Thread>