xfs
[Top] [All Lists]

Re: [PATCH 07/10] xfs: add trie generator and supporting code for UTF-8.

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: [PATCH 07/10] xfs: add trie generator and supporting code for UTF-8.
From: Ben Myers <bpm@xxxxxxx>
Date: Tue, 23 Sep 2014 13:57:21 -0500
Cc: linux-fsdevel@xxxxxxxxxxxxxxx, tinguely@xxxxxxx, olaf@xxxxxxx, xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <20140922205714.GN4267@dastard>
References: <20140918195650.GI19952@xxxxxxx> <20140918201518.GJ4482@xxxxxxx> <20140922205714.GN4267@dastard>
User-agent: Mutt/1.5.20 (2009-06-14)
On Tue, Sep 23, 2014 at 06:57:14AM +1000, Dave Chinner wrote:
> On Thu, Sep 18, 2014 at 03:15:19PM -0500, Ben Myers wrote:
> > From: Olaf Weber <olaf@xxxxxxx>
> > 
> > mkutf8data.c is the source for a program that generates utf8data.h, which
> > contains the trie that utf8norm.c uses. The trie is generated from the
> > Unicode 7.0.0 data files. The format of the utf8data[] table is described
> > in utf8norm.c.
> > 
> > Supporting functions for UTF-8 normalization are in utf8norm.c with the
> > header utf8norm.h. Two normalization forms are supported: nfkdi and nfkdicf.
> > 
> >   nfkdi:
> >    - Apply unicode normalization form NFKD.
> >    - Remove any Default_Ignorable_Code_Point.
> > 
> >   nfkdicf:
> >    - Apply unicode normalization form NFKD.
> >    - Remove any Default_Ignorable_Code_Point.
> >    - Apply a full casefold (C + F).
> > 
> > For the purposes of the code, a string is valid UTF-8 if:
> > 
> >  - The values encoded are 0x1..0x10FFFF.
> >  - The surrogate codepoints 0xD800..0xDFFFF are not encoded.
> >  - The shortest possible encoding is used for all values.
> > 
> > The supporting functions work on null-terminated strings (utf8 prefix) and
> > on length-limited strings (utf8n prefix).
> > 
> > Signed-off-by: Olaf Weber <olaf@xxxxxxx>
> > 
> > ---
> > [v2: the trie is now separated into utf8norm.ko;
> >      utf8version is now a function and exported;
> >      introduced CONFIG_XFS_UTF8. -bpm]
> > ---
> >  fs/xfs/Kconfig               |    8 +
> >  fs/xfs/Makefile              |    2 +-
> >  fs/xfs/utf8norm/Makefile     |   37 +
> >  fs/xfs/utf8norm/mkutf8data.c | 3239 
> > ++++++++++++++++++++++++++++++++++++++++++
> >  fs/xfs/utf8norm/utf8norm.c   |  649 +++++++++
> >  fs/xfs/utf8norm/utf8norm.h   |  116 ++
> 
> Again, nothing XFS specific here. It's being built as a separate
> module and the only thing that XFS uses are exported functions, so
> it really should be generic library code....

I'll get this moved to lib/ as you suggested elsewhere in the
thread.

Thanks,
        Ben

<Prev in Thread] Current Thread [Next in Thread>