On Thu, Sep 18, 2014 at 03:15:19PM -0500, Ben Myers wrote:
> From: Olaf Weber <olaf@xxxxxxx>
>
> mkutf8data.c is the source for a program that generates utf8data.h, which
> contains the trie that utf8norm.c uses. The trie is generated from the
> Unicode 7.0.0 data files. The format of the utf8data[] table is described
> in utf8norm.c.
>
> Supporting functions for UTF-8 normalization are in utf8norm.c with the
> header utf8norm.h. Two normalization forms are supported: nfkdi and nfkdicf.
>
> nfkdi:
> - Apply unicode normalization form NFKD.
> - Remove any Default_Ignorable_Code_Point.
>
> nfkdicf:
> - Apply unicode normalization form NFKD.
> - Remove any Default_Ignorable_Code_Point.
> - Apply a full casefold (C + F).
>
> For the purposes of the code, a string is valid UTF-8 if:
>
> - The values encoded are 0x1..0x10FFFF.
> - The surrogate codepoints 0xD800..0xDFFFF are not encoded.
> - The shortest possible encoding is used for all values.
>
> The supporting functions work on null-terminated strings (utf8 prefix) and
> on length-limited strings (utf8n prefix).
>
> Signed-off-by: Olaf Weber <olaf@xxxxxxx>
>
> ---
> [v2: the trie is now separated into utf8norm.ko;
> utf8version is now a function and exported;
> introduced CONFIG_XFS_UTF8. -bpm]
> ---
> fs/xfs/Kconfig | 8 +
> fs/xfs/Makefile | 2 +-
> fs/xfs/utf8norm/Makefile | 37 +
> fs/xfs/utf8norm/mkutf8data.c | 3239
> ++++++++++++++++++++++++++++++++++++++++++
> fs/xfs/utf8norm/utf8norm.c | 649 +++++++++
> fs/xfs/utf8norm/utf8norm.h | 116 ++
Again, nothing XFS specific here. It's being built as a separate
module and the only thing that XFS uses are exported functions, so
it really should be generic library code....
Cheers,
Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx
|