xfs
[Top] [All Lists]

Re: [PATCH 04/16] lib/utf8norm.c: reduce the size of utf8data[]

To: Ben Myers <bpm@xxxxxxx>
Subject: Re: [PATCH 04/16] lib/utf8norm.c: reduce the size of utf8data[]
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Mon, 6 Oct 2014 08:52:44 +1100
Cc: linux-fsdevel@xxxxxxxxxxxxxxx, olaf@xxxxxxx, xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <20141003215455.GC1865@xxxxxxx>
References: <20141003214758.GY1865@xxxxxxx> <20141003215455.GC1865@xxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Fri, Oct 03, 2014 at 04:54:55PM -0500, Ben Myers wrote:
> From: Olaf Weber <olaf@xxxxxxx>
> 
> Remove the Hangul decompositions from the utf8data trie, and do
> algorithmic decomposition to calculate them on the fly. To store
> the decomposition the caller of utf8lookup()/utf8nlookup() must
> provide a 12-byte buffer, which is used to synthesize a leaf with
> the decomposition. Trie size is reduced from 245kB to 90kB.
> 
> This change also contains a number of robustness fixes to the
> trie generator mkutf8data.c.

Please separate out the robustness fixes or merge them back into the
original patch. e.g. Bulk renaming of code like this:


>  static int
> -utf8key(unsigned int key, char keyval[])
> -{
> -     int keylen;
> -
> -     if (key < 0x80) {
> -             keyval[0] = key;
> -             keylen = 1;
> -     } else if (key < 0x800) {
> -             keyval[1] = key & UTF8_V_MASK;
> -             keyval[1] |= UTF8_N_BITS;
> -             key >>= UTF8_V_SHIFT;
....
> +utf8encode(char *str, unsigned int val)
> +{
> +     int len;
> +
> +     if (val < 0x80) {
> +             str[0] = val;
> +             len = 1;
> +     } else if (val < 0x800) {
> +             str[1] = val & UTF8_V_MASK;
> +             str[1] |= UTF8_N_BITS;
> +             val >>= UTF8_V_SHIFT;

Doesn't belong in a patch that introduces special hangul character
handling....

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>