[Top] [All Lists]

Re: [PATCH v3, 03/16] xfsprogs: metadump: drop unneeded use of a random

To: Alex Elder <aelder@xxxxxxx>
Subject: Re: [PATCH v3, 03/16] xfsprogs: metadump: drop unneeded use of a random character
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Thu, 24 Feb 2011 12:31:35 +1100
Cc: xfs@xxxxxxxxxxx
In-reply-to: <201102182121.p1ILL1x7029051@xxxxxxxxxxxxxxxxxxxxxx>
References: <201102182121.p1ILL1x7029051@xxxxxxxxxxxxxxxxxxxxxx>
User-agent: Mutt/1.5.20 (2009-06-14)
On Fri, Feb 18, 2011 at 03:21:01PM -0600, Alex Elder wrote:
> With the exception of the last five bytes, an obfuscated name is
> simply a random string of (acceptable) characters.  The last five
> bytes are chosen, based on the random portion before them, such that
> the resulting obfuscated name has the same hash value as the
> original name.
> This is done by essentially working backwards from the difference
> between the original hash and the hash value computed for the
> obfuscated name so far, picking final bytes based on how that
> difference gets manipulated by completing the hash computation.
> Of those last 5 bytes, all but the upper half of the first one are
> completely determined by this process.  The upper part of the first
> one is currently computed as four random bits, just like all the
> earlier bytes in the obfuscated name.
> It is not actually necessary to randomize these four upper bits,
> and we can simply make them 0.
> Here's why:
> - The final bytes are pulled directly from the hash difference
>   mentioned above, with the lowest-order byte of the hash
>   determining the last character used in the name.
> - The upper nibble of the 5th-to-last byte in a name will affect the
>   lowest 4 bits of hash value and therefore the last byte of the
>   name.  Those four bits are combined with the hash computed from
>   the random characters generated earlier.
> - Because those earlier bytes were random, their hash value will
>   also be random, and in particular, the lowest-order four bits of
>   the hash will be random.
> - So it doesn't matter whether we choose all 0 bits or some other
>   random value for that upper nibble of the byte at offset
>   (namelen - 5).  When it's combined with the hash, the last byte of
>   the name will be random either way.
> Therefore we will choose to use all 0's for that upper nibble.
> Doing this simplifies the generation of two of the final five
> characters, and makes all five of them get computed in a consistent
> way.  We'll still get some small bit of obfuscation for even
> 5-character names, since the upper bits of the first character will
> generally be cleared and likely different from the original.
> Add the use of a mask in the one case it wasn't used to be even more
> consistent.
> Signed-off-by: Alex Elder <aelder@xxxxxxx>

Reviewed-by: Dave Chinner <dchinner@xxxxxxxxxx>

Dave Chinner

<Prev in Thread] Current Thread [Next in Thread>