On Fri, Feb 18, 2011 at 03:21:01PM -0600, Alex Elder wrote:
> With the exception of the last five bytes, an obfuscated name is
> simply a random string of (acceptable) characters. The last five
> bytes are chosen, based on the random portion before them, such that
> the resulting obfuscated name has the same hash value as the
> original name.
>
> This is done by essentially working backwards from the difference
> between the original hash and the hash value computed for the
> obfuscated name so far, picking final bytes based on how that
> difference gets manipulated by completing the hash computation.
>
> Of those last 5 bytes, all but the upper half of the first one are
> completely determined by this process. The upper part of the first
> one is currently computed as four random bits, just like all the
> earlier bytes in the obfuscated name.
>
> It is not actually necessary to randomize these four upper bits,
> and we can simply make them 0.
>
> Here's why:
> - The final bytes are pulled directly from the hash difference
> mentioned above, with the lowest-order byte of the hash
> determining the last character used in the name.
> - The upper nibble of the 5th-to-last byte in a name will affect the
> lowest 4 bits of hash value and therefore the last byte of the
> name. Those four bits are combined with the hash computed from
> the random characters generated earlier.
> - Because those earlier bytes were random, their hash value will
> also be random, and in particular, the lowest-order four bits of
> the hash will be random.
> - So it doesn't matter whether we choose all 0 bits or some other
> random value for that upper nibble of the byte at offset
> (namelen - 5). When it's combined with the hash, the last byte of
> the name will be random either way.
>
> Therefore we will choose to use all 0's for that upper nibble.
>
> Doing this simplifies the generation of two of the final five
> characters, and makes all five of them get computed in a consistent
> way. We'll still get some small bit of obfuscation for even
> 5-character names, since the upper bits of the first character will
> generally be cleared and likely different from the original.
>
> Add the use of a mask in the one case it wasn't used to be even more
> consistent.
>
> Signed-off-by: Alex Elder <aelder@xxxxxxx>
Reviewed-by: Dave Chinner <dchinner@xxxxxxxxxx>
--
Dave Chinner
david@xxxxxxxxxxxxx
|