Christoph Hellwig wrote:
> On Thu, Sep 17, 2009 at 11:06:16AM -0500, Eric Sandeen wrote:
>> A couple people reported xfs_repair hangs after
>> "Traversing filesystem ..." in xfs_repair. This happens
>> when all slots in the cache are full and referenced, and the
>> loop in cache_node_get() which tries to shake unused entries
>> fails to find any - it just keeps upping the priority and goes
>> forever.
>>
>> This can be worked around by restarting xfs_repair with
>> -P and/or "-o bhash=<largersize>" for older xfs_repair.
>>
>> I started down the path of increasing the number of hash buckets
>> on the fly, but Barry suggested simply increasing the max allowed
>> depth which is much simpler (thanks!)
>>
>> Resizing the hash lengths does mean that cache_report ends up with
>> most things in the "greater-than" category:
>>
>> ...
>> Hash buckets with 23 entries 3 ( 3%)
>> Hash buckets with 24 entries 3 ( 3%)
>> Hash buckets with >24 entries 50 ( 85%)
>>
>> but I think I'll save that fix for another patch unless there's
>> real concern right now.
>>
>> I tested this on the metadump image provided by Tomek.
>
> How large is that image? I really think we need to start collecting
> these images for regression testing.
zipped metadump is 170M; unzipped 1.1G.
Crafting a special test fs somehow might be better; maybe with an
artificially low bhashsize or something .... yeah, I know. I'm not
sure how to manage the regression testing. Working backwards to a
minimal testcase on these would be extremely time-consuming and/or
impossible I'm afraid.
> The patch looks good to me,
thanks for the review
-Eric
>
> Reviewed-by: Christoph Hellwig <hch@xxxxxx>
>
> _______________________________________________
> xfs mailing list
> xfs@xxxxxxxxxxx
> http://oss.sgi.com/mailman/listinfo/xfs
>
|