On Fri, Jun 27, 2008 at 03:43:49PM +0530, Sagar Borikar wrote:
> Dave Chinner wrote:
>> Yes, but all the same pattern of corruption, so it is likely
>> that it is one problem.
>>
>> All I can suggest is working out a reproducable test case in your
>> development environment, attaching a debugger and start digging around
>> in memory when the problem is hit and try to find out exactly what
>> is corrupted. If you can't reproduce it or work out what is
>> occurring to trigger the problem, then we're not going to be able to
>> find the cause...
>>
> Thanks Dave
> I did some experiments today with the corrupted filesystem.
> setup : NAS box contains one volume /share and 10 subdirectories.
> In first subdirectory sh1, I kept 512MB file. Through a script I
> continuously copy this file
> simultaneously from sh2 to sh10 subdirectories.
> The script looks like
> ....
> while [ 1 ]
> do
> cp $1 $2
> done
....
> uninterruptible sleep state continuously. Ran xfs_repair with -n option
> on filesystem mounted on JBOD
> Here is the output :
....
> entry "iozone_68.tst" in shortform directory 67108993 references free
> inode 67108995
....
> entry "iozone_68.tst" in shortform directory 100663425 references free
> inode 100663427
....
> entry "iozone_68.tst" in shortform directory 301990016 references free
> inode 301990019
....
> entry "iozone_68.tst" in shortform directory 335544448 references free
> inode 335544451
....
> entry "iozone_68.tst" in shortform directory 402653313 references free
> inode 402653318
....
And so on. There's a pattern here. Can you try to find out what
part of your workload is producing these errors?
Cheers,
Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx
|