On Sat, Dec 04, 2010 at 09:49:25AM +0530, Ajeet Yadav wrote:
> Our test case is automated:
> 1. Create large number of file of 6KB sizes ( 6KB is taken, we wanted to
> increase journal load, and file size not in multiple of file system block
> 2. Set target to reboot at random seconds seconds.
> 3. Next boot do "ls" of all files in XFS partition.
> 4. Remove all files in XFS.
> 5. Go back to step 1
> The purpose of this test is to test journal and stability of XFS filestem.
> Do you think, we should consider this test case ?
Are you running with barriers enabled? What are your mkfs and mount
Also, does the problem exist on a current kernel? We've fixed lots
of writeback related problems since 2.6.30, so I'd suggest that you
need to reproduce this on a current kernel before anyone will spend
large amounts of time trying to track it down. Especially as
xfstests 136-140 do similar testing (just without the reboots) and
don't show any problems.
> Other is when we should run xfs_repair ? because if mount fails and journal
> contain dirty logs then xfs_repair does not run, we are forced to use (-L)
> option but its description say that (-L) can corrupt the file system.
Yes, it can.
> Other case even if xfs mount successfully, even in that case accessing some
> files give IO input/ output error.
Which means something got corrupted. Look in dmesg for reasons why.
> 1. I recommend the following usage for xfs_repair so that we do not come
> accross these problem
> Mount Success -> Umount -> run xfs_repair -> mount
> Mount fails -> try xfs_repair -> xfs_repair fails -> finally xfs_repair
> -L -> mount
> Adding above mount + xfs_repair procedure to script makes file system
> stable. But other member of my team do not agree as it increases mount time.
I agree with your team members. All you are proposing to do is to hide
failures that need further investigation...