On Tue, Dec 11, 2007 at 06:26:55PM +0000, David Greaves wrote:
> I've been having problems with this filesystem for a while now.
> I upgraded to 2.6.23 to see if it's improved (no).
> Once every 2 or 3 cold boots I get this in dmesg as the user logs in and
> accesses the /scratch filesystem. If the error doesn't occur as the user logs
> then it won't happen at all.
> Filesystem "dm-0": XFS internal error xfs_btree_check_sblock at line 334 of
> fs/xfs/xfs_btree.c. Caller 0xc01b7bc1
> [<c010511a>] show_trace_log_lvl+0x1a/0x30
> [<c0105d72>] show_trace+0x12/0x20
> [<c0105d95>] dump_stack+0x15/0x20
> [<c01dd34f>] xfs_error_report+0x4f/0x60
> [<c01cfcb6>] xfs_btree_check_sblock+0x56/0xd0
> [<c01b7bc1>] xfs_alloc_lookup+0x181/0x390
> [<c01b7e23>] xfs_alloc_lookup_eq+0x13/0x20
> [<c01b5594>] xfs_free_ag_extent+0x2f4/0x690
> [<c01b7164>] xfs_free_extent+0xb4/0xd0
> [<c01c1979>] xfs_bmap_finish+0x119/0x170
> [<c0209aa7>] xfs_remove+0x247/0x4f0
> [<c0211cc2>] xfs_vn_unlink+0x22/0x50
> [<c0172f28>] vfs_unlink+0x68/0xa0
> [<c01751e9>] do_unlinkat+0xb9/0x140
> [<c0175280>] sys_unlink+0x10/0x20
> [<c010420a>] syscall_call+0x7/0xb
> xfs_force_shutdown(dm-0,0x8) called from line 4274 of file fs/xfs/xfs_bmap.c.
> Return address = 0xc0214dae
> Filesystem "dm-0": Corruption of in-memory data detected. Shutting down
> filesystem: dm-0
> Please umount the filesystem, and rectify the problem(s)
So there's a corrupted freespace btree block.
> I ssh in as root, umount, mount, umount and run xfs_repair.
> This is what I got this time:
> Phase 2 - using internal log
> - zero log...
> - scan filesystem freespace and inode maps...
> ir_freecount/free mismatch, inode chunk 59/5027968, freecount 27 nfree 26
> - found root inode chunk
> All the rest was clean.
repair doesn't check the freespace btrees - it just rebuilds them from
scratch. use xfs_check to tell you what is wrong with the filesystem, then
use xfs_repair to fix it....
> It is possible this fs suffered in the 2.6.17 timeframe
> It is also possible something got broken whilst I was having lots of issues
> hibernate (which is still unreliable).
Suspend does not quiesce filesystems safely, so you risk filesystem
corruption every time you suspend and resume no matter what filesystem
SGI Australian Software Group