xfs
[Top] [All Lists]

Re: xfsprogs 3.1.0 repair problems?

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: xfsprogs 3.1.0 repair problems?
From: Eric Sandeen <sandeen@xxxxxxxxxxx>
Date: Fri, 15 Jan 2010 14:47:50 -0600
Cc: xfs-oss <xfs@xxxxxxxxxxx>
In-reply-to: <20100115055628.GD28498@xxxxxxxxxxxxxxxx>
References: <4B4FDC3E.6030205@xxxxxxxxxxx> <20100115055628.GD28498@xxxxxxxxxxxxxxxx>
User-agent: Thunderbird 2.0.0.23 (Macintosh/20090812)
Dave Chinner wrote:
> On Thu, Jan 14, 2010 at 09:08:46PM -0600, Eric Sandeen wrote:
>> It looks like maybe the freespace checking isn't quite up to par:
>>
>> test 073 is dying with:
>>
>> _check_xfs_filesystem: filesystem on /mnt/test/14309.image is inconsistent
>> *** xfs_repair -n output ***
>> Phase 1 - find and verify superblock...
>> Phase 2 - using internal log
>>         - scan filesystem freespace and inode maps...
>> sb_fdblocks 26156829, counted 26157853
>>         - found root inode chunk
> 
> This is caused by the remount,ro done in the test - the superblock
> is written to disk with the reserved blocks considered used. At
> unmount time those reserve blocks are "freed" before the superblock
> is written and so the total is correct at that time.
> 
> I'm going to go look at the kernel code now...
> 
> Cheers,
> 
> Dave.

Today, I'm intermittently getting xfs_repair segfaults, in btree_get_prev,
but this seems odd:

Program terminated with signal 11, Segmentation fault.
#0  btree_get_prev (key=0x9fc330, root=0x9fc300) at btree.c:190
p190            if (cur->index > 0) {
(gdb) list btree.c:190
185     {
186             struct btree_cursor     *cur = root->cursor;
187             int                     level = 0;
188             struct btree_node       *node;
189     
190             if (cur->index > 0) {
191                     if (key)
192                             *key = cur->node->keys[cur->index - 1];
193                     return cur->node->ptrs[cur->index - 1];
194             }
(gdb) p *root
$1 = {root_node = 0x7fca8400fac0, cursor = 0x7fca8400e0e0, height = 1, 
keys_valid = 1, cur_key = 6792640, next_key = 0, next_value = 0x0, prev_key = 
0, prev_value = 0x0}
(gdb) p cur
$2 = (struct btree_cursor *) 0x0
(gdb) quit


root->cursor is valid, but cur is not?

Still looking....

But maybe related to the initial problem... this is cropping up in
the post-test check & repair that xfstests does.

I imaged the filesystem before the checking in hopes of getting a
reproducer.  When the testsuite runs, repair looks clean, but
then it segfaults.

If I then point repair at the filesystem image I created prior
to the segfault, it lights up with tons of errors.  I'm confused
by that; one interesting thing is that repair is testing a mounted,
readonly filesystem when xfstests runs.

-Eric

<Prev in Thread] Current Thread [Next in Thread>