[PATCH] repair: fix wrong logic when validating node magic number
Zorro Lang
zorro.lang at gmail.com
Wed Aug 26 22:35:20 CDT 2015
2015-08-13 15:15 GMT+08:00 Eryu Guan <eguan at redhat.com>:
> On Thu, Aug 13, 2015 at 03:01:16PM +0800, Eryu Guan wrote:
>> Magic number is wrong only when != XFS_DA_NODE_MAGIC and
>> != XFS_DA3_NODE_MAGIC.
>>
>> This is triggered by shared/002 when testing 512 block size XFS.
>>
>> Phase 1 - find and verify superblock...
>> Phase 2 - using internal log
>> - scan filesystem freespace and inode maps...
>> - found root inode chunk
>> Phase 3 - for each AG...
>> - scan (but don't clear) agi unlinked lists...
>> - process known inodes and perform inode discovery...
>> - agno = 0
>> bad magic number febe in block 64 (108) for directory inode 35
>> ......
>>
>> Fix it by changing "||" to "&&".
>>
>> Signed-off-by: Eryu Guan <eguan at redhat.com>
>
> With this patch applied, shared/002 still fails on 512 block size XFS,
This failure not only be reproduced on 512 block size XFS. When I increase
the stress of shared/002, this bug can be reproduced on any block size XFS.
For example, on my test machine, when I increase $num_attrs to 6000, this
bug be reproduced on 1k block size xfs. When increased $num_attrs to 80k,
this bug be reproduced on 4k block size xfs.
So this's not a block size related bug.
BTW, xfs_repair can repair this corruption. And from shared/002 output, we can
see shared/002 use getfattr to sure all xattrs haven been wrote in
device correctly.
So this corruption maybe due to xfs log problems.
Thanks,
Zorro Lang
> full xfs_repair -n output is
>
> *** xfs_repair -n output ***
> Phase 1 - find and verify superblock...
> Phase 2 - using internal log
> - scan filesystem freespace and inode maps...
> - found root inode chunk
> Phase 3 - for each AG...
> - scan (but don't clear) agi unlinked lists...
> - process known inodes and perform inode discovery...
> - agno = 0
> problem with attribute contents in inode 35
> would clear attr fork
> bad nblocks 67 for inode 35, would reset to 0
> bad anextents 5 for inode 35, would reset to 0
> - agno = 1
> - agno = 2
> - agno = 3
> - process newly discovered inodes...
> Phase 4 - check for duplicate blocks...
> - setting up duplicate extent list...
> - check for inodes claiming duplicate blocks...
> - agno = 0
> - agno = 1
> - agno = 2
> - agno = 3
> No modify flag set, skipping phase 5
> Phase 6 - check inode connectivity...
> - traversing filesystem ...
> - traversal finished ...
> - moving disconnected inodes to lost+found ...
> Phase 7 - verify link counts...
> No modify flag set, skipping filesystem flush and exiting.
> *** end xfs_repair output
>
> And a simplified reproducer is just adding >= 577 xattrs to file foo on
> 512 block size XFS, no dmflaky is needed.
>
> num_xattrs=577
> for ((i = 1; i <= $num_xattrs; i++)); do
> name="user.attr_$(printf "%04d" $i)"
> $SETFATTR_PROG -n $name -v "val_$(printf "%04d" $i)" $SCRATCH_MNT/foo
> done
>
> And it's easily reproduced.
>
> Thanks,
> Eryu
>
> _______________________________________________
> xfs mailing list
> xfs at oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
More information about the xfs
mailing list