xfs
[Top] [All Lists]

Re: mount XFS partition fail after repair when uquota and gquota are use

To: Ben Myers <bpm@xxxxxxx>
Subject: Re: mount XFS partition fail after repair when uquota and gquota are used
From: Guillaume Anciaux <guillaume.anciaux@xxxxxxx>
Date: Mon, 18 Mar 2013 20:51:17 +0100
Cc: linux-xfs@xxxxxxxxxxx
Delivered-to: linux-xfs@xxxxxxxxxxx
In-reply-to: <20130318164743.GC22182@xxxxxxx>
References: <1363600796196-34996.post@xxxxxxxxxxxxx> <20130318164743.GC22182@xxxxxxx>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130308 Thunderbird/17.0.4

On 18/03/2013 17:47, Ben Myers wrote:
Hi anciaux,

On Mon, Mar 18, 2013 at 02:59:56AM -0700, anciaux wrote:
I have been struggling to repair a partition after a RAID disk set failure.

Apparently the data is accessible with no problem since I can mount the
partition.

The problem is ONLY when I use the uquota and gquota mount option (which I
was using freely before the disk failure).

The syslog shows:

Mar 18 09:35:50 storage kernel: [  417.885430] XFS (sdb1): Internal error
xfs_iformat(1) at line 319 of file
   ^^^^^^^^^^^^^^ Matches the corruption error below.

/build/buildd/linux-3.2.0/fs/xfs/xfs_inode.c.  Caller 0xffffffffa0308502
I believe this is the relevant code, although I'm pasting from the latest
codebase so the line numbers won't match:

500 STATIC int
501 xfs_iformat(
502         xfs_inode_t             *ip,
503         xfs_dinode_t            *dip)
504 {
505         xfs_attr_shortform_t    *atp;
506         int                     size;
507         int                     error = 0;
508         xfs_fsize_t             di_size;
509
510         if (unlikely(be32_to_cpu(dip->di_nextents) +
511                      be16_to_cpu(dip->di_anextents) >
512                      be64_to_cpu(dip->di_nblocks))) {
513                 xfs_warn(ip->i_mount,
514                         "corrupt dinode %Lu, extent total = %d, nblocks = 
%Lu.",
515                         (unsigned long long)ip->i_ino,
516                         (int)(be32_to_cpu(dip->di_nextents) +
517                               be16_to_cpu(dip->di_anextents)),
518                         (unsigned long long)
519                                 be64_to_cpu(dip->di_nblocks));
520                 XFS_CORRUPTION_ERROR("xfs_iformat(1)", XFS_ERRLEVEL_LOW,
521                                      ip->i_mount, dip);
522                 return XFS_ERROR(EFSCORRUPTED);
523         }

Mar 18 09:35:50 storage kernel: [  417.885634]  [<ffffffffa02c26cf>]
xfs_error_report+0x3f/0x50 [xfs]
Mar 18 09:35:50 storage kernel: [  417.885651]  [<ffffffffa0308502>] ?
xfs_iread+0x172/0x1c0 [xfs]
Mar 18 09:35:50 storage kernel: [  417.885663]  [<ffffffffa02c273e>]
xfs_corruption_error+0x5e/0x90 [xfs]
Mar 18 09:35:50 storage kernel: [  417.885680]  [<ffffffffa030826c>]
xfs_iformat+0x42c/0x550 [xfs]
Mar 18 09:35:50 storage kernel: [  417.885697]  [<ffffffffa0308502>] ?
xfs_iread+0x172/0x1c0 [xfs]
Mar 18 09:35:50 storage kernel: [  417.885714]  [<ffffffffa0308502>]
xfs_iread+0x172/0x1c0 [xfs]
Mar 18 09:35:50 storage kernel: [  417.885729]  [<ffffffffa02c71e4>]
xfs_iget_cache_miss+0x64/0x230 [xfs]
Mar 18 09:35:50 storage kernel: [  417.885740]  [<ffffffffa02c74d9>]
xfs_iget+0x129/0x1b0 [xfs]
Mar 18 09:35:50 storage kernel: [  417.885763]  [<ffffffffa0323c46>]
xfs_qm_dqusage_adjust+0x86/0x2a0 [xfs]
Mar 18 09:35:50 storage kernel: [  417.885774]  [<ffffffffa02bfda1>] ?
xfs_buf_rele+0x51/0x130 [xfs]
Mar 18 09:35:50 storage kernel: [  417.885787]  [<ffffffffa02ccf83>]
xfs_bulkstat+0x413/0x800 [xfs]
Mar 18 09:35:50 storage kernel: [  417.885811]  [<ffffffffa0323bc0>] ?
xfs_qm_quotacheck_dqadjust+0x190/0x190 [xfs]
Mar 18 09:35:50 storage kernel: [  417.885826]  [<ffffffffa02d66d5>] ?
kmem_free+0x35/0x40 [xfs]
Mar 18 09:35:50 storage kernel: [  417.885843]  [<ffffffffa03246b5>]
xfs_qm_quotacheck+0xe5/0x1c0 [xfs]
Mar 18 09:35:50 storage kernel: [  417.885862]  [<ffffffffa031de3c>] ?
xfs_qm_dqdestroy+0x1c/0x30 [xfs]
Mar 18 09:35:50 storage kernel: [  417.885880]  [<ffffffffa0324a94>]
xfs_qm_mount_quotas+0x124/0x1b0 [xfs]
Mar 18 09:35:50 storage kernel: [  417.885897]  [<ffffffffa0310990>]
xfs_mountfs+0x5f0/0x690 [xfs]
Mar 18 09:35:50 storage kernel: [  417.885910]  [<ffffffffa02ce322>] ?
xfs_mru_cache_create+0x162/0x190 [xfs]
Mar 18 09:35:50 storage kernel: [  417.885923]  [<ffffffffa02d053e>]
xfs_fs_fill_super+0x1de/0x290 [xfs]
Mar 18 09:35:50 storage kernel: [  417.885939]  [<ffffffffa02d0360>] ?
xfs_parseargs+0xbc0/0xbc0 [xfs]
Mar 18 09:35:50 storage kernel: [  417.885953]  [<ffffffffa02ce665>]
xfs_fs_mount+0x15/0x20 [xfs]

I fear for the filesystem to be corrupted and xfs_repair not able to
notice.  At least for the quota information.  Someone has any hint on
what could be the problem ?
Have you tried xfs_repair?  I'm not clear on that.
Sorry I was not clear enough in my message: Yes I did hit xfs_repair -L. And it permitted me to mount the partition but ONLYwhen quota options are not set. If quota is activated then a corruption message (see below for the complete message) is printed in syslog.

On how I could fix/regenerate the quota
information ?
It looks like you're hitting the corruption during quotacheck, which is in the
process of regenerating the quota information.  Your paste seems to be missing
the output that would be printed by xfs_warn at line 513 which would include
ino, total nextents, and the number of blocks used.  Is that info available?
Sorry I did a " | grep -i xfs" for the previous log. The complete log is hereafter:

Mar 18 09:35:50 storage kernel: [ 417.883817] XFS (sdb1): corrupt dinode 3224608213, extent total = 1, nblocks = 0. Mar 18 09:35:50 storage kernel: [ 417.883822] ffff880216304500: 49 4e 81 a4 01 02 00 01 00 00 03 f4 00 00 03 f5 IN.............. Mar 18 09:35:50 storage kernel: [ 417.883926] XFS (sdb1): Internal error xfs_iformat(1) at line 319 of file /build/buildd/linux-3.2.0/fs/xfs/xfs_inode.c. Caller 0xffffffffa0308502
Mar 18 09:35:50 storage kernel: [  417.883928]
Mar 18 09:35:50 storage kernel: [ 417.884103] Pid: 2947, comm: mount Tainted: P O 3.2.0-38-generic #61-Ubuntu
Mar 18 09:35:50 storage kernel: [  417.884105] Call Trace:
Mar 18 09:35:50 storage kernel: [ 417.884137] [<ffffffffa02c26cf>] xfs_error_report+0x3f/0x50 [xfs] Mar 18 09:35:50 storage kernel: [ 417.884155] [<ffffffffa0308502>] ? xfs_iread+0x172/0x1c0 [xfs] Mar 18 09:35:50 storage kernel: [ 417.884166] [<ffffffffa02c273e>] xfs_corruption_error+0x5e/0x90 [xfs] Mar 18 09:35:50 storage kernel: [ 417.884183] [<ffffffffa030826c>] xfs_iformat+0x42c/0x550 [xfs] Mar 18 09:35:50 storage kernel: [ 417.884200] [<ffffffffa0308502>] ? xfs_iread+0x172/0x1c0 [xfs] Mar 18 09:35:50 storage kernel: [ 417.884217] [<ffffffffa0308502>] xfs_iread+0x172/0x1c0 [xfs] Mar 18 09:35:50 storage kernel: [ 417.884223] [<ffffffff81193612>] ? inode_init_always+0x102/0x1c0 Mar 18 09:35:50 storage kernel: [ 417.884235] [<ffffffffa02c71e4>] xfs_iget_cache_miss+0x64/0x230 [xfs] Mar 18 09:35:50 storage kernel: [ 417.884247] [<ffffffffa02c74d9>] xfs_iget+0x129/0x1b0 [xfs] Mar 18 09:35:50 storage kernel: [ 417.884250] [<ffffffff81193e9a>] ? evict+0x12a/0x1c0 Mar 18 09:35:50 storage kernel: [ 417.884269] [<ffffffffa0323c46>] xfs_qm_dqusage_adjust+0x86/0x2a0 [xfs] Mar 18 09:35:50 storage kernel: [ 417.884300] [<ffffffffa02bfda1>] ? xfs_buf_rele+0x51/0x130 [xfs] Mar 18 09:35:50 storage kernel: [ 417.884314] [<ffffffffa02ccf83>] xfs_bulkstat+0x413/0x800 [xfs] Mar 18 09:35:50 storage kernel: [ 417.884338] [<ffffffffa0323bc0>] ? xfs_qm_quotacheck_dqadjust+0x190/0x190 [xfs] Mar 18 09:35:50 storage kernel: [ 417.884358] [<ffffffffa02d66d5>] ? kmem_free+0x35/0x40 [xfs] Mar 18 09:35:50 storage kernel: [ 417.884382] [<ffffffffa03246b5>] xfs_qm_quotacheck+0xe5/0x1c0 [xfs] Mar 18 09:35:50 storage kernel: [ 417.884406] [<ffffffffa031de3c>] ? xfs_qm_dqdestroy+0x1c/0x30 [xfs] Mar 18 09:35:50 storage kernel: [ 417.884430] [<ffffffffa0324a94>] xfs_qm_mount_quotas+0x124/0x1b0 [xfs] Mar 18 09:35:50 storage kernel: [ 417.884452] [<ffffffffa0310990>] xfs_mountfs+0x5f0/0x690 [xfs] Mar 18 09:35:50 storage kernel: [ 417.884470] [<ffffffffa02ce322>] ? xfs_mru_cache_create+0x162/0x190 [xfs] Mar 18 09:35:50 storage kernel: [ 417.884490] [<ffffffffa02d053e>] xfs_fs_fill_super+0x1de/0x290 [xfs] Mar 18 09:35:50 storage kernel: [ 417.884499] [<ffffffff8117c366>] mount_bdev+0x1c6/0x210 Mar 18 09:35:50 storage kernel: [ 417.884518] [<ffffffffa02d0360>] ? xfs_parseargs+0xbc0/0xbc0 [xfs] Mar 18 09:35:50 storage kernel: [ 417.884537] [<ffffffffa02ce665>] xfs_fs_mount+0x15/0x20 [xfs] Mar 18 09:35:50 storage kernel: [ 417.884547] [<ffffffff8117cef3>] mount_fs+0x43/0x1b0 Mar 18 09:35:50 storage kernel: [ 417.884555] [<ffffffff8119783a>] vfs_kern_mount+0x6a/0xc0 Mar 18 09:35:50 storage kernel: [ 417.884564] [<ffffffff81198d44>] do_kern_mount+0x54/0x110 Mar 18 09:35:50 storage kernel: [ 417.884573] [<ffffffff8119a8a4>] do_mount+0x1a4/0x260 Mar 18 09:35:50 storage kernel: [ 417.884581] [<ffffffff8119ad80>] sys_mount+0x90/0xe0 Mar 18 09:35:50 storage kernel: [ 417.884591] [<ffffffff81665982>] system_call_fastpath+0x16/0x1b Mar 18 09:35:50 storage kernel: [ 417.884596] XFS (sdb1): Corruption detected. Unmount and run xfs_repair

Could you provide a metadump?  This bug report isn't ringing any bells for me
yet, but maybe it will for someone else.
I wish I could do this but the result of "meta_dump /dev/sdb1" for the partition containing 6.9T of data is promising to be quite large. Are there special options I should use to extract only the information that you would need to investigate my problem ?

Thanks again for your concern.

Guillaume Anciaux

<Prev in Thread] Current Thread [Next in Thread>