xfs
[Top] [All Lists]

Re: xfs_repair deleting realtime files.

To: Eric Sandeen <sandeen@xxxxxxxxxxx>
Subject: Re: xfs_repair deleting realtime files.
From: Anand Tiwari <tiwarikanand@xxxxxxxxx>
Date: Fri, 21 Sep 2012 09:51:10 -0600
Cc: xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=tkt21PTOnZ6xYw3v3Wbu+iSMgmgBVx/+umVTnLlzOC4=; b=zT7S1hgBTSyEuZT7GgW+m0iarHQMwK+ayvWC9nYLEQ6fARBlKwfnRhejx22BIS/yAZ Ufoi3yNrfHN+ihuJd5XUb3Q9aHbQ/29t7kY+3nBGboFLqbWtYuhWzDeQm9P8tSqidRdK 7RHIGYpT4BILO8Xognb8JH5ZNEhzrnTr1UStaXXpky2gJxryQGcDMcAoyyETEtL09b2q amFFsBTqT5iJxRRfu1+3TlrXydj84q3JsiYITurCXo+2iIXhoPd1CKR6zx6cqL+/URCR PNjxOVs+y7wC1e1WK51lCuE1ZN8sJMrnoEPuQVV9hdVaSroc0uuj0kjPK4mo4p000CeB DJAA==
In-reply-to: <505BF45D.5050909@xxxxxxxxxxx>
References: <CAHt31_9K_vrzoqwSVsz-6VNVmMUzMyGCFEZfviRV-xPcUqv8-w@xxxxxxxxxxxxxx> <505BF45D.5050909@xxxxxxxxxxx>


On Thu, Sep 20, 2012 at 11:00 PM, Eric Sandeen <sandeen@xxxxxxxxxxx> wrote:
On 9/20/12 7:40 PM, Anand Tiwari wrote:
> Hi All,
>
> I have been looking into an issue with xfs_repair with realtime sub volume. some times while running xfs_repair I see following errors
>
> ----------------------------
> data fork in rt inode 134 claims used rt block 19607
> bad data fork in inode 134
> would have cleared inode 134
> data fork in rt inode 135 claims used rt block 29607
> bad data fork in inode 135
> would have cleared inode 135
>         - agno = 1
>         - agno = 2
>         - agno = 3
>         - process newly discovered inodes...
> Phase 4 - check for duplicate blocks...
>         - setting up duplicate extent list...
>         - check for inodes claiming duplicate blocks...
>         - agno = 0
>         - agno = 1
>         - agno = 2
>         - agno = 3
> entry "test-011" in shortform directory 128 references free inode 134
> would have junked entry "test-011" in directory inode 128
> entry "test-0" in shortform directory 128 references free inode 135
> would have junked entry "test-0" in directory inode 128
> data fork in rt ino 134 claims dup rt extent,off - 0, start - 7942144, count 2097000
> bad data fork in inode 134
> would have cleared inode 134
> data fork in rt ino 135 claims dup rt extent,off - 0, start - 13062144, count 2097000
> bad data fork in inode 135
> would have cleared inode 135
> No modify flag set, skipping phase 5
> ------------------------
>
> Here is the bmap for both inodes.
>
> xfs_db> inode 135
> xfs_db> bmap
> data offset 0 startblock 13062144 (12/479232) count 2097000 flag 0
> data offset 2097000 startblock 15159144 (14/479080) count 2097000 flag 0
> data offset 4194000 startblock 17256144 (16/478928) count 2097000 flag 0
> data offset 6291000 startblock 19353144 (18/478776) count 2097000 flag 0
> data offset 8388000 startblock 21450144 (20/478624) count 2097000 flag 0
> data offset 10485000 startblock 23547144 (22/478472) count 2097000 flag 0
> data offset 12582000 startblock 25644144 (24/478320) count 2097000 flag 0
> data offset 14679000 startblock 27741144 (26/478168) count 2097000 flag 0
> data offset 16776000 startblock 29838144 (28/478016) count 2097000 flag 0
> data offset 18873000 startblock 31935144 (30/477864) count 1607000 flag 0
> xfs_db> inode 134
> xfs_db> bmap
> data offset 0 startblock 7942144 (7/602112) count 2097000 flag 0
> data offset 2097000 startblock 10039144 (9/601960) count 2097000 flag 0
> data offset 4194000 startblock 12136144 (11/601808) count 926000 flag 0

It's been a while since I thought about realtime, but -

That all seems fine, I don't see anything overlapping there, they are
all perfectly adjacent, though of interesting size.

>
> by looking into xfs_repair code, it looks like repair does not handle
> a case where we have more than one extent in a real-time extent.
> following is code from repair/dinode.c: process_rt_rec

"more than one extent in a real-time extent?"  I'm not sure what that means.

Every extent above is length 2097000 blocks, and they are adjacent.
But you say your realtime extent size is 512 blocks ... which doesn't go
into 2097000 evenly.   So that's odd, at least.


well, lets look at first extent
> data offset 0 startblock 13062144 (12/479232) count 2097000 flag 0
> data offset 2097000 startblock 15159144 (14/479080) count 2097000 flag 0
startblock is aligned and rtext is 25512,  since the blockcount is not multiple of 512, last realtime extent ( 25512 + 4095) is partially used, 360 blks
second extent start from realtime extent 29607 (ie 25512 + 4095).  so, yes, extents are not overlapping, but 29607 realtime extent is shared by two extents.
Now once xfs_repair detects this case in phase 2, it bails out and clears that inode. I think  search for duplicate extent is done in phase 4, but inode is marked already.


Can you provide your xfs_info output for this fs?
Or maybe better yet an xfs_metadump image.


I will do it soon.  The other  thing I noticed is if I try to delete this file (inode 135),  I get following assertion failure.

[75669.291000] Assertion failed: prev.br_state == XFS_EXT_NORM, file: fs/xfs/xfs_bmap.c, line: 5187                                                                            
[75669.300000] [<8024dc44>] assfail+0x28/0x2c                                                                                                                                  
[75669.300000] [<801d8864>] xfs_bunmapi+0x1288/0x14e4                                                                                                                          
[75669.300000] [<8020aff4>] xfs_itruncate_finish+0x344/0x738                                                                                                                   
[75669.300000] [<802363b0>] xfs_inactive+0x4a8/0x51c                                                                                                                           
[75669.300000] [<80109124>] evict+0x28/0xd0                                                                                                                                    
[75669.300000] [<80109c14>] iput+0x19c/0x2d8                                                                                                                                   
[75669.300000] [<800fe58c>] do_unlinkat+0x10c/0x19c                                                                                                                            
[75669.300000] [<80011b7c>] stack_done+0x20/0x40                    

 
> -----
>      for (b = irec->br_startblock; b < irec->br_startblock +
>                         irec->br_blockcount; b += mp->m_sb.sb_rextsize)  {
>                 ext = (xfs_drtbno_t) b / mp->m_sb.sb_rextsize;
>                 pwe = xfs_sb_version_hasextflgbit(&mp->m_sb) &&
>                                 irec->br_state == XFS_EXT_UNWRITTEN &&
>                                 (b % mp->m_sb.sb_rextsize != 0);
> -----
>
> In my case rextsize is 512 (512 * 4096 = 2mb). So we have multiple
> extents (written extents to be precise, thanks dchinner for that),
> value of "ext" will be same for all of them and xfs_repair does not
> like it. thus the error message ""data fork in rt inode XX claims
> used rt block XX".

"ext" should not be the same for all of them; ext is the realtime extent
number in the fs, based on the physical start, br_startblock,
divided by the rt extent size.  There shouldn't be duplicate values
of "ext" based on the bmaps above.

The error comes from search_rt_dup_extent() which looks for overlaps
elsewhere in the fs...

If you can provide a metadump of the fs it might be easier to see what's going on.

-Eric

> If I ignore this failure condition, xfs_repairs seems to be happy.
> (FYI: this file-system is cleanly umounted) But in my opinion, its
> not good as these multiple extents can overlap too.

> Should we be using XR_E_MULT to flag and keep track of duplicated
> real-time extents. (maybe use the present API for adding/detecting
> duplicate extents)
>
> I am open of suggestion or comments on how to fix this.
>
> xfs_repair version is 3.1.8 and kernel 2.6.37.
>
> thanks,
> Anand
>
>
>
> _______________________________________________
> xfs mailing list
> xfs@xxxxxxxxxxx
> http://oss.sgi.com/mailman/listinfo/xfs
>


<Prev in Thread] Current Thread [Next in Thread>