On Tue, Sep 25, 2012 at 07:26:32PM -0600, Anand Tiwari wrote:
> On Mon, Sep 24, 2012 at 6:51 AM, Anand Tiwari <
tiwarikanand@xxxxxxxxx>wrote:
>
> >
> >
> > On Mon, Sep 24, 2012 at 1:55 AM, Dave Chinner <
david@xxxxxxxxxxxxx> wrote:
> >
> >> On Fri, Sep 21, 2012 at 12:00:13AM -0500, Eric Sandeen wrote:
> >> > On 9/20/12 7:40 PM, Anand Tiwari wrote:
> >> > > Hi All,
> >> > >
> >> > > I have been looking into an issue with xfs_repair with realtime sub
> >> volume. some times while running xfs_repair I see following errors
> >> > >
> >> > > ----------------------------
> >> > > data fork in rt inode 134 claims used rt block 19607
> >> > > bad data fork in inode 134
> >> > > would have cleared inode 134
> >> > > data fork in rt inode 135 claims used rt block 29607
> >> > > bad data fork in inode 135
> >> > > would have cleared inode 135
> >> .....
> >> > > xfs_db> inode 135
> >> > > xfs_db> bmap
> >> > > data offset 0 startblock 13062144 (12/479232) count 2097000 flag 0
> >> > > data offset 2097000 startblock 15159144 (14/479080) count 2097000
> >> flag 0
> >> > > data offset 4194000 startblock 17256144 (16/478928) count 2097000
> >> flag 0
> >> > > data offset 6291000 startblock 19353144 (18/478776) count 2097000
> >> flag 0
> >> > > data offset 8388000 startblock 21450144 (20/478624) count 2097000
> >> flag 0
> >> > > data offset 10485000 startblock 23547144 (22/478472) count 2097000
> >> flag 0
> >> > > data offset 12582000 startblock 25644144 (24/478320) count 2097000
> >> flag 0
> >> > > data offset 14679000 startblock 27741144 (26/478168) count 2097000
> >> flag 0
> >> > > data offset 16776000 startblock 29838144 (28/478016) count 2097000
> >> flag 0
> >> > > data offset 18873000 startblock 31935144 (30/477864) count 1607000
> >> flag 0
> >> > > xfs_db> inode 134
> >> > > xfs_db> bmap
> >> > > data offset 0 startblock 7942144 (7/602112) count 2097000 flag 0
> >> > > data offset 2097000 startblock 10039144 (9/601960) count 2097000 flag
> >> 0
> >> > > data offset 4194000 startblock 12136144 (11/601808) count 926000 flag
> >> 0
> >> >
> >> > It's been a while since I thought about realtime, but -
> >> >
> >> > That all seems fine, I don't see anything overlapping there, they are
> >> > all perfectly adjacent, though of interesting size.
> >>
> >> Yeah, the size is the problem.
> >>
> >> ....
> >> > Every extent above is length 2097000 blocks, and they are adjacent.
> >> > But you say your realtime extent size is 512 blocks ... which doesn't go
> >> > into 2097000 evenly. So that's odd, at least.
> >>
> >> Once you realise that the bmapbt is recording multiples of FSB (4k)
> >> rather than rtextsz (2MB), it becomes more obvious what the problem
> >> is: rounding of the extent size at MAXEXTLEN - 2097000 is only 152
> >> blocks short of 2^21 (2097152).
> >>
> >> I haven't looked at the kernel code yet to work out why it is
> >> rounding to a non-rtextsz multiple, but that is the source of the
> >> problem.
> >>
> >> The repair code is detecting that extents are not of the
> >> correct granularity, but the error message indicates that this was
> >> only ever expected for duplicate blocks occurring rather than a
> >> kernel bug. So "fixing repair" is not what is needd here - finding
> >> and fixing the kernel bug is what you shoul be looking at.
> >>
> >> Cheers,
> >>
> >> Dave.
> >> --
> >> Dave Chinner
> >>
david@xxxxxxxxxxxxx
> >>
> >
> >
> > thanks, I started looking at allocator code and and will report if see
> > something
> >
>
>
> I think this is what happening. If we have following conditions,
> 1) we have more than 8gb contiguous space available to allocate. ( i.e.
> more than 2^21 4k blocks)
> 2) only one file is open for writing in real-time volume.
>
> To satisfy first condition, I just took empty file-system.
>
> Now lets start allocating, lets say in chucks of 25000, realtime allocator
> will have no problem allocating "exact" block while searching forward.
> xfs_rtfind_forw(). It will allocate 49 "real-time extents", where the 49th
> "real-time extent" is partially full. (25000/512 = 48)
>
> everything is fine for first 83 allocations, as we were able to grow the
> extent. Now we have 2075000 (25000*83) blocks in first extent ie 4053
> "real-time extents" (where last "real-time extent" is partially full).
>
> for 84th allocation, real-time allocator will allocate another 49
> "real-time extents" as it does not know about maximum extent size, but we
> can not grow the extent in xfs_bmap_add_extent_unwritten_real(). so we
> insert a new extent (case BMAP_LEFT_FILLING). now the new extent starts
> from 2075000, which is not aligned with rextsize (512 in this case).