xfs
[Top] [All Lists]

Re: TAKE - turn on delayed allocation

To: linux-xfs@xxxxxxxxxxx
Subject: Re: TAKE - turn on delayed allocation
From: Russell Cattelan <cattelan@xxxxxxxxxxx>
Date: Wed, 10 May 2000 15:51:29 -0500
References: <200005101926.MAA17364@dbear.engr.sgi.com> <3919BB0C.87D22D9B@thebarn.com> <3919BE06.CC3C76F1@sgi.com> <3919C0F3.E8F97889@thebarn.com> <3919C36D.1B01981A@sgi.com>
Sender: owner-linux-xfs@xxxxxxxxxxx
Rajagopal Ananthanarayanan wrote:

> Russell Cattelan wrote:
> >
> > Rajagopal Ananthanarayanan wrote:
> >
> > > Russell Cattelan wrote:
> > > >
> > > > Ananth Ananthanarayanan wrote:
> > > >
> > > > > Since delay_alloc ON and PAGEBUF_META off
> > > > > seems to not hit the corruption problems,
> > > > > I'm turning delay_alloc on by default.
> > > >
> > > > The corruption problem occurs with PAGEBUF_META off as well
> > > > as on.
> > > >
> > > > Please don't through more variables in the mix just yet.
> > >
> > > Well, it may not have to do with PAGEBUF_META.
> > > The corruption does seem to go away with delay_alloc ON.
> >
> > Can you verify that?
> > I'm looking at the pagebuf_file_write path ...
> > delayed alloc short circutes a lot code.... maks me
> > wonder it the problem isn't someplace within the code that
> > is then truned off.
>
> I know one thing for sure: withe delay_alloc ON
> and PAGEBUF_META off I've never seen corruption.

It probably changes the timing enough.

>
>
> With a 3/30/2000 snap shot (before the days of
> delalloc):
>
>         (a) Corruption with PAGEBUF_META ON
>         (b) NO corruption with PAGEBUF_META OFF
>
> >
> > >
> > > I'd like to get as much exposure with delay allocation:
> > > it's been in there for 3 weeks now ... and like I said
> > > earlier the corruption has been there since 3/30/2000
> >
> > I'm not convinced it hasn't always been there.
>
> No. A 3/30/2000 tree easily produces corruption
> with PAGEBUF_META ON.

Page buf meta data had leakage problems at that time, which would
cause other forms of corruption,  hard to say what was causing what at
that point.

Corruption doesn't occur as often with PB META off but
with enough kernel makes running currently; typically 5, corruption
will occur with P B META off.

PB Meta data is probably using  the same broken interface that the
user data path uses, the increased activity would probably cause
the corruption to show up sooner.

What we know for sure at this point; pages are getting re used by
other processes before they are written out to disk. It may have
something to do with the delay write path.... that is just a guess at
this point.


The best way at this point to get corruption:

cd <xfs partition>
mkdir 1 2 3 4 5
foreach i ( 1 2 3 4 5)
rsync -av <xfs source tree> $i/

foreach i ( 1 2 3 4 5)
(cd $i/<xfs source tree> ; make clean depend bzImage modules >& m.out ) &

the m.out files will get corrupted.

the fact that the pages of the m.out files fill slowy, cause the corruption
to show up.

>
>
> [ Trimming replies to only TO: linux-xfs;
> I get enough mail already ... getting three copies of
> the same thing doesn't make it easy ... Thanks! ]
>
> --------------------------------------------------------------------------
> Rajagopal Ananthanarayanan ("ananth")
> Member Technical Staff, SGI.
> --------------------------------------------------------------------------


<Prev in Thread] Current Thread [Next in Thread>