On Thu, May 23, 2013 at 08:09:33AM +1000, Dave Chinner wrote:
> On Wed, May 22, 2013 at 12:19:46PM -0400, Dave Jones wrote:
> > On Wed, May 22, 2013 at 10:22:52AM -0400, Dave Jones wrote:
> > > On Wed, May 22, 2013 at 03:51:47PM +1000, Dave Chinner wrote:
> > >
> > > > > Tomorrow I'll also try running some older kernels with the same
> > > > > options to see if it's something new, or an older bug. This is a
> > > > > new machine, so it may be something that's been around for a
> > > > > while, and for whatever reason, my other machines don't hit
> > > > > this.
> > > >
> > > > Another thing that just occurred to me - what compiler are you
> > > > using? We had a report last week on #xfs that xfsdump was failing
> > > > with bad checksums because of link time optimisation (LTO) in
> > > > gcc-4.8.0. When they turned that off, everything worked fine. So if
> > > > you are using 4.8.0, perhaps trying a different compiler might be a
> > > > good idea, too.
> > >
> > > Yeah, this is 4.8.0. This box is running F19-beta.
> > > I managed to shoehorn the gcc-4.7 from f18 on there though.
> > > Bug reproduced instantly, so I think we can rule out compiler.
> > >
> > > I ran 3.9 with the same debug options. Seems stable.
> > > I'll do a bisect.
> > good news. It wasn't until I started bisecting I realised I was still
> > carrying this patch from you to fix slab corruption I was seeing.
> > It seems to be the culprit (or is masking another problem -- I had to apply
> > it at each step of the bisect to get past the slab corruption bug).
> That doesn't make a whole lot of sense to me. The fix in the xfsdev
> tree is a little different:
> but I can't set how this makes any difference to the problem at all.
> See my previous post about the fact that 0xa068 is actually a valid
> mask and should not be tripping the assert....
Hmm, I did git bisect fs/xfs/, so maybe itit's something outside of that subdir
that's the cause. I'll start over on the whole tree once I'm done bisecting
another entirely different bug.