[Top] [All Lists]

Re: Xfs Access to block zero exception and system crash

To: Sagar Borikar <sagar_borikar@xxxxxxxxxxxxxx>, xfs@xxxxxxxxxxx
Subject: Re: Xfs Access to block zero exception and system crash
From: Sagar Borikar <sagar_borikar@xxxxxxxxxxxxxx>
Date: Wed, 02 Jul 2008 11:05:41 +0530
In-reply-to: <20080702051337.GX29319@disturbed>
Organization: PMC Sierra Inc
References: <20080626070215.GI11558@disturbed> <4864BD5D.1050202@xxxxxxxxxxxxxx> <4864C001.2010308@xxxxxxxxxxxxxx> <20080628000516.GD29319@disturbed> <340C71CD25A7EB49BFA81AE8C8392667028A1CA7@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx> <20080629215647.GJ29319@disturbed> <20080630034112.055CF18904C4@xxxxxxxxxxxxxxxxxxxxxxxxxx> <4868B46C.9000200@xxxxxxxxxxxxxx> <20080701064437.GR29319@disturbed> <486B01A6.4030104@xxxxxxxxxxxxxx> <20080702051337.GX29319@disturbed>
Sender: xfs-bounce@xxxxxxxxxxx
User-agent: Thunderbird (X11/20080421)

Dave Chinner wrote:
On Wed, Jul 02, 2008 at 09:48:46AM +0530, Sagar Borikar wrote:
Dave Chinner wrote:
On Mon, Jun 30, 2008 at 03:54:44PM +0530, Sagar Borikar wrote:
Sure - just like any other workload that generates enough
extents. Like I said originally, we've fixed so many problems
in this code since 2.6.18 I'd suggest that your only sane
hope for us to help you track done the problem is to upgrade
to a current kernel and go from there....
Thanks again Dave. But we can't upgrade the kernel as it is already in production and on field.

Yes, but you can run it in your test environment where you are
reproducing this problem, right?

Unfortunately the architecture is customized mips for which the standard kernel port is not available and we have to port the new kernel in order to try this which is why I was
hesitating to do this.
So do you think, periodic cleaning of file system using xfs_fsr can solve the issue?

No, at best it would only delay the problem (whatever it is).

If not, could you
kindly direct me what all patches were fixing similar problem? I can try back porting them.

I don't have time to try to identify some set of changes from the
past 3-4 years that might fix your problem. There may not even be a
patch that fixes your problem, which is one of the reasons why I've
asked if you can reproduce it on a current kernel....

I pointed you the files that the bug could lie in earlier in the
thread. You can find the history of changes to those files via the
mainline git repository or via the XFS CVS repository. You'd
probably do best to look at the git tree because all the changes are
well described in the commit logs and you should be able to isolate
ones that fix btree problems fairly easily...


Sure I'll go through these changelogs. Thanks for all your help and really appreciate your
time. I hope you don't mind to help me in future if I find something new :)


<Prev in Thread] Current Thread [Next in Thread>