xfs
[Top] [All Lists]

Re: XFS related hang (was Re: How to send a break? - dump from frozen 64

To: Janos Haar <djani22@xxxxxxxxxxxx>
Subject: Re: XFS related hang (was Re: How to send a break? - dump from frozen 64bit linux)
From: Nathan Scott <nathans@xxxxxxx>
Date: Fri, 2 Jun 2006 09:43:25 +1000
Cc: linux-kernel@xxxxxxxxxxxxxxx, linux-xfs@xxxxxxxxxxx
In-reply-to: <01ed01c685c8$b46ec6f0$1800a8c0@dcccs>; from djani22@netcenter.hu on Fri, Jun 02, 2006 at 12:14:04AM +0200
References: <004501c68225$00add170$1800a8c0@dcccs> <9a8748490605280917l73f5751cmf40674fc22726c43@mail.gmail.com> <01d801c6827c$fba04ca0$1800a8c0@dcccs> <01a801c683d2$e7a79c10$1800a8c0@dcccs> <200605301903.k4UJ3xQU008919@turing-police.cc.vt.edu> <1149038431.21827.20.camel@localhost.localdomain> <20060531143849.C478554@wobbly.melbourne.sgi.com> <00f501c68488$4d10c080$1800a8c0@dcccs> <20060602075826.B530100@wobbly.melbourne.sgi.com> <01ed01c685c8$b46ec6f0$1800a8c0@dcccs>
Sender: xfs-bounce@xxxxxxxxxxx
User-agent: Mutt/1.2.5i
On Fri, Jun 02, 2006 at 12:14:04AM +0200, Janos Haar wrote:
> ---- Original Message ----- 
> > On Wed, May 31, 2006 at 10:00:33AM +0200, Janos Haar wrote:
> > >
> > > Hey, i think i found something.
> > > My quota on my huge device is broken.
> > > (inferno   -- 18014398504855404       0       0
> 18446744073709551519
> > > 0     0)
> >
> > Hmm, that is interesting.  I guess you don't know whether this
> > accounting problem happened before you rebooted or whether it
> > only just got this way (after journal recovery)?
> 
> In my system, this huge device is difficult.

Can you describe your hardware a bit more?  (and send xfs_info
output too please).

> I often need to reboot, and run xfs_repair, to make it clean. (nodes hangs,
> reboots, etc...)

Ehrm, hmm, that smells fishy... does this device have a write
cache enabled by any chance?

> Now is my default reboot option is xfs_repair -L, so i dont know, this
> happens before, or after, sorry.

Oh, thats bad, all bets are off then - you really cant go doing
that routinely, thats an "in emergency only" big red button -
it throws away the contents of the journal, and will pretty much
guarantee filesystem corruption.

But, it sounds alot like you may have a big hardware reliability
issue there, which is going to make it difficult to distinguish
any software problems.  However, if you find a way to reproduce
that quota accounting problem (above), I'm all ears.

cheers.

-- 
Nathan


<Prev in Thread] Current Thread [Next in Thread>