[Top] [All Lists]

Re: vanilla, project quota enabled and process stuck in D state

To: Arkadiusz Miskiewicz <arekm@xxxxxxxx>
Subject: Re: vanilla, project quota enabled and process stuck in D state (repeatable every time)
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Thu, 4 Dec 2008 08:30:28 +1100
Cc: xfs@xxxxxxxxxxx
In-reply-to: <200812031406.41882.arekm@xxxxxxxx>
Mail-followup-to: Arkadiusz Miskiewicz <arekm@xxxxxxxx>, xfs@xxxxxxxxxxx
References: <200812021949.55463.arekm@xxxxxxxx> <20081203032013.GS18236@disturbed> <200812031406.41882.arekm@xxxxxxxx>
User-agent: Mutt/1.5.18 (2008-05-17)
On Wed, Dec 03, 2008 at 02:06:41PM +0100, Arkadiusz Miskiewicz wrote:
> On Wednesday 03 of December 2008, Dave Chinner wrote:
> > On Tue, Dec 02, 2008 at 07:49:55PM +0100, Arkadiusz Miskiewicz wrote:
> > > Hello,
> > >
> > > I'm trying to use xfs project quota on kernel (vanilla, no
> > > additional patches), x86_64 UP machine (SMP kernel).
> > >
> > > Now some processes that are using /home/users/arekm/rpm are hanging in
> > > D-state like:
> [arekm@farm ~]$ zgrep LOCKDEP /proc/config.gz
> I don't see anything strictly lockdep related in dmesg so it doesn't seem to 
> be triggered.

Which implies there is something with a lock held that is blocked

> D-state lock is also happening if I drop usrquota,prjquota, reboot and retry 
> the test. I assume something was written on disk that triggers the problem.

Unlikely - locking doesn't generally get stuck due to on disk
corruption. Are there any other blocked processes in the machine?
i.e. what is the entire output of 'echo w > /proc/sysrq-trigger'?
Are there any other signs of general unwellness (e.g. a CPU running
at 100% when it shouldn't be)?

> Note that now I'm testing on a second machine (UP i686, SMP kernel), so this 
> isn't unique problem.

Can you identify the inode that the unlinkis hanging on and get
an xfs_db dump of the contents of that inode? Also a dump of the
parent directory inode would be useful, too.

FWIW, if you are seeing this on two hosts, can you try to build
a reproducable test case using a minimal data set and a simple
set of commands? If you can do this and supply us with a
xfs_metadump image of the filesystem plus the commands to reproduce
the problem we'll be able to find the problem pretty quickly....


Dave Chinner

<Prev in Thread] Current Thread [Next in Thread>