xfs
[Top] [All Lists]

Re: XFS crash?

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: XFS crash?
From: Austin Schuh <austin@xxxxxxxxxxxxxxxx>
Date: Tue, 13 May 2014 10:11:11 -0700
Cc: xfs <xfs@xxxxxxxxxxx>
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <20140513090321.GR26353@dastard>
References: <CANGgnMYPLF+8616Rs9eQOXUc9He2NSgFnNrvHvepV-x+pWS6oQ@xxxxxxxxxxxxxx> <20140305233551.GK6851@dastard> <CANGgnMb=2dYGQO4K36pQ9LEb8E4rT6S_VskLF+n=ndd0_kJr_g@xxxxxxxxxxxxxx> <CANGgnMa80WwQ8zSkL52yYegmQURVQeZiBFv41=FQXMZJ_NaEDw@xxxxxxxxxxxxxx> <20140513034647.GA5421@dastard> <CANGgnMZ0q9uE3NHj2i0SBK1d0vdKLx7QBJeFNb+YwP-5EAmejQ@xxxxxxxxxxxxxx> <20140513063943.GQ26353@dastard> <CANGgnMYn++1++UyX+D2d9GxPxtytpQJv0ThFwdxM-yX7xDWqiA@xxxxxxxxxxxxxx> <20140513090321.GR26353@dastard>
On Tue, May 13, 2014 at 2:03 AM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> On Tue, May 13, 2014 at 12:02:18AM -0700, Austin Schuh wrote:
>> On Mon, May 12, 2014 at 11:39 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
>> > On Mon, May 12, 2014 at 09:03:48PM -0700, Austin Schuh wrote:
>> >> On Mon, May 12, 2014 at 8:46 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
>> >> > On Mon, May 12, 2014 at 06:29:28PM -0700, Austin Schuh wrote:
>> >> >> On Wed, Mar 5, 2014 at 4:53 PM, Austin Schuh <austin@xxxxxxxxxxxxxxxx> 
>> >> >> wrote:
>> >> >> > Hi Dave,
>> >> >> >
>> >> >> > On Wed, Mar 5, 2014 at 3:35 PM, Dave Chinner <david@xxxxxxxxxxxxx> 
>> >> >> > wrote:
>> >> >> >> On Wed, Mar 05, 2014 at 03:08:16PM -0800, Austin Schuh wrote:
>> >> >> >>> Howdy,
>> >> >> >>>
>> >> >> >>> I'm running a config_preempt_rt patched version of the 3.10.11 
>> >> >> >>> kernel,
>> >> >> >>> and I'm seeing a couple lockups and crashes which I think are 
>> >> >> >>> related
>> >> >> >>> to XFS.
>> >> >> >>
>> >> >> >> I think they ar emore likely related to RT issues....
>> >> >> >>
>> >> >> >
>> >> >> > That very well may be true.
>> >> >> >
>> >> >> >> Cheers,
>> >> >> >>
>> >> >> >> Dave.
>> >> >> >> --
>> >> >> >> Dave Chinner
>> >> >>
>> >> >> I had the issue reproduce itself today with just the main SSD
>> >> >> installed.  This was on a new machine that was built this morning.
>> >> >> There is a lot less going on in this trace than the previous one.
>> >> >
>> >> > The three blocked threads:
>> >> >
>> >> >         1. kworker running IO completion waiting on an inode lock,
>> >> >            holding locked pages.
>> >> >         2. kworker running writeback flusher work waiting for a page 
>> >> > lock
>> >> >         3. direct flush work waiting for allocation, holding page
>> >> >            locks and the inode lock.
>> >> >
>> >> > What's the kworker thread running the allocation work doing?
>> >> >
>> >> > You might need to run `echo w > proc-sysrq-trigger` to get this
>> >> > information...
>> >>
>> >> I was able to reproduce the lockup.  I ran `echo w >
>> >> /proc/sysrq-trigger` per your suggestion.  I don't know how to figure
>> >> out what the kworker thread is doing, but I'll happily do it if you
>> >> can give me some guidance.
>> >
>> > There isn't a worker thread blocked doing an allocation in that
>> > dump, so it doesn't shed any light on the problem at all. try
>> > `echo l > /proc/sysrq-trigger`, followed by `echo t >
>> > /proc/sysrq-trigger` so we can see all the processes running on CPUs
>> > and all the processes in the system...
>> >
>> > Cheers,
>> >
>> > Dave.
>>
>> Attached is the output of the two commands you asked for.
>
> Nothing there. There's lots of processes waiting for allocation to
> run, and no kworkers running allocation work. This looks more
> like a rt-kernel workqueue issue, not an XFS problem.
>
> FWIW, it woul dbe really helpful if you compiled your kernels with
> frame pointers enabled - the stack traces are much more precise and
> readable (i.e. gets rid of all the false/stale entrys) and that
> helps understanding where things are stuck immensely.
>
> Cheers,
>
> Dave.

Thanks Dave.

I'll go check with the rt-kernel guys and take it from there.  Thanks
for the frame pointers suggestion.  I'll make that change the next
time I build a kernel.

Austin

<Prev in Thread] Current Thread [Next in Thread>