XFS crash?
Austin Schuh
austin at peloton-tech.com
Mon May 12 23:03:48 CDT 2014
On Mon, May 12, 2014 at 8:46 PM, Dave Chinner <david at fromorbit.com> wrote:
> On Mon, May 12, 2014 at 06:29:28PM -0700, Austin Schuh wrote:
>> On Wed, Mar 5, 2014 at 4:53 PM, Austin Schuh <austin at peloton-tech.com> wrote:
>> > Hi Dave,
>> >
>> > On Wed, Mar 5, 2014 at 3:35 PM, Dave Chinner <david at fromorbit.com> wrote:
>> >> On Wed, Mar 05, 2014 at 03:08:16PM -0800, Austin Schuh wrote:
>> >>> Howdy,
>> >>>
>> >>> I'm running a config_preempt_rt patched version of the 3.10.11 kernel,
>> >>> and I'm seeing a couple lockups and crashes which I think are related
>> >>> to XFS.
>> >>
>> >> I think they ar emore likely related to RT issues....
>> >>
>> >
>> > That very well may be true.
>> >
>> >> Your usb device has disconnected and gone down the device
>> >> removal/invalidate partition route. and it's trying to flush the
>> >> device, which is stuck on IO completion which is stuck waiting for
>> >> the device error handling to error them out.
>> >>
>> >> So, this is a block device problem error handling problem caused by
>> >> device unplug getting stuck because it's decided to ask the
>> >> filesystem to complete operations that can't be completed until the
>> >> device error handling progress far enough to error out the IOs that
>> >> the filesystem is waiting for completion on.
>> >>
>> >> Cheers,
>> >>
>> >> Dave.
>> >> --
>> >> Dave Chinner
>> >> david at fromorbit.com
>>
>> I had the issue reproduce itself today with just the main SSD
>> installed. This was on a new machine that was built this morning.
>> There is a lot less going on in this trace than the previous one.
>
> The three blocked threads:
>
> 1. kworker running IO completion waiting on an inode lock,
> holding locked pages.
> 2. kworker running writeback flusher work waiting for a page lock
> 3. direct flush work waiting for allocation, holding page
> locks and the inode lock.
>
> What's the kworker thread running the allocation work doing?
>
> You might need to run `echo w > proc-sysrq-trigger` to get this
> information...
I was able to reproduce the lockup. I ran `echo w >
/proc/sysrq-trigger` per your suggestion. I don't know how to figure
out what the kworker thread is doing, but I'll happily do it if you
can give me some guidance.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dmesg
Type: application/octet-stream
Size: 33299 bytes
Desc: not available
URL: <http://oss.sgi.com/pipermail/xfs/attachments/20140512/d60c30be/attachment-0001.obj>
More information about the xfs
mailing list