[Top] [All Lists]

Re: Storage server, hung tasks and tracebacks

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: Storage server, hung tasks and tracebacks
From: Brian Candler <B.Candler@xxxxxxxxx>
Date: Sun, 9 Sep 2012 10:47:56 +0100
Cc: Stan Hoeppner <stan@xxxxxxxxxxxxxxxxx>, xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=date:from:to :cc:subject:message-id:references:mime-version:content-type :in-reply-to; s=sasl; bh=J9SPdQj8yWkB5hlrsEea+jgBTyk=; b=ZbnhS7V HIq43xaGUiRTyaScCEFf9Qd+e5BKCJmNdI+mWbVWbQ+yiYbI+xxGv1x02awbfpQ/ n8idjJi6UU9PiO3Fe9381IoE+wP8e8mflcW6+RdafCYq92NUXsLvjJbJfpp6UW7n cfaQT/vB3zyjBlCCFlNtHfmnl4VhBiQH0x4Q=
Domainkey-signature: a=rsa-sha1; c=nofws; d=pobox.com; h=date:from:to:cc :subject:message-id:references:mime-version:content-type :in-reply-to; q=dns; s=sasl; b=E20KcEdfZUU5Yega+BTf+k9i5WMojMGUp HnWJ1XGPYOfKCGJ5n9GwF3FSp2V86L68w7mzD5mN6zd8CwzMcHGTUlkd0KdrumD+ nv/JMdHcgJI14hPuJStHIe9RLANdVl+r2m50qMgEuTLa53xH59VWCf2rSUTBRvOV zjeynNL6VU=
In-reply-to: <20120521095830.GC8283@xxxxxxxx>
References: <20120502184450.GA2557@xxxxxxxx> <4FA27EF8.6040002@xxxxxxxxxxxxxxxxx> <20120503204157.GC4387@xxxxxxxx> <4FA3047D.8060908@xxxxxxxxxxxxxxxxx> <20120504163237.GA6128@xxxxxxxx> <4FA4C321.2070105@xxxxxxxxxxxxxxxxx> <20120515140237.GA3630@xxxxxxxx> <20120520235903.GB25351@dastard> <20120521095830.GC8283@xxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Mon, May 21, 2012 at 10:58:30AM +0100, Brian Candler wrote:
> On Mon, May 21, 2012 at 09:59:03AM +1000, Dave Chinner wrote:
> > You need to provide the output of sysrq-W at this point ('echo w >
> > /proc/sysrq-trigger') so we can see where these are hung. the entire
> > dmesg would also be useful....
> Thank you for this advice Dave.
> Attached is the full dmesg output after another hang. The sysrq output is
> near the end, at timestamp 250695.

It has recently been pointed out to me that the original attachment was
incomplete and didn't include the sysrq output.  Attached is the dmesg file
*with* the sysrq data at the given timestamp.

Unfortunately, I have been trying to reproduce this problem on an identical
box and over 5 days the problem didn't reoccur.  (Aside: that was until one
of the RAID0 drives failed, which is turning out to be a common occurrence
with Seagates; but that's not the failure I was seeing before, which I used
to be able to replicate in less than an hour. In this case XFS shut down
gracefully in the face of I/O errors)

This isn't one of the three actual boxes which had locked up before, but
it's another box with identical chassis and motherboard bought from same
supplier. This however has an up-to-date Ubuntu 12.04 kernel on it, so it's
possible there has been some driver fix since my original tests. 

I will continue to investigate.



Attachment: storage3-sysreq.txt.gz
Description: application/gunzip

<Prev in Thread] Current Thread [Next in Thread>
  • Re: Storage server, hung tasks and tracebacks, Brian Candler <=