xfs
[Top] [All Lists]

Re: xfsdump INTERRUPT issue

To: "J. Ellis" <jellis@xxxxxxxx>
Subject: Re: xfsdump INTERRUPT issue
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Fri, 21 Dec 2012 14:20:26 +1100
Cc: Stan Hoeppner <stan@xxxxxxxxxxxxxxxxx>, "xfs@xxxxxxxxxxx" <xfs@xxxxxxxxxxx>
In-reply-to: <CCF8A128.B0C13%jellis@xxxxxxxx>
References: <50D2A038.2040501@xxxxxxxxxxxxxxxxx> <CCF8A128.B0C13%jellis@xxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Thu, Dec 20, 2012 at 11:04:08AM -0500, J. Ellis wrote:
> on 12/20/12 12:2 AM, Stan Hoeppner at stan@xxxxxxxxxxxxxxxxx wrote:
> 
> > On 12/19/2012 4:12 P, Jeffrey Ellis wrote:
> >> Dave, is there a way of piping dmesg toa file?
> > 
> > ~$ dmesg > /var/tmp/somefile.txt
> > 
> > You can write the file anywhere.  This path is an example.
> 
> Tanks, Stan. That saved me a good 10 min. of copying and pasting.
> 
> Ok, here'sthe output of dmesg after echoing /proc/sysrq-trigger:

.....
> [  935.496565] XFS (sda2): Mounting Filesystem
> [  935.566619] XFS (sda2): Starting recovery (logdev: internal)
> [  935.742295] XFS (sda2): Ending recovery (logdev: internal)
> [ 1014.810155] BUG: unable to handle kernel NULL pointer dereference at
> 00000070
> [ 1014.810163] IP: [<c1037a58>] __ticket_spin_lock+0x8/0x30
....
> [ 1014.810259] Call Trace:
> [ 1014.810265]  [<c15ca1cd>] _raw_spin_lock+0xd/0x10
> [ 1014.810289]  [<f94a0dbc>] _xfs_buf_find+0x6c/0x240 [xfs]
> [ 1014.810304]  [<f94a1062>] xfs_buf_get+0x32/0x190 [xfs]
> [ 1014.810319]  [<f94a1bf6>] xfs_buf_read+0x26/0xd0 [xfs]
> [ 1014.810340]  [<f94fa58f>] xfs_trans_read_buf+0x22f/0x380 [xfs]
> [ 1014.810361]  [<f9502145>] xfs_rtbuf_get+0xe5/0x110 [xfs]
> [ 1014.810379]  [<f94b6760>] ? kmem_zone_zalloc+0x30/0x40 [xfs]
> [ 1014.810400]  [<f94f2554>] ? xfs_trans_add_item+0x24/0x60 [xfs]
> [ 1014.810421]  [<f9502ea9>] xfs_rtcheck_range.constprop.3+0x59/0x360 [xfs]
> [ 1014.810441]  [<f9502145>] ? xfs_rtbuf_get+0xe5/0x110 [xfs]
> [ 1014.810462]  [<f9503bf7>] xfs_rtallocate_extent_block+0xd7/0x2d0 [xfs]
> [ 1014.810483]  [<f95028c7>] ? xfs_rtget_summary+0x87/0x120 [xfs]
> [ 1014.810504]  [<f9503ecc>] xfs_rtallocate_extent_size+0xdc/0x310 [xfs]

And therein lies the problem. The kernel is crashing trying to
allocate an extent in the real-time device, so xfs_restore is not
completing an IO properly, and not detecting that a thread has been
terminated in this manner.

I posted a patch a couple of days ago that would fix this oops:

http://oss.sgi.com/pipermail/xfs/2012-December/023257.html

but that wouldn't solve your problem, I think, because the crash is
occuring when a block beyond the end of the data device is being
asked for. So something else has already gone wrong by this
stage....

Can you run xfs_repair on the new filesystem after a failure like
this has occurred and post the output?

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>