xfs
[Top] [All Lists]

Re: NULL mp->m_log in 2.6.31 xfs_log_move_tail

To: linux-xfs@xxxxxxxxxxx
Subject: Re: NULL mp->m_log in 2.6.31 xfs_log_move_tail
From: Ed Cashin <ecashin@xxxxxxxxxx>
Date: Mon, 09 Nov 2009 16:41:58 -0500
Cancel-lock: sha1:r/vqIqPDbxJrkW6NI1hmHa/KbwI=
Cc: ecashin@xxxxxxxxxx
References: <87ws1z8mbf.fsf@xxxxxxxxxx> <20091109211620.GA22777@xxxxxxxxxxxxx>
Sender: news <news@xxxxxxxxxxxxx>
User-agent: Gnus/5.110002 (No Gnus v0.2) Emacs/21.3 (gnu/linux)
Christoph Hellwig <hch@xxxxxxxxxxxxx> writes:

> On Mon, Nov 09, 2009 at 03:39:32PM -0500, Ed Cashin wrote:
>> A colleague has seen oopses in 2.6.31 when an XFS is mounted on an AoE
>> target that becomes unresponsive and is marked as "down" by the aoe
>> driver.  The aoe driver starts failing all new I/O requests after
>> failing all current requests when the device is down.
>> 
>> I looked at the trace (included below) and put in the following check:
>
> Given that you seem to be able to reproduce it can you see if the patch
> below helps:
>
> Index: linux-2.6/fs/xfs/xfs_log.c
> ===================================================================
> --- linux-2.6.orig/fs/xfs/xfs_log.c   2009-11-09 22:09:08.858026060 +0100
> +++ linux-2.6/fs/xfs/xfs_log.c        2009-11-09 22:13:13.958255857 +0100
> @@ -1602,6 +1602,8 @@ xlog_dealloc_log(xlog_t *log)
>       xlog_in_core_t  *iclog, *next_iclog;
>       int             i;
>  
> +     xfs_flush_buftarg(log->l_mp->m_logdev_targp, 1);
> +
>       iclog = log->l_iclog;
>       for (i=0; i<log->l_iclog_bufs; i++) {
>               sv_destroy(&iclog->ic_force_wait);
>

Thanks.  I am not sure when we'll be able to try it, because the one
who discovered this issue is not currently available, but I'll try to
fit it in one way or another.  A lot of folks using AoE use XFS.

-- 
  Ed Cashin <ecashin@xxxxxxxxxx>
  http://www.coraid.com/
  http://noserose.net/e/

<Prev in Thread] Current Thread [Next in Thread>