xfs
[Top] [All Lists]

Re: xfslogd-spinlock bug?

To: Haar János <djani22@xxxxxxxxxxxx>
Subject: Re: xfslogd-spinlock bug?
From: David Chinner <dgc@xxxxxxx>
Date: Mon, 18 Dec 2006 09:44:57 +1100
Cc: linux-xfs@xxxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx
In-reply-to: <000d01c72127$3d7509b0$0400a8c0@dcccs>
References: <003701c71d78$33ed28d0$0400a8c0@dcccs> <Pine.LNX.4.64.0612120932220.19050@p34.internal.lan> <00ab01c71e53$942af2f0$0400a8c0@dcccs> <000d01c72127$3d7509b0$0400a8c0@dcccs>
Sender: xfs-bounce@xxxxxxxxxxx
User-agent: Mutt/1.4.2.1i
On Sat, Dec 16, 2006 at 12:19:45PM +0100, Haar János wrote:
> Hi
> 
> I have some news.
> 
> I dont know there is a context between 2 messages, but i can see, the
> spinlock bug comes always on cpu #3.
> 
> Somebody have any idea?

Your disk interrupts are directed to CPU 3, and so log I/O completion
occurs on that CPU.

> Dec 16 12:08:36 dy-base BUG: spinlock bad magic on CPU#3, xfslogd/3/317
> Dec 16 12:08:36 dy-base general protection fault: 0000 [1]
> Dec 16 12:08:36 dy-base SMP
> Dec 16 12:08:36 dy-base
> Dec 16 12:08:36 dy-base CPU 3
> Dec 16 12:08:36 dy-base
> Dec 16 12:08:36 dy-base Modules linked in:
> Dec 16 12:08:36 dy-base  nbd

Are you using XFS on a NBD?

> Dec 16 12:08:36 dy-base  rd
> Dec 16 12:08:36 dy-base  netconsole
> Dec 16 12:08:36 dy-base  e1000
> Dec 16 12:08:36 dy-base  video
> Dec 16 12:08:36 dy-base
> Dec 16 12:08:36 dy-base Pid: 317, comm: xfslogd/3 Not tainted 2.6.19 #1
> Dec 16 12:08:36 dy-base RIP: 0010:[<ffffffff803f3aba>]
> Dec 16 12:08:36 dy-base  [<ffffffff803f3aba>] spin_bug+0x69/0xdf
> Dec 16 12:08:36 dy-base RSP: 0018:ffff81011fdedbc0  EFLAGS: 00010002
> Dec 16 12:08:36 dy-base RAX: 0000000000000033 RBX: 6b6b6b6b6b6b6b6b RCX:
                                                     ^^^^^^^^^^^^^^^^
Anyone recognise that pattern?

> Dec 16 12:08:36 dy-base Call Trace:
> Dec 16 12:08:36 dy-base  [<ffffffff803f3bdc>] _raw_spin_lock+0x23/0xf1
> Dec 16 12:08:36 dy-base  [<ffffffff805e7f2b>] _spin_lock_irqsave+0x11/0x18
> Dec 16 12:08:36 dy-base  [<ffffffff80222aab>] __wake_up+0x22/0x50
> Dec 16 12:08:36 dy-base  [<ffffffff803c97f9>] xfs_buf_unpin+0x21/0x23
> Dec 16 12:08:36 dy-base  [<ffffffff803970a4>] xfs_buf_item_unpin+0x2e/0xa6

This implies a spinlock inside a wait_queue_head_t is corrupt.

What are you type of system do you have, and what sort of
workload are you running?

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group


<Prev in Thread] Current Thread [Next in Thread>