xfs
[Top] [All Lists]

Re: 3.9.2: xfstests triggered panic

To: CAI Qian <caiqian@xxxxxxxxxx>
Subject: Re: 3.9.2: xfstests triggered panic
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Wed, 22 May 2013 19:53:00 +1000
Cc: LKML <linux-kernel@xxxxxxxxxxxxxxx>, stable@xxxxxxxxxxxxxxx, xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <1805266998.4499261.1369211998387.JavaMail.root@xxxxxxxxxx>
References: <40971621.4497871.1369211701112.JavaMail.root@xxxxxxxxxx> <1805266998.4499261.1369211998387.JavaMail.root@xxxxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Wed, May 22, 2013 at 04:39:58AM -0400, CAI Qian wrote:
> Reproduced on almost all s390x guests by running xfstests.
> 
> 14634.396658Â XFS (dm-1): Mounting Filesystem 
> 14634.525522Â XFS (dm-1): Ending clean mount 
> 14640.413007Â  <000000000017c6d4>Â idle_balance+0x1a0/0x340 
> 14640.413010Â  <000000000063303e>Â __schedule+0xa22/0xaf0 
> 14640.428279Â  <0000000000630da6>Â schedule_timeout+0x186/0x2c0 
> 14640.428289Â  <00000000001cf864>Â rcu_gp_kthread+0x1bc/0x298 
> 14640.428300Â  <0000000000158c5a>Â kthread+0xe6/0xec 
> 14640.428304Â  <0000000000634de6>Â kernel_thread_starter+0x6/0xc 
> 14640.428308Â  <0000000000634de0>Â kernel_thread_starter+0x0/0xc 
> 14640.428311Â Last Breaking-Event-Address: 
> 14640.428314Â  <000000000016bd76>Â walk_tg_tree_from+0x3a/0xf4 
> 14640.428319Â  list_add corruption. next->prev should be prev 
> (0000000000000918 
> ), but was           (null). (next=          (null)). 

Where's XFS in this? walk_tg_tree_from() is part of the scheduler
code. This kind of implies a stack corruption....

> Sometimes, this pops up,
> [16907.275002] WARNING: at kernel/rcutree.c:1960 
> 
> or this,
> 15316.154171Â XFS (dm-1): Mounting Filesystem 
> 15316.255796Â XFS (dm-1): Ending clean mount 
> 15320.364246Â            00000000006367a2: e310b0080004        lg      
> %r1,8(%r 
> 11) 
> 15320.364249Â            00000000006367a8: 41101010            la      
> %r1,16(% 
> r1) 
> 15320.364251Â            00000000006367ac: e33010000004        lg      
> %r3,0(%r 
> 1) 
> 15320.364252Â Call Trace: 
> 15320.364252Â Last Breaking-Event-Address: 
> 15320.364253Â  ï <0000000000000000>Â Kernel stack overflow. 
> 15320.364308Â CPU: 0 Tainted: GF       W    3.9.2 #1 
> 15320.364309Â Process rhts-test-runne (pid: 625, task: 000000003dccc890, ksp: 
> 0 

.... and there you go - a stack overflow. Your kernel stack size is
too small.

I'd suggest that you need 16k stacks on s390 - IIRC every function
call has 128 byte stack frame, and there are call chains 70-80
functions deep in the storage stack...

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>