xfs
[Top] [All Lists]

Re: 3.9-rc2 xfs panic

To: CAI Qian <caiqian@xxxxxxxxxx>
Subject: Re: 3.9-rc2 xfs panic
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Wed, 13 Mar 2013 15:43:07 +1100
Cc: xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <1063371063.191362.1363142676803.JavaMail.root@xxxxxxxxxx>
References: <20130312074608.GL21651@dastard> <1063371063.191362.1363142676803.JavaMail.root@xxxxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Tue, Mar 12, 2013 at 10:44:36PM -0400, CAI Qian wrote:
> Eek, got another NULL pointer on an x64 system also. Looks like from
> xfstests case 110. Same user-space version as the one in the ppc64
> case. Still trying to reproduce and without more debugging options
> enabled if possible.
> 
> Swap Size               = 7983 MB
> Mem Size                = 7852 MB
> Number of Processors    = 16
> 
> meta-data=/dev/loop0             isize=256    agcount=4, agsize=655360 blks
>          =                       sectsz=512   attr=2, projid32bit=0
> data     =                       bsize=2048   blocks=2621440, imaxpct=25
>          =                       sunit=0      swidth=0 blks
> naming   =version 2              bsize=4096   ascii-ci=0
> log      =internal log           bsize=2048   blocks=5120, version=2
>          =                       sectsz=512   sunit=0 blks, lazy-count=1
> realtime =none                   extsz=4096   blocks=0, rtextents=0
> 
> CAI Qian
> 
> [30706.240701] XFS (loop1): xfs_trans_ail_delete_bulk: attempting to delete a 
> log item that is not in the AIL 

What happens prior to this message? This is the first indication of
a problem....

> [30706.242124] XFS (loop1): xfs_do_force_shutdown(0x2) called from line 743 
> of file fs/xfs/xfs_trans_ail.c.  Return address = 0xffffffffa03c03ef 
> [30706.245280] XFS (loop1): Log I/O Error Detected.  Shutting down filesystem 
> [30706.246311] XFS (loop1): Please umount the filesystem and rectify the 
> problem(s) 
> [30707.279880] XFS (loop0): Mounting Filesystem 
> [30707.290512] XFS (loop0): Ending clean mount 
> [30708.966751] XFS (loop1): xfs_log_force: error 5 returned. 
> [30708.977075] XFS (loop1): xfs_log_force: error 5 returned. 
> [30708.978074] BUG: unable to handle kernel NULL pointer dereference at 
> 0000000000000230 
> [30708.979629] IP: [<ffffffffa03655e7>] xfs_bdstrat_cb+0x27/0xd0 [xfs] 

And that indicates that the buftarg attached to the buffer has a
NULL xfs_mount pointer, so it's probably related to the above issue.

As it is, none of my machines see this problem, so I'm wondering if
this is related to the way you are using loop devices. Can you
reproduce it on a different type of storage device (like LVm of
physical disk partitions)?

Also, can you turn on CONFIG_XFS_DEBUG and all the memory
leak/poisoning checks and see if that catches anything.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>