Inode lockdep problem observed on 2.6.37.6 xfs with RT subvolume
kdasu
kdasu.kdev at gmail.com
Thu Feb 9 21:27:11 CST 2012
Christoph,
I would like to share some update on the issue I reported on the RT
subvolume with more
data.
I have back ported the three patches to 2.6.37.
> > ?xfs: only lock the rt bitmap inode once per allocation
> > ?xfs: fix xfs_get_extsz_hint for a zero extent size hint
> > ?xfs: add lockdep annotations for the rt inodes
With the patches the situation is slightly better however there seems to be
a
recursive deadlock as part of xfs_fs_evict_inode, if there are multiple
extents
associated with the same inode.
This is a stack trace during a mount after a reboot when the log is
replayed, however
exactly the same path fails and deadlocks when the evict operation is
attempted before a reboot.
xfs_ilock(ip, XFS_ILOCK_EXCL) being acquired twice in a recursive loop
deadlock :
#0 xfs_ilock (ip=0xcf879980, lock_flags=33554436) at fs/xfs/xfs_iget.c:498
#1 0x801ee674 in xfs_iget_cache_hit (mp=0xcf640400, tp=0xcf0c0e58,
ino=<value
optimized out>, flags=0, lock_flags=33554436,
ipp=0xcf60f950) at fs/xfs/xfs_iget.c:238
#2 xfs_iget (mp=0xcf640400, tp=0xcf0c0e58, ino=<value optimized out>,
flags=0, lock_flags=33554436, ipp=0xcf60f950)
at fs/xfs/xfs_iget.c:391
#3 0x80215b50 in xfs_trans_iget (mp=<value optimized out>, tp=0xcf0c0e58,
ino=<value optimized out>, flags=0, lock_flags=33554436,
ipp=0xcf60f950) at fs/xfs/xfs_trans_inode.c:60
#4 0x801a7044 in xfs_rtfree_extent (tp=0xcf0c0e58, bno=<value optimized
out>,
len=9) at fs/xfs/xfs_rtalloc.c:2166
#5 0x801c05d0 in xfs_bmap_del_extent (ip=0xcf879380, tp=<value optimized
out>, idx=0, flist=0xcf60fbb0, cur=0x0, del=0xcf60fad0,
logflagsp=0xcf60fac0, whichfork=0, rsvd=0) at fs/xfs/xfs_bmap.c:2892
#6 0x801c5460 in xfs_bunmapi (tp=0xcf0c0e58, ip=0xcf879380, bno=2303,
len=4294967297, flags=0, nexts=2, firstblock=0xcf60fba8,
flist=0xcf60fbb0, done=0xcf60fba0) at fs/xfs/xfs_bmap.c:5256
#7 0x801f0a88 in xfs_itruncate_finish (tp=0xcf60fc14, ip=0xcf879380,
new_size=<value optimized out>, fork=0, sync=1)
at fs/xfs/xfs_inode.c:1585
#8 0x80218428 in xfs_inactive (ip=0xcf879380) at fs/xfs/xfs_vnodeops.c:1102
#9 0x800e2be4 in evict (inode=0xcf8794c0) at fs/inode.c:450
#10 0x800e3300 in iput_final (inode=0xcf8794c0) at fs/inode.c:1401
#11 iput (inode=0xcf8794c0) at fs/inode.c:1423
#12 0x80208740 in xlog_recover_process_one_iunlink (mp=0xcf640400,
agno=<value
optimized out>, agino=<value optimized out>, bucket=29)
at fs/xfs/xfs_log_recover.c:3212
#13 0x8020884c in xlog_recover_process_iunlinks (log=<value optimized out>)
at
fs/xfs/xfs_log_recover.c:3289
#14 0x80209928 in xlog_recover_finish (log=0xcf638000) at
fs/xfs/xfs_log_recover.c:3926
#15 0x8020de74 in xfs_mountfs (mp=0xcf640400) at fs/xfs/xfs_mount.c:1386
#16 0x8022d228 in xfs_fs_fill_super (sb=0xcf5ff400, data=<value optimized
out>, silent=<value optimized out>)
at fs/xfs/linux-2.6/xfs_super.c:1539
#17 0x800cbe68 in mount_bdev (fs_type=<value optimized out>, flags=32768,
dev_name=<value optimized out>, data=0xcfc52000,
fill_super=0x8022d04c <xfs_fs_fill_super>) at fs/super.c:820
#18 0x8022a6a4 in xfs_fs_mount (fs_type=<value optimized out>, flags=<value
optimized out>, dev_name=<value optimized out>,
data=<value optimized out>) at fs/xfs/linux-2.6/xfs_super.c:1616
#19 0x800ca6e0 in vfs_kern_mount (type=0x80597e10, flags=<value optimized
out>, name=<value optimized out>, data=<value optimized out>)
at fs/super.c:986
#20 0x800ca888 in do_kern_mount (fstype=0xcff42580 "xfs", flags=<value
optimized out>, name=<value optimized out>,
data=<value optimized out>) at fs/super.c:1155
#21 0x800e9f08 in do_new_mount (dev_name=0xcf600100 "/dev/sda2",
dir_name=<value optimized out>, type_page=0xcff42580 "xfs",
flags=32768, data_page=0xcfc52000) at fs/namespace.c:1746
#22 do_mount (dev_name=0xcf600100 "/dev/sda2", dir_name=<value optimized
out>,
type_page=0xcff42580 "xfs", flags=32768,
data_page=0xcfc52000) at fs/namespace.c:2066
#23 0x800ea9d0 in sys_mount (dev_name=0x46e5d4 "/dev/sda2", dir_name=<value
optimized out>, type=<value optimized out>, flags=33792,
data=0x4700b0) at fs/namespace.c:2210
#24 0x800117bc in handle_sys () at arch/mips/kernel/scall32-o32.S:59
#25 0x0041ff1c in ?? ()
warning: GDB can't find the start of the function at 0x41ff1b.
The code deadlocks here :
xfs_iget.c
515 if (lock_flags & XFS_ILOCK_EXCL)
516 mrupdate_nested(&ip->i_lock,
In case of 2.6.37 xfs_iget_cache_hit try's to lock repeatedly during the
evict.
I had to fix the locking by detecting if the inode is already locked and is
part of
a transaction tp and also prevent from calleing xfs_trans_ijoin().
I can post the patch, however I would like to know if this deadlock makes
sense
to you.
I suspect the same occurs with 2.6.39 as well. Although the xfs_trans_iget()
got replaced with the xfs_ilock() the deadlock can happen in
xfs_rtfree_extents().
Code on the 2.6.37 :
xfs_rt
int xfs_rtfree_extent() {
...
...
/*
* Synchronize by locking the bitmap inode.
*/
error = xfs_trans_iget(mp, tp, mp->m_sb.sb_rbmino, 0,
XFS_ILOCK_EXCL | XFS_ILOCK_RTBITMAP, &ip);
...
...
}
Code on 2.6.39
int xfs_rtfree_extent() {
...
...
/*
* Synchronize by locking the bitmap inode.
*/
xfs_ilock(mp->m_rbmip, XFS_ILOCK_EXCL); /*called from the upstream calling
function while loop*/
xfs_trans_ijoin_ref(tp, mp->m_rbmip, XFS_ILOCK_EXCL);
..
..
}
Kamal
Christoph Hellwig wrote:
>
> On Thu, Feb 02, 2012 at 11:26:28AM -0500, Kamal Dasu wrote:
>> > ?xfs: only lock the rt bitmap inode once per allocation
>> > ?xfs: fix xfs_get_extsz_hint for a zero extent size hint
>> > ?xfs: add lockdep annotations for the rt inodes
>> >
>> > But in general the RT subvolume code is not regularly tested and only
>> > fixed when issues arise.
>>
>>
>> Thanks for quick reply and clarifying this, if upgrading the kernel is
>> not an option, should I be
>> considering backporting changes to 2.6.37, should I use the entire
>> 2.6.39 or 3.0
>> xfs implementation as is of cherry pick the above three changes ?.
>
> I don't remember if we have other changes in that area. If backporting
> the changes is easy enough, go for it, if not stick to your original
> workaround. Either way make sure you don't introduce other regressions
> by running xfstests.
>
> _______________________________________________
> xfs mailing list
> xfs at oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
>
>
--
View this message in context: http://old.nabble.com/Inode-lockdep-problem-observed-on-2.6.37.6-xfs-with-RT-subvolume-tp33247492p33297927.html
Sent from the Xfs - General mailing list archive at Nabble.com.
More information about the xfs
mailing list