xfs
[Top] [All Lists]

2.6.23 kdb in xfs_bmbt_get_block with unwritten extents

To: <xfs@xxxxxxxxxxx>
Subject: 2.6.23 kdb in xfs_bmbt_get_block with unwritten extents
From: "Richard Troxell (rtroxell)" <rtroxell@xxxxxxxxx>
Date: Thu, 21 Jan 2010 17:41:48 -0800
Authentication-results: sj-iport-5.cisco.com; dkim=neutral (message not signed) header.i=none
Cc: "Richard Troxell (rtroxell)" <rtroxell@xxxxxxxxx>
Thread-index: AcqbBBBC4SPXo/x3SCaBOjvi45A68g==
Thread-topic: 2.6.23 kdb in xfs_bmbt_get_block with unwritten extents

Hello All,

I am getting random kdbs when creating preallocated files that are excessively 'holey' (ex: 500MB+ file with alternating 4K written 4K unwritten extents). Creating such files is not my intention, and is being addressing in the userspace writer. That said, I am still concerned with running into kdb.

I am currently running 2.6.23.9, and have done some digging through the changelogs, but cant seem to find a match. Also, 2.6.24 seems to have a massive rewrite in this area, which significantly limits the scope that I can search.

The cause of the crash is a straigtforward NULL derference in xfs_bmap_btree.c:xfs_bmbt_get_block(), but I suspect the root cause is going to be some complex condition that corrupts the cursor...

if (level < cur->bc_nlevels - 1) {
        *bpp = cur->bc_bufs[level];           <----- cur->bc_bufs[level] == NULL
        rval = XFS_BUF_TO_BMBT_BLOCK(*bpp);   <----- BAM! NULL dereferenced
}

Scanning the source, I see numerous instances of this same unchecked dereference from bc_bufs, but so far I have only hit this one condition.

Here is the call trace...

 [<ffffffff8034fec0>] xfs_bmbt_increment+0xb0/0x2c0
 [<ffffffff80346c4b>] xfs_bmap_add_extent_unwritten_real+0x5eb/0xd50
 [<ffffffff80349c72>] xfs_bmap_add_extent+0x152/0x480
 [<ffffffff8038f8d2>] kmem_zone_zalloc+0x32/0x50
 [<ffffffff8034cd40>] xfs_bmapi+0xbe0/0x11f0
 [<ffffffff806ea404>] _spin_unlock+0x14/0x40
 [<ffffffff806ea4cd>] _spin_lock+0x1d/0x90
 [<ffffffff80375e63>] xfs_log_reserve+0xa3/0x100
 [<ffffffff806ea975>] _spin_unlock_irq+0x15/0x40
 [<ffffffff806e9fa6>] __down_write_nested+0x96/0xa0
 [<ffffffff80381ec9>] xfs_trans_reserve+0xa9/0x1f0
 [<ffffffff8037274a>] xfs_iomap_write_unwritten+0x14a/0x230
 [<ffffffff8037166e>] xfs_iomap+0x2fe/0x390
 [<ffffffff806ea3c6>] __lock_text_start+0x16/0x40
 [<ffffffff8038fb20>] xfs_end_bio_unwritten+0x0/0x50
 [<ffffffff8038fb51>] xfs_end_bio_unwritten+0x31/0x50
 [<ffffffff802429a3>] run_workqueue+0x73/0x130
 [<ffffffff80242afc>] worker_thread+0x9c/0xf0
 [<ffffffff80246cd0>] autoremove_wake_function+0x0/0x30
 [<ffffffff80246cd0>] autoremove_wake_function+0x0/0x30
 [<ffffffff80242a60>] worker_thread+0x0/0xf0
 [<ffffffff8024660c>] kthread+0x6c/0xa0
 [<ffffffff8020c9a8>] child_rip+0xa/0x12
 [<ffffffff802465a0>] kthread+0x0/0xa0
 [<ffffffff8020c99e>] child_rip+0x0/0x12

Given the trace, I assume that if I avoid all B+tree managed unwritten extents, I can avoid the crash. However avoiding such files completely seems a bit unrealistic, as I have the need to store files with a reasonable amount of holes...

Thanks,
Richard

<Prev in Thread] Current Thread [Next in Thread>