I'm looking at a machine where xfs_freeze -f is stuck in D state. The
freeze is done on a filesystem that already has three LVM snapshots, and has
writes streamed to it over both nfs and samba. I have some backtraces from
kdb (attached), but I'm not quite sure what to make of them. After
xfs_freeze -f stuck, xfs_freeze -u was tried from the command line, and also
stuck.
The kernel is from the XFS CVS tree of Mar 19th, using LVM's VFS lock patch.
It has been patched to exempt certain processes coming from being stopped by
xfs_check_frozen, specifically kupdated and xfs_freeze. knfsd and many smbd
processes are stopped by xfs_check_frozen, which is fine.
kupdated seems to be stuck in a semaphore I don't recognize, descending out
of linvfs_pb_map()
kdb> btp 6
EBP EIP Function(args)
0xefef5bbc 0xc0110d5f schedule+0x33f (0xcf5f21a0, 0xcfdebc00, 0xc4ea6920,
0xcf5f221c, 0xefef5bd8)
kernel .text 0xc0100000 0xc0110a20 0xc0110d88
0xefef5be8 0xc0105cbd __down+0x61 (0xcf5f2214, 0xefef5e1c, 0x0)
kernel .text 0xc0100000 0xc0105c5c 0xc0105d08
0xefef5bfc 0xc0105e2b __down_failed+0xb (0xcf5f21a0, 0xefef5c18, 0xc01c44b6,
0xcf5f21a0, 0xcf5f21a0)
kernel .text 0xc0100000 0xc0105e20 0xc0105e34
0xefef5c08 0xc01d875b _text_lock_page_buf_locking+0x2d (0xc03d9420, 0x1,
0xd1c3f980, 0x8, 0x1)
kernel .text 0xc0100000 0xc01d872e 0xc01d8770
0xc01f9b5b generic_make_request+0x97 (0x1, 0xd1c3f980,
0xd1c3f980, 0x0, 0x0)
kernel .text 0xc0100000 0xc01f9ac4 0xc01f9be0
[rest of trace is in attachment]
xfs_freeze -f is in pagebuf_iorequest(). How can I tell what it is waiting
for?
kdb> btp 2533
EBP EIP Function(args)
0xc4d87b60 0xc0110d5f schedule+0x33f (0xcf5f21a0, 0xcfdebc00, 0x0,
0xcf5f2248, 0x0)
kernel .text 0xc0100000 0xc0110a20 0xc0110d88
0xc4d87b98 0xc01d5dfe pagebuf_iorequest+0xae (0xcf5f21a0)
kernel .text 0xc0100000 0xc01d5d50 0xc01d5e88
0xc4d87ba4 0xc01defaa xfsbdstrat+0x2a (0xcfdebc00, 0xcf5f21a0, 0xcfc2c820,
0xcfdebc00, 0xc4d86000)
kernel .text 0xc0100000 0xc01def80 0xc01defbc
0xc4d87bc4 0xc01c3fb6 xfs_unmountfs_writesb+0xc2 (0xcfdebc00, 0xcfdebc00,
0xc4d86000, 0xcf5a6c80, 0xcfdebc00)
kernel .text 0xc0100000 0xc01c3ef4 0xc01c401c
0xc4d87be4 0xc01b0c44 xfs_fs_freeze+0xbc (0xcfdebc00, 0xbffffcac,
0xcf5a6c80, 0xcf5a6da4, 0x400151ac)
kernel .text 0xc0100000 0xc01b0b88 0xc01b0c60
[rest of trace is in attachment]
The attempt to jog things loose with xfs_freeze -u failed. It seems to be
stuck on the same semaphore as kupdated.
kdb> btp 2814
EBP EIP Function(args)
0xe1641b50 0xc0110d5f schedule+0x33f (0xcf5f21a0, 0xcfdebc00, 0x0,
0xcf5f221c, 0xe1641b6c)
kernel .text 0xc0100000 0xc0110a20 0xc0110d88
0xe1641b7c 0xc0105cbd __down+0x61 (0xcf5f2214, 0x1000100, 0x0)
kernel .text 0xc0100000 0xc0105c5c 0xc0105d08
0xe1641b90 0xc0105e2b __down_failed+0xb (0xcf5f21a0, 0xe1641bac, 0xc01c44b6,
0xcf5f21a0, 0x100)
kernel .text 0xc0100000 0xc0105e20 0xc0105e34
0xe1641b9c 0xc01d875b _text_lock_page_buf_locking+0x2d (0xc476c560,
0x400f1000,
0x0, 0x400f1480, 0xed180be0)
kernel .text 0xc0100000 0xc01d872e 0xc01d8770
[rest of trace in attachment]
Any hints, suggestions or explanations would be very much appreciated. I
still have access to the machine at the moment.
Thank you,
Dale Stephenson
steph@xxxxxxxxxxxxxx
deadlock.txt
Description: Text document
|