I'm running linux 2.4.17 with a version of XFS downloaded via CVS on Jan
30th. When I run the SPEC SFS NFS test against this kernel, nfsd stops
responding after awhile. I captured the state of all of the system
processes via magic sysrq, and found 24 nfsd processes locked up in various
stages of the nfsd_lookup code:
- 20 of them were locked up in the fh_lock call before lookup_one_len in
nfsd_lookup().
- 2 processes were locked up in the _pagebuf_grab_lock call inside
_pagebuf_find_lockable_buffer().
- 2 processes were locked up in the pagebuf_iowait() call in
pagebuf_iostart()
Any ideas on what may be wrong, and how I can help debug and solve this
problem? I've attached the call traces for the locked up nfsd processes. I
can provide vmlinux and System.map for this kernel to help debugging.
Thanks,
Erik Habbinga
Hewlett Packard
nfsd processes locked up in nfsd_lookup()->fh_lock()
task: nfsd
c0105ab4: c0105a48 T __down
c0105c50: c0105c48 T __down_failed
c02c9bab: c02c6214 T stext_lock
c0168048: c0167f74 t nfsd3_proc_lookup
c015fa13: c015f940 t nfsd_dispatch
c02bd775: c02bd4e8 T svc_process
c015f80f: c015f618 t nfsd
c0105594: c010556c T kernel_thread
nfsd processes locked up in
_pagebuf_find_lockable_buffer()->_pagebuf_grab_lock call()
task: nfsd
c0105ab4: c0105a48 T __down
c0105c50: c0105c48 T __down_failed
c02ca95c: c02c6214 T stext_lock
c01db8e6: c01db7e4 T _pagebuf_find_lockable_buffer
c01e5d92: c01e5d58 T kmem_zone_alloc
c01a7545: c01a7510 t xfs_dir2_block_lookup_int
c01a7545: c01a7510 t xfs_dir2_block_lookup_int
c01dba19: c01db9e4 T _pagebuf_get_lockable_buffer
c01d829f: c01d8268 T pagebuf_get
c01cc374: c01cc338 T xfs_trans_read_buf
c01b9078: c01b8f78 T xfs_itobp
c01ba0cc: c01ba05c T xfs_iread
c01b80c2: c01b7ebc T xfs_iget_core
c0147ebe: c0147df0 T icreate
c01b8488: c01b8404 T xfs_iget
c01cd61c: c01cd4f8 T xfs_dir_lookup_int
c01d1c07: c01d1b78 t xfs_lookup
c01df729: c01df6c4 T linvfs_lookup
c013df41: c013dea8 T lookup_hash
c013dfd7: c013df80 T lookup_one_len
c016263d: c0162370 T nfsd_lookup
c0168048: c0167f74 t nfsd3_proc_lookup
c015fa13: c015f940 t nfsd_dispatch
c02bd775: c02bd4e8 T svc_process
c015f80f: c015f618 t nfsd
c0105594: c010556c T kernel_thread
0xc02ca95c <stext_lock+18248>: jmp 0xc01db7e1 <_pagebuf_grab_lock+17>
nfsd processes locked up in pagebuf_iostart()->pagebuf_iowait()
task: nfsd
c0105ab4: c0105a48 T __down
c0105c50: c0105c48 T __down_failed
c02ca8b3: c02c6214 T stext_lock
c01d88cd: c01d8844 T pagebuf_iostart
c01d830d: c01d8268 T pagebuf_get
c01cc374: c01cc338 T xfs_trans_read_buf
c01b9078: c01b8f78 T xfs_itobp
c01ba0cc: c01ba05c T xfs_iread
c01b80c2: c01b7ebc T xfs_iget_core
c0147ebe: c0147df0 T icreate
c01b8488: c01b8404 T xfs_iget
c01cd61c: c01cd4f8 T xfs_dir_lookup_int
c01d1c07: c01d1b78 t xfs_lookup
c01df729: c01df6c4 T linvfs_lookup
c013df41: c013dea8 T lookup_hash
c013dfd7: c013df80 T lookup_one_len
c016263d: c0162370 T nfsd_lookup
c0168048: c0167f74 t nfsd3_proc_lookup
c015fa13: c015f940 t nfsd_dispatch
c02bd775: c02bd4e8 T svc_process
c015f80f: c015f618 t nfsd
c0105594: c010556c T kernel_thread
0xc02ca8b3 <stext_lock+18079>: jmp 0xc01d9109 <pagebuf_iowait+49>
|