View Incident:
http://co-op.engr.sgi.com/BugWorks/code/bwxquery.cgi?search=Search&wlong=1&view_type=Bug&wi=800480
Status : open Priority : 2
Assigned Engineer : lord Submitter : ananth
*Modified User : lord *Modified User Domain : sgi.com
*Description :
We have a semi-production build machine that is
running XFS bits as of 8/23/00. I have seen
things like "rm" getting into the following backtrace:
---------
schedule+0x415
_sv_wait+0xcd
xlog_grant_log_space+0x139
xfs_log_reserve+0x7b
xfs_trans_reserve+0x76
.....
==========================
ADDITIONAL INFORMATION (ADD)
From: lord@xxxxxxx (BugWorks)
Date: Sep 12 2000 04:49:33AM
==========================
So it appears timing is everything, because this morning there
is nothing hung on the machine. Looks like the hang is not
indefinite.
Here is what needs dumping from a hang should it happen again.
feed the first parameter of xlog_grant_log_space into the xlog
kdb command. In the third line of output there is a parameter
ICLOG, pass this to the xicall command. This should produce
output like this:
[0]kdb> xicall 0xf44d0000
xlog_in_core/header at 0xf44d0000
magicno: feedbabe cycle: 1519 version: 1 lsn: 0x0
tail_lsn: 0x5ef00001d4c len: 1060 prev_block: 8376 num_ops: 0
cycle_data: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0
--------------------------------------------------
data: 0xf44d0400 &forcesema: 0xf44d0000 next: 0xf4f80000 bp: 0xf70e6da0
log: 0xf7cee880 callb: 0x00000000 callb_tail: 0xf44d0020 roundoff: 476
size: 32256 (OFFSET: 0) trace: 0x00000000 refcnt: 0 bwritecnt: 0 state:
state 0x1 <ACTIVE >
=================================================
xlog_in_core/header at 0xf4f80000
magicno: feedbabe cycle: 1519 version: 1 lsn: 0x0
tail_lsn: 0x5ef000020f8 len: 224 prev_block: 8440 num_ops: 0
cycle_data: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0
--------------------------------------------------
data: 0xf4f80400 &forcesema: 0xf4f80000 next: 0xf44d0000 bp: 0xf70e6d00
log: 0xf7cee880 callb: 0x00000000 callb_tail: 0xf4f80020 roundoff: 288
size: 32256 (OFFSET: 0) trace: 0x00000000 refcnt: 0 bwritecnt: 0 state:
state 0x1 <ACTIVE >
=================================================
There are a couple of bp fields in this output. these need
passing into the pb command. So from this example:
[0]kdb> pb 0xf70e6da0
page_buf_t at 0xf70e6da0
pb_flags MAPPED ASYNC SYNC LOCKABLE FORECIO
pb_target 0xf735a200 pb_hold 1 pb_next 0xf70e6da0 pb_prev 0xf70e6da0
pb_file_offset 0x0 pb_buffer_length 0x800 pb_addr 0xf44d0200
pb_bn 0x79a118 pb_count_desired 0x800
pb_io_remaining 0 pb_error 0 pba_kiovec[0] 0xf3012d40 pba_kiocnt 1
pb_iodonesema (0,0) pb_sema (0,0) pincount (0) last holder 0xf7538000
pb_fspriv 0xf44d0000 pb_fspriv2 0x00000001
[0]kdb> pb 0xf70e6d00
page_buf_t at 0xf70e6d00
pb_flags MAPPED ASYNC SYNC LOCKABLE FORECIO
pb_target 0xf735a200 pb_hold 1 pb_next 0xf70e6d00 pb_prev 0xf70e6d00
pb_file_offset 0x0 pb_buffer_length 0x400 pb_addr 0xf4f80200
pb_bn 0x79a11c pb_count_desired 0x400
pb_io_remaining 0 pb_error 0 pba_kiovec[0] 0xf3012dc0 pba_kiocnt 1
pb_iodonesema (0,0) pb_sema (0,0) pincount (0) last holder 0xf7538000
pb_fspriv 0xf4f80000 pb_fspriv2 0x00000001
Also from the xlog output there is an mp: field on the fifth line
, this needs feeding into the xmount command, and the xail command.
If the hang happens again, please put all this info into the PV,
it really looks like this is not a permanent hang, but a missed
wakeup which is getting compensated for by something else in
the system.
|