xfs
[Top] [All Lists]

ADD 800480 - xlog_grant_log_space can wait indefinitely

To: lord@xxxxxxx
Subject: ADD 800480 - xlog_grant_log_space can wait indefinitely
From: pv@xxxxxxxxxxxxx (lord@xxxxxxx)
Date: Tue, 12 Sep 2000 04:49:36 -0700 (PDT)
Cc: tduffy@xxxxxxxxxxxxxxxxxxxx, linux-xfs@xxxxxxxxxxx
Reply-to: sgi.bugs.xfs@xxxxxxxxxxxxxxxxx
Sender: owner-linux-xfs@xxxxxxxxxxx
Webexec: webpvupdate,pvincident
Webpv: 192.82.201.223
View Incident: 
http://co-op.engr.sgi.com/BugWorks/code/bwxquery.cgi?search=Search&wlong=1&view_type=Bug&wi=800480

 Status : open                         Priority : 2                         
 Assigned Engineer : lord              Submitter : ananth                   
*Modified User : lord                 *Modified User Domain : sgi.com       
*Description :
We have a semi-production build machine that is
running XFS bits as of 8/23/00. I have seen
things like "rm" getting into the following backtrace:

---------
schedule+0x415
_sv_wait+0xcd
xlog_grant_log_space+0x139
xfs_log_reserve+0x7b
xfs_trans_reserve+0x76

.....


==========================
ADDITIONAL INFORMATION (ADD)
From: lord@xxxxxxx (BugWorks)
Date: Sep 12 2000 04:49:33AM
==========================

So it appears timing is everything, because this morning there
is nothing hung on the machine. Looks like the hang is not
indefinite.

Here is what needs dumping from a hang should it happen again.

feed the first parameter of xlog_grant_log_space into the xlog
kdb command. In the third line of output there is a parameter
ICLOG, pass this to the xicall command. This should produce
output like this:
[0]kdb> xicall 0xf44d0000
xlog_in_core/header at 0xf44d0000
magicno: feedbabe  cycle: 1519  version: 1  lsn: 0x0
tail_lsn: 0x5ef00001d4c  len: 1060  prev_block: 8376  num_ops: 0
cycle_data: 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 
 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  
0  0  0  0  0  0  0  0  0  0  0  0  0  0  
--------------------------------------------------
data: 0xf44d0400  &forcesema: 0xf44d0000  next: 0xf4f80000 bp: 0xf70e6da0
log: 0xf7cee880  callb: 0x00000000  callb_tail: 0xf44d0020  roundoff: 476
size: 32256  (OFFSET: 0) trace: 0x00000000  refcnt: 0  bwritecnt: 0  state: 
state 0x1 <ACTIVE > 
=================================================
xlog_in_core/header at 0xf4f80000
magicno: feedbabe  cycle: 1519  version: 1  lsn: 0x0
tail_lsn: 0x5ef000020f8  len: 224  prev_block: 8440  num_ops: 0
cycle_data: 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 
 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  
0  0  0  0  0  0  0  0  0  0  0  0  0  0  
--------------------------------------------------
data: 0xf4f80400  &forcesema: 0xf4f80000  next: 0xf44d0000 bp: 0xf70e6d00
log: 0xf7cee880  callb: 0x00000000  callb_tail: 0xf4f80020  roundoff: 288
size: 32256  (OFFSET: 0) trace: 0x00000000  refcnt: 0  bwritecnt: 0  state: 
state 0x1 <ACTIVE > 
=================================================

There are a couple of bp fields in this output. these need
passing into the pb command. So from this example:
[0]kdb> pb 0xf70e6da0
page_buf_t at 0xf70e6da0
  pb_flags MAPPED ASYNC SYNC LOCKABLE FORECIO
  pb_target 0xf735a200 pb_hold 1 pb_next 0xf70e6da0 pb_prev 0xf70e6da0
  pb_file_offset 0x0 pb_buffer_length 0x800 pb_addr 0xf44d0200
  pb_bn 0x79a118 pb_count_desired 0x800
  pb_io_remaining 0   pb_error 0 pba_kiovec[0] 0xf3012d40 pba_kiocnt 1
  pb_iodonesema (0,0) pb_sema (0,0) pincount (0) last holder 0xf7538000
pb_fspriv 0xf44d0000 pb_fspriv2 0x00000001
[0]kdb> pb 0xf70e6d00
page_buf_t at 0xf70e6d00
  pb_flags MAPPED ASYNC SYNC LOCKABLE FORECIO
  pb_target 0xf735a200 pb_hold 1 pb_next 0xf70e6d00 pb_prev 0xf70e6d00
  pb_file_offset 0x0 pb_buffer_length 0x400 pb_addr 0xf4f80200
  pb_bn 0x79a11c pb_count_desired 0x400
  pb_io_remaining 0   pb_error 0 pba_kiovec[0] 0xf3012dc0 pba_kiocnt 1
  pb_iodonesema (0,0) pb_sema (0,0) pincount (0) last holder 0xf7538000
pb_fspriv 0xf4f80000 pb_fspriv2 0x00000001

Also from the xlog output there is an mp: field on the fifth line
, this needs feeding into the xmount command, and the xail command.

If the hang happens again, please put all this info into the PV,
it really looks like this is not a permanent hang, but a missed
wakeup which is getting compensated for by something else in
the system.

<Prev in Thread] Current Thread [Next in Thread>