xfs
[Top] [All Lists]

Re: xfs_lock_dir_and_entry problem

To: Rajagopal Ananthanarayanan <ananth@xxxxxxx>
Subject: Re: xfs_lock_dir_and_entry problem
From: Steve Lord <lord@xxxxxxx>
Date: Fri, 19 May 2000 09:00:37 -0500
Cc: slinx-xfs@xxxxxxxxxxxxxxxxxxxx
In-reply-to: Message from Rajagopal Ananthanarayanan <ananth@sgi.com> of "Thu, 18 May 2000 18:39:07 PDT." <39249B3B.A60E7494@sgi.com>
Sender: owner-linux-xfs@xxxxxxxxxxx
I think we need a better implementation of delay than this:

void
delay(long ticks)
{
  unsigned long timeout = jiffies + ((ticks * HZ) / 1000);
  while (jiffies < timeout);
}

Say this:

void
delay(long ticks)
{
        current->state = TASK_INTERRUPTIBLE; /* Probably should be
                                                uninterruptable */
        schedule_timeout(ticks);
}


We need to get off the cpu - this might fix things - since we would
drop the kernel lock in the process. 

And now back to cxfs.......

Steve


> 
> while running dbench on a tot kernel with delalloc ON,
> I'm seeing that one of the processes gets stuck as follows:
> 
> ------------
> [1]kdb> bt
>     EBP       EIP         Function(args)
> 0xc1a9bd94 0xc48515c6  xfs_iunlock+0x56( 0xc36bf2e8, 0x4, 0x0, 0xc3e05000, 0x
c36bf2e8 )
> 0xc1a9bdf0 0xc486e7d5  xfs_lock_dir_and_entry+0x101( 0xc36bf2e8, 0xc3882f20, 
0xc06387d8, 0x23,
> 0xc1a9be5c )
> 0xc1a9be74 0xc486ecb9  xfs_remove+0x261( 0xc36bf300, 0xc3882f20, 0xc1a9be98, 
0xffffffff,
> 0xc3179e40 )
> 0xc1a9bf54 0xc4876bf5  linvfs_unlink+0x41( 0xc3179e40, 0xc3882ec0, 0xc33dd700
, 0xfffffffe, 0x0 )
> 0xc1a9bf78 0xc0155aeb  vfs_unlink+0x12b( 0xc3179e40, 0xc3882ec0, 0xc1a9a000, 
0xc1997000, 0x0 )
> 0xc1a9bf98 0xc0155bcf  do_unlink+0x97( 0xc1997000, 0x0, 0xc1a9a000, 0xbffff85
e, 0x804a0b1 )
> 0xc1a9bfbc 0xc0155cc7  sys_unlink+0x8b( 0xbffffc54, 0xbffffc7c, 0xbffffc54, 0
xbffff85e,
> 0x804a0b1 )
> 0xc1a9bfd8 0xc010a538  system_call+0x34
> -------------------
> 
> Apparently, the xfs_lock_dir_and_entry() keeps doing the "again:"
> loop ... down in the stack, sys_unlink took the kernel lock
> and the process is still hanging on to that. On another cpu,
> a thread active in XFS code is spinning while trying to get the
> kernel lock. It's trace is as follows:
> 
> ----------------
> [0]kdb> bt
>     EBP       EIP         Function(args)
> 0xc1a57988 0xc0209265  kdb_bt: confused
> stext_lock+0x711( 0xc0369778, 0xc0368e90, 0xc3df1aa0, 0xc0368e90, 0xc1a579d8 
)
> 0xc1a579e0 0xc01b42db  __get_request_wait+0x297( 0xaa, 0x811, 0xc3df1aa0, 0x5
, 0xc3df1aa0 )
> 0xc1a57a38 0xc01b4fbc  generic_make_request+0x728( 0xc11ff3a4, 0x5, 0xc3df1aa
0, 0x811,
> 0xc3df1ae8 )
> 0xc1a57a58 0xc01b4889  make_request+0x1d( 0x8, 0x5, 0xc3df1aa0, 0x401f6e, 0xc
10adc40 )
> 0xc1a57ab0 0xc0140ef8  _pagebuf_page_io+0x328( 0xc10adc40, 0xc22f51e0, 0x401f
6f, 0x811, 0xe00 )
> 0xc1a57af8 0xc0141062  _page_buf_page_apply+0x112( 0x0, 0xc22f51e0, 0x0, 0x0,
 0xc10adc40 )
> 0xc1a57b48 0xc0141c2c  pagebuf_segment_apply+0xfc( 0xc0140f50, 0x0, 0xc22f51e
0, 0x0, 0x0 )
> 0xc1a57bac 0xc01414ac  pagebuf_iorequest+0x43c( 0xc22f51e0, 0xc11b14e4, 0xc11
b14a0, 0xc22c0000 )
> 0xc1a57bd4 0xc4858a2c  xlog_sync+0x120( 0xc11b14a0, 0xc22c0000, 0x0, 0xc22c00
00, 0xc11b14e4 )
> 0xc1a57c00 0xc485a969  xlog_state_release_iclog+0x155( 0xc11b14a0, 0xc22c0000
, 0xc11b14a0, 0x3,
> 0xc1481b20 )
> 0xc1a57c24 0xc485aeb2  xlog_state_sync+0x1ea( 0xc11b14a0, 0x34, 0x248e, 0x3, 
0xc3e05000 )
> 0xc1a57c44 0xc4857739  xfs_log_force+0x49( 0xc3e05000, 0x34, 0x248e, 0x3, 0x2
 )
> 0xc1a57d04 0xc4866d05  xfs_trans_commit+0x315( 0xc1481328, 0x0, 0x0, 0xc14813
28, 0xffffffff )
> 0xc1a57d38 0xc4834534  xfs_bmap_finish+0x8c( 0xc1a57e08, 0xc1a57da4, 0xffffff
ff, 0xffffffff,
> 0xc1a57d94 )
> 0xc1a57db0 0xc4852f19  xfs_itruncate_finish+0x211( 0xc1a57e08, 0xc215a7d8, 0x
0, 0x0, 0x0 )
> 0xc1a57e0c 0xc486d87c  xfs_inactive+0x210( 0xc215a7f0, 0x0, 0xc2a332fc, 0xc48
8c8e0, 0xc292ce40 )
> 0xc1a57e3c 0xc48801a1  vn_rele+0xed( 0xc2a332fc, 0xc30ded00, 0xc1a57e68, 0xc0
15e5e8 )
> 0xc1a57e4c 0xc487d117  linvfs_put_inode+0x17( 0xc30ded00, 0xc3882540, 0xc30de
d00, 0xc292ce40 )
> 0xc1a57e68 0xc015e5e8  iput+0x38( 0xc30ded00, 0x0, 0xc30ded00, 0xc1a57f54 )
> 0xc1a57e7c 0xc015c68d  d_delete+0x49( 0xc3882540, 0xffffffff, 0xc292ce40, 0xc
292ced8 )
> 0xc1a57f54 0xc4876c2f  linvfs_unlink+0x7b( 0xc292ce40, 0xc3882540, 0xc33dde80
, 0xfffffffe, 0x0 )
> 0xc1a57f78 0xc0155aeb  vfs_unlink+0x12b( 0xc292ce40, 0xc3882540, 0xc1a56000, 
0xc1907000, 0x0 )
> [0]more> 
> 0xc1a57f98 0xc0155bcf  do_unlink+0x97( 0xc1907000, 0x0, 0xc1a56000, 0xbffff85
e, 0x804a0b1 )
> 0xc1a57fbc 0xc0155cc7  sys_unlink+0x8b( 0xbffffc54, 0xbffffc7d, 0xbffffc54, 0
xbffff85e,
> 0x804a0b1 )
> 0xc1a57fd8 0xc010a538  system_call+0x34
> -----------------
> 
> Basically, while doing a make_request as part of pagebuf_page_io,
> it ran out of request structures, went to sleep, now has the
> request structure ... and is trying to reacquire the kernel lock
> which it originally aquired (also) in sys_unlink.
> 
> In the meantime, the page-cleaner daemon is trying to convert
> a delalloc extent. Its backtrace is as follows:
> 
> ------------
> [0]kdb> btp 560
>     EBP       EIP         Function(args)
> 0xc22e76f4 0xc0118304  schedule+0x4b4( 0xc22f5860, 0xc22f5880, 0xc22f5880, 0x
c22f5878,
> 0xc22e7744 )
> 0xc22e774c 0xc0107def  __down+0x257( 0xc22f5860, 0xc1160000, 0xc22f5800 )
> 0xc22e7760 0xc0108487  __down_failed+0xb( 0x1ff, 0xbff04200, 0xc2656740, 0xc4
825eb9, 0xc148147c
> )
> 0xc22e780c 0xc020a7db  _pagebuf_get_lockable_buffer+0x38e( 0xc2656740, 0xc267
6000, 0xbff04200,
> 0x0, 0x200 )
> 0xc22e7844 0xc013f734  pagebuf_get+0xc0( 0xc2656740, 0xbff04200, 0x0, 0x200, 
0x2005 )
> 0xc22e7870 0xc48681e6  xfs_trans_read_buf+0x1be( 0xc3e05000, 0xc14811d4, 0xc3
e05188, 0x5ff821,
> 0x0 )
> 0xc22e78bc 0xc4825ff8  xfs_alloc_read_agf+0x54( 0xc3e05000, 0xc14811d4, 0x6, 
0x1, 0xc22e7900 )
> 0xc22e7954 0xc4825ba3  xfs_alloc_fix_freelist+0x133( 0xc22e7b44, 0x1, 0xc3e05
1fc, 0x1, 0xa )
> 0xc22e79a8 0xc48263c1  xfs_alloc_vextent+0x305( 0xc22e7b44, 0x1, 0x16, 0xc3e0
5000 )
> 0xc22e7b94 0xc4832616  xfs_bmap_alloc+0x1c42( 0xc22e7c9c, 0x9, 0xc0638950, 0x
c3e0500c )
> 0xc22e7ce8 0xc4835430  xfs_bmapi+0x690( 0xc14811d4, 0xc06387d8, 0x9, 0x0, 0x1
 )
> 0xc22e7e04 0xc487b0a2  xfs_iomap_write_convert+0x33e( 0xc0638950, 0x9000, 0x0
, 0x1000,
> 0xc22e7fac )
> 0xc22e7ed0 0xc487a0fe  xfs_iomap_write+0x10e( 0xc0638950, 0x9000, 0x0, 0x1000
, 0xc22e7fac )
> 0xc22e7f10 0xc4879b66  xfs_bmap+0xfe( 0xc06387f0, 0x9000, 0x0, 0x1000, 0x1001
0002 )
> 0xc22e7f58 0xc487741d  linvfs_pb_bmap+0x79( 0xc0c7bd80, 0x9000, 0x0, 0x1000, 
0xc22e7fac )
> 0xc22e7fc0 0xc0145b27  pb_delalloc_convert+0xa3( 0xc1034610, 0xc22e7fea, 0x10
000000, 0x100,
> 0xc2691c50 )
> 0xc22e7fec 0xc0145f6f  page_cleaner_daemon+0x29b
> 0xc2691c58 0xc0107603  kernel_thread+0x23
> ------------------
> 
> 
> Ted, do you think the stuff you're working on has anything to do with this?
> 
> 
> -- 
> --------------------------------------------------------------------------
> Rajagopal Ananthanarayanan ("ananth")
> Member Technical Staff, SGI.
> --------------------------------------------------------------------------



<Prev in Thread] Current Thread [Next in Thread>
  • Re: xfs_lock_dir_and_entry problem, Steve Lord <=