Christoph Hellwig wrote:
On Thu, Jun 19, 2008 at 04:40:00PM +1000, Brian May wrote:
Does the following help? I still have the logs of the other processes, if
required (just in case it is some weird interaction between multiple
processes?)
It seems to be pretty consistent with lock_timer_base, every time I look
(assuming I haven't read the stack trace upside down...).
Jun 19 16:33:30 hq kernel: grep S 00000000 0 12793 12567
(NOTLB)
Jun 19 16:33:30 hq kernel: f0c23e7c 00200082 000a1089 00000000
00000010 00000008 cd0db550 dfa97550
Jun 19 16:33:30 hq kernel: 34f84262 00273db2 0008a1dc 00000001
cd0db660 c20140a0 dfe1cbe8 00200286
Jun 19 16:33:30 hq kernel: c0125380 a4dbf26b dfa6a000 00200286
000000ff 00000000 00000000 a4dbf26b
Jun 19 16:33:30 hq kernel: Call Trace:
Jun 19 16:33:30 hq kernel: [<c0125380>] lock_timer_base+0x15/0x2f
Jun 19 16:33:30 hq kernel: [<c027f960>] schedule_timeout+0x71/0x8c
Jun 19 16:33:30 hq kernel: [<c0124a81>] process_timeout+0x0/0x5
Jun 19 16:33:30 hq kernel: [<c016c801>] __break_lease+0x2a8/0x2b9
That's the lease breaking code in the VFS, long before we call
into XFS. Looks like someone (samba?) has a least on this file and
we're having trouble having it broken. Try sending a report about
this to linux-fsdevel@xxxxxxxxxxxxxxx
I feel I am going around in circles.
Anyway, I started the discussion from
<http://www.archivum.info/linux-fsdevel@xxxxxxxxxxxxxxx/2008-06/msg00337.html>.
In the last message (which isn't archived yet), I looked at the Samba
process that is holding the lease. The following is the stack trace of
this process. I don't understand why the XFS code is calling e1000 code,
the filesystem isn't attached via the network. Perhaps this would mean
the problem is with the network code???
Jun 20 10:54:37 hq kernel: smbd S 00000000 0 13516 11112
13459 (NOTLB)
Jun 20 10:54:37 hq kernel: ddd19b70 00000082 034cdfca 00000000 00000001 00000007 f7c2c550 dfa9caa0
Jun 20 10:54:37 hq kernel: ae402975 002779a9 0000830f 00000003 f7c2c660 c20240a0 00000001 00000286
Jun 20 10:54:37 hq kernel: c0125380 a5d7f11b c2116000 00000286 000000ff 00000000 00000000 a5d7f11b
Jun 20 10:54:37 hq kernel: Call Trace:
Jun 20 10:54:37 hq kernel: [<c0125380>] lock_timer_base+0x15/0x2f
Jun 20 10:54:37 hq kernel: [<c027f960>] schedule_timeout+0x71/0x8c
Jun 20 10:54:37 hq kernel: [<c0124a81>] process_timeout+0x0/0x5
Jun 20 10:54:37 hq kernel: [<c016a115>] do_select+0x37a/0x3d4
Jun 20 10:54:37 hq kernel: [<c016a677>] __pollwait+0x0/0xb2
Jun 20 10:54:37 hq kernel: [<c0117778>] default_wake_function+0x0/0xc
Jun 20 10:54:37 hq kernel: [<c0117778>] default_wake_function+0x0/0xc
Jun 20 10:54:37 hq kernel: [<f88e998f>] e1000_xmit_frame+0x928/0x958 [e1000]
Jun 20 10:54:37 hq kernel: [<c0121c24>] tasklet_action+0x55/0xaf
Jun 20 10:54:37 hq kernel: [<c022950a>] dev_hard_start_xmit+0x19a/0x1f0
Jun 20 10:54:37 hq kernel: [<f8ae3e6d>] xfs_iext_bno_to_ext+0xd8/0x191 [xfs]
Jun 20 10:54:37 hq kernel: [<f8ac7aec>]
xfs_bmap_search_multi_extents+0xa8/0xc5 [xfs]
Jun 20 10:54:37 hq kernel: [<f8ac7b52>] xfs_bmap_search_extents+0x49/0xbe [xfs]
Jun 20 10:54:37 hq kernel: [<f8ac7e35>] xfs_bmapi+0x26e/0x20ce [xfs]
Jun 20 10:54:37 hq kernel: [<f8ac7e35>] xfs_bmapi+0x26e/0x20ce [xfs]
Jun 20 10:54:37 hq kernel: [<c02547e4>] tcp_transmit_skb+0x604/0x632
Jun 20 10:54:37 hq kernel: [<c02560d3>] __tcp_push_pending_frames+0x6a2/0x758
Jun 20 10:54:37 hq kernel: [<c016d84e>] __d_lookup+0x98/0xdb
Jun 20 10:54:37 hq kernel: [<c016d84e>] __d_lookup+0x98/0xdb
Jun 20 10:54:37 hq kernel: [<c0165370>] do_lookup+0x4f/0x135
Jun 20 10:54:37 hq kernel: [<c016dbc4>] dput+0x1a/0x11b
Jun 20 10:54:37 hq kernel: [<c0167312>] __link_path_walk+0xbe4/0xd1d
Jun 20 10:54:37 hq kernel: [<c016a3fb>] core_sys_select+0x28c/0x2a9
Jun 20 10:54:37 hq kernel: [<c01674fe>] link_path_walk+0xb3/0xbd
Jun 20 10:54:37 hq kernel: [<f8afbea1>] xfs_inactive_free_eofblocks+0xdf/0x23f
[xfs]
Jun 20 10:54:37 hq kernel: [<c016785d>] do_path_lookup+0x20a/0x225
Jun 20 10:54:37 hq kernel: [<f8b07de5>] xfs_vn_getattr+0x27/0x2f [xfs]
Jun 20 10:54:37 hq kernel: [<c0161b28>] cp_new_stat64+0xfd/0x10f
Jun 20 10:54:37 hq kernel: [<c016a9c1>] sys_select+0x9f/0x182
Jun 20 10:54:37 hq kernel: [<c0102c11>] sysenter_past_esp+0x56/0x79
I guess I also need to make sure I get this same stack trace each time.
Thanks.
Brian May
|