On 04 July james-p@xxxxxxxxxxxxxxxxxx wrote:
>We have noticed a problem with a couple of our NFS servers (running
>RedHat 7.2 with a stock 2.4.18 kernel with XFS v1.1) whereby NFS access
>slows to a crawl or stalls.
>
>The exported filesystem(s) are XFS with 8 nfsd's running - when we have
>the problem the load average is about 8 - but CPU usage, disk access
>and
>network traffic are minimal.
>
>I found, by accident, that running the command 'sync' appears to 'fix'
>the situation...
We have just had the same problem again. Although this time there was
an associated kernel oops. The output of ksymoops is as follows:
Jul 9 12:45:20 santos kernel: Unable to handle kernel NULL pointer
dereference at virtual address 00000001
Jul 9 12:45:20 santos kernel: c01c5384
Jul 9 12:45:20 santos kernel: *pde = 00000000
Jul 9 12:45:20 santos kernel: Oops: 0000
Jul 9 12:45:20 santos kernel: CPU: 1
Jul 9 12:45:20 santos kernel: EIP: 0010:[xfs_syncsub+588/3128]
Not tainted
Jul 9 12:45:20 santos kernel: EIP: 0010:[<c01c5384>] Not tainted
Jul 9 12:45:20 santos kernel: EFLAGS: 00010286
Jul 9 12:45:20 santos kernel: eax: 00000000 ebx: e226f730 ecx:
00000008 edx: dc356904
Jul 9 12:45:20 santos kernel: esi: 00000001 edi: f2dffc40 ebp:
ffffffff esp: f7ebff30
Jul 9 12:45:20 santos kernel: ds: 0018 es: 0018 ss: 0018
Jul 9 12:45:20 santos kernel: Process kupdated (pid: 7,
stackpage=f7ebf000)
Jul 9 12:45:20 santos kernel: Stack: f7a41400 f7a41448 f7ebe660
0008e000 dc356904 f7a41d18 00000010 00000001
Jul 9 12:45:20 santos kernel: f7a41d18 00000000 00000010
00000001 00000001 00000000 00000008 00000008
Jul 9 12:45:20 santos kernel: 00000040 00000000 00000000
00000000 d4fad000 c65e12e0 00000004 f7ebe000
Jul 9 12:45:21 santos kernel: Call Trace: [xfs_sync+21/28]
[linvfs_statfs+59/128] [sync_supers+224/276] [sync_old_buffers+47/140]
[kupdate+313/320]
Jul 9 12:45:21 santos kernel: Call Trace: [<c01c5131>] [<c01db2b7>]
[<c013ed28>] [<c013de67>] [<c013e185>]
Jul 9 12:45:21 santos kernel: [<c0105000>] [<c01057d2>]
[<c01057db>]
Jul 9 12:45:21 santos kernel: Code: f6 45 02 40 75 3b 8b 54 24 70 f6
82 30 02 00 00 10 74 0b f6
>>EIP; c01c5384 <xfs_syncsub+24c/c38> <=====
Trace; c01c5131 <xfs_sync+15/1c>
Trace; c01db2b7 <linvfs_statfs+3b/80>
Trace; c013ed28 <sync_supers+e0/114>
Trace; c013de67 <sync_old_buffers+2f/8c>
Trace; c013e185 <kupdate+139/140>
Trace; c0105000 <_stext+0/0>
Trace; c01057d2 <kernel_thread+1a/30>
Trace; c01057db <kernel_thread+23/30>
Code; c01c5384 <xfs_syncsub+24c/c38>
00000000 <_EIP>:
Code; c01c5384 <xfs_syncsub+24c/c38> <=====
0: f6 45 02 40 testb $0x40,0x2(%ebp) <=====
Code; c01c5388 <xfs_syncsub+250/c38>
4: 75 3b jne 41 <_EIP+0x41> c01c53c5
<xfs_syncsub+28d/c38>
Code; c01c538a <xfs_syncsub+252/c38>
6: 8b 54 24 70 mov 0x70(%esp,1),%edx
Code; c01c538e <xfs_syncsub+256/c38>
a: f6 82 30 02 00 00 10 testb $0x10,0x230(%edx)
Code; c01c5395 <xfs_syncsub+25d/c38>
11: 74 0b je 1e <_EIP+0x1e> c01c53a2
<xfs_syncsub+26a/c38>
Code; c01c5397 <xfs_syncsub+25f/c38>
13: f6 00 00 testb $0x0,(%eax)
Is there anything helpful I should post along with this?
Thanks,
Huw
--
| Huw Lynes | The Moving Picture Company |
| System Administrator | 127 Wardour Street |
|.........................| London, W1F 0NL |
|