xfs
[Top] [All Lists]

LOCKUP with XFS and NFS

To: linux-xfs@xxxxxxxxxxx
Subject: LOCKUP with XFS and NFS
From: Ajay Shekhawat <ajay@xxxxxxxxxxxxxxxxx>
Date: Sun, 25 Mar 2001 14:52:48 -0500
Organization: Center for Document Analysis and Recognition
Sender: owner-linux-xfs@xxxxxxxxxxx
User-agent: Mutt/1.2.5i
Background: I'm trying to evaluate the suitability of XFS -vs- ReiserFS for
    an NFS server. I'm leaning towards XFS because we have a few IRIX machines,
    and the admins are familiar with XFS (backup/restore, etc.).
    The "test" server is a dual P-II 450 machine w/ 256MB RAM and a couple 
    of SCSI disks on an Adaptex 2940. One of these disks was formatted as XFS. 
    This disk is NFS exported to 4-5 client machines.  The base installation is
    RedHat Wolverine; we upgraded the kernel to the latest XFS patched kernel,
    0.10-test2 using the kernel source RPM on the OSS ftp site.

Testing:  We are beating on this exported filesystem from 4-5 clients. All
    clients are reading random 3MB files each, in a tight loop. One of the
    clients is writing 0-1MB size files (randomly), again in a tight loop.
    This homespun testing methodology tries to simulate the kind of use the
    real server would expect to get.

Problem:  After about 3-4 hours of continuous beating, the system experienced
    a lockup. The message on the console is duplicated below. Also given
    below is a "function call trace" which I attempted to figure out using
    disassembled output of the kernel, my kernel hacking skills being
    severely lacking  :-)

Question: Is this a knfsd problem, or an XFS problem? It appears like the
    problem is with XFS, but I've been easily mistaken before...

Thanks for any help in this regard,


Ajay


Listing #1: Console message -----------------------------------------------

NMI watchdog detected LOCKUP on CPU0, registers:
CPU:    0
EIP:    0010:[<c02b12a8>]
EFLAGS: 00000086
eax: 00000000   ebx: c1453c00   ecx: cf025cd0 edx: c1453c00
esi: 0000007c   edi: c1447d10   ebp: cf024000 esp: cf025be8
ds: 0018  es: 0018   ss: 0018
Process: nfsd (pid: 645, stackpage=cf025000)
Stack: c1447d10 00000286 00000003 c012a2a3 c1447d10 00000003 00000000 00000000
       00000001 00001000 c1447d78 c0134e34 c1447d10 00000003 00000000 00000001
       c0134f11 00000001 00000000 c12f5e1c 00000811 cf9d69e0 00000000 00000001
Call Trace: [<c012a2a3>] [<c0134e34>] [<c0134f11>] [<c01351e4>] [<c01895cb>] 
[<c018926d>] [<c012d834>]
       [<c0124a8d>] [<c01f1cd8>] [<c01f1b2c>] [<c012514b>] [<c01253e1>] 
[<c0125807>] [<c0125750>] [<c01f1fcf>]
       [<c01eeee7>] [<c01eee40>] [<c017316d>] [<c01eee40>] [<c0170335>] 
[<c016fa33>] [<c02a9a78>] [<c016f7da>]
       [<c0107503>]

Code: 7e f8 e9 6c 8e e7 ff 90 7e 18 00 f3 90 7e f8 e9 d3 8e e7 ff
Console shuts up...




Listing #2: Function call trace (done by hand) --------------------------------

address         in function
-------         ------------------------
c012a258        <kmem_cache_alloc>
c0134df4        <get_unused_buffer_head>
c0134eec        <create_buffers>
c01351cc        <create_empty_buffers>
c0189568        <hook_buffers_to_page>
c0189178        <pagebuf_read_full_page>
c012d6f8        <__alloc_pages>
c01249d4        <add_to_page_cache_unique>
c01f1cc4        <linvfs_read_full_page>
c01f1b2c        <linvfs_pb_bmap>
c0124f28        <generic_file_readahead>
c01251c8        <do_generic_file_read>
c01257a4        <generic_file_read>
c0125750        <file_read_actor>
c01f1d90        <xfs_read>
c01eee40        <linvfs_read>
c01eee40        <linvfs_read>
c0172f40        <nfsd_read>
c01eee40        <linvfs_read>
c0170214        <nfsd_proc_read>
c016f968        <nfsd_dispatch>
c02a97cc        <svc_process>
c016f610        <nfsd>
c01074e0        <kernel_thread>


<Prev in Thread] Current Thread [Next in Thread>