xfs
[Top] [All Lists]

xfs_alloc_lookup kernel oopses with snapshot-xfs-2.4.17-2002-01-23_04:32

To: linux-xfs@xxxxxxxxxxx
Subject: xfs_alloc_lookup kernel oopses with snapshot-xfs-2.4.17-2002-01-23_04:32_UTC?
From: Stuart Levy <slevy@xxxxxxxxxxxxx>
Date: Fri, 1 Mar 2002 17:07:42 -0600 (CST)
Sender: owner-linux-xfs@xxxxxxxxxxx
I've got a fairly large XFS filesystem with an external log:
    meta-data=/bd          isize=256    agcount=188, agsize=1048576 blks
    data     =             bsize=4096   blocks=196796242, imaxpct=25
             =             sunit=0      swidth=0 blks, unwritten=0
             =             imaxbits=32    
    naming   =version 2    bsize=4096  
    log      =external     bsize=4096   blocks=18065
    realtime =none         extsz=65536  blocks=0, rtextents=0

    Filesystem            Size  Used Avail Use% Mounted on
    /dev/sda1             751G  631G  120G  84% /bd

running under a stock 2.4.17 kernel with the
    snapshot-xfs-2.4.17-2002-01-23_04:32_UTC
xfs-2.4.17-all-i386 patch applied.
XFS is built as a module; _RT, _QUOTA, _DMAPI all turned off.

I'd also applied all of Trond Myklebust's NFS-related
2.4.17 patches from http://www.fys.uio.no/~trondmy/src/2.4.17/.
Compiled under gcc 2.95.3 20010315.
Filesystem built using xfsprogs-1.3.13.

Hardware is a dual-processor Athlon MP, Tyan 2460,
3ware 7850 8-port IDE RAID (780GB) plus IDE disk on
the motherboard (system disk, plus XFS external log partition).

The whole thing worked well for a few days' fairly heavy use.
But for the last couple of days -- the file system has become
more than about half full -- I've been getting kernel oopses
like those below, every 2-4 hours' use.

After each crash I try mounting/unmounting the filesystem;
xfs_check says all's well, and
xfs_repair -l /dev/hda5  /dev/sda1
also does.  (I had to tweak xfs_repair.c to accept a value for -l --
I see the same change was made in CVS recently too.)

Files don't seem to be corrupted, except sometimes for
0-filled ones that were being written at crash time,
i.e. all seems well.

Does anything seem especially promising to try?
Should I build from the live CVS copy (if so, which tag?)?
Go back to 2.4.14 and use the 1.0.2 XFS release?
Try another compiler?

Mar  1 12:59:32 wallbits kernel: Unable to handle kernel NULL pointer 
dereference at virtual address 00000030
Mar  1 12:59:32 wallbits kernel: f8c35ed2
Mar  1 12:59:32 wallbits kernel: *pde = 00000000
Mar  1 12:59:32 wallbits kernel: Oops: 0000
Mar  1 12:59:32 wallbits kernel: CPU:    0
Mar  1 12:59:32 wallbits kernel: EIP:    
0010:[rtc:__insmod_rtc_S.bss_L24+4408630/24018534]    Not tainted
Mar  1 12:59:32 wallbits kernel: EIP:    0010:[<f8c35ed2>]    Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
Mar  1 12:59:32 wallbits kernel: EFLAGS: 00010246
Mar  1 12:59:32 wallbits kernel: eax: 00000000   ebx: efbd5400   ecx: 00000000  
 edx: f431af3c
Mar  1 12:59:32 wallbits kernel: esi: 00000000   edi: 00000000   ebp: f431aed8  
 esp: ef953700
Mar  1 12:59:32 wallbits kernel: ds: 0018   es: 0018   ss: 0018
Mar  1 12:59:32 wallbits kernel: Process mergemovie (pid: 2728, 
stackpage=ef953000)
Mar  1 12:59:32 wallbits kernel: Stack: f431aed8 00000000 00000000 ef9537cc 
f1229a60 00000004 ef9537d0 00000000 
Mar  1 12:59:32 wallbits kernel:        51000003 00000000 51000018 00000000 
00000000 00000000 00000006 00000000 
Mar  1 12:59:32 wallbits kernel:        000000a2 efbd5400 00000001 00000000 
00000003 00000000 00000000 f8c37548 
Mar  1 12:59:32 wallbits kernel: Call Trace: 
[rtc:__insmod_rtc_S.bss_L24+4414380/24012784] 
[rtc:__insmod_rtc_S.bss_L24+4398238/24028926] 
[rtc:__insmod_rtc_S.bss_L24+4395233/24031931] 
[rtc:__insmod_rtc_S.bss_L24+4401636/24025528] 
[rtc:__insmod_rtc_S.bss_L24+4734884/23692280] 
Mar  1 12:59:32 wallbits kernel: Call Trace: [<f8c37548>] [<f8c3363a>] 
[<f8c32a7d>] [<f8c34380>] [<f8c85940>] 
Mar  1 12:59:32 wallbits kernel:    [<f8c34b57>] [<f8917bcb>] [<f8c42241>] 
[<c01e83cf>] [<c01e830c>] [<f8c4bf77>] 
Mar  1 12:59:32 wallbits kernel:    [<f8c4bf77>] [<f8c45b76>] [<f8c860ba>] 
[<f8c6f376>] [<f8c8fdd5>] [<c012ea0b>] 
Mar  1 12:59:32 wallbits kernel:    [<c012ea0b>] [<f8c8e281>] [<f8c9cca0>] 
[<f8c89c36>] [<f8c88cc9>] [<f8c8e1fc>] 
Mar  1 12:59:32 wallbits kernel:    [<f8c8e3fb>] [<f8c8e1fc>] [<c012da0c>] 
[<c012dd20>] [<c012dd7c>] [<c012e78e>] 
Mar  1 12:59:32 wallbits kernel:    [<c012ea6d>] [<c012e71e>] [<c0126f6b>] 
[<c0127003>] [<f8c89379>] [<f8c89627>] 
Mar  1 12:59:32 wallbits kernel:    [<f8c89859>] [<f8c8e1fc>] [<f8c8f394>] 
[<f8c8e1fc>] [<f8c8ac90>] [<c013500b>] 
Mar  1 12:59:32 wallbits kernel:    [<c0107043>] 
Mar  1 12:59:32 wallbits kernel: Code: 8b 50 30 89 54 24 4c 50 56 52 55 e8 5a 
74 01 00 83 c4 1c 85 

>>EIP; f8c35ed2 <[xfs]xfs_alloc_lookup+11a/340>   <=====
Trace; f8c37548 <[xfs]xfs_alloc_lookup_ge+20/28>
Trace; f8c3363a <[xfs]xfs_alloc_ag_vextent_size+4a/394>
Trace; f8c32a7d <[xfs]xfs_alloc_ag_vextent+31/c8>
Trace; f8c34380 <[xfs]xfs_alloc_fix_freelist+390/408>
Trace; f8c85940 <[xfs]avl_remove+b8/c8>
Trace; f8c34b57 <[xfs]xfs_alloc_vextent+333/3d4>
Trace; f8917bcb <[xfs_support]mrlock+13/24>
Trace; f8c42241 <[xfs]xfs_bmap_alloc+17c1/1afc>
Trace; c01e83cf <ip_finish_output2+c3/118>
Trace; c01e830c <ip_finish_output2+0/118>
Trace; f8c4bf77 <[xfs]xfs_bmbt_get_state+33/3c>
Trace; f8c4bf77 <[xfs]xfs_bmbt_get_state+33/3c>
Trace; f8c45b76 <[xfs]xfs_bmapi+6e2/1048>
Trace; f8c860ba <[xfs]_pagebuf_free_object+ca/f4>
Trace; f8c6f376 <[xfs]xlog_grant_log_space+be/274>
Trace; f8c8fdd5 <[xfs]xfs_strategy+605/854>
Trace; c012ea0b <__alloc_pages+a3/164>
Trace; c012ea0b <__alloc_pages+a3/164>
Trace; f8c8e281 <[xfs]linvfs_pb_bmap+85/c4>
Trace; f8c9cca0 <[xfs]xfs_vnodeops+0/a0>
Trace; f8c89c36 <[xfs]pagebuf_delalloc_convert+42/ec>
Trace; f8c88cc9 <[xfs]pagebuf_write_full_page+59/190>
Trace; f8c8e1fc <[xfs]linvfs_pb_bmap+0/c4>
Trace; f8c8e3fb <[xfs]linvfs_write_full_page+f/1c>
Trace; f8c8e1fc <[xfs]linvfs_pb_bmap+0/c4>
Trace; c012da0c <shrink_cache+220/3d4>
Trace; c012dd20 <shrink_caches+5c/84>
Trace; c012dd7c <try_to_free_pages+34/58>
Trace; c012e78e <balance_classzone+6e/248>
Trace; c012ea6d <__alloc_pages+105/164>
Trace; c012e71e <_alloc_pages+16/18>
Trace; c0126f6b <find_or_create_page+73/f8>
Trace; c0127003 <grab_cache_page+13/18>
Trace; f8c89379 <[xfs]__pagebuf_do_delwri+c9/238>
Trace; f8c89627 <[xfs]_pagebuf_file_write+13f/1f4>
Trace; f8c89859 <[xfs]pagebuf_generic_file_write+17d/304>
Trace; f8c8e1fc <[xfs]linvfs_pb_bmap+0/c4>
Trace; f8c8f394 <[xfs]xfs_write+284/4b4>
Trace; f8c8e1fc <[xfs]linvfs_pb_bmap+0/c4>
Trace; f8c8ac90 <[xfs]linvfs_write+2bc/304>
Trace; c013500b <sys_write+8f/c4>
Trace; c0107043 <system_call+33/40>
Code;  f8c35ed2 <[xfs]xfs_alloc_lookup+11a/340>
00000000 <_EIP>:
Code;  f8c35ed2 <[xfs]xfs_alloc_lookup+11a/340>   <=====
   0:   8b 50 30                  mov    0x30(%eax),%edx   <=====
Code;  f8c35ed5 <[xfs]xfs_alloc_lookup+11d/340>
   3:   89 54 24 4c               mov    %edx,0x4c(%esp,1)
Code;  f8c35ed9 <[xfs]xfs_alloc_lookup+121/340>
   7:   50                        push   %eax
Code;  f8c35eda <[xfs]xfs_alloc_lookup+122/340>
   8:   56                        push   %esi
Code;  f8c35edb <[xfs]xfs_alloc_lookup+123/340>
   9:   52                        push   %edx
Code;  f8c35edc <[xfs]xfs_alloc_lookup+124/340>
   a:   55                        push   %ebp
Code;  f8c35edd <[xfs]xfs_alloc_lookup+125/340>
   b:   e8 5a 74 01 00            call   1746a <_EIP+0x1746a> f8c4d33c 
<[xfs]xfs_btree_check_sblock+0/dc>
Code;  f8c35ee2 <[xfs]xfs_alloc_lookup+12a/340>
  10:   83 c4 1c                  add    $0x1c,%esp
Code;  f8c35ee5 <[xfs]xfs_alloc_lookup+12d/340>
  13:   85 00                     test   %eax,(%eax)


Another Oops, also crashing within xfs_alloc_lookup:

>>EIP; f8c35ed2 <[xfs]xfs_alloc_lookup+11a/340>   <=====
Trace; f8c37520 <[xfs]xfs_alloc_lookup_eq+20/28>
Trace; f8c32845 <[xfs]xfs_alloc_fixup_trees+6d/20c>
Trace; f8c338f1 <[xfs]xfs_alloc_ag_vextent_size+301/394>
Trace; f8c32a7d <[xfs]xfs_alloc_ag_vextent+31/c8>
Trace; f8c34b74 <[xfs]xfs_alloc_vextent+350/3d4>
Trace; f8920bcb <[3w-xxxx].rodata.start+17ab/2b9f>
Trace; f8c42241 <[xfs]xfs_bmap_alloc+17c1/1afc>
Trace; f8c4bf77 <[xfs]xfs_bmbt_get_state+33/3c>
Trace; f8c4bf77 <[xfs]xfs_bmbt_get_state+33/3c>
Trace; f8c45b76 <[xfs]xfs_bmapi+6e2/1048>
Trace; f8c6f376 <[xfs]xlog_grant_log_space+be/274>
Trace; f8c8fdd5 <[xfs]xfs_strategy+605/854>
 ...


<Prev in Thread] Current Thread [Next in Thread>