I've got a fairly large XFS filesystem with an external log:
meta-data=/bd isize=256 agcount=188, agsize=1048576 blks
data = bsize=4096 blocks=196796242, imaxpct=25
= sunit=0 swidth=0 blks, unwritten=0
= imaxbits=32
naming =version 2 bsize=4096
log =external bsize=4096 blocks=18065
realtime =none extsz=65536 blocks=0, rtextents=0
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 751G 631G 120G 84% /bd
running under a stock 2.4.17 kernel with the
snapshot-xfs-2.4.17-2002-01-23_04:32_UTC
xfs-2.4.17-all-i386 patch applied.
XFS is built as a module; _RT, _QUOTA, _DMAPI all turned off.
I'd also applied all of Trond Myklebust's NFS-related
2.4.17 patches from http://www.fys.uio.no/~trondmy/src/2.4.17/.
Compiled under gcc 2.95.3 20010315.
Filesystem built using xfsprogs-1.3.13.
Hardware is a dual-processor Athlon MP, Tyan 2460,
3ware 7850 8-port IDE RAID (780GB) plus IDE disk on
the motherboard (system disk, plus XFS external log partition).
The whole thing worked well for a few days' fairly heavy use.
But for the last couple of days -- the file system has become
more than about half full -- I've been getting kernel oopses
like those below, every 2-4 hours' use.
After each crash I try mounting/unmounting the filesystem;
xfs_check says all's well, and
xfs_repair -l /dev/hda5 /dev/sda1
also does. (I had to tweak xfs_repair.c to accept a value for -l --
I see the same change was made in CVS recently too.)
Files don't seem to be corrupted, except sometimes for
0-filled ones that were being written at crash time,
i.e. all seems well.
Does anything seem especially promising to try?
Should I build from the live CVS copy (if so, which tag?)?
Go back to 2.4.14 and use the 1.0.2 XFS release?
Try another compiler?
Mar 1 12:59:32 wallbits kernel: Unable to handle kernel NULL pointer
dereference at virtual address 00000030
Mar 1 12:59:32 wallbits kernel: f8c35ed2
Mar 1 12:59:32 wallbits kernel: *pde = 00000000
Mar 1 12:59:32 wallbits kernel: Oops: 0000
Mar 1 12:59:32 wallbits kernel: CPU: 0
Mar 1 12:59:32 wallbits kernel: EIP:
0010:[rtc:__insmod_rtc_S.bss_L24+4408630/24018534] Not tainted
Mar 1 12:59:32 wallbits kernel: EIP: 0010:[<f8c35ed2>] Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
Mar 1 12:59:32 wallbits kernel: EFLAGS: 00010246
Mar 1 12:59:32 wallbits kernel: eax: 00000000 ebx: efbd5400 ecx: 00000000
edx: f431af3c
Mar 1 12:59:32 wallbits kernel: esi: 00000000 edi: 00000000 ebp: f431aed8
esp: ef953700
Mar 1 12:59:32 wallbits kernel: ds: 0018 es: 0018 ss: 0018
Mar 1 12:59:32 wallbits kernel: Process mergemovie (pid: 2728,
stackpage=ef953000)
Mar 1 12:59:32 wallbits kernel: Stack: f431aed8 00000000 00000000 ef9537cc
f1229a60 00000004 ef9537d0 00000000
Mar 1 12:59:32 wallbits kernel: 51000003 00000000 51000018 00000000
00000000 00000000 00000006 00000000
Mar 1 12:59:32 wallbits kernel: 000000a2 efbd5400 00000001 00000000
00000003 00000000 00000000 f8c37548
Mar 1 12:59:32 wallbits kernel: Call Trace:
[rtc:__insmod_rtc_S.bss_L24+4414380/24012784]
[rtc:__insmod_rtc_S.bss_L24+4398238/24028926]
[rtc:__insmod_rtc_S.bss_L24+4395233/24031931]
[rtc:__insmod_rtc_S.bss_L24+4401636/24025528]
[rtc:__insmod_rtc_S.bss_L24+4734884/23692280]
Mar 1 12:59:32 wallbits kernel: Call Trace: [<f8c37548>] [<f8c3363a>]
[<f8c32a7d>] [<f8c34380>] [<f8c85940>]
Mar 1 12:59:32 wallbits kernel: [<f8c34b57>] [<f8917bcb>] [<f8c42241>]
[<c01e83cf>] [<c01e830c>] [<f8c4bf77>]
Mar 1 12:59:32 wallbits kernel: [<f8c4bf77>] [<f8c45b76>] [<f8c860ba>]
[<f8c6f376>] [<f8c8fdd5>] [<c012ea0b>]
Mar 1 12:59:32 wallbits kernel: [<c012ea0b>] [<f8c8e281>] [<f8c9cca0>]
[<f8c89c36>] [<f8c88cc9>] [<f8c8e1fc>]
Mar 1 12:59:32 wallbits kernel: [<f8c8e3fb>] [<f8c8e1fc>] [<c012da0c>]
[<c012dd20>] [<c012dd7c>] [<c012e78e>]
Mar 1 12:59:32 wallbits kernel: [<c012ea6d>] [<c012e71e>] [<c0126f6b>]
[<c0127003>] [<f8c89379>] [<f8c89627>]
Mar 1 12:59:32 wallbits kernel: [<f8c89859>] [<f8c8e1fc>] [<f8c8f394>]
[<f8c8e1fc>] [<f8c8ac90>] [<c013500b>]
Mar 1 12:59:32 wallbits kernel: [<c0107043>]
Mar 1 12:59:32 wallbits kernel: Code: 8b 50 30 89 54 24 4c 50 56 52 55 e8 5a
74 01 00 83 c4 1c 85
>>EIP; f8c35ed2 <[xfs]xfs_alloc_lookup+11a/340> <=====
Trace; f8c37548 <[xfs]xfs_alloc_lookup_ge+20/28>
Trace; f8c3363a <[xfs]xfs_alloc_ag_vextent_size+4a/394>
Trace; f8c32a7d <[xfs]xfs_alloc_ag_vextent+31/c8>
Trace; f8c34380 <[xfs]xfs_alloc_fix_freelist+390/408>
Trace; f8c85940 <[xfs]avl_remove+b8/c8>
Trace; f8c34b57 <[xfs]xfs_alloc_vextent+333/3d4>
Trace; f8917bcb <[xfs_support]mrlock+13/24>
Trace; f8c42241 <[xfs]xfs_bmap_alloc+17c1/1afc>
Trace; c01e83cf <ip_finish_output2+c3/118>
Trace; c01e830c <ip_finish_output2+0/118>
Trace; f8c4bf77 <[xfs]xfs_bmbt_get_state+33/3c>
Trace; f8c4bf77 <[xfs]xfs_bmbt_get_state+33/3c>
Trace; f8c45b76 <[xfs]xfs_bmapi+6e2/1048>
Trace; f8c860ba <[xfs]_pagebuf_free_object+ca/f4>
Trace; f8c6f376 <[xfs]xlog_grant_log_space+be/274>
Trace; f8c8fdd5 <[xfs]xfs_strategy+605/854>
Trace; c012ea0b <__alloc_pages+a3/164>
Trace; c012ea0b <__alloc_pages+a3/164>
Trace; f8c8e281 <[xfs]linvfs_pb_bmap+85/c4>
Trace; f8c9cca0 <[xfs]xfs_vnodeops+0/a0>
Trace; f8c89c36 <[xfs]pagebuf_delalloc_convert+42/ec>
Trace; f8c88cc9 <[xfs]pagebuf_write_full_page+59/190>
Trace; f8c8e1fc <[xfs]linvfs_pb_bmap+0/c4>
Trace; f8c8e3fb <[xfs]linvfs_write_full_page+f/1c>
Trace; f8c8e1fc <[xfs]linvfs_pb_bmap+0/c4>
Trace; c012da0c <shrink_cache+220/3d4>
Trace; c012dd20 <shrink_caches+5c/84>
Trace; c012dd7c <try_to_free_pages+34/58>
Trace; c012e78e <balance_classzone+6e/248>
Trace; c012ea6d <__alloc_pages+105/164>
Trace; c012e71e <_alloc_pages+16/18>
Trace; c0126f6b <find_or_create_page+73/f8>
Trace; c0127003 <grab_cache_page+13/18>
Trace; f8c89379 <[xfs]__pagebuf_do_delwri+c9/238>
Trace; f8c89627 <[xfs]_pagebuf_file_write+13f/1f4>
Trace; f8c89859 <[xfs]pagebuf_generic_file_write+17d/304>
Trace; f8c8e1fc <[xfs]linvfs_pb_bmap+0/c4>
Trace; f8c8f394 <[xfs]xfs_write+284/4b4>
Trace; f8c8e1fc <[xfs]linvfs_pb_bmap+0/c4>
Trace; f8c8ac90 <[xfs]linvfs_write+2bc/304>
Trace; c013500b <sys_write+8f/c4>
Trace; c0107043 <system_call+33/40>
Code; f8c35ed2 <[xfs]xfs_alloc_lookup+11a/340>
00000000 <_EIP>:
Code; f8c35ed2 <[xfs]xfs_alloc_lookup+11a/340> <=====
0: 8b 50 30 mov 0x30(%eax),%edx <=====
Code; f8c35ed5 <[xfs]xfs_alloc_lookup+11d/340>
3: 89 54 24 4c mov %edx,0x4c(%esp,1)
Code; f8c35ed9 <[xfs]xfs_alloc_lookup+121/340>
7: 50 push %eax
Code; f8c35eda <[xfs]xfs_alloc_lookup+122/340>
8: 56 push %esi
Code; f8c35edb <[xfs]xfs_alloc_lookup+123/340>
9: 52 push %edx
Code; f8c35edc <[xfs]xfs_alloc_lookup+124/340>
a: 55 push %ebp
Code; f8c35edd <[xfs]xfs_alloc_lookup+125/340>
b: e8 5a 74 01 00 call 1746a <_EIP+0x1746a> f8c4d33c
<[xfs]xfs_btree_check_sblock+0/dc>
Code; f8c35ee2 <[xfs]xfs_alloc_lookup+12a/340>
10: 83 c4 1c add $0x1c,%esp
Code; f8c35ee5 <[xfs]xfs_alloc_lookup+12d/340>
13: 85 00 test %eax,(%eax)
Another Oops, also crashing within xfs_alloc_lookup:
>>EIP; f8c35ed2 <[xfs]xfs_alloc_lookup+11a/340> <=====
Trace; f8c37520 <[xfs]xfs_alloc_lookup_eq+20/28>
Trace; f8c32845 <[xfs]xfs_alloc_fixup_trees+6d/20c>
Trace; f8c338f1 <[xfs]xfs_alloc_ag_vextent_size+301/394>
Trace; f8c32a7d <[xfs]xfs_alloc_ag_vextent+31/c8>
Trace; f8c34b74 <[xfs]xfs_alloc_vextent+350/3d4>
Trace; f8920bcb <[3w-xxxx].rodata.start+17ab/2b9f>
Trace; f8c42241 <[xfs]xfs_bmap_alloc+17c1/1afc>
Trace; f8c4bf77 <[xfs]xfs_bmbt_get_state+33/3c>
Trace; f8c4bf77 <[xfs]xfs_bmbt_get_state+33/3c>
Trace; f8c45b76 <[xfs]xfs_bmapi+6e2/1048>
Trace; f8c6f376 <[xfs]xlog_grant_log_space+be/274>
Trace; f8c8fdd5 <[xfs]xfs_strategy+605/854>
...
|