Hi
I am using XFS with Linux 2.6.7 kernel on Redhat 8.0.
xfsprogs version is 2.6.13.
I was running a kind of crash test on an XFS
filesystem to check recovery/corruptions from unclean
shutdowns.
The XFS filesystem sits on 18G partition. The test
worked in the following manner:
20 dirs were created under the FS root directory. A
sample program which repeatedly does a number of
different operations like
create,delete,link,read,write,rename etc was used to
generate FS load. 20 threads of this program were
spawned each working on one of the above 20
directories to generate heavy FS load.
After about 5 minutes from the time the threads were
spawned to build up the load a bit, the machine was
crashed with a direct power-off.
This cycle was repeated for about 200 times.
After about 164 cycles, the filesystem usage reached
100% and further writes failed as expected. I had
logged the dmesg outputs for each reboot cycle
and all of them showed that XFS recovery did not face
any problems. The message seen in each dmesg log was
<snip>
Starting XFS recovery on filesystem: cciss/c0d0p8
(dev: cciss/c0d0p8)
Ending XFS recovery on filesystem: cciss/c0d0p8 (dev:
cciss/c0d0p8
</snip>
Upto this point, everything was fine with XFS
recovering properly after each crash even after the
filesystem was 100% full.
Next, I deleted 10 of the 20 top level directories to
free up some space.
Here, in the "rm -rf" command for one of the
directories, I noticed a hang.
After sometime of inactivity, I rebooted the system (a
clean reboot) and noticed
that XFS recovery failed. The relevant sections of the
boot messages are attached in xfs_bootup_failure.txt
Next, I tried xfs_check. It basically printed a lot of
"block 12/232064 type unknown not expected" messages
and stopped responding too. I noticed a defunct xfs_db
process on the system at this point.
<snip>
[root@mirahp1 root]# ps -aef | grep 1433
root 1433 1377 0 11:39 pts/1 00:00:00
/bin/sh -f /usr/sbin/xfs_check /dev/cciss/c0d0p8
root 1434 1433 0 11:39 pts/1 00:00:01
[xfs_db] <defunct>
</snip>
After this, I tried xfs_repair. Following is the
session trace
<snip>
[root@mirahp1 root]# xfs_repair /dev/cciss/c0d0p8
Phase 1 - find and verify superblock...
Phase 2 - using internal log
- zero log...
ERROR: The filesystem has valuable metadata changes in
a log which needs to
be replayed. Mount the filesystem to replay the log,
and unmount it before
re-running xfs_repair. If you are unable to mount the
filesystem, then use
the -L option to destroy the log and attempt a repair.
Note that destroying the log may cause corruption --
please attempt a mount
of the filesystem before doing this.
[root@mirahp1 root]# mount -t xfs /dev/cciss/c0d0p8
/xfs
Segmentation fault
</snip>
xfs_repair with -L also results in a hang after this
point.
Any ideas whats going wrong ?
Basically, its looking like my filesystem is
inaccessible now.
I am unable to mount it or run any repair on it.
Any help will be appreciated.
Thanks,
Ash
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com XFS mounting filesystem cciss/c0d0p8
Starting XFS recovery on filesystem: cciss/c0d0p8 (dev: cciss/c0d0p8)
XFS internal error XFS_WANT_CORRUPTED_GOTO at line 4355 of file
fs/xfs/xfs_bmap.c. Caller 0xc024aeaf
[<c02210dd>] xfs_bmap_read_extents+0x37d/0x520
[<c024aeaf>] xfs_iread_extents+0x9f/0x1a0
[<c024aeaf>] xfs_iread_extents+0x9f/0x1a0
[<c011445a>] __wake_up_locked+0x2a/0x30
[<c0223865>] xfs_bunmapi+0xf05/0xfa0
[<c0114370>] default_wake_function+0x0/0x20
[<c0385118>] __down_failed+0x8/0xc
[<c02746fe>] .text.lock.xfs_buf+0x37/0x49
[<c02737df>] pagebuf_iostart+0x6f/0xb0
[<c0255a69>] xlog_grant_log_space+0x189/0x350
[<c012ddbc>] find_lock_page+0x2c/0xb0
[<c0264cfd>] xfs_trans_log_inode+0x2d/0x60
[<c024b855>] xfs_itruncate_finish+0x1e5/0x440
[<c026b02a>] xfs_inactive+0x50a/0x570
[<c024928c>] xfs_itobp+0xfc/0x280
[<c027b6c5>] vn_rele+0x95/0xa0
[<c0279f98>] linvfs_clear_inode+0x18/0x30
[<c0161086>] clear_inode+0xc6/0xe0
[<c0161d48>] generic_delete_inode+0xe8/0x120
[<c027270a>] pagebuf_free+0x8a/0x110
[<c0161f15>] iput+0x55/0x80
[<c025b7b0>] xlog_recover_process_iunlinks+0x320/0x3c0
[<c025cb49>] xlog_recover_finish+0xa9/0xe0
[<c02537ac>] xfs_log_mount_finish+0x2c/0x30
[<c025e5c3>] xfs_mountfs+0xa63/0xff0
[<c027407c>] xfs_setsize_buftarg+0x3c/0x80
[<c024f1f2>] xfs_ioinit+0x22/0x40
[<c02667bd>] xfs_mount+0x2dd/0x400
[<c027a9c3>] vfs_mount+0x43/0x50
[<c027a797>] linvfs_fill_super+0x97/0x240
[<c0285877>] snprintf+0x27/0x30
[<c017d0d6>] disk_name+0x66/0xc0
[<c0151015>] sb_set_blocksize+0x25/0x60
[<c0150a36>] get_sb_bdev+0x126/0x160
[<c0163825>] alloc_vfsmnt+0x85/0xc0
[<c027a96f>] linvfs_get_sb+0x2f/0x40
[<c027a700>] linvfs_fill_super+0x0/0x240
[<c0150c8f>] do_kern_mount+0x5f/0xe0
[<c0164835>] do_add_mount+0x95/0x1a0
[<c0164b70>] do_mount+0x170/0x1c0
[<c01649b8>] copy_mount_options+0x78/0xc0
[<c0164f41>] sys_mount+0xb1/0xe0
[<c0105c8f>] syscall_call+0x7/0xb
Unable to handle kernel NULL pointer dereference at virtual address 000002f2
printing eip:
c026447f
*pde = 00000000
Oops: 0000 [#1]
Modules linked in: usbcore
CPU: 0
EIP: 0060:[<c026447f>] Not tainted
EFLAGS: 00010282 (2.6.7-mirahp1compiled30jul)
EIP is at xfs_trans_brelse+0x1f/0x100
eax: 00000246 ebx: f4345504 ecx: c03e2af0 edx: 000023b7
esi: 00000000 edi: 00000246 ebp: f7fc7b38 esp: f36819c8
ds: 007b es: 007b ss: 0068
Process mount (pid: 517, threadinfo=f3680000 task=f77e9930)
Stack: 0000000d c01061ec 00000000 00000000 f4345504 c02210f1 f4345504 00000246
00000000 c039c53c 00001103 c024aeaf f7fec340 eb89b36c f7fec338 f7d1102c
00000000 00000000 eb89b36c f77e9930 00000286 f3681a30 0000000d f7792c00
Call Trace:
[<c01061ec>] dump_stack+0x1c/0x20
[<c02210f1>] xfs_bmap_read_extents+0x391/0x520
[<c024aeaf>] xfs_iread_extents+0x9f/0x1a0
[<c024aeaf>] xfs_iread_extents+0x9f/0x1a0
[<c011445a>] __wake_up_locked+0x2a/0x30
[<c0223865>] xfs_bunmapi+0xf05/0xfa0
[<c0114370>] default_wake_function+0x0/0x20
[<c0385118>] __down_failed+0x8/0xc
[<c02746fe>] .text.lock.xfs_buf+0x37/0x49
[<c02737df>] pagebuf_iostart+0x6f/0xb0
[<c0255a69>] xlog_grant_log_space+0x189/0x350
[<c012ddbc>] find_lock_page+0x2c/0xb0
[<c0264cfd>] xfs_trans_log_inode+0x2d/0x60
[<c024b855>] xfs_itruncate_finish+0x1e5/0x440
[<c026b02a>] xfs_inactive+0x50a/0x570
[<c024928c>] xfs_itobp+0xfc/0x280
[<c027b6c5>] vn_rele+0x95/0xa0
[<c0279f98>] linvfs_clear_inode+0x18/0x30
[<c0161086>] clear_inode+0xc6/0xe0
[<c0161d48>] generic_delete_inode+0xe8/0x120
[<c027270a>] pagebuf_free+0x8a/0x110
[<c0161f15>] iput+0x55/0x80
[<c025b7b0>] xlog_recover_process_iunlinks+0x320/0x3c0
[<c025cb49>] xlog_recover_finish+0xa9/0xe0
[<c02537ac>] xfs_log_mount_finish+0x2c/0x30
[<c025e5c3>] xfs_mountfs+0xa63/0xff0
[<c027407c>] xfs_setsize_buftarg+0x3c/0x80
[<c024f1f2>] xfs_ioinit+0x22/0x40
[<c02667bd>] xfs_mount+0x2dd/0x400
[<c027a9c3>] vfs_mount+0x43/0x50
[<c027a797>] linvfs_fill_super+0x97/0x240
[<c0285877>] snprintf+0x27/0x30
[<c017d0d6>] disk_name+0x66/0xc0
[<c017d0d6>] disk_name+0x66/0xc0
[<c0151015>] sb_set_blocksize+0x25/0x60
[<c0150a36>] get_sb_bdev+0x126/0x160
[<c0163825>] alloc_vfsmnt+0x85/0xc0
[<c027a96f>] linvfs_get_sb+0x2f/0x40
[<c027a700>] linvfs_fill_super+0x0/0x240
[<c0150c8f>] do_kern_mount+0x5f/0xe0
[<c0164835>] do_add_mount+0x95/0x1a0
[<c0164b70>] do_mount+0x170/0x1c0
[<c01649b8>] copy_mount_options+0x78/0xc0
[<c0164f41>] sys_mount+0xb1/0xe0
[<c0105c8f>] syscall_call+0x7/0xb
Code: 8b b7 ac 00 00 00 89 1c 24 89 74 24 04 e8 cf 0a 00 00 89 c2
|