can xfs_repair guarantee a complete clean filesystem?
hank peng
pengxihan at gmail.com
Tue Dec 1 20:39:50 CST 2009
Hi, Eric:
I think I have reproduced the problem.
# uname -a
Linux 1234dahua 2.6.23 #747 Mon Nov 16 10:52:58 CST 2009 ppc unknown
#mdadm -C /dev/md1 -l5 -n3 /dev/sd{h,c,b}
# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5]
[raid4] [multipath]
md1 : active raid5 sdb[3] sdc[1] sdh[0]
976772992 blocks level 5, 64k chunk, algorithm 2 [3/2] [UU_]
[==>..................] recovery = 13.0% (63884032/488386496)
finish=103.8min speed=68124K/sec
unused devices: <none>
#pvcreate /dev/md1
#vgcreate Pool_md1 /dev/md1
#lvcreate -L 931G -n testlv Pool_md1
#lvdisplay
# lvdisplay
--- Logical volume ---
LV Name /dev/Pool_md1/testlv
VG Name Pool_md1
LV UUID jWTgk5-Q6tf-jSEU-m9VZ-K2Kb-1oRW-R7oP94
LV Write Access read/write
LV Status available
# open 1
LV Size 931.00 GB
Current LE 238336
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 253:0
#mkfs.xfs -f -ssize=4k /dev/Pool_md1/testlv
#mount /dev/Pool_md1/testlv /mnt/Pool_md1/testlv
All is OK and mount the filesystem and began to write files into it
through our application software. For a short while, problem occured.
# cd /mnt/Pool_md1/testlv
cd: error retrieving current directory: getcwd: cannot access parent
directories: Input/output error
#dmesg | tail -n 30
--- rd:3 wd:2
disk 0, o:1, dev:sdh
disk 1, o:1, dev:sdc
RAID5 conf printout:
--- rd:3 wd:2
disk 0, o:1, dev:sdh
disk 1, o:1, dev:sdc
disk 2, o:1, dev:sdb
md: recovery of RAID array md1
md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
md: using maximum available idle IO bandwidth (but not more than
200000 KB/sec) for recovery.
md: using 128k window, over a total of 488386496 blocks.
Filesystem "dm-0": Disabling barriers, not supported by the underlying device
XFS mounting filesystem dm-0
Ending clean XFS mount for filesystem: dm-0
Filesystem "dm-0": XFS internal error xfs_trans_cancel at line 1169 of
file fs/xfs/xfs_trans.c. Caller 0xc019fbf0
Call Trace:
[e8e6dcb0] [c00091ec] show_stack+0x3c/0x1a0 (unreliable)
[e8e6dce0] [c017559c] xfs_error_report+0x50/0x60
[e8e6dcf0] [c0197058] xfs_trans_cancel+0x124/0x140
[e8e6dd10] [c019fbf0] xfs_create+0x1fc/0x63c
[e8e6dd90] [c01ad690] xfs_vn_mknod+0x1ac/0x20c
[e8e6de40] [c007ded4] vfs_create+0xa8/0xe4
[e8e6de60] [c0081370] open_namei+0x5f0/0x688
[e8e6deb0] [c00729b8] do_filp_open+0x2c/0x6c
[e8e6df20] [c0072a54] do_sys_open+0x5c/0xf8
[e8e6df40] [c0002320] ret_from_syscall+0x0/0x3c
xfs_force_shutdown(dm-0,0x8) called from line 1170 of file
fs/xfs/xfs_trans.c. Return address = 0xc01b0b74
Filesystem "dm-0": Corruption of in-memory data detected. Shutting
down filesystem: dm-0
Please umount the filesystem, and rectify the problem(s)
What shoul I do now? use xfs_repair or use newer kernel ? Please let
me know if you need other information.
2009/12/2 Eric Sandeen <sandeen at sandeen.net>:
> hank peng wrote:
>> 2009/12/1 Eric Sandeen <sandeen at sandeen.net>:
>
> ...
>
>>>> kernel version is 2.6.23, xfsprogs is 2.9.7, CPU is MPC8548, powerpc arch.
>>>> I am at home now, Maybe I can provide some detailed information tomorrow.
>>> If there's any possibility to test newer kernel & userspace, that'd
>>> be great. Many bugs have been fixed since those versions.
>>>
>> We did have plan to upgrade kernel to latest 2.6.31.
>
> Well, I'm just suggesting testing it for now, not necessarily
> upgrading your product. Would just be good to know if the bug you
> are seeing persists upstream on ppc.
>
>> BTW, Is there some place where I can check those fixed bug list across versions?
>
> You can look at the changelogs on kernel.org, for instance:
>
> http://www.kernel.org/pub/linux/kernel/v2.6/ChangeLog-2.6.32
>
> Or with git, git log --pretty-oneline fs/xfs
>
> There isn't a great bug <-> commit <-> kernelversion mapping, I guess.
>
> -Eric
>
>>> -Eric
>
>
--
The simplest is not all best but the best is surely the simplest!
More information about the xfs
mailing list