xfs
[Top] [All Lists]

Re: Data Corruption Problem

To: "Aman Shahi" <ashahi@xxxxxxxxxxx>, <linux-xfs@xxxxxxxxxxx>
Subject: Re: Data Corruption Problem
From: "Wendy Cheng" <s_wendy_cheng@xxxxxxxxxxx>
Date: Wed, 30 Jul 2003 13:22:25 -0400
References: <E923357F2279D411B9F500508BAEE83701CCD539@hqntex1.ciprico.com>
Sender: linux-xfs-bounce@xxxxxxxxxxx
Linux XFS is not a cluster file system - implies that during a failover,
the file system's buffer cache is lost (together with the faulty machine).
Even the second machine can pick up the follow-on workload, there
are possibilities that data could get lost (or corrupted) unless you mount
them with "sync" option. All the journaling file system (such as XFS)
could do is to ensure file system's meta data is kept on a consistent
state but sometimes the journal data could get screwed up too.

You ask too much for a free software :)-

Wendy
-------

----- Original Message ----- 
From: "Aman Shahi" <ashahi@xxxxxxxxxxx>
To: <linux-xfs@xxxxxxxxxxx>
Sent: Tuesday, July 29, 2003 5:45 PM
Subject: Data Corruption Problem


|
| Hi,
| I am using linux 2.4.20 + LVM 1.0.7 +
| XFS(snapshot-xfs-2.4.20-2003-04-07_05:19_UTC with ACLs, no debug enabled).
|
| I created couple of Logical Volumes using LVM, and then created/mounted
file
| system over it. I am running some NFS Client doing I/O over different
files
| in these file systems. I am doing Failover/Failback testing. That is I
have
| one filestem attached to one node and other to the second node. When I
fail
| one of the node, the other node takes over the file system of the second
| node. When trying to mount the file system of the second node, I am
getting
| File System corruption.
|
| Could anybody tell what is the problem here. Attached here is the output
| from "dmesg".
|
| thanks in Advance,
|
| Aman.
|
|
| Jul 29 13:17:25 localhost kernel: Linux version 2.4.20
| (root@xxxxxxxxxxxxxxxxxxxxxxxx) (gcc version 3.2 20020903 (Red Hat Linux
8.0
| 3.2-7)) #1 SMP Tue Jul 29 11:31:57 EDT 2003
| Jul 29 13:17:26 localhost kernel: LVM version 1.0.7(28/03/2003)
| Jul 29 13:17:26 localhost kernel: NET4: Linux TCP/IP 1.0 for NET4.0
| Jul 29 13:17:26 localhost kernel: qla2x00: Found  VID=1077 DID=2312
| SSVID=1077 SSDID=100
| Jul 29 13:17:26 localhost kernel: scsi(0:0:1:1): Enabled tagged queuing,
| queue depth 16.
| Jul 29 13:17:26 localhost kernel: Attached scsi disk sda at scsi0, channel
| 0, id 0, lun 0
| Jul 29 13:17:26 localhost kernel: Attached scsi disk sdb at scsi0, channel
| 0, id 0, lun 1
| Jul 29 13:17:26 localhost kernel: Attached scsi disk sdc at scsi0, channel
| 0, id 1, lun 0
| Jul 29 13:17:26 localhost kernel: Attached scsi disk sdd at scsi0, channel
| 0, id 1, lun 1
| Jul 29 13:17:26 localhost kernel: SCSI device sda: 573498800 512-byte hdwr
| sectors (293631 MB)
| Jul 29 13:17:26 localhost kernel:  sda: sda1 sda2 sda3
| Jul 29 13:17:26 localhost kernel: SCSI device sdb: 573498800 512-byte hdwr
| sectors (293631 MB)
| Jul 29 13:17:26 localhost kernel:  sdb: sdb1 sdb2 sdb3
| Jul 29 13:17:26 localhost kernel: SCSI device sdc: 573498800 512-byte hdwr
| sectors (293631 MB)
| Jul 29 13:17:26 localhost kernel:  sdc: sdc1 sdc2 sdc3
| Jul 29 13:17:26 localhost kernel: SCSI device sdd: 573498800 512-byte hdwr
| sectors (293631 MB)
| Jul 29 13:17:26 localhost kernel:  sdd: sdd1 sdd2 sdd3
| Jul 29 13:17:26 localhost kernel: reiserfs: checking transaction log
(device
| 03:03) ...
| Jul 29 13:17:26 localhost kernel: Warning, log replay starting on readonly
| filesystem
| Jul 29 13:17:26 localhost kernel: reiserfs: replayed 63 transactions in 1
| seconds
| Jul 29 13:17:26 localhost kernel: Using r5 hash to sort names
| Jul 29 13:17:26 localhost kernel: ReiserFS version 3.6.25
| Jul 29 13:17:32 localhost modprobe: modprobe: Can't locate module
| block-major-43
| Jul 29 13:17:35 localhost kernel: hydra uses obsolete
(PF_INET,SOCK_PACKET)
| Jul 29 13:17:35 localhost modprobe: modprobe: Can't locate module
| block-major-43
| Jul 29 13:17:35 localhost last message repeated 31 times
| Jul 29 13:17:35 localhost nfs: Starting NFS services:  succeeded
| Jul 29 13:17:35 localhost nfs: rpc.nfsd startup succeeded
| Jul 29 13:17:36 localhost nfs: rpc.mountd startup succeeded
| Jul 29 13:17:52 localhost modprobe: modprobe: Can't locate module
| block-major-43
| Jul 29 13:17:52 localhost last message repeated 31 times
| Jul 29 13:17:56 localhost kernel: SGI XFS
| snapshot-xfs-2.4.20-2003-04-07_05:19_UTC with ACLs, no debug enabled
| Jul 29 13:17:56 localhost kernel: SGI XFS Quota Management subsystem
| Jul 29 13:17:56 localhost kernel: XFS mounting filesystem lvm(58,1)
| Jul 29 13:35:25 localhost modprobe: modprobe: Can't locate module
| block-major-43
| Jul 29 13:36:21 localhost last message repeated 192 times
| Jul 29 13:36:23 localhost last message repeated 127 times
| Jul 29 13:36:29 localhost kernel: XFS mounting filesystem lvm(58,1)
| Jul 29 13:36:30 localhost kernel: XFS quotacheck lvm(58,1): Please wait.
| Jul 29 13:36:32 localhost kernel: XFS quotacheck lvm(58,1): Done.
| Jul 29 13:43:47 localhost kernel: e1000: eth2 NIC Link is Down
| Jul 29 13:43:49 localhost kernel: e1000: eth2 NIC Link is Up 100 Mbps Full
| Duplex
| Jul 29 13:44:20 localhost kernel: XFS mounting filesystem lvm(58,0)
| Jul 29 13:44:20 localhost kernel: Filesystem "lvm(58,0)": XFS internal
error
| xlog_clear_stale_blocks(2) at line 1135 of file xfs_log_recover.c.  Caller
| 0xf8b27f8a
| Jul 29 13:44:20 localhost kernel: eb7e3bf0 f8b13775 f8b137ef 00000008
| 00000000 00000001 f219b000 f8b5b6e0
| Jul 29 13:44:20 localhost kernel:        f8b5a95e 0000046f f8b5a8b2
f8b27f8a
| 00000007 00002400 00001200 f8b286ea
| Jul 29 13:44:20 localhost kernel:        f8b5a95e 00000001 f219b000
f8b5a8b2
| 0000046f f8b27f8a 00000008 00000007
| Jul 29 13:44:20 localhost kernel: Call Trace:
| Jul 29 13:44:20 localhost kernel:
| [qla2300:__insmod_qla2300_S.bss_L22432+2805589/92572491]
| xfs_stack_trace+0x5/0x10 [xfs]
| Jul 29 13:44:20 localhost kernel:  [<f8b13775>] xfs_stack_trace+0x5/0x10
| [xfs]
| Jul 29 13:44:20 localhost kernel:
| [qla2300:__insmod_qla2300_S.bss_L22432+2805711/92572369]
| xfs_error_report+0x6f/0xb0 [xfs]
| Jul 29 13:44:20 localhost kernel:  [<f8b137ef>] xfs_error_report+0x6f/0xb0
| [xfs]
| Jul 29 13:44:20 localhost kernel:
| [qla2300:__insmod_qla2300_S.bss_L22432+3100352/92277728]
| .rodata.str1.32+0x240/0x2c00 [xfs]
| Jul 29 13:44:20 localhost kernel:  [<f8b5b6e0>]
.rodata.str1.32+0x240/0x2c00
| [xfs]
| Jul 29 13:44:20 localhost kernel:
| [qla2300:__insmod_qla2300_S.bss_L22432+3096894/92281186]
| .rodata.str1.1+0x83a/0x137c [xfs]
| Jul 29 13:44:20 localhost kernel:  [<f8b5a95e>]
.rodata.str1.1+0x83a/0x137c
| [xfs]
| Jul 29 13:44:20 localhost kernel:
| [qla2300:__insmod_qla2300_S.bss_L22432+3096722/92281358]
| .rodata.str1.1+0x78e/0x137c [xfs]
| Jul 29 13:44:20 localhost kernel:  [<f8b5a8b2>]
.rodata.str1.1+0x78e/0x137c
| [xfs]
| Jul 29 13:44:20 localhost kernel:
| [qla2300:__insmod_qla2300_S.bss_L22432+2889578/92488502]
| xlog_find_tail+0x27a/0x440 [xfs]
| Jul 29 13:44:20 localhost kernel:  [<f8b27f8a>] xlog_find_tail+0x27a/0x440
| [xfs]
| Jul 29 13:44:20 localhost kernel:
| [qla2300:__insmod_qla2300_S.bss_L22432+2891466/92486614]
| xlog_clear_stale_blocks+0x14a/0x1a0 [xfs]
| Jul 29 13:44:20 localhost kernel:  [<f8b286ea>]
| xlog_clear_stale_blocks+0x14a/0x1a0 [xfs]
| Jul 29 13:44:20 localhost kernel:
| [qla2300:__insmod_qla2300_S.bss_L22432+3096894/92281186]
| .rodata.str1.1+0x83a/0x137c [xfs]
| Jul 29 13:44:20 localhost kernel:  [<f8b5a95e>]
.rodata.str1.1+0x83a/0x137c
| [xfs]
| Jul 29 13:44:20 localhost kernel:
| [qla2300:__insmod_qla2300_S.bss_L22432+3096722/92281358]
| .rodata.str1.1+0x78e/0x137c [xfs]
| Jul 29 13:44:20 localhost kernel:  [<f8b5a8b2>]
.rodata.str1.1+0x78e/0x137c
| [xfs]
| Jul 29 13:44:20 localhost kernel:
| [qla2300:__insmod_qla2300_S.bss_L22432+2889578/92488502]
| xlog_find_tail+0x27a/0x440 [xfs]
| Jul 29 13:44:20 localhost kernel:  [<f8b27f8a>] xlog_find_tail+0x27a/0x440
| [xfs]
| Jul 29 13:44:20 localhost kernel:
| [qla2300:__insmod_qla2300_S.bss_L22432+2889578/92488502]
| xlog_find_tail+0x27a/0x440 [xfs]
| Jul 29 13:44:20 localhost kernel:  [<f8b27f8a>] xlog_find_tail+0x27a/0x440
| [xfs]
| Jul 29 13:44:20 localhost kernel:
| [qla2300:__insmod_qla2300_S.bss_L22432+2906167/92471913]
| xlog_recover+0x37/0x100 [xfs]
| Jul 29 13:44:20 localhost kernel:  [<f8b2c057>] xlog_recover+0x37/0x100
| [xfs]
| Jul 29 13:44:20 localhost kernel:
| [qla2300:__insmod_qla2300_S.bss_L22432+2869341/92508739]
| xfs_log_mount+0x8d/0xf0 [xfs]
| Jul 29 13:44:20 localhost kernel:  [<f8b2307d>] xfs_log_mount+0x8d/0xf0
| [xfs]
| Jul 29 13:44:20 localhost kernel:
| [qla2300:__insmod_qla2300_S.bss_L22432+2912227/92465853]
| xfs_mountfs+0x503/0xf20 [xfs]
| Jul 29 13:44:20 localhost kernel:  [<f8b2d803>] xfs_mountfs+0x503/0xf20
| [xfs]
| Jul 29 13:44:20 localhost kernel:
| [qla2300:__insmod_qla2300_S.bss_L22432+2909972/92468108]
| xfs_readsb+0x134/0x1f0 [xfs]
| Jul 29 13:44:20 localhost kernel:  [<f8b2cf34>] xfs_readsb+0x134/0x1f0
[xfs]
| Jul 29 13:44:20 localhost kernel:
| [qla2300:__insmod_qla2300_S.bss_L22432+2858466/92519614]
| xfs_ioinit+0x42/0x50 [xfs]
| Jul 29 13:44:20 localhost kernel:  [<f8b20602>] xfs_ioinit+0x42/0x50 [xfs]
| Jul 29 13:44:20 localhost kernel:
| [qla2300:__insmod_qla2300_S.bss_L22432+2947982/92430098]
| xfs_mount+0x2ce/0x400 [xfs]
| Jul 29 13:44:20 localhost kernel:  [<f8b363ae>] xfs_mount+0x2ce/0x400
[xfs]
| Jul 29 13:44:20 localhost kernel:
| [qla2300:__insmod_qla2300_S.bss_L22432+3043747/92334333]
vfs_mount+0x43/0x50
| [xfs]
| Jul 29 13:44:20 localhost kernel:  [<f8b4d9c3>] vfs_mount+0x43/0x50 [xfs]
| Jul 29 13:44:20 localhost kernel:
| [qla2300:__insmod_qla2300_S.bss_L22432+3075484/92302596]
| xfs_qm_mount+0x4c/0x70 [xfs]
| Jul 29 13:44:20 localhost kernel:  [<f8b555bc>] xfs_qm_mount+0x4c/0x70
[xfs]
| Jul 29 13:44:20 localhost kernel:
| [qla2300:__insmod_qla2300_S.bss_L22432+3043747/92334333]
vfs_mount+0x43/0x50
| [xfs]
| Jul 29 13:44:20 localhost kernel:  [<f8b4d9c3>] vfs_mount+0x43/0x50 [xfs]
| Jul 29 13:44:20 localhost kernel:
| [qla2300:__insmod_qla2300_S.bss_L22432+3042987/92335093]
| linvfs_read_super+0x9b/0x1c0 [xfs]
| Jul 29 13:44:20 localhost kernel:  [<f8b4d6cb>]
linvfs_read_super+0x9b/0x1c0
| [xfs]
| Jul 29 13:44:20 localhost kernel:  [kmalloc+75/96] kmalloc+0x4b/0x60
| [kernel]
| Jul 29 13:44:20 localhost kernel:  [<c013ae5b>] kmalloc+0x4b/0x60 [kernel]
| Jul 29 13:44:20 localhost kernel:  [alloc_super+58/432]
| alloc_super+0x3a/0x1b0 [kernel]
| Jul 29 13:44:20 localhost kernel:  [<c014bbfa>] alloc_super+0x3a/0x1b0
| [kernel]
| Jul 29 13:44:20 localhost kernel:  [insert_super+100/128]
| insert_super+0x64/0x80 [kernel]
| Jul 29 13:44:20 localhost kernel:  [<c014bee4>] insert_super+0x64/0x80
| [kernel]
| Jul 29 13:44:20 localhost kernel:  [get_sb_bdev+446/752]
| get_sb_bdev+0x1be/0x2f0 [kernel]
| Jul 29 13:44:20 localhost kernel:  [<c014c8ee>] get_sb_bdev+0x1be/0x2f0
| [kernel]
| Jul 29 13:44:20 localhost kernel:
| [qla2300:__insmod_qla2300_S.bss_L22432+3160236/92217844]
| xfs_fs_type+0x0/0x34 [xfs]
| Jul 29 13:44:20 localhost kernel:  [<f8b6a0cc>] xfs_fs_type+0x0/0x34 [xfs]
| Jul 29 13:44:20 localhost kernel:  [do_kern_mount+289/320]
| do_kern_mount+0x121/0x140 [kernel]
| Jul 29 13:44:20 localhost kernel:  [<c014ccf1>] do_kern_mount+0x121/0x140
| [kernel]
| Jul 29 13:44:20 localhost kernel:
| [qla2300:__insmod_qla2300_S.bss_L22432+3160236/92217844]
| xfs_fs_type+0x0/0x34 [xfs]
| Jul 29 13:44:20 localhost kernel:  [<f8b6a0cc>] xfs_fs_type+0x0/0x34 [xfs]
| Jul 29 13:44:20 localhost kernel:  [do_add_mount+147/400]
| do_add_mount+0x93/0x190 [kernel]
| Jul 29 13:44:20 localhost kernel:  [<c0163993>] do_add_mount+0x93/0x190
| [kernel]
| Jul 29 13:44:20 localhost kernel:  [do_mount+352/432] do_mount+0x160/0x1b0
| [kernel]
| Jul 29 13:44:20 localhost kernel:  [<c0163cc0>] do_mount+0x160/0x1b0
| [kernel]
| Jul 29 13:44:20 localhost kernel:  [copy_mount_options+121/208]
| copy_mount_options+0x79/0xd0 [kernel]
| Jul 29 13:44:20 localhost kernel:  [<c0163b09>]
copy_mount_options+0x79/0xd0
| [kernel]
| Jul 29 13:44:20 localhost kernel:  [sys_mount+215/352]
sys_mount+0xd7/0x160
| [kernel]
| Jul 29 13:44:20 localhost kernel:  [<c01640f7>] sys_mount+0xd7/0x160
| [kernel]
| Jul 29 13:44:20 localhost kernel:  [system_call+51/56]
system_call+0x33/0x38
| [kernel]
| Jul 29 13:44:20 localhost kernel:  [<c01094ef>] system_call+0x33/0x38
| [kernel]
| Jul 29 13:44:20 localhost kernel:
| Jul 29 13:44:20 localhost kernel: XFS: failed to locate log tail
| Jul 29 13:44:20 localhost kernel: XFS: log mount/recovery failed
| Jul 29 13:44:20 localhost kernel: XFS: log mount failed
| Jul 29 13:44:31 localhost kernel: XFS mounting filesystem lvm(58,0)
|
|
|


<Prev in Thread] Current Thread [Next in Thread>