xfs
[Top] [All Lists]

Re: Errors using amanda/xfsdump

To: Federico Sevilla III <jijo@xxxxxxxxxxxxxxx>
Subject: Re: Errors using amanda/xfsdump
From: Justin Tripp <justin@xxxxxxxxx>
Date: Tue, 8 May 2001 07:35:03 -0600 (MDT)
Cc: Linux XFS Mailing List <linux-xfs@xxxxxxxxxxx>
In-reply-to: <Pine.LNX.4.21.0105081734340.2456-100000@kalapati.jijo.local>
Sender: owner-linux-xfs@xxxxxxxxxxx
I do not believe this is the problem.  For the RAID 5 failure to occur,
one must run the raid in degraded mode (e.g. with a disk failed.)  My raid
has not run in the degraded mode, so according to them, it is not a
problem.  My own observations seem indicate that there may be a problem
with XFS.

I have already been able to sucessfully backup the drives before I
exported them via NFS and started running NNTP spools across the NFS
exports.  It seems to be the way, I am using the drives that causes the
failure.  If I start the NNTP daemons running across the file system
(examining articles to see what ones exist and how old they are) as
a backup is going, it seems to consistently fail.  The oops says that it
occurs in xfsdump and the software trace seems to indicate it was the XFS
code that attempted to dereference a NULL pointer.  So, I am led to
believe that the problem may lie in XFS.  I could be convinced to
re-arrange the array (since it currently has no real data anyway), but I
currently do not believe that it will make a difference.


                                .justin.

------------------------------------------------------------------------
Justin Leonard Tripp                                   justin@xxxxxxxxxx
Configurable Computing Laboratory Research Assistant      CB 461 x8-7206
Electrical and Computer Engineering Department  Brigham Young University

On Tue, 8 May 2001, Federico Sevilla III wrote:

> Hi, this is a little late and not particularly XFS-specific but ...
>
> > The machine is a Dual 500 MHz PIII, and the filesystems run on top of
> > the 3ware IDE raid card with 4 46G disks running in RAID level 5.
> > (138G filesystem available...)  The XFS is the 2.4.3 version from
> > April 5th.
>
> 3Ware sent me an e-mail about a problem they found with their Escalade
> 6400 controllers, RAID 5, and ext2. Their e-mail did not have conclusive
> information on any other filesystem, but I'm thinking that XFS could be
> affected, too. I "dirty hack" seems to be to mount the ext2 partition with
> the sync option. Maybe XFS treats a disk in a similar manner, making XFS
> "immune"?
>
> Anywa, until a patch comes out (expected somewhere around May 15) for
> their software, RAID 5 looks iffy. Or does it?
>
> Here is a copy of the e-mail I got for whatever it's worth. :)
>
>  --> Jijo
>
> ---------- Forwarded message ----------
> Date: Wed, 2 May 2001 09:24:15 -0700
> From: Mike Wentz <mike.wentz@xxxxxxxxx>
> Subject: 3ware Technical Bulletin concerning RAID 5
>
>              Important Technical Bulletin
>
>
> Dear customer,
>
> You are receiving this email because you are a registered
> owner of a 3ware Escalade 6400 Series Storage
> Switch, or you have opened a Tech Support case with 3ware
> on a 6400, 6410 or 6800 Series Escalade Storage Switch.
>
> 3ware has found a bug in our RAID 5 code that can cause file
> system errors resulting in possible loss of data. This
> problem has only been experienced on Linux operating systems
> running the default ext2 file system, however, other file
> systems may be affected.
>
> Products affected:           Escalade 6400, 6410 and 6800
> Software Versions affected:  6.5 and 6.6
> RAID level affected:         RAID 5
> Operating systems affected:    
>                   Known:     Linux with default ext2 file
>                              system
>                   Possible:  Win98/ME, WinNT, Win2000
>
> ************************************************************
>
> Symptoms:
> 1. 3ware BIOS VERIFY command reports VERIFY Failed.
> 2. Linux fsck -vf command returns inode and/or superblock
>    errors when run while the array is in degraded mode.
> 3. Windows chkdsk command returns errors on the MFT
>    (Master File Table).
>
> ************************************************************
>
> Problem description:
> 1. In Linux, the RAID 5 parity data can get corrupted during
>    writes.
> 2. On Microsoft operating systems, parity and/or user data
>    can get corrupted during writes.
>
> ************************************************************
>
> Is my data corrupted?
> There are two factors that determine whether or not your
> data is affected:
> 1. Whether or not the array has ever degraded
> 2. Which operating system you are using
>
> If the array has never degraded, then the data should be
> fine.
>
> If the array has degraded, then there is a potential that
> your data is affected. In most cases, the operating system
> will detect the errors and correct them. It is possible,
> however, that the operating system cannot detect and correct
> all the errors, in which case you will need to restore from
> backup.
>
> To date, this problem has only been seen on Linux operating
> systems running the ext2 default file system.
>
> 3ware has not yet found an instance where user data on a
> Microsoft operating system has been affected, however, it is
> theoretically possible.
>
> ************************************************************
> I'm using RAID 5, what should I do?
> Depending on your operating system and the state of the
> array, the following actions are recommended:
>
> Linux:
>
> If the RAID 5 has never degraded there are two choices:
> 1. Reconfigure the array to RAID 1 or RAID 10.
> 2. Mount the file system in synchronous mode. Synchronous
>    mode will prevent the parity data from being affected,
>    thereby eliminating the problem. Please note that there
>    is a significant performance penalty.
>  
>                   Command syntax:
> mount -t ext2 /dev/device_node /mountpoint -o sync
>
> If the array has degraded run fsck to repair damaged data.
> 1. If the repair completes successfully, you may either
>    convert the array to RAID 1 or RAID 10, or run the file
>    system in synchronous mode.
> 2. If the repair is unsuccessful, you will need to restore
>    from backup.
>
> Microsoft NT, 98, ME or 2000:
>
> If the RAID 5 array has never degraded you should
> reconfigure the array to RAID 1 or RAID 10.
>
> If the array has degraded, run chkdsk /R to repair damaged
> data.
> 1. If the repair completes successfully, reconfigure the
>    array to RAID 1 or RAID 10.
> 2. If the repair is unsuccessful, you will need to restore
>    from backup.
>
> ************************************************************
>
> What happens if I stay on RAID 5?
> 1. 3ware does not recommend using RAID 5 on any Microsoft
>    operating system at this time.
> 2. If you are running Linux and wish to remain on RAID 5,
>    3ware recommends you mount the file system in synchronous
>    mode.
>
> ************************************************************
>
> When will a RAID 5 fix be available?
> 1. 3ware has identified the cause of the problem and
>    developed a fix. We are in the process of testing to
>    insure that it completely corrects the problem. We
>    anticipate having the fix available on or before
>    May 15, 2001.
> 2. If you would like to register to be notified when the
>    fix is available please click here:   
>    http://www.3ware.com/support/contact3wareraid5.asp
>
> 3ware regrets any inconvenience this problem may cause you.
>
> Sincerely
> Michael L. Wentz
> Director, Customer Service (650.269.2977)
>
>


<Prev in Thread] Current Thread [Next in Thread>