xfs
[Top] [All Lists]

Re: xfs_force_shutdown called on hardware RAID5+0 XFS filesystem

To: linux-xfs@xxxxxxxxxxx
Subject: Re: xfs_force_shutdown called on hardware RAID5+0 XFS filesystem
From: slaton <slaton@xxxxxxxxxxxxxxxx>
Date: Thu, 16 Sep 2004 11:24:32 -0700 (PDT)
Cc: Seth Mos <seth.mos@xxxxxxxxx>
In-reply-to: <41496A92.50304@xs4all.nl>
References: <Pine.SOL.4.61.0409151613490.12964@conquest.OCF.Berkeley.EDU> <41496A92.50304@xs4all.nl>
Sender: linux-xfs-bounce@xxxxxxxxxxx
Thanks for the reply.

The hardware raid is a 2 month-old RAIDking 825R, which I believe is a 
rebranded Maxtronic Sivy unit. It has 16 SATA disks and a SCSI interface. 
The drive status LEDs are all green, indicating no detected failure (hmm), 
although I did have a drive fail about two weeks ago and did a rebuild. 
The SCSI host adapter is an Adaptec 29160.

Last night i tried to dump the 950GB of data from the raid1 LUN (2TB) to 
the raid2 LUN (1.7TB), which is empty. At about the 10% point, it 
triggered this same error/crash. But on reboot, xfs_check and xfs_repair 
still don't find anything wrong with the two volumes themselves.

The recurrence of the issue would support your case of this being a 
hardware issue.

Is it possible the Adaptec card is to blame here?

I also must admit to some paranoia about my 2TB filesystem size, although 
i did do the research and it seemed that should be fine for 32-bit x86 
hardware.

I have a second identical hardware raid box, that has been unused up to 
now. I suppose i'll get it online and see if i can dump the data from the 
first to the second. Although it will probably trigger the same thing 
again...

thanks
slaton

On Thu, 16 Sep 2004, Seth Mos wrote:

> slaton wrote:

> > We noticed that NFS mounts from the fileserver had gone stale this 
> > morning. These correspond to two hardware RAID LUNs (info below). I 
> > logged into the fileserver and found that the mountpoints were dead as 
> > well, even
> 
> Your hardware raid threw a IO error. This should _not_ happen.
> 
> You probably have a almost broken disk. Hardware error which results in 
> xfs shutting the filesystem down.
> 
> > Should I upgrade to a new kernel and XFS release before investigating 
> > this further? System info and some kernel log excerpts are below; the 
> > full kernel log (events related to this) can be downloaded from 
> > http://cryoem.berkeley.edu/~slaton/kernel.040915.scsicrash.gz
> 
> XFS is not at fault here, although a newer kernel might alleviate or at 
> least provide more info about the hardware problem.
> 
> I am curious as to what raid controller you use.
> 
> Some raid controllers from adaptec have a tendency to get their panties 
> in a knot and die under heavy IO (updatedb).
> 
> Cheers
> Seth


<Prev in Thread] Current Thread [Next in Thread>