At 11:53 29-7-2001 -0500, Chris Bednar wrote:
Hi. I still haven't determined whether this is an XFS problem, or a
raid5 problem,
but here's a situation I see if raid5 is trying to resync an XFS volume
while people
are thrashing it:
Jul 26 14:55:38 gigem101 kernel: raid5: in raid5_sync_request,
bufsize=512 redone=6 rtrn=-5
Jul 26 14:55:38 gigem101 kernel: md: sync_request returned sectors=-5
(j=296588438) ... exiting
Jul 26 14:55:38 gigem101 kernel: raid5: resync aborted (err=-5)!
I'm running a 2.4.5-SGI_XFS_1.0.1 kernel, with md patched up to print
errors (patch at end).
The default behaviour was just to print `raid5: resync aborted' with no
other indication (which
definitely needs to be fixed in the md driver anyway). As you can see,
the chunk-size on
the volume in question is 512k. We have another one (identical hw, 128k
chunk) that synced
without trouble, since I kept users off it. This has only happened during
periods of heavy
read/write activity.
there are some slight problems with xfs over software raid5 that have just
been fixed in the CVS tree. There were also IO stall problems with this
setup when you have a internal log. Basically you also want to make the log
an external log which has a large performance boost.
Search the archive for discussion about this. 1 week ago it was discussed
and it has some bencmarks to back it up.
I'm starting this on the XFS list only, since I know there have been
issues with
chunk sizes >256k elsewhere. It doesn't look to me like this kernel has
the I/O optimization
problem that's been discussed of late, since that seems to be turned off
for ALL md devices
here.
The basic setup is 8 u160-scsi 180GB Seagate disks on an adaptec
29160 controller,
512k chunk, left-sym parity, 1.2 TB XFS filesystem.
Can't comment on that.
--
Seth
Every program has two purposes one for which
it was written and another for which it wasn't
I use the last kind.
|