Just to clarify:
The resync _never_ finishes after we kill it by doing a mkfs.xfs on a
logical volume on a RAID 5. If we let the sync finish before calling
mkfs.xfs, there is no problem.
With the resync daemon hung we can continue to use the RAID normally,
although the sync never progresses. A reboot never completes because it
can't get rid of raid5syncd, which is in uninterruptible sleep.
FYI, here's my processor info:
----------
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 8
model name : Celeron (Coppermine)
stepping : 3
cpu MHz : 564.849
cache size : 128 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov
pat pse36 mmx
fxsr sse
bogomips : 1127.22
--------------------
> -----Original Message-----
> From: Scott Smyth [mailto:SSmyth@xxxxxxxxxx]
> Sent: Friday, January 05, 2001 10:13 AM
> To: 'Rajagopal Ananthanarayanan '; Scott Smyth
> Cc: ''linux-xfs@xxxxxxxxxxx ' '; Dale Stephenson;
> 'neilb@xxxxxxxxxxxxxxx
> '
> Subject: RE: LVM w/ XFS: 1) raid 5 issues (sync daemon) & 2) XFS in
> kernel w/ LVM
>
>
> Hi;
>
> The problem is that the resync _never_ finishes. I mean
> never. Or are you saying you took the sse code for the
> MMX out of something besides the xor.h?
>
> With our one patch, we can get it started but the
> resync never goes forward.
>
> thanks, Scott
>
> -----Original Message-----
> From: Rajagopal Ananthanarayanan
> To: Scott Smyth
> Cc: 'linux-xfs@xxxxxxxxxxx '; Dale Stephenson; neilb@xxxxxxxxxxxxxxx
> Sent: 1/5/01 10:06 AM
> Subject: Re: LVM w/ XFS: 1) raid 5 issues (sync daemon) & 2)
> XFS in kernel
> w/ LVM
>
> Scott Smyth wrote:
> >
> > Hi;
> >
> > Now that LVM is in the SGI tree, we would like to
> > find out if anybody is seeing similar issues
> > around RAID 5 software support and the complaints
> > of mount with XFS in the kernel and trying to mount
> > LVM volumes:
> >
> > LVM 0.9 (obviously)
> > 2.4.0-test13-pre3 and 2.4.0-prerelease
> >
> > First, there are two problems we seem to have with
> > RAID 5 with LVM on top and XFS as the file system.
> > The first is that the MMX optimization in xor.h
> > fails and keeps the xor module from even loading
> > properly (see attached patch to avoid this). Having
> > gotten around that, we then have a problem with
> > the rsync daemon. We get a kernel oops (we have not
> > had time to use kdb on it yet but will) on the
> > starting the raid, but you can still proceed and
> > use it. However, it never resyncs nor makes any
> > progress to resync according to /proc/mdstat. Then,
> > at shutdown of the raidset (or attempted), the
> > raid stop hangs. We will give more info as we debug,
> > but it is 100% reproduceable.
> >
> [ ... ]
>
> Hi,
>
> We have seen similar issues both w.r.t the sse optimizations
> and w.r.t re-syncing. The sse issue surfaced when the raid
> code got re-worked to factor out some of the architecture/processor
> specific optimizations. I don't know if accessing the device
> while resyncing is intented to work. One workaround for the
> resync problem is to wait for the raid to finish resyncing;
> the status, as you noted, is reported in /proc/mdstat.
>
> In our test box, the processor is:
>
> ----------
> processor : 0
> vendor_id : GenuineIntel
> cpu family : 6
> model : 7
> model name : Pentium III (Katmai)
> stepping : 2
> cpu MHz : 500.141
> cache size : 512 KB
> fdiv_bug : no
> hlt_bug : no
> f00f_bug : no
> coma_bug : no
> fpu : yes
> fpu_exception : yes
> cpuid level : 2
> wp : yes
> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
> mca cmov pat pse36 mmx
> fxsr sse
> bogomips : 996.15
> --------------------
>
> I'll cc: Neil Brown who seems to be actively maintaining
> the RAID subsystem.
>
> thanks,
>
> --------------------------------------------------------------
> ----------
> --
> Rajagopal Ananthanarayanan ("ananth")
> Member Technical Staff, SGI.
> --------------------------------------------------------------
> ----------
> --
>
|