xfs
[Top] [All Lists]

Re: More raid5 issues: Kernel panic running xfs_db (fwd)

To: "Lars Kellogg-Stedman" <lars@xxxxxxxxxxxxx>
Subject: Re: More raid5 issues: Kernel panic running xfs_db (fwd)
From: "Peter T. Breuer" <ptb@xxxxxxxxxx>
Date: Wed, 7 Mar 2001 11:28:56 +0100 (MET)
Cc: linux-xfs@xxxxxxxxxxx
In-reply-to: <Pine.LNX.4.30.0103062154450.1276-100000@xxxxxxxxxxxxxxxxxxxxxxxxxxx> from "Lars Kellogg-Stedman" at "Mar 6, 2001 09:55:10 pm"
Reply-to: ptb@xxxxxxxxxx
Sender: owner-linux-xfs@xxxxxxxxxxx
"A month of sundays ago Lars Kellogg-Stedman wrote:"
> > Are you using Russell/Martin's md patch?
> 
> Possibly.  I'm using a patch Scott (Smyth) sent me, which I presume
> incorporates this work -- the patch fixed the problem with the raid5 resync
> process wedging when the volume was mounted.

I think scott also gave me this patch.  I applied it over the Jan xfs
patch for 2.4.0.  It's a nice patch but the introduction of the
BLKSETSIZE ioctl in md.c causes instant lockups for me when mounting or
making an xfs over raid1.  Commenting those three lines fixes things for
me.  I surmise that parts of the code somewhere are not ready to deal
with a nonstandard blksize, so its better if nobody succeeds in setting
one.

The rest of the patch deals with fixing a purported bug in remove_from_lru
in buffer.c (there wasn't a bug, and the patch doesn't fix it, but it
shouldn't do any real harm either - the correct approach is to remove
the guards looking for null pointers since the list is circular and
doesn't ever have null pointers), and in raid5 it converts 
raid5_sync_request to using sector counts instead of block counts.
That's fine, but in one place it makes an additional change to sector
counts in a call to md_done_sync that looks unwarranted to me (in my xfs
snapshot) since md_done_sync uses blocks, not sectors:

-               md_done_sync(conf->mddev, (sh->size>>10) - sh->sync_redone,1);
+               md_done_sync(conf->mddev, (sh->size>>9) - sh->sync_redone, 1);

I'm not using raid5 so I can't comment, but that bit seemed offbase
to me when I read the patch.

  void md_done_sync(mddev_t *mddev, int blocks, int ok)
         // stuff that subtracts the blocks count from the 
         // to-be-recovered list

The patch does some other nice things. I didn't know its intention
was to unblock background resync ... I'll have a look to see where that
effect comes from.


Peter

<Prev in Thread] Current Thread [Next in Thread>