bsuparna@xxxxxxxxxx wrote:
>
> Matt,
>
> Responding to two of your notes (about the last lkcd code drop) in one
> shot.
>
> >I'm still planning to roll a 4.0 release as soon as I talk to
> >the IBM folks about the last code drop I gave them.
>
> We've tried this code out only a UP, and have just started trying it on a
> SMP system. We had a few initial hiccups with dump configuration with the
> original scripts, but using test.c and modifying the device number as you
> suggested did the trick. As you've mentioned below new scripts and a new
> dumpconfig utility would be required for the 4.0 release.
The dump configuration utility is checked in. It's in lkcdutils/lkcd_config.
All the appropriate scripts/spec files have changed to use it. There's
even a manual page, if you can believe that.
> Before moving to SMP, we decided to first merge in our changes to enable
> system continuation after a dump, by making the other CPUs spin for the
> duration of the dump and then release them, rather than making them stop.
> (We are now using dprobes to trigger the dump from a probe point to test
> our changes.)
How's this working? I'd like to get this into the tree if at all possible
so we can get rid of the current "stop" method and get rid of the SMP bugs.
> I'm hoping that we can include this in the 4.0 release together with SMP
> problem fixes that you are working on. When are you planning on the release
> ?
I'm ready to release it now, believe it or not. I don't have to release
the dump_gzip.c code just yet, as I'm still improving it, but at least
everything will work with the new methodology.
> The next thing that we are trying to implement is to get non-disruptive
> dumps to work from any context, including interrupt context, based on some
> of the ideas we'd discussed earlier. We are attempting to get this to work
> with the current basic dump i/o model for the non-disruptive dumps case.
> (We may need to relook at it later once the dump driver interface is in
> place, though only for devices that implement/register such an interface)
> Will discuss this in more detail after we've tried out a few things ...
Okay.
Hey, I was thinking. Right now, we open up /dev/dump (227,0) to do our
ioctl()s against. If we make that our major number by default, we could
have multiple dump instantiations in the kernel by working against the
minor number. Would that work for you, David, or how were you planning
to do this?
> >For those who are working directly in the tree, you'll note we're
> >now moving from 'vmdump' to 'dump' conventions, and hopefully all
> >the future scripts will use this as well.
>
> BTW, I did try directly accessing the CVS tree, which works.
Great.
> >Also, I spoke to someone at MCL, and we'll see how we can roll in
> >mcore into the LKCD project in some capacity.
>
> That's good news! We wanted to check with you on this. Do we now have a
> contact at MCL whom we can work with to do this, so that we have a fallback
> standalone dump feature ?
I believe so. I've just started communicating with Mike Keefe. He's
sent me a patch (among other things), and I'm in the process of review
and seeing how we can integrate it, and then mcore.
> >The latest code is in the SourceForge tree ... look in
> >2.4/drivers/block/dump.c,
> >and you'll see the restructuring changes. 'lcrash' has also
> >changed a bit.I copied the LKCD group on my last check-in.
> >If you didn't get a copy of it,let me know. It touched a bunch
> >of files.
> >I have to check in new scripts and a new dumpconfig utility next
> >(and fix this bloody SMP problem now that I actually have an SMP
> >system again to test against).
>
> Do let us know how this goes. We had to give some thought to a few of the
> SMP issues for the non-disruptive case (not that we're sure if we've got it
> right or thought of all subtle race possibilities ! ), so it would be
> interesting to discuss this more (I remember you mentioned fixing the CPU 0
> special cases when we talked last).
I've checked in almost everything you can imagine now:
- lkcd_config
- new /sbin/lkcd (instead of /sbin/vmdump)
- modifications to rc.sysinit scripts
- manual page modifications for lcrash/lkcd_config
- updated spec file to build new lkcdutils-4.0
- all 2.4 code is checked in, all header mods done
The _only_ things left to fix on my plate includes:
SMP issue (not always dumping)
gzip dump compression changes (kernel/lcrash)
After those two are done, then we talk about multiple dump
devices, multiple dump methods, integrating all your non-disruptive
dumping code, new kdb/kgdb/dprobes hooks, and adding in the
dump() functionality to the block_device_operations structure, and
then finishing up an IDE dump function. Should be fun!
> Regards
> Suparna
Thanks, Suparna. :) BTW, I'll be out on #lkcd late tonight
to discuss some of this. For those that are curious, we're
currently connecting to irc.kernel.org/#lkcd with IRC to talk
about this stuff pretty late in the evening.
--Matt
|