On Tue, Oct 08, 2013 at 09:21:13AM -0500, Rich Johnston wrote:
> On 10/07/2013 07:57 PM, Eric Sandeen wrote:
> >On 10/7/13 7:53 PM, Dave Chinner wrote:
> >>Two tests, please. move all the common parts into common/dump, and
> >>write them as two separate tests. That way we can easily track what
> >>test is failing just by looking at what harness test is failing...
> >I'm not quite convinced that it's 2 separate tests, TBH.
> >It's the same root cause; I guess there is a slightly different
> >outcome because if you hit the same root cause enough times,
> >you'll segfault.
> Multiple DMF offline files are successfully restored but the attrs
> are lost. I wanted to show/test that case.
> I agree with Eric that it is the same root cause but because can
> occur with successful dumps and does not segfault, Thats why the 2
Ok, the problem might be triggering the same root cause, but in the
case of unit tests that is usually irrelevant. That is each individual test
should be independently tracked by the test harness regardless of
the bug it triggers.
And reading on #xfs, the problem isn't clearly understood yet as
both you and Eric are not sure exactly why there are differences in
behaviour between different tests yet. e.g:
[09/10/13 02:13] <rjohnston1> Ahh OK but my DMF test case had several
wholly-sparse (offline files) and the dump succeeded.
[09/10/13 02:18] <sandeen> tbh there is one thing I'm not clear on here, why a
1t sparse file behaves differently from a 1k sparse file
[09/10/13 02:18] <sandeen> that seems . . wrong
[09/10/13 02:19] <sandeen> but I guess it must just key on i_size, not blocks
[09/10/13 02:19] <sandeen> so anyway, maybe your dmf testcase had smaller file
[09/10/13 02:19] <sandeen> sorry, I have to run & get missed homework to my kid
@ school, bbiab. Grr.
[09/10/13 02:20] <rjohnston1> NP, yes they were smaller.
[09/10/13 02:22] <rjohnston1> 100 10MB files no segfault, just trashed attrs.
SO, there's different behaviour dependent on file sizes, and that's
not understood yet. IOWs, there's yet another test case that needs
to be exercised here to demonstrate the different failure cases
that are being seen.
And that brings it further into the realm of multiple tests, in
which case we might have:
- 320 - multistream with wholly sparse files
- 321 - multistream with small sparse files
- 322 - multistream with large sparse files
This is the point I'm trying to make - from a test harness
perspective, we don't really care what the bugs are that are being
triggered by the tests - what we are trying to do is get coverage of
different behaviours and test cases and track which ones fail. What
i see from the above woul dbe:
- 320 = pass
- 321 = fail, corrupt attrs
- 322 = fail, SEGV
Different tests, different failure modes, easy to tell them apart.
> >That's the only difference in test #2.
> >(and the segfault isn't fixed AFAIK).
Exactly my point. With the fix you have made, we'll get:
- 320 = pass
- 321 = pass
- 322 = fail, segv.
We can clearly state at this point that your patch has fixed an
attribute corruption problem because it makes a specific unit go
from fail to pass. If we see that unit test fail on other distros,
we know exactly what patch is needed to fix it. And the same can be
said for when we find the root cause of the SEGV failure....