On 3/3/13 4:19 PM, Richard Weinberger wrote:
> Am Sun, 03 Mar 2013 16:04:48 -0600
> schrieb Eric Sandeen <sandeen@xxxxxxxxxxx>:
> > > Using xfstests I was able to trigger dlm issues in ocfs2.
>>> I ran xfstests on one node and other nodes had it mounted too.
>> Just for my own education, how does that happen?
>> Were you testing on filesystems already configured into a cluster,
>> or did the cluster somehow pick up your newly-defined test
>> fileystems and mount them?
> The cluster is already configured. But a single node can mount/unmount
> the fs as it wants.
a) your cluster is already configured, and
b) other nodes can mount cluster filesystems
Sure, but - how *did* other nodes mount your xfstest filesystems?
Or did you configure xfstests to use something already configured
to be mounted on multiple nodes?
Perhaps another related question - did the fs *need* to be mounted
on other nodes to expose the problems you found?
I'm just trying to understand if this is a common case, or unique to
how you have configured things.
> I know, xfstests is not a perfect test case for ocfs2 but it allowed me
> to trigger issues...
> These issues can also be triggered without xfstests. But in my case I
> found them using xfstests.
I understand, I'm not suggesting that you not run xfstests; I'm sure
it is useful. It's supposed to be. :) We just need to keep it
useful & not disable the consistency checks unless it's necessary.
>> How does fsck.ocfs2 behave when you run it on one node, when the
>> fs is mounted on others? Will it proceed w/ no knowledge of the
>> fact that it's mounted elsewhere?
> It refuses to check the fs and exists with an error code != 0.
Ok, then that confuses me a little, because earlier you said:
> To ensure that fsck.ocfs2 will not corrupt the filesystem
but just now you said it won't run at all? Anyway...
> From the manpage:
> -F By default fsck.ocfs2 will check with the cluster
> services to ensure that the volume is not in-use (mounted) on any node
> in the cluster before proceeding. -F skips this check and should only
> be used when it can be guaranteed that the volume is not mounted on any
> node in the cluster. WARNING: If the cluster check is disabled and the
> volume is mounted on one or more nodes, file system corruption is very
> likely. If unsure, do not use this option.
Ok, but xfstests wasn't *using* -F was it?
Anyway, what if you did something more along the lines of [pseudocode]
if mounted.ocfs2 -f $TEST-DEV | frob_as_necessary
so that *if* it's mounted on some other node, the fsck won't run.
That has downsides as Dave mentioned, but for the case where the
xfstests node is the only one with it in use, it'll still do the
beneficial consistency check.
Just tweaking the fsck action bsed on *if* it's mounted (or,
maybe, if the node is in a cluster?) might be a more generic solution
that is widely applicable to all ocfs2 test environments.
 I know next to nothing about ocfs2, but presumably one can detect
if the device in question is configured into, or mounted in, a cluster?