[Top] [All Lists]

Re: XFS filesystem claims to be mounted after a disconnect

To: Eric Sandeen <sandeen@xxxxxxxxxxx>, xfs@xxxxxxxxxxx
Subject: Re: XFS filesystem claims to be mounted after a disconnect
From: Martin Papik <mp6058@xxxxxxxxx>
Date: Fri, 02 May 2014 22:07:20 +0300
Delivered-to: xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type:content-transfer-encoding; bh=gGIPuc3Eap8X2qGJ9z7WmQ7YOtCObjQDD2xjE+TSnWM=; b=JkWCXsEZt9pYgkxiy8GeaoSw80M3b4UNO2snEBsAcur/RGda83jvnB4x5OAMcc85fS PLkwe7KyCC2/ZEF8SCOrWOzW+vZN3cmdzsC2et9l+HJGT+EhJH2T93XgFbbXK1UHyvhB 3fjWYf6M3B3dUSbZ22CGBmPmDBp0bC9tsoMo4Et/Y2zUlufGZ4B/Xx1z1+IFZIpXC9KQ 6SE1N1v3umSWXKvNI8EowR262n0eKfkWa75LSTI+VxAql0xZK0WJeVmH5lidvyrFjbwG uxRiwyJpaggKSqV5L8dsPC5DmxLOSmAZiM258KKb8QRomwdYWt6kdxb7/AlEotaGEHeY WTiw==
In-reply-to: <5363E65C.6010006@xxxxxxxxxxx>
References: <5363A1D8.2020402@xxxxxxxxx> <5363B4C9.4000900@xxxxxxxxxxx> <5363CB5E.3090008@xxxxxxxxx> <5363CD70.3000006@xxxxxxxxxxx> <5363DBD7.4060002@xxxxxxxxx> <5363E65C.6010006@xxxxxxxxxxx>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.4.0
Hash: SHA512

> to be honest, I'm not certain; if it came back under the same
> device name, things may have continued.  I'm not sure.

Personally, I haven't seen it reconnect even once. I've seen disks
fail to appear until the old references are removed, or even
partitions not detecting until all is clean. Reconnecting, only on SW
raid, and only when everything was just right.

> In general, filesystems are not very happy with storage being
> yanked out from under them.

Yup, I know that, except when there's raid 1, 5 or 6, some yanking is
possible. But I wish it were possible, even if manually at my own risk.

> Well, I did say that it was the simplest thing.  Not the best or 
> most informative thing.  :)

I know, I'm just philosophically opposed to rebooting, every time I'm
forced to reboot a system I have a nagging feeling I don't really know
what the problem is and how to fix it. So, having to reboot makes me
think I'm stupid. So I prefer fixing things.

> Somewhere in the vfs, the filesystem was still present in a way
> that the ustat syscall reported that it was mounted. xfs_repair
> uses this syscall to determine mounted state.  It called sys_ustat,
> got an answer of "it's mounted" and refused to continue.
> It refused to continue because running xfs_repair on a mounted
> filesystem would lead to severe damage.

I understand that, and I'm okay with whatever I need to do in order to
restore the FS after the failure, but it would be good to have xfs
report the status correctly, i.e. show up in /proc/mounts UNTIL all
resources are released. What do you think?

> If xfs encounters an insurmountable error, it will shut down, and
> all operations will return EIO or EUCLEAN.  You are right that
> there is no errors=* mount option; the behavior is not configurable
> on xfs.

IMHO it should be, but since the last email I've glanced at some
mailing lists and understand that there's some reluctance, in the name
of not polluting the FS after an error. But at least a R/O remount
should be possible, to prevent yanking libraries from under
applications (root FS).

> You're right that this doesn't seem to be well described in
> documentation, that's probably something we should address.

Yup, any idea when? .... Also, I think it would be good to have a
section on what to do when things go south and what to expect. E.g. I
found out the hard way that xfs_check on a 2TB disk allocates 16G of
memory, so now I'm running it with cgroup based limitations, otherwise
I couldn't even open my emails now. I'm still not sure when to run
xfs_check and when xfs_repair, etc. At least I haven't seen such docs.
Maybe I missed them.

Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/


<Prev in Thread] Current Thread [Next in Thread>