XFS filesystem claims to be mounted after a disconnect
Dave Chinner
david at fromorbit.com
Fri May 2 22:02:22 CDT 2014
On Sat, May 03, 2014 at 03:04:48AM +0300, Martin Papik wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA512
>
>
> > It's called a lazy unmount: "umount -l". It disconnects the
> > filesystem from the namespace, but it still lives on in the kernel
> > until all references to the filesystem go away. Given that the
> > hot-unplug proceedure can call back into the filesystem to sync it
> > (once it's been disconnected!) the hot unplug can deadlock on
> > filesystem locks that can't be released until the hot-unplug errors
> > everything out.
> >
> > So you can end up with the system in an unrecoverable state when
> > USB unplugs.
>
> And the disconnect from the namespace is what removes it from
> /proc/mounts?
I believe so.
> By hot unplug, do you mean a user initiated "remove device" or a pull
> out of the USB cable? I'm sorry, I don't understand your example.
> Would you be kind enough to elaborate?
Anything that causes a hot-unplug to occur. There's no real
difference between echoing a value to the relevant sysfs file to
trigger the hot-unplug or simply pull the plug on the active device.
Or could even occur because something went wrong in the USB
subsystem (e.g. a hub stopped communicating) and so the end devices
disappeared, even though nothing is wrong with them.
> >>> If xfs encounters an insurmountable error, it will shut down,
> >>> and all operations will return EIO or EUCLEAN. You are right
> >>> that there is no errors=* mount option; the behavior is not
> >>> configurable on xfs.
> >>
> >> IMHO it should be, but since the last email I've glanced at some
> >> mailing lists and understand that there's some reluctance, in the
> >> name of not polluting the FS after an error. But at least a R/O
> >> remount should be possible, to prevent yanking libraries from
> >> under applications (root FS).
> >
> > What you see here has nothing to do with XFS's shutdown behaviour.
> > The filesystem is already unmounted, it just can't be destroyed
> > because there are still kernel internal references to it.
>
> How can I detect this situation? I mean I didn't see anything in
> /proc/mounts or references to the mount point from /proc/<pid>/*, so I
> only managed to correct it (chdir elsewhere) by chance on a hunch.
> Would it not be desirable to know that there's a phantom FS referenced
> by a number of processes?
lsof.
> Also, do you know if this affects other filesystems? I never saw this
> with ext3/4 or reiser, I don't have much practical experience with
> other filesystems. I ask because your explanation sounds like it's vfs
> rather than xfs, but as I said, I never saw this before.
Yes, it affects all filesystems - the same behaviour occurs
regardless of the filesystem that is active on the block device.
Cheers,
Dave.
--
Dave Chinner
david at fromorbit.com
More information about the xfs
mailing list