[Top] [All Lists]

Re: Questions about XFS discard and xfs_free_extent() code (newbie)

To: "Dave Chinner" <david@xxxxxxxxxxxxx>
Subject: Re: Questions about XFS discard and xfs_free_extent() code (newbie)
From: "Alex Lyakas" <alex@xxxxxxxxxxxxxxxxx>
Date: Thu, 19 Dec 2013 21:24:40 +0200
Cc: <xfs@xxxxxxxxxxx>
Delivered-to: xfs@xxxxxxxxxxx
Importance: Normal
In-reply-to: <20131219105513.GZ31386@dastard>
References: <AD612A564BB84E75B010AE37687DFC8E@alyakaslap> <20131218230615.GQ31386@dastard> <78FC295EC7FF48C987266DC48B183930@alyakaslap> <20131219105513.GZ31386@dastard>
Hi Dave,
It makes sense. I agree it might break some guarantees. Although if the user deleted some blocks in the file or the whole file, maybe it's ok to not have a clear promise what he sees after the crash. But I agree, it's not a clear semantics.

Thanks for the comments,

-----Original Message----- From: Dave Chinner
Sent: 19 December, 2013 12:55 PM
To: Alex Lyakas
Cc: xfs@xxxxxxxxxxx
Subject: Re: Questions about XFS discard and xfs_free_extent() code (newbie)

On Thu, Dec 19, 2013 at 11:24:15AM +0200, Alex Lyakas wrote:
Hi Dave,
Thank you for your comments.
I realize now that what I proposed cannot be done; I need to
understand deeper how XFS transactions work (unfortunately, the
awesome "XFS Filesystem Structure" doc has a TODO in the "Journaling
Log" section).

Can you please comment on one more question:
Let's say we had such fully asynchronous "fire-and-forget" discard
operation (I can implement one myself for my block-device via a
custom IOCTL). What is wrong if we trigger such operation in
xfs_free_ag_extent(), right after we have merged the freed extent
into a bigger one? I understand that the extent-free-intent is not
yet committed to the log at this point. But from the user's point of
view, the extent has been deleted, no? So if the underlying block
device discards the merged extent right away, before committing to
the log, what issues this can cause?

Think of what happens when a crash occurs immediately after the
discard completes. The freeing of the extent never made it to th
elog, so after recovery, the file still exists and the user can
access it. Except that it's contents are now all different to
before the crash occurred.

IOWs, issuing the discard before the transaction that frees the
extent is on stable storage means we are discarding user data or
metadata before we've guaranteed that the extent free transaction
is permanent and that means we violate certain guarantees with
respect to crash recovery...


Dave Chinner
<Prev in Thread] Current Thread [Next in Thread>