xfs
[Top] [All Lists]

Re: Subtle races between DAX mmap fault and write path

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: Subtle races between DAX mmap fault and write path
From: Dan Williams <dan.j.williams@xxxxxxxxx>
Date: Fri, 29 Jul 2016 07:44:25 -0700
Cc: Jan Kara <jack@xxxxxxx>, Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>, linux-fsdevel <linux-fsdevel@xxxxxxxxxxxxxxx>, "linux-nvdimm@xxxxxxxxxxxx" <linux-nvdimm@xxxxxxxxxxxx>, XFS Developers <xfs@xxxxxxxxxxx>, linux-ext4 <linux-ext4@xxxxxxxxxxxxxxx>
Delivered-to: xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=QPOt3iUiLrhsfx/h/W4DjCcsWIcFS6oSKPiSSmCuDhg=; b=SCY8CXZIy8JAU8MrzKrDyk2oQGpprQns9l8fR5ADRoDBXLZT+dAU7nv/YN+rFPfOnF CtPLuLk6FTVo9Aj5c3NGDFbbHqypoK4YOFp/kvSCkTTC97w9uhvj4EKoi9mqW2jnBz2Y j5OLQ+krB0/pYszX5gpNaI6L9WaYSNogkIfvNzo3K+Yj3g0a9jU86xzPVl67StXuu8qV Qcfp4OQxG5Kwf/tZJKVn1xLYjH9TaljloIG7BgNRWHc0EakRw/o0gxEp68znw5JuOc3J luw/94Jp7fJ+gOYvGKiwykx02srGCFgvblovP1CMjPR+mFEiI3fqrFaApp4W7CQ+1Lxr KGtQ==
In-reply-to: <20160729022152.GZ16044@dastard>
References: <20160727120745.GI6860@xxxxxxxxxxxxxx> <20160727211039.GA20278@xxxxxxxxxxxxxxx> <20160727221949.GU16044@dastard> <20160728081033.GC4094@xxxxxxxxxxxxxx> <20160729022152.GZ16044@dastard>
On Thu, Jul 28, 2016 at 7:21 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> On Thu, Jul 28, 2016 at 10:10:33AM +0200, Jan Kara wrote:
>> On Thu 28-07-16 08:19:49, Dave Chinner wrote:
[..]
>> So DAX doesn't need flushing to maintain consistent view of the data but it
>> does need flushing to make sure fsync(2) results in data written via mmap
>> to reach persistent storage.
>
> I thought this all changed with the removal of the pcommit
> instruction and wmb_pmem() going away.  Isn't it now a platform
> requirement now that dirty cache lines over persistent memory ranges
> are either guaranteed to be flushed to persistent storage on power
> fail or when required by REQ_FLUSH?

No, nothing automates cache flushing.  The path of a write is:

cpu-cache -> cpu-write-buffer -> bus -> imc -> imc-write-buffer -> media

The ADR mechanism and the wpq-flush facility flush data thorough the
imc (integrated memory controller) to media.  dax_do_io() gets writes
to the imc, but we still need a posted-write-buffer flush mechanism to
guarantee data makes it out to media.


> https://lkml.org/lkml/2016/7/9/131
>
> And part of that is the wmb_pmem() calls are going away?
>
> https://lkml.org/lkml/2016/7/9/136
> https://lkml.org/lkml/2016/7/9/140
>
> i.e. fsync on pmem only needs to take care of writing filesystem
> metadata now, and the pmem driver handles the rest when it gets a
> REQ_FLUSH bio from fsync?
>
> https://lkml.org/lkml/2016/7/9/134
>
> Or have we somehow ended up with the fucked up situation where
> dax_do_io() writes are (effectively) immediately persistent and
> untracked by internal infrastructure, whilst mmap() writes
> require internal dirty tracking and fsync() to flush caches via
> writeback?

dax_do_io() writes are not immediately persistent.  They bypass the
cpu-cache and cpu-write-bufffer and are ready to be flushed to media
by REQ_FLUSH or power-fail on an ADR system.

<Prev in Thread] Current Thread [Next in Thread>