xfs
[Top] [All Lists]

Re: Device loses barrier support (was: Fixed patch for simple barriers.)

To: Andi Kleen <andi@xxxxxxxxxxxxxx>
Subject: Re: Device loses barrier support (was: Fixed patch for simple barriers.)
From: Mikulas Patocka <mpatocka@xxxxxxxxxx>
Date: Thu, 4 Dec 2008 11:45:44 -0500 (EST)
Cc: linux-kernel@xxxxxxxxxxxxxxx, xfs@xxxxxxxxxxx, Alasdair G Kergon <agk@xxxxxxxxxx>, Andi Kleen <andi-suse@xxxxxxxxxxxxxx>, Milan Broz <mbroz@xxxxxxxxxx>
In-reply-to: <20081204145810.GR6703@xxxxxxxxxxxxxxxxxx>
References: <Pine.LNX.4.64.0812040009340.15169@xxxxxxxxxxxxxxxxxxxxxxxxxxx> <20081204100050.GN6703@xxxxxxxxxxxxxxxxxx> <Pine.LNX.4.64.0812040836480.6118@xxxxxxxxxxxxxxxxxxxxxxxxxxx> <20081204142015.GQ6703@xxxxxxxxxxxxxxxxxx> <Pine.LNX.4.64.0812040913510.6118@xxxxxxxxxxxxxxxxxxxxxxxxxxx> <20081204145810.GR6703@xxxxxxxxxxxxxxxxxx>
> > > > finished after the 2nd write) and you are in an interrupt context, 
> > > > where 
> > > > you can't reissue -EOPNOTSUPP request. So what do you want to do?
> > > 
> > > The barrier aware file systems I know of just resubmit synchronously when 
> > > a barrier fails.
> > 
> > ... and produce structure corruption for certain period in time, because 
> > the writes meant to be ordered are submitted unordered.
> 
> No there is nothing unordered. The file system path typically looks like
> 
> commit of a transaction
>       if (i have never seen a barrier failing) 
>               write block with barrier
>               if (EOPNOTSUPP) {
>                       record failure
>                       submit synchronously
>               }
>       } else
>               submit synchronously
> 

If you view this as a "right" way of using barriers, then you can drop 
barrier support at all and replace this code sequence with:

 flush disk cache
 submit write synchronously
 flush disk cache

--- because synchronous barriers bring you no performance advantage over 
the above sequence.

> So if a pvmove barrier fails it will just submit synchronously.
> 
> The write block with barrier bit varies, jbd/gfs2 do it synchronously
> too and xfs does it asynchronously (with io done callbacks), but

And how does xfs preserve write ordering, if the barrier asynchronously 
fails with -EOPNOTSUPP and there are other writes submitted after the 
barrier?

> in both cases they handle an EOPNOTSUPP comming out in the final
> io done.

Mikulas

<Prev in Thread] Current Thread [Next in Thread>