xfs
[Top] [All Lists]

Re: SEEK_DATA/SEEK_HOLE support

To: xfs@xxxxxxxxxxx
Subject: Re: SEEK_DATA/SEEK_HOLE support
From: Michael Monnerie <michael.monnerie@xxxxxxxxxxxxxxxxxxx>
Date: Wed, 5 Oct 2011 09:34:26 +0200
Cc: Dave Chinner <david@xxxxxxxxxxxxx>, Christoph Hellwig <hch@xxxxxxxxxxxxx>, Jeff Liu <jeff.liu@xxxxxxxxxx>
In-reply-to: <20111005043659.GO3159@dastard>
Organization: it-management http://it-management.at
References: <4E887D7F.2010306@xxxxxxxxxx> <20111004130208.GA19263@xxxxxxxxxxxxx> <20111005043659.GO3159@dastard>
User-agent: KMail/1.13.6 (Linux/3.0.3-zmi; KDE/4.6.0; x86_64; ; )
On Mittwoch, 5. Oktober 2011 Dave Chinner wrote:
> That will only work if you can prevent concurrent unwritten extent
> conversion from happening while we do the separate tag lookups on
> the range because it requires two radix tree tag lookups rather than
> just one index lookup. i.e. miss the dirty page because it went
> dirty->writeback during the dirty tag search, and miss the same page
> when doing the writeback lookup because it went writeback->clean
> very quickly due to IO completion.
> 
> So to stop that from happening, it requires that filesystems can
> exclude unwritten extent conversion from happening while a
> SEEK_HOLE/SEEK_DATA operation is in progress, and that the
> filesystem can safely do mapping tree lookups while providing that
> extent tree exclusion.  I know that XFS has no problems here, but
> filesystems that use i_mutex for everything might be in trouble.
> 
> Besides, if we just look for pages in the cache over unwritten
> extents (i.e. someone has treated it as data already), then it can
> be done locklessly without having to worry about page state changes
> occurring concurrently...

I'd like to understand why it's important to care about locking here. As 
I understand it, SEEK_* is used for example to copy a file efficiently. 
If that is performed on a file that is currently being written to, the 
resulting copy will probably be bogus anyway, even without SEEK_* usage.

There might be a case where it is important, but I can't see that atm. 

If I understand it correctly, then if we do not lock during SEEK_* 
operations, a part of the file might be missed to copy, but that's only 
for cases where the source file is being written to. If that file is 
100GB size (to be extreme), and we copy it while it's modified, we will 
almost for sure have a copy that is partly modified, partly not, 
depending on which area was modified before read and which not. So 
where's the point?

-- 
mit freundlichen Grüssen,
Michael Monnerie, Ing. BSc

it-management Internet Services: Protéger
http://proteger.at [gesprochen: Prot-e-schee]
Tel: +43 660 / 415 6531

// Haus zu verkaufen: http://zmi.at/langegg/

Attachment: signature.asc
Description: This is a digitally signed message part.

<Prev in Thread] Current Thread [Next in Thread>