sleeps and waits during io_submit

Avi Kivity avi at scylladb.com
Wed Dec 2 02:38:16 CST 2015



On 12/02/2015 02:57 AM, Dave Chinner wrote:
> On Tue, Dec 01, 2015 at 07:13:29PM -0500, Brian Foster wrote:
>> On Tue, Dec 01, 2015 at 09:26:42PM +0200, Avi Kivity wrote:
>>> On 12/01/2015 08:51 PM, Brian Foster wrote:
>>>> On Tue, Dec 01, 2015 at 07:09:29PM +0200, Avi Kivity wrote:
>>>> Nope, it's synchronous from a code perspective. The
>>>> xfs_bmapi_read()->xfs_iread_extents() path could have to read in the
>>>> inode bmap metadata if it hasn't been done already. Note that this
>>>> should only happen once as everything is stored in-core, so in most
>>>> cases this is skipped. It's also possible extents are read in via some
>>>> other path/operation on the inode before an async I/O happens to be
>>>> submitted (e.g., see some of the other xfs_bmapi_read() callers).
>>> Is there (could we add) some ioctl to prime this cache?  We could call it
>>> from a worker thread where we don't mind blocking during open.
>>>
>> I suppose that's possible, or the worker thread could perform some
>> existing operation known to prime the cache. I don't think it's worth
>> getting into without a concrete example, however.
> You mean like EXT4_IOC_PRECACHE_EXTENTS?
>
> You know, that ioctl that the ext4 googlers needed to add because
> they already had AIO applications that depend on it and they hadn't
> realised that the could do exactly the same thing with a FIEMAP
> call? i.e. this call to count the number of extents in the file:
>
> 	struct fiemap fm = {
> 		.offset = 0,
> 		.length = FIEMAP_MAX_OFFSET,
> 	};
>
> 	res = ioctl(fd, FS_IOC_FIEMAP, &fm);
>
> will cause XFS to read in the extent map and cache it.
>

Cool, it even appears to be callable with CAP_WHATEVER.  So we would use 
this to prime the metadata caches before startup, if they turn out to be 
a problem in practice.



More information about the xfs mailing list