On Mon, Jul 21, 2008 at 02:52:39PM +1000, Dave Chinner wrote:
> If we allow incore extent tree allocations to recurse into the
> filesystem under memory pressure, new delayed allocations through
> xfs_iomap_write_delay() can deadlock on themselves if memory reclaim
> tries to write back dirty pages from that inode.
>
> It will deadlock in xfs_iomap_write_allocate() trying to take the
> ilock we already hold. This can also show up as complex ABBA
> deadlocks when multiple threeads are triggering memory reclaim when
> trying to allocate extents.
>
> The main cause of this is the fact that delayed allocation is
> not done in a transaction, so KM_NOFS is not automatically
> added to the allocations to prevent this recursion.
>
> Mark all allocations done for the incore inode extent tree as
> KM_NOFS to ensure they never recurse back into the filesystem.
Looks good. Note that KM_NOFS alone already means a allocation
that can't fail, so no need to or it to KM_SLEEP.
And long term we should try to look into allowing these to fail,
allocations that aren't allowed to fail but can't recurse back into
the fs still have a chance to deadlock.
|