[Top] [All Lists]

Re: [PATCH] xfs: call xfs_idestroy_fork() in xfs_ilock() critical sectio

To: Brian Foster <bfoster@xxxxxxxxxx>
Subject: Re: [PATCH] xfs: call xfs_idestroy_fork() in xfs_ilock() critical section
From: Waiman Long <waiman.long@xxxxxx>
Date: Wed, 22 Apr 2015 16:28:38 -0400
Cc: Dave Chinner <david@xxxxxxxxxxxxx>, xfs@xxxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <20150422191137.GF6688@xxxxxxxxxxxxxxx>
References: <1429724021-7675-1-git-send-email-Waiman.Long@xxxxxx> <20150422191137.GF6688@xxxxxxxxxxxxxxx>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.12) Gecko/20130109 Thunderbird/10.0.12
On 04/22/2015 03:11 PM, Brian Foster wrote:
On Wed, Apr 22, 2015 at 01:33:41PM -0400, Waiman Long wrote:
The commit f7be2d7f594cbc ("xfs: push down inactive transaction
mgmt for truncate") refactored the xfs_inactive() function
in fs/xfs/xfs_inode.c.  However, it also moved the call to
xfs_idestroy_fork() from inside the xfs_ilock() critical section to
outside. That was causing memory corruption and strange failures like
deferencing NULL pointers in some circumstances.

This patch moves the xfs_idestroy_fork() call back into an xfs_ilock()
critical section to avoid memory corruption problem.

Signed-off-by: Waiman Long<Waiman.Long@xxxxxx>
Interesting... so from your previous mail we have an inactive/reclaim
racing with an xfs_iflush_fork() of the attr fork, or something of that
nature? Is there a specific reproducer or is it some kind of stress

Good catch in any case, it looks like a deviation from the previous

I am not sure what kind of races are going on. I was running the AIM7 workload for performance comparison purpose. I hit the error when running the disk workload with xfs filesystem. The smaller the ramdisk that I used, the easier it was to reproduce the error. I think I haven't run it for quite a while so I did not notice any problem or I might have just ignored it in some previous runs.

I did check some other call sites of xfs_idestroy_fork() and they are under xfs_ilock(). So I suppose it is not safe to call it outside of the critical section. This patch did indeed fix the problem that I saw when running the disk workload.


<Prev in Thread] Current Thread [Next in Thread>