[PATCH] xfs_file_last_byte() needs to acquire ilock

Felix Blyakher felixb at sgi.com
Tue Apr 28 00:03:19 CDT 2009


On Apr 27, 2009, at 11:11 PM, Lachlan McIlroy wrote:

> ----- "Felix Blyakher" <felixb at sgi.com> wrote:
>
>> On Apr 23, 2009, at 10:46 PM, Lachlan McIlroy wrote:
>>
>>>
>>> ----- "Eric Sandeen" <sandeen at sandeen.net> wrote:
>>>
>>>> Lachlan McIlroy wrote:
>>>>> We had some systems crash with this stack:
>>>>>
>>>>> [<a00000010000cb20>] ia64_leave_kernel+0x0/0x280
>>>>> [<a00000021291ca00>] xfs_bmbt_get_startoff+0x0/0x20 [xfs]
>>>>> [<a0000002129080b0>] xfs_bmap_last_offset+0x210/0x280 [xfs]
>>>>> [<a00000021295b010>] xfs_file_last_byte+0x70/0x1a0 [xfs]
>>>>> [<a00000021295b200>] xfs_itruncate_start+0xc0/0x1a0 [xfs]
>>>>> [<a0000002129935f0>] xfs_inactive_free_eofblocks+0x290/0x460
>> [xfs]
>>>>> [<a000000212998fb0>] xfs_release+0x1b0/0x240 [xfs]
>>>>> [<a0000002129ad930>] xfs_file_release+0x70/0xa0 [xfs]
>>>>> [<a000000100162ea0>] __fput+0x1a0/0x420
>>>>> [<a000000100163160>] fput+0x40/0x60
>>>>>
>>>>> The problem here is that xfs_file_last_byte() does not acquire
>> the
>>>>> inode lock and can therefore race with another thread that is
>>>> modifying
>>>>> the extext list.  While xfs_bmap_last_offset() is trying to
>> lookup
>>>>> what was the last extent some extents were merged and the extent
>>>> list
>>>>> shrunk so the index we lookup is now beyond the end of the extent
>>>> list
>>>>> and potentially in a freed buffer.
>>>>>
>>>>> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
>>>>> index e7ae08d..cf62d9d 100644
>>>>> --- a/fs/xfs/xfs_inode.c
>>>>> +++ b/fs/xfs/xfs_inode.c
>>>>> @@ -1258,8 +1258,10 @@ xfs_file_last_byte(
>>>>
>>>>       /*
>>>>        * Only check for blocks beyond the EOF if the extents have
>>>>        * been read in.  This eliminates the need for the inode
>>>> lock,
>>>>        * and it also saves us from looking when it really isn't
>>>>> 	 * necessary.
>>>>> 	 */
>>>>
>>>> I suppose that comment should be modified too, and maybe the
>> commit
>>>> log
>>>> should say why, exactly, it was wrong? :)
>>> Ha, I didn't even read the comment!  It's still kind of correct in
>>> that we wont have to get the inode lock if the extents have not
>> been
>>>
>>> read in.
>>
>> I'd still think the comments could be made less confusing
>> if we're adding the inode lock here.
> The more I read the comment the more it makes sense and it seems to
> make more sense now with the change because it is clear how we can
> avoid the inode lock if the extents are not read in.

OK, now after your explanation and reading the comments the Nth time,
I think, I know what you mean.

I think, the original comment intention was the following:

         if (ip->i_df.if_flags & XFS_IFEXTENTS) {
		// extents have been read in. This (the fact that the extents
		// have been read in) eliminates the need for the inode lock, as
		// we are not going to read them in through xfs_iread_extents().
                 error = xfs_bmap_last_offset(NULL, ip, &last_block,
                         XFS_DATA_FORK);
                 if (error) {
                         last_block = 0;
                 }
         } else {
                 last_block = 0;
         }

while in the patched version it'll become:

         if (ip->i_df.if_flags & XFS_IFEXTENTS) {
		// extents have been read in ...
		xfs_ilock(ip, XFS_ILOCK_SHARED);
                 error = xfs_bmap_last_offset(NULL, ip, &last_block,
                         XFS_DATA_FORK);
		xfs_iunlock(ip, XFS_ILOCK_SHARED);
                 if (error) {
                         last_block = 0;
                 }
         } else {
		// this (the fact that the extents have _NOT_ been read in)
		// eliminates the need for the inode lock.
		// Doh, obvious.
                 last_block = 0;
         }

Is that how you see the comment now?

Was the assumption in the original comment about not needing the ilock
simply incorrect?

> How would you prefer the comment reads?

I'd just leave the first sentence from the original comment.

          * Only check for blocks beyond the EOF if the extents have
          * been read in.

The mentioning about the ilock is too confusing now, imho.

Felix

>
>
>>
>> Felix
>>
>>>
>>>
>>>>
>>>> -Eric
>>>>
>>>>> 	if (ip->i_df.if_flags & XFS_IFEXTENTS) {
>>>>> +		xfs_ilock(ip, XFS_ILOCK_SHARED);
>>>>> 		error = xfs_bmap_last_offset(NULL, ip, &last_block,
>>>>> 			XFS_DATA_FORK);
>>>>> +		xfs_iunlock(ip, XFS_ILOCK_SHARED);
>>>>> 		if (error) {
>>>>> 			last_block = 0;
>>>>> 		}
>>>>>
>>>>> _______________________________________________
>>>>> xfs mailing list
>>>>> xfs at oss.sgi.com
>>>>> http://oss.sgi.com/mailman/listinfo/xfs
>>>>>
>>>>
>>>> _______________________________________________
>>>> xfs mailing list
>>>> xfs at oss.sgi.com
>>>> http://oss.sgi.com/mailman/listinfo/xfs
>>>
>>> _______________________________________________
>>> xfs mailing list
>>> xfs at oss.sgi.com
>>> http://oss.sgi.com/mailman/listinfo/xfs
>>
>> _______________________________________________
>> xfs mailing list
>> xfs at oss.sgi.com
>> http://oss.sgi.com/mailman/listinfo/xfs




More information about the xfs mailing list