xfs
[Top] [All Lists]

Re: [PATCH] xfs: fix buffer use after free on IO error

To: Christoph Hellwig <hch@xxxxxxxxxxxxx>, Eric Sandeen <sandeen@xxxxxxxxxxx>
Subject: Re: [PATCH] xfs: fix buffer use after free on IO error
From: Eric Sandeen <sandeen@xxxxxxxxxx>
Date: Tue, 25 Mar 2014 10:39:31 -0700
Cc: Brian Foster <bfoster@xxxxxxxxxx>, xfs-oss <xfs@xxxxxxxxxxx>
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <20140325172508.GA4446@xxxxxxxxxxxxx>
References: <532CFA12.4040104@xxxxxxxxxx> <20140325125754.GA18691@xxxxxxxxxxxxxxx> <20140325131705.GB25392@xxxxxxxxxxxxx> <5331A930.9030402@xxxxxxxxxxx> <20140325172508.GA4446@xxxxxxxxxxxxx>
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:24.0) Gecko/20100101 Thunderbird/24.4.0
On 3/25/14, 10:25 AM, Christoph Hellwig wrote:
> On Tue, Mar 25, 2014 at 09:05:04AM -0700, Eric Sandeen wrote:
>>>> Out of curiosity, is there any major reason we don't use 0 here
>>>> unconditionally? Are we worried about I/O completing before we have a
>>>> chance to decrement the reference?
>>>
>>> I think this should unconditionally avoid the schedule, and while we're
>>> at it we should kill _xfs_buf_ioend and opencode it here and at the
>>> other callsite.
>>
>> And then remove the flag from xfs_buf_ioend which is always 0 at that
>> point ...
> 
> Is it?  xfs_buf_bio_end_io should stil be passing 1, the bio end_io
> handler is the place we really need the workqueue for anyway.

These are the callers of xfs_buf_ioend:

  File              Function                 Line
0 xfs_buf.c         xfs_bioerror             1085 xfs_buf_ioend(bp, 0);
1 xfs_buf.c         _xfs_buf_ioend           1177 xfs_buf_ioend(bp, schedule);
2 xfs_buf_item.c    xfs_buf_item_unpin        494 xfs_buf_ioend(bp, 0);
3 xfs_buf_item.c    xfs_buf_iodone_callbacks 1138 xfs_buf_ioend(bp, 0);
4 xfs_inode.c       xfs_iflush_cluster       3015 xfs_buf_ioend(bp, 0);
5 xfs_log.c         xlog_bdstrat             1644 xfs_buf_ioend(bp, 0);
6 xfs_log_recover.c xlog_recover_iodone       386 xfs_buf_ioend(bp, 0);

so only _xfs_buf_ioend *might* pass something other than 0, and:

  File      Function           Line
0 xfs_buf.c xfs_buf_bio_end_io 1197 _xfs_buf_ioend(bp, 1);
1 xfs_buf.c xfs_buf_iorequest  1377 _xfs_buf_ioend(bp, bp->b_error ? 0 : 1);

At least up until now that was always called with "1"

>> Yeah I have a patch to do that as well; I wanted to separate the
>> bugfix from the more invasive cleanup, though - and I wanted to
>> get the fix out for review sooner.
> 
> Sure, feel free to leave all the cleanups to another patch.
> 
>> But yeah, I was unsure about whether or not to schedule at all here.
>> We come here from a lot of callsites and I'm honestly not sure what
>> the implications are yet.
> 
> I think the the delayed completion is always wrong from the submission
> path.  The error path is just a special case of a completion happening
> before _xfs_buf_ioapply returns.  The combination of incredibly fast
> hardware and bad preemption could cause the same bug you observed.

I wondered about that.

I'm not sure; I don't think it was the buf_rele inside xfs_buf_iorequest
that freed it, I think it was specifically the error path afterwards -
in my case, in xfs_trans_read_buf_map():

                                xfs_buf_        (bp);
                                        // xfs_buf_iorequest code below
                                        xfs_buf_hold(bp);
                                        atomic_set(&bp->b_io_remaining, 1);
                                        _xfs_buf_ioapply(bp); <-- gets error
                                        if 
(atomic_dec_and_test(&bp->b_io_remaining)
                                                xfs_buf_ioend(bp, bp->b_error ? 
0 : 1);
                                        xfs_buf_rele(bp); <- releases our hold
                        }

                        error = xfs_buf_iowait(bp); <-- sees error; would have 
waited otherwise
                        if (error) {
                                xfs_buf_ioerror_alert(bp, __func__);
                                xfs_buf_relse(bp); <--- freed here ?

but my bp refcounting & lifetime knowledge is lacking :(

-Eric

<Prev in Thread] Current Thread [Next in Thread>