xfs
[Top] [All Lists]

Re: use-after-free on log replay failure

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: use-after-free on log replay failure
From: Alex Lyakas <alex@xxxxxxxxxxxxxxxxx>
Date: Sun, 10 Aug 2014 19:26:24 +0300
Cc: xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <20140806123215.GJ20518@dastard>
References: <CAOcd+r0B-4SPjzim=68w3L8Y9FxwBD-C5HkkeO58C6t9nfgbhw@xxxxxxxxxxxxxx> <20140113204314.GJ3469@dastard> <CAOcd+r22FirPMHjdxQyTmXOAM72ND-t0=njK9nEmesSV5=Ec=Q@xxxxxxxxxxxxxx> <20140115014503.GQ3469@dastard> <CAOcd+r0R6KxmgEJNPUZ0Q5cQhsStGb=cehYE0+wKgDNU1negsA@xxxxxxxxxxxxxx> <20140119231745.GF18112@dastard> <4B2A412C75324EE9880358513C069476@alyakaslap> <9D3CBECB663B4A77B7EF74B67973310A@alyakaslap> <20140804230721.GA20518@dastard> <CAOcd+r0_KjaZam8ocA1YtwF7jmCyh0+bXOY8nqiE3fsVn1gyiA@xxxxxxxxxxxxxx> <20140806123215.GJ20518@dastard>
Hello Dave,

On Wed, Aug 6, 2014 at 3:32 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> On Wed, Aug 06, 2014 at 01:05:34PM +0300, Alex Lyakas wrote:
>> Hi Dave,
>>
>> On Tue, Aug 5, 2014 at 2:07 AM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
>> > On Mon, Aug 04, 2014 at 02:00:05PM +0300, Alex Lyakas wrote:
>> >> Greetings,
>> >>
>> >> we had a log replay failure due to some errors that the underlying
>> >> block device returned:
>> >> [49133.801406] XFS (dm-95): metadata I/O error: block 0x270e8c180
>> >> ("xlog_recover_iodone") error 28 numblks 16
>> >> [49133.802495] XFS (dm-95): log mount/recovery failed: error 28
>> >> [49133.802644] XFS (dm-95): log mount failed
>> >
>> > #define ENOSPC          28      /* No space left on device */
>> >
>> > You're getting an ENOSPC as a metadata IO error during log recovery?
>> > Thin provisioning problem, perhaps,
>> Yes, it is a thin provisioning problem (which I already know the cause for).
>>
>> > and the error is occurring on
>> > submission rather than completion? If so:
>> >
>> > 8d6c121 xfs: fix buffer use after free on IO error
>> I am not sure what do you mean by "submission rather than completion".
>> Do you mean that xfs_buf_ioapply_map() returns without submitting any
>> bios?
>
> No, that the bio submission results in immediate failure (e.g. the
> device goes away, so submission results in ENODEV). Hence when
> _xfs_buf_ioapply() releases it's IO reference itis the only
> remaining reference to the buffer and so completion processing is
> run immediately. i.e. inline from the submission path.
>
> Normally IO errors are reported through the bio in IO completion
> interrupt context. i.e the IO is completed by the hardware and the
> error status is attached to bio, which is then completed and we get
> into XFS that way. The IO submision context is long gone at this
> point....
>
>> In that case, no, bios are submitted to the block device, and it
>> fails them through a different context with ENOSPC error. I will still
>> try the patch you mentioned, because it also looks relevant to another
>> question I addressed to you earlier in:
>> http://oss.sgi.com/archives/xfs/2013-11/msg00648.html
>
> No, that's a different problem.
>
> 9c23ecc xfs: unmount does not wait for shutdown during unmount
Yes, this patch appears to fix the problem that I reported in the
past. XFS survives the unmount and kmemleak is also happy. Thanks! Is
this patch safe to apply to 3.8.13?

Thanks,
Alex.



>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>