use-after-free on log replay failure
Alex Lyakas
alex at zadarastorage.com
Sun Aug 10 11:26:24 CDT 2014
Hello Dave,
On Wed, Aug 6, 2014 at 3:32 PM, Dave Chinner <david at fromorbit.com> wrote:
> On Wed, Aug 06, 2014 at 01:05:34PM +0300, Alex Lyakas wrote:
>> Hi Dave,
>>
>> On Tue, Aug 5, 2014 at 2:07 AM, Dave Chinner <david at fromorbit.com> wrote:
>> > On Mon, Aug 04, 2014 at 02:00:05PM +0300, Alex Lyakas wrote:
>> >> Greetings,
>> >>
>> >> we had a log replay failure due to some errors that the underlying
>> >> block device returned:
>> >> [49133.801406] XFS (dm-95): metadata I/O error: block 0x270e8c180
>> >> ("xlog_recover_iodone") error 28 numblks 16
>> >> [49133.802495] XFS (dm-95): log mount/recovery failed: error 28
>> >> [49133.802644] XFS (dm-95): log mount failed
>> >
>> > #define ENOSPC 28 /* No space left on device */
>> >
>> > You're getting an ENOSPC as a metadata IO error during log recovery?
>> > Thin provisioning problem, perhaps,
>> Yes, it is a thin provisioning problem (which I already know the cause for).
>>
>> > and the error is occurring on
>> > submission rather than completion? If so:
>> >
>> > 8d6c121 xfs: fix buffer use after free on IO error
>> I am not sure what do you mean by "submission rather than completion".
>> Do you mean that xfs_buf_ioapply_map() returns without submitting any
>> bios?
>
> No, that the bio submission results in immediate failure (e.g. the
> device goes away, so submission results in ENODEV). Hence when
> _xfs_buf_ioapply() releases it's IO reference itis the only
> remaining reference to the buffer and so completion processing is
> run immediately. i.e. inline from the submission path.
>
> Normally IO errors are reported through the bio in IO completion
> interrupt context. i.e the IO is completed by the hardware and the
> error status is attached to bio, which is then completed and we get
> into XFS that way. The IO submision context is long gone at this
> point....
>
>> In that case, no, bios are submitted to the block device, and it
>> fails them through a different context with ENOSPC error. I will still
>> try the patch you mentioned, because it also looks relevant to another
>> question I addressed to you earlier in:
>> http://oss.sgi.com/archives/xfs/2013-11/msg00648.html
>
> No, that's a different problem.
>
> 9c23ecc xfs: unmount does not wait for shutdown during unmount
Yes, this patch appears to fix the problem that I reported in the
past. XFS survives the unmount and kmemleak is also happy. Thanks! Is
this patch safe to apply to 3.8.13?
Thanks,
Alex.
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david at fromorbit.com
More information about the xfs
mailing list