xfs
[Top] [All Lists]

Re: kernel bug in xfs_lrw.c (centos v5.5, directio, aio)

To: Eric Sandeen <sandeen@xxxxxxxxxxx>
Subject: Re: kernel bug in xfs_lrw.c (centos v5.5, directio, aio)
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Thu, 19 Aug 2010 11:50:29 +1000
Cc: Nohez <nohez@xxxxxxxx>, "xfs@xxxxxxxxxxx" <xfs@xxxxxxxxxxx>
In-reply-to: <040DE437-1D0C-45F1-8CC7-CD11D49B6E53@xxxxxxxxxxx>
References: <alpine.LNX.2.00.1008171906220.21398@xxxxxxxxxxxxxxxxxx> <20100818114305.GR10429@dastard> <341C002D-C57C-4F73-8B36-5D12B0B91CD5@xxxxxxxxxxx> <20100819013433.GP7362@dastard> <040DE437-1D0C-45F1-8CC7-CD11D49B6E53@xxxxxxxxxxx>
User-agent: Mutt/1.5.20 (2009-06-14)
On Wed, Aug 18, 2010 at 08:38:33PM -0500, Eric Sandeen wrote:
> On Aug 18, 2010, at 8:34 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> 
> > On Wed, Aug 18, 2010 at 07:47:09PM -0500, Eric Sandeen wrote:
> >> On Aug 18, 2010, at 6:43 AM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> >> 
> >>> On Tue, Aug 17, 2010 at 07:12:12PM +0530, Nohez wrote:
> >>>> 
> >>>> Hi,
> >>>> 
> >>>> I had a kernel bug today when running xfs on CentOS v5.5. I moved to
> >>>> xfs from ext3 today.
> >>>> 
> >>>> The only application accessing the xfs filesystem is Sybase ASE v15.x.
> >>>> Database has been configured to use directio with native kernel
> >>>> asynchronous disk i/o enabled.
> >>> 
> >>> The warning is being issued because the application is mixing
> >>> buffered IO with direct IO on the same file. i.e. data corruption
> >>> waiting to happen. This is an application bug - the responsibility
> >>> for ensuring data coherency and integrity is assumed by the
> >>> application issuing the direct IO.
> >>> 
> >> You know... A clearer kernel message might help a lot here...
> > 
> > Yeah, probably would given we've had more reports of this in the
> > last month or two than we've had in the last five years. What sort
> > of text do you think we should add? I'd argue on the scary side,
> > say:
> > 
> > "XFS: filesystem 〈blah>: detected potential data corruption issue
> > caused by application(s) mixing concurrent buffered and direct IO to
> > the same inode. Inode #12345, pid 6789. Please report this issue
> > to your application vendor."
> > 
> > What do you think?
> > 
> Plenty verbose, might want to limit/throttle it, but sure.

Rate limiting it is a good idea, anyway. How about this:

"XFS: <dev>: inode <#>: pid <#> <name>: detected potential data
corruption issue due to concurrent buffered and direct IO to the
same inode. Please report this issue to your application vendor."

> Maybe include current->comm?

Yes, I thought about that but hadn't gone looking to find out how
easy it was to get the process name.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>