[Top] [All Lists]

Re: XFS realtime O_DIRECT failures

To: Alan Cook <acook@xxxxxxxxxxxxxxxxxxxxxx>
Subject: Re: XFS realtime O_DIRECT failures
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Thu, 10 Nov 2011 09:33:14 +1100
Cc: Christoph Hellwig <hch@xxxxxxxxxxxxx>, linux-xfs@xxxxxxxxxxx
In-reply-to: <CAGedfzmcmfLXhBEzm9yhpRQTf-7dnMenXqe0FABAzJgP0rxSUA@xxxxxxxxxxxxxx>
References: <loom.20111108T180925-222@xxxxxxxxxxxxxx> <20111109080133.GB20604@xxxxxxxxxxxxx> <CAGedfzmcmfLXhBEzm9yhpRQTf-7dnMenXqe0FABAzJgP0rxSUA@xxxxxxxxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Wed, Nov 09, 2011 at 08:25:02AM -0600, Alan Cook wrote:
> On Wed, Nov 9, 2011 at 2:01 AM, Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote:
> > It might sounda a bit harsh, but the problem is that use the realtime
> > subvolume.  It hasn't really been maintained or been part of the tested
> > setup, and most distros in the know ship with it disabled.  I went
> > through and fixed it when we got bug reports once in a while, and I'm
> > happy to look into your issues once I get a bit spare time, but in
> > general use is highly discouraged.   Is there any specific reason why
> > you want to use the RT subvolume?
> I am streaming high-resolution images to be compressed through
> hardware and need to separate the data from the meta data for
> compression purposes.  XFS gave that for free if I used a realtime
> subvolume.

I'm not sure from that description just why the realtime volume adds
any benefit to your workflow. Separation of data and metadata is
does not provide you with data compression, so you must be doing
something different with the real time device to acheive
compression. Any details on that aspect of your setup?

I'm really only trying to understand why you need such a setup - it
helps to understand the full use case you have before trying to
determine if there is a better way of acheiving your end goal....

> I originally started on kernel 2.6.27 (CentOS 5.5) which had no issues
> with the RT subvolume and direct writes.  I was then moved to kernel
> 2.6.32 (SUSE 11) which does have the issue.
> I appreciate your willingness to help.  Are there any alternatives or
> suggestions for splitting the data and meta data?  Any direction you
> can give on where to start looking or what I could do to track down
> the exact cause of the bug?

I have long term plans for metadata/data separation involving a
separate metadata device (i.e. so metadata space can be placed on
different media, grown separately from data space, we don't give up
any of the inherent parallelism in allocation like we do for the RT
device, etc), but that's a long way off yet so not a solution to
your problem.

As to your current problem, it's got a NULL pointer dereference
trying to lock the per-ag structure. That means the per-ag lookup
failed, which implies that the RT freespace bitmap may be corrupt
and it's tried to read a bitmap block that is apparently beyond the
end of the filesystem.  What does xfs_check/xfs_repair -n tell you
about the filesystem state?


Dave Chinner

<Prev in Thread] Current Thread [Next in Thread>