[Top] [All Lists]

Re: XFS realtime O_DIRECT failures

To: Alan Cook <acook@xxxxxxxxxxxxxxxxxxxxxx>
Subject: Re: XFS realtime O_DIRECT failures
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Fri, 11 Nov 2011 09:24:24 +1100
Cc: Christoph Hellwig <hch@xxxxxxxxxxxxx>, linux-xfs@xxxxxxxxxxx
In-reply-to: <CAGedfznrW4c9Bf3D-7+CEUUa-_u5iVzh+KskwFS9bom0s=C=gQ@xxxxxxxxxxxxxx>
References: <loom.20111108T180925-222@xxxxxxxxxxxxxx> <20111109080133.GB20604@xxxxxxxxxxxxx> <CAGedfzmcmfLXhBEzm9yhpRQTf-7dnMenXqe0FABAzJgP0rxSUA@xxxxxxxxxxxxxx> <20111109223314.GQ5534@dastard> <CAGedfznrW4c9Bf3D-7+CEUUa-_u5iVzh+KskwFS9bom0s=C=gQ@xxxxxxxxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Wed, Nov 09, 2011 at 05:52:15PM -0500, Alan Cook wrote:
> On Wed, Nov 9, 2011 at 5:33 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> > I'm not sure from that description just why the realtime volume adds
> > any benefit to your workflow. Separation of data and metadata is
> > does not provide you with data compression, so you must be doing
> > something different with the real time device to acheive
> > compression. Any details on that aspect of your setup?
> The compression is done via hardware that sits between the block layer
> and the actual storage device (in this case it is a solid state
> drive).  Having both the data and meta data reside on the same device
> creates a problem, as the block layer has no idea whether it has data
> or meta data, and so will compress the meta data along with the
> regular data, which is very bad.  Splitting the meta data to a
> separate device eliminated that problem.

XFS metadata IO is tagged with REQ_META, so the block layer can
easily distinguish it from data IO.  Even if the version if XFS you
are using is not doing this, it's rather simple to add it to

With that, your problem at the block layer goes away, and hence the
need to separate data from metadata at the filesystem layer goes
away, too.

> > As to your current problem, it's got a NULL pointer dereference
> > trying to lock the per-ag structure. That means the per-ag lookup
> > failed, which implies that the RT freespace bitmap may be corrupt
> > and it's tried to read a bitmap block that is apparently beyond the
> > end of the filesystem.  What does xfs_check/xfs_repair -n tell you
> > about the filesystem state?
> Unfortunately they do not tell a lot.  Running xfs_check/xfs_repair -n
> prior to running the test reports no errors.  However, attempting to
> run it after the test fails results in an indefinite I/O block (state
> of D+ for the process).  In fact, if I run the test utility twice, it
> results in a hung system.

That sounds like you have a problem with your block device or
underlying storage, not a filesystem problem....


Dave Chinner

<Prev in Thread] Current Thread [Next in Thread>