xfs
[Top] [All Lists]

Re: your mail

To: Tom Christensen <tom.christensen@xxxxxxxxxxxxxxxx>
Subject: Re: your mail
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Mon, 15 Jun 2015 09:27:38 +1000
Cc: "swakefie@xxxxxxxxxx" <swakefie@xxxxxxxxxx>, "xfs@xxxxxxxxxxx" <xfs@xxxxxxxxxxx>
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <20150613223921.GC20262@dastard>
References: <71962FC942A61D42A3DECDFB7CC94F61193F6F07@xxxxxxxxxxxxxxxxxx> <20150613223921.GC20262@dastard>
User-agent: Mutt/1.5.21 (2010-09-15)
On Sun, Jun 14, 2015 at 08:39:21AM +1000, Dave Chinner wrote:
> On Sat, Jun 13, 2015 at 01:02:51AM +0000, Tom Christensen wrote:
> > We've run into a bit of an issue with xfs running Ceph.  The following bug 
> > details what we are seeing:
> > https://bugs.launchpad.net/ubuntu/+source/xfs/+bug/1464308
> > 
> > Basically the Ceph OSD process gets hung in dstate due to the traceback in 
> > the bug.
> > 
> > Here is additional info gathered:
> > 
> > xfs bmap output for a random directory
> > https://gist.github.com/dmmatson/e864252c7ff346df954a
> > 
> > attr -l of the file dchinner indicated from the xfs bmap output
> > (attr -l)
> > (6:11:41 PM) dmatson: Attribute "cephos.spill_out" has a 2 byte value for 
> > rbd\udata.66039648e29a80.0000000000000d35__head_23EA10B8__5
> > (6:11:45 PM) dmatson: Attribute "ceph.snapset@3" has a 263 byte value for 
> > rbd\udata.66039648e29a80.0000000000000d35__head_23EA10B8__5
> > (6:11:49 PM) dmatson: Attribute "ceph.snapset@2" has a 1131 byte value for 
> > rbd\udata.66039648e29a80.0000000000000d35__head_23EA10B8__5
> > (6:11:53 PM) dmatson: Attribute "ceph.snapset@1" has a 2048 byte value for 
> > rbd\udata.66039648e29a80.0000000000000d35__head_23EA10B8__5
> > (6:11:56 PM) dmatson: Attribute "ceph._" has a 259 byte value for 
> > rbd\udata.66039648e29a80.0000000000000d35__head_23EA10B8__5
> > (6:12:00 PM) dmatson: Attribute "ceph.snapset" has a 2048 byte value for 
> > rbd\udata.66039648e29a80.0000000000000d35__head_23EA10B8__5
> > 
> > xfs_bmap -vp of same file
> > 
> > rbd\udata.66039648e29a80.0000000000000d35__head_23EA10B8__5:
> > (6:13:21 PM) dmatson: EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL FLAGS
> > (6:13:25 PM) dmatson: 0: [0..8191]: 2944471376..2944479567 16 
> > (24445776..24453967) 8192 00000
> 
> And the attribute fork was:
> 
> rbd\udata.66039648e29a80.0000000000000d35__head_23EA10B8__5:
> EXT: FILE-OFFSET      BLOCK-RANGE             AG AG-OFFSET          TOTAL 
> FLAGS
>   0: [0..7]:          1461176488..1461176495  8  (1163688..1163695)     8 
> 00000
>   1: [8..31]:         1461176504..1461176527  8  (1163704..1163727)    24 
> 00000
> 
> I just created a filesystem and attribute list identical to the
> above, and came up with a attribute fork that looks like:
> 
> /mnt/scratch/udata.66039648e29a80.0000000000000d35__head_23EA10B8__5:
>  EXT: FILE-OFFSET      BLOCK-RANGE      AG AG-OFFSET        TOTAL FLAGS
>    0: [0..7]:          120..127          0 (120..127)           8 00000
>    1: [8..15]:         112..119          0 (112..119)           8 00000
>    2: [16..23]:        104..111          0 (104..111)           8 00000
> 
> IOWs, there's an extra block in the attribute fork that is causing
> problems than there needs to be.  That tends to imply attribute
> overwrites might be contributing here (the 3-phase overwrite
> algorithm increases the space usage) so I'm going to need to try a
> few different things to see if I can get an attribute fork into the
> same shape....

I've had a test running overnight that generates attribute forks of
this shape, but I haven't seen any problems. sequential grow of
attributes, semi random growth, semi random truncation, etc don't
seem to trip over this problem on a 4.1-rc6 kernel. Hence I'm going
to need a metadump image of a filesystem with a broken file in it.
An obfuscated dump is fine; I only need to look at the structure of
the bad attribute fork. If you want to send the link privately to me
that is fine, too.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>