xfs
[Top] [All Lists]

Re: bad performance on touch/cp file on XFS system

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: bad performance on touch/cp file on XFS system
From: Zhang Qiang <zhangqiang.buaa@xxxxxxxxx>
Date: Mon, 25 Aug 2014 17:05:33 +0800
Cc: xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=7cm0RfRZspK94Efg/FFvSseiMvDhI/zyX+9xtmVahXA=; b=es/wsAPjLIeqrWRXId6UifYzlAtp5yKw5Zf6fVr5V8WNT7ZvnvT3Mscz92WepYR4Uy kVenv1f7UA0GLZK9it1wA7MGyPVg/Imcdus6+7DYw6rBvmtU81ZrniQsVMqoouZhOh0e kHVTNcwtW5NBB10WURb7hih5x0fFmOS72TVelYrsySWd0NtNuVPZUWSvWez36LDf5SJ7 iQ9Ditu4Vt5sk72JqeAPZfoQZO95OpQAE+s5ms20EJcQrHiDaE3idJEazMriefkXSujU fYY3C7JqJmzWx7e89yAePTycXJhKuzJrvjefvQf6nUsueItAsoZZs1XtTQRxoh2AuODD iP0Q==
In-reply-to: <20140825085616.GD20518@dastard>
References: <CAKEtwsWxZseS8M+O7vSR2FRXr4gjVQ0RDO8ok+jMPWq-8jPEeA@xxxxxxxxxxxxxx> <20140825051801.GY26465@dastard> <CAKEtwsWWoJ4QK1Od4WUT+hx7iFeX2bKfXk3QF9f2yG6vJ+TgxQ@xxxxxxxxxxxxxx> <20140825085616.GD20518@dastard>
Great, thank you.

From my xfs_db debug, I found I have icount and ifree as follow:

icount = 220619904
ifree = 26202919

So the number of free inode take about 10%, so that's not so few.

So, are you still sure the patches can fix this issue?

Here's the detail xfs_db info:

# mount /dev/sda4 /data1/
# xfs_info /data1/
meta-data="" Â Â Â Â Â Â Âisize=256 Â Âagcount=4, agsize=142272384 blks
    Â=            sectsz=512  attr=2, projid32bit=0
data   =            bsize=4096  blocks=569089536, imaxpct=5
    Â=            sunit=0   Âswidth=0 blks
naming  =version 2       Âbsize=4096  ascii-ci=0
log   Â=internal        bsize=4096  blocks=277875, version=2
    Â=            sectsz=512  sunit=0 blks, lazy-count=1
realtime =none          extsz=4096  blocks=0, rtextents=0
# umount /dev/sda4
# xfs_db /dev/sda4
xfs_db> sb 0
xfs_db> p
magicnum = 0x58465342
blocksize = 4096
dblocks = 569089536
rblocks = 0
rextents = 0
uuid = 13ecf47b-52cf-4944-9a71-885bddc5e008
logstart = 536870916
rootino = 128
rbmino = 129
rsumino = 130
rextsize = 1
agblocks = 142272384
agcount = 4
rbmblocks = 0
logblocks = 277875
versionnum = 0xb4a4
sectsize = 512
inodesize = 256
inopblock = 16
fname = "\000\000\000\000\000\000\000\000\000\000\000\000"
blocklog = 12
sectlog = 9
inodelog = 8
inopblog = 4
agblklog = 28
rextslog = 0
inprogress = 0
imax_pct = 5
icount = 220619904
ifree = 26202919
fdblocks = 147805479
frextents = 0
uquotino = 0
gquotino = 0
qflags = 0
flags = 0
shared_vn = 0
inoalignmt = 2
unit = 0
width = 0
dirblklog = 0
logsectlog = 0
logsectsize = 0
logsunit = 1
features2 = 0xa
bad_features2 = 0xa
xfs_db> sb 1
xfs_db> p
magicnum = 0x58465342
blocksize = 4096
dblocks = 569089536
rblocks = 0
rextents = 0
uuid = 13ecf47b-52cf-4944-9a71-885bddc5e008
logstart = 536870916
rootino = 128
rbmino = null
rsumino = null
rextsize = 1
agblocks = 142272384
agcount = 4
rbmblocks = 0
logblocks = 277875
versionnum = 0xb4a4
sectsize = 512
inodesize = 256
inopblock = 16
fname = "\000\000\000\000\000\000\000\000\000\000\000\000"
blocklog = 12
sectlog = 9
inodelog = 8
inopblog = 4
agblklog = 28
rextslog = 0
inprogress = 1
imax_pct = 5
icount = 0
ifree = 0
fdblocks = 568811645
frextents = 0
uquotino = 0
gquotino = 0
qflags = 0
flags = 0
shared_vn = 0
inoalignmt = 2
unit = 0
width = 0
dirblklog = 0
logsectlog = 0
logsectsize = 0
logsunit = 1
features2 = 0xa
bad_features2 = 0xa
xfs_db> sb 2
xfs_db> p
magicnum = 0x58465342
blocksize = 4096
dblocks = 569089536
rblocks = 0
rextents = 0
uuid = 13ecf47b-52cf-4944-9a71-885bddc5e008
logstart = 536870916
rootino = null
rbmino = null
rsumino = null
rextsize = 1
agblocks = 142272384
agcount = 4
rbmblocks = 0
logblocks = 277875
versionnum = 0xb4a4
sectsize = 512
inodesize = 256
inopblock = 16
fname = "\000\000\000\000\000\000\000\000\000\000\000\000"
blocklog = 12
sectlog = 9
inodelog = 8
inopblog = 4
agblklog = 28
rextslog = 0
inprogress = 1
imax_pct = 5
icount = 0
ifree = 0
fdblocks = 568811645
frextents = 0
uquotino = 0
gquotino = 0
qflags = 0
flags = 0
shared_vn = 0
inoalignmt = 2
unit = 0
width = 0
dirblklog = 0
logsectlog = 0
logsectsize = 0
logsunit = 1
features2 = 0xa
bad_features2 = 0xa
xfs_db> sb 3
xfs_db> p
magicnum = 0x58465342
blocksize = 4096
dblocks = 569089536
rblocks = 0
rextents = 0
uuid = 13ecf47b-52cf-4944-9a71-885bddc5e008
logstart = 536870916
rootino = 128
rbmino = null
rsumino = null
rextsize = 1
agblocks = 142272384
agcount = 4
rbmblocks = 0
logblocks = 277875
versionnum = 0xb4a4
sectsize = 512
inodesize = 256
inopblock = 16
fname = "\000\000\000\000\000\000\000\000\000\000\000\000"
blocklog = 12
sectlog = 9
inodelog = 8
inopblog = 4
agblklog = 28
rextslog = 0
inprogress = 1
imax_pct = 5
icount = 0
ifree = 0
fdblocks = 568811645
frextents = 0
uquotino = 0
gquotino = 0
qflags = 0
flags = 0
shared_vn = 0
inoalignmt = 2
unit = 0
width = 0
dirblklog = 0
logsectlog = 0
logsectsize = 0
logsunit = 1
features2 = 0xa
bad_features2 = 0xa


Thanks
Qiang



2014-08-25 16:56 GMT+08:00 Dave Chinner <david@xxxxxxxxxxxxx>:
On Mon, Aug 25, 2014 at 04:09:05PM +0800, Zhang Qiang wrote:
> Thanks for your quick and clear response. Some comments bellow:
>
>
> 2014-08-25 13:18 GMT+08:00 Dave Chinner <david@xxxxxxxxxxxxx>:
>
> > On Mon, Aug 25, 2014 at 11:34:34AM +0800, Zhang Qiang wrote:
> > > Dear XFS community & developers,
> > >
> > > I am using CentOS 6.3 and xfs as base file system and use RAID5 as
> > hardware
> > > storage.
> > >
> > > Detail environment as follow:
> > >Â Â OS: CentOS 6.3
> > >Â Â Kernel: kernel-2.6.32-279.el6.x86_64
> > >Â Â XFS option info(df output): /dev/sdb1 on /data type xfs
> > > (rw,noatime,nodiratime,nobarrier)
....

> > > It's very greatly appreciated if you can give constructive suggestion
> > about
> > > this issue, as It's really hard to reproduce from another system and it's
> > > not possible to do upgrade on that online machine.
> >
> > You've got very few free inodes, widely distributed in the allocated
> > inode btree. The CPU time above is the btree search for the next
> > free inode.
> >
> > This is the issue solved by this series of recent commits to add a
> > new on-disk free inode btree index:
> >
> [Qiang] This meas that if I want to fix this issue, I have to apply the
> following patches and build my own kernel.

Yes. Good luck, even I wouldn't attempt to do that.

And then use xfsprogs 3.2.1, and make a new filesystem that enables
metadata CRCs and the free inode btree feature.

> As the on-disk structure has been changed, so should I also re-create xfs
> filesystem again?

Yes, you need to download the latest xfsprogs (3.2.1) to be able to
make it with the necessary feature bits set.

> is there any user space tools to convert old disk
> filesystem to new one, and don't need to backup and restore currently data?

No, we don't write utilities to mangle on disk formats. dump, mkfs
and restore is far more reliable than any "in-place conversion" code
we could write. It will probably be faster, too.

> > Which is of no help to you, however, because it's not available in
> > any CentOS kernel.
> >
> [Qiang] Do you think if it's possible to just backport these patches to
> kernel 6.2.32 (CentOS 6.3) to fix this issue?
>
> Or it's better to backport to 3.10 kernel, used in CentOS 7.0?

You can try, but if you break it you get to keep all the pieces
yourself. Eventually someone who maintains the RHEL code will do a
backport that will trickle down to CentOS. If you need it any
sooner, then you'll need to do it yourself, or upgrade to RHEL
and ask your support contact for it to be included in RHEL 7.1....

> > There's really not much you can do to avoid the problem once you've
> > punched random freespace holes in the allocated inode btree. IT
> > generally doesn't affect many people; those that it does affect are
> > normally using XFS as an object store indexed by a hard link farm
> > (e.g. various backup programs do this).
> >
> OK, I see.
>
> Could you please guide me to reproduce this issue easily? as I have tried
> to use a 500G xfs partition, and use about 98 % spaces, but still can't
> reproduce this issue. Is there any easy way from your mind?

Search the archives for the test cases that were used for the patch
set. There's a performance test case documented in the review
discussions.

Cheers,

Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs

<Prev in Thread] Current Thread [Next in Thread>