xfs
[Top] [All Lists]

Re: Is XFS suitable for 350 million files on 20TB storage?

To: Brian Foster <bfoster@xxxxxxxxxx>
Subject: Re: Is XFS suitable for 350 million files on 20TB storage?
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Sun, 7 Sep 2014 08:56:54 +1000
Cc: Stefan Priebe <s.priebe@xxxxxxxxxxxx>, "xfs@xxxxxxxxxxx" <xfs@xxxxxxxxxxx>
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <20140906150412.GB23506@xxxxxxxxxxxxxxx>
References: <540986B1.4080306@xxxxxxxxxxxx> <20140905123058.GA29710@xxxxxxxxxxxxxxx> <5409AF40.10801@xxxxxxxxxxxx> <20140905230528.GO20473@dastard> <540AB933.4030707@xxxxxxxxxxxx> <20140906150412.GB23506@xxxxxxxxxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Sat, Sep 06, 2014 at 11:04:13AM -0400, Brian Foster wrote:
> On Sat, Sep 06, 2014 at 09:35:15AM +0200, Stefan Priebe wrote:
> > Hi Dave,
> > 
> > Am 06.09.2014 01:05, schrieb Dave Chinner:
> > >On Fri, Sep 05, 2014 at 02:40:32PM +0200, Stefan Priebe - Profihost AG 
> > >wrote:
> > >>
> > >>Am 05.09.2014 um 14:30 schrieb Brian Foster:
> > >>>On Fri, Sep 05, 2014 at 11:47:29AM +0200, Stefan Priebe - Profihost AG 
> > >>>wrote:
> > >>>>Hi,
> > >>>>
> > >>>>i have a backup system running 20TB of storage having 350 million files.
> > >>>>This was working fine for month.
> > >>>>
> > >>>>But now the free space is so heavily fragmented that i only see the
> > >>>>kworker with 4x 100% CPU and write speed beeing very slow. 15TB of the
> > >>>>20TB are in use.
> > >
> > >What does perf tell you about the CPU being burnt? (i.e run perf top
> > >for 10-20s while that CPU burn is happening and paste the top 10 CPU
> > >consuming functions).
> > 
> > here we go:
> >  15,79%  [kernel]            [k] xfs_inobt_get_rec
> >  14,57%  [kernel]            [k] xfs_btree_get_rec
> >  10,37%  [kernel]            [k] xfs_btree_increment
> >   7,20%  [kernel]            [k] xfs_btree_get_block
> >   6,13%  [kernel]            [k] xfs_btree_rec_offset
> >   4,90%  [kernel]            [k] xfs_dialloc_ag
> >   3,53%  [kernel]            [k] xfs_btree_readahead
> >   2,87%  [kernel]            [k] xfs_btree_rec_addr
> >   2,80%  [kernel]            [k] _xfs_buf_find
> >   1,94%  [kernel]            [k] intel_idle
> >   1,49%  [kernel]            [k] _raw_spin_lock
> >   1,13%  [kernel]            [k] copy_pte_range
> >   1,10%  [kernel]            [k] unmap_single_vma
> > 
> 
> The top 6 or so items look related to inode allocation, so that probably
> confirms the primary bottleneck as searching around for free inodes out
> of the existing inode chunks, precisely what the finobt is intended to
> resolve. That was introduced in 3.16 kernels, so unfortunately it is not
> available in 3.10.

*nod*

Again, the only workaround for this on a non-finobt fs is to greatly
increase the number of AGs so there's less records in each btree to
search.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>