xfs
[Top] [All Lists]

Re: Performance problems with millions of inodes

To: Christoph Litauer <litauer@xxxxxxxxxxxxxx>
Subject: Re: Performance problems with millions of inodes
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Thu, 26 Jun 2008 09:12:10 +1000
Cc: xfs@xxxxxxxxxxx
In-reply-to: <4862598B.80905@xxxxxxxxxxxxxx>
Mail-followup-to: Christoph Litauer <litauer@xxxxxxxxxxxxxx>, xfs@xxxxxxxxxxx
References: <4862598B.80905@xxxxxxxxxxxxxx>
Sender: xfs-bounce@xxxxxxxxxxx
User-agent: Mutt/1.5.17+20080114 (2008-01-14)
On Wed, Jun 25, 2008 at 04:43:23PM +0200, Christoph Litauer wrote:
> Hi,
>
> sorry if this has been asked before, I am new to this mailing list. I
> didn't find any hints in the FAQ or by googling ...
>
> I have a backup server driving two kinds of backup software: bacula and
> backuppc. bacula saves it's backups on raid1, backuppc on raid2
> (different hardware, but both fast hardware raids).
> I have massive performance problems with backuppc which I tracked down
> to performance problems of the filesystem on raid2 (I think so). The
> main difference between the two backup systems is that backuppc uses
> millions of inodes for it's backup (in fact it duplicates the directory
> structure of the backup client).
>
> raid1 consists of 91675 inodes, raid2 of 143646439. The filesystems were
> created without any options. raid1 is about 7 TB, raid2 about 10TB. Both
> filesystems are mounted with options  
> '(rw,noatime,nodiratime,ihashsize=65536)'.
>
> I used bonnie++ to benchmark both filesystems. Here are the results of
> 'bonnie++ -u root -f -n 10:0:0:1000':
>
> raid1:
> -------------------
> Sequential Output: 82505 K/sec
> Sequential Input : 102192 K/sec
> Sequential file creation: 7184/sec
> Random file creation    : 17277/sec
>
> raid2:
> -------------------
> Sequential Output: 124802 K/sec
> Sequential Input : 109158 K/sec
> Sequential file creation: 123/sec
> Random file creation    : 138/sec
>
> As you can see, raid2's throughput is higher than raid1's. But the file
> creation times are rather slow ...
>
> Maybe the 143 million inodes cause this effect?

Certain will be. You've got about 3 AGs that are holding inodes, so
that's probably 35M+ inodes per AG. With the way allocation works,
it's probably doing a dual-traversal of the AGI btree to find a free
inode "near" to the parent and that is consuming lots and lots of
CPU time.

> Any idea how to avoid it?

I had a protoype patch back when I was at SGI than stopped this
search when the search reached a radius that was no longer "near".
This greatly reduced CPU time for allocation on large inode count
AGs and hence create rates increased significantly.

[Mark - IIRC that patch was in the miscellaneous patch tarball I
left behind...]

The only other way of dealing with this is to use inode64 so that
inodes get spread across the entire filesystem instead of just a
few AGs at the start of the filesystem. It's too late to change the
existing inodes, but new inodes would get spread around....

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx


<Prev in Thread] Current Thread [Next in Thread>