xfs
[Top] [All Lists]

Re: XFS slows down on used partions with bonnie++

To: Paul Schutte <paul@xxxxxxxx>
Subject: Re: XFS slows down on used partions with bonnie++
From: Steve Lord <lord@xxxxxxx>
Date: 24 Apr 2002 09:44:11 -0500
Cc: Paul Schutte <paul@xxxxxxxxxxx>, XFS mailing list <linux-xfs@xxxxxxxxxxx>
In-reply-to: <3CC67369.935DF3ED@xxxxxxxx>
References: <3C94F14E.7DE5A62D@xxxxxxxxxxx> <3CC67369.935DF3ED@xxxxxxxx>
Sender: owner-linux-xfs@xxxxxxxxxxx
Very interesting, I will take a look at this some more. One initial
comment is that optimizing for bonnie is not necessarily the correct
thing to do - not many real world loads create thousands of files and
then immediately delete them. Plus, once you are on a raid device,
logical and physical closeness on the volume no longer mean very much.

Having said that we need to think some more about the underlying
allocation policy of inodes vs file data here.

I will run a broader set of tests on this and see what the impact is.

Steve

p.s. congrats on getting further into XFS than most people have!

On Wed, 2002-04-24 at 03:57, Paul Schutte wrote:
> The patch that I include here fixes the problem that I mentioned in my 
> original
> posting.
> It should work on all 2.4.x kernels.
> 
> I don't know how it will behave on IRIX, but linux gets a bit of a performance
> boost.
> It seemed faster or the same on all tests that I have run so far.
> 
> The backup speed of my XFS servers got slower over time and that is why I 
> chased
> down
> the problem.
> 
> The results here are done on a different server than the one I used for my 
> original
> posting, so
> you can not compare numeric values between the 2 postings.
> 
> Here is before and after values done now on the same server now.
> 
> BEFORE:
> 1st run:
> Version 1.02b       ------Sequential Create------ --------Random 
> Create--------
> kendy2.up.ac.za     -Create-- --Read--- -Delete-- -Create-- --Read--- 
> -Delete--
> files:max:min        /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec 
> %CP
> 10:100000:1000/1000   222   8   170   4  1441  21   199   8    55   1   985  
> 20
> kendy2.up.ac.za,,,,,,,,,,,,,,10:100000:1000/1000,222,8,170,4,1441,21,199,8,55,!
> 0.320u 16.660s 6:00.93 4.7%     0+0k 0+0io 202pf+0w
> 
> 2nd run:
> Version 1.02b       ------Sequential Create------ --------Random 
> Create--------
> kendy2.up.ac.za     -Create-- --Read--- -Delete-- -Create-- --Read--- 
> -Delete--
> files:max:min        /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec 
> %CP
> 10:100000:1000/1000   201   7    99   2  1168  16   204   7    54   1   989  
> 22
> kendy2.up.ac.za,,,,,,,,,,,,,,10:100000:1000/1000,201,7,99,2,1168,16,204,7,54,1!
> 0.400u 16.360s 6:49.92 4.0%     0+0k 0+0io 202pf+0w
> 
> AFTER:
> 1st run:
> Version 1.02b       ------Sequential Create------ --------Random 
> Create--------
> kendy2.up.ac.za     -Create-- --Read--- -Delete-- -Create-- --Read--- 
> -Delete--
> files:max:min        /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec 
> %CP
> 10:100000:1000/1000   224   8   176   4  1444  21   224   8    57   1  1084  
> 19
> kendy2.up.ac.za,,,,,,,,,,,,,,10:100000:1000/1000,224,8,176,4,1444,21,224,8,57,!
> 0.380u 15.700s 5:44.37 4.6%     0+0k 0+0io 202pf+0w
> 
> 2nd run:
> Version 1.02b       ------Sequential Create------ --------Random 
> Create--------
> kendy2.up.ac.za     -Create-- --Read--- -Delete-- -Create-- --Read--- 
> -Delete--
> files:max:min        /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec 
> %CP
> 10:100000:1000/1000   214   8   173   4  1367  20   222   8    57   1  1081  
> 23
> kendy2.up.ac.za,,,,,,,,,,,,,,10:100000:1000/1000,214,8,173,4,1367,20,222,8,57,!
> 0.350u 16.110s 5:49.39 4.7%     0+0k 0+0io 202pf+0w
> 
> >From this we can see that the sequential read speed is wat is should be.
> As a bonus we get a healthy increase in random create.
> 
> Paul
> 
> Paul Schutte wrote:
> 
> > Hi,
> >
> > I have being playing around with bonnie++ from
> > http://www.coker.com.au/bonnie++/
> >
> > I found an interesting thing.
> >
> > When I run bonnie++ on a newly created XFS filesystem I get the
> > following results:
> >
> > mkfs.xfs -f -l size=8192b /dev/sda7
> >
> > meta-data=/dev/sda7              isize=256    agcount=19, agsize=262144
> > blks
> > data     =                       bsize=4096   blocks=4843589, imaxpct=25
> >
> >          =                       sunit=0      swidth=0 blks, unwritten=0
> >
> > naming   =version 2              bsize=4096
> > log      =internal log           bsize=4096   blocks=8192
> > realtime =none                   extsz=65536  blocks=0, rtextents=0
> >
> > mount -o logbufs=8 /dev/sda7 /mnt
> >
> > cd /mnt
> >
> > /mnt#time bonnie++ -u root -s0 -b -n 10:100000:1000:1000
> >
> > Version 1.02b       ------Sequential Create------ --------Random
> > Create--------
> > kendy2.up.ac.za     -Create-- --Read--- -Delete-- -Create-- --Read---
> > -Delete--
> > files:max:min        /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
> > /sec %CP
> > 10:100000:1000/1000   206   8   154   3  1463  20   192   7    49   1
> > 1081  18
> > kendy2.up.ac.za,,,,,,,,,,,,,,10:100000:1000/1000,206,8,154,3,1463,20,192,7,49,1,
> >
> > 1081,18
> > 0.300u 15.540s 6:35.00 4.0%     0+0k 0+0io 215pf+0w
> >
> > /mnt#time bonnie++ -u root -s0 -b -n 10:100000:1000:1000
> >
> > Version 1.02b       ------Sequential Create------ --------Random
> > Create--------
> > kendy2.up.ac.za     -Create-- --Read--- -Delete-- -Create-- --Read---
> > -Delete--
> > files:max:min        /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
> > /sec %CP
> > 10:100000:1000/1000   196   7    83   1  1215  23   191   8    49   1
> > 1023  20
> > kendy2.up.ac.za,,,,,,,,,,,,,,10:100000:1000/1000,196,7,83,1,1215,23,191,8,49,1,1
> >
> > 023,20
> > 0.370u 16.520s 7:31.92 3.7%     0+0k 0+0io 219pf+0w
> >
> > I created the file system.
> > Run bonnie++ with the parameters as above.
> > I run bonnie++ immediately for a second time.
> > If you look at the sequential read field, you will see that it is nearly
> > half the amount of the first run.
> > According to this test XFS seems to lose sequential read speed as the
> > filesystem gets used.
> > You can umount and even reboot the machine and run bonnie++ again and
> > still get the slowdown phenomenon,
> > provided you mount the same filesystem again without mkfs'ing it.
> >
> > I repeated this test on several other machines with the same result.
> > I also did it with other filesystems.
> >
> > Here is the result with ext2:
> >
> > Version 1.02b       ------Sequential Create------ --------Random
> > Create--------
> > kendy2.up.ac.za     -Create-- --Read--- -Delete-- -Create-- --Read---
> > -Delete--
> > files:max:min        /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
> > /sec %CP
> > 10:100000:1000/1000   142   2   142   2   585   3   150   3    46   0
> > 430   2
> > kendy2.up.ac.za,,,,,,,,,,,,,,10:100000:1000/1000,142,2,142,2,585,3,150,3,46,0,43
> >
> > 0,2
> > 0.240u 8.950s 7:56.63 1.9%      0+0k 0+0io 218pf+0w
> > /mnt#time bonnie++ -u root -s0 -b -n 10:100000:1000:1000
> > Version 1.02b       ------Sequential Create------ --------Random
> > Create--------
> > kendy2.up.ac.za     -Create-- --Read--- -Delete-- -Create-- --Read---
> > -Delete--
> > files:max:min        /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
> > /sec %CP
> > 10:100000:1000/1000   154   3   143   3   540   2   152   2    47   0
> > 449   2
> > kendy2.up.ac.za,,,,,,,,,,,,,,10:100000:1000/1000,154,3,143,3,540,2,152,2,47,0,44
> >
> > 9,2
> > 0.300u 9.080s 7:42.20 2.0%      0+0k 0+0io 220pf+0w
> >
> > No slow down.
> > I can include results for reiserfs and JFS, but will add size to this
> > message without adding additional info
> > regarding this issue.
> >
> > The machine is a Dell PE2550
> > 1G RAM (I used mem=256M kernel param, otherwise everything runs from
> > cache)
> > 4x18G 15k RPM seagate cheethas in RAID 10
> > 2x1.133GHz P4 CPUs
> >
> > Regards
> >
> > Paul Schutte
> ----
> 

> diff -ur linux/fs/xfs/xfs_ialloc.c 
> /data/linux-2.4-xfs-ifix/linux/fs/xfs/xfs_ialloc.c
> --- linux/fs/xfs/xfs_ialloc.c Sat Jan 12 01:43:26 2002
> +++ /data/linux-2.4-xfs-ifix/linux/fs/xfs/xfs_ialloc.c        Wed Apr 24 
> 00:36:54 2002
> @@ -815,20 +815,35 @@
>               if ((error = xfs_inobt_lookup_eq(cur,
>                               INT_GET(agi->agi_newino, ARCH_CONVERT), 0, 0, 
> &i)))
>                       goto error0;
> +        
> +       /* 
> +        * It seems not te be a good idea to use the most recently
> +        * allocated block. If we do so, we end up using the inodes
> +        * at the back of the ag first and work our way to the front.
> +        * The data blocks on the other hand tend to be allocated from 
> +        * the begining to the end of the ag. The avarage distance between 
> +        * an inode and its data in terms of daddr is much longer if we do
> +        * it this way. The avarage distance between the inode and it's data
> +        * tend to be more constant and in general shorter if we allocate
> +        * inodes from the front of the ag to the back.
> +        * 
> +        * Paul Schutte
> +        * 
>               if (i == 1 &&
>                   (error = xfs_inobt_get_rec(cur, &rec.ir_startino,
>                           &rec.ir_freecount, &rec.ir_free, &j, 
> ARCH_NOCONVERT)) == 0 &&
>                   j == 1 &&
>                   rec.ir_freecount > 0) {
> -                     /*
> +                     *
>                        * The last chunk allocated in the group still has
>                        * a free inode.
> -                      */
> +                      *
>               }
>               /*
>                * None left in the last group, search the whole a.g.
> -              */
> +              
>               else {
> +              */
>                       if (error)
>                               goto error0;
>                       if ((error = xfs_inobt_lookup_ge(cur, 0, 0, 0, &i)))
> @@ -847,7 +862,7 @@
>                                       goto error0;
>                               XFS_WANT_CORRUPTED_GOTO(i == 1, error0);
>                       }
> -             }
> +     /*      } Matches the most recently block code above */
>       }
>       offset = XFS_IALLOC_FIND_FREE(&rec.ir_free);
>       ASSERT(offset >= 0);
-- 

Steve Lord                                      voice: +1-651-683-3511
Principal Engineer, Filesystem Software         email: lord@xxxxxxx


<Prev in Thread] Current Thread [Next in Thread>