xfs
[Top] [All Lists]

Re: Slow performance after ~4.5TB

To: Linas Jankauskas <linas.j@xxxxx>
Subject: Re: Slow performance after ~4.5TB
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Mon, 12 Nov 2012 23:32:22 +1100
Cc: xfs@xxxxxxxxxxx
In-reply-to: <50A0C590.6020602@xxxxx>
References: <50A0AFD5.2020607@xxxxx> <20121112090448.GS24575@dastard> <50A0C590.6020602@xxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Mon, Nov 12, 2012 at 11:46:56AM +0200, Linas Jankauskas wrote:
> 
> Servers are HP dl180 g6
> OS centos 6.3 x86_64
>
> CPU
> 2x Intel(R) Xeon(R) CPU           L5630  @ 2.13GHz
> 
> uname -r
> 2.6.32-279.5.2.el6.x86_64
> 
> xfs_repair -V
> xfs_repair version 3.1.1
> 
> 
> cat /proc/meminfo
> MemTotal:       12187500 kB
> MemFree:          153080 kB
> Buffers:         6400308 kB

That looks strange - 6GB of buffers? That's block device cached
pages, and XFS doesn't use the block device for caching. You don't
have much in the way of ext4 filesystems, either, so i don't thik
that is responsible.

> cat /proc/mounts
> rootfs / rootfs rw 0 0
> proc /proc proc rw,relatime 0 0
> sysfs /sys sysfs rw,relatime 0 0
> devtmpfs /dev devtmpfs
> rw,relatime,size=6084860k,nr_inodes=1521215,mode=755 0 0

A 6GB devtmpfs? That seems unusual. What is the purpose of having a
6GB ramdisk mounting on /dev?  I wonder if that is consuming all
that buffer space....

>       logicaldrive 1 (20.0 TB, RAID 5, OK)
> 
>       physicaldrive 1I:1:1 (port 1I:box 1:bay 1, SATA, 2 TB, OK)
>       physicaldrive 1I:1:2 (port 1I:box 1:bay 2, SATA, 2 TB, OK)
>       physicaldrive 1I:1:3 (port 1I:box 1:bay 3, SATA, 2 TB, OK)
>       physicaldrive 1I:1:4 (port 1I:box 1:bay 4, SATA, 2 TB, OK)
>       physicaldrive 1I:1:5 (port 1I:box 1:bay 5, SATA, 2 TB, OK)
>       physicaldrive 1I:1:6 (port 1I:box 1:bay 6, SATA, 2 TB, OK)
>       physicaldrive 1I:1:7 (port 1I:box 1:bay 7, SATA, 2 TB, OK)
>       physicaldrive 1I:1:8 (port 1I:box 1:bay 8, SATA, 2 TB, OK)
>       physicaldrive 1I:1:9 (port 1I:box 1:bay 9, SATA, 2 TB, OK)
>       physicaldrive 1I:1:10 (port 1I:box 1:bay 10, SATA, 2 TB, OK)
>       physicaldrive 1I:1:11 (port 1I:box 1:bay 11, SATA, 2 TB, OK)
>       physicaldrive 1I:1:12 (port 1I:box 1:bay 12, SATA, 2 TB, OK)

OK, so RAID5, but it doesn't tell me the geometry of it.

> xfs_info /var
> meta-data=/dev/sda5              isize=256    agcount=20,
> agsize=268435455 blks
>          =                       sectsz=512   attr=2
> data     =                       bsize=4096   blocks=5368633873, imaxpct=5
>          =                       sunit=0      swidth=0 blks

And no geometry here, either.

> naming   =version 2              bsize=4096   ascii-ci=0
> log      =internal               bsize=4096   blocks=521728, version=2
>          =                       sectsz=512   sunit=0 blks, lazy-count=1
> realtime =none                   extsz=4096   blocks=0, rtextents=0
> 
> No dmesg errors.
> 
> vmstat 5
> procs -----------memory---------- ---swap-- -----io---- --system-- 
> -----cpu-----
>  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa 
> st
>  1  0   2788 150808 6318232 2475332    0    0   836   185    2    4 1 11 87  
> 1  0
>  1  0   2788 150608 6318232 2475484    0    0     0    89 1094  126 0 12 88  
> 0  0
>  1  0   2788 150500 6318232 2475604    0    0     0    60 1109   99 0 12 88  
> 0  0
>  1  0   2788 150252 6318232 2475720    0    0     0    49 1046   79 0 12 88  
> 0  0
>  1  0   2788 150344 6318232 2475844    0    0     1   157 1046   82 0 12 88  
> 0  0
>  1  0   2788 149972 6318232 2475960    0    0     0   197 1086  144 0 12 88  
> 0  0
>  1  0   2788 150020 6318232 2476088    0    0     0    76 1115   99 0 12 88  
> 0  0
>  1  0   2788 150012 6318232 2476204    0    0     0    81 1131  132 0 12 88  
> 0  0
>  1  0   2788 149624 6318232 2476340    0    0     0    53 1074   95 0 12 88  
> 0  0

basically idle, but burning a CPu in system time.

> iostat -x -d -m 5
> Linux 2.6.32-279.5.2.el6.x86_64 (storage)     11/12/2012
> _x86_64_  (8 CPU)
> 
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz 
> avgqu-sz   await  svctm  %util
> sda             103.27     1.51   92.43   37.65     6.52     1.44   125.36    
>  0.73    5.60   1.13  14.74
> sda               0.00     0.20    2.40   19.80     0.01     0.09     9.08    
>  0.13    5.79   2.25   5.00
> sda               0.00     3.60    0.60   36.80     0.00     4.15   227.45    
>  0.12    3.21   0.64   2.38
> sda               0.00     0.40    1.20   36.80     0.00     8.01   431.83    
>  0.11    3.00   1.05   4.00
> sda               0.00     0.60    0.00   20.60     0.00     0.08     8.39    
>  0.01    0.69   0.69   1.42
> sda               0.00    38.40    4.20   27.40     0.02     0.27    18.34    
>  0.25    8.06   2.63   8.32

Again, pretty much idle.

So, it's not doing IO, it's not thrashing caches, so what is burning
cpu? Can you take a profile? maybe just run 'perf top' for 15s and
then just paste the top 10 samples?

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>