xfs
[Top] [All Lists]

Re: slow ls -l

To: Qing Chang <qchang@xxxxxxxxxxxxxxx>, xfs@xxxxxxxxxxx
Subject: Re: slow ls -l
From: Stan Hoeppner <stan@xxxxxxxxxxxxxxxxx>
Date: Mon, 04 Nov 2013 00:42:07 -0600
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <5273FCCB.60601@xxxxxxxxxxxxxxx>
References: <526EAFD1.1030500@xxxxxxxxxxxxxxx> <526F7600.8060102@xxxxxxxxxxxxxxxxx> <526FC027.3040906@xxxxxxxxxxxxxxx> <526FC84D.7070905@xxxxxxxxxxxxxxxxx> <5273FCCB.60601@xxxxxxxxxxxxxxx>
Reply-to: stan@xxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (Windows NT 5.1; rv:24.0) Gecko/20100101 Thunderbird/24.1.0
On 11/1/2013 2:11 PM, Qing Chang wrote:
> I have collected some info following the FAQ guide. It may still
> not be enough. Please let me know.

You say /var/spool/mail contains 1360 "items".  That's ambiguous.  You
mean 1360 user mbox files, yes?

> # time to run ls -l (1360 items):
> time ls -l /var/spool/mail
> real    0m10.271s
> user    0m0.097s
> sys     0m0.162s

The combined CPU execution time is apparently 0.259s.  Something else is
adding 10s+ of delay.  mbox is notorious for locking problems.  I wonder
if stat() is waiting on release of exclusive write locks being held by
your MDA during delivery.  This would also tend to explain the
variability you demonstrated, with one run being 10s and another being
7s--fewer files locked on the second run.

Run 'ls' with "strace -T" to find out which system call is adding the
delay.  Run perf-lock concurrently to see the locks.

More comments below.


> # Oct 29, 2013
> #
> # XFS slowness with ls du etc.
> 
> kernel version (uname -a)
> Linux dovecot2.sri.utoronto.ca 2.6.32-358.18.1.el6.x86_64 #1 SMP Fri Aug 2 
> 17:04:38 EDT 2013 x86_64 x86_64 x86_64 GNU/Linux
> 
> xfsprogs version (xfs_repair -V)
> xfs_repair version 3.1.1
> 
> number of CPUs: 16 virtual CPUs
> 
> contents of /proc/meminfo
> MemTotal:       32878632 kB
> MemFree:          310260 kB
> Buffers:          220496 kB
> Cached:         29971428 kB
> SwapCached:          212 kB
> Active:          9726912 kB
> Inactive:       21180124 kB
> Active(anon):     385040 kB
> Inactive(anon):   330260 kB
> Active(file):    9341872 kB
> Inactive(file): 20849864 kB
> Unevictable:           0 kB
> Mlocked:               0 kB
> SwapTotal:       6160376 kB
> SwapFree:        6159896 kB
> Dirty:               772 kB
> Writeback:             0 kB
> AnonPages:        714912 kB
> Mapped:            52696 kB
> Shmem:               184 kB
> Slab:            1045988 kB
> SReclaimable:     776440 kB
> SUnreclaim:       269548 kB
> KernelStack:        8736 kB
> PageTables:        57172 kB
> NFS_Unstable:          0 kB
> Bounce:                0 kB
> WritebackTmp:          0 kB
> CommitLimit:    22599692 kB
> Committed_AS:    1148180 kB
> VmallocTotal:   34359738367 kB
> VmallocUsed:      335224 kB
> VmallocChunk:   34333710552 kB
> HardwareCorrupted:     0 kB
> AnonHugePages:    180224 kB
> HugePages_Total:       0
> HugePages_Free:        0
> HugePages_Rsvd:        0
> HugePages_Surp:        0
> Hugepagesize:       2048 kB
> DirectMap4k:       10240 kB
> DirectMap2M:    33544192 kB
> 
> contents of /proc/mounts
> [root@dovecot2 ~]# cat /proc/mounts
> rootfs / rootfs rw 0 0
> proc /proc proc rw,relatime 0 0
> sysfs /sys sysfs rw,seclabel,relatime 0 0
> devtmpfs /dev devtmpfs 
> rw,seclabel,relatime,size=16429456k,nr_inodes=4107364,mode=755 0 0
> devpts /dev/pts devpts rw,seclabel,relatime,gid=5,mode=620,ptmxmode=000 0 0
> tmpfs /dev/shm tmpfs rw,seclabel,relatime 0 0
> /dev/mapper/vg_ipa1-lv_root / ext4 
> rw,seclabel,relatime,barrier=1,data=ordered 0 0
> none /selinux selinuxfs rw,relatime 0 0
> devtmpfs /dev devtmpfs 
> rw,seclabel,relatime,size=16429456k,nr_inodes=4107364,mode=755 0 0
> /proc/bus/usb /proc/bus/usb usbfs rw,relatime 0 0
> /dev/sda1 /boot ext4 rw,seclabel,relatime,barrier=1,data=ordered 0 0
> /dev/mapper/vg_ipa1-lv_home /var ext4 
> rw,seclabel,relatime,barrier=1,data=ordered 0 0
> /dev/sdb1 /mailhomes xfs 
> rw,seclabel,relatime,attr2,delaylog,nobarrier,inode64,usrquota 0 0
> /dev/sdc1 /var/spool/mail xfs 
> rw,seclabel,relatime,attr2,delaylog,nobarrier,inode64,usrquota 0 0
> none /proc/sys/fs/binfmt_misc binfmt_misc rw,relatime 0 0
> sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw,relatime 0 0
> 
> contents of /proc/partitions
> major minor  #blocks  name
> 
>    8        0   83886080 sda
>    8        1     512000 sda1
>    8        2   83373056 sda2
>    8       16 5368709120 sdb
>    8       17 5368707072 sdb1
>    8       32 2147483648 sdc
>    8       33 2147481600 sdc1
>  253        0   56729600 dm-0
>  253        1    6160384 dm-1
>  253        2   20480000 dm-2
> 
> RAID layout (hardware and/or software): Hardware RAID6 IBM Storwize V7000

What we're looking for here is the RAID geometry of the local disk
device, logical volume, SAN LUN, etc, underlying the XFS filesystem,
i.e. number of disks in the array, RAID level, stripe unit and stripe width.

> LVM configuration: XFS not configured
> 
> type of disks you are using: SAS dirves
> 
> write cache status of drives: V7000 has 16 GB of cache
> 
> size of BBWC and mode it is running in: N/A

The 16GB of BBWC in the V7000 defaults to write-back mode, or has been
manually enabled.  If not you're playing Russian roulette, because
you're mounting with nobarrier.  You only disable barriers if you have
persistent write cache, either battery or flash based.

> xfs_info output on the filesystem in question
> meta-data=/dev/sdc1              isize=256    agcount=4, agsize=134217600 blks
>          =                       sectsz=512   attr=2
> data     =                       bsize=4096   blocks=536870400, imaxpct=5
>          =                       sunit=0      swidth=0 blks
> naming   =version 2              bsize=4096   ascii-ci=0
> log      =internal               bsize=4096   blocks=262143, version=2
>          =                       sectsz=512   sunit=0 blks, lazy-count=1
> realtime =none                   extsz=4096   blocks=0, rtextents=0

2TB seems rather large for a 1360 user mail spool directory, given new
mail is apparently moved from INBOX to other folders in /mailhomes.

Additionally, given the innate parallelism in a mail workload, 4 AGs is
far less than optimal.  XFS throughput with a highly parallel workload
is limited by the number of allocation groups.  This filesystem was
apparently created with mkfs.xfs defaults.  It should have been created
with many more AGs, 16 minimum.  Same goes for the /mailhomes filesystem.

> dmesg output showing all error messages and stack traces: N/A
> 
> # time to run ls -l (1360 items):
> time ls -l /var/spool/mail
> real    0m10.271s
> user    0m0.097s
> sys     0m0.162s
> 
> # iostat when ls -l is running:
> [root@dovecot2 ~]# date;iostat -xdmz 5
> Tue Oct 29 14:44:28 EDT 2013
> Linux 2.6.32-358.18.1.el6.x86_64 (dovecot2.sri.utoronto.ca)     10/29/2013    
>   _x86_64_        (16 CPU)
> 
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz 
> avgqu-sz   await  svctm  %util
> sda               0.02    17.87    0.17   24.49     0.01     0.17    14.16    
>  0.01    0.56   0.28   0.68
> sdb               0.19     0.08    1.62    1.94     0.04     0.02    33.47    
>  0.01    2.76   1.23   0.44
> sdc               0.00     0.03   11.57    0.69     1.10     0.05   192.23    
>  0.02    1.65   0.78   0.96
> dm-0              0.00     0.00    0.11    2.24     0.00     0.01     8.70    
>  0.00    1.14   0.51   0.12
> dm-1              0.00     0.00    0.00    0.00     0.00     0.00     8.00    
>  0.00    1.26   0.24   0.00
> dm-2              0.00     0.00    0.08   40.11     0.00     0.16     8.18    
>  0.03    0.86   0.14   0.57
> 
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz 
> avgqu-sz   await  svctm  %util
> sda               0.00   155.00    0.00  127.20     0.00     1.10    17.75    
>  0.06    0.45   0.25   3.22
> sdb               0.00     0.20    0.00   11.60     0.00     0.07    11.81    
>  0.05    4.07   0.17   0.20
> sdc               0.00     0.00    0.00    0.40     0.00     0.00    12.00    
>  0.00    0.50   0.50   0.02
> dm-0              0.00     0.00    0.00    1.20     0.00     0.00     8.00    
>  0.01    4.50   4.50   0.54
> dm-2              0.00     0.00    0.00  281.40     0.00     1.10     8.00    
>  0.19    0.68   0.10   2.74
> 
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz 
> avgqu-sz   await  svctm  %util
> sda               0.00  1480.80    0.00  627.20     0.00     8.23    26.89    
>  0.28    0.45   0.20  12.70
> sdb               0.00     0.00    0.40    4.00     0.00     0.03    18.27    
>  0.00    1.05   1.05   0.46
> sdc               0.00     0.00    0.00    1.40     0.00     0.00     6.71    
>  0.00    0.14   0.14   0.02
> dm-0              0.00     0.00    0.00   17.20     0.00     0.07     8.00    
>  0.02    1.09   0.09   0.16
> dm-2              0.00     0.00    0.00 2090.20     0.00     8.16     8.00    
>  1.65    0.79   0.06  12.52
> 
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz 
> avgqu-sz   await  svctm  %util
> sda               0.00  1331.00    0.00  629.80     0.00     7.66    24.91    
>  0.28    0.44   0.22  13.64
> sdb               0.00     0.00    0.00    2.80     0.00     0.02    14.07    
>  0.00    0.79   0.57   0.16
> sdc               0.00     0.00    0.00    1.40     0.00     0.02    30.14    
>  0.00    0.57   0.57   0.08
> dm-0              0.00     0.00    0.00    1.20     0.00     0.00     8.00    
>  0.00    3.83   3.83   0.46
> dm-2              0.00     0.00    0.00 1959.60     0.00     7.65     8.00    
>  1.43    0.73   0.07  13.20
> 
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz 
> avgqu-sz   await  svctm  %util
> sda               0.00     6.60    0.00    0.60     0.00     0.03    96.00    
>  0.00    1.33   1.33   0.08
> sdb               0.00     0.20    0.00    0.80     0.00     0.00     8.25    
>  0.00    0.75   0.75   0.06
> sdc               0.00     0.20    0.00    0.20     0.00     0.00    16.00    
>  0.00    0.00   0.00   0.00
> dm-0              0.00     0.00    0.00    6.20     0.00     0.02     8.00    
>  0.01    0.97   0.03   0.02
> dm-2              0.00     0.00    0.00    1.00     0.00     0.00     8.00    
>  0.00    3.00   0.60   0.06
> 
> # a second run of ls -l:
> [root@dovecot2 ~]# date;time ls -l /var/spool/mail
> Fri Nov  1 15:04:43 EDT 2013
> real    0m7.814s
> user    0m0.096s
> sys     0m0.139s
> 
> [root@dovecot2 ~]# date;iostat -xmdz 5
> Fri Nov  1 15:04:41 EDT 2013
> Linux 2.6.32-358.18.1.el6.x86_64 (dovecot2.sri.utoronto.ca)     11/01/2013    
>   _x86_64_        (16 CPU)
> 
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz 
> avgqu-sz   await  svctm  %util
> sda               0.01    21.20    0.16   28.28     0.00     0.19    14.27    
>  0.02    0.57   0.27   0.78
> sdb               0.22     0.09    1.70    2.19     0.04     0.02    31.03    
>  0.01    2.83   1.20   0.47
> sdc               0.00     0.04   12.63    0.75     1.21     0.05   192.59    
>  0.02    1.60   0.77   1.04
> dm-0              0.00     0.00    0.11    2.46     0.00     0.01     8.57    
>  0.00    1.17   0.51   0.13
> dm-1              0.00     0.00    0.00    0.00     0.00     0.00     8.00    
>  0.00    2.03   0.21   0.00
> dm-2              0.00     0.00    0.07   47.02     0.00     0.18     8.15    
>  0.04    0.93   0.14   0.65
> 
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz 
> avgqu-sz   await  svctm  %util
> sda               0.00   730.20    0.00 1251.60     0.00     7.76    12.69    
>  0.52    0.42   0.21  25.86
> sdb               0.00     1.00    0.80   25.40     0.01     0.16    12.62    
>  0.01    0.37   0.18   0.46
> sdc               0.00     0.00    0.00    2.60     0.00     0.03    20.62    
>  0.00    0.31   0.23   0.06
> dm-0              0.00     0.00    0.00    0.20     0.00     0.00     8.00    
>  0.00    3.00   3.00   0.06
> dm-2              0.00     0.00    0.00 1981.00     0.00     7.74     8.00    
>  1.04    0.52   0.13  25.96
> 
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz 
> avgqu-sz   await  svctm  %util
> sda               0.00  1544.20    0.00 2844.60     0.00    17.14    12.34    
>  1.10    0.39   0.21  58.34
> sdb               0.00     0.00    0.00    1.20     0.00     0.01     9.33    
>  0.00    0.00   0.00   0.00
> sdc               0.00     0.00    0.00    0.60     0.00     0.00     8.00    
>  0.00    3.00   1.00   0.06
> dm-0              0.00     0.00    0.00    3.00     0.00     0.01     8.00    
>  0.00    0.07   0.07   0.02
> dm-2              0.00     0.00    0.00 4386.00     0.00    17.13     8.00    
>  2.19    0.50   0.13  58.46
> 
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz 
> avgqu-sz   await  svctm  %util
> sda               0.00   104.80    0.00  212.80     0.00     1.24    11.94    
>  0.09    0.41   0.21   4.44
> sdb               0.00     0.00    0.20    3.00     0.00     0.01     9.81    
>  0.00    0.50   0.50   0.16
> sdc               0.00     0.20    0.00    2.80     0.00     0.02    16.86    
>  0.01    4.43   0.79   0.22
> dm-0              0.00     0.00    0.00    0.40     0.00     0.00     8.00    
>  0.00    5.50   5.50   0.22
> dm-2              0.00     0.00    0.00  317.00     0.00     1.24     8.00    
>  0.15    0.47   0.13   4.24
> 
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz 
> avgqu-sz   await  svctm  %util
> sda               0.00     6.60    0.00    1.20     0.00     0.03    52.00    
>  0.00    0.33   0.33   0.04
> sdb               0.00     0.20    0.00    1.00     0.00     0.00     9.60    
>  0.00    0.00   0.00   0.00
> sdc               0.00     0.00    0.00    0.20     0.00     0.00     2.00    
>  0.00    1.00   1.00   0.02
> dm-0              0.00     0.00    0.00    4.40     0.00     0.02     8.00    
>  0.00    0.95   0.05   0.02
> dm-2              0.00     0.00    0.00    3.40     0.00     0.01     8.00    
>  0.00    0.06   0.06   0.02
> 
> [root@dovecot2 ~]# date;vmstat 5
> Fri Nov  1 15:04:39 EDT 2013
> procs -----------memory---------- ---swap-- -----io---- --system-- 
> -----cpu-----
>  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa 
> st
>  0  0   1304 542644 229484 29885592    0    0    80    17    1    1  1  0 99  
> 0  0
>  0  1   1304 542752 229484 29885696    0    0     0  2211 1518 1281  1  0 98  
> 0  0
>  0  1   1304 544736 229484 29885696    0    0     6 17216 7535 7631  3  2 94  
> 2  0
>  0  0   1304 543988 229484 29885696    0    0     0  7403 3782 3889  1  1 97  
> 1  0
>  0  0   1304 543604 229484 29885712    0    0     1   215  399  277  0  0 100 
>  0  0


-- 
Stan

<Prev in Thread] Current Thread [Next in Thread>