slow ls -l

Nathaniel W. Turner nate at houseofnate.net
Tue Nov 12 15:55:22 CST 2013


Hi Qing,

I'm not sure, but it looks like those poll calls are related to "ls -l" 
communicating with the System Security Services Daemon (SSSD) to (I 
imagine) look up user and group names. Just speculation, but if true, 
this suggests a performance problem with SSSD or some related part of 
the system, not XFS. Someone with a clue about sssd might be able to 
confirm or deny this.

Your "df" performance problem may be caused by something completely 
different.

n

On 11/12/2013 01:15 PM, Qing Chang wrote:
> Hi Stan,
>
> thank you very much for your comments, I need some time to
> follow through all of your suggestions.
>
> Yes, this is a dovecot IMAP and sendmail SMTP server, and there
> are 1360 mailboxes.
>
> With this email, I'd like to give you a bit more information with
> strace -T and others commands.
>
> Because "# CONFIG_LOCK_STAT is not set", I can not run perf
> lock record. Since this is a production server, I am reluctant to
> rebuild the kernel on it. But I hope the info below may give
> you an idea that it is not "lock" related. As slow "ls -l" and "du"
> also happen on the two file servers (64TB and 32TB), it does not
> feel like it is lock related.
>
> I run this "(time strace -T ls -l /var/spool/mail > ls-out) >& ls-trace"
> and got quite a bit data.
>
> First "time" output:
> =====
> real    0m17.720s
> user    0m0.574s
> sys     0m1.271s
> =====
>
> Some 14s was spent on 164 "poll" calls on fd=5 that took more than 10ms,
> total number of "poll" call is 6618!
>
> Below is the trace output, red numbers are the three longest time 
> taken, listed
> at top are those that took more than 100ms, the three sections are for 
> the top
> three time taken, note the three mailboxes are not special compare to 
> another
> mailboxes, a run at another time would produce calls on different 
> mailboxes.
> Also, cache seams to play a role too, after a fresh run that takes 
> long time,
> subsequent runs give instant results:
> =====
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.144634>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.104580>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.104812>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.181959>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.102520>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.107539>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.142462>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.215745>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.309854>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.166665>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.101035>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.113119>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.120745>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.110027>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.137906>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.249039>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.104079>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.250003>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.103791>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.102018>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.103551>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.103696>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.102164>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.105843>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.104413>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.101304>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.101869>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.101366>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.132958>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.309411>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.100342>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.101279>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.144725>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.109867>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.109951>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.290189>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.102449>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.101182>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.111699>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.107351>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.104558>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.115447>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.163904>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.266427>
>
> ##### df=5 opened:
> socket(PF_FILE, SOCK_STREAM, 0)         = 5 <0.000016>
> fcntl(5, F_GETFL)                       = 0x2 (flags O_RDWR) <0.000010>
> fcntl(5, F_SETFL, O_RDWR|O_NONBLOCK)    = 0 <0.000010>
> fcntl(5, F_GETFD)                       = 0 <0.000010>
> fcntl(5, F_SETFD, FD_CLOEXEC)           = 0 <0.000010>
> connect(5, {sa_family=AF_FILE, path="/var/lib/sss/pipes/nss"}, 110) = 
> 0 <0.001886>
> fstat(5, {st_mode=S_IFSOCK|0777, st_size=0, ...}) = 0 <0.000010>
> poll([{fd=5, events=POLLOUT}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLOUT}]) <0.000012>
> sendto(5, "\24\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0", 16, MSG_NOSIGNAL, NULL, 
> 0) = 16 <0.000029>
> poll([{fd=5, events=POLLOUT}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLOUT}]) <0.000012>
> sendto(5, "\1\0\0\0", 4, MSG_NOSIGNAL, NULL, 0) = 4 <0.000027>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.000011>
> read(5, "\24\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0", 16) = 16 <0.000013>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.000011>
> read(5, "\1\0\0\0", 4)                  = 4 <0.000025>
> poll([{fd=5, events=POLLOUT}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLOUT}]) <0.000011>
> sendto(5, "\24\0\0\0\22\0\0\0\0\0\0\0\0\0\0\0", 16, MSG_NOSIGNAL, 
> NULL, 0) = 16 <0.000026>
> poll([{fd=5, events=POLLOUT}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLOUT}]) <0.000011>
> sendto(5, "\225N\0\0", 4, MSG_NOSIGNAL, NULL, 0) = 4 <0.000025>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.000515>
> read(5, "L\0\0\0\22\0\0\0\0\0\0\0\0\0\0\0", 16) = 16 <0.000012>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.000011>
> read(5, "\1\0\0\0\0\0\0\0\225N\0\0\311\0\0\0qchang\0*\0Qing Ch"..., 
> 60) = 60 <0.000026>
> open("/etc/group", O_RDONLY|O_CLOEXEC)  = 6 <0.000017>
> fstat(6, {st_mode=S_IFREG|0644, st_size=790, ...}) = 0 <0.000010>
> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 
> 0) = 0x7fd5d53f4000 <0.004539>
> read(6, "root:x:0:root\nbin:x:1:root,bin,d"..., 4096) = 790 <0.000018>
> read(6, "", 4096)                       = 0 <0.000011>
> close(6)                                = 0 <0.000011>
> munmap(0x7fd5d53f4000, 4096)            = 0 <0.000016>
> open("/var/lib/sss/mc/group", O_RDONLY|O_CLOEXEC) = 6 <0.000021>
> fstat(6, {st_mode=S_IFREG|0644, st_size=5206312, ...}) = 0 <0.000010>
> mmap(NULL, 5206312, PROT_READ, MAP_SHARED, 6, 0) = 0x7fd5ce3c1000 
> <0.000015>
> poll([{fd=5, events=POLLIN|POLLOUT}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLOUT}]) <0.000012>
> poll([{fd=5, events=POLLOUT}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLOUT}]) <0.000011>
> sendto(5, "\24\0\0\0\"\0\0\0\0\0\0\0\0\0\0\0", 16, MSG_NOSIGNAL, NULL, 
> 0) = 16 <0.000031>
> poll([{fd=5, events=POLLOUT}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLOUT}]) <0.000011>
> sendto(5, "\311\0\0\0", 4, MSG_NOSIGNAL, NULL, 0) = 4 <0.000025>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.000427>
> read(5, "?\0\0\0\"\0\0\0\0\0\0\0\0\0\0\0", 16) = 16 <0.000013>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.000011>
> read(5, "\1\0\0\0\0\0\0\0\311\0\0\0\3\0\0\0\0*\0qiu"..., 47) = 47 
> <0.000017>
> lstat("/var/spool/mail/root", {st_mode=S_IFREG|0600, 
> st_size=187649403, ...}) = 0 <0.000019>
> lgetxattr("/var/spool/mail/root", "security.selinux", 
> "unconfined_u:object_r:file_t:s0", 255) = 32 <0.000020>
> lstat("/var/spool/mail/root", {st_mode=S_IFREG|0600, 
> st_size=187649403, ...}) = 0 <0.000014>
> lgetxattr("/var/spool/mail/root", "system.posix_acl_access", 0x0, 0) = 
> -1 ENODATA (No data available) <0.000018>
> lgetxattr("/var/spool/mail/root", "system.posix_acl_default", 0x0, 0) 
> = -1 ENODATA (No data available) <0.000016>
> lstat("/var/spool/mail/agutstei", {st_mode=S_IFREG|0600, 
> st_size=3864251, ...}) = 0 <0.000016>
> lgetxattr("/var/spool/mail/agutstei", "security.selinux", 
> "unconfined_u:object_r:file_t:s0", 255) = 32 <0.000018>
> lstat("/var/spool/mail/agutstei", {st_mode=S_IFREG|0600, 
> st_size=3864251, ...}) = 0 <0.000015>
> lgetxattr("/var/spool/mail/agutstei", "system.posix_acl_access", 0x0, 
> 0) = -1 ENODATA (No data available) <0.000016>
> lgetxattr("/var/spool/mail/agutstei", "system.posix_acl_default", 0x0, 
> 0) = -1 ENODATA (No data available) <0.000015>
> open("/etc/passwd", O_RDONLY|O_CLOEXEC) = 7 <0.000017>
> fstat(7, {st_mode=S_IFREG|0644, st_size=1835, ...}) = 0 <0.000011>
> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 
> 0) = 0x7fd5ce3c0000 <0.000014>
> read(7, "root:x:0:0:root:/root:/bin/bash\n"..., 4096) = 1835 <0.000020>
> read(7, "", 4096)                       = 0 <0.000011>
> close(7)                                = 0 <0.000012>
>
> ###### took 0.309854s:
> munmap(0x7fd5ce3c0000, 4096)            = 0 <0.000015>
> poll([{fd=5, events=POLLIN|POLLOUT}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLOUT}]) <0.000011>
> poll([{fd=5, events=POLLOUT}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLOUT}]) <0.000010>
> sendto(5, "\24\0\0\0\22\0\0\0\0\0\0\0\0\0\0\0", 16, MSG_NOSIGNAL, 
> NULL, 0) = 16 <0.000034>
> poll([{fd=5, events=POLLOUT}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLOUT}]) <0.000010>
> sendto(5, "\253\260\0\0", 4, MSG_NOSIGNAL, NULL, 0) = 4 <0.000335>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.309854>
> read(5, "N\0\0\0\22\0\0\0\0\0\0\0\0\0\0\0", 16) = 16 <0.000012>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.000010>
> read(5, "\1\0\0\0\0\0\0\0\253\260\0\0\253\1\0\0acqu\0*\0Chen Ge A"..., 
> 62) = 62 <0.000016>
> lstat("/var/spool/mail/araouf", {st_mode=S_IFREG|0600, 
> st_size=3943191, ...}) = 0 <0.000052>
> lgetxattr("/var/spool/mail/araouf", "security.selinux", 
> "unconfined_u:object_r:file_t:s0", 255) = 32 <0.000021>
> lstat("/var/spool/mail/araouf", {st_mode=S_IFREG|0600, 
> st_size=3943191, ...}) = 0 <0.000014>
> lgetxattr("/var/spool/mail/araouf", "system.posix_acl_access", 0x0, 0) 
> = -1 ENODATA (No data available) <0.000017>
> lgetxattr("/var/spool/mail/araouf", "system.posix_acl_default", 0x0, 
> 0) = -1 ENODATA (No data available) <0.000015>
> open("/etc/passwd", O_RDONLY|O_CLOEXEC) = 7 <0.000016>
> fstat(7, {st_mode=S_IFREG|0644, st_size=1835, ...}) = 0 <0.000010>
> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 
> 0) = 0x7fd5ce3c0000 <0.000012>
> read(7, "root:x:0:0:root:/root:/bin/bash\n"..., 4096) = 1835 <0.000018>
> read(7, "", 4096)                       = 0 <0.000010>
> close(7)                                = 0 <0.000012>
>
> ###### took 0.309411:
> munmap(0x7fd5ce3c0000, 4096)            = 0 <0.000015>
> poll([{fd=5, events=POLLIN|POLLOUT}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLOUT}]) <0.000010>
> poll([{fd=5, events=POLLOUT}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLOUT}]) <0.000009>
> sendto(5, "\24\0\0\0\22\0\0\0\0\0\0\0\0\0\0\0", 16, MSG_NOSIGNAL, 
> NULL, 0) = 16 <0.000035>
> poll([{fd=5, events=POLLOUT}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLOUT}]) <0.000010>
> sendto(5, "\357\246\0\0", 4, MSG_NOSIGNAL, NULL, 0) = 4 <0.000342>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.309411>
> read(5, "V\0\0\0\22\0\0\0\0\0\0\0\0\0\0\0", 16) = 16 <0.000012>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.000010>
> read(5, "\1\0\0\0\0\0\0\0\357\246\0\0\253\1\0\0pthaya\0*\0Parthip"..., 
> 70) = 70 <0.000018>
> lstat("/var/spool/mail/ptsao", {st_mode=S_IFREG|0600, st_size=515358, 
> ...}) = 0 <0.000025>
> lgetxattr("/var/spool/mail/ptsao", "security.selinux", 
> "unconfined_u:object_r:file_t:s0", 255) = 32 <0.000021>
> lstat("/var/spool/mail/ptsao", {st_mode=S_IFREG|0600, st_size=515358, 
> ...}) = 0 <0.000014>
> lgetxattr("/var/spool/mail/ptsao", "system.posix_acl_access", 0x0, 0) 
> = -1 ENODATA (No data available) <0.000019>
> lgetxattr("/var/spool/mail/ptsao", "system.posix_acl_default", 0x0, 0) 
> = -1 ENODATA (No data available) <0.000014>
> open("/etc/passwd", O_RDONLY|O_CLOEXEC) = 7 <0.000017>
> fstat(7, {st_mode=S_IFREG|0644, st_size=1835, ...}) = 0 <0.000010>
> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 
> 0) = 0x7fd5ce3c0000 <0.000012>
> read(7, "root:x:0:0:root:/root:/bin/bash\n"..., 4096) = 1835 <0.000019>
> read(7, "", 4096)                       = 0 <0.000009>
> close(7)                                = 0 <0.000011>
>
> ###### took 0.290189:
> munmap(0x7fd5ce3c0000, 4096)            = 0 <0.000015>
> poll([{fd=5, events=POLLIN|POLLOUT}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLOUT}]) <0.000011>
> poll([{fd=5, events=POLLOUT}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLOUT}]) <0.000010>
> sendto(5, "\24\0\0\0\22\0\0\0\0\0\0\0\0\0\0\0", 16, MSG_NOSIGNAL, 
> NULL, 0) = 16 <0.000034>
> poll([{fd=5, events=POLLOUT}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLOUT}]) <0.000010>
> sendto(5, "\10\270\0\0", 4, MSG_NOSIGNAL, NULL, 0) = 4 <0.000391>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.290189>
> read(5, "N\0\0\0\22\0\0\0\0\0\0\0\0\0\0\0", 16) = 16 <0.000012>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5, 
> revents=POLLIN}]) <0.000010>
> read(5, "\1\0\0\0\0\0\0\0\10\270\0\0\327\1\0\0taberk\0*\0Thomas "..., 
> 62) = 62 <0.000017>
> lstat("/var/spool/mail/taha", {st_mode=S_IFREG|0600, 
> st_size=293648143, ...}) = 0 <0.000018>
> lgetxattr("/var/spool/mail/taha", "security.selinux", 
> "unconfined_u:object_r:file_t:s0", 255) = 32 <0.000022>
> lstat("/var/spool/mail/taha", {st_mode=S_IFREG|0600, 
> st_size=293648143, ...}) = 0 <0.000013>
> lgetxattr("/var/spool/mail/taha", "system.posix_acl_access", 0x0, 0) = 
> -1 ENODATA (No data available) <0.000021>
> lgetxattr("/var/spool/mail/taha", "system.posix_acl_default", 0x0, 0) 
> = -1 ENODATA (No data available) <0.000014>
> open("/etc/passwd", O_RDONLY|O_CLOEXEC) = 7 <0.000021>
> fstat(7, {st_mode=S_IFREG|0644, st_size=1835, ...}) = 0 <0.000010>
> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 
> 0) = 0x7fd5ce3c0000 <0.000015>
> read(7, "root:x:0:0:root:/root:/bin/bash\n"..., 4096) = 1835 <0.000033>
> read(7, "", 4096)                       = 0 <0.000011>
> close(7)                                = 0 <0.000013>
> =====
>
> ====== iostat around the time ls -l was run =====
> [root at dovecot2 ~]# iostat -xmz 5
> Linux 2.6.32-358.18.1.el6.x86_64 (dovecot2.sri.utoronto.ca) 
> 11/12/2013      _x86_64_        (16 CPU)
>
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            0.56    0.00    0.17    0.06    0.00   99.21
>
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
> avgrq-sz avgqu-sz   await  svctm  %util
> sda               0.01    19.88    0.14   26.86     0.00 0.18    
> 14.18     0.02    0.57   0.28   0.75
> sdb               0.23     0.08    1.75    1.98     0.04 0.02    
> 30.38     0.01    2.81   1.26   0.47
> sdc               0.00     0.03   13.78    0.71     1.34 0.05   
> 196.35     0.02    1.60   0.77   1.11
> dm-0              0.00     0.00    0.09    2.42     0.00 0.01     
> 8.49     0.00    1.18   0.52   0.13
> dm-1              0.00     0.00    0.00    0.00     0.00 0.00     
> 8.00     0.00    3.09   0.27   0.00
> dm-2              0.00     0.00    0.06   44.33     0.00 0.17     
> 8.14     0.04    0.90   0.14   0.63
>
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            1.31    0.00    0.46    0.10    0.00   98.12
>
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
> avgrq-sz avgqu-sz   await  svctm  %util
> sda               0.00    27.40    1.00   65.80     0.02 0.36    
> 11.78     0.04    0.61   0.43   2.86
> sdb               0.00     0.00    1.80    0.00     0.03 0.00    
> 31.11     0.01    6.44   2.00   0.36
> sdc               0.00     1.00    0.00    4.80     0.00 0.03    
> 11.79     0.03    6.88   0.46   0.22
> dm-0              0.00     0.00    0.80    1.20     0.02 0.00    
> 24.00     0.01    6.10   6.10   1.22
> dm-2              0.00     0.00    0.00   92.00     0.00 0.36     
> 8.00     0.06    0.62   0.19   1.76
>
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            5.76    0.00    1.62    0.59    0.00   92.03
>
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
> avgrq-sz avgqu-sz   await  svctm  %util
> sda               0.00  1045.60    0.80  959.40     0.05 7.83    
> 16.82     0.44    0.46   0.21  19.98
> sdb               0.00     0.00    1.40    1.20     0.01 0.01     
> 9.23     0.00    0.62   0.54   0.14
> sdc               0.00     0.00    0.00    0.60     0.00 0.01    
> 18.67     0.00    0.33   0.33   0.02
> dm-0              0.00     0.00    0.80    8.00     0.05 0.03    
> 20.00     0.02    2.05   1.41   1.24
> dm-2              0.00     0.00    0.00 1997.00     0.00 7.80     
> 8.00     1.24    0.62   0.09  18.86
>
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            4.38    0.00    1.29    0.67    0.00   93.66
>
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
> avgrq-sz avgqu-sz   await  svctm  %util
> sda               0.00  1204.40    0.00 1074.60     0.00 8.90    
> 16.97     0.41    0.38   0.20  21.04
> sdb               0.20     0.00    2.00    0.00     0.01 0.00    
> 13.60     0.01    6.80   4.00   0.80
> dm-0              0.00     0.00    0.00    9.00     0.00 0.04     
> 8.00     0.01    1.27   0.42   0.38
> dm-2              0.00     0.00    0.00 2270.00     0.00 8.87     
> 8.00     1.33    0.59   0.09  20.74
>
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            4.24    0.00    1.16    0.58    0.00   94.02
>
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
> avgrq-sz avgqu-sz   await  svctm  %util
> sda               0.00  1159.20    0.00  985.80     0.00 8.38    
> 17.41     0.34    0.35   0.18  17.84
> sdb               0.00     0.00    0.20    1.40     0.00 0.01     
> 8.62     0.00    0.75   0.75   0.12
> sdc               0.00     0.00    0.00    0.20     0.00 0.00    
> 16.00     0.00    1.00   1.00   0.02
> dm-0              0.00     0.00    0.00    5.00     0.00 0.02     
> 8.00     0.00    0.24   0.24   0.12
> dm-2              0.00     0.00    0.00 2140.00     0.00 8.36     
> 8.00     1.23    0.58   0.08  17.90
>
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            2.77    0.00    1.14    0.54    0.00   95.55
>
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
> avgrq-sz avgqu-sz   await  svctm  %util
> sda               0.00   768.20    0.00  847.40     0.00 6.31    
> 15.25     0.36    0.43   0.19  16.40
> sdb               0.00     0.00    0.00    2.00     0.00 0.01     
> 7.00     0.00    0.10   0.10   0.02
> sdc               0.00     0.00    0.00    2.00     0.00 0.04    
> 37.60     0.00    0.70   0.40   0.08
> dm-0              0.00     0.00    0.00   18.80     0.00 0.07     
> 8.00     0.09    4.84   0.27   0.50
> dm-2              0.00     0.00    0.00 1596.80     0.00 6.24     
> 8.00     0.84    0.52   0.10  16.22
> =====
>
> ===== vmstat =====
> [root at dovecot2 ~]# vmstat 5
> procs -----------memory---------- ---swap-- -----io---- --system-- 
> -----cpu-----
>  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us 
> sy id wa st
>  2  0   3476 785664 223836 29329692    0    0    88    16    1 1  1  0 
> 99  0  0
>  1  0   3476 783512 223844 29330448    0    0    84  3383 2210 3985  
> 3  1 96  0  0
>  0  1   3476 779328 223848 29331208    0    0    20  9254 4530 9894  
> 5  2 92  1  0
>  2  0   3476 778616 223856 29331956    0    0     1  9104 4160 10909  
> 4  1 94  1  0
>  1  0   3476 777584 223864 29332932    0    0     0 10056 4428 10263  
> 4  1 94  1  0
>  0  0   3476 778312 223872 29333044    0    0     2   660  921 1871  
> 1  0 99  0  0
> =====
>
> Really appreciate your willingness to help and your time spent on this!
>
> Qing
>
> On 04/11/2013 1:42 AM, Stan Hoeppner wrote:
>> On 11/1/2013 2:11 PM, Qing Chang wrote:
>>> I have collected some info following the FAQ guide. It may still
>>> not be enough. Please let me know.
>> You say /var/spool/mail contains 1360 "items".  That's ambiguous.  You
>> mean 1360 user mbox files, yes?
>>
>>> # time to run ls -l (1360 items):
>>> time ls -l /var/spool/mail
>>> real    0m10.271s
>>> user    0m0.097s
>>> sys     0m0.162s
>> The combined CPU execution time is apparently 0.259s.  Something else is
>> adding 10s+ of delay.  mbox is notorious for locking problems.  I wonder
>> if stat() is waiting on release of exclusive write locks being held by
>> your MDA during delivery.  This would also tend to explain the
>> variability you demonstrated, with one run being 10s and another being
>> 7s--fewer files locked on the second run.
>>
>> Run 'ls' with "strace -T" to find out which system call is adding the
>> delay.  Run perf-lock concurrently to see the locks.
>>
>> More comments below.
>>
>>
>>> # Oct 29, 2013
>>> #
>>> # XFS slowness with ls du etc.
>>>
>>> kernel version (uname -a)
>>> Linux dovecot2.sri.utoronto.ca 2.6.32-358.18.1.el6.x86_64 #1 SMP Fri Aug 2 17:04:38 EDT 2013 x86_64 x86_64 x86_64 GNU/Linux
>>>
>>> xfsprogs version (xfs_repair -V)
>>> xfs_repair version 3.1.1
>>>
>>> number of CPUs: 16 virtual CPUs
>>>
>>> contents of /proc/meminfo
>>> MemTotal:       32878632 kB
>>> MemFree:          310260 kB
>>> Buffers:          220496 kB
>>> Cached:         29971428 kB
>>> SwapCached:          212 kB
>>> Active:          9726912 kB
>>> Inactive:       21180124 kB
>>> Active(anon):     385040 kB
>>> Inactive(anon):   330260 kB
>>> Active(file):    9341872 kB
>>> Inactive(file): 20849864 kB
>>> Unevictable:           0 kB
>>> Mlocked:               0 kB
>>> SwapTotal:       6160376 kB
>>> SwapFree:        6159896 kB
>>> Dirty:               772 kB
>>> Writeback:             0 kB
>>> AnonPages:        714912 kB
>>> Mapped:            52696 kB
>>> Shmem:               184 kB
>>> Slab:            1045988 kB
>>> SReclaimable:     776440 kB
>>> SUnreclaim:       269548 kB
>>> KernelStack:        8736 kB
>>> PageTables:        57172 kB
>>> NFS_Unstable:          0 kB
>>> Bounce:                0 kB
>>> WritebackTmp:          0 kB
>>> CommitLimit:    22599692 kB
>>> Committed_AS:    1148180 kB
>>> VmallocTotal:   34359738367 kB
>>> VmallocUsed:      335224 kB
>>> VmallocChunk:   34333710552 kB
>>> HardwareCorrupted:     0 kB
>>> AnonHugePages:    180224 kB
>>> HugePages_Total:       0
>>> HugePages_Free:        0
>>> HugePages_Rsvd:        0
>>> HugePages_Surp:        0
>>> Hugepagesize:       2048 kB
>>> DirectMap4k:       10240 kB
>>> DirectMap2M:    33544192 kB
>>>
>>> contents of /proc/mounts
>>> [root at dovecot2 ~]# cat /proc/mounts
>>> rootfs / rootfs rw 0 0
>>> proc /proc proc rw,relatime 0 0
>>> sysfs /sys sysfs rw,seclabel,relatime 0 0
>>> devtmpfs /dev devtmpfs rw,seclabel,relatime,size=16429456k,nr_inodes=4107364,mode=755 0 0
>>> devpts /dev/pts devpts rw,seclabel,relatime,gid=5,mode=620,ptmxmode=000 0 0
>>> tmpfs /dev/shm tmpfs rw,seclabel,relatime 0 0
>>> /dev/mapper/vg_ipa1-lv_root / ext4 rw,seclabel,relatime,barrier=1,data=ordered 0 0
>>> none /selinux selinuxfs rw,relatime 0 0
>>> devtmpfs /dev devtmpfs rw,seclabel,relatime,size=16429456k,nr_inodes=4107364,mode=755 0 0
>>> /proc/bus/usb /proc/bus/usb usbfs rw,relatime 0 0
>>> /dev/sda1 /boot ext4 rw,seclabel,relatime,barrier=1,data=ordered 0 0
>>> /dev/mapper/vg_ipa1-lv_home /var ext4 rw,seclabel,relatime,barrier=1,data=ordered 0 0
>>> /dev/sdb1 /mailhomes xfs rw,seclabel,relatime,attr2,delaylog,nobarrier,inode64,usrquota 0 0
>>> /dev/sdc1 /var/spool/mail xfs rw,seclabel,relatime,attr2,delaylog,nobarrier,inode64,usrquota 0 0
>>> none /proc/sys/fs/binfmt_misc binfmt_misc rw,relatime 0 0
>>> sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw,relatime 0 0
>>>
>>> contents of /proc/partitions
>>> major minor  #blocks  name
>>>
>>>     8        0   83886080 sda
>>>     8        1     512000 sda1
>>>     8        2   83373056 sda2
>>>     8       16 5368709120 sdb
>>>     8       17 5368707072 sdb1
>>>     8       32 2147483648 sdc
>>>     8       33 2147481600 sdc1
>>>   253        0   56729600 dm-0
>>>   253        1    6160384 dm-1
>>>   253        2   20480000 dm-2
>>>
>>> RAID layout (hardware and/or software): Hardware RAID6 IBM Storwize V7000
>> What we're looking for here is the RAID geometry of the local disk
>> device, logical volume, SAN LUN, etc, underlying the XFS filesystem,
>> i.e. number of disks in the array, RAID level, stripe unit and stripe width.
>>
>>> LVM configuration: XFS not configured
>>>
>>> type of disks you are using: SAS dirves
>>>
>>> write cache status of drives: V7000 has 16 GB of cache
>>>
>>> size of BBWC and mode it is running in: N/A
>> The 16GB of BBWC in the V7000 defaults to write-back mode, or has been
>> manually enabled.  If not you're playing Russian roulette, because
>> you're mounting with nobarrier.  You only disable barriers if you have
>> persistent write cache, either battery or flash based.
>>
>>> xfs_info output on the filesystem in question
>>> meta-data=/dev/sdc1              isize=256    agcount=4, agsize=134217600 blks
>>>           =                       sectsz=512   attr=2
>>> data     =                       bsize=4096   blocks=536870400, imaxpct=5
>>>           =                       sunit=0      swidth=0 blks
>>> naming   =version 2              bsize=4096   ascii-ci=0
>>> log      =internal               bsize=4096   blocks=262143, version=2
>>>           =                       sectsz=512   sunit=0 blks, lazy-count=1
>>> realtime =none                   extsz=4096   blocks=0, rtextents=0
>> 2TB seems rather large for a 1360 user mail spool directory, given new
>> mail is apparently moved from INBOX to other folders in /mailhomes.
>>
>> Additionally, given the innate parallelism in a mail workload, 4 AGs is
>> far less than optimal.  XFS throughput with a highly parallel workload
>> is limited by the number of allocation groups.  This filesystem was
>> apparently created with mkfs.xfs defaults.  It should have been created
>> with many more AGs, 16 minimum.  Same goes for the /mailhomes filesystem.
>>
>>> dmesg output showing all error messages and stack traces: N/A
>>>
>>> # time to run ls -l (1360 items):
>>> time ls -l /var/spool/mail
>>> real    0m10.271s
>>> user    0m0.097s
>>> sys     0m0.162s
>>>
>>> # iostat when ls -l is running:
>>> [root at dovecot2 ~]# date;iostat -xdmz 5
>>> Tue Oct 29 14:44:28 EDT 2013
>>> Linux 2.6.32-358.18.1.el6.x86_64 (dovecot2.sri.utoronto.ca)     10/29/2013      _x86_64_        (16 CPU)
>>>
>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
>>> sda               0.02    17.87    0.17   24.49     0.01     0.17    14.16     0.01    0.56   0.28   0.68
>>> sdb               0.19     0.08    1.62    1.94     0.04     0.02    33.47     0.01    2.76   1.23   0.44
>>> sdc               0.00     0.03   11.57    0.69     1.10     0.05   192.23     0.02    1.65   0.78   0.96
>>> dm-0              0.00     0.00    0.11    2.24     0.00     0.01     8.70     0.00    1.14   0.51   0.12
>>> dm-1              0.00     0.00    0.00    0.00     0.00     0.00     8.00     0.00    1.26   0.24   0.00
>>> dm-2              0.00     0.00    0.08   40.11     0.00     0.16     8.18     0.03    0.86   0.14   0.57
>>>
>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
>>> sda               0.00   155.00    0.00  127.20     0.00     1.10    17.75     0.06    0.45   0.25   3.22
>>> sdb               0.00     0.20    0.00   11.60     0.00     0.07    11.81     0.05    4.07   0.17   0.20
>>> sdc               0.00     0.00    0.00    0.40     0.00     0.00    12.00     0.00    0.50   0.50   0.02
>>> dm-0              0.00     0.00    0.00    1.20     0.00     0.00     8.00     0.01    4.50   4.50   0.54
>>> dm-2              0.00     0.00    0.00  281.40     0.00     1.10     8.00     0.19    0.68   0.10   2.74
>>>
>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
>>> sda               0.00  1480.80    0.00  627.20     0.00     8.23    26.89     0.28    0.45   0.20  12.70
>>> sdb               0.00     0.00    0.40    4.00     0.00     0.03    18.27     0.00    1.05   1.05   0.46
>>> sdc               0.00     0.00    0.00    1.40     0.00     0.00     6.71     0.00    0.14   0.14   0.02
>>> dm-0              0.00     0.00    0.00   17.20     0.00     0.07     8.00     0.02    1.09   0.09   0.16
>>> dm-2              0.00     0.00    0.00 2090.20     0.00     8.16     8.00     1.65    0.79   0.06  12.52
>>>
>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
>>> sda               0.00  1331.00    0.00  629.80     0.00     7.66    24.91     0.28    0.44   0.22  13.64
>>> sdb               0.00     0.00    0.00    2.80     0.00     0.02    14.07     0.00    0.79   0.57   0.16
>>> sdc               0.00     0.00    0.00    1.40     0.00     0.02    30.14     0.00    0.57   0.57   0.08
>>> dm-0              0.00     0.00    0.00    1.20     0.00     0.00     8.00     0.00    3.83   3.83   0.46
>>> dm-2              0.00     0.00    0.00 1959.60     0.00     7.65     8.00     1.43    0.73   0.07  13.20
>>>
>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
>>> sda               0.00     6.60    0.00    0.60     0.00     0.03    96.00     0.00    1.33   1.33   0.08
>>> sdb               0.00     0.20    0.00    0.80     0.00     0.00     8.25     0.00    0.75   0.75   0.06
>>> sdc               0.00     0.20    0.00    0.20     0.00     0.00    16.00     0.00    0.00   0.00   0.00
>>> dm-0              0.00     0.00    0.00    6.20     0.00     0.02     8.00     0.01    0.97   0.03   0.02
>>> dm-2              0.00     0.00    0.00    1.00     0.00     0.00     8.00     0.00    3.00   0.60   0.06
>>>
>>> # a second run of ls -l:
>>> [root at dovecot2 ~]# date;time ls -l /var/spool/mail
>>> Fri Nov  1 15:04:43 EDT 2013
>>> real    0m7.814s
>>> user    0m0.096s
>>> sys     0m0.139s
>>>
>>> [root at dovecot2 ~]# date;iostat -xmdz 5
>>> Fri Nov  1 15:04:41 EDT 2013
>>> Linux 2.6.32-358.18.1.el6.x86_64 (dovecot2.sri.utoronto.ca)     11/01/2013      _x86_64_        (16 CPU)
>>>
>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
>>> sda               0.01    21.20    0.16   28.28     0.00     0.19    14.27     0.02    0.57   0.27   0.78
>>> sdb               0.22     0.09    1.70    2.19     0.04     0.02    31.03     0.01    2.83   1.20   0.47
>>> sdc               0.00     0.04   12.63    0.75     1.21     0.05   192.59     0.02    1.60   0.77   1.04
>>> dm-0              0.00     0.00    0.11    2.46     0.00     0.01     8.57     0.00    1.17   0.51   0.13
>>> dm-1              0.00     0.00    0.00    0.00     0.00     0.00     8.00     0.00    2.03   0.21   0.00
>>> dm-2              0.00     0.00    0.07   47.02     0.00     0.18     8.15     0.04    0.93   0.14   0.65
>>>
>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
>>> sda               0.00   730.20    0.00 1251.60     0.00     7.76    12.69     0.52    0.42   0.21  25.86
>>> sdb               0.00     1.00    0.80   25.40     0.01     0.16    12.62     0.01    0.37   0.18   0.46
>>> sdc               0.00     0.00    0.00    2.60     0.00     0.03    20.62     0.00    0.31   0.23   0.06
>>> dm-0              0.00     0.00    0.00    0.20     0.00     0.00     8.00     0.00    3.00   3.00   0.06
>>> dm-2              0.00     0.00    0.00 1981.00     0.00     7.74     8.00     1.04    0.52   0.13  25.96
>>>
>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
>>> sda               0.00  1544.20    0.00 2844.60     0.00    17.14    12.34     1.10    0.39   0.21  58.34
>>> sdb               0.00     0.00    0.00    1.20     0.00     0.01     9.33     0.00    0.00   0.00   0.00
>>> sdc               0.00     0.00    0.00    0.60     0.00     0.00     8.00     0.00    3.00   1.00   0.06
>>> dm-0              0.00     0.00    0.00    3.00     0.00     0.01     8.00     0.00    0.07   0.07   0.02
>>> dm-2              0.00     0.00    0.00 4386.00     0.00    17.13     8.00     2.19    0.50   0.13  58.46
>>>
>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
>>> sda               0.00   104.80    0.00  212.80     0.00     1.24    11.94     0.09    0.41   0.21   4.44
>>> sdb               0.00     0.00    0.20    3.00     0.00     0.01     9.81     0.00    0.50   0.50   0.16
>>> sdc               0.00     0.20    0.00    2.80     0.00     0.02    16.86     0.01    4.43   0.79   0.22
>>> dm-0              0.00     0.00    0.00    0.40     0.00     0.00     8.00     0.00    5.50   5.50   0.22
>>> dm-2              0.00     0.00    0.00  317.00     0.00     1.24     8.00     0.15    0.47   0.13   4.24
>>>
>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
>>> sda               0.00     6.60    0.00    1.20     0.00     0.03    52.00     0.00    0.33   0.33   0.04
>>> sdb               0.00     0.20    0.00    1.00     0.00     0.00     9.60     0.00    0.00   0.00   0.00
>>> sdc               0.00     0.00    0.00    0.20     0.00     0.00     2.00     0.00    1.00   1.00   0.02
>>> dm-0              0.00     0.00    0.00    4.40     0.00     0.02     8.00     0.00    0.95   0.05   0.02
>>> dm-2              0.00     0.00    0.00    3.40     0.00     0.01     8.00     0.00    0.06   0.06   0.02
>>>
>>> [root at dovecot2 ~]# date;vmstat 5
>>> Fri Nov  1 15:04:39 EDT 2013
>>> procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
>>>   r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
>>>   0  0   1304 542644 229484 29885592    0    0    80    17    1    1  1  0 99  0  0
>>>   0  1   1304 542752 229484 29885696    0    0     0  2211 1518 1281  1  0 98  0  0
>>>   0  1   1304 544736 229484 29885696    0    0     6 17216 7535 7631  3  2 94  2  0
>>>   0  0   1304 543988 229484 29885696    0    0     0  7403 3782 3889  1  1 97  1  0
>>>   0  0   1304 543604 229484 29885712    0    0     1   215  399  277  0  0 100  0  0
>
>
>
> _______________________________________________
> xfs mailing list
> xfs at oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://oss.sgi.com/pipermail/xfs/attachments/20131112/8c34586e/attachment-0001.html>


More information about the xfs mailing list