slow ls -l
Nathaniel W. Turner
nate at houseofnate.net
Tue Nov 12 15:55:22 CST 2013
Hi Qing,
I'm not sure, but it looks like those poll calls are related to "ls -l"
communicating with the System Security Services Daemon (SSSD) to (I
imagine) look up user and group names. Just speculation, but if true,
this suggests a performance problem with SSSD or some related part of
the system, not XFS. Someone with a clue about sssd might be able to
confirm or deny this.
Your "df" performance problem may be caused by something completely
different.
n
On 11/12/2013 01:15 PM, Qing Chang wrote:
> Hi Stan,
>
> thank you very much for your comments, I need some time to
> follow through all of your suggestions.
>
> Yes, this is a dovecot IMAP and sendmail SMTP server, and there
> are 1360 mailboxes.
>
> With this email, I'd like to give you a bit more information with
> strace -T and others commands.
>
> Because "# CONFIG_LOCK_STAT is not set", I can not run perf
> lock record. Since this is a production server, I am reluctant to
> rebuild the kernel on it. But I hope the info below may give
> you an idea that it is not "lock" related. As slow "ls -l" and "du"
> also happen on the two file servers (64TB and 32TB), it does not
> feel like it is lock related.
>
> I run this "(time strace -T ls -l /var/spool/mail > ls-out) >& ls-trace"
> and got quite a bit data.
>
> First "time" output:
> =====
> real 0m17.720s
> user 0m0.574s
> sys 0m1.271s
> =====
>
> Some 14s was spent on 164 "poll" calls on fd=5 that took more than 10ms,
> total number of "poll" call is 6618!
>
> Below is the trace output, red numbers are the three longest time
> taken, listed
> at top are those that took more than 100ms, the three sections are for
> the top
> three time taken, note the three mailboxes are not special compare to
> another
> mailboxes, a run at another time would produce calls on different
> mailboxes.
> Also, cache seams to play a role too, after a fresh run that takes
> long time,
> subsequent runs give instant results:
> =====
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.144634>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.104580>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.104812>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.181959>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.102520>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.107539>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.142462>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.215745>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.309854>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.166665>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.101035>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.113119>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.120745>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.110027>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.137906>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.249039>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.104079>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.250003>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.103791>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.102018>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.103551>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.103696>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.102164>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.105843>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.104413>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.101304>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.101869>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.101366>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.132958>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.309411>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.100342>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.101279>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.144725>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.109867>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.109951>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.290189>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.102449>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.101182>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.111699>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.107351>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.104558>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.115447>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.163904>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.266427>
>
> ##### df=5 opened:
> socket(PF_FILE, SOCK_STREAM, 0) = 5 <0.000016>
> fcntl(5, F_GETFL) = 0x2 (flags O_RDWR) <0.000010>
> fcntl(5, F_SETFL, O_RDWR|O_NONBLOCK) = 0 <0.000010>
> fcntl(5, F_GETFD) = 0 <0.000010>
> fcntl(5, F_SETFD, FD_CLOEXEC) = 0 <0.000010>
> connect(5, {sa_family=AF_FILE, path="/var/lib/sss/pipes/nss"}, 110) =
> 0 <0.001886>
> fstat(5, {st_mode=S_IFSOCK|0777, st_size=0, ...}) = 0 <0.000010>
> poll([{fd=5, events=POLLOUT}], 1, 300000) = 1 ([{fd=5,
> revents=POLLOUT}]) <0.000012>
> sendto(5, "\24\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0", 16, MSG_NOSIGNAL, NULL,
> 0) = 16 <0.000029>
> poll([{fd=5, events=POLLOUT}], 1, 300000) = 1 ([{fd=5,
> revents=POLLOUT}]) <0.000012>
> sendto(5, "\1\0\0\0", 4, MSG_NOSIGNAL, NULL, 0) = 4 <0.000027>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.000011>
> read(5, "\24\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0", 16) = 16 <0.000013>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.000011>
> read(5, "\1\0\0\0", 4) = 4 <0.000025>
> poll([{fd=5, events=POLLOUT}], 1, 300000) = 1 ([{fd=5,
> revents=POLLOUT}]) <0.000011>
> sendto(5, "\24\0\0\0\22\0\0\0\0\0\0\0\0\0\0\0", 16, MSG_NOSIGNAL,
> NULL, 0) = 16 <0.000026>
> poll([{fd=5, events=POLLOUT}], 1, 300000) = 1 ([{fd=5,
> revents=POLLOUT}]) <0.000011>
> sendto(5, "\225N\0\0", 4, MSG_NOSIGNAL, NULL, 0) = 4 <0.000025>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.000515>
> read(5, "L\0\0\0\22\0\0\0\0\0\0\0\0\0\0\0", 16) = 16 <0.000012>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.000011>
> read(5, "\1\0\0\0\0\0\0\0\225N\0\0\311\0\0\0qchang\0*\0Qing Ch"...,
> 60) = 60 <0.000026>
> open("/etc/group", O_RDONLY|O_CLOEXEC) = 6 <0.000017>
> fstat(6, {st_mode=S_IFREG|0644, st_size=790, ...}) = 0 <0.000010>
> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
> 0) = 0x7fd5d53f4000 <0.004539>
> read(6, "root:x:0:root\nbin:x:1:root,bin,d"..., 4096) = 790 <0.000018>
> read(6, "", 4096) = 0 <0.000011>
> close(6) = 0 <0.000011>
> munmap(0x7fd5d53f4000, 4096) = 0 <0.000016>
> open("/var/lib/sss/mc/group", O_RDONLY|O_CLOEXEC) = 6 <0.000021>
> fstat(6, {st_mode=S_IFREG|0644, st_size=5206312, ...}) = 0 <0.000010>
> mmap(NULL, 5206312, PROT_READ, MAP_SHARED, 6, 0) = 0x7fd5ce3c1000
> <0.000015>
> poll([{fd=5, events=POLLIN|POLLOUT}], 1, 300000) = 1 ([{fd=5,
> revents=POLLOUT}]) <0.000012>
> poll([{fd=5, events=POLLOUT}], 1, 300000) = 1 ([{fd=5,
> revents=POLLOUT}]) <0.000011>
> sendto(5, "\24\0\0\0\"\0\0\0\0\0\0\0\0\0\0\0", 16, MSG_NOSIGNAL, NULL,
> 0) = 16 <0.000031>
> poll([{fd=5, events=POLLOUT}], 1, 300000) = 1 ([{fd=5,
> revents=POLLOUT}]) <0.000011>
> sendto(5, "\311\0\0\0", 4, MSG_NOSIGNAL, NULL, 0) = 4 <0.000025>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.000427>
> read(5, "?\0\0\0\"\0\0\0\0\0\0\0\0\0\0\0", 16) = 16 <0.000013>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.000011>
> read(5, "\1\0\0\0\0\0\0\0\311\0\0\0\3\0\0\0\0*\0qiu"..., 47) = 47
> <0.000017>
> lstat("/var/spool/mail/root", {st_mode=S_IFREG|0600,
> st_size=187649403, ...}) = 0 <0.000019>
> lgetxattr("/var/spool/mail/root", "security.selinux",
> "unconfined_u:object_r:file_t:s0", 255) = 32 <0.000020>
> lstat("/var/spool/mail/root", {st_mode=S_IFREG|0600,
> st_size=187649403, ...}) = 0 <0.000014>
> lgetxattr("/var/spool/mail/root", "system.posix_acl_access", 0x0, 0) =
> -1 ENODATA (No data available) <0.000018>
> lgetxattr("/var/spool/mail/root", "system.posix_acl_default", 0x0, 0)
> = -1 ENODATA (No data available) <0.000016>
> lstat("/var/spool/mail/agutstei", {st_mode=S_IFREG|0600,
> st_size=3864251, ...}) = 0 <0.000016>
> lgetxattr("/var/spool/mail/agutstei", "security.selinux",
> "unconfined_u:object_r:file_t:s0", 255) = 32 <0.000018>
> lstat("/var/spool/mail/agutstei", {st_mode=S_IFREG|0600,
> st_size=3864251, ...}) = 0 <0.000015>
> lgetxattr("/var/spool/mail/agutstei", "system.posix_acl_access", 0x0,
> 0) = -1 ENODATA (No data available) <0.000016>
> lgetxattr("/var/spool/mail/agutstei", "system.posix_acl_default", 0x0,
> 0) = -1 ENODATA (No data available) <0.000015>
> open("/etc/passwd", O_RDONLY|O_CLOEXEC) = 7 <0.000017>
> fstat(7, {st_mode=S_IFREG|0644, st_size=1835, ...}) = 0 <0.000011>
> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
> 0) = 0x7fd5ce3c0000 <0.000014>
> read(7, "root:x:0:0:root:/root:/bin/bash\n"..., 4096) = 1835 <0.000020>
> read(7, "", 4096) = 0 <0.000011>
> close(7) = 0 <0.000012>
>
> ###### took 0.309854s:
> munmap(0x7fd5ce3c0000, 4096) = 0 <0.000015>
> poll([{fd=5, events=POLLIN|POLLOUT}], 1, 300000) = 1 ([{fd=5,
> revents=POLLOUT}]) <0.000011>
> poll([{fd=5, events=POLLOUT}], 1, 300000) = 1 ([{fd=5,
> revents=POLLOUT}]) <0.000010>
> sendto(5, "\24\0\0\0\22\0\0\0\0\0\0\0\0\0\0\0", 16, MSG_NOSIGNAL,
> NULL, 0) = 16 <0.000034>
> poll([{fd=5, events=POLLOUT}], 1, 300000) = 1 ([{fd=5,
> revents=POLLOUT}]) <0.000010>
> sendto(5, "\253\260\0\0", 4, MSG_NOSIGNAL, NULL, 0) = 4 <0.000335>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.309854>
> read(5, "N\0\0\0\22\0\0\0\0\0\0\0\0\0\0\0", 16) = 16 <0.000012>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.000010>
> read(5, "\1\0\0\0\0\0\0\0\253\260\0\0\253\1\0\0acqu\0*\0Chen Ge A"...,
> 62) = 62 <0.000016>
> lstat("/var/spool/mail/araouf", {st_mode=S_IFREG|0600,
> st_size=3943191, ...}) = 0 <0.000052>
> lgetxattr("/var/spool/mail/araouf", "security.selinux",
> "unconfined_u:object_r:file_t:s0", 255) = 32 <0.000021>
> lstat("/var/spool/mail/araouf", {st_mode=S_IFREG|0600,
> st_size=3943191, ...}) = 0 <0.000014>
> lgetxattr("/var/spool/mail/araouf", "system.posix_acl_access", 0x0, 0)
> = -1 ENODATA (No data available) <0.000017>
> lgetxattr("/var/spool/mail/araouf", "system.posix_acl_default", 0x0,
> 0) = -1 ENODATA (No data available) <0.000015>
> open("/etc/passwd", O_RDONLY|O_CLOEXEC) = 7 <0.000016>
> fstat(7, {st_mode=S_IFREG|0644, st_size=1835, ...}) = 0 <0.000010>
> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
> 0) = 0x7fd5ce3c0000 <0.000012>
> read(7, "root:x:0:0:root:/root:/bin/bash\n"..., 4096) = 1835 <0.000018>
> read(7, "", 4096) = 0 <0.000010>
> close(7) = 0 <0.000012>
>
> ###### took 0.309411:
> munmap(0x7fd5ce3c0000, 4096) = 0 <0.000015>
> poll([{fd=5, events=POLLIN|POLLOUT}], 1, 300000) = 1 ([{fd=5,
> revents=POLLOUT}]) <0.000010>
> poll([{fd=5, events=POLLOUT}], 1, 300000) = 1 ([{fd=5,
> revents=POLLOUT}]) <0.000009>
> sendto(5, "\24\0\0\0\22\0\0\0\0\0\0\0\0\0\0\0", 16, MSG_NOSIGNAL,
> NULL, 0) = 16 <0.000035>
> poll([{fd=5, events=POLLOUT}], 1, 300000) = 1 ([{fd=5,
> revents=POLLOUT}]) <0.000010>
> sendto(5, "\357\246\0\0", 4, MSG_NOSIGNAL, NULL, 0) = 4 <0.000342>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.309411>
> read(5, "V\0\0\0\22\0\0\0\0\0\0\0\0\0\0\0", 16) = 16 <0.000012>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.000010>
> read(5, "\1\0\0\0\0\0\0\0\357\246\0\0\253\1\0\0pthaya\0*\0Parthip"...,
> 70) = 70 <0.000018>
> lstat("/var/spool/mail/ptsao", {st_mode=S_IFREG|0600, st_size=515358,
> ...}) = 0 <0.000025>
> lgetxattr("/var/spool/mail/ptsao", "security.selinux",
> "unconfined_u:object_r:file_t:s0", 255) = 32 <0.000021>
> lstat("/var/spool/mail/ptsao", {st_mode=S_IFREG|0600, st_size=515358,
> ...}) = 0 <0.000014>
> lgetxattr("/var/spool/mail/ptsao", "system.posix_acl_access", 0x0, 0)
> = -1 ENODATA (No data available) <0.000019>
> lgetxattr("/var/spool/mail/ptsao", "system.posix_acl_default", 0x0, 0)
> = -1 ENODATA (No data available) <0.000014>
> open("/etc/passwd", O_RDONLY|O_CLOEXEC) = 7 <0.000017>
> fstat(7, {st_mode=S_IFREG|0644, st_size=1835, ...}) = 0 <0.000010>
> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
> 0) = 0x7fd5ce3c0000 <0.000012>
> read(7, "root:x:0:0:root:/root:/bin/bash\n"..., 4096) = 1835 <0.000019>
> read(7, "", 4096) = 0 <0.000009>
> close(7) = 0 <0.000011>
>
> ###### took 0.290189:
> munmap(0x7fd5ce3c0000, 4096) = 0 <0.000015>
> poll([{fd=5, events=POLLIN|POLLOUT}], 1, 300000) = 1 ([{fd=5,
> revents=POLLOUT}]) <0.000011>
> poll([{fd=5, events=POLLOUT}], 1, 300000) = 1 ([{fd=5,
> revents=POLLOUT}]) <0.000010>
> sendto(5, "\24\0\0\0\22\0\0\0\0\0\0\0\0\0\0\0", 16, MSG_NOSIGNAL,
> NULL, 0) = 16 <0.000034>
> poll([{fd=5, events=POLLOUT}], 1, 300000) = 1 ([{fd=5,
> revents=POLLOUT}]) <0.000010>
> sendto(5, "\10\270\0\0", 4, MSG_NOSIGNAL, NULL, 0) = 4 <0.000391>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.290189>
> read(5, "N\0\0\0\22\0\0\0\0\0\0\0\0\0\0\0", 16) = 16 <0.000012>
> poll([{fd=5, events=POLLIN}], 1, 300000) = 1 ([{fd=5,
> revents=POLLIN}]) <0.000010>
> read(5, "\1\0\0\0\0\0\0\0\10\270\0\0\327\1\0\0taberk\0*\0Thomas "...,
> 62) = 62 <0.000017>
> lstat("/var/spool/mail/taha", {st_mode=S_IFREG|0600,
> st_size=293648143, ...}) = 0 <0.000018>
> lgetxattr("/var/spool/mail/taha", "security.selinux",
> "unconfined_u:object_r:file_t:s0", 255) = 32 <0.000022>
> lstat("/var/spool/mail/taha", {st_mode=S_IFREG|0600,
> st_size=293648143, ...}) = 0 <0.000013>
> lgetxattr("/var/spool/mail/taha", "system.posix_acl_access", 0x0, 0) =
> -1 ENODATA (No data available) <0.000021>
> lgetxattr("/var/spool/mail/taha", "system.posix_acl_default", 0x0, 0)
> = -1 ENODATA (No data available) <0.000014>
> open("/etc/passwd", O_RDONLY|O_CLOEXEC) = 7 <0.000021>
> fstat(7, {st_mode=S_IFREG|0644, st_size=1835, ...}) = 0 <0.000010>
> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
> 0) = 0x7fd5ce3c0000 <0.000015>
> read(7, "root:x:0:0:root:/root:/bin/bash\n"..., 4096) = 1835 <0.000033>
> read(7, "", 4096) = 0 <0.000011>
> close(7) = 0 <0.000013>
> =====
>
> ====== iostat around the time ls -l was run =====
> [root at dovecot2 ~]# iostat -xmz 5
> Linux 2.6.32-358.18.1.el6.x86_64 (dovecot2.sri.utoronto.ca)
> 11/12/2013 _x86_64_ (16 CPU)
>
> avg-cpu: %user %nice %system %iowait %steal %idle
> 0.56 0.00 0.17 0.06 0.00 99.21
>
> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
> avgrq-sz avgqu-sz await svctm %util
> sda 0.01 19.88 0.14 26.86 0.00 0.18
> 14.18 0.02 0.57 0.28 0.75
> sdb 0.23 0.08 1.75 1.98 0.04 0.02
> 30.38 0.01 2.81 1.26 0.47
> sdc 0.00 0.03 13.78 0.71 1.34 0.05
> 196.35 0.02 1.60 0.77 1.11
> dm-0 0.00 0.00 0.09 2.42 0.00 0.01
> 8.49 0.00 1.18 0.52 0.13
> dm-1 0.00 0.00 0.00 0.00 0.00 0.00
> 8.00 0.00 3.09 0.27 0.00
> dm-2 0.00 0.00 0.06 44.33 0.00 0.17
> 8.14 0.04 0.90 0.14 0.63
>
> avg-cpu: %user %nice %system %iowait %steal %idle
> 1.31 0.00 0.46 0.10 0.00 98.12
>
> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
> avgrq-sz avgqu-sz await svctm %util
> sda 0.00 27.40 1.00 65.80 0.02 0.36
> 11.78 0.04 0.61 0.43 2.86
> sdb 0.00 0.00 1.80 0.00 0.03 0.00
> 31.11 0.01 6.44 2.00 0.36
> sdc 0.00 1.00 0.00 4.80 0.00 0.03
> 11.79 0.03 6.88 0.46 0.22
> dm-0 0.00 0.00 0.80 1.20 0.02 0.00
> 24.00 0.01 6.10 6.10 1.22
> dm-2 0.00 0.00 0.00 92.00 0.00 0.36
> 8.00 0.06 0.62 0.19 1.76
>
> avg-cpu: %user %nice %system %iowait %steal %idle
> 5.76 0.00 1.62 0.59 0.00 92.03
>
> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
> avgrq-sz avgqu-sz await svctm %util
> sda 0.00 1045.60 0.80 959.40 0.05 7.83
> 16.82 0.44 0.46 0.21 19.98
> sdb 0.00 0.00 1.40 1.20 0.01 0.01
> 9.23 0.00 0.62 0.54 0.14
> sdc 0.00 0.00 0.00 0.60 0.00 0.01
> 18.67 0.00 0.33 0.33 0.02
> dm-0 0.00 0.00 0.80 8.00 0.05 0.03
> 20.00 0.02 2.05 1.41 1.24
> dm-2 0.00 0.00 0.00 1997.00 0.00 7.80
> 8.00 1.24 0.62 0.09 18.86
>
> avg-cpu: %user %nice %system %iowait %steal %idle
> 4.38 0.00 1.29 0.67 0.00 93.66
>
> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
> avgrq-sz avgqu-sz await svctm %util
> sda 0.00 1204.40 0.00 1074.60 0.00 8.90
> 16.97 0.41 0.38 0.20 21.04
> sdb 0.20 0.00 2.00 0.00 0.01 0.00
> 13.60 0.01 6.80 4.00 0.80
> dm-0 0.00 0.00 0.00 9.00 0.00 0.04
> 8.00 0.01 1.27 0.42 0.38
> dm-2 0.00 0.00 0.00 2270.00 0.00 8.87
> 8.00 1.33 0.59 0.09 20.74
>
> avg-cpu: %user %nice %system %iowait %steal %idle
> 4.24 0.00 1.16 0.58 0.00 94.02
>
> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
> avgrq-sz avgqu-sz await svctm %util
> sda 0.00 1159.20 0.00 985.80 0.00 8.38
> 17.41 0.34 0.35 0.18 17.84
> sdb 0.00 0.00 0.20 1.40 0.00 0.01
> 8.62 0.00 0.75 0.75 0.12
> sdc 0.00 0.00 0.00 0.20 0.00 0.00
> 16.00 0.00 1.00 1.00 0.02
> dm-0 0.00 0.00 0.00 5.00 0.00 0.02
> 8.00 0.00 0.24 0.24 0.12
> dm-2 0.00 0.00 0.00 2140.00 0.00 8.36
> 8.00 1.23 0.58 0.08 17.90
>
> avg-cpu: %user %nice %system %iowait %steal %idle
> 2.77 0.00 1.14 0.54 0.00 95.55
>
> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
> avgrq-sz avgqu-sz await svctm %util
> sda 0.00 768.20 0.00 847.40 0.00 6.31
> 15.25 0.36 0.43 0.19 16.40
> sdb 0.00 0.00 0.00 2.00 0.00 0.01
> 7.00 0.00 0.10 0.10 0.02
> sdc 0.00 0.00 0.00 2.00 0.00 0.04
> 37.60 0.00 0.70 0.40 0.08
> dm-0 0.00 0.00 0.00 18.80 0.00 0.07
> 8.00 0.09 4.84 0.27 0.50
> dm-2 0.00 0.00 0.00 1596.80 0.00 6.24
> 8.00 0.84 0.52 0.10 16.22
> =====
>
> ===== vmstat =====
> [root at dovecot2 ~]# vmstat 5
> procs -----------memory---------- ---swap-- -----io---- --system--
> -----cpu-----
> r b swpd free buff cache si so bi bo in cs us
> sy id wa st
> 2 0 3476 785664 223836 29329692 0 0 88 16 1 1 1 0
> 99 0 0
> 1 0 3476 783512 223844 29330448 0 0 84 3383 2210 3985
> 3 1 96 0 0
> 0 1 3476 779328 223848 29331208 0 0 20 9254 4530 9894
> 5 2 92 1 0
> 2 0 3476 778616 223856 29331956 0 0 1 9104 4160 10909
> 4 1 94 1 0
> 1 0 3476 777584 223864 29332932 0 0 0 10056 4428 10263
> 4 1 94 1 0
> 0 0 3476 778312 223872 29333044 0 0 2 660 921 1871
> 1 0 99 0 0
> =====
>
> Really appreciate your willingness to help and your time spent on this!
>
> Qing
>
> On 04/11/2013 1:42 AM, Stan Hoeppner wrote:
>> On 11/1/2013 2:11 PM, Qing Chang wrote:
>>> I have collected some info following the FAQ guide. It may still
>>> not be enough. Please let me know.
>> You say /var/spool/mail contains 1360 "items". That's ambiguous. You
>> mean 1360 user mbox files, yes?
>>
>>> # time to run ls -l (1360 items):
>>> time ls -l /var/spool/mail
>>> real 0m10.271s
>>> user 0m0.097s
>>> sys 0m0.162s
>> The combined CPU execution time is apparently 0.259s. Something else is
>> adding 10s+ of delay. mbox is notorious for locking problems. I wonder
>> if stat() is waiting on release of exclusive write locks being held by
>> your MDA during delivery. This would also tend to explain the
>> variability you demonstrated, with one run being 10s and another being
>> 7s--fewer files locked on the second run.
>>
>> Run 'ls' with "strace -T" to find out which system call is adding the
>> delay. Run perf-lock concurrently to see the locks.
>>
>> More comments below.
>>
>>
>>> # Oct 29, 2013
>>> #
>>> # XFS slowness with ls du etc.
>>>
>>> kernel version (uname -a)
>>> Linux dovecot2.sri.utoronto.ca 2.6.32-358.18.1.el6.x86_64 #1 SMP Fri Aug 2 17:04:38 EDT 2013 x86_64 x86_64 x86_64 GNU/Linux
>>>
>>> xfsprogs version (xfs_repair -V)
>>> xfs_repair version 3.1.1
>>>
>>> number of CPUs: 16 virtual CPUs
>>>
>>> contents of /proc/meminfo
>>> MemTotal: 32878632 kB
>>> MemFree: 310260 kB
>>> Buffers: 220496 kB
>>> Cached: 29971428 kB
>>> SwapCached: 212 kB
>>> Active: 9726912 kB
>>> Inactive: 21180124 kB
>>> Active(anon): 385040 kB
>>> Inactive(anon): 330260 kB
>>> Active(file): 9341872 kB
>>> Inactive(file): 20849864 kB
>>> Unevictable: 0 kB
>>> Mlocked: 0 kB
>>> SwapTotal: 6160376 kB
>>> SwapFree: 6159896 kB
>>> Dirty: 772 kB
>>> Writeback: 0 kB
>>> AnonPages: 714912 kB
>>> Mapped: 52696 kB
>>> Shmem: 184 kB
>>> Slab: 1045988 kB
>>> SReclaimable: 776440 kB
>>> SUnreclaim: 269548 kB
>>> KernelStack: 8736 kB
>>> PageTables: 57172 kB
>>> NFS_Unstable: 0 kB
>>> Bounce: 0 kB
>>> WritebackTmp: 0 kB
>>> CommitLimit: 22599692 kB
>>> Committed_AS: 1148180 kB
>>> VmallocTotal: 34359738367 kB
>>> VmallocUsed: 335224 kB
>>> VmallocChunk: 34333710552 kB
>>> HardwareCorrupted: 0 kB
>>> AnonHugePages: 180224 kB
>>> HugePages_Total: 0
>>> HugePages_Free: 0
>>> HugePages_Rsvd: 0
>>> HugePages_Surp: 0
>>> Hugepagesize: 2048 kB
>>> DirectMap4k: 10240 kB
>>> DirectMap2M: 33544192 kB
>>>
>>> contents of /proc/mounts
>>> [root at dovecot2 ~]# cat /proc/mounts
>>> rootfs / rootfs rw 0 0
>>> proc /proc proc rw,relatime 0 0
>>> sysfs /sys sysfs rw,seclabel,relatime 0 0
>>> devtmpfs /dev devtmpfs rw,seclabel,relatime,size=16429456k,nr_inodes=4107364,mode=755 0 0
>>> devpts /dev/pts devpts rw,seclabel,relatime,gid=5,mode=620,ptmxmode=000 0 0
>>> tmpfs /dev/shm tmpfs rw,seclabel,relatime 0 0
>>> /dev/mapper/vg_ipa1-lv_root / ext4 rw,seclabel,relatime,barrier=1,data=ordered 0 0
>>> none /selinux selinuxfs rw,relatime 0 0
>>> devtmpfs /dev devtmpfs rw,seclabel,relatime,size=16429456k,nr_inodes=4107364,mode=755 0 0
>>> /proc/bus/usb /proc/bus/usb usbfs rw,relatime 0 0
>>> /dev/sda1 /boot ext4 rw,seclabel,relatime,barrier=1,data=ordered 0 0
>>> /dev/mapper/vg_ipa1-lv_home /var ext4 rw,seclabel,relatime,barrier=1,data=ordered 0 0
>>> /dev/sdb1 /mailhomes xfs rw,seclabel,relatime,attr2,delaylog,nobarrier,inode64,usrquota 0 0
>>> /dev/sdc1 /var/spool/mail xfs rw,seclabel,relatime,attr2,delaylog,nobarrier,inode64,usrquota 0 0
>>> none /proc/sys/fs/binfmt_misc binfmt_misc rw,relatime 0 0
>>> sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw,relatime 0 0
>>>
>>> contents of /proc/partitions
>>> major minor #blocks name
>>>
>>> 8 0 83886080 sda
>>> 8 1 512000 sda1
>>> 8 2 83373056 sda2
>>> 8 16 5368709120 sdb
>>> 8 17 5368707072 sdb1
>>> 8 32 2147483648 sdc
>>> 8 33 2147481600 sdc1
>>> 253 0 56729600 dm-0
>>> 253 1 6160384 dm-1
>>> 253 2 20480000 dm-2
>>>
>>> RAID layout (hardware and/or software): Hardware RAID6 IBM Storwize V7000
>> What we're looking for here is the RAID geometry of the local disk
>> device, logical volume, SAN LUN, etc, underlying the XFS filesystem,
>> i.e. number of disks in the array, RAID level, stripe unit and stripe width.
>>
>>> LVM configuration: XFS not configured
>>>
>>> type of disks you are using: SAS dirves
>>>
>>> write cache status of drives: V7000 has 16 GB of cache
>>>
>>> size of BBWC and mode it is running in: N/A
>> The 16GB of BBWC in the V7000 defaults to write-back mode, or has been
>> manually enabled. If not you're playing Russian roulette, because
>> you're mounting with nobarrier. You only disable barriers if you have
>> persistent write cache, either battery or flash based.
>>
>>> xfs_info output on the filesystem in question
>>> meta-data=/dev/sdc1 isize=256 agcount=4, agsize=134217600 blks
>>> = sectsz=512 attr=2
>>> data = bsize=4096 blocks=536870400, imaxpct=5
>>> = sunit=0 swidth=0 blks
>>> naming =version 2 bsize=4096 ascii-ci=0
>>> log =internal bsize=4096 blocks=262143, version=2
>>> = sectsz=512 sunit=0 blks, lazy-count=1
>>> realtime =none extsz=4096 blocks=0, rtextents=0
>> 2TB seems rather large for a 1360 user mail spool directory, given new
>> mail is apparently moved from INBOX to other folders in /mailhomes.
>>
>> Additionally, given the innate parallelism in a mail workload, 4 AGs is
>> far less than optimal. XFS throughput with a highly parallel workload
>> is limited by the number of allocation groups. This filesystem was
>> apparently created with mkfs.xfs defaults. It should have been created
>> with many more AGs, 16 minimum. Same goes for the /mailhomes filesystem.
>>
>>> dmesg output showing all error messages and stack traces: N/A
>>>
>>> # time to run ls -l (1360 items):
>>> time ls -l /var/spool/mail
>>> real 0m10.271s
>>> user 0m0.097s
>>> sys 0m0.162s
>>>
>>> # iostat when ls -l is running:
>>> [root at dovecot2 ~]# date;iostat -xdmz 5
>>> Tue Oct 29 14:44:28 EDT 2013
>>> Linux 2.6.32-358.18.1.el6.x86_64 (dovecot2.sri.utoronto.ca) 10/29/2013 _x86_64_ (16 CPU)
>>>
>>> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
>>> sda 0.02 17.87 0.17 24.49 0.01 0.17 14.16 0.01 0.56 0.28 0.68
>>> sdb 0.19 0.08 1.62 1.94 0.04 0.02 33.47 0.01 2.76 1.23 0.44
>>> sdc 0.00 0.03 11.57 0.69 1.10 0.05 192.23 0.02 1.65 0.78 0.96
>>> dm-0 0.00 0.00 0.11 2.24 0.00 0.01 8.70 0.00 1.14 0.51 0.12
>>> dm-1 0.00 0.00 0.00 0.00 0.00 0.00 8.00 0.00 1.26 0.24 0.00
>>> dm-2 0.00 0.00 0.08 40.11 0.00 0.16 8.18 0.03 0.86 0.14 0.57
>>>
>>> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
>>> sda 0.00 155.00 0.00 127.20 0.00 1.10 17.75 0.06 0.45 0.25 3.22
>>> sdb 0.00 0.20 0.00 11.60 0.00 0.07 11.81 0.05 4.07 0.17 0.20
>>> sdc 0.00 0.00 0.00 0.40 0.00 0.00 12.00 0.00 0.50 0.50 0.02
>>> dm-0 0.00 0.00 0.00 1.20 0.00 0.00 8.00 0.01 4.50 4.50 0.54
>>> dm-2 0.00 0.00 0.00 281.40 0.00 1.10 8.00 0.19 0.68 0.10 2.74
>>>
>>> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
>>> sda 0.00 1480.80 0.00 627.20 0.00 8.23 26.89 0.28 0.45 0.20 12.70
>>> sdb 0.00 0.00 0.40 4.00 0.00 0.03 18.27 0.00 1.05 1.05 0.46
>>> sdc 0.00 0.00 0.00 1.40 0.00 0.00 6.71 0.00 0.14 0.14 0.02
>>> dm-0 0.00 0.00 0.00 17.20 0.00 0.07 8.00 0.02 1.09 0.09 0.16
>>> dm-2 0.00 0.00 0.00 2090.20 0.00 8.16 8.00 1.65 0.79 0.06 12.52
>>>
>>> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
>>> sda 0.00 1331.00 0.00 629.80 0.00 7.66 24.91 0.28 0.44 0.22 13.64
>>> sdb 0.00 0.00 0.00 2.80 0.00 0.02 14.07 0.00 0.79 0.57 0.16
>>> sdc 0.00 0.00 0.00 1.40 0.00 0.02 30.14 0.00 0.57 0.57 0.08
>>> dm-0 0.00 0.00 0.00 1.20 0.00 0.00 8.00 0.00 3.83 3.83 0.46
>>> dm-2 0.00 0.00 0.00 1959.60 0.00 7.65 8.00 1.43 0.73 0.07 13.20
>>>
>>> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
>>> sda 0.00 6.60 0.00 0.60 0.00 0.03 96.00 0.00 1.33 1.33 0.08
>>> sdb 0.00 0.20 0.00 0.80 0.00 0.00 8.25 0.00 0.75 0.75 0.06
>>> sdc 0.00 0.20 0.00 0.20 0.00 0.00 16.00 0.00 0.00 0.00 0.00
>>> dm-0 0.00 0.00 0.00 6.20 0.00 0.02 8.00 0.01 0.97 0.03 0.02
>>> dm-2 0.00 0.00 0.00 1.00 0.00 0.00 8.00 0.00 3.00 0.60 0.06
>>>
>>> # a second run of ls -l:
>>> [root at dovecot2 ~]# date;time ls -l /var/spool/mail
>>> Fri Nov 1 15:04:43 EDT 2013
>>> real 0m7.814s
>>> user 0m0.096s
>>> sys 0m0.139s
>>>
>>> [root at dovecot2 ~]# date;iostat -xmdz 5
>>> Fri Nov 1 15:04:41 EDT 2013
>>> Linux 2.6.32-358.18.1.el6.x86_64 (dovecot2.sri.utoronto.ca) 11/01/2013 _x86_64_ (16 CPU)
>>>
>>> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
>>> sda 0.01 21.20 0.16 28.28 0.00 0.19 14.27 0.02 0.57 0.27 0.78
>>> sdb 0.22 0.09 1.70 2.19 0.04 0.02 31.03 0.01 2.83 1.20 0.47
>>> sdc 0.00 0.04 12.63 0.75 1.21 0.05 192.59 0.02 1.60 0.77 1.04
>>> dm-0 0.00 0.00 0.11 2.46 0.00 0.01 8.57 0.00 1.17 0.51 0.13
>>> dm-1 0.00 0.00 0.00 0.00 0.00 0.00 8.00 0.00 2.03 0.21 0.00
>>> dm-2 0.00 0.00 0.07 47.02 0.00 0.18 8.15 0.04 0.93 0.14 0.65
>>>
>>> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
>>> sda 0.00 730.20 0.00 1251.60 0.00 7.76 12.69 0.52 0.42 0.21 25.86
>>> sdb 0.00 1.00 0.80 25.40 0.01 0.16 12.62 0.01 0.37 0.18 0.46
>>> sdc 0.00 0.00 0.00 2.60 0.00 0.03 20.62 0.00 0.31 0.23 0.06
>>> dm-0 0.00 0.00 0.00 0.20 0.00 0.00 8.00 0.00 3.00 3.00 0.06
>>> dm-2 0.00 0.00 0.00 1981.00 0.00 7.74 8.00 1.04 0.52 0.13 25.96
>>>
>>> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
>>> sda 0.00 1544.20 0.00 2844.60 0.00 17.14 12.34 1.10 0.39 0.21 58.34
>>> sdb 0.00 0.00 0.00 1.20 0.00 0.01 9.33 0.00 0.00 0.00 0.00
>>> sdc 0.00 0.00 0.00 0.60 0.00 0.00 8.00 0.00 3.00 1.00 0.06
>>> dm-0 0.00 0.00 0.00 3.00 0.00 0.01 8.00 0.00 0.07 0.07 0.02
>>> dm-2 0.00 0.00 0.00 4386.00 0.00 17.13 8.00 2.19 0.50 0.13 58.46
>>>
>>> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
>>> sda 0.00 104.80 0.00 212.80 0.00 1.24 11.94 0.09 0.41 0.21 4.44
>>> sdb 0.00 0.00 0.20 3.00 0.00 0.01 9.81 0.00 0.50 0.50 0.16
>>> sdc 0.00 0.20 0.00 2.80 0.00 0.02 16.86 0.01 4.43 0.79 0.22
>>> dm-0 0.00 0.00 0.00 0.40 0.00 0.00 8.00 0.00 5.50 5.50 0.22
>>> dm-2 0.00 0.00 0.00 317.00 0.00 1.24 8.00 0.15 0.47 0.13 4.24
>>>
>>> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
>>> sda 0.00 6.60 0.00 1.20 0.00 0.03 52.00 0.00 0.33 0.33 0.04
>>> sdb 0.00 0.20 0.00 1.00 0.00 0.00 9.60 0.00 0.00 0.00 0.00
>>> sdc 0.00 0.00 0.00 0.20 0.00 0.00 2.00 0.00 1.00 1.00 0.02
>>> dm-0 0.00 0.00 0.00 4.40 0.00 0.02 8.00 0.00 0.95 0.05 0.02
>>> dm-2 0.00 0.00 0.00 3.40 0.00 0.01 8.00 0.00 0.06 0.06 0.02
>>>
>>> [root at dovecot2 ~]# date;vmstat 5
>>> Fri Nov 1 15:04:39 EDT 2013
>>> procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
>>> r b swpd free buff cache si so bi bo in cs us sy id wa st
>>> 0 0 1304 542644 229484 29885592 0 0 80 17 1 1 1 0 99 0 0
>>> 0 1 1304 542752 229484 29885696 0 0 0 2211 1518 1281 1 0 98 0 0
>>> 0 1 1304 544736 229484 29885696 0 0 6 17216 7535 7631 3 2 94 2 0
>>> 0 0 1304 543988 229484 29885696 0 0 0 7403 3782 3889 1 1 97 1 0
>>> 0 0 1304 543604 229484 29885712 0 0 1 215 399 277 0 0 100 0 0
>
>
>
> _______________________________________________
> xfs mailing list
> xfs at oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://oss.sgi.com/pipermail/xfs/attachments/20131112/8c34586e/attachment-0001.html>
More information about the xfs
mailing list