pcp
[Top] [All Lists]

[performancecopilot/pcp] pmwebd impossibly slow when using grafana with

To: performancecopilot/pcp <pcp@xxxxxxxxxxxxxxxxxx>
Subject: [performancecopilot/pcp] pmwebd impossibly slow when using grafana with 300 archives (#117)
From: Marko Kevac <notifications@xxxxxxxxxx>
Date: Thu, 29 Sep 2016 07:29:27 -0700
Delivered-to: pcp@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=github.com; s=pf2014; t=1475159367; bh=ntjBUrZEkQ5x2fjAulNdmsnMk6JdZ34qvziTtVfMfE4=; h=From:Reply-To:To:Subject:List-ID:List-Archive:List-Post: List-Unsubscribe:From; b=uRQ5jcXHy7yTlmJ8lfKrIofY1OwAsi+XuuiRhzA+mITSJL/f9oNdHzOvh2IBfF8Oq naH3IYXqnDrumPpSd3CVKAjahgZQIIAzD1k7hkb8yVK+hW5XdirF/ngyJ/OldI3Ybt 4uIF5bwKb2VF4vfsvkZwCi4aOUtZgp4XOn+t7ebY=
List-archive: https://github.com/performancecopilot/pcp
List-id: performancecopilot/pcp <pcp.performancecopilot.github.com>
List-post: <mailto:reply+00bd08b6a0fe35503ce5cf3f578c21f90e1da5837c94969f92cf000000011404e74792a169ce0abb9393@reply.github.com>
List-unsubscribe: <mailto:unsub+00bd08b6a0fe35503ce5cf3f578c21f90e1da5837c94969f92cf000000011404e74792a169ce0abb9393@reply.github.com>, <https://github.com/notifications/unsubscribe/AL0ItlPhOX-kQtWARGHp7BjZ4TZebMPJks5qu8tHgaJpZM4KKD0Y>
Reply-to: performancecopilot/pcp <reply+00bd08b6a0fe35503ce5cf3f578c21f90e1da5837c94969f92cf000000011404e74792a169ce0abb9393@xxxxxxxxxxxxxxxx>

Hello.

We are collecting metrics from ~300 servers with 1s granularity using pmlogger. It works fine. Data is going to appropriate PCP databases on disk.

But visually browsing these archives is impossible using pmwebd and grafana.

After clicking on a host list in grafana

Imgur

Nothing happens.

I can see in pmwebd log that it got request for /graphite/render:

[Thu Sep 29 14:06:55] pmwebd(22926): [192.168.3.129:33220] HTTP/1.1 GET /grafana/config.js
[Thu Sep 29 14:06:55] pmwebd(22926): [192.168.3.129:33220] HTTP/1.1 GET /grafana/app/dashboards/hostselect.js
[Thu Sep 29 14:06:55] pmwebd(22926): [192.168.3.129:33220] HTTP/1.1 GET /graphite/render

pmwebd process is stuck in D reading files. I've waited for 10 minutes, but nothing changed.

strace shows something like this:

[pid 22926] lseek(60, 666161152, SEEK_SET) = 666161152
[pid 22926] read(60, "\3\0\0\f\0\0\0\0\310\326\375\262\3\0\0\f\0\0\0\0\316\7\242Z\3\0\0\f\0\0\0\0"..., 4096) = 4096
[pid 22926] lseek(60, 666165248, SEEK_SET) = 666165248
[pid 22926] lseek(60, 666165248, SEEK_SET) = 666165248
[pid 22926] lseek(60, 666136576, SEEK_SET) = 666136576
[pid 22926] read(60, "\3\0\0\f\0\0\0\0\0\0\0\0\3\0\0\f\0\0\0\0\0\0\0\0\3\0\0\f\0\0\0\0"..., 4096) = 4096
[pid 22926] read(60, "\0\0\0\20\0\0\v\265\0\0\0\21\0\0\v\270\0\0\0\22\0\0\v\273\0\0\0\23\0\0\v\276"..., 20480) = 20480
[pid 22926] read(60, "\3\0\0\f\0\0\0\0\310\326\375\262\3\0\0\f\0\0\0\0\316\7\242Z\3\0\0\f\0\0\0\0"..., 4096) = 4096
[pid 22926] lseek(60, 666136576, SEEK_SET) = 666136576
[pid 22926] read(60, "\3\0\0\f\0\0\0\0\0\0\0\0\3\0\0\f\0\0\0\0\0\0\0\0\3\0\0\f\0\0\0\0"..., 4096) = 4096
[pid 22926] lseek(60, 666140672, SEEK_SET) = 666140672
[pid 22926] lseek(60, 666140672, SEEK_SET) = 666140672
[pid 22926] lseek(60, 666116096, SEEK_SET) = 666116096
[pid 22926] read(60, "\0\0\0\0\22\240\357\267\3\0\0\f\0\0\0\0\0\23\222G\3\0\0\f\0\0\0\33\204\245\3714"..., 4096) = 4096
[pid 22926] read(60, "\0\0\0\1\0\0\0\1\377\377\377\377\0\0\r\324\17\0pD\0\0\0\1\0\0\0\1\377\377\377\377"..., 16384) = 16384
[pid 22926] read(60, "\3\0\0\f\0\0\0\0\0\0\0\0\3\0\0\f\0\0\0\0\0\0\0\0\3\0\0\f\0\0\0\0"..., 4096) = 4096
[pid 22926] lseek(60, 666116096, SEEK_SET) = 666116096
[pid 22926] read(60, "\0\0\0\0\22\240\357\267\3\0\0\f\0\0\0\0\0\23\222G\3\0\0\f\0\0\0\33\204\245\3714"..., 4096) = 4096
[pid 22926] lseek(60, 666120192, SEEK_SET) = 666120192
[pid 22926] lseek(60, 666120192, SEEK_SET) = 666120192
[pid 22926] lseek(60, 666091520, SEEK_SET) = 666091520
[pid 22926] read(60, "\0\0\0\0\0\0\0\0\3\0\0\f\0\0\0\0\0\0\0\0\3\0\0\f\0\0\0\0\0\0\0\0"..., 4096) = 4096
[pid 22926] read(60, "\0\0\v[\0\0\0\23\0\0\v^\0\0\0\24\0\0\va\0\0\0\25\0\0\vd\0\0\0\26"..., 20480) = 20480
[pid 22926] read(60, "\0\0\0\0\22\240\357\267\3\0\0\f\0\0\0\0\0\23\222G\3\0\0\f\0\0\0\33\204\245\3714"..., 4096) = 4096
[pid 22926] lseek(60, 666091520, SEEK_SET) = 666091520
[pid 22926] read(60, "\0\0\0\0\0\0\0\0\3\0\0\f\0\0\0\0\0\0\0\0\3\0\0\f\0\0\0\0\0\0\0\0"..., 4096) = 4096
[pid 22926] lseek(60, 666095616, SEEK_SET) = 666095616
[pid 22926] lseek(60, 666095616, SEEK_SET) = 666095616
[pid 22926] lseek(60, 666071040, SEEK_SET) = 666071040
[pid 22926] read(60, "\0\21\302T\3\0\0\f\0\0\0\0\0\7\233D\3\0\0\f\0\0\0\0\0\31O\210\3\0\0\f"..., 4096) = 4096

I will try and provide some more info, but maybe you, pcp developers, already know what is going on...

This is a IO Wait time for server after requesting /graphite/render:

Imgur

Server is barely working :-)


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.

<Prev in Thread] Current Thread [Next in Thread>
  • [performancecopilot/pcp] pmwebd impossibly slow when using grafana with 300 archives (#117), Marko Kevac <=