= Milestones and upcoming dates
Chatted about 20th anniversary of PCP which has either arrived
or is just about to (Ken may followup with specific dates once
he digs up the time capsule). Congratulations to Ken and Mark
(sorry Mark, I overlooked you before) for the early vision and
super-impressive endurance to still be here!
- releases [nathans]
Tracking well with the regular release schedule, business as
usual, no surprises to report.
- conferences / presentations [nathans]
Mark and Nathan attending PyCon 2014 in Brisbane .au (August)
- we'll attempt to run a PCP BOF if there's interest. Nathan
has a conference slot to talk about PCP Python APIs.
= Administration
- closed/open mailing list [nathans, fche, kenj, mgoodwin]
Lots of discussion and reminiscing about state of the mailing
list over time, where it was originally open and later made
subscriber-only as a result of spam. Everyone is generally
happy about the zero spam on-list, Ken mentioned low effort
required (but some) in keeping it that way. Frank suggested
an open-subscription-policy mail forwarder on sourceware.org
which claims superior anti-spam measures and will investigate
feasibility. Ken points out that there is no way of knowing
from the outside if the list is open/subscriber-only. Nathan
recalls just the one or two posts-by-non-members in the last
several months, if that helps.
- github request [nathans]
Some general discussion about whether a mirror of the master
PCP git tree on github would be useful to anyone, after a
recent request. Noone on the call would find it directly
useful to their workflows.
Frank noted systemtap has a mirror, but very few/no requests
for pulls have come through. Nathan to ponder further, most
likely we'll simply await more demand at this stage.
- code reviews [nathans]
General thanks to folks doing reviews after requests in the
previous conf call - esp. Frank who has really stepped up his
efforts and many issues are being found in review. Nathan is
very encouraging of more folks doing this - please! anyone! -
its a great way to increase overall familiarity with the code
base and has a clear impact.
- release builds [nathans]
Discussion about getting more folks involved with the binary
builds we do for each release. Ken and Nathan primarily do
this, others would be welcome - particularly around platforms
of importance to you (like Solaris and Mac, perhaps?). Ken
and Nathan to huddle and attempt to improve regularity of the
IA64 builds used by SGI sysadmin folks.
= Quality Assurance
- archives [nathans]
Since the pcp-gui/pcp merge we've now got two locations for QA
archives (qa/src and qa/archives). Planning to move all logs
in src/ over to archives/, but this will touch alot of tests.
Some discussion about timing - will wait a little while as we
have a future merge from Frank thats affected, and (later on)
Nathan notes another set of QA test churn is needed for systemd
unit script work (planning to coincide the two at this stage).
- testing kernel PMDAs [nathans]
Discussion around a freshly introduced (yesterday) mechanism to
aid testing the Linux kernel PMDAs (pmdalinux, pmdaproc, pmdaxfs
pmdajbd2, possibly others). Involves the use of an environment
variable at PMDA startup that specifies an alternate filesystem
root location for any proc/sysfs/cgroupfs files that PMDAs need
to parse. So far pmdaproc only has been converted (tests qa/730
and qa/731 use this), but the main kernel PMDA will follow soon.
Ken notes this doesn't cover all metrics, so some care is needed
(e.g. ipc metrics, which use syscalls). Mark and Nathan chatted
a bit about updating the root on-the-fly while the test runs, to
exercise more dynamic aspects of PMDA behaviour (definitely this
is feasible, not yet attempted).
= Work-In-Progress
- web interface(s) & threading status [fche, kenj]
Progressing well. Some threading issues remain affecting pmwebd
(via libpcp), but this is only optionally enabled at this stage.
Frank expecting to have last few release-related details sorted
soon (QA, docs, and such), planning to send code to the pcp list
shortly. Nathan asked about options with demoing at PyCon since
its sure to be asked - Frank outlined a couple of paths to doing
so. Discussion about Amer's request for deb packages to enable
wider/easier testing at his end. Frank has rpm packaging done,
not deb yet (its non-obvious, we may need to enumerate all files
which is problematic in this case - further investigation needed).
Nathan seeks helpers on this, but will take a look if noone else
gets there before him.
- python interfaces [nathans, mgoodwin]
Progressing well. Much work done, more work to follow - aiming
in general to be able to have scripts that only have to focus on
reporting/printing details. Mark reported on his progress with
iostat, which may even make it for this release. Frank asked why
do a pcp-iostat - Mark gave a detailed explanation about Red Hat
customer support needs, sosreport, and how these folks interact
with Red Hat customers outlining how its useful with archive data
from customers in particular.
Some discussion around pmlogconf - collectl and iostat metrics
need to be in the default set (or at least reevaluated again at
some point soon to ensure the coverage vs cost tradeoff is at an
appropriate point still) for Red Hat customer support needs.
Side discussion in there too about whether scripts should be done
as standalone tools (/usr/bin) or as pcp(1) scriptlets. General
feeling is its the authors choice; pros and cons of each approach
discussed: namespace collisions, cmdline option collisions versus
ease of invocation, and quirky option processing via pcp(1). A
side side discussion about QA of these scripts too (archives have
worked well for Nathan).
- memory blowout [brolley, nathans, kenj]
Problem is well understood at this stage, although remedial effort
is not yet underway we know how to get there and Dave is confident
its all readily achievable.
- discovery [brolley, kenj, fche]
Dave gave an update on recent advances in discovery in the last
little while (both his current dev work and last few releases).
Looks like 3.9.7 will have the last set of updates for the active
probing after Daves sorted out remaining issues from Franks most
recent review.
Ken brought up a cloud usage requirement which he's observed at a
couple of deployments now. Frank gave an update on pmmgrs use of
discovery, and how planned not-too-distant-future work is aiming
to help with the problem of Avahi/mDNS solutions not being viable
in this kind of environment. Followup discussion between Ken and
Frank planned for the list.
- pmdapapi [lukas, wcohen, nathans, fche, kenj]
Lukas gave an overview of his work producing a PAPI PMDA, which
is on-list and seeking feedback. Discussion primarily focussed
on the enabling mechanism, and whether auto-enabling on fetch &
auto-disabling when last client exits for this class of data is
appropriate. Will provided additional insights into how PAPI
works, Ken reminisced about MIPS hardware event counters back in
the day, on IRIX :) -- sound like 2 perfect code reviewers. ;)
The potential for accidental conflicts with other users of the
hardware remains a concern. Nathan pointed out unexpected places
where pminfo will be run - such as at PMDA install time - where
people would not expect to have to guard against conflicting
hardware counter usage. Will believes the policy to be "first
user wins" in terms of claiming access to the counters, we don't
think the second user will be parked and have to wait (bad for a
PMDA if so) or worse, interrupt the first. Investigation into
the actual behaviour when these things happen will be helpful to
assist setting implementation directions (none of these issues
affect the pmStore-based approach, but its less user-friendly --
more configurable and controlled though - trade-offs as always).
Along the same lines, behaviour of a full PMDA fetch was queried;
some investigation into what happens as a result of running full
fetches like "pmdumptext papi" (fetching all the metrics at once,
which the hardware is unlikely to support sensibly - Frank points
out PAPI might auto-multiplex counters here, which would be neat
if it does).
[All up, a difficult first PMDA project - good work, Lukas!]
- gfs2 updates [pevans]
Paul gave an update on some planned GFS2 PMDA updates around hot
glock tracing. A GFS2-specific distributed monitoring tool using
PCP metrics and APIs is in the future-feature pipeline as well.
= Requests-For-Comments
- new net latency metrics, pmdapipe [wcohen, nathans, fche, kenj]
Lots of discussions ... scribe running out of steam ... mostly we
went through the list discussions here, I think? Frank points out
no short-term plan to work on pmdapipe, although in general theres
agreement the approach is sound and needed. Nathan points out its
a difficult PMDA to implement, too, (extensive event metric use) &
also thinks its a very high value target. So if anyone out there
wants to take a crack, please do. (see RFC on-list)
Ken pointed out MMV is a good, efficient model for planned systemtap
based exporting and that it fits the PCP model well. Nathan agrees
and cautions about production lessons learned from MMV-instrumented
Java applications. Some metrics rely on changes to the exported
value at sample time, and this facility is effectively lost with
the current MMV PMDA (unless the instrumented application becomes
proactively involved in continually refreshing, even then its still
not as neatly/efficiently done as a coordinated sampling model).
- pmcd co-process [nathans, fche]
Nathan gave a high level overview of the proposed approach, and the
three problems its aiming to tackle. Not a whole lot of time for
everyone to digest the RFC yet though, I think. Frank has though,
responding favourably.
Final quick poll of appropriate timing for these conf calls - three
monthly appears to be working fine, we'll continue on this track.
Phew, done! Thanks for reading. :) The above is all from memory,
so apologies if I missed or misrepresented anything. Please post
corrections &| omissions here as follow-up mail - thanks.
cheers.
--
Nathan
|