Can I get some more info with regards to the issues with Parfait? What are the scalability/critical issues that were seen?
On Fri, Jun 24, 2011 at 2:00 PM, Nathan Scott <nathans@xxxxxxxxxx> wrote:
Hi all,
On Wednesday evening (Melbourne time) we had a meeting of minds
of some of the recent PCP core committers and people involved
with active, related projects. Below is a collation of Ken's
notes and what I can remember of the round-table discussions.
If I've overlooked anything, feel free to reply and mention it
- thanks!
Attendees: Mark Goodwin, Max Matveev, David Chatterton, Gokul
Krishnan, Ken McDonell, Paul Smith, Nathan Scott.
Apologies:
Jeff Sipek ("I'm a normal person, 3:30am is too early")
Mustafa Sezgin ("I'm a consultant, 5:30pm is too early")
Discussion kicked off with Paul wondering whether further steps
could be taken to make PCP easier to setup and use for the large
cluster community, e.g. people using deployments of Hadoop/HBase
with many hundreds of machines. Some discussion of failure modes
in this realm ensued, around detecting insidious failure modes in
which a machine/service continues to work in a degraded mode.
In the context of considering if there is any scope to change the
"archives are associated with a single host" restriction, Paul
mentioned OpenTSDB http://opentsdb.net/ in the context of very large
cluster monitoring as something to be considered and compared to PCP.
Its unclear the PCP could be doing more to help this class of problems,
as the higher-level aggregation and business intelligence services are
not something PCP is going to tackle, although PCP could be producer of
raw data to be input to such an infrastructure.
Nathan talked about how Aconex has used SQLServer BI/warehousing to
build performance data cubes for comparison across data centers and
had some success with this kind of layered-on-top-of-PCP approach.
General point was we should keep in mind what belongs as part of core
PCP, and what should be layered above - making sure its easy to extract
data from PCP for anyone building on top seems a noble aim, and likely
to be the best we can do (many people have different needs at this high
reporting level, not sure a general PCP solution feasible in this space).
Also from Paul Smith, a pointer to NewRelic http://newrelic.com/ as an
emerging and interesting package (SaaS) for monitoring web applications
in the Java, Ruby, PHP, .net, Python, Ruby on Rails, ... spaces.
Mark and Max then moved discussion to Authentication and Access Controls
extensions to PCP protocols. There seems to be general agreement about
using a private/public key scheme to enable anonymous authentication ...
some of the details remain sketchy however. Mark gave the example of
Redhat customer support, where an ideal situation for them would be to
be able to give a public key to a customer who could then open pmcd port
allowing live monitoring access, then remove the key at their control.
There is much less agreement on access controls, and we need to invest
some more effort here in drafting a strawman proposal with sufficient
detail to draw out the likely the implementation issues.
As a side discussion, there appeared to be some support for pulling the
proc metrics out of the linux PMDA and putting them in their own PMDA
(like IRIX), and then _not_ shipping with the proc PMDA installed by
default. This plugs one of the larger holes, but still leaves the
problem of generalized access control suitable for large-scale
production environments unanswered.
Nathan pointed out one downside is that the proc metric IDs will change
(the domain number having to change to that of the proc PMDA). As an
alternative, Nathan suggested the require-a-pmstore before allowing any
fetch, which gives control via the pmcd store access control mechanism.
But there is not general client tool support for this (yet?), and not
clear whether this is something to be pushing out. Would give us a
backward compatible solution in the 3.5.* timeframe however.
[ Forgot at the time, but further complicating the proc.* extraction
is the close relationship between the cgroup and proc metrics ].
Mark talked a bit about Koji, which he uses to generate the binary RPMs
we sometimes (heheheh) have available for releases. Mostly automated
and would be possible for others to run this, but some steps after the
generation (moving bits to oss.sgi.com) are not fully scripted.
Chatz and Ken talked independently about pmview - David mentioned it
in the context of visualising the large data sets that Paul brought up
early on, and Ken talked of reinvigorating the qavis tool.
Ken has built a VM QA Farm that includes:
64-bit Ubuntu 11.04
32-bit Ubuntu 11.04
32-bit openSUSE 11.4
64-bit Fedora 15
32-bit Centos 5.6
32-bit Gentoo 11.0
64-bit FreeBSD 8.2 (no PCP port as yet)
64-bit Debian 6.0.1
32-bit OpenSolaris 2009.06
32-bit NetBSD 5.1 (no PCP port as yet)
32-bit FreeBSD 8.2 (no PCP port as yet)
32-bit Debian 6.0.1 or linux 3.0.0
plus real machines for 64-bit Ubuntu 11.04, 32-bit Ubuntu 11.04 and Mac
OS X. Mark suggested adding Fedora 14 to the mix. Someone suggested a
possible cloud-like resource for QA machines, but light on details.
Ken took us all through PCP 4.0 status:
Done items
- event records
- retire old archive, PMAPI, PMDA_INTERFACE, PDU, PMNS versions
- remove __pmPool* allocation routines
- async variants of libpcp PMAPI routines are all conditionally
compiled in and will be removed once library is thread-safe
In progress
- thread-safe libpcp - some regions of the code will enforce
single-threading, usually in the routines not used by clients,
or rarely used by clients, the core of the routines will be
protected by either a big library lock, or per-channel mutexes
for the client-pmcd channels ... there was some discussion on
mutex per context or mutex per channel and the related issue of
fd duping that takes place under the covers in pmNewContext(),
pmDupContext() and pmReconnectContext() [Ken has more
investigations to be done here] ... also good progress on a
build-time tool to check for regressions when any new
global/static data symbols introduced into the library
- eliminate compilation warnings on all platforms
- rework the init/rc scripts (a) to separate pmcd and pmlogger,
and (b) to validate integration into the native infrastructure
across all the supported platforms (it is an area of great
divergence)
It was suggested that the binary namespace (PMNS) be dropped and the
requirement to have a working cpp installed to parse the PMNS be
removed ... will be added to the PCP 4.0 shopping list.
[ In hindsight, Ken, there'd be significant value in backporting the
removal of cpp to the pcp-3.5.x series too - please keep it in mind ]
There was also some discussion about the application instrumentation
APIs, Paul walked through some of the remaining scalability issues in
Parfait which we hope to have addressed and in production soon. Gokul
mentioned the groovy API about 30% complete, and Nathan went through a
couple of the more critical issues that have affected Parfait, in the
hope that they're not repeated.
In followup email, Mustafa sent a pointer to the "Bubbles" project he
has been quietly working away at (https://gitorious.org/bubbles). And
Jeff sent some links to images generated by his latest tool, pmgraph,
which generates graph images from a descriptive configuration file and
a PCP archive.
All in all, a productive and interesting evening. Thanks to Aconex for
providing the venue, pizza and drinks! Let's do it again, soon, when
sufficient time has passed that we all have interesting new topics to
discuss (perhaps a few months?). If you're interested in attending, or
calling in, please let one of us know and we'll make arrangements.
cheers!
--
Nathan
-- Mustafa SezginFulltime Consultant, Part time Wizard Mobile : +61409060571
|