Nathan has asked me (several times...) about making it easy for existing Java applications to be instrumented into PCP. The way Aconex originally did this was quite deeply embedded in the life-cycle of our main application, and requires more configuration and effort than ideal. Lots of things in the Java world have come about since we started, and we've certainly learned more since then. What follows is an attempt to outline what I would say to someone trying to integrate a Java app into PCP.
I thought of 2 phases a user might go through in this journey of discovery of instrumenting their Java stuff with PCP:
1) I have a Java app not under my source control I'd really like to instrument into PCP 2) I have a Java app I do have source control of, and would like to start exposing more stuff to PCP
For the first example, I see a Cuckoo's Egg (see [4]) mechanism where we need a way to 'hijack' the JVM to allow an instrumentation path with the least amount of modification possible. Originally I hacked up a mechanisms that injected a custom JAR embedded into the application by placing the JAR in a known location that was the source of libraries loaded, and then leveraged Log4j's configuration mechanism to create a custom "appender" that bootstrapped the instrumentation process. Crafty, but seriously hacky.
For this simple/general case I now think a the best & easiest approach is to use Jolokia (see [1]) via it's agent mechanism (see [2]). Jolokia is an awesome library that allows exposure of Java JMX objects within a running JVM process via a really nice RESTful API. Really top stuff. Since the JVM itself exposes JMX objects, it's really easy to get the Java Memory info & others sucked out via REST by hijacking the JVM through this method.
The 'documentation', if you will, for telling PCP users how to do it would be something like:
1) Download Jolokia jar, place it somewhere on the host of the Java process 2) Force the Java process to launch with a subtle change to the command line. Many Java apps support the JAVA_OPTS environment variable method to allow building extra options (particularly to give a Java app more Heap memory in certain circumstances). This would allow the injection of the Jolokia Java Agent to start and expose the REST interface on a port (see [2])
A fairly standard Java PCP PMDA could be written for this approach, indeed our recent ActiveMQ PMDA follows this approach, but isn't yet generic enough to use here, but the calls through to Jolokia are there (see [7]). It could query the REST interface for known JMX namespace entries to soak out the metrics you want. The JVM versions from time to time tweak different JMX namespaces particularly if the Garbage Collector method changes, but it's usually from a well formed base that a simple iteration from a root of a name space to create matching PCP metrics might be fine.
So now the user thinks, "ok this is awesome, I'd like to create my own objects to expose to PCP" or another scenario "The Built In JVM instrumentation here is great, but there's a few other JMX beans I've just discovered in this darn app I think would be good to expose too", perhaps the base/simple Jolokia agent could be configured to add extra JMX namespace mappings to include as well?
Now we move on to the next scenario, where the user has control over the source code. I was thinking the best recommendation is to suggest the user adopts the DropWizard (nee Codahale) Metrics library (see [3]). This comprehensive package allows excellent metric definitions within the code in a nice way, although it doesn't quite have that Instance Domain concept that PCP has, which is powerful, but probably an Advanced Rocketry mode for many. By going down the DropWizard Metrics route, the Java developer can easily expose the metrics via JMX, which then leads down the above Jolokia route, or PCP could have a companion module for DropWizard Metrics which allows exporting these metrics to PCP. DropWizard Metrics already supports several Reporter interfaces to things like Ganglia (see [8], it shouldn't be that hard to create an interface. The BIM team within Aconex has created a Parfait fork that allows exporting DropWizard Metrics into PCP via Parfait.
Parfait though is not easy to configure and setup however. As a library it really needs a 2.0'ing, start from scratch with new information & ideas. An alternative idea perhaps might be to use the built-in(*gasp*) CSV Reporter interface of DropWizard Metrics might allow a simple soaking up of values.. Hacky!.. :) Honestly a reworking of Parfait to make it simpler, cleaner, tighter code would be good. I'd also like that module migrated to GitHub (off Mercurial), within the 'performancecopilot' Organization namespace.
So there you go, the ideas I've had cooking around in my mind for a while, and waiting for a time to chat to Nathan face to face (I was too lazy to type this up before and also never here at Aconex when he comes over to the city).
Happy New Year to all.
Paul Smith
|