Hi Stan,
----- Original Message -----
> I just wanted to give a heads up for this reboot, which caches the rpm
> metrics in advance as opposed to version one's accessing them just in
> time. This is an effort to improve the fetch performance. The rpm
> metrics are cached into the data structure 'cache'. A hash table is
> built; the key is the rpm name and the value is the pointer to the cache
> entry. The hash table, via hsearch_r, is used to find the cache entry
> for an rpm. The fetch callback accesses the cache and returns the
> proper info. If the rpm database changes then it is reloaded.
> -Currently inotify checks the entire dir to make testing easier
> -The hashing sometimes fails; I am diagnosing that problem
There's a number (3 iirc) of places where it takes the lock but returns
without unlocking, possibly part of the problem, possibly not. Memory
and I/O management remains an issue in general though - see notes below.
> -Threading needs to be revisited since the reboot
> -A subset of available rpm metrics is being cached
All need to be cached (scanned by background thread, reported to pmcd
via main thread). There still seems to be rpm.file.* metrics - are
these leftovers, still to be removed? Those are going to consume too
much memory, I think? I had expected them to exit stage left, never
to be seen again. :) Either way, lets get the smaller stuff working
first and revisit those.
> committed to pcpfans scox/dev branch
>
Misc other schtuff:
- some build warnings on RHEL6 gcc-4.4.6...
rpm.c: In function ârpm_update_cacheâ:
rpm.c:482: warning: assignment discards qualifiers from pointer target type
rpm.c:490: warning: enumeration value âRPM_NULL_TYPEâ not handled in switch
rpm.c:490: warning: enumeration value âRPM_CHAR_TYPEâ not handled in switch
rpm.c:490: warning: enumeration value âRPM_BIN_TYPEâ not handled in switch
rpm.c:490: warning: enumeration value âRPM_I18NSTRING_TYPEâ not handled in
switch
rpm.c:589: warning: unused variable âcodeâ
rpm.c: In function ârpm_fetchCallBackâ:
rpm.c:354: warning: âret_typeâ may be used uninitialized in this function
rpm.c: In function ârpm_update_cacheâ:
rpm.c:510: warning: âccecpâ may be used uninitialized in this function
rpm.c: In function ârpm_update_indomâ:
rpm.c:650: warning: âe.dataâ may be used uninitialized in this function
- build fails... simpler to just install the help files (to be compiled
at ./Install time), and not build 'em during the build - see pmdasimple
and attached makefile patch:
pmcpp: root[5]: #include <stdpmid>
pmcpp: Error: Cannot open file for #include
Error Parsing ASCII PMNS: pmcpp returned non-zero exit status
newhelp: pmLoadNameSpace: Problems parsing PMNS definitions
- makefile needs to build conditionally based on presence of rpm lib
and headers (see note in patch attached).
- need configure.in support for detecting whether/not to build this code,
will cause build failure on non-Linux and non-rpm platforms currently.
- add PMDA_NOTREADY into rpm_fetch and rpm_instance to tackle latency on a
request during or shortly after startup (see patch, untested) - based on
pmdasample logic
- RPM_BUFSIZE macro is unused
- "ret_type" is usually named "sts" in the rest of the pcp code.
Used in a sentence: "aliens froze my spaceship and i was stuck in sts"
[sts? status? a count of "st"'s? its a stasis symbol.]
- add in a metric to cheaply track memleaks via __pmProcessDataSize
like the pmcd.datasize metric in pmcd
- the units for those refresh metrics are missing, and should be free
running counters. (see attached patch)
- refresh.time - hmm, its not really cpu time we want though, its the
elapsed time thats important here (with I/O) - lets make this into
a tree with refresh.time.{user,kernel,elapsed}. We need some new
gettimeofday based calipers for elapsed time - see timer.c in patch.
- there's no initialisation of indom_lock (pthread_rwlock_init)
- is rwlock helpful? theres only one reader & one writer - mutex?
- rpm_update_cache - holds locks while doing rpmdb I/O, which will
block fetches, no? locking needs to become more fine-grained.
- const char *rpmname = rpmtdGetString(td);
...
e.key = rpmname;
(what's the lifecycle of the memory rpmname points at here? it
looks like we're pointing at memory that might go away later?)
- stepping back, thinking about data structures in general:
- so, we have a linked list (cache) and a hash table (htab)
- we need the linked list because theres no hsearch(3) API to walk
the hash table
- can we use a second pmdaCacheOp indom hash here instead of the
above? I think this will be simpler, and may help us with the
locking too - will catch you on IRC to discuss when you're back.
- needs another test (750 is a good start though) - the tricky memory
management here definitely warrants a valgrind test (via DSO use in
dbpmda perhaps?) - not sure how to automate the rpmdb changes to
trigger a refresh though? needs deeper thought, but thats the hard
part of this PMDA and 750 doesn't cover that aspect.
cheers.
--
Nathan
scox-pmdarpm.patch
Description: Text Data
|