Hi Marko,
----- Original Message -----
>
> I now had a chance to retest. There was not much other DB load during
> my tests.
>
> The patch down below cures the DBI->connect() issues.
>
Good stuff. Strange that it helps though, as I would've expected that
host/port number combination to be the default.
>From DBD::Oracle docs on CPAN ...
"If port name is not specified, 1521 is the default. If service name is
not specified, the hostname will be used as a service name."
So maybe there's some problem/slowness resolving the hostname? (which
using "localhost" circumvents - *shrug*)
> But even after that ./Install always fails:
>
> pmcd(89137) Warning: pduread: timeout (after 5.000 sec) while attempting to
> read 12 bytes out of 12 in HDR on fd=18
> pmcd(89137) Info: CleanupAgent ...
> Cleanup "oracle" agent (dom 32): protocol failure for fd=18, exit(1)
>
> After pmcd restart we see numbers like these:
>
> # time pminfo -f oracle > /dev/null
>
> real 0m6.583s
> user 0m0.026s
> sys 0m0.010s
>
Yeah, OK, hmm (those times will certainly be the cause of the ./Install
failure)
> Then the most relevant part: for most clusters response times are
> somewhere between 0.03 and 0.3 sec but these two stand out:
Those seem like good-to-middling times, but this...
> - oracle.file takes ~1.3s with ~1k rows
> - oracle.object_cache takes ~3.2s with ~225k rows
is horrendous. oracle.file is the same cluster we had trouble with earlier
when testing with the Intel folk FWIW.
I wonder if the best we can do here is something like:
- disable these two clusters by default
- add oracle.control metrics for each
- add pmstore support to allow people to opt-in to these clusters.
Its not ideal but I don't think there's much else we're going to be able to
do to improve things on our end of the connection, and this would stabilize
things for you at least. Thoughts?
cheers.
--
Nathan
|