Hey Frank,
fche@xxxxxxxxxx (Frank Ch. Eigler) writes:
[...]
> The two commits on pcpfans.git fche/papi improve on this situation by
> moving code from the papi-pmda "inner loop" (the papi_fetchCallBack
> function) into the higher level papi_fetch one. With default pminfo
> and papi-pmda batching values, both those pminfo calls should finish
> in much less than a second. The knobs exposed by the code let you
> benchmark the improvements on your own hardware too.
Thanks for taking a look at this. The changes look good to me. I've
also verified that the updated QA passes all tests on my machine.
> Before:
> # pmstore papi.control.batch 9999
> # pmstore papi.control.reset ""
> # /bin/time pminfo -f papi >/dev/null
> 0.00user 0.00system 0:02.68elapsed
> # /bin/time pminfo -b1 -f papi >/dev/null
> 0.00user 0.01system 0:03.30elapsed
>
> After:
> # pmstore papi.control.batch 10
> # pmstore papi.control.reset ""
> # /bin/time pminfo -f papi >/dev/null
> 0.00user 0.00system 0:00.19elapsed
> # /bin/time pminfo -f papi >/dev/null
> 0.00user 0.00system 0:00.11elapsed
I'm also seeing a healthy performance improvement on my machine.
# time pminfo -v papi > /dev/null
Before:
real 0m3.070s, user 0m0.009s, sys 0m0.011s
real 0m0.726s, user 0m0.006s, sys 0m0.014s
real 0m0.713s, user 0m0.006s, sys 0m0.010s
After:
real 0m0.252s, user 0m0.009s, sys 0m0.011s
real 0m0.165s, user 0m0.005s, sys 0m0.010s
real 0m0.168s, user 0m0.008s, sys 0m0.006s
If the commits on fche/papi could get picked up for dev that'd be
greatly appreciated.
Cheers,
Lukas
|