| To: | Mike Kravetz <kravetz@xxxxxxxxxx> |
|---|---|
| Subject: | Re: kernprof performance |
| From: | Ethan Solomita <ethan@xxxxxxxxxxxxxxx> |
| Date: | Mon, 17 Jun 2002 23:25:31 -0700 |
| Cc: | kernprof@xxxxxxxxxxx |
| References: | <20020613114140.A2384@w-mikek2.des.beaverton.ibm.com> |
| Sender: | owner-kernprof@xxxxxxxxxxx |
| User-agent: | Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.0.0) Gecko/20020530 |
[resend, take 2] Hi Mike -- there may be some delayed response, because the Usenix Technical Conference was this past week, and John Hawkes was there and busy with a paper to be presented. Mike Kravetz wrote: > Recently, I was trying to understand kernprof output while > doing a scalability study. My intention was to run a benchmark > on a UP (actually 1P SMP kernel) and capture kernprof output. > I then ran the same benchmark after enabling 7 additional > processors. I 'thought' that I could do an apples to apples > comparison of the output to look for routines whose execution > time increased. However, this is not what I got. It 'looks' > like the 8P numbers are 8X the UP numbers??? > I'm not sure what you're referring to, but I have a guess. If you have 1 CPU, and HZ is 100 (which it is), then you will get 100 samples per second from kernprof. If you have 2 CPUs, then you will get 200 samples per second. I believe that the output from gprof will show a total of 200% (my memory may be bad). If so, that's not a problem. Each CPU represents 100%. This seems much more useful than the reverse. For example, if 1 cpu is 100% busy, and the other is 100% idle, the gprof output will show 100% (well, 99% 8) in cpu_idle(), and another 100% spread around among various functions. This seems much more readable than 50% in cpu_idle and 50% spread around. Now imagine a 32 CPU system and how readable it would be...
Personally, I don't use acg mode. For most purposes, I prefer call backtrace mode. You don't need to build the kernel with CONFIG_MCOUNT at all. So you can enable CONFIG_KERNPROF (very minimal overhead) all the time, and still get profiling info whenever you need it without rebooting. The output in gprof that says how many times a function was called, and how many times that function called others, is different. But no less useful. And the times it attributes to callers and callees of a function are actually *accurate* with call backtrace, whereas acg can be completely inaccurate. -- Ethan |
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| ||
| Previous by Date: | [RFC - Patch] Kernprof kernel profiling in NMI mode, Ravikiran G Thirumalai |
|---|---|
| Next by Date: | Re: kernprof performance, John Hawkes |
| Previous by Thread: | kernprof performance, Mike Kravetz |
| Next by Thread: | Re: kernprof performance, John Hawkes |
| Indexes: | [Date] [Thread] [Top] [All Lists] |