Re: Stats Question correction
Angus Dorbie (dorbie++at++sgi.com)
Mon, 06 Apr 1998 12:08:33 -0700
Jay Gischer wrote:
>
> Sam Chu writes:
> >
> >
> > I still get some confused. I try 'perfly' in a single channel single pipe
> > configuration, and got the following result.
> > (I run "perfly -m 4 -p 1 esprit.flt" (using APPCULL_DRAW mode))
> >
> > 15.0/30.0 HZ LOCK APPCULL_DRAW
> > frame = 96.3 app=0.4 cull=1.0 draw=31.2 isect=0.0
> >
> I believe that you are running on an mp system using at least two
> processors (since you are in APPCULL_DRAW mode, more won't matter).
>
> > I think the reason that app+cull+draw+isect(32.6) in much small than
> > fram(96.3) is that the process in wait for the Transformation and Fill
> > stage to finish. So May I say that app+cull+draw is just for the CPU stage?
> >
>
> The label "frame" appears to be causing a great deal of confusion
> here. Recall the picture of Performer's pipe
>
> APP | Frame 1 | Frame 2 | Frame 3 | Frame 4 | etc
>
> CULL | | Frame 1 | Frame 2 | Frame 3 | etc
>
> DRAW | | | Frame 1 | Frame 2 | etc
>
> If 30.0 Hz is the target frame rate then the target from one pipe
> stage boundary to the next is 33.3 ms.
>
> However, the time reported with the label "frame" is a latency time,
> and measures the time from when the frame starts the APP stage, to
> when it completes the draw stage. Ignoring special cases, this time
> is 3 times as long as the time for a single stage. Three time 33.3 is
> 100, so 96.3 is about right.
>
> The app time of 0.4 means that the APP stage finished very quickly,
> and then hung around for a long time waiting for the next pfSynch(),
> so it could start the CULL stage. Likewise for the cull time.
> For each of the stages, Performer measures the time from the beginning
> of that stage for the frame to when it completes its work, not to the
> synch boundary, since that wouldn't be a very interesting number
>
> Here's the picture again with annotations for what time each of
> the stats measures.
>
> APP | Frame 1 | Frame 2 | Frame 3 | Frame 4 | etc
> app |<----->A1
> CULL | | Frame 1 | Frame 2 | Frame 3 | etc
> cull |<------>C1
> DRAW | | | Frame 1 | Frame 2 | etc
> draw |<---->D1
> frame |<-------------------------->D1
>
> A1 = APP stage completes on frame 1
> C1 = APP stage completes on frame 1
> D1 = Draw stage completes on frame 1
>
> But this begs the question, why, if my draw time is 31.3, and my
> latency is 96.3, am I not getting a frame rate of 30hz?
>
> As it turns out, in perfly there are two contributions to the DRAW
> stage which are not measured by the perfly stats, pre-draw and
> post-draw.
>
> In perfly, the GUI is drawn in the pre-draw phase, and statistics are
> drawn in the post-draw phase. Together, they could easily take up
> enough time (about 2 msecs) to push your application over the 33.3 ms
> mark. The printed displays do not measure these times, but you can
> glean this information from the graphical display by noticing that the
> line representing the DRAW stage of a frame doesn't start quite at the
> boundary (because of the pre-draw time before it), and is extended by
> a dotted line after it (representing post-draw time).
>
> The draw time *does* include transformation and fill.
>
> Okay, so if that explains why we are running at 15hz instead of 30,
> then why isn't the latency equal to about 66.6+66.6+31.3 = 164.5?
>
> This is because you are running in LOCK mode. In LOCK mode, the
> picture I gave above gets more complicated (assuming you are missing
> the target rate in the draw stage):
>
> APP | F1 | F2 | F3 | F4 | F5 | F6 | F7 | F8 | etc
> app |<-->A1
> CULL | | F1 | F2 | F3 | F4 | F5 | F6 | F7 | F8 | etc
> cull |<-->C1
> DRAW | | | F1 | F1 | F3 | F3 | F5 | F5 | F7 | F7 | etc
> draw |<------->D1
> frame |<------------------->D1
>
> In LOCK mode the APP and CULL stages start each frame at the target
> frame rate, and the DRAW stage merely ignores frames that it wasn't
> ready to draw at the proper time, hence, in your case, it displays
> only every other frame. But the latency from beginning of pipe to end
> is smaller, on the order of 33.3+33.3+31.2 = 97.8 which is very close
> to the reported latency of 96.3.
>
> > If the above is true, how can I measure the time spent in the
> > Transformation stage and Fill stage?
> >
>
> So, to summarize, the missing time (about 2ms) is due probably to GUI
> and the stats display rendering. It is not due to fill time, which is
> measured and included in the "draw" time.
Very erudite.
I'll only add that if you need less latency you can read eye positions
from
shared memory directly in the draw process and update the channel
position
and FOV prior to calling pfDraw.
Also if you call pfSync it is possible to make latency critical updates
after the pfSync call but before the pfFrame call which will avoid app
processing latency. The only price you pay is that you lose microseconds
of CULL time.
Basically you only really have to worry about draw time latency if you
do the right thing.
Cheers,Angus.
--
"Only the mediocre are always at their best." -- Jean Giraudoux
For advanced 3D graphics Performer + OpenGL based examples and tutors:
http://www.dorbie.com/
=======================================================================
List Archives, FAQ, FTP: http://www.sgi.com/Technology/Performer/
Submissions: info-performer++at++sgi.com
Admin. requests: info-performer-request++at++sgi.com
This archive was generated by hypermail 2.0b2
on Mon Aug 10 1998 - 17:57:12 PDT