Re: Latency in statics graph

New Message Reply Date view Thread view Subject view Author view

From: agenty++at++paris.sgi.com
Date: 11/13/2001 02:26:14


Hi,

May be this old article will be interresting about this,
You will find above a Latency Performer article (1985 i think )
is this seems still acurate ?

Ariane

On Nov 12, 12:02pm, Marcin Romaszewicz wrote:
> Subject: Re: Latency in statics graph
>
> It's not just the sum of APP+CULL+DRAW time, but rather the sum of three
> frame boundaries. If you're swapping at 60Hz and running APP_CULL_DRAW,
> the whole process pipeline takes 3 * 16.6 ms, since data does not flow in
> the pipeline until the frame boundary.
>
> -- Marcin
>
> On Mon, 12 Nov 2001, Keith Parkins wrote:
>
> > Marcin,
> >
> > This is what I thought, but the numbers don't add up. When I run the
> > stats, I'll get approximately 2ms for app, 2ms for cull, and 7-9ms for
> > draw. However the latency reported is 29ms. Do you know why this would
> > be? The app line in the graph is solid for the 2ms length, but then a
> > dotted line extends beyond the next frame (app 17ms from the looks). My
> > question is whether or not my latency is arround 13ms which I get from
> > adding or 29ms, and where are the extra 12ms comming from? I am running
> > Performer 2.2 if that makes a difference.
> >
> > Keith
> >
> > On Wed, 31 Oct 2001, Marcin Romaszewicz wrote:
> >
> > >
> > > The latency is how long it will take for a change you make in performer
to
> > > appear on the screen. Say you are running APP_CULL_DRAW multiprocess and
> > > each process takes 10 ms. You make a change in the beginning of the app,
> > > 10 ms later its in the cull, 10 ms after that its in the draw, and after
> > > another 10 ms it's on your screen after a buffer swap.
> > >
> > > -- Marcin
> > >
> > > On Wed, 31 Oct 2001, Keith Parkins wrote:
> > >
> > > > Hi,
> > > >
> > > > When I toggle the statistics graph, I notice that the latency seems
> > > > unusually high. My app cull and dram are all under 10ms but the latency
> > > > can be up above 30ms. Is this latency due to drawing the satistics
graph
> > > > or due to some other area? Where should I be looking to trim down the
> > > > time? I couldn't find documentation on this field in the stats graph,
but
> > > > if there is some, could you point it out to me?
> > > >
> > > > Thanks for the help!
> > > > Keith
> > > >
> > > > -----------------------------------------------------------------------
> > > > List Archives, Info, FAQ: http://www.sgi.com/software/performer/
> > > > Open Development Project: http://oss.sgi.com/projects/performer/
> > > > Submissions: info-performer++at++sgi.com
> > > > Admin. requests: info-performer-request++at++sgi.com
> > > > -----------------------------------------------------------------------
> >
>
> -----------------------------------------------------------------------
> List Archives, Info, FAQ: http://www.sgi.com/software/performer/
> Open Development Project: http://oss.sgi.com/projects/performer/
> Submissions: info-performer++at++sgi.com
> Admin. requests: info-performer-request++at++sgi.com
> -----------------------------------------------------------------------
>-- End of excerpt from Marcin Romaszewicz

On Nov 12, 7:08pm, Ariane Genty wrote:
> Subject: latency performer article
>
> > Latency and Transport Delay
> > ---------------------------
> >
> > In visual simulation the terms "latency" and "transport delay" refer
> > to the same thing -- the time elapsed between stimulus and response.
> > Confusion can enter the picture because there are several important
> > latencies in visual simulation and often it's not clear which one is
> > being discussed, that is, which stimulus and which response.
> >
> > The most general measure is the TOTAL LATENCY, which measures the time
> > between user input (such as a pilot moving a control) to display of a
> > new image computed using that input. An example would be a sudden roll
> > after smooth level flight. How long does it take for a tilted horizon
> > to appear?
> >
> > The answer to the total time required is the sum of each latency in
> > the simulation system. The component latencies are usually:
> >
> > 1. input device measurement and reporting latency
> > 2. vehicle dynamics computation latency
> > 3. image generation computation latency
> > 4. display system (video) scan-out latency
> >
> > This is the latency that matters to the user of the system, since the
> > overall latency is what controls the sense of realness the system can
> > provide.
> >
> > Despite the utility of a total system measure, vendors of subsystems
> > can only provide latency measures for their component. The exception
> > to this is the image generation system, since the video latency is
> > implied by the video output format of the image generation hardware.
> > The combined image generation computation latency and display system
> > latency measure is known as the VISUAL LATENCY.
> >
> > Questions of latency in IRIS Performer applications may refer to
> > either total latency or visual latency. The application developer will
> > select the scope (in the sense of the four tasks outlined above) of
> > the application, and then the latency will be decided by the choice of
> > multiprocessing mode, frame rate, and video output format, as shown
> > below.
> >
> > Multiprocessing and Latency
> > ---------------------------
> >
> > Multiprocessing is a major performance feature for applications based
> > on IRIS Performer. Multiprocessing has an effect on latency that must
> > be understood, however, so that increasing throughput using multiple
> > processes does not cause objectionable latency. Here are the major
> > multiprocessing opportunities:
> >
> > 1. Application processing, such as input device measurement and
> > vehicle dynamics computation for future frames can proceed in
> > parallel with image generation for current frames. These are
> > the non-visual aspects of simulation systems. These tasks were
> > once commonly performed in separate computers from different
> > vendors. The current trend is to combine this processing into
> > a single system for additional price and programming benefits.
> > The latencies of these tasks are accounted for independently
> > of the visual system latency.
> >
> > 2. Intersection detection, often called "mission functions" in
> > the image generation world, can be performed on separate
> > processors in parallel with visual simulation.
> >
> > 3. Systems with multiple graphics pipelines can operate those
> > graphics systems in parallel. This can double or triple the
> > total graphics throughput of a system. In essence, this means
> > operating multiple visual systems in a single box, which has
> > price and programming benefits, but does not increase latency
> > when sufficient processing power is configured.
> >
> > 4. The image generation task can be subdivided into a culling
> > phase and a drawing phase, and these can operate in parallel
> > on separate processors to (in some cases) double throughput.
> > Interestingly this is the only multiprocessing issue that has
> > an impact on the latency of the visual system.
> >
> > Each of the multiprocessing concepts outlined above is supported by
> > IRIS Performer: application processing, intersection detection, and
> > multiple graphics pipelines with parallelized cull and draw for each
> > can all be realized on separate processors for increased performance.
> >
> > Task Partitioning
> > -----------------
> >
> > There are four basic ways the application, intersection, cull, and draw
> > tasks can be partitioned in IRIS Performer applications. They are listed
> > in the following table, with the symbol 'A' representing application
> > processing, 'I' for intersections, 'C' for culling, and 'D' for drawing:
> >
> > CPU 1 CPU 2 CPU 3 pfMultiprocess Mode
> > ------- ------- ------- -------------------
> > A+I+C+D PFMP_APPCULLDRAW
> > A+I+C D PFMP_APPCULL_DRAW
> > A+I C+D PFMP_APP_CULLDRAW
> > A+I C D PFMP_APP_CULL_DRAW
> >
> > As you can see, the names are built by inserting an underscore into the
> > string "APPCULLDRAW" wherever this pipeline is split across multiple
> > processors.
> >
> > Each of the four task partitionings listed above has a twin in which the
> > intersection processing is performed in one or more separate processes.
> > These modes each require one extra process, as shown in the following
> > table:
> >
> > CPU 1 CPU 2 CPU 3 CPU 4 pfMultiprocess() Mode
> > ----- ----- ----- ----- ---------------------
> > A+C+D I PFMP_APPCULLDRAW | PFMP_FORK_ISECT
> > A+C D I PFMP_APPCULL_DRAW | PFMP_FORK_ISECT
> > A C+D I PFMP_APP_CULLDRAW | PFMP_FORK_ISECT
> > A C D I PFMP_APP_CULL_DRAW | PFMP_FORK_ISECT
> >
> > Temporal Structure
> > ------------------
> >
> > Once the division of tasks among processors is defined, the next issue
> > is the temporal structure--the layout of these processes in time.
> > Rather than measure time in seconds, the following discussion refers to
> > it in terms of video frames, since these form the basic unit of time
> > for image generation tasks. Here is a chart showing advancing time from
> > left to right as measured in video frames, and within this context, the
> > arrangement of application, culling, and drawing in time:
> >
> > <-frame time->|
> > +-------------+-------------+-------------+-------------+
> > A+I |A0-> |A1-> |A2-> |A3-> |
> > C | C0-> | C1-> | C2-> | C3-> |
> > D | D0-> | D1-> | D2-> | D3-> |
> > +-------------+-------------+-------------+-------------+-->
> > 0 1 2 3 4
> >
> > <-- frame 0 -->
> > <-- frame 1 -->
> >
> > This shows that during the time for frame 0, the first activity was
> > application processing (A0), followed by culling (C0), and then drawing
> > (D0). Note that the cull started as soon as the application finished,
> > and that the draw started as soon as the cull finished. This would be
> > accomplished by the pfMultiprocess(PFMP_APPCULLDRAW) call, where the
> > tasks are serial steps taken by one process. This mode has an elapsed
> > time that is the reciprocal of the frame time. This is not the highest
> > throughput choice, however. A different arrangement is shown below:
> >
> > <-frame time->|
> > +-------------+-------------+-------------+-------------+
> > A+I |A0---------->|A1---------->|A2---------->|A3---------->|
> > C | |C0---------->|C1---------->|C2---------->|
> > D | | |D0---------->|D1---------->|
> > +-------------+-------------+-------------+-------------+-->
> > 0 1 2 3 4
> >
> > <---------------- frame 0 --------------->
> > <---------------- frame 1 --------------->
> >
> > Here, an entire frame time has been allocated for each of application,
> > cull, and draw processing. This means that the complete processing for a
> > single frame extends across several frame times, increasing the elapsed
> > time from stimulus to response but also increasing the throughput of
> > the visual system by allowing more time for each stage of processing.
> >
> > Since there are multiple processes operating in parallel (application,
> > culling, and drawing) the latency can be longer than 1/frame time, even
> > though frames are completed at the frame rate. This is an important point
> > to understand about pipelining. This mode of operation can be selected
> > with pfMultiprocess(PFMP_APP_CULL_DRAW).
> >
> > Calculating Latency
> > -------------------
> >
> > To calculate the visual latency of IRIS Performer applications we must
> > define the stimulus and response to be used.
> >
> > The proper stimulus for the visual system measurement is the end of
> > application processing, since it is the point in time at which control
> > is handed to IRIS Performer for the computation of the next image. The
> > application processing which proceeds this point is application time;
> > time spent in input device measurement, in intersection processing, and
> > in vehicle dynamics computation. This time is important and contributes
> > to total latency, but must be counted separately.
> >
> > The response typically considered to terminate the latency calculation
> > in visual simulation is the point in time at which the last pixel of the
> > first field of the new image is displayed. Understanding this requires
> > a brief review of video output terminology.
> >
> > A complete video image is known as a FRAME, and a frame is composed of
> > one or more FIELDS. The two most common video formats used by IRIS
> > Performer applications are the 60 Hertz non-interlaced display typical
> > of computer workstations and the 30 Hertz frame, 60 Hertz field,
> > interlaced NTSC video format used by broadcast television.
> >
> > In the non-interlaced case, there is one field per frame, so the lines
> > on the display are scanned sequentially from top to bottom. This means
> > that at a 60 Hz frame rate, each frame (and thus field) requires 1/60th
> > second to display. In this situation the "last pixel of the first field"
> > metric outlined above will be reached 1/60th second after the start of
> > the display of a frame.
> >
> > Interlaced displays are different, in that they make multiple passes
> > through the image, displaying the even-numbered scan-lines on one trip
> > and the odd-numbered lines on the other. Each of these passes is known
> > as a field, and since there are two fields per frame, the field rate is
> > twice the frame rate. In the NTSC example, the frame rate is 30 Hz, but
> > the field rate is 60 Hz, so the "last pixel of the first field" event
> > will occur 1/60th second after the start of the display of a frame.
> >
> > With these definitions of terms and events, the following timing chart
> > can be understood:
> >
> > <field>|
> > <---frame---->|
> > +------+------+------+------+------+------+------+------+
> > A+I |A0-> |A1-> |A2-> |A3-> |
> > C | C0-> | C1-> | C2-> | C3-> |
> > D | D0-> | D1-> | D2-> | D3-> |
> > V | |V0F1->|V0F2->|V1F1->|V1F2->|V2F1->|V2F2->|
> > +------+------+------+------+------+------+------+------+-->
> > 0 1 2 3 4 5 6 7 8
> >
> > <- F0 Latency ->
> > <- F1 Latency ->
> >
> > Note that now each frame of the previous charts is now broken into two
> > fields and that displaying the two fields is shown as an event on the
> > bottom line, where 'V' represents video scan-out. Notice also that the
> > latencies are shown now, at the bottom of the chart. They measure the
> > time elapsed between the end of application processing (which is the
> > same as the start of the cull -- it's the point where pfFrame() is
> > called in the application process) up to the time when the last pixel
> > of the first field of the new image is displayed. This is the true
> > visual latency.
> >
> > The following charts illustrate the latency calculations for several
> > commonly used IRIS Performer multiprocessing modes. The last mode
> > listed shows the advantage of overlapped cull and draw processing. In
> > this mode, IRIS Performer achieves the lowest visual latency reported
> > for an image generation system.
> >
> > Latency in PFMP_APP_CULLDRAW mode
> > ---------------------------------
> >
> > <field>|
> > <---frame---->|
> > +------+------+------+------+------+------+
> > A+I | A0-->| A1-->| A2-->|
> > C | |C0----> |C1----> |
> > D | | D0---->| D1---->|
> > V | | |V0F1->|V0F2->|
> > +------+------+------+------+------+------+-->
> > 0 1 2 3 4 5 6
> >
> > <-a->|<-b-->|<-c-->|<-d-->|
> >
> > The latency for 30 Hz frame and 60 Hz display refresh rate is:
> > a: application time (not our issue, can be near zero)
> > b&c: culling and drawing: one frame (1/30 sec) [C0+D0 in chart]
> > d: display: one field (1/60 sec)
> > --------------------------------
> > Total 3/60 sec = 50.000 msec + user app time
> >
> > Image Field Latency
> > ----- ----- -------
> > 60 Hz 60 Hz 33.333 msec + user app time
> > 30 Hz 60 Hz 50.000 msec + user app time
> > 20 Hz 60 Hz 66.667 msec + user app time
> >
> > Latency in PFMP_APP_CULL_DRAW mode
> > ----------------------------------
> >
> > <field>|
> > <---frame---->|
> > +------+------+------+------+------+------+------+------+
> > A+I | A0-->| A1-->| A2-->| A3-->|
> > C | |C0---------->|C1---------->|C2---------->|
> > D | | |D0---------->|D1---------->|
> > V | | | |V0F1->|V0F2->|
> > +------+------+------+------+------+------+------+------+-->
> > 0 1 2 3 4 5 6 7 8
> >
> > <-a->|<-----b----->|<-----c----->|<-d-->|
> >
> > The latency for 30 Hz frame and 60 Hz display refresh rate is:
> > a: application time (not our issue, can be near zero)
> > b: culling: one frame (1/30 sec) [C0 in above chart]
> > c: drawing: one frame (1/30 sec) [D0 in above chart]
> > d: display: one field (1/60 sec)
> > --------------------------------
> > Total 5/60 sec = 83.333 msec + user app time
> >
> > Image Field Latency
> > ----- ----- -------
> > 60 Hz 60 Hz 50.000 msec + user app time
> > 30 Hz 60 Hz 83.333 msec + user app time
> > 20 Hz 60 Hz 116.67 msec + user app time
> >
> > Latency in PFMP_APP_CULLoDRAW mode
> > ----------------------------------
> >
> > <field>|
> > <---frame---->|
> > +------+------+------+------+------+------+
> > A+I | A0-->| A1-->| A2-->|
> > C | |C0--------->.|C1--------->.|
> > D | |.D0--------->|.D1--------->|
> > V | | |V0F1->|V0F2->|
> > +------+------+------+------+------+------+-->
> > 0 1 2 3 4 5 6
> >
> > <-a->|<-----b----->|
> > |<-----c----->|<-d-->|
> >
> > The latency for 30 Hz frame and 60 Hz display refresh rate is:
> > a: application time (not our issue, can be near zero)
> > b&c: culling and drawing: one frame (1/30 sec) [C0+D0 in chart]
> > d: display: one field (1/60 sec)
> > --------------------------------
> > Total 3/60 sec = 50.000 msec + user app time
> >
> > Image Field Latency
> > ----- ----- -------
> > 60 Hz 60 Hz 33.333 msec + user app time
> > 30 Hz 60 Hz 50.000 msec + user app time
> > 20 Hz 60 Hz 66.667 msec + user app time
> >
> > In this last mode the fields may be interlaced or non-interlaced. If
> > they are interlaced, then only half the frame need be computed in each
> > field. i.e. 1024x1024 rendered as 1024x512 odd lines and then as
> > 1024x512 even lines. This mode of operation halves the pixels draw in
> > each field, thus preserving the depth complexity of the image by
> > comparison to a non-interlaced display.
>
> http://www-oasis.corp.sgi.com/cgi-bin/retrieve.cgi?file=/oasis/TechInfo/all/Articles_Examples/Performer/articles/latency.doc&type=2&hiliteState=1
>-- End of excerpt from Ariane Genty

-- 
 Ariane GENTY		Email: agenty++at++paris.sgi.com   
 Graphic/Multimedia	Phone: 33 (1) 34 88 80 88        
 Software Support	VMail: 521-8017	   
 SGI - France		Fax  : 33 (1) 34 88 82 82           	
                                           
								


New Message Reply Date view Thread view Subject view Author view

This archive was generated by hypermail 2b29 : Tue Nov 13 2001 - 02:31:29 PST

This message has been cleansed for anti-spam protection. Replace '++at++' in any mail addresses with the '@' symbol.