Re: [info-performer] depth buffer write is slow.

New Message Reply Date view Thread view Subject view Author view

From: W.Price++at++electronic-alchemy.co.uk
Date: 08/14/2002 09:08:42


bjk++at++munich.sgi.com wrote:
> Hi!
>
> w.price++at++electronic-alchemy.co.uk wrote:
> >
> > csklu_pf++at++yahoo.com wrote:
> > > Performers,
> > >
> > > Again maybe another OpenGL question: I'm writing
> > > values to the depth buffer directly using:
> > >
> > > glDrawPixels( width,height, GL_DEPTH_COMPONENT,
> > > GL_FLOAT, depths).
> > >
> > > This is much slower than the corresponding write using
> > > GL_RGBA and GL_UNSIGNED_INT (with depths converted to
> > > rgba values). Performer reports that the depth buffer
> > > size is 24 (on a GeForce3, RedHat 7.3). Could this
> > > slowness be because sizeof(GL_FLOAT) !=
> > > depthbuffersize/8? Would it make any difference if I
> > > use a depth buffer depth of 16 and GL_UNSIGNED_SHORTs?
> > > Any other suggestions/explanations?
> >
> > GL_UNSIGNED_SHORT data will be faster than GL_UNSIGNED_INT since you have half as much data being sent (sizeof(int) is usually 4, sizeof(unsigned short) usually 2).
> >
> > Another possible speedup which I have found across using glDrawPixels() on Linux is to NOT send the entire chunk of data in one go. i.e. instead of using
> >
> > glDrawPixels(1024, 768, ....)
> >
> > use
> >
> > glRasterPos2i(0, 0);
> > glDrawPixels(128, 128, ...., buffer);
> > glRasterPos2i(128, 0);
> > glDrawPixels(128, 128, ...., buffer + offset);
> > .....
> > .....
> > (most likely this will be in a loop)
> >
> > I don't know the details of why, but this was twice as fast for me to draw pixels to the screen. Best tile sizes were 128x128 and 64x64, I think.
> >
> > You would have thought that the one call would have been faster since there's less overhead than multiple calls, but experiments have proved otherwise.
>
> Probably you were using an InfiniteReality* system.
>
> One these systems you might do some experiments with the two environment
> variables GLKONA_DMATHRESHOLD_IMAGE and GLKONA_DMATHRESHOLD_SCANLINE.
> These threshold values (in bytes) control weather a PIO (eats up CPU
> time) or DMA (CPY might be available for other things) will happen.
>
> You might try to:
>
> setenv GLKONA_DMATHRESHOLD_IMAGE 301500
>
> Restart your application and check if the performance changes.

Nope... unfortunately not. The machine is *definately* a PC, running Linux with a GeForce graphics card (also same with GeForce4).

The code to draw is a simple loop such as...

  for (i = 0; i < num_frames; i++)
  {
    glDrawPixels()
    GlwDrawingAreaSwapBuffers()
  }

.. so there's nothing else for the CPU to do besides wait in both cases.

My guess is that there's some kind of limitation in the XFree86 DRI stuff, but I don't know.

Wayne

--
___________________________________________________________________________
Wayne Price     W.Price++at++electronic-alchemy.co.uk     Electronic Alchemy Ltd
Mobile: +44 (0) 7770 376383                       Home: +44 (0) 1483 531235


New Message Reply Date view Thread view Subject view Author view

This archive was generated by hypermail 2b29 : Wed Aug 14 2002 - 09:08:55 PDT

This message has been cleansed for anti-spam protection. Replace '++at++' in any mail addresses with the '@' symbol.