Re: OpenGVS Benchmark Applic

New Message Reply Date view Thread view Subject view Author view

Steve Baker (steve++at++mred.bgm.link.com)
Mon, 29 Apr 96 22:29:01 -0500


Well, I have always believed in the maxim:

  There are lies, Damned lies, and Benchmarks.

The trouble with benchmarks is that they are always optimal
for one machine and pessimal for another. Take for example, a
flight simulation benchmark.

Here is a simple database design question:

  "Are you going to layer forest canopy areas on top of the
   terrain, or are you going to cut holes in the terrain?"

  If you do the former, you will be doing a great service to machines
like the Compuscene VI and PT 2000 because they have pixel fill rate
optimisation hardware that lets you skip over second and subsequent
layers of polygon. The SGI hardware will suffer because it doesn't
have that feature.

  If you decide to take pity on the SGI machines and cut holes in the
terrain where the forest canopy over laps it - then the polygon count will
increase fairly dramatically. The SGI boxes won't care too much about
that because they can push a zillion polyons (well, may be a half zillion)
without even breaking into a sweat. Of course the Compuscene box will
be maxed out on polygons and it's performance will be much poorer.

  Some E&S machines offer a trade-off between pixel fill rates and hidden
surface options. Can we run the database with the Z-buffer (sorry 'R-buffer')
disabled for the test? What if a polygon is incorrectly occulted as a
result? Does that disqualify the IG from testing?

  How about texture? Some IG's can do multiple textures per polygon at
no extra penalty. The SGI box can do that too - but only if one is a
detail texture. If you force the SGI box to do two textures per polygon
for real then it'll take twice as long to render the database. Is this
a fair test or not?

  What about DCS's? An SGI box can draw an almost unlimited number of them.
Some other boxes impose severe limits - if we make the benchmark have 1000
moving objects then it'll run OK on an SGI box - just about every other
IG will choke and die.

  You see there is no such thing as a fair benchmark. To get the benchmark
to be portable at all, it will end up using features that are the lowest
common denominator on all the systems you want to test.

  I have a database (a rollercoaster) that was build in GVS many, many
moons ago. It ran quite nicely under GVS.

  When we switched to Performer (a decision I have never regretted), we
just grabbed the flight files and tried to run them. It ran about the same
speed as the GVS software did.

  Later, we restructured the polygons into more smaller objects.
The ability of Performer to push culling off onto another CPU made the
database run substantially faster because GL had far fewer polygons to
process.

  Later still, we rewote our database loader so that instead of using
LoadFlt(), we had our own in-house code. The new loader does a much
better job of TMesh construction than the pfUtils functions that LoadFLt()
uses - so the database ran noticably faster.

  We now have four different timings for the RE2 on the "same" database.

  If we take that database back to GVS, what will happen? Well, since
the original version was optimised to death for GVS, it's likely that
the single-CPU rendering that GVS uses will probably suffer greatly
from the increased cull times, leaving little time left for everything
else. Please note, this is NOT a slur on GVS - the Performer version
uses an additional CPU, so it's not a fair test.

  Exactly the same polygons, exactly the same pictures, exactly the same
hardware. Wildly differing rendering times. Now what?

  OK, you may say - we'll simply define how the database should *look* and
let the proponents of a given hardware/software combination optimise the
database to death - and so long as they don't change it's appearance, the
benchmark will still be valid.

NO !!!

  The whole design of the database is rooted in the hardware architecture
of the machine. In a low flying aircraft simulation, it is important for
the pilot to get a certain degree of speed and altitude cueing. The way
that you provide those cues depends on the kind of IG you have.
Some IG's can draw lots of polygons - so you get the speed cues by drawing
lots and lots of little polygonal trees and bushes out there.
On the other hand, if your IG has fewer polygons - but great looking texture,
then you can reduce the number of trees and use lots of nice detailed, high
contrast texture instead. The ability of the two approaches to train a pilot
to fly low and fast is about the same.

  Benchmarking is just about possible for something as relatively
simple as a CPU, basically, the CPU runs the program - you time
how long it took - if it goes fast, that's better. (Even that isn't
quite true - but that's another argument)

  With graphics there is also a strong QUALITY consideration.
You can usually trade between quality and speed. That makes benchmarking
impossible. How nicely does the database have to look in order to count
as a 'pass' so that we can measure it's speed?

CONCLUSION:

  I think you are setting an impossible goal - the results will come out
  heavily biassed towards the machine and the software that the database
  was first designed for. I suggest you give up on the idea.

    Steve

  Steve Baker 817-323-1361 (Vox-Lab)
  Hughes Training Inc. 817-695-8776 (Vox-Office/vMail)
  2200 Arlington Downs Road 817-695-4028 (Fax)
  Arlington, Texas. TX 76005-6171 steve++at++mred.bgm.link.com (eMail)


New Message Reply Date view Thread view Subject view Author view

This archive was generated by hypermail 2.0b2 on Mon Aug 10 1998 - 17:52:49 PDT

This message has been cleansed for anti-spam protection. Replace '++at++' in any mail addresses with the '@' symbol.