pro64-support
[Top] [All Lists]

Re: fortran performance and array indexing

To: Stephen Pickles <zzcgusp@xxxxxxxxxxxxxxxx>
Subject: Re: fortran performance and array indexing
From: "Nelson H. F. Beebe" <beebe@xxxxxxxxxxxxx>
Date: Tue, 8 May 2001 12:20:01 -0600 (MDT)
Cc: beebe@xxxxxxxxxxxxx, Pro64 Support <pro64-support@xxxxxxxxxxx>
Sender: owner-pro64-support@xxxxxxxxxxx
Stephen Pickles <zzcgusp@xxxxxxxxxxxxxxxx> writes on Tue, 8 May 2001
18:37:02 +0100 (BST) about a performance difference in Fortran 90
matrix multiplication seen with the SGI Pro64 compilers on IA-64.

I took the sample code and ran it on a number of other architectures,
with the results shown below.  Notice that method 2 (0-based indexing)
is notably slower on the Intel Pentium III system, but slightly faster
on six other systems. I repeated several of these runs, without seeing
any significant difference in the reported timings.

Stephen's message did not make clear whether his results were for
native IA-64 hardware, or for the NUE IA-64 emulator environment.
Mine are for the latter, and they demonstrate a 2.7x slowdown for
0-based indexing.  Given that the same number of loads and stores, and
the same number of floating-point operations, is required in each
case, these differences are puzzling, and further examination of the
generated assembly code will likely be necessary to resolve the
question.  The .s file is over 10K lines long, and I wasn't able to
quickly isolate in a text editor the relevant code bodies for
comparison.

        Compaq/DEC Alpha 4100-5/466     OSF/1 4.0F
                f95 -O3 matmul_test.f && ./a.out
                 speed(1) =   128.8102
                 speed(2) =   126.6855
                 discrepancy =  0.000000000000000E+000

        Compaq AlphaServer ES40 Sierra/667 (32 EV6.7 21264A CPUs, 667 MHz, 8GB 
RAM); OSF/1 5.0
                f95 -O5 matmul_test.f && ./a.out
                 speed(1) =   84.92637
                 speed(2) =   86.62321
                 discrepancy =  0.000000000000000E+000

        Compaq AlphaServer ES40 DEC6600/500 (8 EV6 21264 CPUs, 500 MHz, 8GB 
RAM); OSF/1 4.0F
                f95 -O5 matmul_test.f && ./a.out
                 speed(1) =   155.6647
                 speed(2) =   150.3770
                 discrepancy =  0.000000000000000E+000

        IBM SP/2                AIX 4.3
                xlf95 -O1 matmul_test.f && ./a.out
                 speed(1) = 7.607727051
                 speed(2) = 7.227553844
                 discrepancy = 0.000000000000000000E+00

        Intel Pentium III       GNU/Linux 2.2.17-14smp (Red Hat 6.2)
                lf95 -O3 matmul_test.f -o foo.exe && ./foo.exe
                 speed(1) = 87.0748291
                 speed(2) = 97.8032532
                 discrepancy = 0.000000000000000E+00

        Intel Pentium III       GNU/Linux 2.2.17-14smp (Red Hat 6.2)
                                HP NUE IA-64 emulator
                                (reduced nruns from 100000 to 1000)
                sgif90 -O3 matmul_test.f && ./a.out
                 speed(1) = 0.207665801
                 speed(2) = 0.565433443
                 discrepancy = 0.E+0

        SGI R5000-PC            IRIX 6.5
                f90 -O3  matmul_test.f && ./a.out
                 speed(1) = 36.2537994
                 speed(2) = 34.8822517
                 discrepancy = 0.E+0

        SGI Origin 200          IRIX 6.5
                f90 -O3  matmul_test.f && ./a.out
                 speed(1) = 75.4860535
                 speed(2) = 68.8059082
                 discrepancy = 0.E+0

        Sun SPARC               Solaris 2.7
                f95 -O3 matmul_test.f && ./a.out
                 speed(1) = 53.20448
                 speed(2) = 51.201023
                 discrepancy = 0.0E+0

As an aside, see

        http://www.math.utah.edu/pub/benchmarks/usirep.pdf
        http://www.math.utah.edu/pub/benchmarks/usirep.ps

for ways to sometimes dramatically speed-up matrix multiplication on
modern RISC systems.

-------------------------------------------------------------------------------
- Nelson H. F. Beebe                    Tel: +1 801 581 5254                  -
- Center for Scientific Computing       FAX: +1 801 585 1640, +1 801 581 4148 -
- University of Utah                    Internet e-mail: beebe@xxxxxxxxxxxxx  -
- Department of Mathematics, 322 INSCC      beebe@xxxxxxx  beebe@xxxxxxxxxxxx -
- 155 S 1400 E RM 233                       beebe@xxxxxxxx                    -
- Salt Lake City, UT 84112-0090, USA    URL: http://www.math.utah.edu/~beebe  -
-------------------------------------------------------------------------------

<Prev in Thread] Current Thread [Next in Thread>