From owner-pro64-contrib@oss.sgi.com Thu Jun 1 10:20:53 2000 Received: by oss.sgi.com id ; Thu, 1 Jun 2000 10:20:34 -0700 Received: from [204.215.167.132] ([204.215.167.132]:24331 "EHLO gate.init.com") by oss.sgi.com with ESMTP id ; Thu, 1 Jun 2000 10:20:28 -0700 Received: from denali.init.com (denali.init.com [204.215.167.91]) by gate.init.com (8.9.3/8.9.3/ab-gate-1.1) with ESMTP id OAA13955; Thu, 1 Jun 2000 14:20:20 -0400 Received: from neo.init.com (neo.init.com [204.215.167.18]) by denali.init.com (8.9.3/8.9.3/ab-hub-1.2) with ESMTP id OAA17372; Thu, 1 Jun 2000 14:20:10 -0400 (EDT) Received: (from rshapiro@localhost) by neo.init.com (8.9.3/8.8.8/ab-cli-1.1) id OAA05573; Thu, 1 Jun 2000 14:20:19 -0400 Date: Thu, 1 Jun 2000 14:20:19 -0400 Message-Id: <200006011820.OAA05573@neo.init.com> X-Authentication-Warning: neo.init.com: rshapiro set sender to rshapiro@neo.init.com using -f From: Richard Shapiro To: duraid@fl.net.au CC: pro64-contrib@oss.sgi.com In-reply-to: Subject: Re: The fastest way to multiply large integers References: Sender: owner-pro64-contrib@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;pro64-contrib-outgoing From: "Duraid Madina" Date: Thu, 1 Jun 2000 11:49:54 +1000 X-Priority: 3 (Normal) X-MSMail-Priority: Normal Importance: Normal X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6700 X-Orcpt: rfc822;pro64-contrib@oss.sgi.com Sender: owner-pro64-contrib@oss.sgi.com Precedence: bulk Content-Type: text/plain; charset="iso-8859-1" Content-Length: 862 What's the best way (for Itanium, at least) to multiply large integers? For example, 128 bit integers, 1024 bit integers or indeed arbitrarily large integers? I had a look through the Pro64 source and note that there is decreed a threshold of '14' below which successive shifts+adds are used, and above which you outright multiply. But when is it best to use the xma instruction (and the associated cost of converting to/from an FP representation)? When is it better to use the integer packed multiply instructions? For 'streaming multiplication', as it were, what do you think is the best way to proceed? Using the integer/MM units? The FP units? Both at once? I am concerned with 'normal' multiplication of ~kilobit numbers, not massive numbers where it's better to use transform-based multiplication. It's a pretty complex question. If you can pipeline things, you want to use xma. For example, if you are doing extended-precision multiplication, you can keep things in the FPU, and you will win. For a single isolated integer multiply, I think that on Itanium you actually can do a multiply by a 64 bit constant in comparable time using shift-adds. If you are multiplying two numbers, nether of which is a compile-time constant, you always win with xma. -- Richard Shapiro Ab Initio Software Corporation rshapiro@init.com (781) 301-2311