netdev
[Top] [All Lists]

Re: conflicting alignment requirements

To: Ben Greear <greearb@xxxxxxxxxxxxxxx>
Subject: Re: conflicting alignment requirements
From: Ralf Baechle <ralf@xxxxxxxxxxx>
Date: Thu, 2 Aug 2001 02:10:54 +0200
Cc: kuznet@xxxxxxxxxxxxx, Jacob Avraham <jacoba@xxxxxxxxx>, netdev@xxxxxxxxxxx
In-reply-to: <3B6823AB.4D931195@xxxxxxxxxxxxxxx>; from greearb@xxxxxxxxxxxxxxx on Wed, Aug 01, 2001 at 08:43:39AM -0700
References: <EJEHILNJPONOHGEOJKICAEDDCAAA.jacoba@xxxxxxxxx> <200107311712.VAA04463@xxxxxxxxxxxxx> <20010801043638.A17397@xxxxxxxxxxxxxxxx> <3B6823AB.4D931195@xxxxxxxxxxxxxxx>
Sender: owner-netdev@xxxxxxxxxxx
User-agent: Mutt/1.2.5i
On Wed, Aug 01, 2001 at 08:43:39AM -0700, Ben Greear wrote:

> > > > copy the packet to a fresh skb (rx_copybreak = 0), the packet will
> > > > traverse the net layer with unalinged IP header.
> > >
> > > Doing this for an arch which traps wrong alignment, you can expect
> > > everything (except for crash, which could be bug).
> > 
> > Afaik all such architectures have exception handlers to complete the access
> > transparently in software.  Such an access is very slow so where more
> > frequent unaligned accesses are expected there are get_unaligned() and
> > put_unaligned().
> 
> I was recently asked to remove the get/put_unaligned code from my
> VLAN patch, which I did.  However, I don't want to now pay a
> performance penalty on Sparc, or whatever...
> 
> So, what are the drawbacks of using get/put_unaligned?  If it's a
> Macro, it could be defined to do very little extra work on architectures
> that can handle un-aligned access, which might fix the common case, and
> yet still be faster than catching the trap on other hardware architectures??

For machines that handle unaligned access in hardware {get,put}_unaligned
are exactly identical to a normal reference to the same memory location
in C.  As such there is never any drawback.

For architecture which need software asistance for unaligned accesses these
macros expand in a different instruction sequence, for example on MIPS it
will always be a two instruction sequence and on Alpha it's yet more
complex, there two load and a mask, shift and or sequence will be generated
which has even more overhead.

The alternative to this *_unaligned overhead is taking one exception per
unaligned access which can be rather painful.  So the choice between
either mechanisms is a performance tradeoff and as always the choice was
to optimize the common case at cost of the rare case.

  Ralf

<Prev in Thread] Current Thread [Next in Thread>