netdev
[Top] [All Lists]

FreeS/WAN redesign thoughts (KLIPS, IPSEC)

To: Linux Ipsec mailing list <linux-ipsec@xxxxxxxxx>, NetFilter mailing list <netfilter@xxxxxxxxx>, Linux Network Development mailing list <netdev@xxxxxxxxxxx>
Subject: FreeS/WAN redesign thoughts (KLIPS, IPSEC)
From: Richard Guy Briggs <rgb@xxxxxxxxxxxxxxxxxxxxx>
Date: Tue, 15 Aug 2000 14:35:39 -0400
Cc: John Gilmore <gnu@xxxxxxxx>, Hugh Daniel <hugh@xxxxxxxx>, Henry Spencer <henry@xxxxxxxxxxxxx>, Hugh Redelmeier <hugh@xxxxxxxxxx>, Richard Guy Briggs <rgb@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx>
Sender: owner-netdev@xxxxxxxxxxx
User-agent: Mutt/1.2i
FreeS/WAN IPSEC -- KLIPS2 DESIGN THOUGHTS
=========================================

This document was written shortly after OLS2000, inspired from a
meeting with Rusty and Marc in Montreal in November 1999 and two
intense FreeS/WAN BoFs at OLS2000.

Current kernel version reference is 2.4.0-test4.

The idea is to redesign KLIPS (kernel parts of FreeS/WAN) to avoid all
the 'stoopid routing tricks' (TM) to which we have had to resort over
the last 2+ years and to add a proper SPDB to do proper incoming IPSEC
policy checks.  We are hoping to use existing pattern-matching tools
rather than invent our own.  NetFilter appears to have all the pattern
matching capabilities, but is limited in other ways.

This is an exploratory document.  Please comment, particularly if I have
missed or mis-understood something, to the linux-ipsec, netfilter or
netdev lists.

The basic architecture of NetFilter is:

       --->[1]--->(ROUTE)--->[3]--->[4]--->     where:
                     |            ^             [1] NF_IP_PRE_ROUTING
                     |            |             [2] NF_IP_LOCAL_IN
                     |         (ROUTE)          [3] NF_IP_FORWARD
                     v            |             [4] NF_IP_POST_ROUTING
                    [2]          [5]            [5] NF_IP_LOCAL_OUT
                     |            ^             
                     |            |             
                     v            |             

The basic path through the kernel as it concerns IPSEC for the three
types of packets is as follows:
IN:
        nic
        sanity check
        NF_IP_PRE_ROUTING
        route-in
        ip-options
        defragment
        NF_IP_LOCAL_IN
        layer3demux
        application

FORWARD:
        nic
        sanity check
        NF_IP_PRE_ROUTING
        routing-in
        ip-options digesting
        ttl decrement and check
        NF_IP_FORWARD
        fragment
        NF_IP_POST_ROUTING
        output()
        nic

OUT:
        application
        layer3mux
        NF_IP_LOCAL_OUT
        route-out
        NF_IP_POST_ROUTING
        output()
        nic

Keep in mind that Destination NAT (port forwarding) gets applied in
NF_IP_PRE_ROUTING and Source NAT (masquerading) gets applied in
NF_IP_POST_ROUTING.

-----------

There is more than one possible approach.  The following is not
exhaustive.

    --- 1 ---

Treat incoming IPSEC encapsulation as a layer 3 protocol and decapsulate
it at the Layer 3 demultiplexer.

An incoming packet starts off with a sanity check.  It then goes through
all the NF_IP_PRE_ROUTING hooks starting with the SPDB checking.  Since
it is a fresh ESP or AH packet, it will not have any nfmarks and unless
that outer IP header should have been processed by another SG in
between, no policy will have been required, letting it through.

The rest of the NF_IP_PRE_ROUTING hooks may cause it to
be DNATed and defragmented.  It then goes through routing which thinks
it is a local packet, deals with any outer header IP options, then
defragmentation and NF_IP_LOCAL_IN filter (allow ESP,AH) before getting
to ipsec_rcv() where the outer bundle is authenticated and decrypted and
nfmarked before being passed back to netif_rx().  The next IP header is
now visible.

The SADB would be managed via the PF_KEYv2 socket I/F.

For local packets, it follows the same path, getting checked at
NF_IP_PRE_ROUTING for policy using previously set nfmark.  If this
passes, routing looks at the now-visible next IP header and routes it
locally where inner IP options and defragmentation are processed.
NF_IP_LOCAL_IN then gets to check filtering policy for other L3
protocols.  If it is the endpoint for multiple bundles, it iterates,
having exposed the next IP header.

For non-local packets, it goes through the incoming sanity check again,
goes through NF_IP_PRE_ROUTING where it could get DNATed and
defragmented, it routes, potentially through an existing virtual IPSEC
device, one per connection, not per physical I/F.  IP options and TTL
are processed before being filtered at NF_IP_FORWARD, fragmented, then
intercepted at NF_IP_POST_ROUTING after SNAT for encryption and
authentication.

Again, at NF_IP_POST_ROUTING, an IPSEC matching module would make a
decision about the fate of the packet.  It would have several possible
targets:  ACCEPT would allow the packet through with no processing.
ENCRYPT would send it off to the equivalent of
ipsec_tunnel_start_transmit() after setting nfmark if it knows that the
SA exists.  QUEUE would allow the packet to be sent to userspace to set
up keying for a connection.

The way that nfmark is used is rather vague.  It is presently only 32
bits.  Ideally, I would like to be able to indicate exactly which SAs
were processed on the way in, which would most easily be represented by
as many as 4 SAs (AH, ESP, IPCOMP, IPIP), each having an 8 bit protocol
field (absolute minimum of a 2-bits), 32-bit destination address field
(for IPv4, IPv6 would be 128) and a 32-bit SPI.  This is a potential
maximum of 672 bits.  A way of mapping 672 bits on to the 32 bits
available would be required to use this.  A lookup table could be used
to map nfmarks to SAIDs, not the SAs themselves, since the SAs could
disappear at any time the tdb table is not locked.  It should be able to
represent a bundle of SAs where one SA could be used in more than one
bundle.  There could also be more than one right answer for the incoming
SPDB.

The SPDB would be managed via a combination of PF_KEYv2 socket I/F
extensions and iptables.  A separate NetFilter table called 'ipsec'
(as opposed to 'filter' or 'nat') would have the first hook at
NF_IP_PRE_ROUTING and the last hook at NF_IP_POST_ROUTING.  iptables
uses the AF_NETLINK socket family.

I'm not certain exactly where a packet routed through an optional IPSEC
virtual I/F gets injected into the system.

-----------


     --- 2 ---

Treat incoming IPSEC encapsulation as an enhancement of the layer 2
protocol and decapsulate it at the NF_IP_PRE_ROUTING hook.  This option
is less favourable as it stands since it involves creating our own SPDB
engine.

An incoming packet starts off with a sanity check.  It then goes through
the NF_IP_PRE_ROUTING match hook for IPSEC, which would be the first in
priority, matching every single packet to force it through a policy
check.  If it was an ESP or AH packet with a local destination address,
it would then be sent to ipsec_rcv() and the first bundle
would be processed, keeping state until that bundle is completely
processed.  At this point the incoming SPDB would be checked to ensure
that the proper policy had been applied to it.  If there is another
bundle inside with an ESP or AH header, that bundle is processed,
storing the new and old state.  This SPDB check would not be
iptables-based since we have already gone through the match and target
hooks and would have too much state to store in nfmark.  The result of
the SPDB check would be ACCEPT or DROP (It could also be STOLEN or
QUEUEd at this point for opportunistic encryption).

The SADB and SPDB entries would be managed via the extended PF_KEYv2
socket I/F.

The rest of the NF_IP_PRE_ROUTING hooks may cause it to
be DNATed and defragmented.  

For local packets, routing looks at the now-visible next IP header and
routes it locally where inner IP options and defragmentation are
processed.  NF_IP_LOCAL_IN then gets to check filtering policy for 
layer 3 protocols.

For non-local packets, it routes, potentially through an existing
virtual IPSEC device, one per connection, not per physical I/F.  IP
options and TTL are processed before being filtered at NF_IP_FORWARD
then fragmented.  Packets are then sent through all the hooks at
NF_IP_POST_ROUTING potentially for SNAT, after which the last hook would
force all packets to go through the IPSEC outgoing processing module.
Here outgoing policy would be checked, again not necessarily by
iptables, encryption and authentication would be applied as available,
then the result would be ACCEPT or DROP (It could again be STOLEN or
QUEUEd at this point for opportunistic encryption).

------------------

If there are any other directions we should be considering, please
suggest...




        slainte mhath, RGB
-- 
Richard Guy Briggs -- PGP key available            Auto-Free Ottawa! Canada
<www.conscoop.ottawa.on.ca/rgb/>                       <www.flora.org/afo/>
Prevent Internet Wiretapping!        --        FreeS/WAN:<www.freeswan.org>
Thanks for voting Green! -- <green.ca>      Marillion:<www.marillion.co.uk>

Attachment: pgpQBmsNGZC2H.pgp
Description: PGP signature

<Prev in Thread] Current Thread [Next in Thread>