Home | History | Annotate | Download | only in netipsec
History log of /src/sys/netipsec/ipsec.c
RevisionDateAuthorComments
 1.179  13-May-2024  msaitoh s/priviliged/privileged/
 1.178  27-Jan-2023  ozaki-r ipsec: remove unnecessary splsoftnet

Because the code of IPsec itself is already MP-safe.
 1.177  08-Dec-2022  knakahara branches: 1.177.2;
Fix: sp->lastused should be updated by time_uptime, and refactor a little.
 1.176  09-Nov-2022  knakahara Fix IPv4 security policy with port number does not work for forwarding packets.
 1.175  04-Nov-2022  ozaki-r inpcb: rename functions to inpcb_*

Inspired by rmind-smpnet patches.
 1.174  28-Oct-2022  ozaki-r inpcb: integrate data structures of PCB into one

Data structures of network protocol control blocks (PCBs), i.e.,
struct inpcb, in6pcb and inpcb_hdr, are not organized well. Users of
the data structures have to handle them separately and thus the code
is cluttered and duplicated.

The commit integrates the data structures into one, struct inpcb. As a
result, users of PCBs only have to handle just one data structure, so
the code becomes simple.

One drawback is that the data size of PCB for IPv4 increases by 40 bytes
(from 248 bytes to 288 bytes).
 1.173  08-Dec-2021  andvar s/speficication/specification/
 1.172  28-Aug-2020  ozaki-r ipsec: rename ipsec_ip_input to ipsec_ip_input_checkpolicy

Because it just checks if a packet passes security policies.
 1.171  28-Aug-2020  ozaki-r inet, inet6: count packets dropped by IPsec

The counters count packets dropped due to security policy checks.
 1.170  07-Aug-2019  knakahara ipsec_getpolicybysock() should also call key_havesp() like ipsec_getpolicybyaddr().

That can reduce KEYDEBUG messages.
 1.169  09-Jul-2019  maxv Fix uninitialized variable: in ipsec_checkpcbcache(), spidx.dir is not
initialized, and the padding of the spidx structure is not initialized
either. This causes the memcmp() to wrongfully fail.

Change ipsec_setspidx() to always initialize spdix.dir and zero out the
padding.

ok ozaki-r@
 1.168  27-Jan-2019  pgoyette Merge the [pgoyette-compat] branch
 1.167  22-Nov-2018  knakahara Support IPv6 NAT-T. Implemented by hsuenaga@IIJ and ohishi@IIJ.

Add ATF later.
 1.166  27-Oct-2018  maxv Localify one function, and switch to C99 types while here.
 1.165  11-Jul-2018  maxv Rename

ip_undefer_csum -> in_undefer_cksum
in_delayed_cksum -> in_undefer_cksum_tcpudp

The two previous names were inconsistent and misleading.

Put the two functions into in_offload.c. Add comments to explain what
we're doing.

The same could be done for IPv6.
 1.164  14-May-2018  maxv branches: 1.164.2;
Merge ipsec4_input and ipsec6_input into ipsec_ip_input. Make the argument
a bool for clarity. Optimize the function: if M_CANFASTFWD is not there
(because already removed by the firewall) leave now.

Makes it easier to see that M_CANFASTFWD is not removed on IPv6.
 1.163  10-May-2018  maxv Replace dumb code by M_VERIFY_PACKET. In fact, perhaps we should not even
call M_VERIFY_PACKET here, there is no particular reason for this place to
be more wrong than the rest.
 1.162  10-May-2018  maxv Rename ipsec4_forward -> ipsec_mtu, and switch to void.
 1.161  29-Apr-2018  maxv Remove unused and misleading argument from ipsec_set_policy.
 1.160  28-Apr-2018  maxv Remove IPSEC_SPLASSERT_SOFTNET, it has always been a no-op.
 1.159  28-Apr-2018  maxv Stop using a macro, rename the function to ipsec_init_pcbpolicy directly.
 1.158  28-Apr-2018  maxv Style and remove unused stuff.
 1.157  19-Apr-2018  maxv Remove extra long file paths from the headers.
 1.156  18-Apr-2018  maxv Remove dead code.

ok ozaki-r@
 1.155  17-Apr-2018  maxv Add XXX. If this code really does something, it should use MCHTYPE.
 1.154  17-Apr-2018  maxv Style, add XXX (about the mtu that goes negative), and remove #ifdef inet.
 1.153  03-Apr-2018  maxv Remove ipsec_copy_policy and ipsec_copy_pcbpolicy. No functional change,
since we used only ipsec_copy_pcbpolicy, and it was a no-op.

Originally we were using ipsec_copy_policy to optimize the IPsec-PCB
cache: when an ACK was received in response to a SYN, we used to copy the
SP cached in the SYN's PCB into the ACK's PCB, so that
ipsec_getpolicybysock could use the cached SP instead of requerying it.

Then we switched to ipsec_copy_pcbpolicy which has always been a no-op. As
a result the SP cached in the SYN was/is not copied in the ACK, and the
first call to ipsec_getpolicybysock had to query the SP and cache it
itself. It's not totally clear to me why this change was made.

But it has been this way for years, and after a conversation with Ryota
Ozaki it turns out the optimization is not valid anymore due to
MP-ification, so it won't be re-enabled.

ok ozaki-r@
 1.152  31-Mar-2018  maxv typo in comments
 1.151  03-Mar-2018  maxv branches: 1.151.2;
Reduce the diff between ipsec4_output and ipsec6_check_policy. While here
style.
 1.150  03-Mar-2018  maxv Dedup.
 1.149  28-Feb-2018  maxv add missing static
 1.148  28-Feb-2018  maxv Dedup: merge ipsec4_setspidx_inpcb and ipsec6_setspidx_in6pcb.
 1.147  28-Feb-2018  maxv ipsec6_setspidx_in6pcb: call ipsec_setspidx() only once, just like the
IPv4 code. While here put the correct variable in sizeof.

ok ozaki-r@
 1.146  27-Feb-2018  maxv Dedup: merge ipsec4_set_policy and ipsec6_set_policy. The content of the
original ipsec_set_policy function is inlined into the new one.
 1.145  27-Feb-2018  maxv Remove duplicate checks, and no need to initialize 'newsp' in
ipsec_set_policy.
 1.144  27-Feb-2018  maxv Dedup: merge

ipsec4_get_policy and ipsec6_get_policy
ipsec4_delete_pcbpolicy and ipsec6_delete_pcbpolicy

The already-existing ipsec_get_policy() function is inlined in the new
one.
 1.143  27-Feb-2018  maxv Use inpcb_hdr to reduce the diff between

ipsec4_set_policy and ipsec6_set_policy
ipsec4_get_policy and ipsec6_get_policy
ipsec4_delete_pcbpolicy and ipsec6_delete_pcbpolicy

No real functional change.
 1.142  27-Feb-2018  maxv Optimize: use ipsec_sp_hdrsiz instead of ipsec_hdrsiz, not to re-query
the SP.

ok ozaki-r@
 1.141  26-Feb-2018  maxv Dedup: call ipsec_in_reject directly. IPSEC_STAT_IN_POLVIO also gets
increased now.
 1.140  26-Feb-2018  maxv Reduce the diff between ipsec6_input and ipsec4_input.
 1.139  26-Feb-2018  maxv Dedup: merge ipsec4_in_reject and ipsec6_in_reject into ipsec_in_reject.
While here fix misleading comment.

ok ozaki-r@
 1.138  26-Feb-2018  maxv Dedup: merge ipsec4_hdrsiz and ipsec6_hdrsiz into ipsec_hdrsiz.

ok ozaki-r@
 1.137  26-Feb-2018  maxv Dedup: merge ipsec4_checkpolicy and ipsec6_checkpolicy into
ipsec_checkpolicy.

ok ozaki-r@
 1.136  26-Feb-2018  maxv Fix nonsensical checks, neither in6p nor request is allowed to be NULL,
and the former is already dereferenced in a kassert. This code should be
the same as ipsec4_set_policy.
 1.135  26-Feb-2018  maxv Merge some minor (mostly stylistic) changes from last week.
 1.134  21-Feb-2018  maxv Fix ipsec4_get_ulp(). We should do "goto done" instead of "return",
otherwise the port fields of spidx are uninitialized.

ok mlelstv@
 1.133  21-Feb-2018  maxv Use inpcb_hdr to reduce the diff between:

ipsec4_hdrsiz and ipsec6_hdrsiz
ipsec4_in_reject and ipsec6_in_reject
ipsec4_checkpolicy and ipsec4_checkpolicy

The members of these couples are now identical, and could be merged,
giving only three functions instead of six...
 1.132  21-Feb-2018  maxv Rename:

ipsec_in_reject -> ipsec_sp_reject
ipsec_hdrsiz -> ipsec_sp_hdrsiz

localify the former, and do some cleanup while here.
 1.131  16-Feb-2018  maxv Style, remove unused and misleading macros and comments, localify, and
reduce the diff between similar functions. No functional change.
 1.130  16-Feb-2018  maxv Fix inverted logic, otherwise the kernel crashes when receiving a 1-byte
AH packet. Triggerable before authentication when IPsec and forwarding
are both enabled.
 1.129  16-Feb-2018  maxv Style a bit, no functional change.
 1.128  16-Feb-2018  maxv Remove some more FreeBSD sysctl declarations that already have NetBSD
counterparts. Discussed with ozaki-r@.
 1.127  16-Feb-2018  maxv Remove ipsec_replay and ipsec_integrity from this place, they are already
declared as sysctls. Discussed with ozaki-r@.
 1.126  16-Feb-2018  maxv Remove ip4_esp_randpad and ip6_esp_randpad, unused. Discussed with
ozaki-r@.
 1.125  08-Feb-2018  maxv Remove unused net_osdep.h include.
 1.124  23-Jan-2018  ozaki-r Fix late NULL-checking (CID 1427782: Null pointer dereferences (REVERSE_INULL))
 1.123  21-Nov-2017  ozaki-r Use M_WAITOK to allocate mbufs wherever sleepable

Further changes will get rid of unnecessary NULL checks then.
 1.122  17-Oct-2017  ozaki-r Fix buffer length for ipsec_logsastr
 1.121  03-Oct-2017  ozaki-r Don't abuse key_checkrequest just for looking up sav

It does more than expected for example key_acquire.
 1.120  28-Sep-2017  christos - sanitize key debugging so that we don't print extra newlines or unassociated
debugging messages.
- remove unused functions and make internal ones static
- print information in one line per message
 1.119  19-Sep-2017  ozaki-r Share a global dummy SP between PCBs

It's never be changed so it can be pre-allocated and shared safely between PCBs.
 1.118  10-Aug-2017  ozaki-r Add per-CPU rtcache to ipsec_reinject_ipstack

It reduces route lookups and also reduces rtcache lock contentions
when NET_MPSAFE is enabled.
 1.117  07-Aug-2017  ozaki-r Remove out-of-date log output

Pointed out by riastradh@
 1.116  03-Aug-2017  ozaki-r Introduce KEY_SA_UNREF and replace KEY_FREESAV with it where sav will never be actually freed in the future

KEY_SA_UNREF is still key_freesav so no functional change for now.

This change reduces diff of further changes.
 1.115  02-Aug-2017  ozaki-r Comment out unused functions
 1.114  02-Aug-2017  ozaki-r Don't use KEY_NEWSP for dummy SP entries

By the change KEY_NEWSP is now not called from softint anymore
and we can use kmem_zalloc with KM_SLEEP for KEY_NEWSP.
 1.113  02-Aug-2017  ozaki-r Make IPsec SPD MP-safe

We use localcount(9), not psref(9), to make the sptree and secpolicy (SP)
entries MP-safe because SPs need to be referenced over opencrypto
processing that executes a callback in a different context.

SPs on sockets aren't managed by the sptree and can be destroyed in softint.
localcount_drain cannot be used in softint so we delay the destruction of
such SPs to a thread context. To do so, a list to manage such SPs is added
(key_socksplist) and key_timehandler_spd deletes dead SPs in the list.

For more details please read the locking notes in key.c.

Proposed on tech-kern@ and tech-net@
 1.112  26-Jul-2017  ozaki-r Fix indentation

Pointed out by knakahara@
 1.111  26-Jul-2017  ozaki-r Provide and apply key_sp_refcnt (NFC)

It simplifies further changes.
 1.110  21-Jul-2017  ozaki-r Remove ipsecrequest#sav
 1.109  21-Jul-2017  ozaki-r Don't use key_lookup_sp that depends on unstable sp->req->sav

It provided a fast look-up of SP. We will provide an alternative
method in the future (after basic MP-ification finishes).
 1.108  21-Jul-2017  ozaki-r Don't use sp->req->sav when handling NAT-T ESP fragmentation

In order to do this we need to look up a sav however an additional
look-up degrades performance. A sav is later looked up in
ipsec4_process_packet so delay the fragmentation check until then
to avoid an extra look-up.
 1.107  21-Jul-2017  ozaki-r Don't use unstable isr->sav for header size calculations

We may need to optimize to not look up sav here for users that
don't need to know an exact size of headers (e.g., TCP segmemt size
caclulation).
 1.106  19-Jul-2017  ozaki-r Look up sav instead of relying on unstable sp->req->sav

This code is executed only in an error path so an additional lookup
doesn't matter.
 1.105  19-Jul-2017  ozaki-r Remove invalid M_AUTHIPDGM check on ESP isr->sav

M_AUTHIPDGM flag is set to a mbuf in ah_input_cb. An sav of ESP can
have AH authentication as sav->tdb_authalgxform. However, in that
case esp_input and esp_input_cb are used to do ESP decryption and
AH authentication and M_AUTHIPDGM never be set to a mbuf. So
checking M_AUTHIPDGM of a mbuf on isr->sav of ESP is meaningless.
 1.104  18-Jul-2017  ozaki-r Restore a comment removed in previous

The comment is valid for the below code.
 1.103  18-Jul-2017  ozaki-r Remove m_tag_find(PACKET_TAG_IPSEC_PENDING_TDB) because nobody sets the tag
 1.102  12-Jul-2017  ozaki-r Omit unnecessary NULL checks for sav->sah
 1.101  07-Jul-2017  ozaki-r Rename key_alloc* functions (NFC)

We shouldn't use the term "alloc" for functions that just look up
data and actually don't allocate memory.
 1.100  14-Jun-2017  ozaki-r KNF
 1.99  02-Jun-2017  ozaki-r branches: 1.99.2;
Assert inph_locked on ipsec_pcb_skip_ipsec (was IPSEC_PCB_SKIP_IPSEC)

The assertion confirms SP caches are accessed under inph lock (solock).
 1.98  02-Jun-2017  ozaki-r Rename IPSEC_PCBHINT_MAYBE to IPSEC_PCBHINT_UNKNOWN

MAYBE is maybe unclear.
 1.97  02-Jun-2017  ozaki-r Get rid of redundant NULL check (NFC)
 1.96  01-Jun-2017  chs remove checks for failure after memory allocation calls that cannot fail:

kmem_alloc() with KM_SLEEP
kmem_zalloc() with KM_SLEEP
percpu_alloc()
pserialize_create()
psref_class_create()

all of these paths include an assertion that the allocation has not failed,
so callers should not assert that again.
 1.95  30-May-2017  ozaki-r Make refcnt operations of SA and SP atomic

Using atomic opeartions isn't optimal and should be optimized somehow
in the future though, the change allows a kernel with NET_MPSAFE to
run out a benchmark, which is useful to know performance improvement
and degradation by code changes.
 1.94  23-May-2017  ozaki-r Use __arraycount (NFC)
 1.93  23-May-2017  ozaki-r Disable secspacq stuffs that are now unused

The stuffs are used only if sp->policy == IPSEC_POLICY_IPSEC
&& sp->req == NULL (see ipsec{4,6}_checkpolicy). However, in the
current implementation, sp->req never be NULL (except for the
moments of SP allocation and deallocation) if sp->policy is
IPSEC_POLICY_IPSEC.

It seems that the facility was partially implemented in the KAME
era and wasn't completed. Make it clear that the facility is
unused for now by #ifdef notyet. Eventually we should complete
the implementation or remove it entirely.
 1.92  19-May-2017  ozaki-r Introduce IPSECLOG and replace ipseclog and DPRINTF with it
 1.91  16-May-2017  ozaki-r Fix diagnostic assertion failure in ipsec_init_policy

panic: kernel diagnostic assertion "!cpu_softintr_p()" failed: file "../../../../netipsec/ipsec.c", line 1277
cpu7: Begin traceback...
vpanic() at netbsd:vpanic+0x140
ch_voltag_convert_in() at netbsd:ch_voltag_convert_in
ipsec_init_policy() at netbsd:ipsec_init_policy+0x149
in_pcballoc() at netbsd:in_pcballoc+0x1c5
tcp_attach_wrapper() at netbsd:tcp_attach_wrapper+0x1e1
sonewconn() at netbsd:sonewconn+0x1ea
syn_cache_get() at netbsd:syn_cache_get+0x15f
tcp_input() at netbsd:tcp_input+0x1689
ipintr() at netbsd:ipintr+0xa88
softint_dispatch() at netbsd:softint_dispatch+0xd3
DDB lost frame for netbsd:Xsoftintr+0x4f, trying 0xfffffe811d337ff0
Xsoftintr() at netbsd:Xsoftintr+0x4f

Reported by msaitoh@
 1.90  16-May-2017  ozaki-r Use kmem(9) instead of malloc/free

Some of non-sleepable allocations can be replaced with sleepable ones.
To make it clear that the replacements are possible, some assertions
are addded.
 1.89  15-May-2017  ozaki-r Show __func__ instead of __FILE__ in debug log messages

__func__ is shorter and more useful than __FILE__.
 1.88  11-May-2017  ryo Make ipsec_address() and ipsec_logsastr() mpsafe.
 1.87  10-May-2017  ozaki-r Stop ipsec4_output returning SP to the caller

SP isn't used by the caller (ip_output) and also holding its
reference looks unnecessary.
 1.86  08-May-2017  ozaki-r Omit two arguments of ipsec4_process_packet

flags is unused and tunalready is always 0. So NFC.
 1.85  28-Apr-2017  ozaki-r Fix function name in log message
 1.84  25-Apr-2017  ozaki-r branches: 1.84.2;
Check if solock of PCB is held when SP caches in the PCB are accessed

To this end, a back pointer from inpcbpolicy to inpcb_hdr is added.
 1.83  21-Apr-2017  ozaki-r Use inph for variable name of struct inpcb_hdr (NFC)
 1.82  20-Apr-2017  ozaki-r Remove unnecessary NULL checks for inp_socket and in6p_socket

They cannot be NULL except for programming errors.
 1.81  20-Apr-2017  ozaki-r Provide IPSEC_DIR_* validation macros
 1.80  19-Apr-2017  ozaki-r Use KASSERT for sanity checks of function arguments
 1.79  19-Apr-2017  ozaki-r Change ifdef DIAGNOSTIC + panic to KASSERT
 1.78  19-Apr-2017  ozaki-r Fix indentations (NFC)
 1.77  19-Apr-2017  ozaki-r Tweak KEYDEBUG macros

Let's avoid passing statements to a macro.
 1.76  19-Apr-2017  ozaki-r Change panic if DIAGNOSTIC to KASSERT

One can be changed to CTASSERT.
 1.75  19-Apr-2017  ozaki-r Retire ipsec_osdep.h

We don't need to care other OSes (FreeBSD) anymore.

Some macros are alive in ipsec_private.h.
 1.74  19-Apr-2017  ozaki-r Improve message on assertion failure
 1.73  18-Apr-2017  ozaki-r Convert IPSEC_ASSERT to KASSERT or KASSERTMSG

IPSEC_ASSERT just discarded specified message...
 1.72  18-Apr-2017  ozaki-r Remove __FreeBSD__ and __NetBSD__ switches

No functional changes (except for a debug printf).

Note that there remain some __FreeBSD__ for sysctl knobs which counerparts
to NetBSD don't exist. And ipsec_osdep.h isn't touched yet; tidying it up
requires actual code changes.
 1.71  06-Apr-2017  ozaki-r Prepare netipsec for rump-ification

- Include "opt_*.h" only if _KERNEL_OPT is defined
- Allow encapinit to be called twice (by ifinit and ipe4_attach)
- ifinit didn't call encapinit if IPSEC is enabled (ipe4_attach called
it instead), however, on a rump kernel ipe4_attach may not be called
even if IPSEC is enabled. So we need to allow ifinit to call it anyway
- Setup sysctls in ipsec_attach explicitly instead of using SYSCTL_SETUP
- Call ip6flow_invalidate_all in key_spdadd only if in6_present
- It's possible that a rump kernel loads the ipsec library but not
the inet6 library
 1.70  03-Mar-2017  ozaki-r Pass inpcb/in6pcb instead of socket to ip_output/ip6_output

- Passing a socket to Layer 3 is layer violation and even unnecessary
- The change makes codes of callers and IPsec a bit simple
 1.69  16-Jan-2017  christos ip6_sprintf -> IN6_PRINT so that we pass the size.
 1.68  16-Jan-2017  ryo Make ip6_sprintf(), in_fmtaddr(), lla_snprintf() and icmp6_redirect_diag() mpsafe.

Reviewed by ozaki-r@
 1.67  08-Dec-2016  ozaki-r branches: 1.67.2;
Add rtcache_unref to release points of rtentry stemming from rtcache

In the MP-safe world, a rtentry stemming from a rtcache can be freed at any
points. So we need to protect rtentries somehow say by reference couting or
passive references. Regardless of the method, we need to call some release
function of a rtentry after using it.

The change adds a new function rtcache_unref to release a rtentry. At this
point, this function does nothing because for now we don't add a reference
to a rtentry when we get one from a rtcache. We will add something useful
in a further commit.

This change is a part of changes for MP-safe routing table. It is separated
to avoid one big change that makes difficult to debug by bisecting.
 1.66  01-Apr-2015  ozaki-r branches: 1.66.2;
Pull out ipsec routines from ip6_input

This change reduces symbol references from netinet6 to netipsec
and improves modularity of netipsec.

No functional change is intended.
 1.65  01-Apr-2015  ozaki-r Fix wrong comments
 1.64  13-Aug-2014  plunky branches: 1.64.2;
C99 6.5.15 Conditional operator note 3 states that the second and
third operators of a ?: operation shoud (amongst other conditions)
either both be integer type, or both void type. cast the second
to (void) then, as log() is already a void and no result is desired.
 1.63  30-May-2014  christos branches: 1.63.2; 1.63.4; 1.63.8;
Introduce 2 new variables: ipsec_enabled and ipsec_used.
Ipsec enabled is controlled by sysctl and determines if is allowed.
ipsec_used is set automatically based on ipsec being enabled, and
rules existing.
 1.62  24-Dec-2013  christos branches: 1.62.2;
fix debugging output printfs to use __func__ so they print the correct names.
 1.61  24-Dec-2013  degroote fix a typo in the log ouput of ipsec4_get_policy
 1.60  08-Jun-2013  rmind branches: 1.60.2;
Split IPsec code in ip_input() and ip_forward() into the separate routines
ipsec4_input() and ipsec4_forward(). Tested by christos@.
 1.59  08-Jun-2013  rmind Split IPSec logic from ip_output() into a separate routine - ipsec4_output().
No change to the mechanism intended. Tested by christos@.
 1.58  04-Jun-2013  christos PR/47886: Dr. Wolfgang Stukenbrock: IPSEC_NAT_T enabled kernels may access
outdated pointers and pass ESP data to UPD-sockets.
While here, simplify the code and remove the IPSEC_NAT_T option; always
compile nat-traversal in so that it does not bitrot.
 1.57  07-Dec-2012  christos rename pcb_sp to policy to avoid:
$SRC/arch/arm/include/pcb.h:#define pcb_sp pcb_un.un_32.pcb32_sp
$SRC/arch/arm/include/pcb.h:#define pcb_sp pcb_sf.sf_r13
 1.56  13-Mar-2012  elad branches: 1.56.2;
Replace the remaining KAUTH_GENERIC_ISSUSER authorization calls with
something meaningful. All relevant documentation has been updated or
written.

Most of these changes were brought up in the following messages:

http://mail-index.netbsd.org/tech-kern/2012/01/18/msg012490.html
http://mail-index.netbsd.org/tech-kern/2012/01/19/msg012502.html
http://mail-index.netbsd.org/tech-kern/2012/02/17/msg012728.html

Thanks to christos, manu, njoly, and jmmv for input.

Huge thanks to pgoyette for spinning these changes through some build
cycles and ATF.
 1.55  09-Jun-2011  drochner branches: 1.55.2; 1.55.6; 1.55.8; 1.55.12; 1.55.14;
more "const"
 1.54  08-Jun-2011  dyoung Fiddle a bit with const's to make FAST_IPSEC compile.
 1.53  05-Jun-2011  christos more malloc style.
 1.52  05-Jun-2011  christos - sprinkle const
- malloc style
 1.51  16-May-2011  drochner branches: 1.51.2;
cosmetical whitespace changes
 1.50  18-Feb-2011  drochner sprinkle some "const", documenting that the SA is not supposed to
change during an xform operation
 1.49  11-Feb-2011  drochner invalidate the secpolicy cache bin the PCB before destroying, so that
the refcount in the (global) policies gets decremented
(This apparently was missed when the policy cache code was copied
over from KAME IPSEC.)
From Wolfgang Stukenbrock per PR kern/44410, just fixed differently
to avoid unecessary differences to KAME.
 1.48  21-Jul-2010  jakllsch branches: 1.48.2; 1.48.4;
Further silence ipsec_attach().
"initializing IPsec..."" done" is of somewhat limited value.
(I normally wouldn't care; but on my box the (root) uhub(4)s attach
between the first and last portion of the line.)
 1.47  31-Jan-2010  hubertf branches: 1.47.2; 1.47.4;
Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.
 1.46  30-Jul-2009  jakllsch As explained in kern/41701 there's a missing splx() here.
 1.45  25-Jun-2009  christos Only print debugging messages about policy on error.
 1.44  10-May-2009  elad Adapt FAST_IPSEC to recent KPI changes.

Pointed out by dyoung@ on tech-kern@, thanks!
 1.43  18-Apr-2009  tsutsui Remove extra whitespace added by a stupid tool.
XXX: more in src/sys/arch
 1.42  18-Mar-2009  cegger bcopy -> memcpy
 1.41  18-Mar-2009  cegger bzero -> memset
 1.40  18-Mar-2009  cegger bcmp -> memcmp
 1.39  27-Jun-2008  degroote branches: 1.39.4; 1.39.6; 1.39.10; 1.39.12; 1.39.14;
Kill caddr_t introduced in the previous revision
Fix build with FAST_IPSEC
 1.38  27-Jun-2008  mlelstv Verify icmp type and code in IPSEC rules.
Fixes PR kern/39018
 1.37  23-Apr-2008  thorpej branches: 1.37.2; 1.37.4; 1.37.6;
Make IPSEC and FAST_IPSEC stats per-cpu. Use <net/net_stats.h> and
netstat_sysctl().
 1.36  29-Dec-2007  degroote branches: 1.36.6; 1.36.8;
Simplify the FAST_IPSEC output path
Only record an IPSEC_OUT_DONE tag when we have finished the processing
In ip{,6}_output, check this tag to know if we have already processed this
packet.
Remove some dead code (IPSEC_PENDING_TDB is not used in NetBSD)

Fix pr/36870
 1.35  09-Dec-2007  degroote branches: 1.35.2;
Kill _IP_VHL ifdef (from netinet/ip.h history, it has never been used in NetBSD so ...)
 1.34  28-Oct-2007  adrianp branches: 1.34.2; 1.34.4; 1.34.6;
The function ipsec4_get_ulp assumes that ip_off is in host order. This results
in IPsec processing that is dependent on protocol and/or port can be bypassed.

Bug report, analysis and initial fix from Karl Knutsson.
Final patch and ok from degroote@
 1.33  07-Jul-2007  degroote branches: 1.33.6; 1.33.8; 1.33.12;
Ansify
Remove useless extern
bzero -> memset, bcopy -> memcpy

No functionnal changes
 1.32  08-May-2007  degroote Always compute the sp index even if we don't have any sp in spd. It will
let us to choose the right default policy (based on the adress family
requested).

While here, fix an error message
 1.31  15-Apr-2007  degroote Choose the good default policy, depending of the adress family of the
desired policy
 1.30  25-Mar-2007  degroote Use ip4_ah_cleartos instead of ah_cleartos for consistency
 1.29  25-Mar-2007  degroote Make an exact match when we are looking for a cached sp for an unconnected
socket. If we don't make an exact match, we may use a cached rule which
has lower priority than a rule that would otherwise have matched the
packet.

Code submitted by Karl Knutsson in PR/36051
 1.28  04-Mar-2007  degroote branches: 1.28.2; 1.28.4; 1.28.6;
Remove useless cast
Use NULL instead of (void*) 0
 1.27  04-Mar-2007  christos Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.
 1.26  10-Feb-2007  degroote branches: 1.26.2;
Commit my SoC work
Add ipv6 support for fast_ipsec
Note that currently, packet with extensions headers are not correctly
supported
Change the ipcomp logic
 1.25  16-Nov-2006  christos branches: 1.25.2;
__unused removal on arguments; approved by core.
 1.24  13-Oct-2006  christos more __unused
 1.23  10-Jun-2006  kardel branches: 1.23.6; 1.23.8;
reference time.tv_sec in non timecounter case
missing conversion spotted by Geoff Wing
XXX This code need to be checked whether UTC time
is really the right abstraction. I suspect uptime
would be the correct time scale for measuring life times.
 1.22  10-Jun-2006  kardel fix a missing conversion for a mono_time reference.
detected by Geoff Wing.
 1.21  11-Apr-2006  rpaulo branches: 1.21.2;
Add two new sysctls protected under IPSEC_DEBUG:

net.inet.ipsec.test_replay - When set to 1, IPsec will send packets with
the same sequence number. This allows to verify if the other side
has proper replay attacks detection.

net.inet.ipsec.test_integrity - When set 1, IPsec will send packets with
corrupted HMAC. This allows to verify if the other side properly
detects modified packets.

(a message will be printed indicating when these sysctls changed)

By Pawel Jakub Dawidek <pjd@FreeBSD.org>.
Discussed with Christos Zoulas and Jonathan Stone.
 1.20  25-Feb-2006  wiz branches: 1.20.2; 1.20.4; 1.20.6;
Fix some typos.
 1.19  11-Dec-2005  christos branches: 1.19.2; 1.19.4; 1.19.6;
merge ktrace-lwp.
 1.18  05-Oct-2005  christos PR/31478: YOMURA Masanori: Inconsistent default value of net.inet.ipsec.dfbit
Changed to match netinet6 (0->2)
 1.17  10-Jun-2005  christos branches: 1.17.2;
constify and unshadow.
 1.16  08-May-2005  christos Panic strings should not end with \n.
 1.15  26-Feb-2005  perry branches: 1.15.2; 1.15.4; 1.15.6;
nuke trailing whitespace
 1.14  27-Oct-2004  jonathan branches: 1.14.4; 1.14.6;
Fix missing break; Emmanuel Dreyfus.

C.f. sys/netinet6/ipsec.c rev 1.97 -> 1.98, but does not include the
gratutious change for a case which (the comment says) should not occur.
 1.13  07-May-2004  jonathan branches: 1.13.2;
Redo net.inet.* sysctl subtree for fast-ipsec from scratch.
Attach FAST-IPSEC statistics with 64-bit counters to new sysctl MIB.
Rework netstat to show FAST_IPSEC statistics, via sysctl, for
netstat -p ipsec.

New kernel files:
sys/netipsec/Makefile (new file; install *_var.h includes)
sys/netipsec/ipsec_var.h (new 64-bit mib counter struct)

Changed kernel files:
sys/Makefile (recurse into sys/netipsec/)
sys/netinet/in.h (fake IP_PROTO name for fast_ipsec
sysctl subtree.)
sys/netipsec/ipsec.h (minimal userspace inclusion)
sys/netipsec/ipsec_osdep.h (minimal userspace inclusion)
sys/netipsec/ipsec_netbsd.c (redo sysctl subtree from scratch)
sys/netipsec/key*.c (fix broken net.key subtree)

sys/netipsec/ah_var.h (increase all counters to 64 bits)
sys/netipsec/esp_var.h (increase all counters to 64 bits)
sys/netipsec/ipip_var.h (increase all counters to 64 bits)
sys/netipsec/ipcomp_var.h (increase all counters to 64 bits)

sys/netipsec/ipsec.c (add #include netipsec/ipsec_var.h)
sys/netipsec/ipsec_mbuf.c (add #include netipsec/ipsec_var.h)
sys/netipsec/ipsec_output.c (add #include netipsec/ipsec_var.h)

sys/netinet/raw_ip.c (add #include netipsec/ipsec_var.h)
sys/netinet/tcp_input.c (add #include netipsec/ipsec_var.h)
sys/netinet/udp_usrreq.c (add #include netipsec/ipsec_var.h)

Changes to usr.bin/netstat to print the new fast-ipsec sysctl tree
for "netstat -s -p ipsec":

New file:
usr.bin/netstat/fast_ipsec.c (print fast-ipsec counters)

Changed files:
usr.bin/netstat/Makefile (add fast_ipsec.c)
usr.bin/netstat/netstat.h (declarations for fast_ipsec.c)
usr.bin/netstat/main.c (call KAME-vs-fast-ipsec dispatcher)
 1.12  25-Apr-2004  jonathan Initial commit of a port of the FreeBSD implementation of RFC 2385
(MD5 signatures for TCP, as used with BGP). Credit for original
FreeBSD code goes to Bruce M. Simpson, with FreeBSD sponsorship
credited to sentex.net. Shortening of the setsockopt() name
attributed to Vincent Jardin.

This commit is a minimal, working version of the FreeBSD code, as
MFC'ed to FreeBSD-4. It has received minimal testing with a ttcp
modified to set the TCP-MD5 option; BMS's additions to tcpdump-current
(tcpdump -M) confirm that the MD5 signatures are correct. Committed
as-is for further testing between a NetBSD BGP speaker (e.g., quagga)
and industry-standard BGP speakers (e.g., Cisco, Juniper).


NOTE: This version has two potential flaws. First, I do see any code
that verifies recieved TCP-MD5 signatures. Second, the TCP-MD5
options are internally padded and assumed to be 32-bit aligned. A more
space-efficient scheme is to pack all TCP options densely (and
possibly unaligned) into the TCP header ; then do one final padding to
a 4-byte boundary. Pre-existing comments note that accounting for
TCP-option space when we add SACK is yet to be done. For now, I'm
punting on that; we can solve it properly, in a way that will handle
SACK blocks, as a separate exercise.

In case a pullup to NetBSD-2 is requested, this adds sys/netipsec/xform_tcp.c
,and modifies:

sys/net/pfkeyv2.h,v 1.15
sys/netinet/files.netinet,v 1.5
sys/netinet/ip.h,v 1.25
sys/netinet/tcp.h,v 1.15
sys/netinet/tcp_input.c,v 1.200
sys/netinet/tcp_output.c,v 1.109
sys/netinet/tcp_subr.c,v 1.165
sys/netinet/tcp_usrreq.c,v 1.89
sys/netinet/tcp_var.h,v 1.109
sys/netipsec/files.netipsec,v 1.3
sys/netipsec/ipsec.c,v 1.11
sys/netipsec/ipsec.h,v 1.7
sys/netipsec/key.c,v 1.11
share/man/man4/tcp.4,v 1.16
lib/libipsec/pfkey.c,v 1.20
lib/libipsec/pfkey_dump.c,v 1.17
lib/libipsec/policy_token.l,v 1.8
sbin/setkey/parse.y,v 1.14
sbin/setkey/setkey.8,v 1.27
sbin/setkey/token.l,v 1.15

Note that the preceding two revisions to tcp.4 will be
required to cleanly apply this diff.
 1.11  21-Apr-2004  itojun kill sprintf, use snprintf
 1.10  02-Mar-2004  thorpej branches: 1.10.2;
Remove some left-over debugging code.
 1.9  02-Mar-2004  thorpej Bring the PCB policy cache over from KAME IPsec, including the "hint"
used to short-circuit IPsec processing in other places.

This is enabled only for NetBSD at the moment; in order for it to function
correctly, ipsec_pcbconn() must be called as appropriate.
 1.8  02-Mar-2004  thorpej iipsec4_get_ulp(): Fix a reversed test that would have caused us to access
bogus IP header data if presented with a short mbuf.
 1.7  24-Feb-2004  wiz occured -> occurred. From Peter Postma.
 1.6  28-Jan-2004  jonathan Change #endif __FreeBSD__ to #endif /* __FreeBSD__ */
 1.5  20-Jan-2004  jonathan IPv6 mapped adddresses require us to cope with limited polymorphism
(struct in6pcb* versus struct inpcb*) in ipsec_getpolicybysock().

Add new macros (in lieu of an abstract data type) for a ``generic''
PCB_T (points to a struct inpcb* or struct in6pcb*) to ipsec_osdep.h.
Use those new macros in ipsec_getpolicybysock() and elsewhere.

As posted to tech-net for comment/feedback, late 2003.
 1.4  06-Oct-2003  tls Reversion of "netkey merge", part 2 (replacement of removed files in the
repository by christos was part 1). netipsec should now be back as it
was on 2003-09-11, with some very minor changes:

1) Some residual platform-dependent code was moved from ipsec.h to
ipsec_osdep.h; without this, IPSEC_ASSERT() was multiply defined. ipsec.h
now includes ipsec_osdep.h

2) itojun's renaming of netipsec/files.ipsec to netipsec/files.netipsec has
been left in place (it's arguable which name is less confusing but the
rename is pretty harmless).

3) Some #endif TOKEN has been replaced by #endif /* TOKEN */; #endif TOKEN
is invalid and GCC 3 won't compile it.

An i386 kernel with "options FAST_IPSEC" and "options OPENCRYPTO" now
gets through "make depend" but fails to build with errors in ip_input.c.
But it's better than it was (thank heaven for small favors).
 1.3  12-Sep-2003  itojun merge netipsec/key* into netkey/key*. no need for both.
change confusing filename
 1.2  20-Aug-2003  jonathan opt_inet6.h is FreeBSD-specific, so wrap it with #ifdef __FreeBSD__/#endif.
 1.1  13-Aug-2003  jonathan Initial import of Sam Leffler's `Fast-IPsec' from FreeBSD 4.
Fast-IPsec is a rework of the OpenBSD and KAME IPsec code, using the
OpenCryptoFramework (and thus hardware crypto accelerators) and
numerous detailed performance improvements.

This import is (aside from SPL-level names) the FreeBSD source,
imported ``as-is'' as a historical snapshot, for future maintenance
and comparison against the FreeBSD source. For now, several minor
kernel-API differences are hidden by macros a shim file, ipsec_osdep.h,
which (aside from SPL names) can be targeted at either NetBSD or FreeBSD.
 1.10.2.2  01-Dec-2007  bouyer Pull up following revision(s) (requested by adrianp in ticket #11395):
sys/netipsec/xform_ah.c: revision 1.19 via patch
sys/netipsec/ipsec.c: revision 1.34 via patch
sys/netipsec/xform_ipip.c: revision 1.18 via patch
sys/netipsec/ipsec_output.c: revision 1.23 via patch
sys/netipsec/ipsec_osdep.h: revision 1.21 via patch
The function ipsec4_get_ulp assumes that ip_off is in host order. This results
in IPsec processing that is dependent on protocol and/or port can be bypassed.
Bug report, analysis and initial fix from Karl Knutsson.
Final patch and ok from degroote@
 1.10.2.1  10-May-2004  tron branches: 1.10.2.1.2; 1.10.2.1.4;
Pull up revision 1.13 (requested by jonathan in ticket #280):
Redo net.inet.* sysctl subtree for fast-ipsec from scratch.
Attach FAST-IPSEC statistics with 64-bit counters to new sysctl MIB.
Rework netstat to show FAST_IPSEC statistics, via sysctl, for
netstat -p ipsec.
New kernel files:
sys/netipsec/Makefile (new file; install *_var.h includes)
sys/netipsec/ipsec_var.h (new 64-bit mib counter struct)
Changed kernel files:
sys/Makefile (recurse into sys/netipsec/)
sys/netinet/in.h (fake IP_PROTO name for fast_ipsec
sysctl subtree.)
sys/netipsec/ipsec.h (minimal userspace inclusion)
sys/netipsec/ipsec_osdep.h (minimal userspace inclusion)
sys/netipsec/ipsec_netbsd.c (redo sysctl subtree from scratch)
sys/netipsec/key*.c (fix broken net.key subtree)
sys/netipsec/ah_var.h (increase all counters to 64 bits)
sys/netipsec/esp_var.h (increase all counters to 64 bits)
sys/netipsec/ipip_var.h (increase all counters to 64 bits)
sys/netipsec/ipcomp_var.h (increase all counters to 64 bits)
sys/netipsec/ipsec.c (add #include netipsec/ipsec_var.h)
sys/netipsec/ipsec_mbuf.c (add #include netipsec/ipsec_var.h)
sys/netipsec/ipsec_output.c (add #include netipsec/ipsec_var.h)
sys/netinet/raw_ip.c (add #include netipsec/ipsec_var.h)
sys/netinet/tcp_input.c (add #include netipsec/ipsec_var.h)
sys/netinet/udp_usrreq.c (add #include netipsec/ipsec_var.h)
Changes to usr.bin/netstat to print the new fast-ipsec sysctl tree
for "netstat -s -p ipsec":
New file:
usr.bin/netstat/fast_ipsec.c (print fast-ipsec counters)
Changed files:
usr.bin/netstat/Makefile (add fast_ipsec.c)
usr.bin/netstat/netstat.h (declarations for fast_ipsec.c)
usr.bin/netstat/main.c (call KAME-vs-fast-ipsec dispatcher)
 1.10.2.1.4.1  01-Dec-2007  bouyer Pull up following revision(s) (requested by adrianp in ticket #11395):
sys/netipsec/xform_ah.c: revision 1.19 via patch
sys/netipsec/ipsec.c: revision 1.34 via patch
sys/netipsec/xform_ipip.c: revision 1.18 via patch
sys/netipsec/ipsec_output.c: revision 1.23 via patch
sys/netipsec/ipsec_osdep.h: revision 1.21 via patch
The function ipsec4_get_ulp assumes that ip_off is in host order. This results
in IPsec processing that is dependent on protocol and/or port can be bypassed.
Bug report, analysis and initial fix from Karl Knutsson.
Final patch and ok from degroote@
 1.10.2.1.2.1  01-Dec-2007  bouyer Pull up following revision(s) (requested by adrianp in ticket #11395):
sys/netipsec/xform_ah.c: revision 1.19 via patch
sys/netipsec/ipsec.c: revision 1.34 via patch
sys/netipsec/xform_ipip.c: revision 1.18 via patch
sys/netipsec/ipsec_output.c: revision 1.23 via patch
sys/netipsec/ipsec_osdep.h: revision 1.21 via patch
The function ipsec4_get_ulp assumes that ip_off is in host order. This results
in IPsec processing that is dependent on protocol and/or port can be bypassed.
Bug report, analysis and initial fix from Karl Knutsson.
Final patch and ok from degroote@
 1.13.2.7  10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.13.2.6  04-Mar-2005  skrll Sync with HEAD.

Hi Perry!
 1.13.2.5  02-Nov-2004  skrll Sync with HEAD.
 1.13.2.4  21-Sep-2004  skrll Fix the sync with head I botched.
 1.13.2.3  18-Sep-2004  skrll Sync with HEAD.
 1.13.2.2  03-Aug-2004  skrll Sync with HEAD
 1.13.2.1  07-May-2004  skrll file ipsec.c was added on branch ktrace-lwp on 2004-08-03 10:55:29 +0000
 1.14.6.1  19-Mar-2005  yamt sync with head. xen and whitespace. xen part is not finished.
 1.14.4.1  29-Apr-2005  kent sync with -current
 1.15.6.1  22-Nov-2007  bouyer Pull up following revision(s) (requested by adrianp in ticket #1878):
sys/netipsec/xform_ah.c: revision 1.19 via patch
sys/netipsec/ipsec.c: revision 1.34 via patch
sys/netipsec/xform_ipip.c: revision 1.18 via patch
sys/netipsec/ipsec_output.c: revision 1.23 via patch
sys/netipsec/ipsec_osdep.h: revision 1.21 via patch
The function ipsec4_get_ulp assumes that ip_off is in host order. This results
in IPsec processing that is dependent on protocol and/or port can be bypassed.
Bug report, analysis and initial fix from Karl Knutsson.
Final patch and ok from degroote@
 1.15.4.1  22-Nov-2007  bouyer Pull up following revision(s) (requested by adrianp in ticket #1878):
sys/netipsec/xform_ah.c: revision 1.19 via patch
sys/netipsec/ipsec.c: revision 1.34 via patch
sys/netipsec/xform_ipip.c: revision 1.18 via patch
sys/netipsec/ipsec_output.c: revision 1.23 via patch
sys/netipsec/ipsec_osdep.h: revision 1.21 via patch
The function ipsec4_get_ulp assumes that ip_off is in host order. This results
in IPsec processing that is dependent on protocol and/or port can be bypassed.
Bug report, analysis and initial fix from Karl Knutsson.
Final patch and ok from degroote@
 1.15.2.1  22-Nov-2007  bouyer Pull up following revision(s) (requested by adrianp in ticket #1878):
sys/netipsec/xform_ah.c: revision 1.19 via patch
sys/netipsec/ipsec.c: revision 1.34 via patch
sys/netipsec/xform_ipip.c: revision 1.18 via patch
sys/netipsec/ipsec_output.c: revision 1.23 via patch
sys/netipsec/ipsec_osdep.h: revision 1.21 via patch
The function ipsec4_get_ulp assumes that ip_off is in host order. This results
in IPsec processing that is dependent on protocol and/or port can be bypassed.
Bug report, analysis and initial fix from Karl Knutsson.
Final patch and ok from degroote@
 1.17.2.6  21-Jan-2008  yamt sync with head
 1.17.2.5  15-Nov-2007  yamt sync with head.
 1.17.2.4  03-Sep-2007  yamt sync with head.
 1.17.2.3  26-Feb-2007  yamt sync with head.
 1.17.2.2  30-Dec-2006  yamt sync with head.
 1.17.2.1  21-Jun-2006  yamt sync with head.
 1.19.6.1  22-Apr-2006  simonb Sync with head.
 1.19.4.1  09-Sep-2006  rpaulo sync with head
 1.19.2.1  01-Mar-2006  yamt sync with head.
 1.20.6.1  24-May-2006  tron Merge 2006-05-24 NetBSD-current into the "peter-altq" branch.
 1.20.4.1  19-Apr-2006  elad sync with head.
 1.20.2.2  26-Jun-2006  yamt sync with head.
 1.20.2.1  24-May-2006  yamt sync with head.
 1.21.2.1  19-Jun-2006  chap Sync with head.
 1.23.8.2  10-Dec-2006  yamt sync with head.
 1.23.8.1  22-Oct-2006  yamt sync with head
 1.23.6.1  18-Nov-2006  ad Sync with head.
 1.25.2.3  31-Oct-2007  liamjfoy Pull up following revision(s) (requested by adrianp in ticket #964):
sys/netipsec/xform_ah.c: revision 1.19
sys/netipsec/ipsec.c: revision 1.34
sys/netipsec/xform_ipip.c: revision 1.18
sys/netipsec/ipsec_output.c: revision 1.23
sys/netipsec/ipsec_osdep.h: revision 1.21
The function ipsec4_get_ulp assumes that ip_off is in host order. This results
in IPsec processing that is dependent on protocol and/or port can be bypassed.
Bug report, analysis and initial fix from Karl Knutsson.
Final patch and ok from degroote&#64;
 1.25.2.2  24-May-2007  pavel Pull up following revision(s) (requested by degroote in ticket #667):
sys/netinet/tcp_input.c: revision 1.260
sys/netinet/tcp_output.c: revision 1.154
sys/netinet/tcp_subr.c: revision 1.210
sys/netinet6/icmp6.c: revision 1.129
sys/netinet6/in6_proto.c: revision 1.70
sys/netinet6/ip6_forward.c: revision 1.54
sys/netinet6/ip6_input.c: revision 1.94
sys/netinet6/ip6_output.c: revision 1.114
sys/netinet6/raw_ip6.c: revision 1.81
sys/netipsec/ipcomp_var.h: revision 1.4
sys/netipsec/ipsec.c: revision 1.26 via patch,1.31-1.32
sys/netipsec/ipsec6.h: revision 1.5
sys/netipsec/ipsec_input.c: revision 1.14
sys/netipsec/ipsec_netbsd.c: revision 1.18,1.26
sys/netipsec/ipsec_output.c: revision 1.21 via patch
sys/netipsec/key.c: revision 1.33,1.44
sys/netipsec/xform_ipcomp.c: revision 1.9
sys/netipsec/xform_ipip.c: revision 1.15
sys/opencrypto/deflate.c: revision 1.8
Commit my SoC work
Add ipv6 support for fast_ipsec
Note that currently, packet with extensions headers are not correctly
supported
Change the ipcomp logic

Add sysctl tree to modify the fast_ipsec options related to ipv6. Similar
to the sysctl kame interface.

Choose the good default policy, depending of the adress family of the
desired policy

Increase the refcount for the default ipv6 policy so nobody can reclaim it

Always compute the sp index even if we don't have any sp in spd. It will
let us to choose the right default policy (based on the adress family
requested).
While here, fix an error message

Use dynamic array instead of an static array to decompress. It lets us to
decompress any data, whatever is the radio decompressed data / compressed
data.
It fixes the last issues with fast_ipsec and ipcomp.
While here, bzero -> memset, bcopy -> memcpy, FREE -> free
Reviewed a long time ago by sam@
 1.25.2.1  12-May-2007  pavel branches: 1.25.2.1.2;
Pull up following revision(s) (requested by degroote in ticket #630):
sys/netipsec/key.c: revision 1.43-1.46
sys/netinet6/ipsec.c: revision 1.116
sys/netipsec/ipsec.c: revision 1.29 via patch
sys/netkey/key.c: revision 1.154-1.155
Call key_checkspidup with spi in network bit order in order to make
comparaison with spi stored into the sadb.
Reported by Karl Knutsson in kern/36038 .

Make an exact match when we are looking for a cached sp for an unconnected
socket. If we don't make an exact match, we may use a cached rule which
has lower priority than a rule that would otherwise have matched the
packet.
Code submitted by Karl Knutsson in PR/36051

Fix a memleak in key_spdget.
Problem was reported by Karl Knutsson by pr/36119.

In spddelete2, if we can't find the sp by this id, return after sending an
error message, don't process the following code with the NULL sp.
Spotted by Matthew Grooms on freebsd-net ML

When we construct an answer for SADB_X_SPDGET, don't use an hardcoded 0 for seq but
the seq used by the request. It will improve consistency with the answer of SADB_GET
request and helps some applications which relies both on seq and pid.
Reported by Karl Knutsson by pr/36119.
 1.25.2.1.2.2  06-Jan-2008  wrstuden Catch up to netbsd-4.0 release.
 1.25.2.1.2.1  04-Jun-2007  wrstuden Update to today's netbsd-4.
 1.26.2.4  17-May-2007  yamt sync with head.
 1.26.2.3  07-May-2007  yamt sync with head.
 1.26.2.2  15-Apr-2007  yamt sync with head.
 1.26.2.1  12-Mar-2007  rmind Sync with HEAD.
 1.28.6.1  29-Mar-2007  reinoud Pullup to -current
 1.28.4.1  11-Jul-2007  mjf Sync with head.
 1.28.2.3  15-Jul-2007  ad Sync with head.
 1.28.2.2  08-Jun-2007  ad Sync with head.
 1.28.2.1  10-Apr-2007  ad Sync with head.
 1.33.12.1  13-Nov-2007  bouyer Sync with HEAD
 1.33.8.2  09-Jan-2008  matt sync with HEAD
 1.33.8.1  06-Nov-2007  matt sync with HEAD
 1.33.6.1  28-Oct-2007  joerg Sync with HEAD.
 1.34.6.1  11-Dec-2007  yamt sync with head.
 1.34.4.1  26-Dec-2007  ad Sync with head.
 1.34.2.1  18-Feb-2008  mjf Sync with HEAD.
 1.35.2.1  02-Jan-2008  bouyer Sync with HEAD
 1.36.8.1  18-May-2008  yamt sync with head.
 1.36.6.2  29-Jun-2008  mjf Sync with HEAD.
 1.36.6.1  02-Jun-2008  mjf Sync with HEAD.
 1.37.6.2  03-Jul-2008  simonb Sync with head.
 1.37.6.1  27-Jun-2008  simonb Sync with head.
 1.37.4.1  18-Sep-2008  wrstuden Sync with wrstuden-revivesa-base-2.
 1.37.2.6  11-Aug-2010  yamt sync with head.
 1.37.2.5  11-Mar-2010  yamt sync with head
 1.37.2.4  19-Aug-2009  yamt sync with head.
 1.37.2.3  18-Jul-2009  yamt sync with head.
 1.37.2.2  16-May-2009  yamt sync with head
 1.37.2.1  04-May-2009  yamt sync with head.
 1.39.14.1  21-Apr-2010  matt sync to netbsd-5
 1.39.12.1  07-Aug-2009  snj Pull up following revision(s) (requested by jakllsch in ticket #884):
sys/netipsec/ipsec.c: revision 1.46
As explained in kern/41701 there's a missing splx() here.
 1.39.10.2  23-Jul-2009  jym Sync with HEAD.
 1.39.10.1  13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.39.6.2  14-Feb-2010  bouyer Pull up following revision(s) (requested by hubertf in ticket #1290):
sys/kern/kern_ksyms.c: revision 1.53
sys/dev/pci/agp_via.c: revision 1.18
sys/netipsec/key.c: revision 1.63
sys/arch/x86/x86/x86_autoconf.c: revision 1.49
sys/kern/init_main.c: revision 1.415
sys/kern/cnmagic.c: revision 1.11
sys/netipsec/ipsec.c: revision 1.47
sys/arch/x86/x86/pmap.c: revision 1.100
sys/netkey/key.c: revision 1.176
Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.
 1.39.6.1  07-Aug-2009  snj Pull up following revision(s) (requested by jakllsch in ticket #884):
sys/netipsec/ipsec.c: revision 1.46
As explained in kern/41701 there's a missing splx() here.
 1.39.4.1  28-Apr-2009  skrll Sync with HEAD.
 1.47.4.3  12-Jun-2011  rmind sync with head
 1.47.4.2  31-May-2011  rmind sync with head
 1.47.4.1  05-Mar-2011  rmind sync with head
 1.47.2.1  17-Aug-2010  uebayasi Sync with HEAD.
 1.48.4.2  05-Mar-2011  bouyer Sync with HEAD
 1.48.4.1  17-Feb-2011  bouyer Sync with HEAD
 1.48.2.1  06-Jun-2011  jruoho Sync with HEAD.
 1.51.2.1  23-Jun-2011  cherry Catchup with rmind-uvmplock merge.
 1.55.14.1  16-Feb-2018  martin Pull up following revision(s) (requested by maxv in ticket #1531):

sys/netipsec/ipsec.c: revision 1.130

Fix inverted logic, otherwise the kernel crashes when receiving a 1-byte
AH packet. Triggerable before authentication when IPsec and forwarding
are both enabled.
 1.55.12.1  16-Feb-2018  martin Pull up following revision(s) (requested by maxv in ticket #1531):

sys/netipsec/ipsec.c: revision 1.130

Fix inverted logic, otherwise the kernel crashes when receiving a 1-byte
AH packet. Triggerable before authentication when IPsec and forwarding
are both enabled.
 1.55.8.1  16-Feb-2018  martin Pull up following revision(s) (requested by maxv in ticket #1531):

sys/netipsec/ipsec.c: revision 1.130

Fix inverted logic, otherwise the kernel crashes when receiving a 1-byte
AH packet. Triggerable before authentication when IPsec and forwarding
are both enabled.
 1.55.6.1  05-Apr-2012  mrg sync to latest -current.
 1.55.2.3  22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.55.2.2  16-Jan-2013  yamt sync with (a bit old) head
 1.55.2.1  17-Apr-2012  yamt sync with head
 1.56.2.4  03-Dec-2017  jdolecek update from HEAD
 1.56.2.3  20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.56.2.2  23-Jun-2013  tls resync from head
 1.56.2.1  25-Feb-2013  tls resync with head
 1.60.2.2  18-May-2014  rmind sync with head
 1.60.2.1  17-Jul-2013  rmind Checkpoint work in progress:
- Move PCB structures under __INPCB_PRIVATE, adjust most of the callers
and thus make IPv4 PCB structures mostly opaque. Any volunteers for
merging in6pcb with inpcb (see rpaulo-netinet-merge-pcb branch)?
- Move various global vars to the modules where they belong, make them static.
- Some preliminary work for IPv4 PCB locking scheme.
- Make raw IP code mostly MP-safe. Simplify some of it.
- Rework "fast" IP forwarding (ipflow) code to be mostly MP-safe. It should
run from a software interrupt, rather than hard.
- Rework tun(4) pseudo interface to be MP-safe.
- Work towards making some other interfaces more strict.
 1.62.2.1  10-Aug-2014  tls Rebase.
 1.63.8.1  16-Feb-2018  martin Pull up following revision(s) (requested by maxv in ticket #1570):

sys/netipsec/ipsec.c: revision 1.130

Fix inverted logic, otherwise the kernel crashes when receiving a 1-byte
AH packet. Triggerable before authentication when IPsec and forwarding
are both enabled.
 1.63.4.1  16-Feb-2018  martin Pull up following revision(s) (requested by maxv in ticket #1570):

sys/netipsec/ipsec.c: revision 1.130

Fix inverted logic, otherwise the kernel crashes when receiving a 1-byte
AH packet. Triggerable before authentication when IPsec and forwarding
are both enabled.
 1.63.2.1  16-Feb-2018  martin Pull up following revision(s) (requested by maxv in ticket #1570):

sys/netipsec/ipsec.c: revision 1.130

Fix inverted logic, otherwise the kernel crashes when receiving a 1-byte
AH packet. Triggerable before authentication when IPsec and forwarding
are both enabled.
 1.64.2.3  28-Aug-2017  skrll Sync with HEAD
 1.64.2.2  05-Feb-2017  skrll Sync with HEAD
 1.64.2.1  06-Apr-2015  skrll Sync with HEAD
 1.66.2.3  26-Apr-2017  pgoyette Sync with HEAD
 1.66.2.2  20-Mar-2017  pgoyette Sync with HEAD
 1.66.2.1  07-Jan-2017  pgoyette Sync with HEAD. (Note that most of these changes are simply $NetBSD$
tag issues.)
 1.67.2.1  21-Apr-2017  bouyer Sync with HEAD
 1.84.2.3  19-May-2017  pgoyette Resolve conflicts from previous merge (all resulting from $NetBSD
keywork expansion)
 1.84.2.2  11-May-2017  pgoyette Sync with HEAD
 1.84.2.1  02-May-2017  pgoyette Sync with HEAD - tag prg-localcount2-base1
 1.99.2.5  30-Mar-2018  martin Pull up following revision(s) (requested by maxv in ticket #669):

sys/netipsec/ipsec.c: revision 1.134

Fix ipsec4_get_ulp(). We should do "goto done" instead of "return",
otherwise the port fields of spidx are uninitialized.

ok mlelstv@
 1.99.2.4  16-Feb-2018  martin Pull up following revision(s) (requested by maxv in ticket #559):

sys/netipsec/ipsec.c: revision 1.130

Fix inverted logic, otherwise the kernel crashes when receiving a 1-byte
AH packet. Triggerable before authentication when IPsec and forwarding
are both enabled.
 1.99.2.3  05-Feb-2018  martin Pull up following revision(s) (requested by ozaki-r in ticket #528):
sys/net/agr/if_agr.c: revision 1.42
sys/netinet6/nd6_rtr.c: revision 1.137
sys/netinet6/nd6_rtr.c: revision 1.138
sys/net/agr/if_agr.c: revision 1.46
sys/net/route.c: revision 1.206
sys/net/if.c: revision 1.419
sys/net/agr/if_agrether.c: revision 1.10
sys/netinet6/nd6.c: revision 1.241
sys/netinet6/nd6.c: revision 1.242
sys/netinet6/nd6.c: revision 1.243
sys/netinet6/nd6.c: revision 1.244
sys/netinet6/nd6.c: revision 1.245
sys/netipsec/ipsec_input.c: revision 1.52
sys/netipsec/ipsec_input.c: revision 1.53
sys/net/agr/if_agrsubr.h: revision 1.5
sys/kern/subr_workqueue.c: revision 1.35
sys/netipsec/ipsec.c: revision 1.124
sys/net/agr/if_agrsubr.c: revision 1.11
sys/net/agr/if_agrsubr.c: revision 1.12
Simplify; share agr_vlan_add and agr_vlan_del (NFCI)
Fix late NULL-checking (CID 1427782: Null pointer dereferences (REVERSE_INULL))
KNF: replace soft tabs with hard tabs
Add missing NULL-checking for m_pullup (CID 1427770: Null pointer dereferences (NULL_RETURNS))
Add locking.
Revert "Get rid of unnecessary splsoftnet" (v1.133)
It's not always true that softnet_lock is held these places.
See PR kern/52947.
Get rid of unnecessary splsoftnet (redo)
Unless NET_MPSAFE, splsoftnet is still needed for rt_* functions.
Use existing fill_[pd]rlist() functions to calculate size of buffer to
allocate, rather than relying on an arbitrary length passed in from
userland.
Allow copyout() of partial results if the user buffer is too small, to
be consistent with the way sysctl(3) is documented.
Garbage-collect now-unused third parrameter in the fill_[pd]rlist()
functions.
As discussed on IRC.
OK kamil@ and christos@
XXX Needs pull-up to netbsd-8 branch.
Simplify, from christos@
More simplification, this time from ozaki-r@
No need to break after return.
One more from christos@
No need to initialize fill_func
more cleanup (don't allow oldlenp == NULL)
Destroy ifq_lock at the end of if_detach
It still can be used in if_detach.
Prevent rt_free_global.wk from being enqueued to workqueue doubly
Check if a queued work is tried to be enqueued again, which is not allowed
 1.99.2.2  30-Nov-2017  martin Pull up following revision(s) (requested by ozaki-r in ticket #406):
sys/netipsec/key.c: revision 1.239
sys/netipsec/key.c: revision 1.240
sys/netipsec/key.c: revision 1.241
sys/netipsec/key.c: revision 1.242
sys/netipsec/key.h: revision 1.33
sys/netipsec/ipsec.c: revision 1.123
sys/netipsec/key.c: revision 1.236
sys/netipsec/key.c: revision 1.237
sys/netipsec/key.c: revision 1.238
Provide a function to call MGETHDR and MCLGET
The change fixes two usages of MGETHDR that don't check whether a mbuf is really
allocated before passing it to MCLGET.
Fix error handling of MCLGET in key_alloc_mbuf
Add missing splx to key_spdexpire
Use M_WAITOK to allocate mbufs wherever sleepable
Further changes will get rid of unnecessary NULL checks then.
Get rid of unnecessary NULL checks that are obsoleted by M_WAITOK
Simply the code by avoiding unnecessary error checks
- Remove unnecessary m_pullup for self-allocated mbufs
- Replace some if-fails-return sanity checks with KASSERT
Call key_sendup_mbuf immediately unless key_acquire is called in softint
We need to defer it only if it's called in softint to avoid deadlock.
 1.99.2.1  21-Oct-2017  snj Pull up following revision(s) (requested by ozaki-r in ticket #300):
crypto/dist/ipsec-tools/src/setkey/parse.y: 1.19
crypto/dist/ipsec-tools/src/setkey/token.l: 1.20
distrib/sets/lists/tests/mi: 1.754, 1.757, 1.759
doc/TODO.smpnet: 1.12-1.13
sys/net/pfkeyv2.h: 1.32
sys/net/raw_cb.c: 1.23-1.24, 1.28
sys/net/raw_cb.h: 1.28
sys/net/raw_usrreq.c: 1.57-1.58
sys/net/rtsock.c: 1.228-1.229
sys/netinet/in_proto.c: 1.125
sys/netinet/ip_input.c: 1.359-1.361
sys/netinet/tcp_input.c: 1.359-1.360
sys/netinet/tcp_output.c: 1.197
sys/netinet/tcp_var.h: 1.178
sys/netinet6/icmp6.c: 1.213
sys/netinet6/in6_proto.c: 1.119
sys/netinet6/ip6_forward.c: 1.88
sys/netinet6/ip6_input.c: 1.181-1.182
sys/netinet6/ip6_output.c: 1.193
sys/netinet6/ip6protosw.h: 1.26
sys/netipsec/ipsec.c: 1.100-1.122
sys/netipsec/ipsec.h: 1.51-1.61
sys/netipsec/ipsec6.h: 1.18-1.20
sys/netipsec/ipsec_input.c: 1.44-1.51
sys/netipsec/ipsec_netbsd.c: 1.41-1.45
sys/netipsec/ipsec_output.c: 1.49-1.64
sys/netipsec/ipsec_private.h: 1.5
sys/netipsec/key.c: 1.164-1.234
sys/netipsec/key.h: 1.20-1.32
sys/netipsec/key_debug.c: 1.18-1.21
sys/netipsec/key_debug.h: 1.9
sys/netipsec/keydb.h: 1.16-1.20
sys/netipsec/keysock.c: 1.59-1.62
sys/netipsec/keysock.h: 1.10
sys/netipsec/xform.h: 1.9-1.12
sys/netipsec/xform_ah.c: 1.55-1.74
sys/netipsec/xform_esp.c: 1.56-1.72
sys/netipsec/xform_ipcomp.c: 1.39-1.53
sys/netipsec/xform_ipip.c: 1.50-1.54
sys/netipsec/xform_tcp.c: 1.12-1.16
sys/rump/librump/rumpkern/Makefile.rumpkern: 1.170
sys/rump/librump/rumpnet/net_stub.c: 1.27
sys/sys/protosw.h: 1.67-1.68
tests/net/carp/t_basic.sh: 1.7
tests/net/if_gif/t_gif.sh: 1.11
tests/net/if_l2tp/t_l2tp.sh: 1.3
tests/net/ipsec/Makefile: 1.7-1.9
tests/net/ipsec/algorithms.sh: 1.5
tests/net/ipsec/common.sh: 1.4-1.6
tests/net/ipsec/t_ipsec_ah_keys.sh: 1.2
tests/net/ipsec/t_ipsec_esp_keys.sh: 1.2
tests/net/ipsec/t_ipsec_gif.sh: 1.6-1.7
tests/net/ipsec/t_ipsec_l2tp.sh: 1.6-1.7
tests/net/ipsec/t_ipsec_misc.sh: 1.8-1.18
tests/net/ipsec/t_ipsec_sockopt.sh: 1.1-1.2
tests/net/ipsec/t_ipsec_tcp.sh: 1.1-1.2
tests/net/ipsec/t_ipsec_transport.sh: 1.5-1.6
tests/net/ipsec/t_ipsec_tunnel.sh: 1.9
tests/net/ipsec/t_ipsec_tunnel_ipcomp.sh: 1.1-1.2
tests/net/ipsec/t_ipsec_tunnel_odd.sh: 1.3
tests/net/mcast/t_mcast.sh: 1.6
tests/net/net/t_ipaddress.sh: 1.11
tests/net/net_common.sh: 1.20
tests/net/npf/t_npf.sh: 1.3
tests/net/route/t_flags.sh: 1.20
tests/net/route/t_flags6.sh: 1.16
usr.bin/netstat/fast_ipsec.c: 1.22
Do m_pullup before mtod

It may fix panicks of some tests on anita/sparc and anita/GuruPlug.
---
KNF
---
Enable DEBUG for babylon5
---
Apply C99-style struct initialization to xformsw
---
Tweak outputs of netstat -s for IPsec

- Get rid of "Fast"
- Use ipsec and ipsec6 for titles to clarify protocol
- Indent outputs of sub protocols

Original outputs were organized like this:

(Fast) IPsec:
IPsec ah:
IPsec esp:
IPsec ipip:
IPsec ipcomp:
(Fast) IPsec:
IPsec ah:
IPsec esp:
IPsec ipip:
IPsec ipcomp:

New outputs are organized like this:

ipsec:
ah:
esp:
ipip:
ipcomp:
ipsec6:
ah:
esp:
ipip:
ipcomp:
---
Add test cases for IPComp
---
Simplify IPSEC_OSTAT macro (NFC)
---
KNF; replace leading whitespaces with hard tabs
---
Introduce and use SADB_SASTATE_USABLE_P
---
KNF
---
Add update command for testing

Updating an SA (SADB_UPDATE) requires that a process issuing
SADB_UPDATE is the same as a process issued SADB_ADD (or SADB_GETSPI).
This means that update command must be used with add command in a
configuration of setkey. This usage is normally meaningless but
useful for testing (and debugging) purposes.
---
Add test cases for updating SA/SP

The tests require newly-added udpate command of setkey.
---
PR/52346: Frank Kardel: Fix checksumming for NAT-T
See XXX for improvements.
---
Remove codes for PACKET_TAG_IPSEC_IN_CRYPTO_DONE

It seems that PACKET_TAG_IPSEC_IN_CRYPTO_DONE is for network adapters
that have IPsec accelerators; a driver sets the mtag to a packet
when its device has already encrypted the packet.

Unfortunately no driver implements such offload features for long
years and seems unlikely to implement them soon. (Note that neither
FreeBSD nor Linux doesn't have such drivers.) Let's remove related
(unused) codes and simplify the IPsec code.
---
Fix usages of sadb_msg_errno
---
Avoid updating sav directly

On SADB_UPDATE a target sav was updated directly, which was unsafe.
Instead allocate another sav, copy variables of the old sav to
the new one and replace the old one with the new one.
---
Simplify; we can assume sav->tdb_xform cannot be NULL while it's valid
---
Rename key_alloc* functions (NFC)

We shouldn't use the term "alloc" for functions that just look up
data and actually don't allocate memory.
---
Use explicit_memset to surely zero-clear key_auth and key_enc
---
Make sure to clear keys on error paths of key_setsaval
---
Add missing KEY_FREESAV
---
Make sure a sav is inserted to a sah list after its initialization completes
---
Remove unnecessary zero-clearing codes from key_setsaval

key_setsaval is now used only for a newly-allocated sav. (It was
used to reset variables of an existing sav.)
---
Correct wrong assumption of sav->refcnt in key_delsah

A sav in a list is basically not to be sav->refcnt == 0. And also
KEY_FREESAV assumes sav->refcnt > 0.
---
Let key_getsavbyspi take a reference of a returning sav
---
Use time_mono_to_wall (NFC)
---
Separate sending message routine (NFC)
---
Simplify; remove unnecessary zero-clears

key_freesaval is used only when a target sav is being destroyed.
---
Omit NULL checks for sav->lft_c

sav->lft_c can be NULL only when initializing or destroying sav.
---
Omit unnecessary NULL checks for sav->sah
---
Omit unnecessary check of sav->state

key_allocsa_policy picks a sav of either MATURE or DYING so we
don't need to check its state again.
---
Simplify; omit unnecessary saidx passing

- ipsec_nextisr returns a saidx but no caller uses it
- key_checkrequest is passed a saidx but it can be gotton by
another argument (isr)
---
Fix splx isn't called on some error paths
---
Fix header size calculation of esp where sav is NULL
---
Fix header size calculation of ah in the case sav is NULL

This fix was also needed for esp.
---
Pass sav directly to opencrypto callback

In a callback, use a passed sav as-is by default and look up a sav
only if the passed sav is dead.
---
Avoid examining freshness of sav on packet processing

If a sav list is sorted (by lft_c->sadb_lifetime_addtime) in advance,
we don't need to examine each sav and also don't need to delete one
on the fly and send up a message. Fortunately every sav lists are sorted
as we need.

Added key_validate_savlist validates that each sav list is surely sorted
(run only if DEBUG because it's not cheap).
---
Add test cases for SAs with different SPIs
---
Prepare to stop using isr->sav

isr is a shared resource and using isr->sav as a temporal storage
for each packet processing is racy. And also having a reference from
isr to sav makes the lifetime of sav non-deterministic; such a reference
is removed when a packet is processed and isr->sav is overwritten by
new one. Let's have a sav locally for each packet processing instead of
using shared isr->sav.

However this change doesn't stop using isr->sav yet because there are
some users of isr->sav. isr->sav will be removed after the users find
a way to not use isr->sav.
---
Fix wrong argument handling
---
fix printf format.
---
Don't validate sav lists of LARVAL or DEAD states

We don't sort the lists so the validation will always fail.

Fix PR kern/52405
---
Make sure to sort the list when changing the state by key_sa_chgstate
---
Rename key_allocsa_policy to key_lookup_sa_bysaidx
---
Separate test files
---
Calculate ah_max_authsize on initialization as well as esp_max_ivlen
---
Remove m_tag_find(PACKET_TAG_IPSEC_PENDING_TDB) because nobody sets the tag
---
Restore a comment removed in previous

The comment is valid for the below code.
---
Make tests more stable

sleep command seems to wait longer than expected on anita so
use polling to wait for a state change.
---
Add tests that explicitly delete SAs instead of waiting for expirations
---
Remove invalid M_AUTHIPDGM check on ESP isr->sav

M_AUTHIPDGM flag is set to a mbuf in ah_input_cb. An sav of ESP can
have AH authentication as sav->tdb_authalgxform. However, in that
case esp_input and esp_input_cb are used to do ESP decryption and
AH authentication and M_AUTHIPDGM never be set to a mbuf. So
checking M_AUTHIPDGM of a mbuf on isr->sav of ESP is meaningless.
---
Look up sav instead of relying on unstable sp->req->sav

This code is executed only in an error path so an additional lookup
doesn't matter.
---
Correct a comment
---
Don't release sav if calling crypto_dispatch again
---
Remove extra KEY_FREESAV from ipsec_process_done

It should be done by the caller.
---
Don't bother the case of crp->crp_buf == NULL in callbacks
---
Hold a reference to an SP during opencrypto processing

An SP has a list of isr (ipsecrequest) that represents a sequence
of IPsec encryption/authentication processing. One isr corresponds
to one opencrypto processing. The lifetime of an isr follows its SP.

We pass an isr to a callback function of opencrypto to continue
to a next encryption/authentication processing. However nobody
guaranteed that the isr wasn't freed, i.e., its SP wasn't destroyed.

In order to avoid such unexpected destruction of isr, hold a reference
to its SP during opencrypto processing.
---
Don't make SAs expired on tests that delete SAs explicitly
---
Fix a debug message
---
Dedup error paths (NFC)
---
Use pool to allocate tdb_crypto

For ESP and AH, we need to allocate an extra variable space in addition
to struct tdb_crypto. The fixed size of pool items may be larger than
an actual requisite size of a buffer, but still the performance
improvement by replacing malloc with pool wins.
---
Don't use unstable isr->sav for header size calculations

We may need to optimize to not look up sav here for users that
don't need to know an exact size of headers (e.g., TCP segmemt size
caclulation).
---
Don't use sp->req->sav when handling NAT-T ESP fragmentation

In order to do this we need to look up a sav however an additional
look-up degrades performance. A sav is later looked up in
ipsec4_process_packet so delay the fragmentation check until then
to avoid an extra look-up.
---
Don't use key_lookup_sp that depends on unstable sp->req->sav

It provided a fast look-up of SP. We will provide an alternative
method in the future (after basic MP-ification finishes).
---
Stop setting isr->sav on looking up sav in key_checkrequest
---
Remove ipsecrequest#sav
---
Stop setting mtag of PACKET_TAG_IPSEC_IN_DONE because there is no users anymore
---
Skip ipsec_spi_*_*_preferred_new_timeout when running on qemu

Probably due to PR 43997
---
Add localcount to rump kernels
---
Remove unused macro
---
Fix key_getcomb_setlifetime

The fix adjusts a soft limit to be 80% of a corresponding hard limit.

I'm not sure the fix is really correct though, at least the original
code is wrong. A passed comb is zero-cleared before calling
key_getcomb_setlifetime, so
comb->sadb_comb_soft_addtime = comb->sadb_comb_soft_addtime * 80 / 100;
is meaningless.
---
Provide and apply key_sp_refcnt (NFC)

It simplifies further changes.
---
Fix indentation

Pointed out by knakahara@
---
Use pslist(9) for sptree
---
Don't acquire global locks for IPsec if NET_MPSAFE

Note that the change is just to make testing easy and IPsec isn't MP-safe yet.
---
Let PF_KEY socks hold their own lock instead of softnet_lock

Operations on SAD and SPD are executed via PF_KEY socks. The operations
include deletions of SAs and SPs that will use synchronization mechanisms
such as pserialize_perform to wait for references to SAs and SPs to be
released. It is known that using such mechanisms with holding softnet_lock
causes a dead lock. We should avoid the situation.
---
Make IPsec SPD MP-safe

We use localcount(9), not psref(9), to make the sptree and secpolicy (SP)
entries MP-safe because SPs need to be referenced over opencrypto
processing that executes a callback in a different context.

SPs on sockets aren't managed by the sptree and can be destroyed in softint.
localcount_drain cannot be used in softint so we delay the destruction of
such SPs to a thread context. To do so, a list to manage such SPs is added
(key_socksplist) and key_timehandler_spd deletes dead SPs in the list.

For more details please read the locking notes in key.c.

Proposed on tech-kern@ and tech-net@
---
Fix updating ipsec_used

- key_update_used wasn't called in key_api_spddelete2 and key_api_spdflush
- key_update_used wasn't called if an SP had been added/deleted but
a reply to userland failed
---
Fix updating ipsec_used; turn on when SPs on sockets are added
---
Add missing IPsec policy checks to icmp6_rip6_input

icmp6_rip6_input is quite similar to rip6_input and the same checks exist
in rip6_input.
---
Add test cases for setsockopt(IP_IPSEC_POLICY)
---
Don't use KEY_NEWSP for dummy SP entries

By the change KEY_NEWSP is now not called from softint anymore
and we can use kmem_zalloc with KM_SLEEP for KEY_NEWSP.
---
Comment out unused functions
---
Add test cases that there are SPs but no relevant SAs
---
Don't allow sav->lft_c to be NULL

lft_c of an sav that was created by SADB_GETSPI could be NULL.
---
Clean up clunky eval strings

- Remove unnecessary \ at EOL
- This allows to omit ; too
- Remove unnecessary quotes for arguments of atf_set
- Don't expand $DEBUG in eval
- We expect it's expanded on execution

Suggested by kre@
---
Remove unnecessary KEY_FREESAV in an error path

sav should be freed (unreferenced) by the caller.
---
Use pslist(9) for sahtree
---
Use pslist(9) for sah->savtree
---
Rename local variable newsah to sah

It may not be new.
---
MP-ify SAD slightly

- Introduce key_sa_mtx and use it for some list operations
- Use pserialize for some list iterations
---
Introduce KEY_SA_UNREF and replace KEY_FREESAV with it where sav will never be actually freed in the future

KEY_SA_UNREF is still key_freesav so no functional change for now.

This change reduces diff of further changes.
---
Remove out-of-date log output

Pointed out by riastradh@
---
Use KDASSERT instead of KASSERT for mutex_ownable

Because mutex_ownable is too heavy to run in a fast path
even for DIAGNOSTIC + LOCKDEBUG.

Suggested by riastradh@
---
Assemble global lists and related locks into cache lines (NFCI)

Also rename variable names from *tree to *list because they are
just lists, not trees.

Suggested by riastradh@
---
Move locking notes
---
Update the locking notes

- Add locking order
- Add locking notes for misc lists such as reglist
- Mention pserialize, key_sp_ref and key_sp_unref on SP operations

Requested by riastradh@
---
Describe constraints of key_sp_ref and key_sp_unref

Requested by riastradh@
---
Hold key_sad.lock on SAVLIST_WRITER_INSERT_TAIL
---
Add __read_mostly to key_psz

Suggested by riastradh@
---
Tweak wording (pserialize critical section => pserialize read section)

Suggested by riastradh@
---
Add missing mutex_exit
---
Fix setkey -D -P outputs

The outputs were tweaked (by me), but I forgot updating libipsec
in my local ATF environment...
---
MP-ify SAD (key_sad.sahlist and sah entries)

localcount(9) is used to protect key_sad.sahlist and sah entries
as well as SPD (and will be used for SAD sav).

Please read the locking notes of SAD for more details.
---
Introduce key_sa_refcnt and replace sav->refcnt with it (NFC)
---
Destroy sav only in the loop for DEAD sav
---
Fix KASSERT(solocked(sb->sb_so)) failure in sbappendaddr that is called eventually from key_sendup_mbuf

If key_sendup_mbuf isn't passed a socket, the assertion fails.
Originally in this case sb->sb_so was softnet_lock and callers
held softnet_lock so the assertion was magically satisfied.
Now sb->sb_so is key_so_mtx and also softnet_lock isn't always
held by callers so the assertion can fail.

Fix it by holding key_so_mtx if key_sendup_mbuf isn't passed a socket.

Reported by knakahara@
Tested by knakahara@ and ozaki-r@
---
Fix locking notes of SAD
---
Fix deadlock between key_sendup_mbuf called from key_acquire and localcount_drain

If we call key_sendup_mbuf from key_acquire that is called on packet
processing, a deadlock can happen like this:
- At key_acquire, a reference to an SP (and an SA) is held
- key_sendup_mbuf will try to take key_so_mtx
- Some other thread may try to localcount_drain to the SP with
holding key_so_mtx in say key_api_spdflush
- In this case localcount_drain never return because key_sendup_mbuf
that has stuck on key_so_mtx never release a reference to the SP

Fix the deadlock by deferring key_sendup_mbuf to the timer
(key_timehandler).
---
Fix that prev isn't cleared on retry
---
Limit the number of mbufs queued for deferred key_sendup_mbuf

It's easy to be queued hundreds of mbufs on the list under heavy
network load.
---
MP-ify SAD (savlist)

localcount(9) is used to protect savlist of sah. The basic design is
similar to MP-ifications of SPD and SAD sahlist. Please read the
locking notes of SAD for more details.
---
Simplify ipsec_reinject_ipstack (NFC)
---
Add per-CPU rtcache to ipsec_reinject_ipstack

It reduces route lookups and also reduces rtcache lock contentions
when NET_MPSAFE is enabled.
---
Use pool_cache(9) instead of pool(9) for tdb_crypto objects

The change improves network throughput especially on multi-core systems.
---
Update

ipsec(4), opencrypto(9) and vlan(4) are now MP-safe.
---
Write known issues on scalability
---
Share a global dummy SP between PCBs

It's never be changed so it can be pre-allocated and shared safely between PCBs.
---
Fix race condition on the rawcb list shared by rtsock and keysock

keysock now protects itself by its own mutex, which means that
the rawcb list is protected by two different mutexes (keysock's one
and softnet_lock for rtsock), of course it's useless.

Fix the situation by having a discrete rawcb list for each.
---
Use a dedicated mutex for rt_rawcb instead of softnet_lock if NET_MPSAFE
---
fix localcount leak in sav. fixed by ozaki-r@n.o.

I commit on behalf of him.
---
remove unnecessary comment.
---
Fix deadlock between pserialize_perform and localcount_drain

A typical ussage of localcount_drain looks like this:

mutex_enter(&mtx);
item = remove_from_list();
pserialize_perform(psz);
localcount_drain(&item->localcount, &cv, &mtx);
mutex_exit(&mtx);

This sequence can cause a deadlock which happens for example on the following
situation:

- Thread A calls localcount_drain which calls xc_broadcast after releasing
a specified mutex
- Thread B enters the sequence and calls pserialize_perform with holding
the mutex while pserialize_perform also calls xc_broadcast
- Thread C (xc_thread) that calls an xcall callback of localcount_drain tries
to hold the mutex

xc_broadcast of thread B doesn't start until xc_broadcast of thread A
finishes, which is a feature of xcall(9). This means that pserialize_perform
never complete until xc_broadcast of thread A finishes. On the other hand,
thread C that is a callee of xc_broadcast of thread A sticks on the mutex.
Finally the threads block each other (A blocks B, B blocks C and C blocks A).

A possible fix is to serialize executions of the above sequence by another
mutex, but adding another mutex makes the code complex, so fix the deadlock
by another way; the fix is to release the mutex before pserialize_perform
and instead use a condvar to prevent pserialize_perform from being called
simultaneously.

Note that the deadlock has happened only if NET_MPSAFE is enabled.
---
Add missing ifdef NET_MPSAFE
---
Take softnet_lock on pr_input properly if NET_MPSAFE

Currently softnet_lock is taken unnecessarily in some cases, e.g.,
icmp_input and encap4_input from ip_input, or not taken even if needed,
e.g., udp_input and tcp_input from ipsec4_common_input_cb. Fix them.

NFC if NET_MPSAFE is disabled (default).
---
- sanitize key debugging so that we don't print extra newlines or unassociated
debugging messages.
- remove unused functions and make internal ones static
- print information in one line per message
---
humanize printing of ip addresses
---
cast reduction, NFC.
---
Fix typo in comment
---
Pull out ipsec_fill_saidx_bymbuf (NFC)
---
Don't abuse key_checkrequest just for looking up sav

It does more than expected for example key_acquire.
---
Fix SP is broken on transport mode

isr->saidx was modified accidentally in ipsec_nextisr.

Reported by christos@
Helped investigations by christos@ and knakahara@
---
Constify isr at many places (NFC)
---
Include socketvar.h for softnet_lock
---
Fix buffer length for ipsec_logsastr
 1.151.2.6  26-Nov-2018  pgoyette Sync with HEAD, resolve a couple of conflicts
 1.151.2.5  28-Jul-2018  pgoyette Sync with HEAD
 1.151.2.4  21-May-2018  pgoyette Sync with HEAD
 1.151.2.3  02-May-2018  pgoyette Synch with HEAD
 1.151.2.2  22-Apr-2018  pgoyette Sync with HEAD
 1.151.2.1  07-Apr-2018  pgoyette Sync with HEAD. 77 conflicts resolved - all of them $NetBSD$
 1.164.2.2  13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.164.2.1  10-Jun-2019  christos Sync with HEAD
 1.177.2.1  20-Jul-2024  martin Pull up following revision(s) (requested by rin in ticket #740):

sys/netipsec/ipsec_input.c: revision 1.79
sys/netipsec/ipsec_output.c: revision 1.86
sys/netipsec/ipsec.c: revision 1.178
sys/netinet6/ip6_output.c: revision 1.232

ipsec: remove unnecessary splsoftnet

Because the code of IPsec itself is already MP-safe.

RSS XML Feed