Home | History | Annotate | Download | only in netinet6
History log of /src/sys/netinet6/icmp6.c
RevisionDateAuthorComments
 1.258  05-Jul-2024  rin sys: Drop redundant NULL check before m_freem(9)

m_freem(9) safely has accepted NULL argument at least since 4.2BSD:
https://www.tuhs.org/cgi-bin/utree.pl?file=4.2BSD/usr/src/sys/sys/uipc_mbuf.c

Compile-tested on amd64/ALL.

Suggested by knakahara@
 1.257  29-Jun-2024  riastradh netinet6: Use _NET_STAT* API instead of direct array access.

XXX Exception: ip6flow_addstats_rt _assigns_ one of the `statistics'
to the current count of ip6 flows in use, and we don't have anything
in the _NET_STAT* API for that. So for now I abuse the abstraction,
until we sort out this one exceptional case properly.

PR kern/58380
 1.256  24-Feb-2024  mlelstv Deliver timestamps also to raw sockets.
Fixes PR 57955
 1.255  09-Dec-2023  pgoyette Modularize the COMPAT_90 code that resulted from the removal of
netinet6/nd6 from the kernel. Now, the minimal compat code can
be successfully loaded and unloaded along with the rest of the
COMPAT_90 code.

XXX pullup-10 - hopefully before RC2
 1.254  28-Oct-2022  ozaki-r branches: 1.254.2;
inpcb: separate inpcb again to reduce the size of PCB for IPv4

The data size of PCB for IPv4 increased because of the merge of
struct in6pcb. The change decreases the size to the original size by
separating struct inpcb (again). struct in4pcb and in6pcb that embed
struct inpcb are introduced.

Even after the separation, users don't need to realize the separation
and only have to use some macros to access dedicated data. For example,
inp->inp_laddr is now accessed through in4p_laddr(inp).
 1.253  28-Oct-2022  ozaki-r inpcb: integrate data structures of PCB into one

Data structures of network protocol control blocks (PCBs), i.e.,
struct inpcb, in6pcb and inpcb_hdr, are not organized well. Users of
the data structures have to handle them separately and thus the code
is cluttered and duplicated.

The commit integrates the data structures into one, struct inpcb. As a
result, users of PCBs only have to handle just one data structure, so
the code becomes simple.

One drawback is that the data size of PCB for IPv4 increases by 40 bytes
(from 248 bytes to 288 bytes).
 1.252  29-Aug-2022  knakahara Add sysctl entry to control to send routing message for RTM_DYNAMIC.

Some routing daemons require such routing message to keep coherency.

If we want to let kernel send such message, set net.inet.icmp.dynamic_rt_msg=1
for IPv4, net.inet6.icmp6.dynamic_rt_msg=1 for IPv6.
Default(=0) is the same as before, that is, not send such routing message.
 1.251  22-Aug-2022  knakahara Add sysctl entry to enable/disable to use path MTU discovery for icmpv6 reflecting.

If we want to use path MTU discovery for icmp reflecting set
net.inet6.icmp6.reflect_pmtu=1. Default(=0) is the same as before, that is,
use IPV6_MINMTU.
 1.250  19-Feb-2021  christos - Make ALIGNED_POINTER use __alignof(t) instead of sizeof(t). This is more
correct because it works with non-primitive types and provides the ABI
alignment for the type the compiler will use.
- Remove all the *_HDR_ALIGNMENT macros and asserts
- Replace POINTER_ALIGNED_P with ACCESSIBLE_POINTER which is identical to
ALIGNED_POINTER, but returns that the pointer is always aligned if the
CPU supports unaligned accesses.
[ as proposed in tech-kern ]
 1.249  15-Feb-2021  martin Fix the build.
Maybe there should be a ICMP6_HDR_ALIGNMENT, but for now there is
only IP6_HDR_ALIGNMENT.
 1.248  14-Feb-2021  christos - centralize header align and pullup into a single inline function
- use a single macro to align pointers and expose the alignment, instead
of hard-coding 3 in 1/2 the macros.
- fix an issue in the ipv6 lt2p where it was aligning for ipv4 and pulling
for ipv6.
 1.247  11-Sep-2020  roy branches: 1.247.2;
inet6: Use generic Neighor Detection rather than IPv6 specific

No functional change intended.
 1.246  27-Jul-2020  roy icmp6: Remove __packed attribute from icmp6 structures

They should naturally align.
Add compile time assertations to icmp6.c to prove this.
 1.245  12-Jun-2020  roy Remove in-kernel handling of Router Advertisements

This is much better handled by a user-land tool.
Proposed on tech-net here:
https://mail-index.netbsd.org/tech-net/2020/04/22/msg007766.html

Note that the ioctl SIOCGIFINFO_IN6 no longer sets flags. That now
needs to be done using the pre-existing SIOCSIFINFO_FLAGS ioctl.

Compat is fully provided where it makes sense, but trying to turn on
RA handling will obviously throw an error as it no longer exists.

Note that if you use IPv6 temporary addresses, this now needs to be
turned on in dhcpcd.conf(5) rather than in sysctl.conf(5).
 1.244  09-Mar-2020  roy route: RTM_MISS now puts the message source address in RTA_AUTHOR

route(8) also reports this.
A userland app could use this to blacklist nodes who probe for machines
that doesn't exist on a subnet / prefix.
 1.243  06-Oct-2019  uwe icmp6_notify_error - fix ctlfunc typedef to match pr_ctlinput,
drop the cast that is no longer necessary.
 1.242  22-Dec-2018  maxv branches: 1.242.4;
Replace: M_COPY_PKTHDR -> m_copy_pkthdr. No functional change, since the
former is a macro to the latter.
 1.241  22-Dec-2018  maxv Replace: M_MOVE_PKTHDR -> m_move_pkthdr. No functional change, since the
former is a macro to the latter.
 1.240  25-Oct-2018  ozaki-r Remove a leftover debug printf

Pointed out by hannken@
 1.239  03-Sep-2018  riastradh Rename min/max -> uimin/uimax for better honesty.

These functions are defined on unsigned int. The generic name
min/max should not silently truncate to 32 bits on 64-bit systems.
This is purely a name change -- no functional change intended.

HOWEVER! Some subsystems have

#define min(a, b) ((a) < (b) ? (a) : (b))
#define max(a, b) ((a) > (b) ? (a) : (b))

even though our standard name for that is MIN/MAX. Although these
may invite multiple evaluation bugs, these do _not_ cause integer
truncation.

To avoid `fixing' these cases, I first changed the name in libkern,
and then compile-tested every file where min/max occurred in order to
confirm that it failed -- and thus confirm that nothing shadowed
min/max -- before changing it.

I have left a handful of bootloaders that are too annoying to
compile-test, and some dead code:

cobalt ews4800mips hp300 hppa ia64 luna68k vax
acorn32/if_ie.c (not included in any kernels)
macppc/if_gm.c (superseded by gem(4))

It should be easy to fix the fallout once identified -- this way of
doing things fails safe, and the goal here, after all, is to _avoid_
silent integer truncations, not introduce them.

Maybe one day we can reintroduce min/max as type-generic things that
never silently truncate. But we should avoid doing that for a while,
so that existing code has a chance to be detected by the compiler for
conversion to uimin/uimax without changing the semantics until we can
properly audit it all. (Who knows, maybe in some cases integer
truncation is actually intended!)
 1.238  01-Jun-2018  ozaki-r branches: 1.238.2;
Fix _rt_free via rtrequest(RTM_DELETE) hangs in rt_timer handlers

A rt_timer handler is passed a rtentry with an extra reference that avoids the
rtentry is accidentally released. So rt_timer handers must release the reference
of a passed rtentry by themselves (but they didn't).
 1.237  07-May-2018  maxv Remove misleading comments.
 1.236  01-May-2018  maxv Remove now unused net_osdep.h includes, the other BSDs did the same.
 1.235  29-Apr-2018  maxv Replace
m_copym(m, 0, M_COPYALL, M_DONTWAIT)
by
m_copypacket(m, M_DONTWAIT)
when it is obvious that 'm' has M_PKTHDR set.
 1.234  28-Apr-2018  maxv Remove unused ipsec_var.h includes.
 1.233  27-Apr-2018  maxv Fix a bug introduced in rev1.154 (2009). mcl_cache still has a size of
MCLBYTES, so the area allocated is still too small.

I think it should have been MEXTMALLOC, and of course I can't test my
change.
 1.232  26-Apr-2018  maxv Stop using m_copy(), use m_copym() directly. m_copy is useless,
undocumented and confusing.
 1.231  26-Apr-2018  maxv Use M_UNWRITABLE, no functional change.
 1.230  14-Apr-2018  maxv Fix 'icmp6len', it shouldn't be ip6_plen, because we may not be at the
beginning of the packet (off+ip6_plen is beyond the end of the mbuf). By
luck, the IP6_EXTHDR_GET that follows will fail and prevent buffer
overflows in non-jumbogram packets.

For jumbograms we will probably be in trouble here; but it doesn't seem
possible to craft reliably a jumbogram for a non-jumbogram-enabled device.

So I don't think it's a huge problem.
 1.229  14-Apr-2018  maxv Cosmetic, and remove one XXX (no problem).
 1.228  14-Apr-2018  maxv Remove the RH0 code from ICMPv6. RH0 is deprecated by RFC5095 (2007) for
security reasons. We already removed it in Route6.

In addition there was an mbuf bug here: calling IP6_EXTHDR_GET twice with
the same offset, but still using the pointer from the first call, which
could have been made invalid. By luck, m_pulldown leaves zero-sized mbufs
in place, instead of freeing them.

And in general, using a 'finaldst' pointer on the mbuf, and then modifying
that mbuf with IP6_EXTHDR_GET with a smaller offset, was really error-
prone.
 1.227  14-Apr-2018  maxv Remove dead code. It is the same as the non-obsolete one, since
ICMP6_DST_UNREACH_NOTNEIGHBOR == ICMP6_DST_UNREACH_BEYONDSCOPE,
and the code leads to the same errno value (EHOSTUNREACH).
 1.226  12-Apr-2018  maxv Synchronize the code between raw_ip6.c<->icmp6.c<->raw_ip.c, so that it is
the same everywhere.
 1.225  12-Apr-2018  maxv Remove misleading comment; we're just checking the SP, not verifying the
AH/ESP payload. While here style a bit.
 1.224  21-Mar-2018  roy Sprinkle more soroverflow().
 1.223  28-Feb-2018  maxv branches: 1.223.2;
Remove unused ipsec_private.h includes.
 1.222  26-Feb-2018  maxv Remove redundant condition (harmless). PR/53030.
 1.221  26-Feb-2018  maxv Dedup: merge ipsec4_in_reject and ipsec6_in_reject into ipsec_in_reject.
While here fix misleading comment.

ok ozaki-r@
 1.220  12-Feb-2018  maxv Replace bcopy -> memcpy when it is obvious that the areas don't overlap.
Rearrange ip6_splithdr() for clarity.
 1.219  23-Jan-2018  maxv Style, localify, remove XXX when there's no issue, and switch 'extra'
to int.
 1.218  23-Jan-2018  maxv Fix the check on 'maxlen', we are not creating struct icmp6_hdr but
struct nd_redirect (which is bigger). Also, make sure we can add a
struct nd_opt_rd_hdr.

Normally this doesn't change anything, since the mbuf has IPV6_MMTU
bytes, and it's always way bigger than what we need.
 1.217  23-Jan-2018  maxv Fix info leak. We are allocating a slot of size:

roundup(sizeof(*nd_opt) + ifp->if_addrlen, 8)

But we are not filling in the padding caused by the roundup, and therefore
several bytes are leaked, in the mbuf we're about to send to the network.
 1.216  23-Jan-2018  maxv Fix twice the same mistake: 'last' can't be null, so there's no point in
having this misleading branch.
 1.215  23-Jan-2018  maxv Style, and four fixes:

* Remove the (disabled) IPPROTO_ESP check. If the packet was decrypted it
will have M_DECRYPTED, and this is already checked.

* Memory leaks in icmp6_error2. They seem hardly triggerable.

* Fix miscomputation in _icmp6_input, the ICMP6 header is not guaranteed
to be located right after the IP6 header. ok mlelstv@

* Memory leak in _icmp6_input. This one seems to be impossible to trigger.
 1.214  05-Nov-2017  ozaki-r Fix usages of ipsec_used

If IPsec isn't used, we must go back to the normal path.

PR kern/52659
 1.213  02-Aug-2017  ozaki-r Add missing IPsec policy checks to icmp6_rip6_input

icmp6_rip6_input is quite similar to rip6_input and the same checks exist
in rip6_input.
 1.212  07-Jul-2017  knakahara fix PR kern/52353. implemented by ozaki-r@n.o. I just commit by proxy.

XXX need to pullup to -8.
 1.211  14-Mar-2017  ozaki-r branches: 1.211.6;
Replace DIAGNOSTIC + panic with CTASSERT
 1.210  17-Feb-2017  ozaki-r Rename if_acquire_NOMPSAFE to if_acquire

It can be used in MP-safe ways. So let's remove the confusing postfix.
If it's used in a unsafe way, warn NOMPSAFE in a comment.
 1.209  13-Feb-2017  ozaki-r Protect mtudisc and redirect stuffs of icmp/icmp6 with mutex

We have to run pr_init of icmp and icmp6 prior to tcp and tcp6 ones
for mutex initialization.
 1.208  07-Feb-2017  ozaki-r Add missing NULL checks for m_get_rcvif
 1.207  02-Feb-2017  ozaki-r Defer some pr_input to workqueue

pr_input is currently called in softint. Some pr_input such as ICMP, ICMPv6
and CARP can add/delete/update IP addresses and routing table entries. For
example, icmp6_redirect_input updates an a routing table entry and
nd6_ra_input may delete an IP address.

Basically such operations shouldn't be done in softint. That aside, we have
a reason to avoid the situation; psz/psref waits cannot be used in softint,
however they are required to work in such pr_input in the MP-safe world.

The change implements the workqueue pr_input framework called wqinput which
provides a means to defer pr_input of a protocol to workqueue easily.
Currently icmp_input, icmp6_input, carp_proto_input and carp6_proto_input
are deferred to workqueue by the framework.

Proposed and discussed on tech-kern and tech-net
 1.206  16-Jan-2017  christos ip6_sprintf -> IN6_PRINT so that we pass the size.
 1.205  16-Jan-2017  ryo Make ip6_sprintf(), in_fmtaddr(), lla_snprintf() and icmp6_redirect_diag() mpsafe.

Reviewed by ozaki-r@
 1.204  13-Jan-2017  ozaki-r branches: 1.204.2;
Tweak icmp6_input; always use off, not *offp
 1.203  12-Dec-2016  ozaki-r Make the routing table and rtcaches MP-safe

See the following descriptions for details.

Proposed on tech-kern and tech-net


Overview
 1.202  11-Dec-2016  ozaki-r Correct sanity checks of icmp6_redirect_output

- rt->rt_ifp is always non-NULL
- Checking RTF_UP here is just racy and meaningless
- The arguments should be non-NULL (at least for now)
 1.201  15-Nov-2016  mlelstv Enforce alignment requirements that are violated in some cases.
For machines that don't need strict alignment (i386,amd64,vax,m68k) this
is a no-op.

Fixes PR kern/50766 but should be improved.
 1.200  31-Oct-2016  ozaki-r Fix race condition of in6_selectsrc

in6_selectsrc returned a pointer to in6_addr that wan't guaranteed to be
safe by pserialize (or psref), which was racy. Let callers pass a pointer
to in6_addr and in6_selectsrc copy a result to it inside pserialize
critical sections.
 1.199  25-Oct-2016  ozaki-r Remove unnecessary argument

No functional change.
 1.198  18-Oct-2016  ozaki-r Remove unnecessary pserialize_read_enter
 1.197  26-Aug-2016  dholland PR 51434 David Binderman: remove redundant test.
 1.196  19-Aug-2016  roy Revert r1.148
IP6_EXTHDR_GET ensures that a icmp6 header can be fetched from the mbuf
so m_pullup does not need to be called.

While here, we can safely increament interface error stats even with an
invalidated mbuf because we have a saved reference to the interface.
 1.195  01-Aug-2016  ozaki-r Apply pserialize and psref to struct ifaddr and its variants

This change makes struct ifaddr and its variants (in_ifaddr and in6_ifaddr)
MP-safe by using pserialize and psref. At this moment, pserialize_perform
and psref_target_destroy are disabled because (1) we don't need them
because of softnet_lock (2) they cause a deadlock because of softnet_lock.
So we'll enable them when we remove softnet_lock in the future.
 1.194  15-Jul-2016  ozaki-r Use sin6tosa and sin6tocsa macros

No functional change.
 1.193  15-Jul-2016  ozaki-r Use ifatoia6 macro

No functional change.
 1.192  07-Jul-2016  ozaki-r branches: 1.192.2;
Switch the address list of intefaces to pslist(9)

As usual, we leave the old list to avoid breaking kvm(3) users.
 1.191  05-Jul-2016  ozaki-r Use ia6 or ia instead of ifa as a variable name of struct in6_ifaddr

We conventionally use ifa for struct ifaddr and use ia6 or ia for
struct in6_ifaddr.

No functional change.
 1.190  28-Jun-2016  ozaki-r Add missing NULL checks for m_get_rcvif_psref
 1.189  21-Jun-2016  ozaki-r Make sure returning ifp from in6_select* functions psref-ed

To this end, callers need to pass struct psref to the functions
and the fuctions acquire a reference of ifp with it. In some cases,
we can simply use if_get_byindex, however, in other cases
(say rt->rt_ifp and ia->ifa_ifp), we have no MP-safe way for now.
In order to take a reference anyway we use non MP-safe function
if_acquire_NOMPSAFE for the latter cases. They should be fixed in
the future somehow.
 1.188  10-Jun-2016  ozaki-r Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.
 1.187  10-Jun-2016  ozaki-r Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.
 1.186  18-May-2016  ozaki-r Don't try to get outif unnecessarily from in6_selectsrc

The got outif is unused.
 1.185  17-May-2016  ozaki-r Get rcvif once and reuse it

No functional change.
 1.184  17-May-2016  ozaki-r Make sure icmp6_redirect_input frees mbuf before return
 1.183  12-May-2016  ozaki-r Protect ifnet list with psz and psref

The change ensures that ifnet objects in the ifnet list aren't freed during
list iterations by using pserialize(9) and psref(9).

Note that the change adds a pslist(9) for ifnet but doesn't remove the
original ifnet list (ifnet_list) to avoid breaking kvm(3) users. We
shouldn't use the original list in the kernel anymore.
 1.182  04-Apr-2016  ozaki-r Separate nexthop caches from the routing table

By this change, nexthop caches (IP-MAC address pair) are not stored
in the routing table anymore. Instead nexthop caches are stored in
each network interface; we already have lltable/llentry data structure
for this purpose. This change also obsoletes the concept of cloning/cloned
routes. Cloned routes no longer exist while cloning routes still exist
with renamed to connected routes.

Noticeable changes are:
- Nexthop caches aren't listed in route show/netstat -r
- sysctl(NET_RT_DUMP) doesn't return them
- If RTF_LLDATA is specified, it returns nexthop caches
- Several definitions of routing flags and messages are removed
- RTF_CLONING, RTF_XRESOLVE, RTF_LLINFO, RTF_CLONED and RTM_RESOLVE
- RTF_CONNECTED is added
- It has the same value of RTF_CLONING for backward compatibility
- route's -xresolve, -[no]cloned and -llinfo options are removed
- -[no]cloning remains because it seems there are users
- -[no]connected is introduced and recommended
to be used instead of -[no]cloning
- route show/netstat -r drops some flags
- 'L' and 'c' are not seen anymore
- 'C' now indicates a connected route
- Gateway value of a route of an interface address is now not
a L2 address but "link#N" like a connected (cloning) route
- Proxy ARP: "arp -s ... pub" doesn't create a route

You can know details of behavior changes by seeing diffs under tests/.

Proposed on tech-net and tech-kern:
http://mail-index.netbsd.org/tech-net/2016/03/11/msg005701.html
 1.181  01-Apr-2016  ozaki-r Remove unnecessary casts and do s/0/NULL/ for rtrequest
 1.180  01-Apr-2016  ozaki-r Refine nd6log

Add __func__ to nd6log itself instead of adding it to callers.
 1.179  21-Jan-2016  riastradh Revert previous: ran cvs commit when I meant cvs diff. Sorry!

Hit up-arrow one too few times.
 1.178  21-Jan-2016  riastradh Give proper prototype to ip_output.
 1.177  14-Sep-2015  ozaki-r Update icmp6_redirect_timeout_q when changing net.inet6.icmp6.redirtimeout

We have to update icmp6_redirect_timeout_q as well as icmp6_redirtimeout
when changing net.inet6.icmp6.redirtimeout via sysctl. The updating logic
is copied from sysctl_net_inet_icmp_redirtimeout.

This change is from s-yamaguchi@IIJ (with KNF by ozaki-r) and fixes
PR kern/50240.
 1.176  31-Aug-2015  ozaki-r Make rt_refcnt take into account rt_timer
 1.175  24-Aug-2015  pooka sprinkle _KERNEL_OPT
 1.174  24-Aug-2015  ozaki-r Change 0 to NULL for rtrequest's last argument (struct rtentry **ret_nrt)
 1.173  07-Aug-2015  ozaki-r Use time_uptime instead of time_second to avoid time leaps

Some codes in sys/net* use time_second to manage time periods such as
cache expirations. However, time_second doesn't increase monotonically
and can leap by say settimeofday(2) according to time_second(9). We
should use time_uptime instead of it to avoid such time leaps.

This change replaces time_second with time_uptime. Additionally it
converts a time based on time_uptime to a time based on time_second
when the kernel passes the time to userland programs that expect
the latter, and vice versa.

Note that we shouldn't leak time_uptime to other hosts over the
netowrk. My investigation shows there is no such leak:
http://mail-index.netbsd.org/tech-net/2015/08/06/msg005332.html

Discussed on tech-kern and tech-net.
 1.172  24-Jul-2015  ozaki-r Fix rtfree-ing wrong rtentry
 1.171  17-Jul-2015  ozaki-r Reform use of rt_refcnt

rt_refcnt of rtentry was used in bad manners, for example, direct rt_refcnt++
and rt_refcnt-- outside route.c, "rt->rt_refcnt++; rtfree(rt);" idiom, and
touching rt after rt->rt_refcnt--.

These abuses seem to be needed because rt_refcnt manages only references
between rtentry and doesn't take care of references during packet processing
(IOW references from local variables). In order to reduce the above abuses,
the latter cases should be counted by rt_refcnt as well as the former cases.

This change improves consistency of use of rt_refcnt:
- rtentry is always accessed with rt_refcnt incremented
- rtentry's rt_refcnt is decremented after use (rtfree is always used instead
of rt_refcnt--)
- functions returning rtentry increment its rt_refcnt (and caller rtfree it)

Note that rt_refcnt prevents rtentry from being freed but doesn't prevent
rtentry from being updated. Toward MP-safe, we need to provide another
protection for rtentry, e.g., locks. (Or introduce a better data structure
allowing concurrent readers during updates.)
 1.170  25-Nov-2014  christos branches: 1.170.2;
CID 977389: Out of bounds access.
 1.169  06-Jun-2014  rmind branches: 1.169.2;
- Eliminate RTFREE() macro in favour of rtfree() function.
- Make rtcache() function static.
 1.168  30-May-2014  christos Introduce 2 new variables: ipsec_enabled and ipsec_used.
Ipsec enabled is controlled by sysctl and determines if is allowed.
ipsec_used is set automatically based on ipsec being enabled, and
rules existing.
 1.167  19-May-2014  rmind - Split off PRU_ATTACH and PRU_DETACH logic into separate functions.
- Replace malloc with kmem and eliminate M_PCB while here.
- Sprinkle more asserts.
 1.166  18-May-2014  rmind Use IFNET_FIRST() rather than open coding ifnet access.
 1.165  25-Feb-2014  pooka branches: 1.165.2;
Ensure that the top level sysctl nodes (kern, vfs, net, ...) exist before
the sysctl link sets are processed, and remove redundancy.

Shaves >13kB off of an amd64 GENERIC, not to mention >1k duplicate
lines of code.
 1.164  20-Feb-2014  joerg Bail out in case m_pulldown failed.
 1.163  23-Nov-2013  christos convert from CIRCLEQ to TAILQ.
 1.162  05-Jun-2013  christos branches: 1.162.2;
IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.
 1.161  23-Jun-2012  christos branches: 1.161.2;
4 new sysctls to avoid ipv6 DoS attacks from OpenBSD
 1.160  22-Mar-2012  drochner remove KAME IPSEC, replaced by FAST_IPSEC
 1.159  31-Dec-2011  christos branches: 1.159.2; 1.159.6; 1.159.8;
- fix offsetof usage, and redundant defines
- kill pointer casts to 0
 1.158  19-Dec-2011  drochner rename the IPSEC in-kernel CPP variable and config(8) option to
KAME_IPSEC, and make IPSEC define it so that existing kernel
config files work as before
Now the default can be easily be changed to FAST_IPSEC just by
setting the IPSEC alias to FAST_IPSEC.
 1.157  31-Aug-2011  plunky branches: 1.157.2; 1.157.6;
NULL does not need a cast
 1.156  12-Sep-2010  drochner avoid NULL dereference in error case
 1.155  18-Oct-2009  christos branches: 1.155.2; 1.155.4;
fix the sun2 case for real.
 1.154  12-Oct-2009  christos unbreak sun2.
 1.153  16-Sep-2009  pooka Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL
 1.152  18-Mar-2009  cegger bzero -> memset
 1.151  18-Mar-2009  cegger bcmp -> memcmp
 1.150  03-Oct-2008  adrianp branches: 1.150.2; 1.150.8;
Fix for CVE-2008-3530 from matt@
Implement improved checking for MTU values on ICMP 'Packet Too Big Messages'
 1.149  06-Aug-2008  plunky Convert socket options code to use a sockopt structure
instead of laying everything into an mbuf.

approved by core
 1.148  07-May-2008  bouyer branches: 1.148.2; 1.148.6;
Sync with ipv4 icmp_input(): make sure the mbuf is writable and
contains the entire icmp message befre calling icmp6_input().
should fix "panic: mbuf too short for IPv6 header" seen by several peoples.
 1.147  04-May-2008  thorpej Simplify the interface to netstat_sysctl() and allocate space for
the collated counters using kmem_alloc().

PR kern/38577
 1.146  23-Apr-2008  thorpej branches: 1.146.2;
Use <net/net_stats.h> / netstat_sysctl().
 1.145  15-Apr-2008  thorpej branches: 1.145.2;
Make ip6 and icmp6 stats per-cpu.
 1.144  08-Apr-2008  thorpej Change IPv6 stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ip6stat structure; old netstat
binaries will continue to work properly.
 1.143  08-Apr-2008  thorpej Change ICMP6 stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old icmp6stat structure; old netstat
binaries will continue to work properly.
 1.142  27-Feb-2008  matt Convert to ansi definitions from old-style definitons.
Remember that func() is not ansi, func(void) is.
 1.141  04-Dec-2007  dyoung branches: 1.141.8; 1.141.12;
Use IFNET_FOREACH() and IFADDR_FOREACH().
 1.140  01-Nov-2007  dyoung branches: 1.140.2; 1.140.4;
De-__P().
 1.139  29-Oct-2007  dyoung The IPv6 stack labels incoming packets with an m_tag whose payload
is a struct ip6aux. A struct ip6aux used to contain a pointer to
an in6_ifaddr, but that pointer could become a dangling reference
in the lifetime of the m_tag, because ip6_setdstifaddr() did not
increase the in6_ifaddr's reference count. I have removed the
pointer from ip6aux. I load it with the interesting fields from
the in6_ifaddr (an IPv6 address, a scope ID, and some flags),
instead.
 1.138  24-Oct-2007  dyoung Replace rote sockaddr_in6 initializations (memset(), set sa6_family,
sa6_len, and sa6_add) with sockaddr_in6_init() calls.

De-__P(). Constify. KNF. Shorten a staircase. Change bcmp() to
memcmp().

Extract subroutine in6_setzoneid() from in6_setscope(), for re-use
soon.
 1.137  19-Sep-2007  dyoung branches: 1.137.4;
1) Introduce a new socket option, (SOL_SOCKET, SO_NOHEADER), that
tells a socket that it should both add a protocol header to tx'd
datagrams and remove the header from rx'd datagrams:

int onoff = 1, s = socket(...);
setsockopt(s, SOL_SOCKET, SO_NOHEADER, &onoff);

2) Add an implementation of (SOL_SOCKET, SO_NOHEADER) for raw IPv4
sockets.

3) Reorganize the protocols' pr_ctloutput implementations a bit.
Consistently return ENOPROTOOPT when an option is unsupported,
and EINVAL if a supported option's arguments are incorrect.
Reorganize the flow of code so that it's more clear how/when
options are passed down the stack until they are handled.

Shorten some pr_ctloutput staircases for readability.

4) Extract common mbuf code into subroutines, add new sockaddr
methods, and introduce a new subroutine, fsocreate(), for reuse
later; use it first in sys_socket():

struct mbuf *m_getsombuf(struct socket *so)

Create an mbuf and make its owner the socket `so'.

struct mbuf *m_intopt(struct socket *so, int val)

Create an mbuf, make its owner the socket `so', put the
int `val' into it, and set its length to sizeof(int).


int fsocreate(..., int *fd)

Create a socket, a la socreate(9), put the socket into the
given LWP's descriptor table, return the descriptor at `fd'
on success.

void *sockaddr_addr(struct sockaddr *sa, socklen_t *slenp)
const void *sockaddr_const_addr(const struct sockaddr *sa, socklen_t *slenp)

Extract a pointer to the address part of a sockaddr. Write
the length of the address part at `slenp', if `slenp' is
not NULL.

socklen_t sockaddr_getlen(const struct sockaddr *sa)

Return the length of a sockaddr. This just evaluates to
sa->sa_len. I only add this for consistency with code that
appears in a portable userland library that I am going to
import.

const struct sockaddr *sockaddr_any(const struct sockaddr *sa)

Return the "don't care" sockaddr in the same family as
`sa'. This is the address a client should sobind(9) if it
does not care the source address and, if applicable, the
port et cetera that it uses.

const void *sockaddr_anyaddr(const struct sockaddr *sa, socklen_t *slenp)

Return the "don't care" sockaddr in the same family as
`sa'. This is the address a client should sobind(9) if it
does not care the source address and, if applicable, the
port et cetera that it uses.
 1.136  10-Aug-2007  dyoung branches: 1.136.2;
Constify. bcopy -> memcpy.
 1.135  19-Jul-2007  dyoung branches: 1.135.4; 1.135.6;
Take steps to hide the radix_node implementation of the forwarding table
from the forwarding table's users:

Introduce rt_walktree() for walking the routing table and
applying a function to each rtentry. Replace most
rn_walktree() calls with it.

Use rt_getkey()/rt_setkey() to get/set a route's destination.
Keep a pointer to the sockaddr key in the rtentry, so that
rtentry users do not have to grovel in the radix_node for
the key.

Add a RTM_GET method to rtrequest. Use that instead of
radix_node lookups in, e.g., carp(4).

Add sys/net/link_proto.c, which supplies sockaddr routines for
link-layer socket addresses (sockaddr_dl).

Cosmetic:

Constify. KNF. Stop open-coding LIST_FOREACH, TAILQ_FOREACH,
et cetera. Use NULL instead of 0 for null pointers. Use
__arraycount(). Reduce gratuitous parenthesization.

Stop using variadic arguments for rip6_output(), it is
unnecessary.

Remove the unnecessary rtentry member rt_genmask and the
code to maintain it, since nothing actually used it.

Make rt_maskedcopy() easier to read by using meaningful variable
names.

Extract a subroutine intern_netmask() for looking up a netmask in
the masks table.

Start converting backslash-ridden IPv6 macros in
sys/netinet6/in6_var.h into inline subroutines that one
can read without special eyeglasses.

One functional change: when the kernel serves an RTM_GET, RTM_LOCK,
or RTM_CHANGE request, it applies the netmask (if supplied) to a
destination before searching for it in the forwarding table.

I have changed sys/netinet/ip_carp.c, carp_setroute(), to remove
the unlawful radix_node knowledge.

Apart from the changes to carp(4), netiso, ATM, and strip(4), I
have run the changes on three nodes in my wireless routing testbed,
which involves IPv4 + IPv6 dynamic routing acrobatics, and it's
working beautifully so far.
 1.134  13-Jun-2007  dyoung branches: 1.134.2;
Persuasive programming: check M_UNWRITABLE(m, len) instead of
m->m_len<len before pulling up, because that helps make it clear
that we m_pullup() in order to guarantee that the contiguous region
is *writable*.
 1.133  23-May-2007  christos Ansify + add a few comments, from Karl Sjödahl
 1.132  02-May-2007  dyoung Eliminate address family-specific route caches (struct route, struct
route_in6, struct route_iso), replacing all caches with a struct
route.

The principle benefit of this change is that all of the protocol
families can benefit from route cache-invalidation, which is
necessary for correct routing. Route-cache invalidation fixes an
ancient PR, kern/3508, at long last; it fixes various other PRs,
also.

Discussions with and ideas from Joerg Sonnenberger influenced this
work tremendously. Of course, all design oversights and bugs are
mine.

DETAILS

1 I added to each address family a pool of sockaddrs. I have
introduced routines for allocating, copying, and duplicating,
and freeing sockaddrs:

struct sockaddr *sockaddr_alloc(sa_family_t af, int flags);
struct sockaddr *sockaddr_copy(struct sockaddr *dst,
const struct sockaddr *src);
struct sockaddr *sockaddr_dup(const struct sockaddr *src, int flags);
void sockaddr_free(struct sockaddr *sa);

sockaddr_alloc() returns either a sockaddr from the pool belonging
to the specified family, or NULL if the pool is exhausted. The
returned sockaddr has the right size for that family; sa_family
and sa_len fields are initialized to the family and sockaddr
length---e.g., sa_family = AF_INET and sa_len = sizeof(struct
sockaddr_in). sockaddr_free() puts the given sockaddr back into
its family's pool.

sockaddr_dup() and sockaddr_copy() work analogously to strdup()
and strcpy(), respectively. sockaddr_copy() KASSERTs that the
family of the destination and source sockaddrs are alike.

The 'flags' argumet for sockaddr_alloc() and sockaddr_dup() is
passed directly to pool_get(9).

2 I added routines for initializing sockaddrs in each address
family, sockaddr_in_init(), sockaddr_in6_init(), sockaddr_iso_init(),
etc. They are fairly self-explanatory.

3 structs route_in6 and route_iso are no more. All protocol families
use struct route. I have changed the route cache, 'struct route',
so that it does not contain storage space for a sockaddr. Instead,
struct route points to a sockaddr coming from the pool the sockaddr
belongs to. I added a new method to struct route, rtcache_setdst(),
for setting the cache destination:

int rtcache_setdst(struct route *, const struct sockaddr *);

rtcache_setdst() returns 0 on success, or ENOMEM if no memory is
available to create the sockaddr storage.

It is now possible for rtcache_getdst() to return NULL if, say,
rtcache_setdst() failed. I check the return value for NULL
everywhere in the kernel.

4 Each routing domain (struct domain) has a list of live route
caches, dom_rtcache. rtflushall(sa_family_t af) looks up the
domain indicated by 'af', walks the domain's list of route caches
and invalidates each one.
 1.131  04-Mar-2007  christos branches: 1.131.2; 1.131.4;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.
 1.130  17-Feb-2007  dyoung KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.
 1.129  10-Feb-2007  degroote branches: 1.129.2;
Commit my SoC work
Add ipv6 support for fast_ipsec
Note that currently, packet with extensions headers are not correctly
supported
Change the ipcomp logic
 1.128  29-Jan-2007  dyoung bzero -> memset
 1.127  15-Jan-2007  dyoung Cosmetic: indent using ASCII horizontal tab, insert space following
comma, wrap line.
 1.126  15-Jan-2007  degroote Fix an infinite loop ( and local dos ) in the case where the ip6_hdr and
the icmp6_hdr are not in the same mbuf.
Fix pr/34994 and probably pr/35333
Ok @rpaulo
 1.125  15-Dec-2006  joerg Introduce new helper functions to abstract the route caching.
rtcache_init and rtcache_init_noclone lookup ro_dst and store
the result in ro_rt, taking care of the reference counting and
calling the domain specific route cache.
rtcache_free checks if a route was cashed and frees the reference.
rtcache_copy copies ro_dst of the given struct route, checking that
enough space is available and incrementing the reference count of the
cached rtentry if necessary.
rtcache_check validates that the cached route is still up. If it isn't,
it tries to look it up again. Afterwards ro_rt is either a valid again
or NULL.
rtcache_copy is used internally.

Adjust to callers of rtalloc/rtflush in the tree to check the sanity of
ro_dst first (if necessary). If it doesn't fit the expectations, free
the cache, otherwise check if the cached route is still valid. After
that combination, a single check for ro_rt == NULL is enough to decide
whether a new lookup needs to be done with a different ro_dst.
Make the route checking in gre stricter by repeating the loop check
after revalidation.
Remove some unused RADIX_MPATH code in in6_src.c. The logic is slightly
changed here to first validate the route and check RTF_GATEWAY
afterwards. This is sementically equivalent though.
etherip doesn't need sc_route_expire similiar to the gif changes from
dyoung@ earlier.

Based on the earlier patch from dyoung@, reviewed and discussed with
him.
 1.124  09-Dec-2006  dyoung Here are various changes designed to protect against bad IPv4
routing caused by stale route caches (struct route). Route caches
are sprinkled throughout PCBs, the IP fast-forwarding table, and
IP tunnel interfaces (gre, gif, stf).

Stale IPv6 and ISO route caches will be treated by separate patches.

Thank you to Christoph Badura for suggesting the general approach
to invalidating route caches that I take here.

Here are the details:

Add hooks to struct domain for tracking and for invalidating each
domain's route caches: dom_rtcache, dom_rtflush, and dom_rtflushall.

Introduce helper subroutines, rtflush(ro) for invalidating a route
cache, rtflushall(family) for invalidating all route caches in a
routing domain, and rtcache(ro) for notifying the domain of a new
cached route.

Chain together all IPv4 route caches where ro_rt != NULL. Provide
in_rtcache() for adding a route to the chain. Provide in_rtflush()
and in_rtflushall() for invalidating IPv4 route caches. In
in_rtflush(), set ro_rt to NULL, and remove the route from the
chain. In in_rtflushall(), walk the chain and remove every route
cache.

In rtrequest1(), call rtflushall() to invalidate route caches when
a route is added.

In gif(4), discard the workaround for stale caches that involves
expiring them every so often.

Replace the pattern 'RTFREE(ro->ro_rt); ro->ro_rt = NULL;' with a
call to rtflush(ro).

Update ipflow_fastforward() and all other users of route caches so
that they expect a cached route, ro->ro_rt, to turn to NULL.

Take care when moving a 'struct route' to rtflush() the source and
to rtcache() the destination.

In domain initializers, use .dom_xxx tags.

KNF here and there.
 1.123  16-Nov-2006  christos branches: 1.123.2;
__unused removal on arguments; approved by core.
 1.122  12-Oct-2006  christos - sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386
 1.121  05-Sep-2006  dyoung branches: 1.121.2; 1.121.4;
Simplify and repair icmp6_input() to stop the kernel from panicking
in m_copydata() when an ICMP6_ECHO_REQUEST is received, as reported
by Tatoku Ogaito on current-users@.
 1.120  01-Sep-2006  dyoung Vastly simplify the code that copies an ICMP6 packet to two data
paths: ICMP6 reply path, and socket path.
 1.119  30-Aug-2006  christos declare the type of code.
 1.118  11-Jul-2006  tron Clear mbuf checksum flags before passing it to ip6_output(). We might
recycle a mbuf which contained a hardware provided checksum. This
fixes "traceroute6" to a machine which is using a wm(4) interface
that has UDP or TCP checksum offload enabled.
 1.117  07-Jun-2006  kardel branches: 1.117.2;
merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html
 1.116  15-Apr-2006  christos branches: 1.116.2;
Coverity CID 740: Change constant comparisons to MCLBYTES to KASSERT and remove
extraneous tests.
 1.115  05-Mar-2006  rpaulo branches: 1.115.2; 1.115.4;
NDP-related improvements:
RFC4191
- supports host-side router-preference

RFC3542
- if DAD fails on a interface, disables IPv6 operation on the
interface
- don't advertise MLD report before DAD finishes

Others
- fixes integer overflow for valid and preferred lifetimes
- improves timer granularity for MLD, using callout-timer.
- reflects rtadvd's IPv6 host variable information into kernel
(router only)
- adds a sysctl option to enable/disable pMTUd for multicast
packets
- performs NUD on PPP/GRE interface by default
- Redirect works regardless of ip6_accept_rtadv
- removes RFC1885-related code

From the KAME project via SUZUKI Shinsuke.
Reviewed by core.
 1.114  03-Mar-2006  rpaulo branches: 1.114.2;
Fix typos in comments.

From: the KAME project via SUZUKI Shinsuke.
 1.113  21-Jan-2006  rpaulo branches: 1.113.2; 1.113.4;
Better support of IPv6 scoped addresses.

- most of the kernel code will not care about the actual encoding of
scope zone IDs and won't touch "s6_addr16[1]" directly.
- similarly, most of the kernel code will not care about link-local
scoped addresses as a special case.
- scope boundary check will be stricter. For example, the current
*BSD code allows a packet with src=::1 and dst=(some global IPv6
address) to be sent outside of the node, if the application do:
s = socket(AF_INET6);
bind(s, "::1");
sendto(s, some_global_IPv6_addr);
This is clearly wrong, since ::1 is only meaningful within a single
node, but the current implementation of the *BSD kernel cannot
reject this attempt.
- and, while there, don't try to remove the ff02::/32 interface route
entry in in6_ifdetach() as it's already gone.

This also includes some level of support for the standard source
address selection algorithm defined in RFC3484, which will be
completed on in the future.

From the KAME project via JINMEI Tatuya.
Approved by core@.
 1.112  11-Dec-2005  christos branches: 1.112.2;
merge ktrace-lwp.
 1.111  19-Oct-2005  bouyer In icmp6_redirect_output(), sip6 is initialised to point to the data area of
m0. But m0 may be freed later, so trying to use sip6 at the end of this
function is wrong. My guess is that we want to reference the data area
of m (the mbuf about to be send) instead at this point.
Fix a panic on Xen (where a data area of a mbuf may be unmapped when the
mbuf is freed), and probably potential data/pool corruption in other cases.
 1.110  18-Aug-2005  yamt branches: 1.110.2;
- introduce M_MOVE_PKTHDR and use it where appropriate.
intended to be mostly API compatible with openbsd/freebsd.
- remove a glue #define in netipsec/ipsec_osdep.h.
 1.109  29-May-2005  christos branches: 1.109.2;
- avoid shadowed variables
- sprinkle const.
 1.108  17-Jan-2005  itojun branches: 1.108.6; 1.108.8; 1.108.10;
shouldn't check code field on "packet too big" icmp6 message.
 1.107  25-May-2004  atatat branches: 1.107.4;
Sysctl descriptions under net subtree (net.key not done)
 1.106  26-Mar-2004  itojun branches: 1.106.2;
do not touch m->m_pkthdr.rcvif after m becomes invalid. Patrick Latifi
 1.105  24-Mar-2004  atatat Tango on sysctl_createv() and flags. The flags have all been renamed,
and sysctl_createv() now uses more arguments.
 1.104  17-Dec-2003  lha Fix ICMPV6CTL_ND6_[DP]RLIST, they broke with new sysctl.
Makes ndp -r/ndp -p work again, patch from atatat
 1.103  04-Dec-2003  atatat Dynamic sysctl.

Gone are the old kern_sysctl(), cpu_sysctl(), hw_sysctl(),
vfs_sysctl(), etc, routines, along with sysctl_int() et al. Now all
nodes are registered with the tree, and nodes can be added (or
removed) easily, and I/O to and from the tree is handled generically.

Since the nodes are registered with the tree, the mapping from name to
number (and back again) can now be discovered, instead of having to be
hard coded. Adding new nodes to the tree is likewise much simpler --
the new infrastructure handles almost all the work for simple types,
and just about anything else can be done with a small helper function.

All existing nodes are where they were before (numerically speaking),
so all existing consumers of sysctl information should notice no
difference.

PS - I'm sorry, but there's a distinct lack of documentation at the
moment. I'm working on sysctl(3/8/9) right now, and I promise to
watch out for buses.
 1.102  30-Oct-2003  simonb Remove some assigned-to but otherwise unused variables.
 1.101  04-Sep-2003  itojun revamp inpcb/in6pcb so that they are more aligned with each other.
in6pcb lookup now uses hash(9).
 1.100  25-Aug-2003  itojun deref member in in6p directly, don't rely on existence of macro
 1.99  22-Aug-2003  itojun remove ipsec_set/getsocket. now we explicitly pass socket * to ip{,6}_output.
 1.98  22-Aug-2003  itojun change the additional arg to be passed to ip{,6}_output to struct socket *.

this fixes KAME policy lookup which was broken by the previous commit.
 1.97  22-Aug-2003  jonathan Replace the set_socket() method of passing an extra struct socket*
argument to ip6_output() with a new explicit struct in6pcb* argument.
(The underlying socket can be obtained via in6pcb->inp6_socket.)

In preparation for fast-ipsec. Reviewed by itojun.
 1.96  07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.95  06-Aug-2003  itojun m_cat may free mbuf on 2nd arg, so m_pkthdr manipulation has to happen
before m_cat call. from Julian Coleman via kame.
 1.94  24-Jun-2003  itojun branches: 1.94.2;
remove unneeded checks of accept_rtadv. from kame
 1.93  24-Jun-2003  itojun use time.tv_sec directly
 1.92  06-Jun-2003  itojun - sync up MLD declaration with RFC3542 (s/MLD6/MLD/)
- routing header declaration with RFC3542
(note: sizeof(ip6_rthdr0) has changed!)
also, sync up with RFC2460 routing header definition (no "strict" source
routing mode any more)

part of advanced API update (RFC2292 -> 3542).
 1.91  03-Jun-2003  itojun remove assumption on redirect header option processing. from kame
 1.90  14-May-2003  itojun always use PULLDOWN_TEST codepath.
 1.89  31-Mar-2003  itojun avoid mbuf leak in redirect header option attachment. more complete
fix to come. from kame
 1.88  27-Sep-2002  provos remove trailing \n in panic(). approved perry.
 1.87  23-Sep-2002  simonb Remove breaks after returns, unreachable returns and returns after
returns(!).
 1.86  11-Sep-2002  itojun KNF - return is not a function. sync w/kame.
 1.85  30-Jul-2002  itojun no need to check NULL mbuf, as we touch it already.
From: tedu <grendel@zeitbombe.org>
 1.84  10-Jul-2002  itojun correct ping6 -w result wth hostname with [A-Z]. PR 17540. sync w/kame
 1.83  30-Jun-2002  thorpej Changes to allow the IPv4 and IPv6 layers to align headers themseves,
as necessary:
* Implement a new mbuf utility routine, m_copyup(), is is like
m_pullup(), except that it always prepends and copies, rather
than only doing so if the desired length is larger than m->m_len.
m_copyup() also allows an offset into the destination mbuf, which
allows space for packet headers, in the forwarding case.
* Add *_HDR_ALIGNED_P() macros for IP, IPv6, ICMP, and IGMP. These
macros expand to 1 if __NO_STRICT_ALIGNMENT is defined, so that
architectures which do not have strict alignment constraints don't
pay for the test or visit the new align-if-needed path.
* Use the new macros to check if a header needs to be aligned, or to
assert that it already is, as appropriate.

Note: This code is still somewhat experimental. However, the new
code path won't be visited if individual device drivers continue
to guarantee that packets are delivered to layer 3 already properly
aligned (which are rules that are already in use).
 1.82  09-Jun-2002  itojun whitespace cleanup
 1.81  08-Jun-2002  itojun whitespace cleanup
 1.80  31-May-2002  itojun do not mistakenly lock PMTUD route entry with RTV_MTU.
 1.79  29-May-2002  christos make this compile again.
 1.78  29-May-2002  itojun correct rmx_mtu value after PMTUD entry timeout (should be set to 0)
 1.77  24-May-2002  itojun extra blank line
 1.76  24-May-2002  itojun make a strict check before sending FQDN node information reply. sync w/kame
 1.75  05-Mar-2002  itojun branches: 1.75.6; 1.75.8;
on redirect output, always try to attach target link layer address option.
 1.74  21-Dec-2001  itojun whitespace/costmetic sync w/kame
 1.73  20-Dec-2001  itojun centralize multicast group management (in6_join/leavegroup).
have a flag for ip6_output() to fragment to minimum MTU.
sync with kame
 1.72  07-Dec-2001  itojun correct timing to increment icmp6 MIB variables. sync with kame
 1.71  13-Nov-2001  lukem add RCSIDs
 1.70  29-Oct-2001  simonb Don't need to include <uvm/uvm_extern.h> just to include <sys/sysctl.h>
anymore.
 1.69  24-Oct-2001  itojun more whitespace sync with kame
 1.68  18-Oct-2001  itojun branches: 1.68.2;
simplify per-if stats.
 1.67  15-Oct-2001  itojun sync with kame.
net.inet6.icmp6.nodeinfo is now a bitmap (2^0 = ping6 -w, 2^1 = ping6 -a).
give up local if there's mbuf alloc failures.
cope with ".." in hostname.
sync comments/whitespaces.
 1.66  22-Jun-2001  itojun branches: 1.66.2;
remove RFC1885 compatibility code in #ifdef COMPAT_RFC1885, for icmp6
reply packet size consideration (obsolete, not used for a long time).
sync with kame
 1.65  01-Jun-2001  itojun use default hoplimit when incoming interface is not given to icmp6_reflect.
sync with kame
 1.64  08-May-2001  itojun correct faith prefix determination. use sys/netinet/if_faith.c:faithprefix()
to determine. sync with kame.
(without this change, non-faith socket may mistakenly accept for-faith traffic)
 1.63  04-Apr-2001  itojun make sure rcvif is sane on call to icmp6_reflect
 1.62  30-Mar-2001  itojun enable FAKE_LOOPBACK_IF case by default.
now traffic on loopback interface will be presented to bpf as normal wire
format packet (without KAME scopeid in s6_addr16[1]).

fix KAME PR 250 (host mistakenly accepts packets to fe80::x%lo0).

sync with kame.
 1.61  21-Mar-2001  itojun set rmx_mtu to L2 interface mtu, instead of 0, on mtudisc timeout.
ip6_output() change is for safety. sync with kame
 1.60  08-Mar-2001  itojun remove bogus rtfree. sync with kame. inspired by openbsd PR 1706.
 1.59  01-Mar-2001  itojun branches: 1.59.2;
make sure to enforce inbound ipsec policy checking, for any protocols on top
of ip (check it when final header is visited). sync with kame.
XXX kame team will need to re-check policy engine code
 1.58  11-Feb-2001  itojun pull latest kame pcbnotify code. synchronizes ICMPv6 path mtu discovery
behavior with other protocols (i.e. validation, use of hiwat/lowat).
 1.57  11-Feb-2001  itojun recover $NetBSD$ (removed by mistake)
 1.56  10-Feb-2001  itojun to sync with kame better, (1) remove register declaration for variables,
(2) sync whitespaces, (3) update comments. (4) bring in some of portability
and logging enhancements. no functional changes here.
 1.55  08-Feb-2001  itojun implement upper limit to icmp6 redirects (experimental, turned off)
negative value to {mtudisc,redirect}_{hi,lo}wat will turn off the limitation.
sync with kame.
 1.54  07-Feb-2001  itojun remove bogus DIAGNOSTIC. sync with kame
 1.53  07-Feb-2001  itojun during ip6/icmp6 inbound packet processing, do not call log() nor printf() in
normal operation (/var can get filled up by flodding bogus packets).
sysctl net.inet6.icmp6.nd6_debug will turn on diagnostic messages.
(#define ND6_DEBUG will turn it on by default)

improve stats in ND6 code.

lots of synchronziation with kame (including comments and cometic ones).
 1.52  24-Jan-2001  itojun - record IPsec packet history into m_aux structure.
- let ipfilter look at wire-format packet only (not the decapsulated ones),
so that VPN setting can work with NAT/ipfilter settings.
sync with kame.

TODO: use header history for stricter inbound validation
 1.51  16-Jan-2001  itojun s/ND6DEBUG/ND6_DEBUG/ to meet other places
 1.50  08-Jan-2001  itojun wrap icmp6 checksum error printf() into #ifdef ND6DEBUG.
sync with kame, NetBSD PR 11911.
 1.49  11-Dec-2000  itojun no need to rtalloc1() twice in pmtud. from kame
 1.48  09-Dec-2000  itojun update icmp6 too big validation. the change is necessary since pmtud is
mandatory for IPv6 (so we can't just validate by using connected pcb - we need
to allow traffic from unconnected pcb to do pmtud).
- if the traffic is validated by xx_ctlinput, allow up to "hiwat" pmtud
route entries.
- if the traffic was not validated by xx_ctlinput, allow up to "lowat" pmtud
route entries (there's upper limit, so bad guys cannot blow up our routing
table).
sync with kame

XXX need to think again about default hiwat/lowat value.
XXX victim selection to help starvation case
 1.47  11-Nov-2000  itojun improve spec conformance of node information query (07).
sync with kame.
 1.46  18-Oct-2000  itojun verify ICMPv6 too big messages based on TCP pcbs, and/or IPsec SA.
TODO: udp6, and sendto consideration. as pmtud is mandatory for IPv6,
it is rather important for us to support those cases.
TODO: more testing
TODO: kame sync
 1.45  10-Oct-2000  itojun sync with kame ($KAME$)
 1.44  02-Oct-2000  itojun fix compilation without INET. fix confusion between ipsecstat and ipsec6stat.
sync with kame.
 1.43  16-Sep-2000  itojun kame sys/netinet6/icmp6.c 1.140 -> 1.144
> in the check for the incoming redirect message, examine the gateway
> (from the routing table) only when the address family of the gateway is
> AF_INET6.
 1.42  19-Aug-2000  itojun - icmp6 nodeinfo: remove possibility of unaligned pointer access.
- jumbo payload output: fix incorrect mbuf manipulation
- pedant: align issues, mbuf assumption
(sync with kame)
 1.41  03-Aug-2000  itojun clearifications in icmp6 node query support.
XXX previous commit included "supported qtypes" icmp6 node query support.
sorry commit message was mistaken.
 1.40  03-Aug-2000  itojun correct typo in #define. ICMP6_NI_SUCESS -> SUCCESS (notice missing C).
sync with kame.
 1.39  30-Jul-2000  itojun sync comment with reality
 1.38  28-Jul-2000  itojun nuke the following sysctl variables. "ppsratelimit" should work better.
need to recompile sbin/sysctl after updating /usr/include.
net.inet.tcp.rstratelimit
net.inet.icmp.errratelimit
net.inet6.icmp6.errratelimit
 1.37  09-Jul-2000  itojun add ppsratelimit(9), which does event-per-sec rate limitation.
use it from icmp6 error rate limitation code.
XXX better name for the function?
 1.36  07-Jul-2000  itojun sync with kame.
introduce in6_{recover,embed}scope, for in-kernel scoped-address manipulation.
improve in6_pcbnotify.
 1.35  06-Jul-2000  itojun - do not use bitfield for router renumbering header.
- add protection mechanism against ND cache corruption due to bad NUD hints.
- more stats
- icmp6 pps limitation. TOOD: should implement ppsratecheck(9).
 1.34  28-Jun-2000  mrg <vm/vm.h> -> <uvm/uvm_extern.h>
 1.33  13-Jun-2000  itojun branches: 1.33.2;
signedness issue with char, take 2. confirmed with i386 cc -funsigned-char.
 1.32  13-Jun-2000  itojun workaround to suppress warning on char == unsigned char arch.
 1.31  12-Jun-2000  itojun better conformance to draft-ietf-ipngwg-icmp-name-lookups-05.
the old code was chimera of 03 and 05 draft.

-n by default, since IPv6 reverse lookup takes too much time.
use -H to enable reverse name lookup.
 1.30  22-May-2000  itojun branches: 1.30.2;
disallow negative numbers for ratelimit interval (tcp, icmp, icmp6).
 1.29  09-May-2000  itojun do not try NUD unless the gateway is a real neighbor.
real fix to KAME PR 245 (workaround has been implemented).
 1.28  13-Apr-2000  itojun do not return icmp6 error against icmp6 error.
(this is due to a bug in header chain chasing)
 1.27  22-Mar-2000  itojun use ip6_{last,next}hdr in icmp6 inbound packet parsing.
 1.26  01-Mar-2000  itojun introduce m->m_pkthdr.aux to hold random data which needs to be passed
between protocol handlers.

ipsec socket pointers, ipsec decryption/auth information, tunnel
decapsulation information are in my mind - there can be several other usage.
at this moment, we use this for ipsec socket pointer passing. this will
avoid reuse of m->m_pkthdr.rcvif in ipsec code.

due to the change, MHLEN will be decreased by sizeof(void *) - for example,
for i386, MHLEN was 100 bytes, but is now 96 bytes.
we may want to increase MSIZE from 128 to 256 for some of our architectures.

take caution if you use it for keeping some data item for long period
of time - use extra caution on M_PREPEND() or m_adj(), as they may result
in loss of m->m_pkthdr.aux pointer (and mbuf leak).

this will bump kernel version.

(as discussed in tech-net, tested in kame tree)
 1.25  28-Feb-2000  itojun fix ICMPv6 redirect input. the bug can result in invalid ND entry.
 1.24  28-Feb-2000  itojun support draft-ietf-ipngwg-icmp-name-lookups-05.txt, drop support for
draft-ietf-ipngwg-icmp-name-lookups-04.txt.

There are certain bitfield change in 04 draft to 05 draft, which makes
04 "ping6 -a" and 05 "ping6 -a" not interoperable. sigh.
 1.23  26-Feb-2000  itojun bring in recent KAME changes (only important and stable ones, as usual).
- remove net.inet6.ip6.nd6_proxyall. introduce proxy NDP code works
just like "arp -s".
- revise source address selection.
be more careful about use of yet-to-be-valid addresses as source.
- as router, transmit ICMP6_DST_UNREACH_BEYONDSCOPE against out-of-scope
packet forwarding attempt.
- path MTU discovery takes care of routing header properly.
- be more strict about mbuf chain parsing.
 1.22  17-Feb-2000  darrenr Change the use of pfil hooks. There is no longer a single list of all
pfil information, instead, struct protosw now contains a structure
which caontains list heads, etc. The per-protosw pfil struct is passed
to pfil_hook_get(), along with an in/out flag to get the head of the
relevant filter list. This has been done for only IPv4 and IPv6, at
present, with these patches only enabling filtering for IPPROTO_IP and
IPPROTO_IPV6, although it is possible to have tcp/udp, etc, dedicated
filters now also. The ipfilter code has been updated to only filter
IPv4 packets - next major release of ipfilter is required for ipv6.
 1.21  15-Feb-2000  thorpej Fix a couple of brainos in the last.
 1.20  14-Feb-2000  thorpej Use ratecheck() for ICMP6 rate limiting.
 1.19  06-Feb-2000  itojun fix include pathname for better rfc2292 compliance.
 1.18  16-Jan-2000  itojun add missing ipcomp cases.
 1.17  07-Jan-2000  itohy Rename variable "prep" for PReP port.
 1.16  06-Jan-2000  itojun remove extra portability #ifdef (like #ifdef __FreeBSD__) in KAME IPv6/IPsec
code, from netbsd-current repository.
#ifdef'ed version is always available from ftp.kame.net.

XXX please do not make too many diff-unfriendly changes, we'll need to take
bunch of diffs on upgrade...
 1.15  05-Jan-2000  itojun avoid panic on getsockopt(ICMPV6_FILTER).
 1.14  02-Jan-2000  itojun add net.inet6.icmp6.nodeinfo sysctl.
this allows you to disable/enable ICMPv6 node information query/reply
processing (which tells remote end the gethostname(3) setting, interface
addresses on the node, and some other things - documented in
draft-ietf-ipngwg-icmp-name-lookup* or something alike).

to test it, try ping6 -w ::1 with nodeinfo=0 and nodeinfo=1.
(sync with kame change)
 1.13  15-Dec-1999  itojun do not overwrite traffic class field when we write IPv6 version field.
 1.12  13-Dec-1999  itojun sync IPv6 part with latest KAME tree. IPsec part is left unmodified
due to massive changes in KAME side.
- IPv6 output goes through nd6_output
- faith can capture IPv4 packets as well - you can run IPv4-to-IPv6 translator
using heavily modified DNS servers
- per-interface statistics (required for IPv6 MIB)
- interface autoconfig is revisited
- udp input handling has a big change for mapped address support.
- introduce in4_cksum() for non-overwriting checksumming
- introduce m_pulldown()
- neighbor discovery cleanups/improvements
- netinet/in.h strictly conforms to RFC2553 (no extra defs visible to userland)
- IFA_STATS is fixed a bit (not tested)
- and more more more.

TODO:
- cleanup os-independency #ifdef
- avoid rcvif dual use (for IPsec) to help ifdetach

(sorry for jumbo commit, I can't separate this any more...)
 1.11  01-Oct-1999  itojun branches: 1.11.2; 1.11.8;
consistent logging for icmp6 redirects
XXX should make logs 1-liner so that duplicated logs can be compressed
by syslog(8)?
 1.10  31-Jul-1999  itojun sync with recent KAME.
- loosen ipsec restriction on packet diredtion.
- revise icmp6 redirect handling on IsRouter bit.
- tcp/udp notification processing (link-local address case)
- cosmetic fixes (better code share across *BSD).
 1.9  30-Jul-1999  itojun remove reference to in6_systm.h (file itself will be removed afterwords)
 1.8  22-Jul-1999  itojun - implement IPv6 pmtud, which is necessary for TCP6.
- fix memory leak on SO_DEBUG over TCP.
 1.7  22-Jul-1999  itojun change unnecessary u_long/long into u_int32_t or something relevant.
more fixes should follow.
 1.6  09-Jul-1999  thorpej defopt IPSEC and IPSEC_ESP (both into opt_ipsec.h).
 1.5  06-Jul-1999  itojun sync with KAME/NetBSD 1.4, SNAP kit 19990705.
key changes are:
- icmp6 redirect fix (dst check)
- revised ip6 multicast check for loopback i/f
- several RCS ID cleanups
 1.4  06-Jul-1999  itojun checked build on alpha and i386, with GENERIC.v6.
fixed several sizeof(void *) and sizeof(size_t) issues on alpha.

Thanks to: Dave Huang and Tim Rightnour
 1.3  03-Jul-1999  thorpej RCS ID police.
 1.2  01-Jul-1999  itojun branches: 1.2.2;
IPv6 kernel code, based on KAME/NetBSD 1.4, SNAP kit 19990628.
(Sorry for a big commit, I can't separate this into several pieces...)
Pls check sys/netinet6/TODO and sys/netinet6/IMPLEMENTATION for details.

- sys/kern: do not assume single mbuf, accept chained mbuf on passing
data from userland to kernel (or other way round).
- "midway" ATM card: ATM PVC pseudo device support, like those done in ALTQ
package (ftp://ftp.csl.sony.co.jp/pub/kjc/).
- sys/netinet/tcp*: IPv4/v6 dual stack tcp support.
- sys/netinet/{ip6,icmp6}.h, sys/net/pfkeyv2.h: IETF document assumes those
file to be there so we patch it up.
- sys/netinet: IPsec additions are here and there.
- sys/netinet6/*: most of IPv6 code sits here.
- sys/netkey: IPsec key management code
- dev/pci/pcidevs: regen

In my understanding no code here is subject to export control so it
should be safe.
 1.1  28-Jun-1999  itojun branches: 1.1.2;
file icmp6.c was initially added on branch kame.
 1.1.2.3  30-Nov-1999  itojun bring in latest KAME (as of 19991130, KAME/NetBSD141) into kame branch
just for reference purposes.
This commit includes 1.4 -> 1.4.1 sync for kame branch.

The branch does not compile at all (due to the lack of ALTQ and some other
source code). Please do not try to modify the branch, this is just for
referenre purposes.

synchronization to latest KAME will take place on HEAD branch soon.
 1.1.2.2  06-Jul-1999  itojun KAME/NetBSD 1.4, SNAP kit 1999/07/05.
NOTE: this branch is just for reference purposes (i.e. for taking cvs diff).
do not touch anything on the branch. actual work must be done on HEAD branch.
 1.1.2.1  28-Jun-1999  itojun KAME/NetBSD 1.4 SNAP kit, dated 19990628.

NOTE: this branch (kame) is used just for refernce. this may not compile
due to multiple reasons.
 1.2.2.3  02-Aug-1999  thorpej Update from trunk.
 1.2.2.2  01-Jul-1999  thorpej Sync w/ -current.
 1.2.2.1  01-Jul-1999  thorpej file icmp6.c was added on branch chs-ubc2 on 1999-07-01 23:48:26 +0000
 1.11.8.1  27-Dec-1999  wrstuden Pull up to last week's -current.
 1.11.2.8  21-Apr-2001  bouyer Sync with HEAD
 1.11.2.7  27-Mar-2001  bouyer Sync with HEAD.
 1.11.2.6  12-Mar-2001  bouyer Sync with HEAD.
 1.11.2.5  11-Feb-2001  bouyer Sync with HEAD.
 1.11.2.4  18-Jan-2001  bouyer Sync with head (for UBC+NFS fixes, mostly).
 1.11.2.3  13-Dec-2000  bouyer Sync with HEAD (for UBC fixes).
 1.11.2.2  22-Nov-2000  bouyer Sync with HEAD.
 1.11.2.1  20-Nov-2000  bouyer Update thorpej_scsipi to -current as of a month ago
 1.30.2.1  22-Jun-2000  minoura Sync w/ netbsd-1-5-base.
 1.33.2.15  07-Jun-2001  he Pull up revision 1.65 (requested by itojun):
Correct icmp6 hoplimit value.
 1.33.2.14  09-May-2001  he Pull up revision 1.53 (via patch, requested by itojun):
Suppress ND6 logs that are too noisy for normal use. Can be
re-enabled by net.inet6.icmp6.nd6_debug.
 1.33.2.13  09-May-2001  he Pull up revision 1.64 (via patch, requested by itojun):
Correct faith prefix determintaion.
 1.33.2.12  28-Apr-2001  he Pull up revision 1.55 (partial, via patch, requested by itojun):
Correct source address selection in icmp6_reflect().
Fixes two problems: kernel may fail to send icmp6 messages, and
there would be a way for user programs to cause a panic.
 1.33.2.11  22-Apr-2001  he Apply patch (requested by itojun):
Avoid passing NULL pointer to in6_ifawithscope.
 1.33.2.10  06-Apr-2001  he Pull up revision 1.52 (requested by itojun):
Record IPsec packet history in m_aux structure. Let ipfilter
look at wire-format packet only (not the decapsulated ones), so
that VPN setting can work with NAT/ipfilter settings.
 1.33.2.9  04-Apr-2001  he Pull up revision 1.63 (requested by itojun):
Make sure rcvif is sane on call to icmp6_reflect().
Fixes panic in certain configurations / instances.
 1.33.2.8  11-Mar-2001  he Pull up revision 1.59 (requested by itojun):
Ensure that we enforce inbound IPsec policy on all IP protocols,
not just TCP, UDP and ICMP.
 1.33.2.7  03-Feb-2001  he Pull up revision 1.51 (requested by itojun):
Correct ND6DEBUG -> ND6_DEBUG.
 1.33.2.6  26-Jan-2001  jhawk Pull up revision 1.50 (requested by itojun):
Only printf() IPv6 ICMP checksum errors under ND6DEBUG.
 1.33.2.5  02-Oct-2000  itojun pullup (approved by releng-1-5)
correct ipsecstat/ipsec6stat mixup.

netinet6/ah_input.c 1.18 -> 1.19
netinet6/ah_output.c 1.11 -> 1.12 (part of)
netinet6/esp_input.c 1.8 -> 1.9 (part of)
netinet6/esp_output.c 1.8 -> 1.9
netinet6/icmp6.c 1.43 -> 1.44
netinet6/ipcomp_input.c 1.13 -> 1.14
netinet6/ipcomp_output.c 1.13 -> 1.14
 1.33.2.4  19-Sep-2000  itojun pullup 1.42 -> 1.43 (approved by releng-1-5)

> kame sys/netinet6/icmp6.c 1.140 -> 1.144
> > in the check for the incoming redirect message, examine the gateway
> > (from the routing table) only when the address family of the gateway is
> > AF_INET6.
 1.33.2.3  16-Aug-2000  itojun pullup (approved by releng-1-5)

switch from net.inet*.*.*ratelimit to net.inet*.*.ppslimit.

(tags are rough estimate - we had some try-and-error in main trunc)
sys/netinet/icmp6.h 1.9 -> 1.11
sys/netinet/icmp_var.h 1.15 -> 1.17
sys/netinet/in_proto.c 1.39 -> 1.42
sys/netinet/ip_icmp.c 1.50 -> 1.51, 1.52 -> 1.54
sys/netinet/tcp_input.c 1.111 -> 1.112, 1.115 -> 1.117
sys/netinet/tcp_usrreq.c 1.52 -> 1.53
sys/netinet/tcp_var.h 1.72 -> 1.75
sys/netinet6/icmp6.c 1.34 -> 1.35, 1.36 -> 1.38
sys/netinet6/in6_proto.c 1.17 -> 1.19
 1.33.2.2  04-Aug-2000  itojun pullup (approved by releng-1-5)
sys/netinet6/icmp6.h 1.11 -> 1.13
sys/netinet6/icmp6.c 1.39 -> 1.41

cvs rdiff -r1.11 -r1.12 syssrc/sys/netinet/icmp6.h
cvs rdiff -r1.39 -r1.40 syssrc/sys/netinet6/icmp6.c

correct typo in #define. ICMP6_NI_SUCESS -> SUCCESS (notice missing C).
sync with kame.

cvs rdiff -r1.12 -r1.13 syssrc/sys/netinet/icmp6.h
cvs rdiff -r1.40 -r1.41 syssrc/sys/netinet6/icmp6.c

clearifications in icmp6 node query support.
XXX previous commit included "supported qtypes" icmp6 node query support.
sorry commit message was mistaken.
 1.33.2.1  20-Jul-2000  itojun pullup from main trunc (approved by releng-1-5)
- add protection mechanism against ND cache corruption due to bad NUD hints.

this is part of:
sys/netinet/icmp6.h 1.9 -> 1.10
sys/netinet/tcp_input.c 1.111 -> 1.112
sys/netinet6/icmp6.c 1.34 -> 1.35
sys/netinet6/nd6.c 1.30 -> 1.31
sys/netinet6/nd6.h 1.14 -> 1.15
 1.59.2.11  18-Oct-2002  nathanw Catch up to -current.
 1.59.2.10  17-Sep-2002  nathanw Catch up to -current.
 1.59.2.9  01-Aug-2002  nathanw Catch up to -current.
 1.59.2.8  20-Jun-2002  nathanw Catch up to -current.
 1.59.2.7  01-Apr-2002  nathanw Catch up to -current.
(CVS: It's not just a program. It's an adventure!)
 1.59.2.6  08-Jan-2002  nathanw Catch up to -current.
 1.59.2.5  14-Nov-2001  nathanw Catch up to -current.
 1.59.2.4  22-Oct-2001  nathanw Catch up to -current.
 1.59.2.3  24-Aug-2001  nathanw Catch up with -current.
 1.59.2.2  21-Jun-2001  nathanw Catch up to -current.
 1.59.2.1  09-Apr-2001  nathanw Catch up with -current.
 1.66.2.5  10-Oct-2002  jdolecek sync kqueue with -current; this includes merge of gehenna-devsw branch,
merge of i386 MP branch, and part of autoconf rototil work
 1.66.2.4  06-Sep-2002  jdolecek sync kqueue branch with HEAD
 1.66.2.3  23-Jun-2002  jdolecek catch up with -current on kqueue branch
 1.66.2.2  16-Mar-2002  jdolecek Catch up with -current.
 1.66.2.1  10-Jan-2002  thorpej Sync kqueue branch with -current.
 1.68.2.1  12-Nov-2001  thorpej Sync the thorpej-mips-cache branch with -current.
 1.75.8.3  11-Jul-2002  thorpej pullup-1-6 ticket #457 (itojun).

Original log message:
correct ping6 -w result wth hostname with [A-Z]. PR 17540. sync w/kame
 1.75.8.2  05-Jun-2002  lukem Pull up revisions 1.78 & 1.80 (via patch) (requested by itojun in #123 & #124):
- correct rmx_mtu value after PMTUD entry timeout (should be set to 0)
- do not mistakenly lock PMTUD route entry with RTV_MTU.
 1.75.8.1  28-May-2002  tv Pull up revision 1.76 (requested by itojun):
make a strict check before sending FQDN node information reply. sync w/kame
 1.75.6.4  29-Aug-2002  gehenna catch up with -current.
 1.75.6.3  15-Jul-2002  gehenna catch up with -current.
 1.75.6.2  20-Jun-2002  gehenna catch up with -current.
 1.75.6.1  30-May-2002  gehenna Catch up with -current.
 1.94.2.5  10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.94.2.4  17-Jan-2005  skrll Sync with HEAD.
 1.94.2.3  21-Sep-2004  skrll Fix the sync with head I botched.
 1.94.2.2  18-Sep-2004  skrll Sync with HEAD.
 1.94.2.1  03-Aug-2004  skrll Sync with HEAD
 1.106.2.2  28-Oct-2005  riz Pull up following revision(s) (requested by bouyer in ticket #5938):
sys/netinet6/icmp6.c: revision 1.111
In icmp6_redirect_output(), sip6 is initialised to point to the data area of
m0. But m0 may be freed later, so trying to use sip6 at the end of this
function is wrong. My guess is that we want to reference the data area
of m (the mbuf about to be send) instead at this point.
Fix a panic on Xen (where a data area of a mbuf may be unmapped when the
mbuf is freed), and probably potential data/pool corruption in other cases.
 1.106.2.1  28-May-2004  tron branches: 1.106.2.1.2; 1.106.2.1.4;
Pull up revision 1.107 (requested by atatat in ticket #391):
Sysctl descriptions under net subtree (net.key not done)
 1.106.2.1.4.1  28-Oct-2005  riz Pull up following revision(s) (requested by bouyer in ticket #5938):
sys/netinet6/icmp6.c: revision 1.111
In icmp6_redirect_output(), sip6 is initialised to point to the data area of
m0. But m0 may be freed later, so trying to use sip6 at the end of this
function is wrong. My guess is that we want to reference the data area
of m (the mbuf about to be send) instead at this point.
Fix a panic on Xen (where a data area of a mbuf may be unmapped when the
mbuf is freed), and probably potential data/pool corruption in other cases.
 1.106.2.1.2.1  28-Oct-2005  riz Pull up following revision(s) (requested by bouyer in ticket #5938):
sys/netinet6/icmp6.c: revision 1.111
In icmp6_redirect_output(), sip6 is initialised to point to the data area of
m0. But m0 may be freed later, so trying to use sip6 at the end of this
function is wrong. My guess is that we want to reference the data area
of m (the mbuf about to be send) instead at this point.
Fix a panic on Xen (where a data area of a mbuf may be unmapped when the
mbuf is freed), and probably potential data/pool corruption in other cases.
 1.107.4.1  29-Apr-2005  kent sync with -current
 1.108.10.1  03-Oct-2008  jdc Pull up revision 1.150 (requested by adrianp in ticket #1966).

Fix for CVE-2008-3530 from matt@
Implement improved checking for MTU values on ICMP 'Packet Too Big Messages'
 1.108.8.1  03-Oct-2008  jdc Pull up revision 1.150 (requested by adrianp in ticket #1966).

Fix for CVE-2008-3530 from matt@
Implement improved checking for MTU values on ICMP 'Packet Too Big Messages'
 1.108.6.1  03-Oct-2008  jdc Pull up revision 1.150 (requested by adrianp in ticket #1966).

Fix for CVE-2008-3530 from matt@
Implement improved checking for MTU values on ICMP 'Packet Too Big Messages'
 1.109.2.8  17-Mar-2008  yamt sync with head.
 1.109.2.7  07-Dec-2007  yamt sync with head
 1.109.2.6  15-Nov-2007  yamt sync with head.
 1.109.2.5  27-Oct-2007  yamt sync with head.
 1.109.2.4  03-Sep-2007  yamt sync with head.
 1.109.2.3  26-Feb-2007  yamt sync with head.
 1.109.2.2  30-Dec-2006  yamt sync with head.
 1.109.2.1  21-Jun-2006  yamt sync with head.
 1.110.2.1  26-Oct-2005  yamt sync with head
 1.112.2.1  01-Feb-2006  yamt sync with head.
 1.113.4.2  22-Apr-2006  simonb Sync with head.
 1.113.4.1  04-Feb-2006  simonb Adapt for timecounters: mostly use get*time(), use bintime's for timeout
calculations and use "time_second" instead of "time.tv_sec".
 1.113.2.3  09-Sep-2006  rpaulo sync with head
 1.113.2.2  23-Feb-2006  rpaulo Another round of s/in6pcb/inpcb/.
 1.113.2.1  02-Feb-2006  rpaulo Adapt to in6pcb -> inpcb changes.
 1.114.2.6  14-Sep-2006  yamt sync with head.
 1.114.2.5  03-Sep-2006  yamt sync with head.
 1.114.2.4  11-Aug-2006  yamt sync with head
 1.114.2.3  26-Jun-2006  yamt sync with head.
 1.114.2.2  24-May-2006  yamt sync with head.
 1.114.2.1  13-Mar-2006  yamt sync with head.
 1.115.4.1  24-May-2006  tron Merge 2006-05-24 NetBSD-current into the "peter-altq" branch.
 1.115.2.1  19-Apr-2006  elad sync with head.
 1.116.2.1  19-Jun-2006  chap Sync with head.
 1.117.2.1  13-Jul-2006  gdamore Merge from HEAD.
 1.121.4.3  18-Dec-2006  yamt sync with head.
 1.121.4.2  10-Dec-2006  yamt sync with head.
 1.121.4.1  22-Oct-2006  yamt sync with head
 1.121.2.3  01-Feb-2007  ad Sync with head.
 1.121.2.2  12-Jan-2007  ad Sync with head.
 1.121.2.1  18-Nov-2006  ad Sync with head.
 1.123.2.3  03-Oct-2008  jdc Pull up revision 1.150 (requested by adrianp in ticket #1209).

Fix for CVE-2008-3530 from matt@
Implement improved checking for MTU values on ICMP 'Packet Too Big Messages'
 1.123.2.2  24-May-2007  pavel branches: 1.123.2.2.4;
Pull up following revision(s) (requested by degroote in ticket #667):
sys/netinet/tcp_input.c: revision 1.260
sys/netinet/tcp_output.c: revision 1.154
sys/netinet/tcp_subr.c: revision 1.210
sys/netinet6/icmp6.c: revision 1.129
sys/netinet6/in6_proto.c: revision 1.70
sys/netinet6/ip6_forward.c: revision 1.54
sys/netinet6/ip6_input.c: revision 1.94
sys/netinet6/ip6_output.c: revision 1.114
sys/netinet6/raw_ip6.c: revision 1.81
sys/netipsec/ipcomp_var.h: revision 1.4
sys/netipsec/ipsec.c: revision 1.26 via patch,1.31-1.32
sys/netipsec/ipsec6.h: revision 1.5
sys/netipsec/ipsec_input.c: revision 1.14
sys/netipsec/ipsec_netbsd.c: revision 1.18,1.26
sys/netipsec/ipsec_output.c: revision 1.21 via patch
sys/netipsec/key.c: revision 1.33,1.44
sys/netipsec/xform_ipcomp.c: revision 1.9
sys/netipsec/xform_ipip.c: revision 1.15
sys/opencrypto/deflate.c: revision 1.8
Commit my SoC work
Add ipv6 support for fast_ipsec
Note that currently, packet with extensions headers are not correctly
supported
Change the ipcomp logic

Add sysctl tree to modify the fast_ipsec options related to ipv6. Similar
to the sysctl kame interface.

Choose the good default policy, depending of the adress family of the
desired policy

Increase the refcount for the default ipv6 policy so nobody can reclaim it

Always compute the sp index even if we don't have any sp in spd. It will
let us to choose the right default policy (based on the adress family
requested).
While here, fix an error message

Use dynamic array instead of an static array to decompress. It lets us to
decompress any data, whatever is the radio decompressed data / compressed
data.
It fixes the last issues with fast_ipsec and ipcomp.
While here, bzero -> memset, bcopy -> memcpy, FREE -> free
Reviewed a long time ago by sam@
 1.123.2.1  12-May-2007  pavel branches: 1.123.2.1.2;
Pull up following revision(s) (requested by degroote in ticket #631):
sys/netinet6/icmp6.c: revision 1.126
Fix an infinite loop ( and local dos ) in the case where the ip6_hdr and
the icmp6_hdr are not in the same mbuf.
Fix pr/34994 and probably pr/35333
Ok @rpaulo
 1.123.2.2.4.1  03-Oct-2008  jdc Pull up revision 1.150 (requested by adrianp in ticket #1209).

Fix for CVE-2008-3530 from matt@
Implement improved checking for MTU values on ICMP 'Packet Too Big Messages'
 1.123.2.1.2.1  04-Jun-2007  wrstuden Update to today's netbsd-4.
 1.129.2.3  07-May-2007  yamt sync with head.
 1.129.2.2  12-Mar-2007  rmind Sync with HEAD.
 1.129.2.1  27-Feb-2007  yamt - sync with head.
- move sched_changepri back to kern_synch.c as it doesn't know PPQ anymore.
 1.131.4.1  11-Jul-2007  mjf Sync with head.
 1.131.2.4  09-Oct-2007  ad Sync with head.
 1.131.2.3  20-Aug-2007  ad Sync with HEAD.
 1.131.2.2  15-Jul-2007  ad Sync with head.
 1.131.2.1  08-Jun-2007  ad Sync with head.
 1.134.2.1  15-Aug-2007  skrll Sync with HEAD.
 1.135.6.2  19-Jul-2007  dyoung Take steps to hide the radix_node implementation of the forwarding table
from the forwarding table's users:

Introduce rt_walktree() for walking the routing table and
applying a function to each rtentry. Replace most
rn_walktree() calls with it.

Use rt_getkey()/rt_setkey() to get/set a route's destination.
Keep a pointer to the sockaddr key in the rtentry, so that
rtentry users do not have to grovel in the radix_node for
the key.

Add a RTM_GET method to rtrequest. Use that instead of
radix_node lookups in, e.g., carp(4).

Add sys/net/link_proto.c, which supplies sockaddr routines for
link-layer socket addresses (sockaddr_dl).

Cosmetic:

Constify. KNF. Stop open-coding LIST_FOREACH, TAILQ_FOREACH,
et cetera. Use NULL instead of 0 for null pointers. Use
__arraycount(). Reduce gratuitous parenthesization.

Stop using variadic arguments for rip6_output(), it is
unnecessary.

Remove the unnecessary rtentry member rt_genmask and the
code to maintain it, since nothing actually used it.

Make rt_maskedcopy() easier to read by using meaningful variable
names.

Extract a subroutine intern_netmask() for looking up a netmask in
the masks table.

Start converting backslash-ridden IPv6 macros in
sys/netinet6/in6_var.h into inline subroutines that one
can read without special eyeglasses.

One functional change: when the kernel serves an RTM_GET, RTM_LOCK,
or RTM_CHANGE request, it applies the netmask (if supplied) to a
destination before searching for it in the forwarding table.

I have changed sys/netinet/ip_carp.c, carp_setroute(), to remove
the unlawful radix_node knowledge.

Apart from the changes to carp(4), netiso, ATM, and strip(4), I
have run the changes on three nodes in my wireless routing testbed,
which involves IPv4 + IPv6 dynamic routing acrobatics, and it's
working beautifully so far.
 1.135.6.1  19-Jul-2007  dyoung file icmp6.c was added on branch matt-mips64 on 2007-07-19 20:48:56 +0000
 1.135.4.6  09-Dec-2007  jmcneill Sync with HEAD.
 1.135.4.5  04-Nov-2007  jmcneill Sync with HEAD.
 1.135.4.4  31-Oct-2007  joerg Sync with HEAD.
 1.135.4.3  26-Oct-2007  joerg Sync with HEAD.

Follow the merge of pmap.c on i386 and amd64 and move
pmap_init_tmp_pgtbl into arch/x86/x86/pmap.c. Modify the ACPI wakeup
code to restore CR4 before jumping back into kernel space as the large
page option might cover that.
 1.135.4.2  02-Oct-2007  joerg Sync with HEAD.
 1.135.4.1  16-Aug-2007  jmcneill Sync with HEAD.
 1.136.2.3  23-Mar-2008  matt sync with HEAD
 1.136.2.2  09-Jan-2008  matt sync with HEAD
 1.136.2.1  06-Nov-2007  matt sync with HEAD
 1.137.4.1  13-Nov-2007  bouyer Sync with HEAD
 1.140.4.1  08-Dec-2007  ad Sync with head.
 1.140.2.1  08-Dec-2007  mjf Sync with HEAD.
 1.141.12.4  05-Oct-2008  mjf Sync with HEAD.
 1.141.12.3  28-Sep-2008  mjf Sync with HEAD.
 1.141.12.2  02-Jun-2008  mjf Sync with HEAD.
 1.141.12.1  03-Apr-2008  mjf Sync with HEAD.
 1.141.8.2  24-Mar-2008  keiichi sync with head.
 1.141.8.1  22-Feb-2008  keiichi imported Mobile IPv6 code developed by the SHISA project
(http://www.mobileip.jp/).
 1.145.2.1  18-May-2008  yamt sync with head.
 1.146.2.4  09-Oct-2010  yamt sync with head
 1.146.2.3  11-Mar-2010  yamt sync with head
 1.146.2.2  04-May-2009  yamt sync with head.
 1.146.2.1  16-May-2008  yamt sync with head.
 1.148.6.1  19-Oct-2008  haad Sync with HEAD.
 1.148.2.2  10-Oct-2008  skrll Sync with HEAD.
 1.148.2.1  18-Sep-2008  wrstuden Sync with wrstuden-revivesa-base-2.
 1.150.8.1  13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.150.2.1  28-Apr-2009  skrll Sync with HEAD.
 1.155.4.1  05-Mar-2011  rmind sync with head
 1.155.2.1  22-Oct-2010  uebayasi Sync with HEAD (-D20101022).
 1.157.6.2  05-Apr-2012  mrg sync to latest -current.
 1.157.6.1  18-Feb-2012  mrg merge to -current.
 1.157.2.3  22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.157.2.2  30-Oct-2012  yamt sync with head
 1.157.2.1  17-Apr-2012  yamt sync with head
 1.159.8.1  08-Jul-2013  jdc Pull up revisions:
src/share/man/man7/sysctl.7 revision 1.73 via patch
src/sys/netinet6/icmp6.c revision 1.161 via patch
src/sys/netinet6/in6.c revision 1.161 via patch
src/sys/netinet6/in6_proto.c revision 1.97 via patch
src/sys/netinet6/in6_var.h revision 1.65 via patch
src/sys/netinet6/ip6_input.c revision 1.139 via patch
src/sys/netinet6/ip6_var.h revision 1.59 via patch
src/sys/netinet6/nd6.c revision 1.143 via patch
src/sys/netinet6/nd6.h revision 1.57 via patch
src/sys/netinet6/nd6_rtr.c revision 1.83 via patch
(requested by christos in ticket #905).
Patch by Loganaden Velvindron.

4 new sysctls to avoid ipv6 DoS attacks from OpenBSD
 1.159.6.1  08-Jul-2013  jdc Pull up revisions:
src/share/man/man7/sysctl.7 revision 1.73 via patch
src/sys/netinet6/icmp6.c revision 1.161 via patch
src/sys/netinet6/in6.c revision 1.161 via patch
src/sys/netinet6/in6_proto.c revision 1.97 via patch
src/sys/netinet6/in6_var.h revision 1.65 via patch
src/sys/netinet6/ip6_input.c revision 1.139 via patch
src/sys/netinet6/ip6_var.h revision 1.59 via patch
src/sys/netinet6/nd6.c revision 1.143 via patch
src/sys/netinet6/nd6.h revision 1.57 via patch
src/sys/netinet6/nd6_rtr.c revision 1.83 via patch
(requested by christos in ticket #905).
Patch by Loganaden Velvindron.

4 new sysctls to avoid ipv6 DoS attacks from OpenBSD
 1.159.2.2  15-Nov-2015  bouyer Pull up following revision(s) (requested by ozaki-r in ticket #1327):
sys/netinet6/icmp6.c: revision 1.177
Update icmp6_redirect_timeout_q when changing net.inet6.icmp6.redirtimeout
We have to update icmp6_redirect_timeout_q as well as icmp6_redirtimeout
when changing net.inet6.icmp6.redirtimeout via sysctl. The updating logic
is copied from sysctl_net_inet_icmp_redirtimeout.
This change is from s-yamaguchi@IIJ (with KNF by ozaki-r) and fixes
PR kern/50240.
 1.159.2.1  08-Jul-2013  jdc Pull up revisions:
src/share/man/man7/sysctl.7 revision 1.73 via patch
src/sys/netinet6/icmp6.c revision 1.161 via patch
src/sys/netinet6/in6.c revision 1.161 via patch
src/sys/netinet6/in6_proto.c revision 1.97 via patch
src/sys/netinet6/in6_var.h revision 1.65 via patch
src/sys/netinet6/ip6_input.c revision 1.139 via patch
src/sys/netinet6/ip6_var.h revision 1.59 via patch
src/sys/netinet6/nd6.c revision 1.143 via patch
src/sys/netinet6/nd6.h revision 1.57 via patch
src/sys/netinet6/nd6_rtr.c revision 1.83 via patch
(requested by christos in ticket #905).
Patch by Loganaden Velvindron.

4 new sysctls to avoid ipv6 DoS attacks from OpenBSD
 1.161.2.3  03-Dec-2017  jdolecek update from HEAD
 1.161.2.2  20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.161.2.1  23-Jun-2013  tls resync from head
 1.162.2.3  18-May-2014  rmind sync with head
 1.162.2.2  28-Aug-2013  rmind Checkpoint work in progress:
- Initial split of the protocol user-request method into the following
methods: pr_attach, pr_detach and pr_generic for old the pr_usrreq.
- Adjust socreate(9) and sonewconn(9) to call pr_attach without the
socket lock held (as a preparation for the locking scheme adjustment).
- Adjust all pr_attach routines to assert that PCB is not set.
- Sprinkle various comments, document some routines and their locking.
- Remove M_PCB, replace with kmem(9).
- Fix few bugs spotted on the way.
 1.162.2.1  17-Jul-2013  rmind Checkpoint work in progress:
- Move PCB structures under __INPCB_PRIVATE, adjust most of the callers
and thus make IPv4 PCB structures mostly opaque. Any volunteers for
merging in6pcb with inpcb (see rpaulo-netinet-merge-pcb branch)?
- Move various global vars to the modules where they belong, make them static.
- Some preliminary work for IPv4 PCB locking scheme.
- Make raw IP code mostly MP-safe. Simplify some of it.
- Rework "fast" IP forwarding (ipflow) code to be mostly MP-safe. It should
run from a software interrupt, rather than hard.
- Rework tun(4) pseudo interface to be MP-safe.
- Work towards making some other interfaces more strict.
 1.165.2.1  10-Aug-2014  tls Rebase.
 1.169.2.1  05-Nov-2015  riz Pull up following revision(s) (requested by ozaki-r in ticket #982):
sys/netinet6/icmp6.c: revision 1.177
Update icmp6_redirect_timeout_q when changing net.inet6.icmp6.redirtimeout
We have to update icmp6_redirect_timeout_q as well as icmp6_redirtimeout
when changing net.inet6.icmp6.redirtimeout via sysctl. The updating logic
is copied from sysctl_net_inet_icmp_redirtimeout.
This change is from s-yamaguchi@IIJ (with KNF by ozaki-r) and fixes
PR kern/50240.
 1.170.2.8  28-Aug-2017  skrll Sync with HEAD
 1.170.2.7  05-Feb-2017  skrll Sync with HEAD
 1.170.2.6  05-Dec-2016  skrll Sync with HEAD
 1.170.2.5  05-Oct-2016  skrll Sync with HEAD
 1.170.2.4  09-Jul-2016  skrll Sync with HEAD
 1.170.2.3  29-May-2016  skrll Sync with HEAD
 1.170.2.2  22-Apr-2016  skrll Sync with HEAD
 1.170.2.1  22-Sep-2015  skrll Sync with HEAD
 1.192.2.5  20-Mar-2017  pgoyette Sync with HEAD
 1.192.2.4  07-Jan-2017  pgoyette Sync with HEAD. (Note that most of these changes are simply $NetBSD$
tag issues.)
 1.192.2.3  04-Nov-2016  pgoyette Sync with HEAD
 1.192.2.2  06-Aug-2016  pgoyette Sync with HEAD
 1.192.2.1  26-Jul-2016  pgoyette Sync with HEAD
 1.204.2.1  21-Apr-2017  bouyer Sync with HEAD
 1.211.6.8  25-Oct-2018  martin Pull up following revision(s) (requested by ozaki-r in ticket #1071):

sys/netinet6/icmp6.c: revision 1.240

Remove a leftover debug printf

Pointed out by hannken@
 1.211.6.7  23-Jun-2018  martin Pull up following revision(s) (requested by maxv in ticket #893):

sys/netinet6/icmp6.c: revision 1.228,1.230

Remove the RH0 code from ICMPv6. RH0 is deprecated by RFC5095 (2007) for
security reasons. We already removed it in Route6.

In addition there was an mbuf bug here: calling IP6_EXTHDR_GET twice with
the same offset, but still using the pointer from the first call, which
could have been made invalid. By luck, m_pulldown leaves zero-sized mbufs
in place, instead of freeing them.

And in general, using a 'finaldst' pointer on the mbuf, and then modifying
that mbuf with IP6_EXTHDR_GET with a smaller offset, was really error-
prone.

Fix 'icmp6len', it shouldn't be ip6_plen, because we may not be at the
beginning of the packet (off+ip6_plen is beyond the end of the mbuf). By
luck, the IP6_EXTHDR_GET that follows will fail and prevent buffer
overflows in non-jumbogram packets.

For jumbograms we will probably be in trouble here; but it doesn't seem
possible to craft reliably a jumbogram for a non-jumbogram-enabled device.

So I don't think it's a huge problem.
 1.211.6.6  08-Jun-2018  martin Pull up following revision(s) (requested by ozaki-r in ticket #852):

sys/netinet6/icmp6.c: revision 1.238
sys/netinet/ip_icmp.c: revision 1.171
sys/net/route.c: revision 1.210

Fix _rt_free via rtrequest(RTM_DELETE) hangs in rt_timer handlers

A rt_timer handler is passed a rtentry with an extra reference that avoids the
rtentry is accidentally released. So rt_timer handers must release
the reference of a passed rtentry by themselves (but they didn't).
 1.211.6.5  09-Apr-2018  bouyer Pull up following revision(s) (requested by roy in ticket #724):
tests/net/icmp/t_ping.c: revision 1.19
sys/netinet6/raw_ip6.c: revision 1.166
sys/netinet6/ip6_input.c: revision 1.195
sys/net/raw_usrreq.c: revision 1.59
sys/sys/socketvar.h: revision 1.151
sys/kern/uipc_socket2.c: revision 1.128
tests/lib/libc/sys/t_recvmmsg.c: revision 1.2
lib/libc/sys/recv.2: revision 1.38
sys/net/rtsock.c: revision 1.239
sys/netinet/udp_usrreq.c: revision 1.246
sys/netinet6/icmp6.c: revision 1.224
tests/net/icmp/t_ping.c: revision 1.20
sys/netipsec/keysock.c: revision 1.63
sys/netinet/raw_ip.c: revision 1.172
sys/kern/uipc_socket.c: revision 1.260
tests/net/icmp/t_ping.c: revision 1.22
sys/kern/uipc_socket.c: revision 1.261
tests/net/icmp/t_ping.c: revision 1.23
sys/netinet/ip_mroute.c: revision 1.155
sbin/route/route.c: revision 1.159
sys/netinet6/ip6_mroute.c: revision 1.123
sys/netatalk/ddp_input.c: revision 1.31
sys/netcan/can.c: revision 1.3
sys/kern/uipc_usrreq.c: revision 1.184
sys/netinet6/udp6_usrreq.c: revision 1.138
tests/net/icmp/t_ping.c: revision 1.18
socket: report receive buffer overflows
Add soroverflow() which increments the overflow counter, sets so_error
to ENOBUFS and wakes the receive socket up.
Replace all code that manually increments this counter with soroverflow().
Add soroverflow() to raw_input().
This allows userland to detect route(4) overflows so it can re-sync
with the current state.
socket: clear error even when peeking
The error has already been reported and it's pointless requiring another
recv(2) call just to clear it.
socket: remove now incorrect comment that so_error is only udp
As it can be affected by route(4) sockets which are raw.
rtsock: log dropped messages that we cannot report to userland
Handle ENOBUFS when receiving messages.
Don't send messages if the receiver has died.
Sprinkle more soroverflow().
Handle ENOBUFS in recv
Handle ENOBUFS in sendto
Note value received. Harden another sendto for ENOBUFS.
Handle the routing socket overflowing gracefully.
Allow a valid sendto .... duh
Handle errors better.
Fix test for checking we sent all the data we asked to.
 1.211.6.4  31-Mar-2018  martin Pull up following revision(s) (requested by maxv in ticket #665):

sys/netinet6/icmp6.c: revision 1.215

Style, and four fixes:

* Remove the (disabled) IPPROTO_ESP check. If the packet was decrypted it
will have M_DECRYPTED, and this is already checked.
* Memory leaks in icmp6_error2. They seem hardly triggerable.
* Fix miscomputation in _icmp6_input, the ICMP6 header is not guaranteed
to be located right after the IP6 header. ok mlelstv@
* Memory leak in _icmp6_input. This one seems to be impossible to trigger.
 1.211.6.3  08-Nov-2017  snj Pull up following revision(s) (requested by ozaki-r in ticket #350):
sys/netinet6/icmp6.c: revision 1.214
sys/netinet6/raw_ip6.c: revision 1.158
Fix usages of ipsec_used
If IPsec isn't used, we must go back to the normal path.
PR kern/52659
 1.211.6.2  21-Oct-2017  snj Pull up following revision(s) (requested by ozaki-r in ticket #300):
crypto/dist/ipsec-tools/src/setkey/parse.y: 1.19
crypto/dist/ipsec-tools/src/setkey/token.l: 1.20
distrib/sets/lists/tests/mi: 1.754, 1.757, 1.759
doc/TODO.smpnet: 1.12-1.13
sys/net/pfkeyv2.h: 1.32
sys/net/raw_cb.c: 1.23-1.24, 1.28
sys/net/raw_cb.h: 1.28
sys/net/raw_usrreq.c: 1.57-1.58
sys/net/rtsock.c: 1.228-1.229
sys/netinet/in_proto.c: 1.125
sys/netinet/ip_input.c: 1.359-1.361
sys/netinet/tcp_input.c: 1.359-1.360
sys/netinet/tcp_output.c: 1.197
sys/netinet/tcp_var.h: 1.178
sys/netinet6/icmp6.c: 1.213
sys/netinet6/in6_proto.c: 1.119
sys/netinet6/ip6_forward.c: 1.88
sys/netinet6/ip6_input.c: 1.181-1.182
sys/netinet6/ip6_output.c: 1.193
sys/netinet6/ip6protosw.h: 1.26
sys/netipsec/ipsec.c: 1.100-1.122
sys/netipsec/ipsec.h: 1.51-1.61
sys/netipsec/ipsec6.h: 1.18-1.20
sys/netipsec/ipsec_input.c: 1.44-1.51
sys/netipsec/ipsec_netbsd.c: 1.41-1.45
sys/netipsec/ipsec_output.c: 1.49-1.64
sys/netipsec/ipsec_private.h: 1.5
sys/netipsec/key.c: 1.164-1.234
sys/netipsec/key.h: 1.20-1.32
sys/netipsec/key_debug.c: 1.18-1.21
sys/netipsec/key_debug.h: 1.9
sys/netipsec/keydb.h: 1.16-1.20
sys/netipsec/keysock.c: 1.59-1.62
sys/netipsec/keysock.h: 1.10
sys/netipsec/xform.h: 1.9-1.12
sys/netipsec/xform_ah.c: 1.55-1.74
sys/netipsec/xform_esp.c: 1.56-1.72
sys/netipsec/xform_ipcomp.c: 1.39-1.53
sys/netipsec/xform_ipip.c: 1.50-1.54
sys/netipsec/xform_tcp.c: 1.12-1.16
sys/rump/librump/rumpkern/Makefile.rumpkern: 1.170
sys/rump/librump/rumpnet/net_stub.c: 1.27
sys/sys/protosw.h: 1.67-1.68
tests/net/carp/t_basic.sh: 1.7
tests/net/if_gif/t_gif.sh: 1.11
tests/net/if_l2tp/t_l2tp.sh: 1.3
tests/net/ipsec/Makefile: 1.7-1.9
tests/net/ipsec/algorithms.sh: 1.5
tests/net/ipsec/common.sh: 1.4-1.6
tests/net/ipsec/t_ipsec_ah_keys.sh: 1.2
tests/net/ipsec/t_ipsec_esp_keys.sh: 1.2
tests/net/ipsec/t_ipsec_gif.sh: 1.6-1.7
tests/net/ipsec/t_ipsec_l2tp.sh: 1.6-1.7
tests/net/ipsec/t_ipsec_misc.sh: 1.8-1.18
tests/net/ipsec/t_ipsec_sockopt.sh: 1.1-1.2
tests/net/ipsec/t_ipsec_tcp.sh: 1.1-1.2
tests/net/ipsec/t_ipsec_transport.sh: 1.5-1.6
tests/net/ipsec/t_ipsec_tunnel.sh: 1.9
tests/net/ipsec/t_ipsec_tunnel_ipcomp.sh: 1.1-1.2
tests/net/ipsec/t_ipsec_tunnel_odd.sh: 1.3
tests/net/mcast/t_mcast.sh: 1.6
tests/net/net/t_ipaddress.sh: 1.11
tests/net/net_common.sh: 1.20
tests/net/npf/t_npf.sh: 1.3
tests/net/route/t_flags.sh: 1.20
tests/net/route/t_flags6.sh: 1.16
usr.bin/netstat/fast_ipsec.c: 1.22
Do m_pullup before mtod

It may fix panicks of some tests on anita/sparc and anita/GuruPlug.
---
KNF
---
Enable DEBUG for babylon5
---
Apply C99-style struct initialization to xformsw
---
Tweak outputs of netstat -s for IPsec

- Get rid of "Fast"
- Use ipsec and ipsec6 for titles to clarify protocol
- Indent outputs of sub protocols

Original outputs were organized like this:

(Fast) IPsec:
IPsec ah:
IPsec esp:
IPsec ipip:
IPsec ipcomp:
(Fast) IPsec:
IPsec ah:
IPsec esp:
IPsec ipip:
IPsec ipcomp:

New outputs are organized like this:

ipsec:
ah:
esp:
ipip:
ipcomp:
ipsec6:
ah:
esp:
ipip:
ipcomp:
---
Add test cases for IPComp
---
Simplify IPSEC_OSTAT macro (NFC)
---
KNF; replace leading whitespaces with hard tabs
---
Introduce and use SADB_SASTATE_USABLE_P
---
KNF
---
Add update command for testing

Updating an SA (SADB_UPDATE) requires that a process issuing
SADB_UPDATE is the same as a process issued SADB_ADD (or SADB_GETSPI).
This means that update command must be used with add command in a
configuration of setkey. This usage is normally meaningless but
useful for testing (and debugging) purposes.
---
Add test cases for updating SA/SP

The tests require newly-added udpate command of setkey.
---
PR/52346: Frank Kardel: Fix checksumming for NAT-T
See XXX for improvements.
---
Remove codes for PACKET_TAG_IPSEC_IN_CRYPTO_DONE

It seems that PACKET_TAG_IPSEC_IN_CRYPTO_DONE is for network adapters
that have IPsec accelerators; a driver sets the mtag to a packet
when its device has already encrypted the packet.

Unfortunately no driver implements such offload features for long
years and seems unlikely to implement them soon. (Note that neither
FreeBSD nor Linux doesn't have such drivers.) Let's remove related
(unused) codes and simplify the IPsec code.
---
Fix usages of sadb_msg_errno
---
Avoid updating sav directly

On SADB_UPDATE a target sav was updated directly, which was unsafe.
Instead allocate another sav, copy variables of the old sav to
the new one and replace the old one with the new one.
---
Simplify; we can assume sav->tdb_xform cannot be NULL while it's valid
---
Rename key_alloc* functions (NFC)

We shouldn't use the term "alloc" for functions that just look up
data and actually don't allocate memory.
---
Use explicit_memset to surely zero-clear key_auth and key_enc
---
Make sure to clear keys on error paths of key_setsaval
---
Add missing KEY_FREESAV
---
Make sure a sav is inserted to a sah list after its initialization completes
---
Remove unnecessary zero-clearing codes from key_setsaval

key_setsaval is now used only for a newly-allocated sav. (It was
used to reset variables of an existing sav.)
---
Correct wrong assumption of sav->refcnt in key_delsah

A sav in a list is basically not to be sav->refcnt == 0. And also
KEY_FREESAV assumes sav->refcnt > 0.
---
Let key_getsavbyspi take a reference of a returning sav
---
Use time_mono_to_wall (NFC)
---
Separate sending message routine (NFC)
---
Simplify; remove unnecessary zero-clears

key_freesaval is used only when a target sav is being destroyed.
---
Omit NULL checks for sav->lft_c

sav->lft_c can be NULL only when initializing or destroying sav.
---
Omit unnecessary NULL checks for sav->sah
---
Omit unnecessary check of sav->state

key_allocsa_policy picks a sav of either MATURE or DYING so we
don't need to check its state again.
---
Simplify; omit unnecessary saidx passing

- ipsec_nextisr returns a saidx but no caller uses it
- key_checkrequest is passed a saidx but it can be gotton by
another argument (isr)
---
Fix splx isn't called on some error paths
---
Fix header size calculation of esp where sav is NULL
---
Fix header size calculation of ah in the case sav is NULL

This fix was also needed for esp.
---
Pass sav directly to opencrypto callback

In a callback, use a passed sav as-is by default and look up a sav
only if the passed sav is dead.
---
Avoid examining freshness of sav on packet processing

If a sav list is sorted (by lft_c->sadb_lifetime_addtime) in advance,
we don't need to examine each sav and also don't need to delete one
on the fly and send up a message. Fortunately every sav lists are sorted
as we need.

Added key_validate_savlist validates that each sav list is surely sorted
(run only if DEBUG because it's not cheap).
---
Add test cases for SAs with different SPIs
---
Prepare to stop using isr->sav

isr is a shared resource and using isr->sav as a temporal storage
for each packet processing is racy. And also having a reference from
isr to sav makes the lifetime of sav non-deterministic; such a reference
is removed when a packet is processed and isr->sav is overwritten by
new one. Let's have a sav locally for each packet processing instead of
using shared isr->sav.

However this change doesn't stop using isr->sav yet because there are
some users of isr->sav. isr->sav will be removed after the users find
a way to not use isr->sav.
---
Fix wrong argument handling
---
fix printf format.
---
Don't validate sav lists of LARVAL or DEAD states

We don't sort the lists so the validation will always fail.

Fix PR kern/52405
---
Make sure to sort the list when changing the state by key_sa_chgstate
---
Rename key_allocsa_policy to key_lookup_sa_bysaidx
---
Separate test files
---
Calculate ah_max_authsize on initialization as well as esp_max_ivlen
---
Remove m_tag_find(PACKET_TAG_IPSEC_PENDING_TDB) because nobody sets the tag
---
Restore a comment removed in previous

The comment is valid for the below code.
---
Make tests more stable

sleep command seems to wait longer than expected on anita so
use polling to wait for a state change.
---
Add tests that explicitly delete SAs instead of waiting for expirations
---
Remove invalid M_AUTHIPDGM check on ESP isr->sav

M_AUTHIPDGM flag is set to a mbuf in ah_input_cb. An sav of ESP can
have AH authentication as sav->tdb_authalgxform. However, in that
case esp_input and esp_input_cb are used to do ESP decryption and
AH authentication and M_AUTHIPDGM never be set to a mbuf. So
checking M_AUTHIPDGM of a mbuf on isr->sav of ESP is meaningless.
---
Look up sav instead of relying on unstable sp->req->sav

This code is executed only in an error path so an additional lookup
doesn't matter.
---
Correct a comment
---
Don't release sav if calling crypto_dispatch again
---
Remove extra KEY_FREESAV from ipsec_process_done

It should be done by the caller.
---
Don't bother the case of crp->crp_buf == NULL in callbacks
---
Hold a reference to an SP during opencrypto processing

An SP has a list of isr (ipsecrequest) that represents a sequence
of IPsec encryption/authentication processing. One isr corresponds
to one opencrypto processing. The lifetime of an isr follows its SP.

We pass an isr to a callback function of opencrypto to continue
to a next encryption/authentication processing. However nobody
guaranteed that the isr wasn't freed, i.e., its SP wasn't destroyed.

In order to avoid such unexpected destruction of isr, hold a reference
to its SP during opencrypto processing.
---
Don't make SAs expired on tests that delete SAs explicitly
---
Fix a debug message
---
Dedup error paths (NFC)
---
Use pool to allocate tdb_crypto

For ESP and AH, we need to allocate an extra variable space in addition
to struct tdb_crypto. The fixed size of pool items may be larger than
an actual requisite size of a buffer, but still the performance
improvement by replacing malloc with pool wins.
---
Don't use unstable isr->sav for header size calculations

We may need to optimize to not look up sav here for users that
don't need to know an exact size of headers (e.g., TCP segmemt size
caclulation).
---
Don't use sp->req->sav when handling NAT-T ESP fragmentation

In order to do this we need to look up a sav however an additional
look-up degrades performance. A sav is later looked up in
ipsec4_process_packet so delay the fragmentation check until then
to avoid an extra look-up.
---
Don't use key_lookup_sp that depends on unstable sp->req->sav

It provided a fast look-up of SP. We will provide an alternative
method in the future (after basic MP-ification finishes).
---
Stop setting isr->sav on looking up sav in key_checkrequest
---
Remove ipsecrequest#sav
---
Stop setting mtag of PACKET_TAG_IPSEC_IN_DONE because there is no users anymore
---
Skip ipsec_spi_*_*_preferred_new_timeout when running on qemu

Probably due to PR 43997
---
Add localcount to rump kernels
---
Remove unused macro
---
Fix key_getcomb_setlifetime

The fix adjusts a soft limit to be 80% of a corresponding hard limit.

I'm not sure the fix is really correct though, at least the original
code is wrong. A passed comb is zero-cleared before calling
key_getcomb_setlifetime, so
comb->sadb_comb_soft_addtime = comb->sadb_comb_soft_addtime * 80 / 100;
is meaningless.
---
Provide and apply key_sp_refcnt (NFC)

It simplifies further changes.
---
Fix indentation

Pointed out by knakahara@
---
Use pslist(9) for sptree
---
Don't acquire global locks for IPsec if NET_MPSAFE

Note that the change is just to make testing easy and IPsec isn't MP-safe yet.
---
Let PF_KEY socks hold their own lock instead of softnet_lock

Operations on SAD and SPD are executed via PF_KEY socks. The operations
include deletions of SAs and SPs that will use synchronization mechanisms
such as pserialize_perform to wait for references to SAs and SPs to be
released. It is known that using such mechanisms with holding softnet_lock
causes a dead lock. We should avoid the situation.
---
Make IPsec SPD MP-safe

We use localcount(9), not psref(9), to make the sptree and secpolicy (SP)
entries MP-safe because SPs need to be referenced over opencrypto
processing that executes a callback in a different context.

SPs on sockets aren't managed by the sptree and can be destroyed in softint.
localcount_drain cannot be used in softint so we delay the destruction of
such SPs to a thread context. To do so, a list to manage such SPs is added
(key_socksplist) and key_timehandler_spd deletes dead SPs in the list.

For more details please read the locking notes in key.c.

Proposed on tech-kern@ and tech-net@
---
Fix updating ipsec_used

- key_update_used wasn't called in key_api_spddelete2 and key_api_spdflush
- key_update_used wasn't called if an SP had been added/deleted but
a reply to userland failed
---
Fix updating ipsec_used; turn on when SPs on sockets are added
---
Add missing IPsec policy checks to icmp6_rip6_input

icmp6_rip6_input is quite similar to rip6_input and the same checks exist
in rip6_input.
---
Add test cases for setsockopt(IP_IPSEC_POLICY)
---
Don't use KEY_NEWSP for dummy SP entries

By the change KEY_NEWSP is now not called from softint anymore
and we can use kmem_zalloc with KM_SLEEP for KEY_NEWSP.
---
Comment out unused functions
---
Add test cases that there are SPs but no relevant SAs
---
Don't allow sav->lft_c to be NULL

lft_c of an sav that was created by SADB_GETSPI could be NULL.
---
Clean up clunky eval strings

- Remove unnecessary \ at EOL
- This allows to omit ; too
- Remove unnecessary quotes for arguments of atf_set
- Don't expand $DEBUG in eval
- We expect it's expanded on execution

Suggested by kre@
---
Remove unnecessary KEY_FREESAV in an error path

sav should be freed (unreferenced) by the caller.
---
Use pslist(9) for sahtree
---
Use pslist(9) for sah->savtree
---
Rename local variable newsah to sah

It may not be new.
---
MP-ify SAD slightly

- Introduce key_sa_mtx and use it for some list operations
- Use pserialize for some list iterations
---
Introduce KEY_SA_UNREF and replace KEY_FREESAV with it where sav will never be actually freed in the future

KEY_SA_UNREF is still key_freesav so no functional change for now.

This change reduces diff of further changes.
---
Remove out-of-date log output

Pointed out by riastradh@
---
Use KDASSERT instead of KASSERT for mutex_ownable

Because mutex_ownable is too heavy to run in a fast path
even for DIAGNOSTIC + LOCKDEBUG.

Suggested by riastradh@
---
Assemble global lists and related locks into cache lines (NFCI)

Also rename variable names from *tree to *list because they are
just lists, not trees.

Suggested by riastradh@
---
Move locking notes
---
Update the locking notes

- Add locking order
- Add locking notes for misc lists such as reglist
- Mention pserialize, key_sp_ref and key_sp_unref on SP operations

Requested by riastradh@
---
Describe constraints of key_sp_ref and key_sp_unref

Requested by riastradh@
---
Hold key_sad.lock on SAVLIST_WRITER_INSERT_TAIL
---
Add __read_mostly to key_psz

Suggested by riastradh@
---
Tweak wording (pserialize critical section => pserialize read section)

Suggested by riastradh@
---
Add missing mutex_exit
---
Fix setkey -D -P outputs

The outputs were tweaked (by me), but I forgot updating libipsec
in my local ATF environment...
---
MP-ify SAD (key_sad.sahlist and sah entries)

localcount(9) is used to protect key_sad.sahlist and sah entries
as well as SPD (and will be used for SAD sav).

Please read the locking notes of SAD for more details.
---
Introduce key_sa_refcnt and replace sav->refcnt with it (NFC)
---
Destroy sav only in the loop for DEAD sav
---
Fix KASSERT(solocked(sb->sb_so)) failure in sbappendaddr that is called eventually from key_sendup_mbuf

If key_sendup_mbuf isn't passed a socket, the assertion fails.
Originally in this case sb->sb_so was softnet_lock and callers
held softnet_lock so the assertion was magically satisfied.
Now sb->sb_so is key_so_mtx and also softnet_lock isn't always
held by callers so the assertion can fail.

Fix it by holding key_so_mtx if key_sendup_mbuf isn't passed a socket.

Reported by knakahara@
Tested by knakahara@ and ozaki-r@
---
Fix locking notes of SAD
---
Fix deadlock between key_sendup_mbuf called from key_acquire and localcount_drain

If we call key_sendup_mbuf from key_acquire that is called on packet
processing, a deadlock can happen like this:
- At key_acquire, a reference to an SP (and an SA) is held
- key_sendup_mbuf will try to take key_so_mtx
- Some other thread may try to localcount_drain to the SP with
holding key_so_mtx in say key_api_spdflush
- In this case localcount_drain never return because key_sendup_mbuf
that has stuck on key_so_mtx never release a reference to the SP

Fix the deadlock by deferring key_sendup_mbuf to the timer
(key_timehandler).
---
Fix that prev isn't cleared on retry
---
Limit the number of mbufs queued for deferred key_sendup_mbuf

It's easy to be queued hundreds of mbufs on the list under heavy
network load.
---
MP-ify SAD (savlist)

localcount(9) is used to protect savlist of sah. The basic design is
similar to MP-ifications of SPD and SAD sahlist. Please read the
locking notes of SAD for more details.
---
Simplify ipsec_reinject_ipstack (NFC)
---
Add per-CPU rtcache to ipsec_reinject_ipstack

It reduces route lookups and also reduces rtcache lock contentions
when NET_MPSAFE is enabled.
---
Use pool_cache(9) instead of pool(9) for tdb_crypto objects

The change improves network throughput especially on multi-core systems.
---
Update

ipsec(4), opencrypto(9) and vlan(4) are now MP-safe.
---
Write known issues on scalability
---
Share a global dummy SP between PCBs

It's never be changed so it can be pre-allocated and shared safely between PCBs.
---
Fix race condition on the rawcb list shared by rtsock and keysock

keysock now protects itself by its own mutex, which means that
the rawcb list is protected by two different mutexes (keysock's one
and softnet_lock for rtsock), of course it's useless.

Fix the situation by having a discrete rawcb list for each.
---
Use a dedicated mutex for rt_rawcb instead of softnet_lock if NET_MPSAFE
---
fix localcount leak in sav. fixed by ozaki-r@n.o.

I commit on behalf of him.
---
remove unnecessary comment.
---
Fix deadlock between pserialize_perform and localcount_drain

A typical ussage of localcount_drain looks like this:

mutex_enter(&mtx);
item = remove_from_list();
pserialize_perform(psz);
localcount_drain(&item->localcount, &cv, &mtx);
mutex_exit(&mtx);

This sequence can cause a deadlock which happens for example on the following
situation:

- Thread A calls localcount_drain which calls xc_broadcast after releasing
a specified mutex
- Thread B enters the sequence and calls pserialize_perform with holding
the mutex while pserialize_perform also calls xc_broadcast
- Thread C (xc_thread) that calls an xcall callback of localcount_drain tries
to hold the mutex

xc_broadcast of thread B doesn't start until xc_broadcast of thread A
finishes, which is a feature of xcall(9). This means that pserialize_perform
never complete until xc_broadcast of thread A finishes. On the other hand,
thread C that is a callee of xc_broadcast of thread A sticks on the mutex.
Finally the threads block each other (A blocks B, B blocks C and C blocks A).

A possible fix is to serialize executions of the above sequence by another
mutex, but adding another mutex makes the code complex, so fix the deadlock
by another way; the fix is to release the mutex before pserialize_perform
and instead use a condvar to prevent pserialize_perform from being called
simultaneously.

Note that the deadlock has happened only if NET_MPSAFE is enabled.
---
Add missing ifdef NET_MPSAFE
---
Take softnet_lock on pr_input properly if NET_MPSAFE

Currently softnet_lock is taken unnecessarily in some cases, e.g.,
icmp_input and encap4_input from ip_input, or not taken even if needed,
e.g., udp_input and tcp_input from ipsec4_common_input_cb. Fix them.

NFC if NET_MPSAFE is disabled (default).
---
- sanitize key debugging so that we don't print extra newlines or unassociated
debugging messages.
- remove unused functions and make internal ones static
- print information in one line per message
---
humanize printing of ip addresses
---
cast reduction, NFC.
---
Fix typo in comment
---
Pull out ipsec_fill_saidx_bymbuf (NFC)
---
Don't abuse key_checkrequest just for looking up sav

It does more than expected for example key_acquire.
---
Fix SP is broken on transport mode

isr->saidx was modified accidentally in ipsec_nextisr.

Reported by christos@
Helped investigations by christos@ and knakahara@
---
Constify isr at many places (NFC)
---
Include socketvar.h for softnet_lock
---
Fix buffer length for ipsec_logsastr
 1.211.6.1  07-Jul-2017  martin Pull up following revision(s) (requested by knakahara in ticket #106):
sys/netinet6/icmp6.c: revision 1.212
fix PR kern/52353. implemented by ozaki-r@n.o. I just commit by proxy.
XXX need to pullup to -8.
 1.223.2.8  26-Dec-2018  pgoyette Sync with HEAD, resolve a few conflicts
 1.223.2.7  26-Nov-2018  pgoyette Sync with HEAD, resolve a couple of conflicts
 1.223.2.6  06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.223.2.5  25-Jun-2018  pgoyette Sync with HEAD
 1.223.2.4  21-May-2018  pgoyette Sync with HEAD
 1.223.2.3  02-May-2018  pgoyette Synch with HEAD
 1.223.2.2  16-Apr-2018  pgoyette Sync with HEAD, resolve some conflicts
 1.223.2.1  22-Mar-2018  pgoyette Synch with HEAD, resolve conflicts
 1.238.2.3  13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.238.2.2  08-Apr-2020  martin Merge changes from current as of 20200406
 1.238.2.1  10-Jun-2019  christos Sync with HEAD
 1.242.4.1  10-Mar-2024  martin Pull up following revision(s) (requested by riastradh in ticket #1809):

sys/netinet6/raw_ip6.c: revision 1.184 (patch)
sys/netinet6/icmp6.c: revision 1.256 (patch)

Deliver timestamps also to raw sockets.
Fixes PR 57955
 1.247.2.1  03-Apr-2021  thorpej Sync with HEAD.
 1.254.2.2  10-Mar-2024  martin Pull up following revision(s) (requested by riastradh in ticket #615):

sys/netinet6/raw_ip6.c: revision 1.184
sys/netinet6/icmp6.c: revision 1.256

Deliver timestamps also to raw sockets.
Fixes PR 57955
 1.254.2.1  10-Dec-2023  martin Pull up following revision(s) (requested by pgoyette in ticket #487):

sys/compat/common/compat_90_mod.c: revision 1.5
sys/compat/common/compat_90_mod.c: revision 1.6
sys/netinet6/in6.c: revision 1.290
sys/netinet6/in6.c: revision 1.291
sys/compat/common/files.common: revision 1.11
sys/netinet6/icmp6.c: revision 1.255
sys/compat/common/net_inet6_nd_90.c: revision 1.1
sys/compat/common/net_inet6_nd_90.c: revision 1.2
sys/modules/compat_90/Makefile: revision 1.2
sys/modules/compat_90/Makefile: revision 1.3
sys/netinet6/nd6.c: revision 1.281
sys/compat/common/compat_mod.h: revision 1.10
sys/kern/compat_stub.c: revision 1.23
sys/sys/compat_stub.h: revision 1.27

Identify the need to rework the COMPAT_* code to be more
module-aware.
This is an XXX comment block only, NFCI.

Modularize the COMPAT_90 code that resulted from the removal of
netinet6/nd6 from the kernel. Now, the minimal compat code can
be successfully loaded and unloaded along with the rest of the
COMPAT_90 code.

Allow kernels builds which don't define INET6 to compile compat bits
too.

Default the build of compat_90 module to include IPv6, as is done
for other INET6-sensitive modules (see if_lagg).

RSS XML Feed