Home | History | Annotate | Download | only in netinet6
History log of /src/sys/netinet6/nd6_nbr.c
RevisionDateAuthorComments
 1.186  28-Apr-2025  joe clear whitespace in IPV6 neighbor solicitation
 1.185  13-Nov-2024  roy ARP/ND6: Revert prior

Turns out some people actually use this behaviour and strictly speaking
it is allowed by RFC5227 2.4 where it says:

At any time, if a host receives
an ARP packet (Request *or* Reply) where the 'sender IP address' is
(one of) the host's own IP address(es) configured on that interface,
but the 'sender hardware address' does not match any of the host's
own interface addresses, then this is a conflicting ARP packet

The key part is "any of the host's own interface addreses".
 1.184  05-Oct-2024  roy ND6: only ignore messages from the receving interface

Sync with ARP behaviour, reverts r1.163 slightly.
 1.183  29-Mar-2023  kardel branches: 1.183.6;
use carp mac address when replying to neighbor solicitations referring
to carp interface addresses.
unconfuses commercial routers
 1.182  02-Aug-2021  andvar fix various typos in comments and log messages.
 1.181  11-Sep-2020  roy inet6: Use generic Neighor Detection rather than IPv6 specific

No functional change intended.
 1.180  20-Aug-2020  roy Sprinkle some const
 1.179  12-Jun-2020  roy Remove in-kernel handling of Router Advertisements

This is much better handled by a user-land tool.
Proposed on tech-net here:
https://mail-index.netbsd.org/tech-net/2020/04/22/msg007766.html

Note that the ioctl SIOCGIFINFO_IN6 no longer sets flags. That now
needs to be done using the pre-existing SIOCSIFINFO_FLAGS ioctl.

Compat is fully provided where it makes sense, but trying to turn on
RA handling will obviously throw an error as it no longer exists.

Note that if you use IPv6 temporary addresses, this now needs to be
turned on in dhcpcd.conf(5) rather than in sysctl.conf(5).
 1.178  22-Apr-2020  roy inet6: nd6_na_input() now considers ln_state <= ND6_LLINFO_INCOMPLETE

Otherwise if ln_state != ND6_LLINFO_INCOMPLETE and the is no lladdr
and this message was solicited then ln_state is set to ND6_LLINFO_REACHABLE
which could then cause a panic in nd6_resolve().
If ln_state > ND6_LLINFO_INCOMPLETE then it's assumed we have a lladdr.

Potentially this could have been triggered by the introduction of
ND6_LLINFO_PURGE in nd6.c r1.143 but also by the re-introduction of
ND6_LLINFO_INCOMPLETE in nd6.c r1.263.
Depending on the timing, it's technically possible to receive such
a message after the llentry is created with ND6_LLINFO_NOSTATE.
 1.177  09-Mar-2020  roy branches: 1.177.2;
route: RTM_MISS now puts the message source address in RTA_AUTHOR

route(8) also reports this.
A userland app could use this to blacklist nodes who probe for machines
that doesn't exist on a subnet / prefix.
 1.176  20-Jan-2020  thorpej Remove FDDI support.
 1.175  13-Nov-2019  ozaki-r branches: 1.175.2;
Get rid of unnecessary NULL checks for rt_ifa and ifa_ifp

They are always non-NULL nowadays.
 1.174  25-Sep-2019  ozaki-r Initialize DAD components properly

The original code initialized each component in non-init functions such as
arp_dad_start and nd6_dad_find, conditionally based on a global flag for each.
However, it was racy because the flag and the code around it were not
protected by a lock and could cause a kernel panic at worst.

Fix the issue by initializing the components in bootup as usual.
 1.173  18-Sep-2019  ozaki-r nd6: remove extra pserialize_read_exit
 1.172  01-Sep-2019  roy inet6: Send RTM_MISS when we fail to resolve an address.

Takes the same approach as when adding a new address - we no longer
announce the new lladdr right away but we announce the result.
This will either be RTM_ADD or RTM_MISS.
RTM_DELETE is only sent if we have a lladdr assigned OR gc'ed.

This results in less messages via route(4) and tells us when a new
lladdr has been added (RTM_ADD), changed (RTM_CHANGE), deleted (RTM_DELETED)
or has failed to been resolved (RTM_MISS). The latter case can be
interpreted as unreachable.
 1.171  30-Aug-2019  roy inet6: Revert prior

It's not needed, listing to RA is enough as discussed on tech-net.
 1.170  29-Aug-2019  roy Userland really has no business with NA messages.
However, RFC 4861 6.2.5 only says departing routers
*SHOULD* send RA with lifetime of zero and *MUST*
send all subsequent NA messages if the router flag
unset.

To help userland avoid the expensive process of
parsing NA messages, send RTM_CHANGE without a
lladdr in the gateway.
This is different from the intial RTM_ADD also
without a lladdr in the gateway and RTM_DELETE.
 1.169  29-Aug-2019  roy more bool
 1.168  29-Aug-2019  roy inet6: change rt_announce and llchange to bool in nd6_na_input()
 1.167  22-Aug-2019  roy nd6: notify userland of neighbour lla updates once more

XXX pullup -8 -9
 1.166  29-Apr-2019  roy branches: 1.166.2;
Introduce rt_addrmsg_src which adds RTA_AUTHOR to the message.
Use this when we notify userland of a duplicate address
and set RTA_AUTHOR to the hardware address of the sender.

While here, match the logging diagnostic of INET6 to the simpler one
of INET so it's consistent.
 1.165  29-Apr-2019  roy rtsock: Route address message simplification

Rename rt_newaddrmsg to rt_addrmsg_rt.
Add rt_addrmsg which drops the error and route arguments which are only
needed by one caller.
 1.164  22-Dec-2018  maxv Replace M_ALIGN and MH_ALIGN by m_align.
 1.163  13-Dec-2018  roy inet6: discard any received NA with a LL address we own

This matches ARP behaviour.
 1.162  07-Dec-2018  roy inet6: match NS nonce to any interface

This allows the same address to exist on many interfaces on the same
prefix, matching the inet behaviour.
 1.161  04-Dec-2018  roy inet6: remove needless ifa_release.
 1.160  04-Dec-2018  roy inet6: use one function for nd6_dad_input

Having different ones for NA and NS is a bit wasteful.
 1.159  04-Dec-2018  roy inet6: simplify NA DaD checking
 1.158  04-Dec-2018  roy inet6: remove unused dad ns/na counters

The current DaD code triggers when either an NS or NA is directly
received, so the counters themselves do nothing of use.
 1.157  29-Nov-2018  ozaki-r Introduce and use ip_dad_enabled() and ip6_dad_enabled() functions
 1.156  19-May-2018  maxv branches: 1.156.2;
Style.
 1.155  17-May-2018  maxv Fix the KASSERTs. It doesn't matter at all since the packet can't be this
big anyway, and there are many other places that have this kind of typo;
but still fix it, for the sake of closing PR/49834.
 1.154  01-May-2018  maxv Remove now unused net_osdep.h includes, the other BSDs did the same.
 1.153  19-Mar-2018  ozaki-r Pull out a sleepable function (in6_selectsrc) from a pserialize read section
 1.152  08-Mar-2018  ozaki-r Fix a race condition on DAD destructions (again)

The previous fix to DAD timers was wrong; it avoided a use-after-free but
instead introduced a memory leak. The destruction method had delegated
a destruction of a DAD timer to the timer itself and told that by setting NULL
to dp->dad_ifa. However, the previous fix made DAD timers do nothing on
the sign.

Fixing the issue with using callout_stop isn't easy. One approach is to have
a refcount on dp but it introduces extra complexity that we want to avoid.

The new fix falls back to using callout_halt, which was abandoned because of
softnet_lock. Fortunately now the network stack is protected by KERNEL_LOCK
so we can remove softnet_lock from DAD timers (callout) and use callout_halt
safely.
 1.151  07-Mar-2018  ozaki-r Avoid passing NULL to nd6_dad_duplicated

Fix PR kern/53075
 1.150  06-Mar-2018  martin Remove unused variables
 1.149  06-Mar-2018  roy nd6: add a nonce to DaD probes in-case they are looped back to us

This implements RFC 7527, based a similar change in FreeBSD.
 1.148  24-Feb-2018  ozaki-r branches: 1.148.2;
Avoid a race condition of DAD timer destructions

When we see dp->dad_ifa == NULL, it means that the ifa is being deleted and also
the callout is scheduled again by someone. We shouldn't rely on a result of
callout_pending to know if the callout is scheduled because it returns false if
the subsequent callout handler is already on the fly.

We have to always delegate the destruction of dp to the subsequent handler
unconditionally if dp->dad_ifa == NULL. Otherwise, the first handler destroys
the dp and the second handler tries to handle destroyed dp.
 1.147  24-Feb-2018  ozaki-r Simplify; pass dp to nd6_dad_duplicated instead of looking it up again in it
 1.146  24-Feb-2018  ozaki-r Use KASSERT for checking a programming error
 1.145  02-Feb-2018  maxv Fix memory leak. Contrary to what the XXX indicates, this place is 100%
reachable remotely.
 1.144  16-Jan-2018  ozaki-r Make DAD destructions (MP-)safe with callout_stop

arp_dad_stoptimer and nd6_dad_stoptimer can be called with or without
softnet_lock held and unfortunately we have no easy way to statically know which.
So it is hard to use callout_halt there.

To address the situation, we use callout_stop to make the code safe. The new
approach copes with the issue by delegating the destruction of a callout to
callout itself, which allows us to not wait the callout to finish. This can be
done thanks to that DAD objects are separated from other data such as ifa.

The approach is suggested by riastradh@
Proposed on tech-kern@ and tech-net@
 1.143  16-Jan-2018  ozaki-r Revert "Work around softnet_lock handling" as per pgoyette@'s request

We should avoid if (mutex_owned(softnet_lock)).
 1.142  10-Jan-2018  ozaki-r Get rid of unnecessary ifdef for IFT_IEEE80211
 1.141  10-Jan-2018  ozaki-r Fix a deadlock on callout_halt of nd6_dad_timer

We must not call callout_halt of nd6_dad_timer with holding nd6_dad_lock because
the lock is taken in nd6_dad_timer. Once softnet_lock goes away, we can pass the
lock to callout_halt, but for now we cannot.
 1.140  26-Dec-2017  ozaki-r Work around softnet_lock handling

nd6_dad_stoptimer can be called with or without softnet_lock held.
callout_halt has to take softnet_lock depending on the situation.
 1.139  17-Nov-2017  ozaki-r Provide macros for softnet_lock and KERNEL_LOCK hiding NET_MPSAFE switch

It reduces C&P codes such as "#ifndef NET_MPSAFE KERNEL_LOCK(1, NULL); ..."
scattered all over the source code and makes it easy to identify remaining
KERNEL_LOCK and/or softnet_lock that are held even if NET_MPSAFE.

No functional change
 1.138  14-Mar-2017  ozaki-r branches: 1.138.6;
Replace DIAGNOSTIC + panic with KASSERT
 1.137  21-Feb-2017  ozaki-r Replace malloc for DAD with kmem and move them out of the lock for DAD
 1.136  16-Jan-2017  christos ip6_sprintf -> IN6_PRINT so that we pass the size.
 1.135  16-Jan-2017  ryo Make ip6_sprintf(), in_fmtaddr(), lla_snprintf() and icmp6_redirect_diag() mpsafe.

Reviewed by ozaki-r@
 1.134  19-Dec-2016  ozaki-r branches: 1.134.2;
Protect IPv6 default router and prefix lists with coarse-grained rwlock

in6_purgeaddr (in6_unlink_ifa) itself unrefernces a prefix entry and calls
nd6_prelist_remove if the counter becomes 0, so callers doesn't need to
handle the reference counting.

Performance-sensitive paths (sending/forwarding packets) call just one
reader lock. This is a trade-off between performance impact vs. the amount
of efforts; if we want to remove the reader lock, we need huge amount of
works including destroying objects with psz/psref in softint, for example.
 1.133  14-Dec-2016  ozaki-r Make functions static
 1.132  12-Dec-2016  ozaki-r Make the routing table and rtcaches MP-safe

See the following descriptions for details.

Proposed on tech-kern and tech-net


Overview
 1.131  11-Dec-2016  ozaki-r Add nd6_ prefix to exported functions
 1.130  15-Nov-2016  mlelstv nd6_dad_duplicated takes the lock itself. Move it out of the critical
section.
 1.129  31-Oct-2016  ozaki-r Fix race condition of in6_selectsrc

in6_selectsrc returned a pointer to in6_addr that wan't guaranteed to be
safe by pserialize (or psref), which was racy. Let callers pass a pointer
to in6_addr and in6_selectsrc copy a result to it inside pserialize
critical sections.
 1.128  18-Oct-2016  ozaki-r Don't hold global locks if NET_MPSAFE is enabled

If NET_MPSAFE is enabled, don't hold KERNEL_LOCK and softnet_lock in
part of the network stack such as IP forwarding paths. The aim of the
change is to make it easy to test the network stack without the locks
and reduce our local diffs.

By default (i.e., if NET_MPSAFE isn't enabled), the locks are held
as they used to be.

Reviewed by knakahara@
 1.127  01-Aug-2016  ozaki-r Apply pserialize and psref to struct ifaddr and its variants

This change makes struct ifaddr and its variants (in_ifaddr and in6_ifaddr)
MP-safe by using pserialize and psref. At this moment, pserialize_perform
and psref_target_destroy are disabled because (1) we don't need them
because of softnet_lock (2) they cause a deadlock because of softnet_lock.
So we'll enable them when we remove softnet_lock in the future.
 1.126  28-Jul-2016  ozaki-r Fix panic on adding/deleting IP addresses under network load

Adding and deleting IP addresses aren't serialized with other network
opeartions, e.g., forwarding packets. So if we add or delete an IP
address under network load, a kernel panic may happen on manipulating
network-related shared objects such as rtentry and rtcache.

To avoid such panicks, we still need to hold softnet_lock in in_control
and in6_control that are called via ioctl and do network-related operations
including IP address additions/deletions.

Fix PR kern/51356
 1.125  25-Jul-2016  ozaki-r Make DAD of ARP/NDP MP-safe with coarse-grained locks

The change also prevents arp_dad_timer/nd6_dad_timer from running if
arp_dad_stop/nd6_dad_stop is called, which makes sure that callout_reset
won't be called during callout_halt.
 1.124  25-Jul-2016  ozaki-r Use KASSERT for checking non-NULL of ifa->ifa_ifp

ifa->ifa_ifp should be always non-NULL, so doing the check only if
DIAGNOSTIC is ok.
 1.123  15-Jul-2016  ozaki-r Use sin6tosa and sin6tocsa macros

No functional change.
 1.122  01-Jul-2016  ozaki-r branches: 1.122.2;
Make sure to free all interface addresses in if_detach

Addresses of an interface (struct ifaddr) have a (reverse) pointer of an
interface object (ifa->ifa_ifp). If the addresses are surely freed when
their interface is destroyed, the pointer is always valid and we don't
need a tweak of replacing the pointer to if_index like mbuf.

In order to make sure the assumption, the following changes are required:
- Deactivate the interface at the firstish of if_detach. This prevents
in6_unlink_ifa from saving multicast addresses (wrongly)
- Invalidate rtcache(s) and clear a rtentry referencing an address on
RTM_DELETE. rtcache(s) may delay freeing an address
- Replace callout_stop with callout_halt of DAD timers to ensure stopping
such timers in if_detach
 1.121  21-Jun-2016  ozaki-r Make sure returning ifp from in6_select* functions psref-ed

To this end, callers need to pass struct psref to the functions
and the fuctions acquire a reference of ifp with it. In some cases,
we can simply use if_get_byindex, however, in other cases
(say rt->rt_ifp and ia->ifa_ifp), we have no MP-safe way for now.
In order to take a reference anyway we use non MP-safe function
if_acquire_NOMPSAFE for the latter cases. They should be fixed in
the future somehow.
 1.120  21-Jun-2016  ozaki-r Replace ifp of ip_moptions and ip6_moptions with if_index

The motivation is the same as the mbuf's rcvif case; avoid having a pointer
of an ifnet object in ip_moptions and ip6_moptions, which is not MP-safe.

ip_moptions and ip6_moptions can be stored in a PCB for inet or inet6
that's life time is different from ifnet one and so an ifnet object can be
disappeared anytime we get it via them. Thus we need to look up an ifnet
object by if_index every time for safe.
 1.119  10-Jun-2016  ozaki-r Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.
 1.118  10-Jun-2016  ozaki-r Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.
 1.117  29-Apr-2016  is Let non-neighbor NS/NA debug error message include useful information.
 1.116  11-Apr-2016  ozaki-r Don't call pfxlist_onlink_check with holding llentry lock

From FreeBSD (as of 2016-04-11).

Should fix PR kern/51060.
 1.115  04-Apr-2016  ozaki-r Separate nexthop caches from the routing table

By this change, nexthop caches (IP-MAC address pair) are not stored
in the routing table anymore. Instead nexthop caches are stored in
each network interface; we already have lltable/llentry data structure
for this purpose. This change also obsoletes the concept of cloning/cloned
routes. Cloned routes no longer exist while cloning routes still exist
with renamed to connected routes.

Noticeable changes are:
- Nexthop caches aren't listed in route show/netstat -r
- sysctl(NET_RT_DUMP) doesn't return them
- If RTF_LLDATA is specified, it returns nexthop caches
- Several definitions of routing flags and messages are removed
- RTF_CLONING, RTF_XRESOLVE, RTF_LLINFO, RTF_CLONED and RTM_RESOLVE
- RTF_CONNECTED is added
- It has the same value of RTF_CLONING for backward compatibility
- route's -xresolve, -[no]cloned and -llinfo options are removed
- -[no]cloning remains because it seems there are users
- -[no]connected is introduced and recommended
to be used instead of -[no]cloning
- route show/netstat -r drops some flags
- 'L' and 'c' are not seen anymore
- 'C' now indicates a connected route
- Gateway value of a route of an interface address is now not
a L2 address but "link#N" like a connected (cloning) route
- Proxy ARP: "arp -s ... pub" doesn't create a route

You can know details of behavior changes by seeing diffs under tests/.

Proposed on tech-net and tech-kern:
http://mail-index.netbsd.org/tech-net/2016/03/11/msg005701.html
 1.114  01-Apr-2016  ozaki-r Refine nd6log

Add __func__ to nd6log itself instead of adding it to callers.
 1.113  07-Dec-2015  ozaki-r CID 1341546: Fix integer handling issue (CONSTANT_EXPRESSION_RESULT)

n > INT_MAX where n is a long integer variable never be true on 32bit
architectures. Use time_t(int64_t) instead of long for the variable.
 1.112  25-Nov-2015  ozaki-r Use lltable/llentry for NDP

lltable and llentry were introduced to replace ARP cache data structure
for further restructuring of the routing table: L2 nexthop cache
separation. This change replaces the NDP cache data structure
(llinfo_nd6) with them as well as ARP.

One noticeable change is for neighbor cache GC mechanism that was
introduced to prevent IPv6 DoS attacks. net.inet6.ip6.neighborgcthresh
was the max number of caches that we store in the system. After
introducing lltable/llentry, the value is changed to be per-interface
basis because lltable/llentry stores neighbor caches in each interface
separately. And the change brings one degradation; the old GC mechanism
dropped exceeded packets based on LRU while the new implementation drops
packets in order from the beginning of lltable (a hash table + linked
lists). It would be improved in the future.

Added functions in in6.c come from FreeBSD (as of r286629) and are
tweaked for NetBSD.

Proposed on tech-kern and tech-net.
 1.111  18-Nov-2015  ozaki-r Stop passing llinfo_nd6 to nd6_ns_output

This is a restructuring for coming changes to nd6 (replacing
llinfo_nd6 with llentry). Once we have a lock of llinfo_nd6,
we need to pass it to nd6_ns_output with holding the lock.
However, in a function subsequent to nd6_ns_output, the llinfo_nd6
may be looked up, i.e., its lock would be acquired again.
To avoid such a situation, pass only required data (in6_addr) to
nd6_ns_output instead of passing whole llinfo_nd6.

Inspired by FreeBSD
 1.110  24-Aug-2015  pooka sprinkle _KERNEL_OPT
 1.109  17-Jul-2015  ozaki-r Reform use of rt_refcnt

rt_refcnt of rtentry was used in bad manners, for example, direct rt_refcnt++
and rt_refcnt-- outside route.c, "rt->rt_refcnt++; rtfree(rt);" idiom, and
touching rt after rt->rt_refcnt--.

These abuses seem to be needed because rt_refcnt manages only references
between rtentry and doesn't take care of references during packet processing
(IOW references from local variables). In order to reduce the above abuses,
the latter cases should be counted by rt_refcnt as well as the former cases.

This change improves consistency of use of rt_refcnt:
- rtentry is always accessed with rt_refcnt incremented
- rtentry's rt_refcnt is decremented after use (rtfree is always used instead
of rt_refcnt--)
- functions returning rtentry increment its rt_refcnt (and caller rtfree it)

Note that rt_refcnt prevents rtentry from being freed but doesn't prevent
rtentry from being updated. Toward MP-safe, we need to provide another
protection for rtentry, e.g., locks. (Or introduce a better data structure
allowing concurrent readers during updates.)
 1.108  27-Apr-2015  ozaki-r Add missing error checks on rtcache_setdst

It can fail with ENOMEM.
 1.107  30-Mar-2015  ozaki-r Tidy up opt_ipsec.h inclusions
 1.106  25-Feb-2015  roy Rename nd6_rtmsg() to rt_newmsg() and move into the generic routing code
as it's not IPv6 specific and will be used elsewhere.
 1.105  25-Feb-2015  roy Retire nd6_newaddrmsg and use rt_newaddrmsg directly instead so that
we don't spam route changes when the route hasn't changed.
 1.104  23-Feb-2015  martin Rearange interface detachement slightly: before we free the INET6 specific
per-interface data, make sure to call nd6_purge() with it to remove
routing entries pointing to the going interface.
When we should happen to call this function again later, with the data
already gone, just return.
Fixes PR kern/49682, ok: christos.
 1.103  16-Dec-2014  roy Report route additions/changes/deletions for cached neighbours to userland.
 1.102  12-Oct-2014  roy branches: 1.102.2;
Remove redundant logging.
 1.101  09-Sep-2014  rmind Eliminate IFAREF() and IFAFREE() macros in favour of functions.
 1.100  01-Jul-2014  ozaki-r branches: 1.100.2;
Stop using callout randomly

nd6_dad_start uses callout when xtick > 0 while doesn't when
xtick == 0. So if we pass a random value ranging from 0 to N,
nd6_dad_start uses callout randomly. This behavior makes
debugging difficult.

Discussed in http://mail-index.netbsd.org/tech-kern/2014/06/25/msg017278.html
 1.99  13-Jan-2014  roy branches: 1.99.2;
When handling NS/NA we need to check our prefix list instead of our
address list to work out if it came from a valid neighbor.
 1.98  21-May-2013  roy branches: 1.98.2;
Disable nd6_newaddrmsg debug
 1.97  21-May-2013  roy For IPv6, emit RTM_NEWADDR once DAD completes and also when address flag
changes. Tentative addresses are not emitted.

Version bumped so userland can detect this behaviour change.
 1.96  22-Mar-2012  drochner branches: 1.96.2;
remove KAME IPSEC, replaced by FAST_IPSEC
 1.95  19-Dec-2011  drochner branches: 1.95.2; 1.95.6; 1.95.8;
rename the IPSEC in-kernel CPP variable and config(8) option to
KAME_IPSEC, and make IPSEC define it so that existing kernel
config files work as before
Now the default can be easily be changed to FAST_IPSEC just by
setting the IPSEC alias to FAST_IPSEC.
 1.94  18-Apr-2009  tsutsui branches: 1.94.12; 1.94.16;
Remove extra whitespace added by a stupid tool.
XXX: more in src/sys/arch
 1.93  18-Mar-2009  cegger bcopy -> memcpy
 1.92  18-Mar-2009  cegger bzero -> memset
 1.91  18-Mar-2009  cegger bcmp -> memcmp
 1.90  31-Jul-2008  matt branches: 1.90.2; 1.90.8;
Generalize previous fix so that both NS and NA packets are checked.
 1.89  31-Jul-2008  matt If a neighbor solictation isn't from the unspecified address, make sure
that the source address matches one of the interfaces address prefixes.
 1.88  22-May-2008  dyoung branches: 1.88.4;
Cosmetic: join lines.
 1.87  22-May-2008  dyoung Cosmetic: don't cast NULL unnecessarily.
 1.86  24-Apr-2008  ad branches: 1.86.2; 1.86.4;
Merge the socket locking patch:

- Socket layer becomes MP safe.
- Unix protocols become MP safe.
- Allows protocol processing interrupts to safely block on locks.
- Fixes a number of race conditions.

With much feedback from matt@ and plunky@.
 1.85  15-Apr-2008  thorpej branches: 1.85.2;
Make ip6 and icmp6 stats per-cpu.
 1.84  08-Apr-2008  thorpej Change ICMP6 stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old icmp6stat structure; old netstat
binaries will continue to work properly.
 1.83  27-Feb-2008  matt Convert to ansi definitions from old-style definitons.
Remember that func() is not ansi, func(void) is.
 1.82  16-Nov-2007  dyoung branches: 1.82.10; 1.82.14;
We might leave nd6_ns_output() really early. Postpone memset()
until after we decide to stay.
 1.81  10-Nov-2007  dyoung Use sockaddr_in6_init().
 1.80  30-Aug-2007  dyoung branches: 1.80.4; 1.80.6;
Use malloc(9) for sockaddrs instead of pool(9), and remove dom_sa_pool
and dom_sa_len members from struct domain. Pools of fixed-size
objects are too rigid for sockaddr_dls, whose size can vary over
a wide range.

Return sockaddr_dl to its "historical" size. Now that I'm using
malloc(9) instead of pool(9) to allocate sockaddr_dl, I can create
a sockaddr_dl of any size in the kernel, so expanding sockaddr_dl
is useless.

Avoid using sizeof(struct sockaddr_dl) in the kernel.

Introduce sockaddr_dl_alloc() for allocating & initializing an
arbitrary sockaddr_dl on the heap.

Add an argument, the sockaddr length, to sockaddr_alloc(),
sockaddr_copy(), and sockaddr_dl_setaddr().

Constify: LLADDR() -> CLLADDR().

Where the kernel overwrites LLADDR(), use sockaddr_dl_setaddr(),
instead. Used properly, sockaddr_dl_setaddr() will not overrun
the end of the sockaddr.
 1.79  26-Aug-2007  dyoung branches: 1.79.2;
Constify: LLADDR -> CLLADDR. I'm aiming here to make it easier to
identify sockaddr_dl abuse that remains in the kernel, especially
the potential for overwriting memory past the end of a sockaddr_dl
with, e.g., memcpy(LLADDR(), ...).

Use sockaddr_dl_setaddr() in a few places.
 1.78  07-Aug-2007  dyoung branches: 1.78.2;
Avoid writing past the end of the buffer [lldst, lldst + dstsize)
in nd6_storelladdr().

Use sockaddr_dl_setaddr(). Constify some sockaddr_dl's. Constify
a sockaddr argument to nd6_na_output(). Change SDL() to "standard"
satocsdl() or satosdl(). Change SIN6() to satocsin6() or satosin6().

bcmp -> memcmp, bcopy -> memcpy.
 1.77  19-Jul-2007  dyoung branches: 1.77.4;
Take steps to hide the radix_node implementation of the forwarding table
from the forwarding table's users:

Introduce rt_walktree() for walking the routing table and
applying a function to each rtentry. Replace most
rn_walktree() calls with it.

Use rt_getkey()/rt_setkey() to get/set a route's destination.
Keep a pointer to the sockaddr key in the rtentry, so that
rtentry users do not have to grovel in the radix_node for
the key.

Add a RTM_GET method to rtrequest. Use that instead of
radix_node lookups in, e.g., carp(4).

Add sys/net/link_proto.c, which supplies sockaddr routines for
link-layer socket addresses (sockaddr_dl).

Cosmetic:

Constify. KNF. Stop open-coding LIST_FOREACH, TAILQ_FOREACH,
et cetera. Use NULL instead of 0 for null pointers. Use
__arraycount(). Reduce gratuitous parenthesization.

Stop using variadic arguments for rip6_output(), it is
unnecessary.

Remove the unnecessary rtentry member rt_genmask and the
code to maintain it, since nothing actually used it.

Make rt_maskedcopy() easier to read by using meaningful variable
names.

Extract a subroutine intern_netmask() for looking up a netmask in
the masks table.

Start converting backslash-ridden IPv6 macros in
sys/netinet6/in6_var.h into inline subroutines that one
can read without special eyeglasses.

One functional change: when the kernel serves an RTM_GET, RTM_LOCK,
or RTM_CHANGE request, it applies the netmask (if supplied) to a
destination before searching for it in the forwarding table.

I have changed sys/netinet/ip_carp.c, carp_setroute(), to remove
the unlawful radix_node knowledge.

Apart from the changes to carp(4), netiso, ATM, and strip(4), I
have run the changes on three nodes in my wireless routing testbed,
which involves IPv4 + IPv6 dynamic routing acrobatics, and it's
working beautifully so far.
 1.76  09-Jul-2007  ad branches: 1.76.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements
 1.75  23-May-2007  christos Ansify + add a few comments, from Karl Sjödahl
 1.74  17-May-2007  dyoung Fix the memory leak reported in kern/36337. Thanks Matthias Scheler
for the heads-up. My fix is based on the following patches from
FreeBSD, however, I extracted the code into a subroutine,
nd6_llinfo_release_pkts():

http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/netinet6/nd6.c.diff?r1=1.48.2.18;r2=1.48.2.19
http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/netinet6/nd6_nbr.c.diff?r1=1.29.2.8;r2=1.29.2.9
 1.73  02-May-2007  dyoung Eliminate address family-specific route caches (struct route, struct
route_in6, struct route_iso), replacing all caches with a struct
route.

The principle benefit of this change is that all of the protocol
families can benefit from route cache-invalidation, which is
necessary for correct routing. Route-cache invalidation fixes an
ancient PR, kern/3508, at long last; it fixes various other PRs,
also.

Discussions with and ideas from Joerg Sonnenberger influenced this
work tremendously. Of course, all design oversights and bugs are
mine.

DETAILS

1 I added to each address family a pool of sockaddrs. I have
introduced routines for allocating, copying, and duplicating,
and freeing sockaddrs:

struct sockaddr *sockaddr_alloc(sa_family_t af, int flags);
struct sockaddr *sockaddr_copy(struct sockaddr *dst,
const struct sockaddr *src);
struct sockaddr *sockaddr_dup(const struct sockaddr *src, int flags);
void sockaddr_free(struct sockaddr *sa);

sockaddr_alloc() returns either a sockaddr from the pool belonging
to the specified family, or NULL if the pool is exhausted. The
returned sockaddr has the right size for that family; sa_family
and sa_len fields are initialized to the family and sockaddr
length---e.g., sa_family = AF_INET and sa_len = sizeof(struct
sockaddr_in). sockaddr_free() puts the given sockaddr back into
its family's pool.

sockaddr_dup() and sockaddr_copy() work analogously to strdup()
and strcpy(), respectively. sockaddr_copy() KASSERTs that the
family of the destination and source sockaddrs are alike.

The 'flags' argumet for sockaddr_alloc() and sockaddr_dup() is
passed directly to pool_get(9).

2 I added routines for initializing sockaddrs in each address
family, sockaddr_in_init(), sockaddr_in6_init(), sockaddr_iso_init(),
etc. They are fairly self-explanatory.

3 structs route_in6 and route_iso are no more. All protocol families
use struct route. I have changed the route cache, 'struct route',
so that it does not contain storage space for a sockaddr. Instead,
struct route points to a sockaddr coming from the pool the sockaddr
belongs to. I added a new method to struct route, rtcache_setdst(),
for setting the cache destination:

int rtcache_setdst(struct route *, const struct sockaddr *);

rtcache_setdst() returns 0 on success, or ENOMEM if no memory is
available to create the sockaddr storage.

It is now possible for rtcache_getdst() to return NULL if, say,
rtcache_setdst() failed. I check the return value for NULL
everywhere in the kernel.

4 Each routing domain (struct domain) has a list of live route
caches, dom_rtcache. rtflushall(sa_family_t af) looks up the
domain indicated by 'af', walks the domain's list of route caches
and invalidates each one.
 1.72  15-Mar-2007  dyoung Don't open-code TAILQ_FOREACH(). KNF: Fix K&R prototypes and
parameter-type declarations.
 1.71  04-Mar-2007  christos branches: 1.71.2; 1.71.4; 1.71.6;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.
 1.70  17-Feb-2007  dyoung KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.
 1.69  29-Jan-2007  dyoung branches: 1.69.2;
Cosmetic: bzero -> memset. Change a bcopy() to a struct assignment.
 1.68  15-Dec-2006  joerg Introduce new helper functions to abstract the route caching.
rtcache_init and rtcache_init_noclone lookup ro_dst and store
the result in ro_rt, taking care of the reference counting and
calling the domain specific route cache.
rtcache_free checks if a route was cashed and frees the reference.
rtcache_copy copies ro_dst of the given struct route, checking that
enough space is available and incrementing the reference count of the
cached rtentry if necessary.
rtcache_check validates that the cached route is still up. If it isn't,
it tries to look it up again. Afterwards ro_rt is either a valid again
or NULL.
rtcache_copy is used internally.

Adjust to callers of rtalloc/rtflush in the tree to check the sanity of
ro_dst first (if necessary). If it doesn't fit the expectations, free
the cache, otherwise check if the cached route is still valid. After
that combination, a single check for ro_rt == NULL is enough to decide
whether a new lookup needs to be done with a different ro_dst.
Make the route checking in gre stricter by repeating the loop check
after revalidation.
Remove some unused RADIX_MPATH code in in6_src.c. The logic is slightly
changed here to first validate the route and check RTF_GATEWAY
afterwards. This is sementically equivalent though.
etherip doesn't need sc_route_expire similiar to the gif changes from
dyoung@ earlier.

Based on the earlier patch from dyoung@, reviewed and discussed with
him.
 1.67  09-Dec-2006  dyoung Here are various changes designed to protect against bad IPv4
routing caused by stale route caches (struct route). Route caches
are sprinkled throughout PCBs, the IP fast-forwarding table, and
IP tunnel interfaces (gre, gif, stf).

Stale IPv6 and ISO route caches will be treated by separate patches.

Thank you to Christoph Badura for suggesting the general approach
to invalidating route caches that I take here.

Here are the details:

Add hooks to struct domain for tracking and for invalidating each
domain's route caches: dom_rtcache, dom_rtflush, and dom_rtflushall.

Introduce helper subroutines, rtflush(ro) for invalidating a route
cache, rtflushall(family) for invalidating all route caches in a
routing domain, and rtcache(ro) for notifying the domain of a new
cached route.

Chain together all IPv4 route caches where ro_rt != NULL. Provide
in_rtcache() for adding a route to the chain. Provide in_rtflush()
and in_rtflushall() for invalidating IPv4 route caches. In
in_rtflush(), set ro_rt to NULL, and remove the route from the
chain. In in_rtflushall(), walk the chain and remove every route
cache.

In rtrequest1(), call rtflushall() to invalidate route caches when
a route is added.

In gif(4), discard the workaround for stale caches that involves
expiring them every so often.

Replace the pattern 'RTFREE(ro->ro_rt); ro->ro_rt = NULL;' with a
call to rtflush(ro).

Update ipflow_fastforward() and all other users of route caches so
that they expect a cached route, ro->ro_rt, to turn to NULL.

Take care when moving a 'struct route' to rtflush() the source and
to rtcache() the destination.

In domain initializers, use .dom_xxx tags.

KNF here and there.
 1.66  02-Dec-2006  dyoung Use the queue(3) macros instead of open-coding them. Shorten
staircases. Remove unnecessary casts. Where appropriate, s/8/NBBY/.
De-__P(). KNF.

No functional changes intended.
 1.65  28-Jun-2006  drochner branches: 1.65.4; 1.65.6; 1.65.8; 1.65.14;
fix the dad_count logic: if we send a packet successfully, reset the counter
for sent tries -- otherwise it gets confused if dad_count is set to >15
by the sysctl, and addresses get stuck in "tentative" state forever
 1.64  18-May-2006  liamjfoy branches: 1.64.4;
Integrate Common Address Redundancy Procotol (CARP) from OpenBSD

'pseudo-device carp'

Thanks to: joerg@ christos@ riz@ and others who tested
Ok: core@
 1.63  06-Mar-2006  rpaulo branches: 1.63.4;
Rename local variables called delay that shadow the delay() decl.
Pointed out by Robert Swindells.
 1.62  05-Mar-2006  rpaulo NDP-related improvements:
RFC4191
- supports host-side router-preference

RFC3542
- if DAD fails on a interface, disables IPv6 operation on the
interface
- don't advertise MLD report before DAD finishes

Others
- fixes integer overflow for valid and preferred lifetimes
- improves timer granularity for MLD, using callout-timer.
- reflects rtadvd's IPv6 host variable information into kernel
(router only)
- adds a sysctl option to enable/disable pMTUd for multicast
packets
- performs NUD on PPP/GRE interface by default
- Redirect works regardless of ip6_accept_rtadv
- removes RFC1885-related code

From the KAME project via SUZUKI Shinsuke.
Reviewed by core.
 1.61  03-Mar-2006  rpaulo branches: 1.61.2;
Fix typos in comments.

From: the KAME project via SUZUKI Shinsuke.
 1.60  25-Feb-2006  wiz Fix typos, reported by Alexey Dobriyan ("Gathered from Linux"),
forwarded by jmc@openbsd.
 1.59  21-Jan-2006  rpaulo branches: 1.59.2; 1.59.4;
Better support of IPv6 scoped addresses.

- most of the kernel code will not care about the actual encoding of
scope zone IDs and won't touch "s6_addr16[1]" directly.
- similarly, most of the kernel code will not care about link-local
scoped addresses as a special case.
- scope boundary check will be stricter. For example, the current
*BSD code allows a packet with src=::1 and dst=(some global IPv6
address) to be sent outside of the node, if the application do:
s = socket(AF_INET6);
bind(s, "::1");
sendto(s, some_global_IPv6_addr);
This is clearly wrong, since ::1 is only meaningful within a single
node, but the current implementation of the *BSD kernel cannot
reject this attempt.
- and, while there, don't try to remove the ff02::/32 interface route
entry in in6_ifdetach() as it's already gone.

This also includes some level of support for the standard source
address selection algorithm defined in RFC3484, which will be
completed on in the future.

From the KAME project via JINMEI Tatuya.
Approved by core@.
 1.58  11-Dec-2005  christos branches: 1.58.2;
merge ktrace-lwp.
 1.57  29-May-2005  christos branches: 1.57.2;
- avoid shadowed variables
- sprinkle const.
 1.56  26-Feb-2005  perry branches: 1.56.2; 1.56.4; 1.56.6;
nuke trailing whitespace
 1.55  10-Feb-2005  itojun backout 1.54. heurestic code should never be used. if you experience DAD
failure, suspect your driver, not ND code.
 1.54  02-Feb-2005  drochner Give DAD a chance to succeed even if the network is "slightly broken"
(in my case it as a switch set to "monitor" mode):
If we see an NS request for the address we are just probing for, for
three times the number of DAD packets we are supposed to send (the
"ip6.dad_count" sysctl variable), assume that these are our own packets
and let DAD succeed.
The code for this was mostly there, commented out. Just needed some fixes.
The "three times" is heuristic of course.
Being here, reset the "dad_ns_tcount" variable on a successful send;
otherwise we get strange interdependencies with user-settable variables
(ever tried to set ip6.dad_count to something >15?).
 1.53  10-Feb-2004  itojun branches: 1.53.8; 1.53.10;
reduce useless variables
 1.52  30-Oct-2003  simonb Remove some assigned-to but otherwise unused variables.
 1.51  05-Sep-2003  itojun u_short -> u_int16_t. sync w/ kame.
don't set ip6_plen where unneeded (i.e. before calling ip6_output)
 1.50  22-Aug-2003  itojun remove ipsec_set/getsocket. now we explicitly pass socket * to ip{,6}_output.
 1.49  22-Aug-2003  itojun change the additional arg to be passed to ip{,6}_output to struct socket *.

this fixes KAME policy lookup which was broken by the previous commit.
 1.48  22-Aug-2003  jonathan Replace the set_socket() method of passing an extra struct socket*
argument to ip6_output() with a new explicit struct in6pcb* argument.
(The underlying socket can be obtained via in6pcb->inp6_socket.)

In preparation for fast-ipsec. Reviewed by itojun.
 1.47  27-Jun-2003  itojun branches: 1.47.2;
split ND6 cache timer management to per-entry. increased accuracy,
no O(N) loop. sync w/ kame
 1.46  24-Jun-2003  itojun remove unneeded checks of accept_rtadv. from kame
 1.45  24-Jun-2003  itojun use time.tv_sec directly
 1.44  14-May-2003  itojun always use PULLDOWN_TEST codepath.
 1.43  23-Sep-2002  simonb Remove breaks after returns, unreachable returns and returns after
returns(!).
 1.42  09-Jun-2002  itojun whitespace cleanup
 1.41  08-Jun-2002  itojun KNF
 1.40  08-Jun-2002  itojun gc
 1.39  08-Jun-2002  itojun sync with latest KAME in6_ifaddr/prefix/default router manipulation.
behavior changes:
- two iocts used by ndp(8) are now obsolete (backward compat provided).
use sysctl path instead.
- lo0 does not get ::1 automatically. it will get ::1 when lo0 comes up.
 1.38  07-Jun-2002  itojun whitespace
 1.37  07-Jun-2002  itojun whitespace
 1.36  29-May-2002  itojun attach nd_ifinfo structure into if_afdata.
split IPv6 link MTU (advertised by RA) from real link MTU.
sync with kame
 1.35  28-May-2002  itojun use arc4random() where possible.
XXX is it necessary to do microtime() on tcp syn cache?
 1.34  15-Mar-2002  itojun branches: 1.34.4; 1.34.6;
s/0/NULL/ as ln_hold is a pointer. sync w/ kame
 1.33  13-Nov-2001  lukem add RCSIDs
 1.32  18-Oct-2001  itojun reduce diffs with kame (mostly cosmetic).
move IPV6_CHECKSUM processing to sys/netinet6/raw_ip6.c.
constify a couple of places.
 1.31  17-Oct-2001  itojun do not change neighbor cache state on entry timeout,
if the cache entry is for outgoing router.

perform on-linkness check before default router (re-)seletion.

do not play with interface direct route on nd6_rtrequest.

sync a lot of cosmetic changes. sync with kame
 1.30  17-Oct-2001  itojun unifdef OLDIP6OUTPUT
 1.29  16-Oct-2001  itojun more whitespace/comment sync with kame
 1.28  23-Feb-2001  itojun branches: 1.28.2; 1.28.4;
garbage-collect stale ND entries (default: 1 day).
RFC 2461 5.3. sync with kame.
 1.27  11-Feb-2001  itojun make sure to clean ln_byhint on reachability confirmation.
 1.26  07-Feb-2001  itojun during ip6/icmp6 inbound packet processing, do not call log() nor printf() in
normal operation (/var can get filled up by flodding bogus packets).
sysctl net.inet6.icmp6.nd6_debug will turn on diagnostic messages.
(#define ND6_DEBUG will turn it on by default)

improve stats in ND6 code.

lots of synchronziation with kame (including comments and cometic ones).
 1.25  24-Jan-2001  itojun - record IPsec packet history into m_aux structure.
- let ipfilter look at wire-format packet only (not the decapsulated ones),
so that VPN setting can work with NAT/ipfilter settings.
sync with kame.

TODO: use header history for stricter inbound validation
 1.24  17-Jan-2001  itojun wrap noisy ND6 debugging messages with ND6_DEBUG. sync with kame
 1.23  05-Nov-2000  onoe First Prototype implementation of network interface part for IEEE1394 (if_fw).

Current status:
Only OHCI chip is supported (fwohci).
ping (IPv4) works with Sony's implementation (SmartConnect) on Win98.
sometimes works but not stable.
Not implemented yet:
IRM (Isochronous Resource Manager) functionality.
Link layer fragmentation.
Topology map.
More to do:
clean ups
MCAP
charactor device part
dhcp

There is no entry in GENERIC config file yet.
Follow sys/dev/ieee1394/IMPLEMENTATION to enable if_fw.
 1.22  19-May-2000  itojun branches: 1.22.4;
do not mistakingly forward link-local scoped packet (the bug was added
with "beyondscope" icmp6 support).
"options FAKE_LOOPBACK_IF" will honor scope on loopback outputs. rcvif will
be real interface, not the loopback, just like when multicast loopback.

(sync with kame)
 1.21  24-Mar-2000  itojun move ia6->ia6_dad_ch to dp->dad_timer_ch, to ease KAME code sharing.
now in6_var.h does not need to pull sys/callout.h in.
 1.20  23-Mar-2000  thorpej New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.
 1.19  16-Mar-2000  thorpej Quiet down the DAD messages a little more.
 1.18  01-Mar-2000  itojun introduce m->m_pkthdr.aux to hold random data which needs to be passed
between protocol handlers.

ipsec socket pointers, ipsec decryption/auth information, tunnel
decapsulation information are in my mind - there can be several other usage.
at this moment, we use this for ipsec socket pointer passing. this will
avoid reuse of m->m_pkthdr.rcvif in ipsec code.

due to the change, MHLEN will be decreased by sizeof(void *) - for example,
for i386, MHLEN was 100 bytes, but is now 96 bytes.
we may want to increase MSIZE from 128 to 256 for some of our architectures.

take caution if you use it for keeping some data item for long period
of time - use extra caution on M_PREPEND() or m_adj(), as they may result
in loss of m->m_pkthdr.aux pointer (and mbuf leak).

this will bump kernel version.

(as discussed in tech-net, tested in kame tree)
 1.17  28-Feb-2000  itojun remove some of cross-BSD portability #ifdef.
remove xxCTL_VARS, which is BSDI specific.
 1.16  26-Feb-2000  itojun bring in recent KAME changes (only important and stable ones, as usual).
- remove net.inet6.ip6.nd6_proxyall. introduce proxy NDP code works
just like "arp -s".
- revise source address selection.
be more careful about use of yet-to-be-valid addresses as source.
- as router, transmit ICMP6_DST_UNREACH_BEYONDSCOPE against out-of-scope
packet forwarding attempt.
- path MTU discovery takes care of routing header properly.
- be more strict about mbuf chain parsing.
 1.15  07-Feb-2000  itojun add more sanity check against mbuf length.
use log() for DAD related kernel message.
 1.14  06-Feb-2000  itojun fix include pathname for better rfc2292 compliance.
 1.13  01-Feb-2000  thorpej First-draft if_detach() implementation, originally from Bill Studnemund,
although this version has been changed somewhat:
- reference counting on ifaddrs isn't as complete as Bill's original
work was. This is hard to get right, and we should attack one
protocol at a time.
- This doesn't do reference counting or dynamic allocation of ifnets yet.
- This version introduces a new PRU -- PRU_PURGEADDR, which is used to
purge an ifaddr from a protocol. The old method Bill used didn't work
on all protocols, and it only worked on some because it was Very Lucky.

This mostly works ... i.e. works for my USB Ethernet, except for a dangling
ifaddr reference left by the IPv6 code; have not yet tracked this down.
 1.12  28-Jan-2000  itojun wrap "DAD start" message into #ifdef DIAGNOSTIC.
From: thorpej, "Soren S. Jorvang" <soren@wheel.dk>
 1.11  06-Jan-2000  itojun remove extra portability #ifdef (like #ifdef __FreeBSD__) in KAME IPv6/IPsec
code, from netbsd-current repository.
#ifdef'ed version is always available from ftp.kame.net.

XXX please do not make too many diff-unfriendly changes, we'll need to take
bunch of diffs on upgrade...
 1.10  15-Dec-1999  itojun do not overwrite traffic class field when we write IPv6 version field.
 1.9  13-Dec-1999  itojun sync IPv6 part with latest KAME tree. IPsec part is left unmodified
due to massive changes in KAME side.
- IPv6 output goes through nd6_output
- faith can capture IPv4 packets as well - you can run IPv4-to-IPv6 translator
using heavily modified DNS servers
- per-interface statistics (required for IPv6 MIB)
- interface autoconfig is revisited
- udp input handling has a big change for mapped address support.
- introduce in4_cksum() for non-overwriting checksumming
- introduce m_pulldown()
- neighbor discovery cleanups/improvements
- netinet/in.h strictly conforms to RFC2553 (no extra defs visible to userland)
- IFA_STATS is fixed a bit (not tested)
- and more more more.

TODO:
- cleanup os-independency #ifdef
- avoid rcvif dual use (for IPsec) to help ifdetach

(sorry for jumbo commit, I can't separate this any more...)
 1.8  19-Sep-1999  is branches: 1.8.2; 1.8.8;
Zeroth version of IPv6 support for ARCnet. Correct MTU handling still needs
to be done.
 1.7  31-Jul-1999  itojun sync with recent KAME.
- loosen ipsec restriction on packet diredtion.
- revise icmp6 redirect handling on IsRouter bit.
- tcp/udp notification processing (link-local address case)
- cosmetic fixes (better code share across *BSD).
 1.6  10-Jul-1999  thorpej Clean up some printfs(), and mark a few for possible later nuking,
since they appear to be for debugging purposes only.
 1.5  09-Jul-1999  thorpej defopt IPSEC and IPSEC_ESP (both into opt_ipsec.h).
 1.4  04-Jul-1999  itojun s/splnet/splsoftnet/ in IPv6/IPsec part.
hope I made no mistake (the kernel works fine but I need a regress test)

Suggested by: thorpej
 1.3  03-Jul-1999  thorpej RCS ID police.
 1.2  01-Jul-1999  itojun branches: 1.2.2;
IPv6 kernel code, based on KAME/NetBSD 1.4, SNAP kit 19990628.
(Sorry for a big commit, I can't separate this into several pieces...)
Pls check sys/netinet6/TODO and sys/netinet6/IMPLEMENTATION for details.

- sys/kern: do not assume single mbuf, accept chained mbuf on passing
data from userland to kernel (or other way round).
- "midway" ATM card: ATM PVC pseudo device support, like those done in ALTQ
package (ftp://ftp.csl.sony.co.jp/pub/kjc/).
- sys/netinet/tcp*: IPv4/v6 dual stack tcp support.
- sys/netinet/{ip6,icmp6}.h, sys/net/pfkeyv2.h: IETF document assumes those
file to be there so we patch it up.
- sys/netinet: IPsec additions are here and there.
- sys/netinet6/*: most of IPv6 code sits here.
- sys/netkey: IPsec key management code
- dev/pci/pcidevs: regen

In my understanding no code here is subject to export control so it
should be safe.
 1.1  28-Jun-1999  itojun branches: 1.1.2;
file nd6_nbr.c was initially added on branch kame.
 1.1.2.4  30-Nov-1999  itojun avoid panic due to uninitialized pointer (on ipsec policy check in ip6_output).
(critical fix sync from KAME)
 1.1.2.3  30-Nov-1999  itojun bring in latest KAME (as of 19991130, KAME/NetBSD141) into kame branch
just for reference purposes.
This commit includes 1.4 -> 1.4.1 sync for kame branch.

The branch does not compile at all (due to the lack of ALTQ and some other
source code). Please do not try to modify the branch, this is just for
referenre purposes.

synchronization to latest KAME will take place on HEAD branch soon.
 1.1.2.2  06-Jul-1999  itojun KAME/NetBSD 1.4, SNAP kit 1999/07/05.
NOTE: this branch is just for reference purposes (i.e. for taking cvs diff).
do not touch anything on the branch. actual work must be done on HEAD branch.
 1.1.2.1  28-Jun-1999  itojun KAME/NetBSD 1.4 SNAP kit, dated 19990628.

NOTE: this branch (kame) is used just for refernce. this may not compile
due to multiple reasons.
 1.2.2.3  02-Aug-1999  thorpej Update from trunk.
 1.2.2.2  01-Jul-1999  thorpej Sync w/ -current.
 1.2.2.1  01-Jul-1999  thorpej file nd6_nbr.c was added on branch chs-ubc2 on 1999-07-01 23:48:30 +0000
 1.8.8.1  27-Dec-1999  wrstuden Pull up to last week's -current.
 1.8.2.4  12-Mar-2001  bouyer Sync with HEAD.
 1.8.2.3  11-Feb-2001  bouyer Sync with HEAD.
 1.8.2.2  22-Nov-2000  bouyer Sync with HEAD.
 1.8.2.1  20-Nov-2000  bouyer Update thorpej_scsipi to -current as of a month ago
 1.22.4.2  09-May-2001  he Pull up revision 1.26 (requested by itojun):
Suppress ND6 logs that are too noisy for normal use. Can be
re-enabled by net.inet6.icmp6.nd6_debug.
 1.22.4.1  06-Apr-2001  he Pull up revision 1.25 (requested by itojun):
Record IPsec packet history in m_aux structure. Let ipfilter
look at wire-format packet only (not the decapsulated ones), so
that VPN setting can work with NAT/ipfilter settings.
 1.28.4.3  10-Oct-2002  jdolecek sync kqueue with -current; this includes merge of gehenna-devsw branch,
merge of i386 MP branch, and part of autoconf rototil work
 1.28.4.2  23-Jun-2002  jdolecek catch up with -current on kqueue branch
 1.28.4.1  10-Jan-2002  thorpej Sync kqueue branch with -current.
 1.28.2.5  18-Oct-2002  nathanw Catch up to -current.
 1.28.2.4  20-Jun-2002  nathanw Catch up to -current.
 1.28.2.3  01-Apr-2002  nathanw Catch up to -current.
(CVS: It's not just a program. It's an adventure!)
 1.28.2.2  14-Nov-2001  nathanw Catch up to -current.
 1.28.2.1  22-Oct-2001  nathanw Catch up to -current.
 1.34.6.1  02-Oct-2003  tron Pull up revision 1.39 via patch (requested by itojun in ticket #1491):
sync with latest KAME in6_ifaddr/prefix/default router manipulation.
behavior changes:
- two iocts used by ndp(8) are now obsolete (backward compat provided).
use sysctl path instead.
- lo0 does not get ::1 automatically. it will get ::1 when lo0 comes up.
 1.34.4.2  20-Jun-2002  gehenna catch up with -current.
 1.34.4.1  30-May-2002  gehenna Catch up with -current.
 1.47.2.7  10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.47.2.6  04-Mar-2005  skrll Sync with HEAD.

Hi Perry!
 1.47.2.5  15-Feb-2005  skrll Sync with HEAD.
 1.47.2.4  04-Feb-2005  skrll Sync with HEAD.
 1.47.2.3  21-Sep-2004  skrll Fix the sync with head I botched.
 1.47.2.2  18-Sep-2004  skrll Sync with HEAD.
 1.47.2.1  03-Aug-2004  skrll Sync with HEAD
 1.53.10.2  19-Mar-2005  yamt sync with head. xen and whitespace. xen part is not finished.
 1.53.10.1  12-Feb-2005  yamt sync with head.
 1.53.8.1  29-Apr-2005  kent sync with -current
 1.56.6.1  03-Oct-2008  jdc Pull up revisions:
src/sys/netinet6/in6.c 1.141 via patch
src/sys/netinet6/in6_var.h 1.59 via patch
src/sys/netinet6/nd6_nbr.c 1.89-1.90 via patch
(requested by adrianp in ticket #1967).

If a neighbor solictation isn't from the unspecified address, make sure
that the source address matches one of the interfaces address prefixes.

Generalize previous fix so that both NS and NA packets are checked.
 1.56.4.1  03-Oct-2008  jdc Pull up revisions:
src/sys/netinet6/in6.c 1.141 via patch
src/sys/netinet6/in6_var.h 1.59 via patch
src/sys/netinet6/nd6_nbr.c 1.89-1.90 via patch
(requested by adrianp in ticket #1967).

If a neighbor solictation isn't from the unspecified address, make sure
that the source address matches one of the interfaces address prefixes.

Generalize previous fix so that both NS and NA packets are checked.
 1.56.2.1  03-Oct-2008  jdc Pull up revisions:
src/sys/netinet6/in6.c 1.141 via patch
src/sys/netinet6/in6_var.h 1.59 via patch
src/sys/netinet6/nd6_nbr.c 1.89-1.90 via patch
(requested by adrianp in ticket #1967).

If a neighbor solictation isn't from the unspecified address, make sure
that the source address matches one of the interfaces address prefixes.

Generalize previous fix so that both NS and NA packets are checked.
 1.57.2.7  17-Mar-2008  yamt sync with head.
 1.57.2.6  07-Dec-2007  yamt sync with head
 1.57.2.5  15-Nov-2007  yamt sync with head.
 1.57.2.4  03-Sep-2007  yamt sync with head.
 1.57.2.3  26-Feb-2007  yamt sync with head.
 1.57.2.2  30-Dec-2006  yamt sync with head.
 1.57.2.1  21-Jun-2006  yamt sync with head.
 1.58.2.2  01-Mar-2006  yamt sync with head.
 1.58.2.1  01-Feb-2006  yamt sync with head.
 1.59.4.2  01-Jun-2006  kardel Sync with head.
 1.59.4.1  22-Apr-2006  simonb Sync with head.
 1.59.2.1  09-Sep-2006  rpaulo sync with head
 1.61.2.3  11-Aug-2006  yamt sync with head
 1.61.2.2  24-May-2006  yamt sync with head.
 1.61.2.1  13-Mar-2006  yamt sync with head.
 1.63.4.1  24-May-2006  tron Merge 2006-05-24 NetBSD-current into the "peter-altq" branch.
 1.64.4.1  13-Jul-2006  gdamore Merge from HEAD.
 1.65.14.1  03-Oct-2008  jdc Pull up revisions:
src/sys/netinet6/in6.c 1.141 via patch
src/sys/netinet6/in6_var.h 1.59 via patch
src/sys/netinet6/nd6_nbr.c 1.89-1.90 via patch
(requested by adrianp in ticket #1210).

If a neighbor solictation isn't from the unspecified address, make sure
that the source address matches one of the interfaces address prefixes.

Generalize previous fix so that both NS and NA packets are checked.
 1.65.8.1  03-Oct-2008  jdc Pull up revisions:
src/sys/netinet6/in6.c 1.141 via patch
src/sys/netinet6/in6_var.h 1.59 via patch
src/sys/netinet6/nd6_nbr.c 1.89-1.90 via patch
(requested by adrianp in ticket #1210).

If a neighbor solictation isn't from the unspecified address, make sure
that the source address matches one of the interfaces address prefixes.

Generalize previous fix so that both NS and NA packets are checked.
 1.65.6.2  18-Dec-2006  yamt sync with head.
 1.65.6.1  10-Dec-2006  yamt sync with head.
 1.65.4.2  01-Feb-2007  ad Sync with head.
 1.65.4.1  12-Jan-2007  ad Sync with head.
 1.69.2.5  17-May-2007  yamt sync with head.
 1.69.2.4  07-May-2007  yamt sync with head.
 1.69.2.3  24-Mar-2007  yamt sync with head.
 1.69.2.2  12-Mar-2007  rmind Sync with HEAD.
 1.69.2.1  27-Feb-2007  yamt - sync with head.
- move sched_changepri back to kern_synch.c as it doesn't know PPQ anymore.
 1.71.6.1  18-Mar-2007  reinoud First attempt to bring branch in sync with HEAD
 1.71.4.1  11-Jul-2007  mjf Sync with head.
 1.71.2.5  09-Oct-2007  ad Sync with head.
 1.71.2.4  20-Aug-2007  ad Sync with HEAD.
 1.71.2.3  01-Jul-2007  ad Adapt to callout API change.
 1.71.2.2  08-Jun-2007  ad Sync with head.
 1.71.2.1  10-Apr-2007  ad Sync with head.
 1.76.2.2  03-Sep-2007  skrll Sync with HEAD.
 1.76.2.1  15-Aug-2007  skrll Sync with HEAD.
 1.77.4.4  21-Nov-2007  joerg Sync with HEAD.
 1.77.4.3  11-Nov-2007  joerg Sync with HEAD.
 1.77.4.2  03-Sep-2007  jmcneill Sync with HEAD.
 1.77.4.1  09-Aug-2007  jmcneill Sync with HEAD.
 1.78.2.2  07-Aug-2007  dyoung Avoid writing past the end of the buffer [lldst, lldst + dstsize)
in nd6_storelladdr().

Use sockaddr_dl_setaddr(). Constify some sockaddr_dl's. Constify
a sockaddr argument to nd6_na_output(). Change SDL() to "standard"
satocsdl() or satosdl(). Change SIN6() to satocsin6() or satosin6().

bcmp -> memcmp, bcopy -> memcpy.
 1.78.2.1  07-Aug-2007  dyoung file nd6_nbr.c was added on branch matt-mips64 on 2007-08-07 04:35:44 +0000
 1.79.2.3  23-Mar-2008  matt sync with HEAD
 1.79.2.2  09-Jan-2008  matt sync with HEAD
 1.79.2.1  06-Nov-2007  matt sync with HEAD
 1.80.6.1  19-Nov-2007  mjf Sync with HEAD.
 1.80.4.2  18-Nov-2007  bouyer Sync with HEAD
 1.80.4.1  13-Nov-2007  bouyer Sync with HEAD
 1.82.14.3  28-Sep-2008  mjf Sync with HEAD.
 1.82.14.2  02-Jun-2008  mjf Sync with HEAD.
 1.82.14.1  03-Apr-2008  mjf Sync with HEAD.
 1.82.10.2  24-Mar-2008  keiichi sync with head.
 1.82.10.1  22-Feb-2008  keiichi imported Mobile IPv6 code developed by the SHISA project
(http://www.mobileip.jp/).
 1.85.2.2  04-Jun-2008  yamt sync with head
 1.85.2.1  18-May-2008  yamt sync with head.
 1.86.4.2  18-Sep-2008  wrstuden Sync with wrstuden-revivesa-base-2.
 1.86.4.1  23-Jun-2008  wrstuden Sync w/ -current. 34 merge conflicts to follow.
 1.86.2.1  04-May-2009  yamt sync with head.
 1.88.4.1  19-Oct-2008  haad Sync with HEAD.
 1.90.8.1  13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.90.2.1  28-Apr-2009  skrll Sync with HEAD.
 1.94.16.2  05-Apr-2012  mrg sync to latest -current.
 1.94.16.1  18-Feb-2012  mrg merge to -current.
 1.94.12.2  22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.94.12.1  17-Apr-2012  yamt sync with head
 1.95.8.1  02-Feb-2018  martin Pull up following revision(s) (requested by maxv in ticket #1525):
sys/netinet6/nd6_nbr.c: revision 1.145 (patch)

Fix memory leak. Contrary to what the XXX indicates, this place is 100%
reachable remotely.
 1.95.6.1  02-Feb-2018  martin Pull up following revision(s) (requested by maxv in ticket #1525):
sys/netinet6/nd6_nbr.c: revision 1.145 (patch)

Fix memory leak. Contrary to what the XXX indicates, this place is 100%
reachable remotely.
 1.95.2.1  02-Feb-2018  martin Pull up following revision(s) (requested by maxv in ticket #1525):
sys/netinet6/nd6_nbr.c: revision 1.145 (patch)

Fix memory leak. Contrary to what the XXX indicates, this place is 100%
reachable remotely.
 1.96.2.3  03-Dec-2017  jdolecek update from HEAD
 1.96.2.2  20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.96.2.1  23-Jun-2013  tls resync from head
 1.98.2.1  18-May-2014  rmind sync with head
 1.99.2.1  10-Aug-2014  tls Rebase.
 1.100.2.3  02-Feb-2018  martin Pull up following revision(s) (requested by maxv in ticket #1562):
sys/netinet6/nd6_nbr.c: revision 1.145
Fix memory leak. Contrary to what the XXX indicates, this place is 100%
reachable remotely.
 1.100.2.2  06-Apr-2015  snj branches: 1.100.2.2.2; 1.100.2.2.6;
Pull up following revision(s) (requested by martin in ticket #655):
sys/netinet6/in6.c: revision 1.182 via patch
sys/netinet6/in6_ifattach.c: revision 1.95 via patch
sys/netinet6/nd6.c: revision 1.158 via patch
sys/netinet6/nd6.h: revision 1.62 via patch
sys/netinet6/nd6_nbr.c: revision 1.104 via patch
sys/netinet6/nd6_rtr.c: revision 1.96 via patch
Rearange interface detachement slightly: before we free the INET6 specific
per-interface data, make sure to call nd6_purge() with it to remove
routing entries pointing to the going interface.
When we should happen to call this function again later, with the data
already gone, just return.
Fixes PR kern/49682, ok: christos.
 1.100.2.1  17-Dec-2014  martin Pull up following revision(s) (requested by roy in ticket #332):
sys/netinet6/nd6_nbr.c: revision 1.103
sys/netinet6/nd6_rtr.c: revision 1.95
sys/netinet6/nd6.h: revision 1.61
sys/netinet6/nd6.c: revision 1.156
Report route additions/changes/deletions for cached neighbours to userland.
 1.100.2.2.6.1  02-Feb-2018  martin Pull up following revision(s) (requested by maxv in ticket #1562):
sys/netinet6/nd6_nbr.c: revision 1.145 (patch)

Fix memory leak. Contrary to what the XXX indicates, this place is 100%
reachable remotely.
 1.100.2.2.2.1  02-Feb-2018  martin Pull up following revision(s) (requested by maxv in ticket #1562):
sys/netinet6/nd6_nbr.c: revision 1.145 (patch)

Fix memory leak. Contrary to what the XXX indicates, this place is 100%
reachable remotely.
 1.102.2.11  28-Aug-2017  skrll Sync with HEAD
 1.102.2.10  05-Feb-2017  skrll Sync with HEAD
 1.102.2.9  05-Dec-2016  skrll Sync with HEAD
 1.102.2.8  05-Oct-2016  skrll Sync with HEAD
 1.102.2.7  09-Jul-2016  skrll Sync with HEAD
 1.102.2.6  29-May-2016  skrll Sync with HEAD
 1.102.2.5  22-Apr-2016  skrll Sync with HEAD
 1.102.2.4  27-Dec-2015  skrll Sync with HEAD (as of 26th Dec)
 1.102.2.3  22-Sep-2015  skrll Sync with HEAD
 1.102.2.2  06-Jun-2015  skrll Sync with HEAD
 1.102.2.1  06-Apr-2015  skrll Sync with HEAD
 1.122.2.5  20-Mar-2017  pgoyette Sync with HEAD
 1.122.2.4  07-Jan-2017  pgoyette Sync with HEAD. (Note that most of these changes are simply $NetBSD$
tag issues.)
 1.122.2.3  04-Nov-2016  pgoyette Sync with HEAD
 1.122.2.2  06-Aug-2016  pgoyette Sync with HEAD
 1.122.2.1  26-Jul-2016  pgoyette Sync with HEAD
 1.134.2.1  21-Apr-2017  bouyer Sync with HEAD
 1.138.6.9  30-Sep-2019  martin Pull up following revision(s) (requested by ozaki-r in ticket #1396):

sys/netinet6/nd6.h: revision 1.88
sys/netinet6/nd6_nbr.c: revision 1.174
sys/netinet6/nd6.c: revision 1.264
sys/netinet/if_arp.c: revision 1.288 (patch)

Initialize DAD components properly

The original code initialized each component in non-init functions such as
arp_dad_start and nd6_dad_find, conditionally based on a global flag for each.
However, it was racy because the flag and the code around it were not
protected by a lock and could cause a kernel panic at worst.

Fix the issue by initializing the components in bootup as usual.
 1.138.6.8  23-Sep-2019  martin Pull up following revision(s) (requested by ozaki-r in ticket #1383):

sys/netinet6/nd6_nbr.c: revision 1.173

nd6: remove extra pserialize_read_exit
 1.138.6.7  13-May-2019  martin Pull up following revision(s) (requested by roy in ticket #1262):

sys/netinet6/nd6_nbr.c: revision 1.163

inet6: discard any received NA with a LL address we own

This matches ARP behaviour.
 1.138.6.6  02-Apr-2018  martin Pull up following revision(s) (requested by ozaki-r in ticket #686):

sys/netinet/if_arp.c: revision 1.271
sys/netinet6/nd6_nbr.c: revision 1.151,1.152

Avoid passing NULL to nd6_dad_duplicated
Fix PR kern/53075

Fix a race condition on DAD destructions (again)

The previous fix to DAD timers was wrong; it avoided a use-after-free but
instead introduced a memory leak. The destruction method had delegated
a destruction of a DAD timer to the timer itself and told that by setting NULL
to dp->dad_ifa. However, the previous fix made DAD timers do nothing on
the sign.

Fixing the issue with using callout_stop isn't easy. One approach is to have
a refcount on dp but it introduces extra complexity that we want to avoid.
The new fix falls back to using callout_halt, which was abandoned because of
softnet_lock. Fortunately now the network stack is protected by KERNEL_LOCK
so we can remove softnet_lock from DAD timers (callout) and use callout_halt
safely.
 1.138.6.5  20-Mar-2018  bouyer Pull up following revision(s) (requested by ozaki-r in ticket #645):
sys/netinet6/nd6_nbr.c: revision 1.153
Pull out a sleepable function (in6_selectsrc) from a pserialize read section
 1.138.6.4  26-Feb-2018  martin Pull up following revision(s) (requested by ozaki-r in ticket #589):
sys/netinet/if_arp.c: revision 1.267
sys/netinet6/nd6_nbr.c: revision 1.146-1.148

Use KASSERT for checking a programming error

Simplify; pass dp to nd6_dad_duplicated instead of looking it up again in it

Avoid a race condition of DAD timer destructions

When we see dp->dad_ifa == NULL, it means that the ifa is being deleted and also
the callout is scheduled again by someone. We shouldn't rely on a result of
callout_pending to know if the callout is scheduled because it returns false if
the subsequent callout handler is already on the fly.
We have to always delegate the destruction of dp to the subsequent handler
unconditionally if dp->dad_ifa == NULL. Otherwise, the first handler destroys
the dp and the second handler tries to handle destroyed dp.
 1.138.6.3  02-Feb-2018  martin Pull up following revision(s) (requested by maxv in ticket #531):
sys/netinet6/nd6_nbr.c: revision 1.145
Fix memory leak. Contrary to what the XXX indicates, this place is 100%
reachable remotely.
 1.138.6.2  26-Jan-2018  martin Pull up following revision(s) (requested by ozaki-r in ticket #511):
sys/kern/kern_timeout.c: revision 1.54
sys/netinet6/nd6_nbr.c: revision 1.141
sys/netinet6/nd6_nbr.c: revision 1.144
sys/netinet/if_arp.c: revision 1.256
Fix a deadlock on callout_halt of nd6_dad_timer
We must not call callout_halt of nd6_dad_timer with holding nd6_dad_lock because
the lock is taken in nd6_dad_timer. Once softnet_lock goes away, we can pass the
lock to callout_halt, but for now we cannot.
Make DAD destructions (MP-)safe with callout_stop
arp_dad_stoptimer and nd6_dad_stoptimer can be called with or without
softnet_lock held and unfortunately we have no easy way to statically know which.
So it is hard to use callout_halt there.
To address the situation, we use callout_stop to make the code safe. The new
approach copes with the issue by delegating the destruction of a callout to
callout itself, which allows us to not wait the callout to finish. This can be
done thanks to that DAD objects are separated from other data such as ifa.
The approach is suggested by riastradh@
Proposed on tech-kern@ and tech-net@
Sanity-check if interlock is held when it's passed
 1.138.6.1  02-Jan-2018  snj Pull up following revision(s) (requested by ozaki-r in ticket #456):
sys/arch/arm/sunxi/sunxi_emac.c: 1.9
sys/dev/ic/dwc_gmac.c: 1.43-1.44
sys/dev/pci/if_iwm.c: 1.75
sys/dev/pci/if_wm.c: 1.543
sys/dev/pci/ixgbe/ixgbe.c: 1.112
sys/dev/pci/ixgbe/ixv.c: 1.74
sys/kern/sys_socket.c: 1.75
sys/net/agr/if_agr.c: 1.43
sys/net/bpf.c: 1.219
sys/net/if.c: 1.397, 1.399, 1.401-1.403, 1.406-1.410, 1.412-1.416
sys/net/if.h: 1.242-1.247, 1.250, 1.252-1.257
sys/net/if_bridge.c: 1.140 via patch, 1.142-1.146
sys/net/if_etherip.c: 1.40
sys/net/if_ethersubr.c: 1.243, 1.246
sys/net/if_faith.c: 1.57
sys/net/if_gif.c: 1.132
sys/net/if_l2tp.c: 1.15, 1.17
sys/net/if_loop.c: 1.98-1.101
sys/net/if_media.c: 1.35
sys/net/if_pppoe.c: 1.131-1.132
sys/net/if_spppsubr.c: 1.176-1.177
sys/net/if_tun.c: 1.142
sys/net/if_vlan.c: 1.107, 1.109, 1.114-1.121
sys/net/npf/npf_ifaddr.c: 1.3
sys/net/npf/npf_os.c: 1.8-1.9
sys/net/rtsock.c: 1.230
sys/netcan/if_canloop.c: 1.3-1.5
sys/netinet/if_arp.c: 1.255
sys/netinet/igmp.c: 1.65
sys/netinet/in.c: 1.210-1.211
sys/netinet/in_pcb.c: 1.180
sys/netinet/ip_carp.c: 1.92, 1.94
sys/netinet/ip_flow.c: 1.81
sys/netinet/ip_input.c: 1.362
sys/netinet/ip_mroute.c: 1.147
sys/netinet/ip_output.c: 1.283, 1.285, 1.287
sys/netinet6/frag6.c: 1.61
sys/netinet6/in6.c: 1.251, 1.255
sys/netinet6/in6_pcb.c: 1.162
sys/netinet6/ip6_flow.c: 1.35
sys/netinet6/ip6_input.c: 1.183
sys/netinet6/ip6_output.c: 1.196
sys/netinet6/mld6.c: 1.90
sys/netinet6/nd6.c: 1.239-1.240
sys/netinet6/nd6_nbr.c: 1.139
sys/netinet6/nd6_rtr.c: 1.136
sys/netipsec/ipsec_output.c: 1.65
sys/rump/net/lib/libnetinet/netinet_component.c: 1.9-1.10
kmem_intr_free kmem_intr_[z]alloced memory
the underlying pools are the same but api-wise those should match
Unify IFEF_*_MPSAFE into IFEF_MPSAFE
There are already two flags for if_output and if_start, however, it seems such
MPSAFE flags are eventually needed for all if_XXX operations. Having discrete
flags for each operation is wasteful of if_extflags bits. So let's unify
the flags into one: IFEF_MPSAFE.
Fortunately IFEF_*_MPSAFE flags have never been included in any releases, so
we can change them without breaking backward compatibility of the releases
(though the kernel version of -current should be bumped).
Note that if an interface have both MP-safe and non-MP-safe operations at a
time, we have to set the IFEF_MPSAFE flag and let callees of non-MP-safe
opeartions take the kernel lock.
Proposed on tech-kern@ and tech-net@
Provide macros for softnet_lock and KERNEL_LOCK hiding NET_MPSAFE switch
It reduces C&P codes such as "#ifndef NET_MPSAFE KERNEL_LOCK(1, NULL); ..."
scattered all over the source code and makes it easy to identify remaining
KERNEL_LOCK and/or softnet_lock that are held even if NET_MPSAFE.
No functional change
Hold KERNEL_LOCK on if_ioctl selectively based on IFEF_MPSAFE
If IFEF_MPSAFE is set, hold the lock and otherwise don't hold.
This change requires additions of KERNEL_LOCK to subsequence functions from
if_ioctl such as ifmedia_ioctl and ifioctl_common to protect non-MP-safe
components.
Proposed on tech-kern@ and tech-net@
Ensure to hold if_ioctl_lock when calling if_flags_set
Fix locking against myself on ifpromisc
vlan_unconfig_locked could be called with holding if_ioctl_lock.
Ensure to not turn on IFF_RUNNING of an interface until its initialization completes
And ensure to turn off it before destruction as per IFF_RUNNING's description
"resource allocated". (The description is a bit doubtful though, I believe the
change is still proper.)
Ensure to hold if_ioctl_lock on if_up and if_down
One exception for if_down is if_detach; in the case the lock isn't needed
because it's guaranteed that no other one can access ifp at that point.
Make if_link_queue MP-safe if IFEF_MPSAFE
if_link_queue is a queue to store events of link state changes, which is
used to pass events from (typically) an interrupt handler to
if_link_state_change softint. The queue was protected by KERNEL_LOCK so far,
but if IFEF_MPSAFE is enabled, it becomes unsafe because (perhaps) an interrupt
handler of an interface with IFEF_MPSAFE doesn't take KERNEL_LOCK. Protect it
by a spin mutex.
Additionally with this change KERNEL_LOCK of if_link_state_change softint is
omitted if NET_MPSAFE is enabled.
Note that the spin mutex is now ifp->if_snd.ifq_lock as well as the case of
if_timer (see the comment).
Use IFADDR_WRITER_FOREACH instead of IFADDR_READER_FOREACH
At that point no other one modifies the list so IFADDR_READER_FOREACH
is unnecessary. Use of IFADDR_READER_FOREACH is harmless in general though,
if we try to detect contract violations of pserialize, using it violates
the contract. So avoid using it makes life easy.
Ensure to call if_addr_init with holding if_ioctl_lock
Get rid of outdated comments
Fix build of kernels without ether
By throwing out if_enable_vlan_mtu and if_disable_vlan_mtu that
created a unnecessary dependency from if.c to if_ethersubr.c.
PR kern/52790
Rename IFNET_LOCK to IFNET_GLOBAL_LOCK
IFNET_LOCK will be used in another lock, if_ioctl_lock (might be renamed then).
Wrap if_ioctl_lock with IFNET_* macros (NFC)
Also if_ioctl_lock perhaps needs to be renamed to something because it's now
not just for ioctl...
Reorder some destruction routines in if_detach
- Destroy if_ioctl_lock at the end of the if_detach because it's used in various
destruction routines
- Move psref_target_destroy after pr_purgeif because we want to use psref in
pr_purgeif (otherwise destruction procedures can be tricky)
Ensure to call if_mcast_op with holding IFNET_LOCK
Note that CARP doesn't deal with IFNET_LOCK yet.
Remove IFNET_GLOBAL_LOCK where it's unnecessary because IFNET_LOCK is held
Describe which lock is used to protect each member variable of struct ifnet
Requested by skrll@
Write a guideline for converting an interface to IFEF_MPSAFE
Requested by skrll@
Note that IFNET_LOCK must not be held in softint
Don't set IFEF_MPSAFE unless NET_MPSAFE at this point
Because recent investigations show that interfaces with IFEF_MPSAFE need to
follow additional restrictions to work with the flag safely. We should enable it
on an interface by default only if the interface surely satisfies the
restrictions, which are described in if.h.
Note that enabling IFEF_MPSAFE solely gains a few benefit on performance because
the network stack is still serialized by the big kernel locks by default.
 1.148.2.5  26-Dec-2018  pgoyette Sync with HEAD, resolve a few conflicts
 1.148.2.4  21-May-2018  pgoyette Sync with HEAD
 1.148.2.3  02-May-2018  pgoyette Synch with HEAD
 1.148.2.2  22-Mar-2018  pgoyette Synch with HEAD, resolve conflicts
 1.148.2.1  15-Mar-2018  pgoyette Synch with HEAD
 1.156.2.2  13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.156.2.1  10-Jun-2019  christos Sync with HEAD
 1.166.2.5  23-Apr-2020  martin Pull up following revision(s) (requested by roy in ticket #845):

sys/netinet6/nd6_nbr.c: revision 1.178

inet6: nd6_na_input() now considers ln_state <= ND6_LLINFO_INCOMPLETE

Otherwise if ln_state != ND6_LLINFO_INCOMPLETE and the is no lladdr
and this message was solicited then ln_state is set to ND6_LLINFO_REACHABLE
which could then cause a panic in nd6_resolve().

If ln_state > ND6_LLINFO_INCOMPLETE then it's assumed we have a lladdr.
Potentially this could have been triggered by the introduction of
ND6_LLINFO_PURGE in nd6.c r1.143 but also by the re-introduction of
ND6_LLINFO_INCOMPLETE in nd6.c r1.263.

Depending on the timing, it's technically possible to receive such
a message after the llentry is created with ND6_LLINFO_NOSTATE.
 1.166.2.4  30-Sep-2019  martin Pull up following revision(s) (requested by ozaki-r in ticket #269):

sys/netinet6/nd6.h: revision 1.88
sys/net/rtsock_shared.c: revision 1.10
sys/netinet6/nd6_nbr.c: revision 1.174
sys/netinet6/nd6.c: revision 1.264
sys/netinet/if_arp.c: revision 1.283
sys/netinet/if_arp.c: revision 1.288

Initialize DAD components properly

The original code initialized each component in non-init functions such as
arp_dad_start and nd6_dad_find, conditionally based on a global flag for each.
However, it was racy because the flag and the code around it were not
protected by a lock and could cause a kernel panic at worst.

Fix the issue by initializing the components in bootup as usual.

-

Initialize dom_mowner for MBUFTRACE
 1.166.2.3  22-Sep-2019  martin Pull up following revision(s) (requested by ozaki-r in ticket #212):

sys/netinet6/nd6_nbr.c: revision 1.173

nd6: remove extra pserialize_read_exit
 1.166.2.2  05-Sep-2019  martin Pull up following revision(s) (requested by roy in ticket #168):

sys/net/rtsock.c: revision 1.252
sys/netinet6/nd6_nbr.c: revision 1.168 - 1.172
sys/netinet6/nd6.c: revision 1.262

inet6: Send RTM_MISS when we fail to resolve an address.

Takes the same approach as when adding a new address - we no longer
announce the new lladdr right away but we announce the result.

This will either be RTM_ADD or RTM_MISS.
RTM_DELETE is only sent if we have a lladdr assigned OR gc'ed.

This results in less messages via route(4) and tells us when a new
lladdr has been added (RTM_ADD), changed (RTM_CHANGE), deleted
(RTM_DELETED) or has failed to been resolved (RTM_MISS).

The latter case can be interpreted as unreachable.

inet6: change rt_announce and llchange to bool in nd6_na_input()
more bool
 1.166.2.1  26-Aug-2019  martin Pull up following revision(s) (requested by roy in ticket #109):

sys/net/route.h: revision 1.124
sys/netinet6/nd6.c: revision 1.258
sys/netinet6/nd6.c: revision 1.259
sys/net/rtsock.c: revision 1.251
sys/netinet/if_arp.c: revision 1.284
sys/netinet6/nd6_nbr.c: revision 1.167

rtsock: rework rt_clonedmsg to take a message type and lladdr

We will use this in a future patch to notify userland of lladdr
changes.

XXX pullup -8 -9

-

nd6: notify userland of neighbour lla updates once more

XXX pullup -8 -9
 1.175.2.1  25-Jan-2020  ad Sync with head.
 1.177.2.1  25-Apr-2020  bouyer Sync with bouyer-xenpvh-base2 (HEAD)
 1.183.6.1  02-Aug-2025  perseant Sync with HEAD

RSS XML Feed