Home | History | Annotate | Download | only in netinet
History log of /src/sys/netinet/ip_input.c
RevisionDateAuthorComments
 1.406  17-Jul-2025  ozaki-r in: avoid racy ia4_acquire(ifatoia(rt->rt_ifa) in ip_rtaddr()

Same as the case of ip_output(), it's racy and should be avoided.

PR kern/59527
 1.405  17-Jun-2025  ozaki-r in: avoid packet looping on incoming packets destining to an initializing address

The initialization of an IPv4 address is done by adding a connected route and
a local route (if necessary), and then publishing itself by adding it to the
global list (and the global hashtable). Thus, there can exist a route with an
address that is not published. This inconsistent state allows an incoming
packet destining to one of a host address which is not published but has a
local route to be forwarded and routed to a loopback interface. This results
in forwarding the packet back to ip_input, that is, packet looping.

To avoid the situation, prohibit packets being forwarded via a local route.

This is a workaround for "IPv4 address initialization atomicity" in doc/TODO.smpnet.
 1.404  05-Jul-2024  rin sys: Drop redundant NULL check before m_freem(9)

m_freem(9) safely has accepted NULL argument at least since 4.2BSD:
https://www.tuhs.org/cgi-bin/utree.pl?file=4.2BSD/usr/src/sys/sys/uipc_mbuf.c

Compile-tested on amd64/ALL.

Suggested by knakahara@
 1.403  29-Jun-2024  riastradh branches: 1.403.2;
netinet: Use _NET_STAT* API instead of direct array access.

PR kern/58380
 1.402  02-Sep-2022  thorpej branches: 1.402.4;
pktqueue: Re-factor sysctl handling.

Provide a new pktq_sysctl_setup() function that attaches standard
pktq sysctl nodes below a specified parent node, with either a
fixed node ID or CTL_CREATE to dynamically assign node IDs. Make
all of the sysctl handlers private to pktqueue.c, and remove the
INET- and INET6-specific pktqueue sysctl code from net/if.c.
 1.401  08-Mar-2021  christos remove now unused pseudo-random ip id code.
 1.400  07-Mar-2021  christos netinet: Enable random IP fragment ids by default (from riastradh)
 1.399  19-Feb-2021  christos - Make ALIGNED_POINTER use __alignof(t) instead of sizeof(t). This is more
correct because it works with non-primitive types and provides the ABI
alignment for the type the compiler will use.
- Remove all the *_HDR_ALIGNMENT macros and asserts
- Replace POINTER_ALIGNED_P with ACCESSIBLE_POINTER which is identical to
ALIGNED_POINTER, but returns that the pointer is always aligned if the
CPU supports unaligned accesses.
[ as proposed in tech-kern ]
 1.398  14-Feb-2021  christos - centralize header align and pullup into a single inline function
- use a single macro to align pointers and expose the alignment, instead
of hard-coding 3 in 1/2 the macros.
- fix an issue in the ipv6 lt2p where it was aligning for ipv4 and pulling
for ipv6.
 1.397  28-Aug-2020  ozaki-r branches: 1.397.2;
inet: reduce silent packet discards
 1.396  28-Aug-2020  ozaki-r inet: pull m_get_rcvif_psref out of ip_input for simplicity

Same as ip6_input.
 1.395  28-Aug-2020  ozaki-r ipsec: rename ipsec_ip_input to ipsec_ip_input_checkpolicy

Because it just checks if a packet passes security policies.
 1.394  28-Aug-2020  ozaki-r inet, inet6: count packets dropped by IPsec

The counters count packets dropped due to security policy checks.
 1.393  13-Nov-2019  ozaki-r Get rid of unnecessary NULL checks for rt_ifa and ifa_ifp

They are always non-NULL nowadays.
 1.392  19-Sep-2019  ozaki-r Apply some missing changes lost on the previous commit
 1.391  19-Sep-2019  ozaki-r Avoid having a rtcache directly in a percpu storage

percpu(9) has a certain memory storage for each CPU and provides it by the piece
to users. If the storages went short, percpu(9) enlarges them by allocating new
larger memory areas, replacing old ones with them and destroying the old ones.
A percpu storage referenced by a pointer gotten via percpu_getref can be
destroyed by the mechanism after a running thread sleeps even if percpu_putref
has not been called.

Using rtcache, i.e., packet processing, typically involves sleepable operations
such as rwlock so we must avoid dereferencing a rtcache that is directly stored
in a percpu storage during packet processing. Address this situation by having
just a pointer to a rtcache in a percpu storage instead.

Reviewed by knakahara@ and yamaguchi@
 1.390  15-Sep-2019  bouyer Packet filters can return an mbuf chain with fragmented headers, so
m_pullup() it if needed and remove the KASSERT()s.
 1.389  13-May-2019  ozaki-r branches: 1.389.2;
Count packets dropped by pfil
 1.388  17-Jan-2019  knakahara Fix ipsecif(4) cannot apply input direction packet filter. Reviewed by ozaki-r@n.o and ryo@n.o.

Add ATF later.
 1.387  15-Nov-2018  maxv Remove the 't' argument from m_tag_find().
 1.386  02-Sep-2018  maxv remove reference to ipnat, and duplicate comments
 1.385  10-Jul-2018  maxv Remove the second argument from ip_reass_packet(). We want the IP header
on the mbuf, not elsewhere. Simplifies the NPF reassembly code a little.
No real functional change.
 1.384  17-May-2018  maxv branches: 1.384.2;
Add KASSERTs, related to PR/39794.
 1.383  14-May-2018  maxv Merge ipsec4_input and ipsec6_input into ipsec_ip_input. Make the argument
a bool for clarity. Optimize the function: if M_CANFASTFWD is not there
(because already removed by the firewall) leave now.

Makes it easier to see that M_CANFASTFWD is not removed on IPv6.
 1.382  10-May-2018  maxv Rename ipsec4_forward -> ipsec_mtu, and switch to void.
 1.381  26-Apr-2018  maxv Remove unused mbuf argument from sbsavetimestamp.
 1.380  15-Apr-2018  maxv Introduce a m_verify_packet function, that verifies the mbuf chain of a
packet to ensure it is not malformed. Call this function in "points of
interest", that are the IPv4/IPv6/IPsec entry points. There could be more.

We use M_VERIFY_PACKET(m), declared under DIAGNOSTIC only.

This function should not be called everywhere, especially not in places
that temporarily manipulate (and clobber) the mbuf structure; once they're
done they put the mbuf back in a correct format.
 1.379  11-Apr-2018  maxv Don't pass IP_ALLOWBROADCAST in ipsec4_input. The flag lands in
ipsec_getpolicybyaddr, and only IP_FORWARDING is taken.

In fact it would be good to change the 'flags' argument of ipsec4_input
to be a boolean, same for ipsec_getpolicybyaddr. It would be less
misleading.
 1.378  11-Apr-2018  maxv Add comment about IPsec.
 1.377  11-Apr-2018  maxv Small changes in ip_dooptions: replace bcopy by memcpy, the areas can't
overlap.
 1.376  24-Feb-2018  ozaki-r branches: 1.376.2;
Avoid a deadlock between softnet_lock and IFNET_LOCK

A deadlock occurs because there is a violation of the rule of lock ordering;
softnet_lock is held with hodling IFNET_LOCK, which violates the rule.
To avoid the deadlock, replace softnet_lock in in_control and in6_control
with KERNEL_LOCK.

We also need to add some KERNEL_LOCKs to protect the network stack surely.
This is required, for example, for PR kern/51356.

Fix PR kern/53043
 1.375  09-Feb-2018  maxv Remove dead code.
 1.374  07-Feb-2018  maxv Remove null check on ip, it can't be null. (Confuses code scanners.)
 1.373  06-Feb-2018  maxv Typos and style a bit, no functional change.
 1.372  05-Feb-2018  maxv Exterminate IPSENDREDIRECTS and IPMTUDISCTIMEOUT, neither is documented.
 1.371  05-Feb-2018  maxv Nuke DIRECTED_BROADCAST, it is not documented and not enabled anywhere. It
probably wouldn't have built correctly anyway, since there is no associated
defflag.

These ten lines of code in ip_input.c already look a lot better.
 1.370  05-Feb-2018  maxv Clean up this mess. This is typically the kind of places where we need to
seriously cut the bullshit. These things are unreadable, undocumented, and
all they bought us was not figuring out we had IPv4 forwarding enabled by
default for 20+ years.
 1.369  05-Feb-2018  maxv Be tougher, and don't allow LSRR+SSRR (RFC7126).
 1.368  05-Feb-2018  maxv Kick duplicate options, they are not allowed (RFC791).
 1.367  05-Feb-2018  maxv Remove unused variable.
 1.366  05-Feb-2018  maxv Disable ip_allowsrcrt and ip_forwsrcrt. Enabling them by default was a
completely dumb idea, because they have security implications.

By sending an IPv4 packet containing an LSRR option, an attacker will
cause the system to forward the packet to another IPv4 address - and
this way he white-washes the source of the packet.

It is also possible for an attacker to reach hidden networks: if a server
has a public address, and a private one on an internal network (network
which has several internal machines connected), the attacker can send a
packet with:

source = 0.0.0.0
destination = public address of the server
LSRR first address = address of a machine on the internal network

And the packet will be forwarded, by the server, to the internal machine,
in some cases even with the internal IP address of the server as a source.
 1.365  05-Feb-2018  maxv Style, no functional change.
 1.364  01-Jan-2018  christos 1) "#define ipi_spec_dst ipi_addr" in <netinet/in.h>
2) Change the IP_RECVPKTINFO option to control the generation of
IP_PKTINFO control messages, the way it's done in Solaris.
3) Remove the superfluous IP_RECVPKTINFO control message.
4) Change the IP_PKTINFO option to do different things depending on
the parameter it's supplied with:
- If it's sizeof(int), assume it's being used as in Linux:
- If it's non-zero, turn on the IP_RECVPKTINFO option.
- If it's zero, turn off the IP_RECVPKTINFO option.
- If it's sizeof(struct in_pktinfo), assume it's being used as in
Solaris, to set a default for the source interface and/or
source address for outgoing packets on the socket.
5) Return what Linux or Solaris compatible code expects, depending
on data size, and just added a fallback to a Linux (and current NetBSD)
compatible value if the size is unknown (as it is now), or,
in the future, if the calling application specifies a receiving
buffer that doesn't match either data item.

From: Tom Ivar Helbekkmo
 1.363  24-Nov-2017  roy Allow local communication over DETACHED addresses.
Allow binding to DETACHED or TENTATIVE addresses as we deny
sending upstream from them anyway.
Prefer non DETACHED or TENTATIVE addresses.
 1.362  17-Nov-2017  ozaki-r Provide macros for softnet_lock and KERNEL_LOCK hiding NET_MPSAFE switch

It reduces C&P codes such as "#ifndef NET_MPSAFE KERNEL_LOCK(1, NULL); ..."
scattered all over the source code and makes it easy to identify remaining
KERNEL_LOCK and/or softnet_lock that are held even if NET_MPSAFE.

No functional change
 1.361  27-Sep-2017  ozaki-r Take softnet_lock on pr_input properly if NET_MPSAFE

Currently softnet_lock is taken unnecessarily in some cases, e.g.,
icmp_input and encap4_input from ip_input, or not taken even if needed,
e.g., udp_input and tcp_input from ipsec4_common_input_cb. Fix them.

NFC if NET_MPSAFE is disabled (default).
 1.360  27-Jul-2017  ozaki-r Don't acquire global locks for IPsec if NET_MPSAFE

Note that the change is just to make testing easy and IPsec isn't MP-safe yet.
 1.359  19-Jul-2017  ozaki-r Correct a comment
 1.358  08-Jul-2017  christos Reorder the controls to the ones that need an interface and the ones that
don't; process the ones that don't first. Add a DIAGNOSTIC if there is no
interface; really this should be a KASSERT/panic because it is a bug if the
interface is not set at this point.
 1.357  06-Jul-2017  christos remove unnecessary casts (no functional change)
 1.356  06-Jul-2017  christos Merge the two copies SO_TIMESTAMP/SO_OTIMESTAMP processing to a single
function, and add a SOOPT_TIMESTAMP define reducing compat pollution from
5 places to 1.
 1.355  01-Jun-2017  chs branches: 1.355.2;
remove checks for failure after memory allocation calls that cannot fail:

kmem_alloc() with KM_SLEEP
kmem_zalloc() with KM_SLEEP
percpu_alloc()
pserialize_create()
psref_class_create()

all of these paths include an assertion that the allocation has not failed,
so callers should not assert that again.
 1.354  31-Mar-2017  ozaki-r Don't use a single global variable to store source route information for multiple incoming packets

It's not MP-safe. So use a m_tag to store the information instead.

Pointed out by knakahara@
The fix is from OpenBSD (originally fixed in FreeBSD)
 1.353  31-Mar-2017  ozaki-r Don't use a single global variable as a temporal storage for multiple packets

It's not MP-safe. So use local variables instead.
 1.352  06-Mar-2017  ozaki-r Make sure icmp_redirect_timeout_q and ip_mtudisc_timeout_q are initialized on bootup

Fix PR kern/52029
 1.351  17-Feb-2017  ozaki-r Fix return value
 1.350  17-Feb-2017  ozaki-r Protect sysctl_net_inet_ip_pmtudto with icmp_mtx instead of softnet_lock
 1.349  07-Feb-2017  ozaki-r Add missing NULL checks for m_get_rcvif
 1.348  24-Jan-2017  ozaki-r Tweak softnet_lock and NET_MPSAFE

- Don't hold softnet_lock in some functions if NET_MPSAFE
- Add softnet_lock to sysctl_net_inet_icmp_redirtimeout
- Add softnet_lock to expire_upcalls of ip_mroute.c
- Restore softnet_lock for in{,6}_pcbpurgeif{,0} if NET_MPSAFE
- Mark some softnet_lock for future work
 1.347  12-Dec-2016  ozaki-r branches: 1.347.2;
Make the routing table and rtcaches MP-safe

See the following descriptions for details.

Proposed on tech-kern and tech-net


Overview
 1.346  08-Dec-2016  ozaki-r Use psref for ip_rtaddr

ip_rtaddr will be sleepable soon. So use psref instead of pserialize.
 1.345  08-Dec-2016  ozaki-r Add rtcache_unref to release points of rtentry stemming from rtcache

In the MP-safe world, a rtentry stemming from a rtcache can be freed at any
points. So we need to protect rtentries somehow say by reference couting or
passive references. Regardless of the method, we need to call some release
function of a rtentry after using it.

The change adds a new function rtcache_unref to release a rtentry. At this
point, this function does nothing because for now we don't add a reference
to a rtentry when we get one from a rtcache. We will add something useful
in a further commit.

This change is a part of changes for MP-safe routing table. It is separated
to avoid one big change that makes difficult to debug by bisecting.
 1.344  18-Oct-2016  ozaki-r Don't hold global locks if NET_MPSAFE is enabled

If NET_MPSAFE is enabled, don't hold KERNEL_LOCK and softnet_lock in
part of the network stack such as IP forwarding paths. The aim of the
change is to make it easy to test the network stack without the locks
and reduce our local diffs.

By default (i.e., if NET_MPSAFE isn't enabled), the locks are held
as they used to be.

Reviewed by knakahara@
 1.343  18-Oct-2016  ozaki-r Avoid double frees of mbuf

May fix one of panicks reported by Tom Ivar Helbekkmo in PR kern/51522
 1.342  11-Oct-2016  ozaki-r Fix kernel builds with IFA_STATS
 1.341  07-Sep-2016  roy Disallow input to detached addresses because they are not yet valid.
 1.340  31-Aug-2016  ozaki-r Make ipforward_rt and ip6_forward_rt percpu

Sharing one rtcache between CPUs is just a bad idea.

Reviewed by knakahara@
 1.339  01-Aug-2016  ozaki-r Apply pserialize and psref to struct ifaddr and its variants

This change makes struct ifaddr and its variants (in_ifaddr and in6_ifaddr)
MP-safe by using pserialize and psref. At this moment, pserialize_perform
and psref_target_destroy are disabled because (1) we don't need them
because of softnet_lock (2) they cause a deadlock because of softnet_lock.
So we'll enable them when we remove softnet_lock in the future.
 1.338  26-Jul-2016  ozaki-r Fix downmatch increment
 1.337  08-Jul-2016  ozaki-r branches: 1.337.2;
CID 1363344: remove dead code

We may need to reconsider a case when m_get_rcvif_psref returns NULL.
 1.336  07-Jul-2016  ozaki-r Switch the address list of intefaces to pslist(9)

As usual, we leave the old list to avoid breaking kvm(3) users.
 1.335  06-Jul-2016  ozaki-r Switch the IPv4 address list to pslist(9)

Note that we leave the old list just in case; it seems there are some
kvm(3) users accessing the list. We can remove it later if we confirmed
nobody does actually.
 1.334  06-Jul-2016  ozaki-r Add and use pslist(9)-based hashtable for IPv4 addresses

Note that we leave the old hashtable to keep vmstat -H working.
 1.333  04-Jul-2016  ozaki-r Separate IP address matching functions

No functional change intended.
 1.332  30-Jun-2016  ozaki-r Tidy up goto lables

No functional change.
 1.331  30-Jun-2016  ozaki-r Fix error paths

Some error paths did m_put_rcvif_psref twice.
 1.330  28-Jun-2016  ozaki-r Add missing NULL checks for m_get_rcvif_psref
 1.329  10-Jun-2016  ozaki-r Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.
 1.328  21-Jan-2016  riastradh Revert previous: ran cvs commit when I meant cvs diff. Sorry!

Hit up-arrow one too few times.
 1.327  21-Jan-2016  riastradh Give proper prototype to ip_output.
 1.326  08-Jan-2016  knakahara eliminate ip_input.c and ip6_input.c dependency on gif(4)
 1.325  13-Oct-2015  roy Include arp.h to restore the sysctl net.inet.ip.dad_count.
Fixes PR kern/49883 thanks to HITOSHI Osada.
 1.324  24-Aug-2015  pooka sprinkle _KERNEL_OPT
 1.323  07-Aug-2015  ozaki-r Use time_uptime instead of time_second to avoid time leaps

Some codes in sys/net* use time_second to manage time periods such as
cache expirations. However, time_second doesn't increase monotonically
and can leap by say settimeofday(2) according to time_second(9). We
should use time_uptime instead of it to avoid such time leaps.

This change replaces time_second with time_uptime. Additionally it
converts a time based on time_uptime to a time based on time_second
when the kernel passes the time to userland programs that expect
the latter, and vice versa.

Note that we shouldn't leak time_uptime to other hosts over the
netowrk. My investigation shows there is no such leak:
http://mail-index.netbsd.org/tech-net/2015/08/06/msg005332.html

Discussed on tech-kern and tech-net.
 1.322  02-May-2015  joerg Fix !ARP build.
 1.321  02-May-2015  roy Add IPv4 address flags IN_IFF_TENTATIVE, IN_IFF_DUPLICATED and
IN_IFF_DETATCHED to mimic the IPv6 address behaviour.
Add SIOCGIFAFLAG_IN ioctl to retrieve the address flag via the
ifreq structure.
Add IPv4 DAD detection via the ARP methods described in RFC 5227.
Add sysctls net.inet.ip.dad_count and net.inet.arp.debug.

Discussed on tech-net@
 1.320  26-Mar-2015  ozaki-r Tidy up the regular path of ip_forward

No functional change is intended.
 1.319  16-Jun-2014  ozaki-r branches: 1.319.2; 1.319.4; 1.319.6; 1.319.10;
Add 3rd argument to pktq_create to pass sc

It will be used to pass bridge sc for bridge_forward softint.

ok rmind@
 1.318  05-Jun-2014  rmind - Implement pktqueue interface for lockless IP input queue.
- Replace ipintrq and ip6intrq with the pktqueue mechanism.
- Eliminate kernel-lock from ipintr() and ip6intr().
- Some preparation work to push softnet_lock out of ipintr().

Discussed on tech-net.
 1.317  30-May-2014  christos Introduce 2 new variables: ipsec_enabled and ipsec_used.
Ipsec enabled is controlled by sysctl and determines if is allowed.
ipsec_used is set automatically based on ipsec being enabled, and
rules existing.
 1.316  29-May-2014  rmind Make IGMP and multicast group management code MP-safe. Use a read-write
lock to protect the hash table of multicast address records; also, make it
private and eliminate some macros. In the long term, the lookup path ought
to be optimised.
 1.315  28-May-2014  christos CID 12164{49,51}: Remove bogus ifp == NULL checks; if ifp was really NULL,
we would have been dead a few lines before the tests.
 1.314  23-May-2014  rmind ip_input(), ip_savecontrol(): cache m->m_pkthdr.rcvif in a variable.
 1.313  23-May-2014  rmind Make ip_forward() static, there is no need to expose it.
 1.312  23-May-2014  rmind Make ip_input() static, there is no need to expose it.
 1.311  22-May-2014  rmind - Add in_init() and move some functions, variables and sysctls into in.c
where they belong to. Make some functions and variables static.
- ip_input.c: reduce some #ifdefs, cleanup a little.
- Move some sysctls into ip_flow.c as they belong there.

No functional change.
 1.310  19-Mar-2014  liamjfoy branches: 1.310.2;
Remove ipflow_prune and replace with ipflow_reap. ok rmind@
 1.309  25-Feb-2014  pooka Ensure that the top level sysctl nodes (kern, vfs, net, ...) exist before
the sysctl link sets are processed, and remove redundancy.

Shaves >13kB off of an amd64 GENERIC, not to mention >1k duplicate
lines of code.
 1.308  29-Jun-2013  rmind - Rewrite parts of pfil(9): use array to store hooks and thus be more cache
friendly (there are only few hooks in the system). Make the structures
opaque and the interface more strict.
- Remove PFIL_HOOKS option by making pfil(9) mandatory.
 1.307  27-Jun-2013  christos branches: 1.307.2;
flip src/dst
 1.306  27-Jun-2013  christos implement IP_PKTINFO and IP_RECVPKTINFO.
 1.305  08-Jun-2013  rmind Split IPsec code in ip_input() and ip_forward() into the separate routines
ipsec4_input() and ipsec4_forward(). Tested by christos@.
 1.304  05-Jun-2013  christos IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.
 1.303  29-Nov-2012  christos Add a new sysctl to mark ports as reserved, so that they are not used in
the anonymous or reserved port allocation.
 1.302  25-Jun-2012  christos branches: 1.302.2;
rename rfc6056 -> portalgo, requested by yamt
 1.301  22-Jun-2012  christos PR/46602: Move the rfc6056 port randomization to the IP layer.
 1.300  02-Jun-2012  dsl Add some pre-processor magic to verify that the type of the data item
passed to sysctl_createv() actually matches the declared type for
the item itself.
In the places where the caller specifies a function and a structure
address (typically the 'softc') an explicit (void *) cast is now needed.
Fixes bugs in sys/dev/acpi/asus_acpi.c sys/dev/bluetooth/bcsp.c
sys/kern/vfs_bio.c sys/miscfs/syncfs/sync_subr.c and setting
AcpiGbl_EnableAmlDebugObject.
(mostly passing the address of a uint64_t when typed as CTLTYPE_INT).
I've test built quite a few kernels, but there may be some unfixed MD
fallout. Most likely passing &char[] to char *.
Also add CTLFLAG_UNSIGNED for unsiged decimals - not set yet.
 1.299  22-Mar-2012  drochner remove KAME IPSEC, replaced by FAST_IPSEC
 1.298  09-Jan-2012  liamjfoy branches: 1.298.2; 1.298.6; 1.298.8;
check against NULL
 1.297  19-Dec-2011  drochner rename the IPSEC in-kernel CPP variable and config(8) option to
KAME_IPSEC, and make IPSEC define it so that existing kernel
config files work as before
Now the default can be easily be changed to FAST_IPSEC just by
setting the IPSEC alias to FAST_IPSEC.
 1.296  31-Aug-2011  plunky branches: 1.296.2; 1.296.6;
NULL does not need a cast
 1.295  03-May-2011  dyoung *_drain() routines may be called with locks held, so instead of doing
any work in *_drain(), set a drain-needed flag. Do the work in the
fasttimo handler.

Contributed by Coyote Point Systems, Inc.
 1.294  14-Apr-2011  dyoung In ipintr(), don't overwrite ipintrq.ifq_maxlen with IFQ_MAXLEN.

Initialize ipintrq.ifq_maxlen using IFQ_MAXLEN directly instead of using
the global ipqmaxlen. Get rid of the global ipqmaxlen.

Now it works again to override the maximum IP queue length with, for
example, sysctl -w net.inet.ip.ifq.maxlen=5.
 1.293  13-Dec-2010  matt branches: 1.293.2;
Back out rev that shouldn't have been committed.
 1.292  11-Dec-2010  matt Add routines to calculate a checkesum if the driver concludes that the
h/w can't do it.
 1.291  05-Nov-2010  rmind ip_randomid: make mechanism MP-safe and more modular.

OK matt@
 1.290  05-Nov-2010  rmind ip_reass_packet: finish abstraction; some clean-up.
Discussed some time ago with matt@.
 1.289  19-Jul-2010  rmind Abstract IP reassembly into single generic routine - ip_reass_packet().
Make struct ipq private and struct ipqent not visible to userland.
Push ip_len adjustment into reassembly layer.

OK matt@
 1.288  13-Jul-2010  rmind Split-off IPv4 re-assembly mechanism into a separate module. Abstract
into ip_reass_init(), ip_reass_lookup(), etc (note: abstraction is not
yet complete). No functional changes to the actual mechanism.

OK matt@
 1.287  09-Jul-2010  rmind ip_input: move lookup for fragment queue a little bit further. OK matt@.
 1.286  01-Apr-2010  tls As suggested by at least 3 different people (the guilty parties know who
they are) avoid repeated kernel_lock/unlock by using an intrq on the stack.

About 5%-10% better from run to run, on my *very* simpleminded test. Can't
possibly be worse.
 1.285  31-Mar-2010  tls Don't hold kernel lock across call to ip_input() -- it blocked *all*
hardware interrupts for the length of time it took for all dequeued
packets to flow up the stack (on multiprocessors only). Initial testing
shows performance impact is minimal -- since this temporary fix actually
means taking/releasing the kernel lock per-packet, that seems
acceptable.

Holding the kernel lock across the ip_input() call duplicated the
exclusion intended to be provided by the socket locks/softnet lock
(same lock, for INET/INET6 sockets) and could mask serious bugs. Several
hours' testing didn't turn any up but I'd be surprised if some don't now
appear.

Damon Permezel noticed the problem. Temporary fix suggested by matt@.
 1.284  16-Sep-2009  pooka branches: 1.284.2; 1.284.4;
Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL
 1.283  17-Jul-2009  minskim Delete trailing whitespace.
 1.282  16-Jul-2009  minskim Add the IP_RECVTTL option support.

If the IP_RECVTTL option is enabled on a SOCK_DGRAM socket, the
recvmsg(2) call will return the TTL of the received datagram. The
msg_control field in the msghdr structure points to a buffer that
contains a cmsghdr structure followed by the TTL value.

Modeled after FreeBSD implementation.
 1.281  18-Apr-2009  tsutsui Remove extra whitespace added by a stupid tool.
XXX: more in src/sys/arch
 1.280  15-Apr-2009  elad Remove a few KAUTH_GENERIC_ISSUSER in favor of more descriptive
alternatives.

Discussed on tech-kern:

http://mail-index.netbsd.org/tech-kern/2009/04/11/msg004798.html

Input from ad@, christos@, dyoung@, tsutsui@.

Okay ad@.
 1.279  18-Mar-2009  cegger bcopy -> memcpy
 1.278  19-Jan-2009  christos branches: 1.278.2;
Provide compatibility to the old timeval SCM_TIMESTAMP messages.
 1.277  17-Dec-2008  cegger kill MALLOC and FREE macros.
 1.276  23-Nov-2008  rmind ip_input: fix an IPQ "lock" leak. (hi <matt>!)
 1.275  04-Oct-2008  pooka branches: 1.275.2; 1.275.4;
POOL_INIT -> pool_init
 1.274  05-Sep-2008  seanb Wrong route being consulted in one place
in ip_forward() after change to rtcache_*().
Restore previous behaviour.
 1.273  20-Aug-2008  matt Make the sysctl routines take out softnet_lock before dealing with
any data structures.

Change inet6ctlerrmap and zeroin6_addr to const.
 1.272  05-May-2008  ad branches: 1.272.2; 1.272.6;
- Convert hashinit() to use kmem_alloc(). The hash tables can be large
and it's better to not have them in kmem_map.
- Convert a couple of minor items along the way to kmem_alloc().
- Fix some memory leaks.
 1.271  04-May-2008  thorpej Simplify the interface to netstat_sysctl() and allocate space for
the collated counters using kmem_alloc().

PR kern/38577
 1.270  02-May-2008  ad PR kern/38497 Out of memory allocating ksiginfo

Work around: don't acquire softnet_lock in protocol drain routines.
 1.269  28-Apr-2008  martin Remove clause 3 and 4 from TNF licenses
 1.268  24-Apr-2008  ad branches: 1.268.2;
Merge the socket locking patch:

- Socket layer becomes MP safe.
- Unix protocols become MP safe.
- Allows protocol processing interrupts to safely block on locks.
- Fixes a number of race conditions.

With much feedback from matt@ and plunky@.
 1.267  23-Apr-2008  thorpej Make IPSEC and FAST_IPSEC stats per-cpu. Use <net/net_stats.h> and
netstat_sysctl().
 1.266  12-Apr-2008  thorpej branches: 1.266.2;
Make IP, TCP, UDP, and ICMP statistics per-CPU. The stats are collated
when the user requests them via sysctl.
 1.265  09-Apr-2008  thorpej - ipflow is not used outside ip_flow.c; move its definition there.
- Make ipflow_reap() private to ip_flow.c, and introduce ipflow_prune()
for external callers to use (avoids returning an ipflow * that is never
actually used anyway).
 1.264  07-Apr-2008  thorpej Change IP stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ipstat structure; old netstat
binaries will continue to work properly.
 1.263  27-Mar-2008  cube - Make sure we send a reasonable fragment size when IPSEC is configured.
Otherwise we end up sending a dubious "0" whenever we cannot find a
proper association for the packet.
- Reset sack_newdata along with snd_nxt to avoid improper integer
arithmetics that lead to sending data from an incorrect place in the
stream, making it appear as corrupted.

Patch by Michael Van Elst, based on an analysis by Michael for the IPSEC
stuff and I for the SACK issue.
 1.262  06-Feb-2008  matt branches: 1.262.6;
Add a new ip_id generation scheme based on a Fisher-Yates shuffle over a
sliding window. XXX replace use of arc4random RSN.
 1.261  14-Jan-2008  dyoung Use rtcache_validate() instead of rtcache_getrt(). Shorten staircase
in in_losing().
 1.260  22-Dec-2007  matt Fix offset calculation.
Make sure that all frags use the same TOS.
 1.259  21-Dec-2007  matt Also make sure the first is at 68 bytes long.
 1.258  21-Dec-2007  matt Prevent TCP blind data attacks by not allowing non-initial fragments to
start at less than 68 bytes (minimal fragment size).
 1.257  20-Dec-2007  dyoung Poison struct route->ro_rt uses in the kernel by changing the name
to _ro_rt. Use rtcache_getrt() to access a route cache's struct
rtentry *.

Introduce struct ifnet->if_dl that always points at the interface
identifier/link-layer address. Make code that treated the first
ifaddr on struct ifnet->if_addrlist as the interface address use
if_dl, instead.

Remove stale debugging code from net/route.c. Move the rtflush()
code into rtcache_clear() and delete rtflush(). Delete rtalloc(),
because nothing uses it any more.

Make ND6_HINT an inline, lowercase subroutine, nd6_hint.

I've done my best to convert IP Filter, the ISO stack, and the
AppleTalk stack to rtcache_getrt(). They compile, but I have not
tested them. I have given the changes to PF, GRE, IPv4 and IPv6
stacks a lot of exercise.
 1.256  26-Nov-2007  yamt branches: 1.256.2; 1.256.6;
inetctlerrmap: use designated initializer.
 1.255  09-Nov-2007  kefren Don't MCLAIM in ipintr() because we do it anyway in ip_input()
 1.254  02-Oct-2007  dyoung branches: 1.254.2; 1.254.4;
Delete the unused second argument to ip_stripoptions(), move it
closer to its single caller in if_eon.c, try to move fewer bytes
by moving the IP header forward instead of moving the tail of the
mbuf backward, and use m_adj(9) instead of fiddling directly with
mbuf data members.
 1.253  11-Sep-2007  degroote branches: 1.253.2;
In some FAST_IPSEC, spl level is not restored correctly. Fix that.

Spotted by Wolfgang Stukenbrock in pr/36800
 1.252  30-Aug-2007  dyoung Use malloc(9) for sockaddrs instead of pool(9), and remove dom_sa_pool
and dom_sa_len members from struct domain. Pools of fixed-size
objects are too rigid for sockaddr_dls, whose size can vary over
a wide range.

Return sockaddr_dl to its "historical" size. Now that I'm using
malloc(9) instead of pool(9) to allocate sockaddr_dl, I can create
a sockaddr_dl of any size in the kernel, so expanding sockaddr_dl
is useless.

Avoid using sizeof(struct sockaddr_dl) in the kernel.

Introduce sockaddr_dl_alloc() for allocating & initializing an
arbitrary sockaddr_dl on the heap.

Add an argument, the sockaddr length, to sockaddr_alloc(),
sockaddr_copy(), and sockaddr_dl_setaddr().

Constify: LLADDR() -> CLLADDR().

Where the kernel overwrites LLADDR(), use sockaddr_dl_setaddr(),
instead. Used properly, sockaddr_dl_setaddr() will not overrun
the end of the sockaddr.
 1.251  10-Aug-2007  dyoung branches: 1.251.2;
Use sockaddr_dl_init().
 1.250  19-Jul-2007  dyoung branches: 1.250.4; 1.250.6;
Take steps to hide the radix_node implementation of the forwarding table
from the forwarding table's users:

Introduce rt_walktree() for walking the routing table and
applying a function to each rtentry. Replace most
rn_walktree() calls with it.

Use rt_getkey()/rt_setkey() to get/set a route's destination.
Keep a pointer to the sockaddr key in the rtentry, so that
rtentry users do not have to grovel in the radix_node for
the key.

Add a RTM_GET method to rtrequest. Use that instead of
radix_node lookups in, e.g., carp(4).

Add sys/net/link_proto.c, which supplies sockaddr routines for
link-layer socket addresses (sockaddr_dl).

Cosmetic:

Constify. KNF. Stop open-coding LIST_FOREACH, TAILQ_FOREACH,
et cetera. Use NULL instead of 0 for null pointers. Use
__arraycount(). Reduce gratuitous parenthesization.

Stop using variadic arguments for rip6_output(), it is
unnecessary.

Remove the unnecessary rtentry member rt_genmask and the
code to maintain it, since nothing actually used it.

Make rt_maskedcopy() easier to read by using meaningful variable
names.

Extract a subroutine intern_netmask() for looking up a netmask in
the masks table.

Start converting backslash-ridden IPv6 macros in
sys/netinet6/in6_var.h into inline subroutines that one
can read without special eyeglasses.

One functional change: when the kernel serves an RTM_GET, RTM_LOCK,
or RTM_CHANGE request, it applies the netmask (if supplied) to a
destination before searching for it in the forwarding table.

I have changed sys/netinet/ip_carp.c, carp_setroute(), to remove
the unlawful radix_node knowledge.

Apart from the changes to carp(4), netiso, ATM, and strip(4), I
have run the changes on three nodes in my wireless routing testbed,
which involves IPv4 + IPv6 dynamic routing acrobatics, and it's
working beautifully so far.
 1.249  02-May-2007  dyoung branches: 1.249.2;
Eliminate address family-specific route caches (struct route, struct
route_in6, struct route_iso), replacing all caches with a struct
route.

The principle benefit of this change is that all of the protocol
families can benefit from route cache-invalidation, which is
necessary for correct routing. Route-cache invalidation fixes an
ancient PR, kern/3508, at long last; it fixes various other PRs,
also.

Discussions with and ideas from Joerg Sonnenberger influenced this
work tremendously. Of course, all design oversights and bugs are
mine.

DETAILS

1 I added to each address family a pool of sockaddrs. I have
introduced routines for allocating, copying, and duplicating,
and freeing sockaddrs:

struct sockaddr *sockaddr_alloc(sa_family_t af, int flags);
struct sockaddr *sockaddr_copy(struct sockaddr *dst,
const struct sockaddr *src);
struct sockaddr *sockaddr_dup(const struct sockaddr *src, int flags);
void sockaddr_free(struct sockaddr *sa);

sockaddr_alloc() returns either a sockaddr from the pool belonging
to the specified family, or NULL if the pool is exhausted. The
returned sockaddr has the right size for that family; sa_family
and sa_len fields are initialized to the family and sockaddr
length---e.g., sa_family = AF_INET and sa_len = sizeof(struct
sockaddr_in). sockaddr_free() puts the given sockaddr back into
its family's pool.

sockaddr_dup() and sockaddr_copy() work analogously to strdup()
and strcpy(), respectively. sockaddr_copy() KASSERTs that the
family of the destination and source sockaddrs are alike.

The 'flags' argumet for sockaddr_alloc() and sockaddr_dup() is
passed directly to pool_get(9).

2 I added routines for initializing sockaddrs in each address
family, sockaddr_in_init(), sockaddr_in6_init(), sockaddr_iso_init(),
etc. They are fairly self-explanatory.

3 structs route_in6 and route_iso are no more. All protocol families
use struct route. I have changed the route cache, 'struct route',
so that it does not contain storage space for a sockaddr. Instead,
struct route points to a sockaddr coming from the pool the sockaddr
belongs to. I added a new method to struct route, rtcache_setdst(),
for setting the cache destination:

int rtcache_setdst(struct route *, const struct sockaddr *);

rtcache_setdst() returns 0 on success, or ENOMEM if no memory is
available to create the sockaddr storage.

It is now possible for rtcache_getdst() to return NULL if, say,
rtcache_setdst() failed. I check the return value for NULL
everywhere in the kernel.

4 Each routing domain (struct domain) has a list of live route
caches, dom_rtcache. rtflushall(sa_family_t af) looks up the
domain indicated by 'af', walks the domain's list of route caches
and invalidates each one.
 1.248  25-Mar-2007  liamjfoy Add net.inet.ip.hashsize to control the IPv4 fast forward hash table size.
 1.247  24-Mar-2007  liamjfoy Don't call ip*flow_reap if we're just looking up maxflows
 1.246  12-Mar-2007  ad branches: 1.246.2; 1.246.4;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.
 1.245  05-Mar-2007  liamjfoy branches: 1.245.2;
Move ipflow_slowtimo from ip_slowtimo and into in_proto.c

ok matt@
 1.244  04-Mar-2007  christos Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.
 1.243  17-Feb-2007  dyoung KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.
 1.242  29-Jan-2007  dyoung branches: 1.242.2;
Cosmetic: remove extraneous, non-KNF parentheses. Change a
sizeof(type) to a sizeof(*ptr) so the correctness of the statement
is correct "at a glance" (or so I hope).
 1.241  22-Dec-2006  ad ipintr(): check if the queue is empty before looping. Hardly a giant
win, but removed 30% of splnet() calls in one local test.
 1.240  15-Dec-2006  joerg Introduce new helper functions to abstract the route caching.
rtcache_init and rtcache_init_noclone lookup ro_dst and store
the result in ro_rt, taking care of the reference counting and
calling the domain specific route cache.
rtcache_free checks if a route was cashed and frees the reference.
rtcache_copy copies ro_dst of the given struct route, checking that
enough space is available and incrementing the reference count of the
cached rtentry if necessary.
rtcache_check validates that the cached route is still up. If it isn't,
it tries to look it up again. Afterwards ro_rt is either a valid again
or NULL.
rtcache_copy is used internally.

Adjust to callers of rtalloc/rtflush in the tree to check the sanity of
ro_dst first (if necessary). If it doesn't fit the expectations, free
the cache, otherwise check if the cached route is still valid. After
that combination, a single check for ro_rt == NULL is enough to decide
whether a new lookup needs to be done with a different ro_dst.
Make the route checking in gre stricter by repeating the loop check
after revalidation.
Remove some unused RADIX_MPATH code in in6_src.c. The logic is slightly
changed here to first validate the route and check RTF_GATEWAY
afterwards. This is sementically equivalent though.
etherip doesn't need sc_route_expire similiar to the gif changes from
dyoung@ earlier.

Based on the earlier patch from dyoung@, reviewed and discussed with
him.
 1.239  09-Dec-2006  dyoung Here are various changes designed to protect against bad IPv4
routing caused by stale route caches (struct route). Route caches
are sprinkled throughout PCBs, the IP fast-forwarding table, and
IP tunnel interfaces (gre, gif, stf).

Stale IPv6 and ISO route caches will be treated by separate patches.

Thank you to Christoph Badura for suggesting the general approach
to invalidating route caches that I take here.

Here are the details:

Add hooks to struct domain for tracking and for invalidating each
domain's route caches: dom_rtcache, dom_rtflush, and dom_rtflushall.

Introduce helper subroutines, rtflush(ro) for invalidating a route
cache, rtflushall(family) for invalidating all route caches in a
routing domain, and rtcache(ro) for notifying the domain of a new
cached route.

Chain together all IPv4 route caches where ro_rt != NULL. Provide
in_rtcache() for adding a route to the chain. Provide in_rtflush()
and in_rtflushall() for invalidating IPv4 route caches. In
in_rtflush(), set ro_rt to NULL, and remove the route from the
chain. In in_rtflushall(), walk the chain and remove every route
cache.

In rtrequest1(), call rtflushall() to invalidate route caches when
a route is added.

In gif(4), discard the workaround for stale caches that involves
expiring them every so often.

Replace the pattern 'RTFREE(ro->ro_rt); ro->ro_rt = NULL;' with a
call to rtflush(ro).

Update ipflow_fastforward() and all other users of route caches so
that they expect a cached route, ro->ro_rt, to turn to NULL.

Take care when moving a 'struct route' to rtflush() the source and
to rtcache() the destination.

In domain initializers, use .dom_xxx tags.

KNF here and there.
 1.238  06-Dec-2006  dyoung KNF.
 1.237  06-Dec-2006  dyoung KNF.
 1.236  16-Nov-2006  christos branches: 1.236.2; 1.236.4;
__unused removal on arguments; approved by core.
 1.235  12-Oct-2006  christos - sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386
 1.234  10-Oct-2006  dogcow change the MOWNER_INIT define to take two args; fix extant struct mowner
decls to use it. Makes options MBUFTRACE compile again and not whinge about
missing structure declarations. (Also makes initialization consistent.)
 1.233  05-Oct-2006  tls Protect calls to pool_put/pool_get that may occur in interrupt context
with spl used to protect other allocations and frees, or datastructure
element insertion and removal, in adjacent code.

It is almost unquestionably the case that some of the spl()/splx() calls
added here are superfluous, but it really seems wrong to see:

s=splfoo();
/* frob data structure */
splx(s);
pool_put(x);

and if we think we need to protect the first operation, then it is hard
to see why we should not think we need to protect the next. "Better
safe than sorry".

It is also almost unquestionably the case that I missed some pool
gets/puts from interrupt context with my strategy for finding these
calls; use of PR_NOWAIT is a strong hint that a pool may be used from
interrupt context but many callers in the kernel pass a "can wait/can't
wait" flag down such that my searches might not have found them. One
notable area that needs to be looked at is pf.

See also:

http://mail-index.netbsd.org/tech-kern/2006/07/19/0003.html
http://mail-index.netbsd.org/tech-kern/2006/07/19/0009.html
 1.232  19-Sep-2006  elad Remove ugly (void *) casts from network scope authorization wrapper and
calls to it.

While here, adapt code for system scope listeners to avoid some more
casts (forgotten in previous run).

Update documentation.
 1.231  13-Sep-2006  elad branches: 1.231.2;
Don't use KAUTH_RESULT_* where it's not applicable.
Prompted by yamt@.
 1.230  08-Sep-2006  elad First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)
 1.229  30-Aug-2006  christos branches: 1.229.2;
fix initializer
 1.228  30-Jul-2006  elad ugh.. more stuff that's overdue and should not be in 4.0: remove the
sysctl(9) flags CTLFLAG_READONLY[12]. luckily they're not documented
so it's only half regression.

only two knobs used them; proc.curproc.corename (check added in the
existing handler; its CTLFLAG_ANYWRITE, yay) and net.inet.ip.forwsrcrt,
that got its own handler now too.
 1.227  07-Jun-2006  kardel merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html
 1.226  08-May-2006  liamjfoy branches: 1.226.2;
#if -> #ifdef

ok christos
 1.225  15-Apr-2006  christos Coverity CID 1134: Protect against NULL deref.
 1.224  18-Feb-2006  joerg branches: 1.224.2; 1.224.4; 1.224.6;
Print the source and destination IP in ip_forward's DIAGNOSTIC code
with inet_ntoa, making it more human friendly.

From Liam J. Foy in private mail.
 1.223  24-Dec-2005  perry branches: 1.223.2; 1.223.4; 1.223.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.
 1.222  11-Dec-2005  christos merge ktrace-lwp.
 1.221  01-Nov-2005  christos Don't decrement the ttl, until we are sure that we can forward this packet.
Before if there was no route, we would call icmp_error with a datagram
packet that has an incorrect checksum. (From Liam Foy)
 1.220  23-Oct-2005  christos No need to pass an interface when only the mtu is needed. From OpenBSD via
Liam Foy.
 1.219  05-Aug-2005  elad branches: 1.219.2;
Add sysctls for IP, ICMP, TCP, and UDP statistics.
 1.218  28-Jun-2005  seanb branches: 1.218.2;
- Return ICMP_UNREACH_NET when no route found as per
section 4.3.3.1 of rfc1812.
 1.217  09-Jun-2005  atatat Properly fix the constipated lossage wrt -Wcast-qual and the sysctl
code. I know it's not the prettiest code, but it seems to work rather
well in spite of itself.
 1.216  01-Jun-2005  blymn Unconstify rnode to prevent compile error when GATEWAY option set.
 1.215  29-Apr-2005  yamt move decl of inetsw to its own header to avoid array of incomplete type.
found by gcc4. reported by Adam Ciarcinski.
 1.214  18-Apr-2005  yamt fix problems related to loopback interface checksum omission. PR/29971.

- for ipv4, defer decision to ip layer as h/w checksum offloading does
so that it can check the actual interface the packet is going to.
- for ipv6, disable it.
(maybe will be revisited when it implements h/w checksum offloading.)

ok'ed by Jason Thorpe.
 1.213  29-Mar-2005  yamt ip_reass: clear stale csum_flags.
 1.212  26-Feb-2005  perry branches: 1.212.2;
nuke trailing whitespace
 1.211  03-Feb-2005  perry ANSIfy function declarations
 1.210  02-Feb-2005  perry de-__P -- will ANSIfy .c files later.
 1.209  24-Jan-2005  matt branches: 1.209.2;
Add IFNET_FOREACH and IFADDR_FOREACH macros and start using them.
 1.208  19-Dec-2004  christos branches: 1.208.2;
yamt's changes seem to fix all the checksumming issues. Turn the loopback
checksums back off so we can make sure that everything works.
 1.207  17-Dec-2004  christos Turn checksumming on loopback back on until we fix the bugs in it.
Connect over tcp on the loopback is broken:

4729 amq 0.000007 CALL connect(4,0x804f2a0,0x1c)
4729 amq 75.007420 RET connect -1 errno 60 Connection timed out
 1.206  15-Dec-2004  thorpej Don't perform checksums on loopback interfaces. They can be reenabled with
the net.inet.*.do_loopback_cksum sysctl.

Approved by: groo
 1.205  06-Oct-2004  darrenr Add a comment to document what setting "srcrt" is really on about in ipintr()
 1.204  29-Sep-2004  christos PR/27081: Sean Boudreau: ip_input() bad csum count not incremented on sw csum
 1.203  25-May-2004  atatat Sysctl descriptions under net subtree (net.key not done)
 1.202  02-May-2004  darrenr at line 543, we do a pullup here of hlen bytes into the mbuf,
so these later ones are superfluous.
 1.201  01-May-2004  matt Use EVCNT_ATTACH_STATIC{,2}
 1.200  25-Apr-2004  simonb Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.
 1.199  22-Apr-2004  matt Constify protosw arrays. This can reduce the kernel .data section by
over 4K (if all the network protocols) are loaded.
 1.198  01-Apr-2004  matt In ip_reass_ttl_descr, make i signed since it's compared to >= 0
 1.197  24-Mar-2004  atatat branches: 1.197.2;
Tango on sysctl_createv() and flags. The flags have all been renamed,
and sysctl_createv() now uses more arguments.
 1.196  15-Jan-2004  itojun correct typo in 1.94 -> 1.95. pointed out by Shiva Shenoy
 1.195  14-Dec-2003  thorpej Fix syntax errors in CHECK_NMBCLUSTER_PARAMS().
 1.194  14-Dec-2003  jonathan Second part of hashed IP_reassembly changes:

When under pressure for mbufs or we have too many fragments in the IP
reassembly queue, drop half of all fragments. This multiplicative-drop
strategy ensures we return to a healthy state, even under borderline
denial-of-service from extremely lossy NFS-over-UDP peers.
The multiplicative-drop phase currently drops 50% of fragments, but
has pre-placed support for implementing drop-fractions other than 50%

The threshhold for the `drop-half' phase is the new variable,
ip_maxfrags which is calculated as nmbclusters/4.

ip_input.c now keeps ip_nmbclusters, a cached copy of nmbclusters.
Before using limits derived from nmbclusters, we check if nmbclusters
and ip_nmclusters are equal. If not, we recompute Ip parameters
derived from nmbclusters. Based on a suggestion by Jason Thorpe.
ip_maxfrags is currently auto-recalcuated.

The counters ip_nfrags and ip_nfragpacketsr are now declared static
and uninitialized (bss), to discourage tampering with them.
 1.193  12-Dec-2003  scw Make fast-ipsec and ipflow (Fast Forwarding) interoperate.

The idea is that we only clear M_CANFASTFWD if an SPD exists
for the packet. Otherwise, it's safe to add a fast-forward
cache entry for the route.

To make this work properly, we invalidate the entire ipflow
cache if a fast-ipsec key is added or changed.
 1.192  08-Dec-2003  jonathan Add new field ipq_nfrags to struct ipq. Maintain count of fragments
(fragments, not fragmented packets) in each queue entry.
Use ipq_nfrags to maintain a count of total fragments in reassembly queue.
 1.191  07-Dec-2003  jonathan KNF: s/unsigned/u_int/, in a couple of places I missed.
 1.190  06-Dec-2003  jonathan Replace the single global IP reassembly list/listhead, with a
hashtable of list-heads. Independently re-invented, then reworked to
match similar code in FreeBSD.
 1.189  04-Dec-2003  atatat Dynamic sysctl.

Gone are the old kern_sysctl(), cpu_sysctl(), hw_sysctl(),
vfs_sysctl(), etc, routines, along with sysctl_int() et al. Now all
nodes are registered with the tree, and nodes can be added (or
removed) easily, and I/O to and from the tree is handled generically.

Since the nodes are registered with the tree, the mapping from name to
number (and back again) can now be discovered, instead of having to be
hard coded. Adding new nodes to the tree is likewise much simpler --
the new infrastructure handles almost all the work for simple types,
and just about anything else can be done with a small helper function.

All existing nodes are where they were before (numerically speaking),
so all existing consumers of sysctl information should notice no
difference.

PS - I'm sorry, but there's a distinct lack of documentation at the
moment. I'm working on sysctl(3/8/9) right now, and I promise to
watch out for buses.
 1.188  04-Dec-2003  scw ipflow (IP fast forwarding) is not compatible with FAST_IPSEC either.

XXX: The decision whether or not to fast forward should be made
XXX: dynamically. Using the current approach seriously reduces
XXX: routing performance on gateways with IPsec enabled.
 1.187  26-Nov-2003  itojun define RANDOM_IP_ID by default (unifdef -DRANDOM_IP_ID).
one use remains in sys/netipsec, which is kept for freebsd source code compat.
 1.186  24-Nov-2003  scw For FAST_IPSEC, ipfilter gets to see wire-format IPsec-encapsulated packets
only. Decapsulated packets bypass ipfilter. This mimics current behaviour
for Kame IPsec.
 1.185  19-Nov-2003  fvdl Correct number of arguments to sysctl_rdint.
 1.184  19-Nov-2003  jonathan Patch back support for (badly) randomized IP ids, by request:

* Include "opt_inet.h" everywhere IP-ids are generated with ip_newid(),
so the RANDOM_IP_ID option is visible. Also in ip_id(), to ensure
the prototype for ip_randomid() is made visible.

* Add new sysctl to enable randomized IP-ids, provided the kernel was
configured with RANDOM_IP_ID. (The sysctl defaults to zero, and is
a read-only zero if RANDOM_IP_ID is not configured).

Note that the implementation of randomized IP ids is still defective,
and should not be enabled at all (even if configured) without
very careful deliberation. Caveat emptor.
 1.183  17-Nov-2003  jonathan Diff to netinet/ip_input.c (restore ip_id, initialize) for ip_id fix:

Revert the (default) ip_id algorithm to the pre-randomid algorithm,
due to demonstrated low-period repeated IDs from the randomized IP_id
code. Consensus is that the low-period repetition (much less than
2^15) is not suitable for general-purpose use.

Allocators of new IPv4 IDs should now call the function ip_newid().
Randomized IP_ids is now a config-time option, "options RANDOM_IP_ID".
ip_newid() can use ip_random-id()_IP_ID if and only if configured
with RANDOM_IP_ID. A sysctl knob should be provided.

This API may be reworked in the near future to support linear ip_id
counters per (src,dst) IP-address pair.
 1.182  12-Nov-2003  itojun KNF
 1.181  11-Nov-2003  jonathan Change global head-of-local-IP-address list from in_ifaddr to
in_ifaddrhead. Recent changes in struct names caused a namespace
collision in fast-ipsec, which are most cleanly fixed by using
"in_ifaddrhead" as the listhead name.
 1.180  10-Nov-2003  jonathan Make per-protocol network input queue stats visible to userland via
sysctl. Add a protocol-independent sysctl handler to show the per-protocol
"struct ifq' statistics. Add IP(v4) specific call to the handler.
Other protocols can show their per-protocol input statistics by
allocating a sysclt node and calling sysctl_ifq() with their own struct ifq *.

As posted to tech-kern plus improvements/cleanup suggested by Andrew Brown.
 1.179  28-Sep-2003  mycroft Remove some code that breaks AH tunnels completely. The comment describing
the purpose of this code appears to be on crack -- it's talking about
end-to-end authentication, but the purpose of an AH tunnel is NOT end-to-end
authentication; it's authentication of the tunnel endpoints.

NB: This does not fix the fact that IPsec leaks "packet tags."
 1.178  06-Sep-2003  itojun randomize IPv4/v6 fragment ID and IPv6 flowlabel. avoids predictability
of these fields. ip_id.c is from openbsd. ip6_id.c is adapted by kame.
 1.177  06-Sep-2003  itojun backout previous, we don't know if arc4random() corrides on reboot.
 1.176  05-Sep-2003  itojun initialize fragment ID with arc4random, not by time.tv_sec
 1.175  22-Aug-2003  itojun remove ipsec_set/getsocket. now we explicitly pass socket * to ip{,6}_output.
 1.174  22-Aug-2003  itojun change the additional arg to be passed to ip{,6}_output to struct socket *.

this fixes KAME policy lookup which was broken by the previous commit.
 1.173  15-Aug-2003  jonathan (fast-ipsec): Add hooks to pass IPv4 IPsec traffic into fast-ipsec, if
configured with ``options FAST_IPSEC''. Kernels with KAME IPsec or
with no IPsec should work as before.

All calls to ip_output() now always pass an additional compulsory
argument: the inpcb associated with the packet being sent,
or 0 if no inpcb is available.

Fast-ipsec tested with ICMP or UDP over ESP. TCP doesn't work, yet.
 1.172  07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.171  14-Jul-2003  itojun correct igmp. from love
 1.170  03-Jul-2003  itojun minor KNF
 1.169  30-Jun-2003  itojun branches: 1.169.2;
do not generate ICMP redirect when packet filter alters ip_dst to an
address that reside on the same link. Cedric Berger convinced me that
it is necessary.
 1.168  30-Jun-2003  itojun fix indent
 1.167  23-Jun-2003  martin Make sure to include opt_foo.h if a defflag option FOO is used.
 1.166  15-Jun-2003  matt Change the way multicasts are kept. They now use a hash table in the same
manner as the ifaddr hash table. By doing this, the mkludge code can go
away. At the same time, keep track of what pcbs are using what ifaddr and
when an address is deleted from an interface, notify/abort all sockets
that have that address as a source. Switch IGMP and multicasts to use pools
for allocation. Fix a number of potential problems in the igmp code where
allocation failures could cause a trap/panic.
 1.165  11-Apr-2003  christos PR/991: Darren Reed: Add a sysctl (checkinteface) to implement this. This
implementation is taken from FreeBSD, but we default to off.
XXX: We should really do this on a per ifaddr basis as jason suggested.
 1.164  26-Feb-2003  matt Add MBUFTRACE kernel option.
Do a little mbuf rework while here. Change all uses of MGET*(*, M_WAIT, *)
to m_get*(M_WAIT, *). These are not performance critical and making them
call m_get saves considerable space. Add m_clget analogue of MCLGET and
make corresponding change for M_WAIT uses.
Modify netinet, gem, fxp, tulip, nfs to support MBUFTRACE.
Begin to change netstat to use sysctl.
 1.163  12-Nov-2002  itojun remove all entries in rt timer queue on ip_mtudisc change, instead of
destroying the queue.
 1.162  12-Nov-2002  itojun ckout previous - doesn't compile
 1.161  12-Nov-2002  itojun update ip_mtudisc sysctl change handling.
 1.160  10-Nov-2002  itojun always create pmtud timeout queue, as ip_mtudisc can be tweaked via
sysctl at runtime. From lha@stacken.kth.se
 1.159  02-Nov-2002  perry /*CONTCOND*/ while (0)'ed macros
 1.158  23-Sep-2002  itojun revert mtudisc_timeout value to the old one if update falis
 1.157  11-Sep-2002  itojun KNF - return is not a function. sync w/kame.
 1.156  11-Sep-2002  itojun correct signedness mixup in pointer passing. sync w/kame
 1.155  14-Aug-2002  itojun avoid swapping endian of ip_len and ip_off on mbuf, to meet with M_LEADINGSPACE
optimization made last year. should solve PR 17867 and 10195.

IP_HDRINCL behavior of raw ip socket is kept unchanged. we may want to
provide IP_HDRINCL variant that does not swap endian.
 1.154  30-Jun-2002  thorpej Changes to allow the IPv4 and IPv6 layers to align headers themseves,
as necessary:
* Implement a new mbuf utility routine, m_copyup(), is is like
m_pullup(), except that it always prepends and copies, rather
than only doing so if the desired length is larger than m->m_len.
m_copyup() also allows an offset into the destination mbuf, which
allows space for packet headers, in the forwarding case.
* Add *_HDR_ALIGNED_P() macros for IP, IPv6, ICMP, and IGMP. These
macros expand to 1 if __NO_STRICT_ALIGNMENT is defined, so that
architectures which do not have strict alignment constraints don't
pay for the test or visit the new align-if-needed path.
* Use the new macros to check if a header needs to be aligned, or to
assert that it already is, as appropriate.

Note: This code is still somewhat experimental. However, the new
code path won't be visited if individual device drivers continue
to guarantee that packets are delivered to layer 3 already properly
aligned (which are rules that are already in use).
 1.153  13-Jun-2002  itojun set IPv4 parameter to modern value.
- turn on path MTU discovery (previous: turned off)
- ICMPv4 redirect entry timeout = 600 sec (previous: never timeout)
 1.152  09-Jun-2002  itojun whitespace
 1.151  07-Jun-2002  itojun look at rmx_mtu on IPsec tunnel MTU computation.
From: David Waitzman <djw@bbn.com>
 1.150  12-May-2002  matt branches: 1.150.2; 1.150.4;
Eliminate commons.
 1.149  12-May-2002  wiz Spelling fixes, from Sergey Svishchev in kern/16650.
 1.148  07-May-2002  matt Change struct ipqe to use TAILQ's instead of LIST's (primarily for TCP's
benefit currently). Rework tcp_reass code to optimize the 4 most likely causes
of out-of-order packets: first OoO pkt, next OoO pkt in seq, OoO pkt is part
of new chuck of OoO packets, and the OoO pkt fills the first hole. Add evcnts
to instrument tcp_reass (enabled by the options TCP_REASS_COUNTERS). This is
part 1/2 of tcp_reass changes.
 1.147  18-Apr-2002  matt Change test for M_EXT to M_READONLY for MROUTING. We only need to to do
a pullup if we aren't allowed to modify the packet.
 1.146  08-Mar-2002  thorpej Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.
 1.145  25-Feb-2002  itojun correctly enforce ipsec policy check on forwarding case.
From: Greg Troxel <gdt@ir.bbn.com>, Bill Chiarchiaro <wjc@work.cleartech.com>
 1.144  24-Feb-2002  martin Clear M_BCAST and M_MCAST on outgoing mbufs.
Don't copy ttl from the inner packet to the encapsulating packet. Make
the outer ttl sysctl'able. This should close PR 14269 from Jasper Wallace
(change partly from there) and it makes traceroute work over gre tunnels.
 1.143  21-Feb-2002  itojun suppress source quence message, based on router-req RFC (also could be abused
as DoS traffic generator). from kjc/kame
 1.142  28-Nov-2001  darrenr recompute hlen after calling pfil_run_hooks() in case ip_hl was changed.
 1.141  13-Nov-2001  lukem add RCSIDs
 1.140  04-Nov-2001  matt Convert netinet to not use the internal <sys/queue.h> field names
but instead the access macros. Use the FOREACH macros where appropriate.
 1.139  04-Nov-2001  matt Change a few variable/tables to const since they are read-only.
 1.138  29-Oct-2001  simonb Don't need to include <uvm/uvm_extern.h> just to include <sys/sysctl.h>
anymore.
 1.137  17-Sep-2001  thorpej branches: 1.137.2;
Split the pre-computed ifnet checksum flags into Tx and Rx directions.
Add capabilities bits that indicate an interface can only perform
in-bound TCPv4 or UDPv4 checksums. There is at least one Gig-E chip
for which this is true (Level One LXT-1001), and this is also the
case for the Intel i82559 10/100 Ethernet chips.
 1.136  06-Aug-2001  itojun branches: 1.136.2;
cache IPsec policy on in6?pcb. most of the lookup operations can be bypassed,
especially when it is a connected SOCK_STREAM in6?pcb. sync with kame.
 1.135  02-Jun-2001  thorpej branches: 1.135.2;
Implement support for IP/TCP/UDP checksum offloading provided by
network interfaces. This works by pre-computing the pseudo-header
checksum and caching it, delaying the actual checksum to ip_output()
if the hardware cannot perform the sum for us. In-bound checksums
can either be fully-checked by hardware, or summed up for final
verification by software. This method was modeled after how this
is done in FreeBSD, although the code is significantly different in
most places.

We don't delay checksums for IPv6/TCP, but we do take advantage of the
cached pseudo-header checksum.

Note: hardware-assisted checksumming defaults to "off". It is
enabled with ifconfig(8). See the manual page for details.

Implement hardware-assisted checksumming on the DP83820 Gigabit Ethernet,
3c90xB/3c90xC 10/100 Ethernet, and Alteon Tigon/Tigon2 Gigabit Ethernet.
 1.134  21-May-2001  lukem fix spelo in comment
 1.133  16-Apr-2001  itojun give a default value to net.inet.ip.maxfragpackets, to protect us from
"lots of fragmented packets" DoS attack.

the current default value is derived from ipv6 counterpart, which is
a magical value "200". it should be enough for normal systems, not sure
if it is enough when you take hundreds of thousands of tcp connections on
your system. if you have proposal for a better value with concrete reasons,
let me know.
 1.132  13-Apr-2001  thorpej Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.
 1.131  27-Mar-2001  itojun net.inet.ip.maxfragpackets defines the maximum size of ip reass queue
(prevents fragment flood from chewing up mbuf memory space).
derived from KAME net.inet6.ip6.maxfragpackets.
 1.130  02-Mar-2001  itojun branches: 1.130.2;
increase ipstat.ips_badaddr if the packet fails to pass address checks.
 1.129  02-Mar-2001  itojun reject packets with 127/8 on IPv4 src/dst, they must not appear on wire
(RFC1122). torture-tests will be welcomed.
XXX do we want to check source routing headers as well?
 1.128  01-Mar-2001  itojun make sure to enforce inbound ipsec policy checking, for any protocols on top
of ip (check it when final header is visited). sync with kame.
XXX kame team will need to re-check policy engine code
 1.127  24-Jan-2001  itojun - record IPsec packet history into m_aux structure.
- let ipfilter look at wire-format packet only (not the decapsulated ones),
so that VPN setting can work with NAT/ipfilter settings.
sync with kame.

TODO: use header history for stricter inbound validation
 1.126  28-Dec-2000  thorpej Back out the sledgehammer damage applied by wiz while I was out for
the holiday.
 1.125  25-Dec-2000  wiz Back out previous change. It causes NAT to fail, and was CLEARLY
NOT TESTED before it was committed.
 1.124  22-Dec-2000  thorpej Slight adjustment to how pfil_head's are registered. Instead of a
"key" and a "dlt", use a "type" (PFIL_TYPE_{AF,IFNET} for now) and
a val/ptr appropriate for that type. This allows for more future
flexibility with the pfil_hook mechanism.
 1.123  14-Dec-2000  thorpej Add ALTQ glue. XXX Temporary until ALTQ is changed to use a pfil hook.
 1.122  24-Nov-2000  itojun IFA_STATS stability (not complete); don't touch ip if it is NULL.
 1.121  11-Nov-2000  thorpej Restructure the PFIL_HOOKS mechanism a bit:
- All packets are passed to PFIL_HOOKS as they come off the wire, i.e.
fields in protocol headers in network order, etc.
- Allow for multiple hooks to be registered, using a "key" and a "dlt".
The "dlt" is a BPF data link type, indicating what type of header is
present.
- INET and INET6 register with key == AF_INET or AF_INET6, and
dlt == DLT_RAW.
- PFIL_HOOKS now take an argument for the filter hook, and mbuf **,
an ifnet *, and a direction (PFIL_IN or PFIL_OUT), thus making them
less IP (really, IP Filter) centric.

Maintain compatibility with IP Filter by adding wrapper functions for
IP Filter.
 1.120  08-Nov-2000  ad Update for hashinit() change.
 1.119  13-Oct-2000  itojun make sure we don't share external mbuf between m and mcopy, in ip_forward().
should solve PR 11201.
 1.118  26-Aug-2000  itojun make sure anonport{min,max} is not negative number
 1.117  25-Aug-2000  tron Add new sysctl variables "net.inet.ip.lowportmin" and
"net.inet.ip.lowportmax" which can be used to the set minimum
and maximum port number assigned to sockets using
IP_PORTRANGE_LOW.
 1.116  06-Jul-2000  itojun remove unnecessary #include <netkey/key_debug.h>. from kame.
 1.115  28-Jun-2000  mrg <vm/vm.h> -> <uvm/uvm_extern.h>
 1.114  10-May-2000  itojun branches: 1.114.4;
add missing boundary checks to ip options processing.
correct timestamp option validation (len and ptr upper/lower bound
based on RFC791).
fill "pointer" field for parameter problem in timestamp option processing.
 1.113  10-May-2000  itojun correct more out-of-bounds memory access, if cnt == 1 and optlen > 1.
 1.112  06-May-2000  sommerfeld Handle large offsets with very small options correctly.
 1.111  31-Mar-2000  jdolecek Slighly improve previous - only include <netinet/ip_mroute.h> if MROUTING
is defined.
 1.110  31-Mar-2000  jdolecek include <netinet/ip_mroute.h> for ip_mforward() - needed after
last duplicate prototype sweep (prototype for ip_mforward() used to be in <netinet/ip_var.h>)
 1.109  30-Mar-2000  augustss Remove register declarations.
 1.108  30-Mar-2000  simonb Delete uninitialised declaration of ip_defttl - there's an initialised
decl earlier in this file.
 1.107  10-Mar-2000  thorpej Back out previous, and adjust a comment.
 1.106  07-Mar-2000  thorpej Back out part of 1.104 which isn't actually needed.
 1.105  03-Mar-2000  itojun remove unnecessary ttl initialization which I mistakingly bringed in
during KAME merge (this is part of WIDE's expeirmental reass code...)
NetBSD PR: 9412
From: Wolfgang Rupprecht <wolfgang@wsrcc.com>
Fix from: ho@crt.se
itojun was notified from: theo
 1.104  02-Mar-2000  thorpej Avoid a bug in GCC which manifests itself when processing unaligned
IP options. Problem pointed out by Matt Hargett and Erik Fair, analyzed
by me.
 1.103  01-Mar-2000  itojun introduce m->m_pkthdr.aux to hold random data which needs to be passed
between protocol handlers.

ipsec socket pointers, ipsec decryption/auth information, tunnel
decapsulation information are in my mind - there can be several other usage.
at this moment, we use this for ipsec socket pointer passing. this will
avoid reuse of m->m_pkthdr.rcvif in ipsec code.

due to the change, MHLEN will be decreased by sizeof(void *) - for example,
for i386, MHLEN was 100 bytes, but is now 96 bytes.
we may want to increase MSIZE from 128 to 256 for some of our architectures.

take caution if you use it for keeping some data item for long period
of time - use extra caution on M_PREPEND() or m_adj(), as they may result
in loss of m->m_pkthdr.aux pointer (and mbuf leak).

this will bump kernel version.

(as discussed in tech-net, tested in kame tree)
 1.102  20-Feb-2000  darrenr pass "struct pfil_head *" to pfil_add_hook and pfil_remove hook rather
than "struct protosw *".
 1.101  17-Feb-2000  darrenr Change the use of pfil hooks. There is no longer a single list of all
pfil information, instead, struct protosw now contains a structure
which caontains list heads, etc. The per-protosw pfil struct is passed
to pfil_hook_get(), along with an in/out flag to get the head of the
relevant filter list. This has been done for only IPv4 and IPv6, at
present, with these patches only enabling filtering for IPPROTO_IP and
IPPROTO_IPV6, although it is possible to have tcp/udp, etc, dedicated
filters now also. The ipfilter code has been updated to only filter
IPv4 packets - next major release of ipfilter is required for ipv6.
 1.100  16-Feb-2000  itojun - if ip_dst matches address on !IFF_UP interface, and
- there's no match against addresses on IFF_UP interface,
send icmp unreach if I'm router. drop it if I'm host.

Revised version of PR: 9387 from nrt@iij.ad.jp. Discussed with thorpej+nrt.
 1.99  12-Feb-2000  thorpej Typo (Thanks, Havard :-)
 1.98  12-Feb-2000  thorpej Small cosmetic change, and note a place where a statistic should be
gathered.
 1.97  11-Feb-2000  itojun fix in-kernel packet forwarding loop (till TTL becomes 0) when:
- a packet is delivered to an address X,
- and the address X is configured on my !IFF_UP interface
- and ipforwarding=1

NetBSD PR: 9387
From: nrt@iij.ad.jp
 1.96  01-Feb-2000  thorpej Use ifatoia() and sintosa() consistently, rather than using home-grown
casting macros intermixed.
 1.95  31-Jan-2000  itojun bring in latest KAME ipsec tree.
- interop issues in ipcomp is fixed
- padding type (after ESP) is configurable
- key database memory management (need more fixes)
- policy specification is revisited

XXX m->m_pkthdr.rcvif is still overloaded - hope to fix it soon
 1.94  26-Oct-1999  itojun disable ipflow (IPv4 fast fowarding) when IPsec is configured into the kernel.
 1.93  17-Oct-1999  sommerfeld branches: 1.93.2; 1.93.4;
In ip_forward():

Avoid forwarding ip unicast packets which were contained inside
link-level multicast packets; having M_MCAST still set in the packet
header flags will mean that the packet will get multicast to a bogus
group instead of unicast to the next hop.

Malformed packets like this have occasionally been spotted "in the
wild" on a mediaone cable modem segment which also had multiple netbsd
machines running as router/NAT boxes.

Without this, any subnet with multiple netbsd routers receiving all
multicasts will generate a packet storm on receipt of such a
multicast. Note that we already do the same check here for link-level
broadcasts; ip6_forward already does this as well.

Note that multicast forwarding does not go through ip_forward().

Adding some code to if_ethersubr to sanity check link-level
vs. ip-level multicast addresses might also be worthwhile.
 1.92  23-Jul-1999  itojun branches: 1.92.2;
do not include unnecessary include files.
 1.91  09-Jul-1999  thorpej defopt IPSEC and IPSEC_ESP (both into opt_ipsec.h).
 1.90  06-Jul-1999  itojun sync with KAME/NetBSD 1.4, SNAP kit 19990705.
key changes are:
- icmp6 redirect fix (dst check)
- revised ip6 multicast check for loopback i/f
- several RCS ID cleanups
 1.89  01-Jul-1999  itojun IPv6 kernel code, based on KAME/NetBSD 1.4, SNAP kit 19990628.
(Sorry for a big commit, I can't separate this into several pieces...)
Pls check sys/netinet6/TODO and sys/netinet6/IMPLEMENTATION for details.

- sys/kern: do not assume single mbuf, accept chained mbuf on passing
data from userland to kernel (or other way round).
- "midway" ATM card: ATM PVC pseudo device support, like those done in ALTQ
package (ftp://ftp.csl.sony.co.jp/pub/kjc/).
- sys/netinet/tcp*: IPv4/v6 dual stack tcp support.
- sys/netinet/{ip6,icmp6}.h, sys/net/pfkeyv2.h: IETF document assumes those
file to be there so we patch it up.
- sys/netinet: IPsec additions are here and there.
- sys/netinet6/*: most of IPv6 code sits here.
- sys/netkey: IPsec key management code
- dev/pci/pcidevs: regen

In my understanding no code here is subject to export control so it
should be safe.
 1.88  26-Jun-1999  sommerfeld If the new global variable hostzerobroadcast is zero, no longer assume
address zero of each net/subnet is a broadcast address.
(The default value is nonzero, which preserves the current behavior).

This can be set using sysctl; the boot-time default can also be
configured using the HOSTZEROBROADCAST kernel config option.

While we're here, defopt HOSTZEROBROADCAST and SUBNETSARELOCAL
 1.87  04-May-1999  hwr It does not make much sense to increase a "output" counter on input.
 1.86  03-May-1999  thorpej In INADDR_TO_IA(), skip interfaces which are not up. Revert previous change
to ip_input.c to check the interface status after INADDR_TO_IA().

Fix cooked up by Heiko Rupp and myself.

Fixes PR 7480.
 1.85  03-May-1999  hwr Drop packets, that have a Class-D address as source address.
Implements the first half of PR 7003.
 1.84  07-Apr-1999  proff tiny KNF change
 1.83  07-Apr-1999  proff Prevent reception of packets on downed interfaces (via an up interface).
fixes kern/7327
 1.82  27-Mar-1999  aidan branches: 1.82.2;
Added per-addr input/output statistics. Currently just support netatalk
and netinet, currently only tested under netinet.

Disabled by default, enabled by compiling the kernel with option
IFA_STATS. Enabling this feature seems to make the ip_output function
take 13% longer than before, which should be OK for people that need
this feature.
 1.81  26-Mar-1999  proff security: test for ip_len < ip_hl <<2 and drop packet accordingly
 1.80  19-Jan-1999  mycroft There's just no plausible reason to byte-swap ip_id internally. It's opaque.
 1.79  19-Jan-1999  mycroft Don't screw with ip_len; just subtract from it where we actually use the
value.
 1.78  19-Jan-1999  mycroft Don't overwrite the checksum fields when checking them. There's no reason to
do this, and it screws up ICMP replies.
XXX The returned IP checksum and length are still wrong.
 1.77  11-Jan-1999  thorpej Fix byte order and ip_len inconsistencies in ICMP reply code. Also, fix
some formatting and HTONS(foo) vs. foo = htons(foo) inconsistencies.

PR #6602, Darren Reed.
 1.76  19-Dec-1998  thorpej Reverse the copyright-notice-swap. It went against existing practice.
 1.75  18-Dec-1998  thorpej Add a lock around the IP fragment reassembly queue, to prevent ip_drain()
from corrupting the queue if called from a device's interrupt context.

Should fix PR #5684.
 1.74  13-Nov-1998  thorpej branches: 1.74.2;
Once a fragmented IP packet has been reassembled, recompute the packet
length before passing it up the stack. From FreeBSD.
 1.73  08-Oct-1998  thorpej Use the pool allocator for ipflow entries.
 1.72  08-Oct-1998  thorpej Use the pool allocator for ipqent structures.
 1.71  30-Sep-1998  tls Switch order of TNF and UCB copyrights so UCB copyright is first; this seems more appropriate since UCB wrote the original code, after all.
 1.70  09-Sep-1998  thorpej Make a diagnostic printf more sensible, PR #5951, Heiko W. Rupp.
 1.69  09-Aug-1998  mrg defopt PFIL_HOOKS.
 1.68  17-Jul-1998  sommerfe Fix PR5508: ipfil cut-through forwarding causes panic
 1.67  01-Jun-1998  thorpej Protect the ipflow_reap() call with splsoftnet.
 1.66  24-May-1998  thorpej Fix OBOB in IP timestamp option processing, as noted in FreeBSD PR 6738,
from Jennifer Dawn Meyers <jdm@enteract.com>.
 1.65  04-May-1998  matt Default IP flow to being enabled. Add a sysctl to control the maximum
number of flows (net.inet.ip.maxflows). If set to 0, will disable fast
path forwarding.
 1.64  01-May-1998  thorpej Allow packet filters to prevent a packet from creating a fast-forwarding
flow, by setting the "can fast forward" flag in the packet header, and
giving a chance for filters to clear the flag. If the flag is still
set after the filters have given it a chance, the packet will be used
to create a fast-forward flow entry.
 1.63  29-Apr-1998  matt Add support for "fast" forwarding. Add hooks in if_ethersubr.c and
if_fddisubr.c to fastpath IP forwarding. If ip_forward successfully
forwards a packet, it will create a cache (ipflow) entry. ether_input
and fddi_input will first call ipflow_fastforward with the received
packet and if the packet passes enough tests, it will be forwarded (the
ttl is decremented and the cksum is adjusted incrementally).
 1.62  29-Apr-1998  matt defopt GATEWAY
 1.61  29-Apr-1998  kml change path MTU timeout value to match RFC 1191
 1.60  29-Apr-1998  kml Add support for deletion of routes added by path MTU discovery;
uses new generic route timeout code. Add sysctl for timeout period.
 1.59  19-Mar-1998  mrg convert pfil(9) in and out lists from <sys/queue.h> LISTs to TAILQs, and
change pfil_add_hook to put output filters at the tail of the queue,
while continuing to place input filters at the head of the queue. update
the two users of these functions, and document these changes.

fixes PR#4593.
 1.58  15-Feb-1998  tls Add correct copyright notice for IP address hash change. This code is donated to TNF by the original copyright holder, Panix.
 1.57  13-Feb-1998  tls Change list of interface IP addresses to a hash. Improves performance on hosts with a large number of IP addresses significantly.
 1.56  28-Jan-1998  thorpej Use offsetof() from libkern.h
 1.55  12-Jan-1998  scottr Use option header file for MROUTING
 1.54  05-Jan-1998  lukem enhance ephemeral port allocation code:
* support sysctl net.inet.ip.anonportmin (lowest ephemeral port)
and net.inet.ip.anonportmax (highest ephemeral port).
these can't be set to >65535, < IPPORT_RESERVED (unless IPNOPRIVPORTS
is defined), and anonportmin has to be < anonportmax.
* use a cleaner way of only cycling through the available set once;
this will be useful for when a random allocation scheme is used
* define IPPORT_ANON{MIN,MAX} instead of IPPORT_USER{LOW,HIGH}
 1.53  18-Oct-1997  kml branches: 1.53.2;
change sysctl net.inet.icmp.mtudisc to net.inet.ip.mtudisc
 1.52  17-Oct-1997  thorpej Allow `subnetsarelocal' to be changed via sysctl.
 1.51  29-Aug-1997  gwr Tweaks to allow operation with an interface address of 0.0.0.0
(needed for NFS mountroot using BOOTP to get boot parameters)
 1.50  24-Jun-1997  thorpej branches: 1.50.4;
Eliminate use of dtom() from the network code, allowing more flexible
use of mbuf external storage and increasing performance (by eliminating
an m_pullup() for clusters in the IP reassembly code).

Changes from Koji Imada <koji@math.human.nagoya-u.ac.jp>, in PR #3628
and #3480, with ever-so-slight integration changes by me.
 1.49  15-Apr-1997  christos Move the mtod calls *after* we've made sure that the packet has passed the
filter successfully. Otherwise it can be NULL if the filter blocked it,
and we die. How did this ever work?
 1.48  26-Feb-1997  mrg allow src-routed packetd by default, per host requirements
 1.47  25-Feb-1997  cjs Add net.inet.ip.allowsrcrt option which allows/drops all source
routed packets. This currently defaults to `drop,' but once we
verify that all applications that rely on determining remote IP
addresses for authentication are dropping the connection when they
see a source route option (not just disabling the source route
option), we can turn this back on and conform with the host
requirements.
 1.46  19-Feb-1997  cjs Fix bug in sysctl net.inet.ip.forwsrcrt handing: now you can read it
if securelevel > 0. (Thanks, cgd.)
 1.45  18-Feb-1997  mrg pseudo-device ipfilter brings in PFIL_HOOKS.
 1.44  11-Jan-1997  thorpej branches: 1.44.4;
Implement the IP_RECVIF socket option: supply a datagram packet's incoming
interface using a sockaddr_dl in a control mbuf.

Implement SO_TIMESTAMP for IP datagrams.

Move packet information option processing into a generic function
so that they work with multicast UDP and raw IP as well as unicast UDP.

Contributed by Bill Fenner <fenner@parc.xerox.com>.
 1.43  20-Dec-1996  mrg in pfil_hooks: always reassign ip after calling hook.
 1.42  20-Dec-1996  mrg remove pfil_bad.
 1.41  25-Oct-1996  thorpej Before concatenating frags, sanity check the length of the packet. If it's
larger than IP_MAXPACKET, discard it.
Based on a patch from Bill Fenner <fenner@parc.xerox.com>
 1.40  22-Oct-1996  veego Fix a panic from the pfil_hooks.
 1.39  13-Oct-1996  christos backout previous kprintf changes
 1.38  10-Oct-1996  christos printf -> kprintf, sprintf -> ksprintf
 1.37  21-Sep-1996  perry commit fix in pr 2772 -- the IP input code was assuming that the
reserved (must be zero) flag must necessarily be zero. We now define
an IP_RF (by analogy to IP_DF and IP_MF) and mask it out when necessary.
 1.36  14-Sep-1996  mrg move the packet filter hooks in to a saner location. while i'm here, rename
PACKET_FILTER to PFIL_HOOKS.
 1.35  09-Sep-1996  mycroft Add in_nullhost() and in_hosteq() macros, to hide some protocol
details. Also, fix a bug in TCP wrt SYN+URG packets.
 1.34  08-Sep-1996  mycroft Save 68 bytes of the packet for ICMP, not 64. From Laine Stump, PR 2296.
 1.33  06-Sep-1996  mrg add packet filter interface code. see pfil(9) for more details. you
need the PACKET_FILTER option to enable this code. currently, ipfilter
version 3.1.1-beta has been converted to use this new interface.
 1.32  14-Aug-1996  thorpej Fix some DIAGNOSTIC printf() formats; ntohl() provides a 32-bit quantity,
and should be printed with %x, not %lx.
 1.31  10-Jul-1996  cgd print result of ntohl/htonl as a long. (makes -Wformat work on the
Alpha.)
 1.30  16-Mar-1996  christos branches: 1.30.4;
Fix printf format args.
 1.29  26-Feb-1996  mrg two more local addr changes, all done differently now (idea from charles)
 1.28  13-Feb-1996  christos netinet prototypes
 1.27  16-Jan-1996  thorpej Add a net.inet.ip.directed-broadcast sysctl as suggested by
Darren Reed <darrenr@vitruvius.arbld.unimelb.edu.au> in PR #1227.
This change is slightly different than the one submitted by Darren in
that the DIRECTED_BROADCAST compile-time option will behave like it used
to so that existing configurations utilizing it won't have to change.
 1.26  15-Jan-1996  thorpej Add net.inet.ip.forwsrcrt: if zero, the system will not forward
source-routed packets. Note this value is protected by kernel security
level; it can only be changed if securelevel < 1.
 1.25  21-Nov-1995  cgd make netinet work on systems where pointers and longs are 64 bits
(like the alpha). Biggest problem: IP headers were overlayed with
structure which included pointers, and which therefore didn't overlay
properly on 64-bit machines. Solution: instead of threading pointers
through IP header overlays, add a "queue element" structure to do
the threading, and point it at the ip headers.
 1.24  12-Aug-1995  mycroft splnet --> splsoftnet
 1.23  12-Jun-1995  mycroft Change in_pcbnotify*() to take an errno value. Make inetctlerrmap[] an
array on ints, not u_chars.
 1.22  12-Jun-1995  mycroft Various cleanup, including:
* Convert several data structures to use queue.h.
* Split in_pcbnotify() into two parts; one for notifying a specific PCB, and
one for notifying all PCBs for a particular foreign address.
 1.21  07-Jun-1995  mycroft Remove ip_ifmatrix completely.
 1.20  04-Jun-1995  mycroft Don't cast things unnecessarily.
 1.19  04-Jun-1995  mycroft Clean up many more casts.
 1.18  01-Jun-1995  mycroft Avoid byte-swapping IP addresses at run time.
 1.17  15-May-1995  cgd oops; forgot a '{'
 1.16  14-May-1995  cgd drop (and record) malformed IP fragments. Fixes pr 1030 (differently).
 1.15  13-Apr-1995  cgd be a bit more careful and explicit with types. (basically a large no-op.)
 1.14  29-Jun-1994  cgd New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'
 1.13  13-May-1994  mycroft Update to 4.4-Lite networking code, with a few local changes.
 1.12  14-Feb-1994  mycroft PARANOID --> DIAGNOSTIC for inexpensive tests.
 1.11  02-Feb-1994  hpeyerl Multicast is no longer optional.
 1.10  29-Jan-1994  brezak Fix some cases of NOT dealing with m_pkthdr's. This code is still suspect though, at least this fixes some panics.
 1.9  10-Jan-1994  mycroft Should compile now with or without `options MULTICAST'.
 1.8  09-Jan-1994  mycroft Prototype the rest.
 1.7  08-Jan-1994  mycroft More prototypes.
 1.6  08-Jan-1994  mycroft Fix some inconsistent spacing; spaces at the end of lines, etc.
 1.5  18-Dec-1993  mycroft Canonicalize all #includes.
 1.4  06-Dec-1993  hpeyerl multicast support.
>From Chris Maeda, cmaeda@cs.washington.edu
These patches are derived from the IP Multicast patches for BSDI.
 1.3  20-May-1993  cgd branches: 1.3.4;
more rcsid additions and file header cleanups
 1.2  04-May-1993  cgd make ip_input recursion checking be for -DPARANOID, and make it panic
 1.1  21-Mar-1993  cgd branches: 1.1.1;
Initial revision
 1.1.1.2  05-Jan-1998  thorpej Import sys/netinet from 4.4BSD-Lite for reference purposes.
 1.1.1.1  21-Mar-1993  cgd initial import of 386bsd-0.1 sources
 1.3.4.2  14-Nov-1993  mycroft PARANOID --> DIAGNOSTIC. These are not expensive tests.
 1.3.4.1  24-Sep-1993  mycroft Make all files using spl*() #include cpu.h. Changes from trunk.
 1.30.4.3  11-Dec-1996  mycroft From trunk:
Save 68 bytes of the packet for ICMP, not 64.
 1.30.4.2  11-Dec-1996  mycroft From trunk:
Ignore the reserved fragment flag when checking ip_off.
 1.30.4.1  10-Nov-1996  thorpej Update from trunk:
- Make ip_len and ip_off unsigned.
- Make sure we don't accept or transmit packets larger than the
maximim IP packet size.
This fixes the so-called `death ping' bug.

Sum of work from Bill Fenner <fenner@parc.xerox.com>,
Kevin Lahey <kml@nas.nasa.gov>, and myself.

Thanks to Curt Sampson, Jukka Marin, and Kevin Lahey for testing
this under NetBSD 1.2
 1.44.4.1  12-Mar-1997  is Merge in changes from Trunk
 1.50.4.1  01-Sep-1997  thorpej Update marc-pcmcia branch from trunk.
 1.53.2.4  15-Nov-1998  cgd pull up rev 1.74 from trunk (thorpej)
 1.53.2.3  01-Oct-1998  cgd pull up revisions 1.57-1.58 (via patch), 1.71 (via patch) from trunk. (tls)
 1.53.2.2  22-Jul-1998  mellon Pull up 1.59 and 1.68 (veego)
 1.53.2.1  09-May-1998  mycroft Pull up patch from kml.
 1.74.2.1  11-Dec-1998  kenh The beginnings of interface detach support. Still some bugs, but mostly
works for me.

This work was originally by Bill Studenmund, and cleaned up by me.
 1.82.2.7  30-May-2001  he Pull up revisions 1.131,1.133 (via patch, requested by he):
Introduce net.inet.ip.maxfragpackets, which controls the maximum
number of IPv4 fragment reassembly queue entries. Defends against
certain DoS attacks. Fixes SA#2001-006.
 1.82.2.6  06-May-2000  he Pull up revision 1.112 (requested by sommerfeld):
Handle large offsets inside very small options correctly.
 1.82.2.5  02-Mar-2000  he Pull up revision 1.104 (requested by thorpej):
Work around a compiler bug that causes a security vulnerability
in our IP stack on some platforms.
 1.82.2.4  12-Feb-2000  he Apply patch (requested by thorpej):
Adhere to RFC 1112 and RFC 1122 by dropping incoming packets with
a multicast source address. Fixes part of PR#7003.
 1.82.2.3  17-Oct-1999  cgd pull up rev 1.93 from trunk (requested by sommerfeld):
Multicast storm prevention: don't attempt to forward link-level
multicast packets which contain ip unicast packets; these packets
would only be generated from misconfigured/buggy systems.
 1.82.2.2  03-May-1999  perry branches: 1.82.2.2.2; 1.82.2.2.4;
pullup 1.85->1.86 (thorpej)
 1.82.2.1  07-Apr-1999  proff pullup 1.82 - 1.83; don't receive packets on downed interface addresses
 1.82.2.2.4.3  30-Nov-1999  itojun bring in latest KAME (as of 19991130, KAME/NetBSD141) into kame branch
just for reference purposes.
This commit includes 1.4 -> 1.4.1 sync for kame branch.

The branch does not compile at all (due to the lack of ALTQ and some other
source code). Please do not try to modify the branch, this is just for
referenre purposes.

synchronization to latest KAME will take place on HEAD branch soon.
 1.82.2.2.4.2  06-Jul-1999  itojun KAME/NetBSD 1.4, SNAP kit 1999/07/05.
NOTE: this branch is just for reference purposes (i.e. for taking cvs diff).
do not touch anything on the branch. actual work must be done on HEAD branch.
 1.82.2.2.4.1  28-Jun-1999  itojun KAME/NetBSD 1.4 SNAP kit, dated 19990628.

NOTE: this branch (kame) is used just for refernce. this may not compile
due to multiple reasons.
 1.82.2.2.2.3  02-Aug-1999  thorpej Update from trunk.
 1.82.2.2.2.2  01-Jul-1999  thorpej Sync w/ -current.
 1.82.2.2.2.1  21-Jun-1999  thorpej Sync w/ -current.
 1.92.2.1  27-Dec-1999  wrstuden Pull up to last week's -current.
 1.93.4.1  15-Nov-1999  fvdl Sync with -current
 1.93.2.8  21-Apr-2001  bouyer Sync with HEAD
 1.93.2.7  27-Mar-2001  bouyer Sync with HEAD.
 1.93.2.6  12-Mar-2001  bouyer Sync with HEAD.
 1.93.2.5  11-Feb-2001  bouyer Sync with HEAD.
 1.93.2.4  05-Jan-2001  bouyer Sync with HEAD
 1.93.2.3  08-Dec-2000  bouyer Sync with HEAD.
 1.93.2.2  22-Nov-2000  bouyer Sync with HEAD.
 1.93.2.1  20-Nov-2000  bouyer Update thorpej_scsipi to -current as of a month ago
 1.114.4.10  13-Nov-2002  itojun sys/net/route.c 1.55 via patch
sys/net/route.h 1.32
sys/netinet/ip_input.c 1.163

Remove all entries on rt timer queue on ip_mtudisc change, instead
of destroying the queue.

(itojun, redo)
 1.114.4.9  10-Nov-2002  itojun sys/netinet/ip_input.c 1.160 via patch

Always create PMTUD timeout queue, as PMTUD can be turned on via
sysctl at runtime. From lha@stacken.kth.se.

(itojun)
 1.114.4.8  26-Feb-2002  he Pull up revision 1.145 (requested by itojun):
Correctly enforce ipsec policy check in IPv4 forwarding case.
 1.114.4.7  26-Feb-2002  he Pull up revision 1.144 (requested by martin):
Clear M_BCAST and M_MCAST on encapsulated packets on outgoing
mbufs. Also do not copy TTL from the inner packet, and make the
outer TTL sysctl'able. Fixes PR#14269, and makes traceroute work
over GRE tunnels.
 1.114.4.6  24-Apr-2001  he Pull up revisions 1.131,1.133 (requested by itojun):
Introduce net.inet.ip.maxfragpackets, which controls the maximum
number of IPv4 fragment reassembly queue entries. Defends against
certain DoS attacks.
 1.114.4.5  06-Apr-2001  he Pull up revision 1.127 (via patch, requested by itojun):
Record IPsec packet history in m_aux structure. Let ipfilter
look at wire-format packet only (not the decapsulated ones), so
that VPN setting can work with NAT/ipfilter settings.
 1.114.4.4  11-Mar-2001  he Pull up revision 1.128 (requested by itojun):
Ensure that we enforce inbound IPsec policy on all IP protocols,
not just TCP, UDP and ICMP.
 1.114.4.3  17-Oct-2000  tv Pullup 1.119 [itojun]:
make sure we don't share external mbuf between m and mcopy, in ip_forward().
should solve PR 11201.
 1.114.4.2  27-Aug-2000  itojun pullup 1.117 -> 1.118 (approved by releng-1-5)

> make sure anonport{min,max} is not negative number
 1.114.4.1  26-Aug-2000  tron Pull up from current (approved by thorpej):

Add new sysctl variables "net.inet.ip.lowportmin" and
"net.inet.ip.lowportmax" which can be used to the set minimum
and maximum port number assigned to sockets using
IP_PORTRANGE_LOW.

syssrc/sys/netinet/in.h 1.49 -> 1.50
syssrc/sys/netinet/in_pcb.c 1.66 -> 1.67
syssrc/sys/netinet/ip_input.c 1.116 -> 1.117
syssrc/sys/netinet/ip_var.h 1.41 -> 1.42
 1.130.2.16  11-Dec-2002  thorpej Sync with HEAD.
 1.130.2.15  11-Nov-2002  nathanw Catch up to -current
 1.130.2.14  18-Oct-2002  nathanw Catch up to -current.
 1.130.2.13  17-Sep-2002  nathanw Catch up to -current.
 1.130.2.12  27-Aug-2002  nathanw Catch up to -current.
 1.130.2.11  01-Aug-2002  nathanw Catch up to -current.
 1.130.2.10  20-Jun-2002  nathanw Catch up to -current.
 1.130.2.9  04-May-2002  thorpej Update from trunk.
 1.130.2.8  01-Apr-2002  nathanw Catch up to -current.
(CVS: It's not just a program. It's an adventure!)
 1.130.2.7  28-Feb-2002  nathanw Catch up to -current.
 1.130.2.6  08-Jan-2002  nathanw Catch up to -current.
 1.130.2.5  14-Nov-2001  nathanw Catch up to -current.
 1.130.2.4  21-Sep-2001  nathanw Catch up to -current.
 1.130.2.3  24-Aug-2001  nathanw Catch up with -current.
 1.130.2.2  21-Jun-2001  nathanw Catch up to -current.
 1.130.2.1  09-Apr-2001  nathanw Catch up with -current.
 1.135.2.6  10-Oct-2002  jdolecek sync kqueue with -current; this includes merge of gehenna-devsw branch,
merge of i386 MP branch, and part of autoconf rototil work
 1.135.2.5  06-Sep-2002  jdolecek sync kqueue branch with HEAD
 1.135.2.4  23-Jun-2002  jdolecek catch up with -current on kqueue branch
 1.135.2.3  16-Mar-2002  jdolecek Catch up with -current.
 1.135.2.2  10-Jan-2002  thorpej Sync kqueue branch with -current.
 1.135.2.1  25-Aug-2001  thorpej Merge Aug 24 -current into the kqueue branch.
 1.136.2.1  01-Oct-2001  fvdl Catch up with -current.
 1.137.2.1  12-Nov-2001  thorpej Sync the thorpej-mips-cache branch with -current.
 1.150.4.3  17-Jun-2003  msaitoh Pullup rev. 1.163 via patch (requested by itojun in ticket #984):
remove all entries in rt timer queue on ip_mtudisc change, instead of
destroying the queue.
 1.150.4.2  12-Nov-2002  tron Pull up revision 1.160 (requested by itojun in ticket #977):
always create pmtud timeout queue, as ip_mtudisc can be tweaked via
sysctl at runtime. From lha@stacken.kth.se
 1.150.4.1  07-Jun-2002  thorpej pullup-1-6 ticket #202:

syssrc/sys/netinet/ip_input.c 1.151

Original log message:

look at rmx_mtu on IPsec tunnel MTU computation.
From: David Waitzman <djw@bbn.com>
 1.150.2.3  29-Aug-2002  gehenna catch up with -current.
 1.150.2.2  15-Jul-2002  gehenna catch up with -current.
 1.150.2.1  20-Jun-2002  gehenna catch up with -current.
 1.169.2.8  10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.169.2.7  01-Apr-2005  skrll Sync with HEAD.
 1.169.2.6  04-Mar-2005  skrll Sync with HEAD.

Hi Perry!
 1.169.2.5  04-Feb-2005  skrll Sync with HEAD.
 1.169.2.4  17-Jan-2005  skrll Sync with HEAD.
 1.169.2.3  18-Dec-2004  skrll Sync with HEAD.
 1.169.2.2  19-Oct-2004  skrll Sync with HEAD
 1.169.2.1  03-Aug-2004  skrll Sync with HEAD
 1.197.2.1  28-May-2004  tron Pull up revision 1.203 (requested by atatat in ticket #391):
Sysctl descriptions under net subtree (net.key not done)
 1.208.2.1  29-Apr-2005  kent sync with -current
 1.209.2.2  19-Mar-2005  yamt sync with head. xen and whitespace. xen part is not finished.
 1.209.2.1  12-Feb-2005  yamt sync with head.
 1.212.2.3  17-Sep-2007  bouyer Pull up following revision(s) (requested by degroote in ticket #1840):
sys/netinet/ip_input.c: revision 1.253
In some FAST_IPSEC, spl level is not restored correctly. Fix that.
Spotted by Wolfgang Stukenbrock in pr/36800
 1.212.2.2  06-May-2005  tron branches: 1.212.2.2.2; 1.212.2.2.4;
Pull up revision 1.214 (requested by yamt in ticket #251):
fix problems related to loopback interface checksum omission. PR/29971.
- for ipv4, defer decision to ip layer as h/w checksum offloading does
so that it can check the actual interface the packet is going to.
- for ipv6, disable it.
(maybe will be revisited when it implements h/w checksum offloading.)
ok'ed by Jason Thorpe.
 1.212.2.1  04-Apr-2005  tron Pull up revision 1.213 (requested by yamt in ticket #88):
ip_reass: clear stale csum_flags.
 1.212.2.2.4.1  17-Sep-2007  bouyer Pull up following revision(s) (requested by degroote in ticket #1840):
sys/netinet/ip_input.c: revision 1.253
In some FAST_IPSEC, spl level is not restored correctly. Fix that.
Spotted by Wolfgang Stukenbrock in pr/36800
 1.212.2.2.2.1  17-Sep-2007  bouyer Pull up following revision(s) (requested by degroote in ticket #1840):
sys/netinet/ip_input.c: revision 1.253
In some FAST_IPSEC, spl level is not restored correctly. Fix that.
Spotted by Wolfgang Stukenbrock in pr/36800
 1.218.2.9  11-Feb-2008  yamt sync with head.
 1.218.2.8  21-Jan-2008  yamt sync with head
 1.218.2.7  07-Dec-2007  yamt sync with head
 1.218.2.6  15-Nov-2007  yamt sync with head.
 1.218.2.5  27-Oct-2007  yamt sync with head.
 1.218.2.4  03-Sep-2007  yamt sync with head.
 1.218.2.3  26-Feb-2007  yamt sync with head.
 1.218.2.2  30-Dec-2006  yamt sync with head.
 1.218.2.1  21-Jun-2006  yamt sync with head.
 1.219.2.2  02-Nov-2005  yamt sync with head.
 1.219.2.1  26-Oct-2005  yamt sync with head
 1.223.6.3  01-Jun-2006  kardel Sync with head.
 1.223.6.2  22-Apr-2006  simonb Sync with head.
 1.223.6.1  04-Feb-2006  simonb Adapt for timecounters: mostly use get*time(), use bintime's for timeout
calculations and use "time_second" instead of "time.tv_sec".
 1.223.4.1  09-Sep-2006  rpaulo sync with head
 1.223.2.1  01-Mar-2006  yamt sync with head.
 1.224.6.1  24-May-2006  tron Merge 2006-05-24 NetBSD-current into the "peter-altq" branch.
 1.224.4.2  11-May-2006  elad sync with head
 1.224.4.1  19-Apr-2006  elad sync with head.
 1.224.2.5  14-Sep-2006  yamt sync with head.
 1.224.2.4  03-Sep-2006  yamt sync with head.
 1.224.2.3  11-Aug-2006  yamt sync with head
 1.224.2.2  26-Jun-2006  yamt sync with head.
 1.224.2.1  24-May-2006  yamt sync with head.
 1.226.2.1  19-Jun-2006  chap Sync with head.
 1.229.2.3  01-Feb-2007  ad Sync with head.
 1.229.2.2  12-Jan-2007  ad Sync with head.
 1.229.2.1  18-Nov-2006  ad Sync with head.
 1.231.2.3  18-Dec-2006  yamt sync with head.
 1.231.2.2  10-Dec-2006  yamt sync with head.
 1.231.2.1  22-Oct-2006  yamt sync with head
 1.236.4.2  03-Jun-2008  skrll Sync with netbsd-4.
 1.236.4.1  23-Sep-2007  wrstuden Sync with somewhat-recent netbsd-4.
 1.236.2.2  30-Mar-2008  jdc Pull up revisions:
src/sys/netinet/ip_input.c 1.263
src/sys/netinet/tcp_subr.c 1.225
(requested by cube in ticket #1109).

- Make sure we send a reasonable fragment size when IPSEC is configured.
Otherwise we end up sending a dubious "0" whenever we cannot find a
proper association for the packet.
- Reset sack_newdata along with snd_nxt to avoid improper integer
arithmetics that lead to sending data from an incorrect place in the
stream, making it appear as corrupted.

Patch by Michael Van Elst, based on an analysis by Michael for the IPSEC
stuff and I for the SACK issue.
 1.236.2.1  16-Sep-2007  xtraeme branches: 1.236.2.1.4;
Pull up following revision(s) (requested by degroote in ticket #881):
sys/netinet/ip_input.c: revision 1.253
sys/netinet6/ip6_input.c: revision 1.110

In some FAST_IPSEC, spl level is not restored correctly. Fix that.
Spotted by Wolfgang Stukenbrock in pr/36800
 1.236.2.1.4.1  30-Mar-2008  jdc Pull up revisions:
src/sys/netinet/ip_input.c 1.263
src/sys/netinet/tcp_subr.c 1.225
(requested by cube in ticket #1109).

- Make sure we send a reasonable fragment size when IPSEC is configured.
Otherwise we end up sending a dubious "0" whenever we cannot find a
proper association for the packet.
- Reset sack_newdata along with snd_nxt to avoid improper integer
arithmetics that lead to sending data from an incorrect place in the
stream, making it appear as corrupted.

Patch by Michael Van Elst, based on an analysis by Michael for the IPSEC
stuff and I for the SACK issue.
 1.242.2.5  07-May-2007  yamt sync with head.
 1.242.2.4  15-Apr-2007  yamt sync with head.
 1.242.2.3  24-Mar-2007  yamt sync with head.
 1.242.2.2  12-Mar-2007  rmind Sync with HEAD.
 1.242.2.1  27-Feb-2007  yamt - sync with head.
- move sched_changepri back to kern_synch.c as it doesn't know PPQ anymore.
 1.245.2.5  09-Oct-2007  ad Sync with head.
 1.245.2.4  20-Aug-2007  ad Sync with HEAD.
 1.245.2.3  08-Jun-2007  ad Sync with head.
 1.245.2.2  10-Apr-2007  ad Sync with head.
 1.245.2.1  13-Mar-2007  ad Sync with head.
 1.246.4.1  29-Mar-2007  reinoud Pullup to -current
 1.246.2.1  11-Jul-2007  mjf Sync with head.
 1.249.2.2  03-Sep-2007  skrll Sync with HEAD.
 1.249.2.1  15-Aug-2007  skrll Sync with HEAD.
 1.250.6.2  19-Jul-2007  dyoung Take steps to hide the radix_node implementation of the forwarding table
from the forwarding table's users:

Introduce rt_walktree() for walking the routing table and
applying a function to each rtentry. Replace most
rn_walktree() calls with it.

Use rt_getkey()/rt_setkey() to get/set a route's destination.
Keep a pointer to the sockaddr key in the rtentry, so that
rtentry users do not have to grovel in the radix_node for
the key.

Add a RTM_GET method to rtrequest. Use that instead of
radix_node lookups in, e.g., carp(4).

Add sys/net/link_proto.c, which supplies sockaddr routines for
link-layer socket addresses (sockaddr_dl).

Cosmetic:

Constify. KNF. Stop open-coding LIST_FOREACH, TAILQ_FOREACH,
et cetera. Use NULL instead of 0 for null pointers. Use
__arraycount(). Reduce gratuitous parenthesization.

Stop using variadic arguments for rip6_output(), it is
unnecessary.

Remove the unnecessary rtentry member rt_genmask and the
code to maintain it, since nothing actually used it.

Make rt_maskedcopy() easier to read by using meaningful variable
names.

Extract a subroutine intern_netmask() for looking up a netmask in
the masks table.

Start converting backslash-ridden IPv6 macros in
sys/netinet6/in6_var.h into inline subroutines that one
can read without special eyeglasses.

One functional change: when the kernel serves an RTM_GET, RTM_LOCK,
or RTM_CHANGE request, it applies the netmask (if supplied) to a
destination before searching for it in the forwarding table.

I have changed sys/netinet/ip_carp.c, carp_setroute(), to remove
the unlawful radix_node knowledge.

Apart from the changes to carp(4), netiso, ATM, and strip(4), I
have run the changes on three nodes in my wireless routing testbed,
which involves IPv4 + IPv6 dynamic routing acrobatics, and it's
working beautifully so far.
 1.250.6.1  19-Jul-2007  dyoung file ip_input.c was added on branch matt-mips64 on 2007-07-19 20:48:56 +0000
 1.250.4.6  27-Nov-2007  joerg Sync with HEAD. amd64 Xen support needs testing.
 1.250.4.5  11-Nov-2007  joerg Sync with HEAD.
 1.250.4.4  04-Oct-2007  joerg Sync with HEAD.
 1.250.4.3  02-Oct-2007  joerg Sync with HEAD.
 1.250.4.2  03-Sep-2007  jmcneill Sync with HEAD.
 1.250.4.1  16-Aug-2007  jmcneill Sync with HEAD.
 1.251.2.3  23-Mar-2008  matt sync with HEAD
 1.251.2.2  09-Jan-2008  matt sync with HEAD
 1.251.2.1  06-Nov-2007  matt sync with HEAD
 1.253.2.1  06-Oct-2007  yamt sync with head.
 1.254.4.4  18-Feb-2008  mjf Sync with HEAD.
 1.254.4.3  27-Dec-2007  mjf Sync with HEAD.
 1.254.4.2  08-Dec-2007  mjf Sync with HEAD.
 1.254.4.1  19-Nov-2007  mjf Sync with HEAD.
 1.254.2.1  13-Nov-2007  bouyer Sync with HEAD
 1.256.6.2  19-Jan-2008  bouyer Sync with HEAD
 1.256.6.1  02-Jan-2008  bouyer Sync with HEAD
 1.256.2.1  26-Dec-2007  ad Sync with head.
 1.262.6.5  17-Jan-2009  mjf Sync with HEAD.
 1.262.6.4  05-Oct-2008  mjf Sync with HEAD.
 1.262.6.3  28-Sep-2008  mjf Sync with HEAD.
 1.262.6.2  02-Jun-2008  mjf Sync with HEAD.
 1.262.6.1  03-Apr-2008  mjf Sync with HEAD.
 1.266.2.1  18-May-2008  yamt sync with head.
 1.268.2.6  11-Aug-2010  yamt sync with head.
 1.268.2.5  11-Mar-2010  yamt sync with head
 1.268.2.4  19-Aug-2009  yamt sync with head.
 1.268.2.3  18-Jul-2009  yamt sync with head.
 1.268.2.2  04-May-2009  yamt sync with head.
 1.268.2.1  16-May-2008  yamt sync with head.
 1.272.6.2  13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.272.6.1  19-Oct-2008  haad Sync with HEAD.
 1.272.2.2  10-Oct-2008  skrll Sync with HEAD.
 1.272.2.1  18-Sep-2008  wrstuden Sync with wrstuden-revivesa-base-2.
 1.275.4.1  25-Nov-2008  snj branches: 1.275.4.1.8;
Pull up following revision(s) (requested by rmind in ticket #119):
sys/netinet/ip_input.c: revision 1.276
ip_input: fix an IPQ "lock" leak. (hi <matt>!)
 1.275.4.1.8.2  07-Jan-2011  matt Backout an inadverdant change.
 1.275.4.1.8.1  07-Jan-2011  matt If using hardware checksum offload and the packet can't be h/w checksumed
(for whatever reason, some hardware is stupid) allow the driver to calculate
the checksum instead.
 1.275.2.2  28-Apr-2009  skrll Sync with HEAD.
 1.275.2.1  19-Jan-2009  skrll Sync with HEAD.
 1.278.2.2  23-Jul-2009  jym Sync with HEAD.
 1.278.2.1  13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.284.4.4  31-May-2011  rmind sync with head
 1.284.4.3  21-Apr-2011  rmind sync with head
 1.284.4.2  05-Mar-2011  rmind sync with head
 1.284.4.1  30-May-2010  rmind sync with head
 1.284.2.3  06-Nov-2010  uebayasi Sync with HEAD.
 1.284.2.2  17-Aug-2010  uebayasi Sync with HEAD.
 1.284.2.1  30-Apr-2010  uebayasi Sync with HEAD.
 1.293.2.1  06-Jun-2011  jruoho Sync with HEAD.
 1.296.6.2  05-Apr-2012  mrg sync to latest -current.
 1.296.6.1  18-Feb-2012  mrg merge to -current.
 1.296.2.4  22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.296.2.3  16-Jan-2013  yamt sync with (a bit old) head
 1.296.2.2  30-Oct-2012  yamt sync with head
 1.296.2.1  17-Apr-2012  yamt sync with head
 1.298.8.1  09-Feb-2018  martin Pull up following revision(s) (requested by maxv in ticket #1526):
sys/netinet/ip_input.c: revision 1.366

Disable ip_allowsrcrt and ip_forwsrcrt. Enabling them by default was a
completely dumb idea, because they have security implications.

By sending an IPv4 packet containing an LSRR option, an attacker will
cause the system to forward the packet to another IPv4 address - and
this way he white-washes the source of the packet.

It is also possible for an attacker to reach hidden networks: if a server
has a public address, and a private one on an internal network (network
which has several internal machines connected), the attacker can send a
packet with:
source = 0.0.0.0
destination = public address of the server
LSRR first address = address of a machine on the internal network
And the packet will be forwarded, by the server, to the internal machine,
in some cases even with the internal IP address of the server as a source.
 1.298.6.1  09-Feb-2018  martin Pull up following revision(s) (requested by maxv in ticket #1526):
sys/netinet/ip_input.c: revision 1.366

Disable ip_allowsrcrt and ip_forwsrcrt. Enabling them by default was a
completely dumb idea, because they have security implications.

By sending an IPv4 packet containing an LSRR option, an attacker will
cause the system to forward the packet to another IPv4 address - and
this way he white-washes the source of the packet.

It is also possible for an attacker to reach hidden networks: if a server
has a public address, and a private one on an internal network (network
which has several internal machines connected), the attacker can send a
packet with:
source = 0.0.0.0
destination = public address of the server
LSRR first address = address of a machine on the internal network
And the packet will be forwarded, by the server, to the internal machine,
in some cases even with the internal IP address of the server as a source.
 1.298.2.1  09-Feb-2018  martin Pull up following revision(s) (requested by maxv in ticket #1526):
sys/netinet/ip_input.c: revision 1.366

Disable ip_allowsrcrt and ip_forwsrcrt. Enabling them by default was a
completely dumb idea, because they have security implications.

By sending an IPv4 packet containing an LSRR option, an attacker will
cause the system to forward the packet to another IPv4 address - and
this way he white-washes the source of the packet.

It is also possible for an attacker to reach hidden networks: if a server
has a public address, and a private one on an internal network (network
which has several internal machines connected), the attacker can send a
packet with:
source = 0.0.0.0
destination = public address of the server
LSRR first address = address of a machine on the internal network
And the packet will be forwarded, by the server, to the internal machine,
in some cases even with the internal IP address of the server as a source.
 1.302.2.4  03-Dec-2017  jdolecek update from HEAD
 1.302.2.3  20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.302.2.2  23-Jun-2013  tls resync from head
 1.302.2.1  25-Feb-2013  tls resync with head
 1.307.2.3  18-May-2014  rmind sync with head
 1.307.2.2  28-Aug-2013  rmind sync with head
 1.307.2.1  17-Jul-2013  rmind Checkpoint work in progress:
- Move PCB structures under __INPCB_PRIVATE, adjust most of the callers
and thus make IPv4 PCB structures mostly opaque. Any volunteers for
merging in6pcb with inpcb (see rpaulo-netinet-merge-pcb branch)?
- Move various global vars to the modules where they belong, make them static.
- Some preliminary work for IPv4 PCB locking scheme.
- Make raw IP code mostly MP-safe. Simplify some of it.
- Rework "fast" IP forwarding (ipflow) code to be mostly MP-safe. It should
run from a software interrupt, rather than hard.
- Rework tun(4) pseudo interface to be MP-safe.
- Work towards making some other interfaces more strict.
 1.310.2.1  10-Aug-2014  tls Rebase.
 1.319.10.2  17-Sep-2019  martin Pull up following revision(s) (requested by bouyer in ticket #1708):

sys/netinet6/ip6_input.c: revision 1.209 via patch
sys/netinet/ip_input.c: revision 1.390 via patch

Packet filters can return an mbuf chain with fragmented headers, so
m_pullup() it if needed and remove the KASSERT()s.
 1.319.10.1  09-Feb-2018  martin Pull up following revision(s) (requested by maxv in ticket #1563):
sys/netinet/ip_input.c: revision 1.366 (via patch)

Disable ip_allowsrcrt and ip_forwsrcrt. Enabling them by default was a
completely dumb idea, because they have security implications.

By sending an IPv4 packet containing an LSRR option, an attacker will
cause the system to forward the packet to another IPv4 address - and
this way he white-washes the source of the packet.

It is also possible for an attacker to reach hidden networks: if a server
has a public address, and a private one on an internal network (network
which has several internal machines connected), the attacker can send a
packet with:
source = 0.0.0.0
destination = public address of the server
LSRR first address = address of a machine on the internal network
And the packet will be forwarded, by the server, to the internal machine,
in some cases even with the internal IP address of the server as a source.
 1.319.6.2  17-Sep-2019  martin Pull up following revision(s) (requested by bouyer in ticket #1708):

sys/netinet6/ip6_input.c: revision 1.209 via patch
sys/netinet/ip_input.c: revision 1.390 via patch

Packet filters can return an mbuf chain with fragmented headers, so
m_pullup() it if needed and remove the KASSERT()s.
 1.319.6.1  09-Feb-2018  martin Pull up following revision(s) (requested by maxv in ticket #1563):
sys/netinet/ip_input.c: revision 1.366 (via patch)

Disable ip_allowsrcrt and ip_forwsrcrt. Enabling them by default was a
completely dumb idea, because they have security implications.

By sending an IPv4 packet containing an LSRR option, an attacker will
cause the system to forward the packet to another IPv4 address - and
this way he white-washes the source of the packet.

It is also possible for an attacker to reach hidden networks: if a server
has a public address, and a private one on an internal network (network
which has several internal machines connected), the attacker can send a
packet with:
source = 0.0.0.0
destination = public address of the server
LSRR first address = address of a machine on the internal network
And the packet will be forwarded, by the server, to the internal machine,
in some cases even with the internal IP address of the server as a source.
 1.319.4.10  28-Aug-2017  skrll Sync with HEAD
 1.319.4.9  05-Feb-2017  skrll Sync with HEAD
 1.319.4.8  05-Dec-2016  skrll Sync with HEAD
 1.319.4.7  05-Oct-2016  skrll Sync with HEAD
 1.319.4.6  09-Jul-2016  skrll Sync with HEAD
 1.319.4.5  19-Mar-2016  skrll Sync with HEAD
 1.319.4.4  27-Dec-2015  skrll Sync with HEAD (as of 26th Dec)
 1.319.4.3  22-Sep-2015  skrll Sync with HEAD
 1.319.4.2  06-Jun-2015  skrll Sync with HEAD
 1.319.4.1  06-Apr-2015  skrll Sync with HEAD
 1.319.2.2  17-Sep-2019  martin Pull up following revision(s) (requested by bouyer in ticket #1708):

sys/netinet6/ip6_input.c: revision 1.209 via patch
sys/netinet/ip_input.c: revision 1.390 via patch

Packet filters can return an mbuf chain with fragmented headers, so
m_pullup() it if needed and remove the KASSERT()s.
 1.319.2.1  09-Feb-2018  martin Pull up following revision(s) (requested by maxv in ticket #1563):
sys/netinet/ip_input.c: revision 1.366 (via patch)

Disable ip_allowsrcrt and ip_forwsrcrt. Enabling them by default was a
completely dumb idea, because they have security implications.

By sending an IPv4 packet containing an LSRR option, an attacker will
cause the system to forward the packet to another IPv4 address - and
this way he white-washes the source of the packet.

It is also possible for an attacker to reach hidden networks: if a server
has a public address, and a private one on an internal network (network
which has several internal machines connected), the attacker can send a
packet with:
source = 0.0.0.0
destination = public address of the server
LSRR first address = address of a machine on the internal network
And the packet will be forwarded, by the server, to the internal machine,
in some cases even with the internal IP address of the server as a source.
 1.337.2.5  26-Apr-2017  pgoyette Sync with HEAD
 1.337.2.4  20-Mar-2017  pgoyette Sync with HEAD
 1.337.2.3  07-Jan-2017  pgoyette Sync with HEAD. (Note that most of these changes are simply $NetBSD$
tag issues.)
 1.337.2.2  04-Nov-2016  pgoyette Sync with HEAD
 1.337.2.1  06-Aug-2016  pgoyette Sync with HEAD
 1.347.2.1  21-Apr-2017  bouyer Sync with HEAD
 1.355.2.9  07-Mar-2021  martin Pull up following revision(s) (requested by christos in ticket #1661):

sys/netinet6/ip6_id.c: revision 1.19-1.21
sys/netinet6/ip6_var.h: revision 1.88
sys/netinet/ip_input.c: revision 1.400
sys/netinet/tcp_subr.c: revision 1.285
sys/netinet/ip6.h: revision 1.30

netinet: Enable random IP fragment ids by default (from riastradh)

netinet: Enable RFC 1948 pseudorandom TCP ISS selection by default.
(from riastradh)

netinet6: Mark randomid unused.

Will make merging and bisection easier if anything goes wrong with
flow label or fragment id randomization changes.
(from riastradh)

netinet/netinet6: Add necessary includes to make these standalone.
(from riastradh)

Replace randomid() by cprng_fast32()
 1.355.2.8  24-Sep-2019  martin Pull up following revision(s) (requested by knakahara in ticket #1385):

sys/net/if.c 1.461
sys/net/if.h 1.277
sys/net/if_gif.c 1.149
sys/net/if_gif.h 1.33
sys/net/if_ipsec.c 1.19,1.20,1.24
sys/net/if_ipsec.h 1.5
sys/net/if_l2tp.c 1.33,1.36-1.39
sys/net/if_l2tp.h 1.7,1.8
sys/net/route.c 1.220,1.221
sys/net/route.h 1.125
sys/netinet/in_gif.c 1.95
sys/netinet/in_l2tp.c 1.17
sys/netinet/ip_input.c 1.391,1.392
sys/netinet/wqinput.c 1.6
sys/netinet6/in6_gif.c 1.94
sys/netinet6/in6_l2tp.c 1.18
sys/netinet6/ip6_forward.c 1.97
sys/netinet6/ip6_input.c 1.210,1.211
sys/netipsec/ipsec_output.c 1.82,1.83 (patched)
sys/netipsec/ipsecif.c 1.12,1.13,1.15,1.17 (patched)
sys/netipsec/key.c 1.259,1.260

ipsecif(4) support input drop packet counter.

ipsecif(4) should not increment drop counter by errors not related to if_snd. Pointed out by ozaki-r@n.o, thanks.
Remove unnecessary addresses in PF_KEY message.

MOBIKE Extensions for PF_KEY draft-schilcher-mobike-pfkey-extension-01.txt says
 1.355.2.7  17-Sep-2019  martin Pull up following revision(s) (requested by bouyer in ticket #1378):

sys/netinet6/ip6_input.c: revision 1.209 (patch)
sys/netinet/ip_input.c: revision 1.390 (patch)

Packet filters can return an mbuf chain with fragmented headers, so
m_pullup() it if needed and remove the KASSERT()s.
 1.355.2.6  18-Mar-2018  martin Pull up following revision(s) (requested by tih in ticket #639):
sys/kern/uipc_socket.c: revision 1.258
sys/kern/uipc_socket.c: revision 1.259
sys/netinet/ip_input.c: revision 1.364 (via patch)
sys/netinet/ip_output.c: revision 1.289
sys/netinet/in.h: revision 1.102
sys/netinet/in_pcb.c: revision 1.181
share/man/man9/sockopt.9: revision 1.11
sys/netinet/in_pcb.h: revision 1.65
sys/sys/socketvar.h: revision 1.146
sys/kern/uipc_syscalls.c: revision 1.189
sys/netinet/ip_output.c: revision 1.290
share/man/man4/ip.4: revision 1.41
share/man/man4/ip.4: revision 1.42
sys/kern/uipc_syscalls.c: revision 1.190

pass valsize for getsockopt like we do for setsockopt
make sure that we have enough space, don't require the exact size
(Tom Ivar Helbekkmo)

1) "#define ipi_spec_dst ipi_addr" in <netinet/in.h>
2) Change the IP_RECVPKTINFO option to control the generation of
IP_PKTINFO control messages, the way it's done in Solaris.
3) Remove the superfluous IP_RECVPKTINFO control message.
4) Change the IP_PKTINFO option to do different things depending on
the parameter it's supplied with:
- If it's sizeof(int), assume it's being used as in Linux:
- If it's non-zero, turn on the IP_RECVPKTINFO option.
- If it's zero, turn off the IP_RECVPKTINFO option.
- If it's sizeof(struct in_pktinfo), assume it's being used as in
Solaris, to set a default for the source interface and/or
source address for outgoing packets on the socket.
5) Return what Linux or Solaris compatible code expects, depending
on data size, and just added a fallback to a Linux (and current NetBSD)
compatible value if the size is unknown (as it is now), or,
in the future, if the calling application specifies a receiving
buffer that doesn't match either data item.

From: Tom Ivar Helbekkmo

new sentence-new line

Remove comment now that the getsockopt code passes the size.

Add a new sockopt member to keep track of the actual size of the option
that should be returned to the caller in getsockopt(2).
(Tom Ivar Helbekkmo)
 1.355.2.5  26-Feb-2018  martin Pull up following revision(s) (requested by ozaki-r in ticket #588):
sys/netinet6/in6.c: revision 1.260
sys/netinet/in.c: revision 1.219
sys/netinet/wqinput.c: revision 1.4
sys/rump/net/lib/libnetinet/netinet_component.c: revision 1.11
sys/netinet/ip_input.c: revision 1.376
sys/netinet6/ip6_input.c: revision 1.193
Avoid a deadlock between softnet_lock and IFNET_LOCK

A deadlock occurs because there is a violation of the rule of lock ordering;
softnet_lock is held with hodling IFNET_LOCK, which violates the rule.
To avoid the deadlock, replace softnet_lock in in_control and in6_control
with KERNEL_LOCK.

We also need to add some KERNEL_LOCKs to protect the network stack surely.
This is required, for example, for PR kern/51356.

Fix PR kern/53043
 1.355.2.4  12-Feb-2018  snj Pull up following revision(s) (requested by maxv in ticket #547):
sys/netinet/ip_input.c: 1.366
Disable ip_allowsrcrt and ip_forwsrcrt. Enabling them by default was a
completely dumb idea, because they have security implications.
By sending an IPv4 packet containing an LSRR option, an attacker will
cause the system to forward the packet to another IPv4 address - and
this way he white-washes the source of the packet.
It is also possible for an attacker to reach hidden networks: if a server
has a public address, and a private one on an internal network (network
which has several internal machines connected), the attacker can send a
packet with:
source = 0.0.0.0
destination = public address of the server
LSRR first address = address of a machine on the internal network
And the packet will be forwarded, by the server, to the internal machine,
in some cases even with the internal IP address of the server as a source.
 1.355.2.3  02-Jan-2018  snj Pull up following revision(s) (requested by ozaki-r in ticket #456):
sys/arch/arm/sunxi/sunxi_emac.c: 1.9
sys/dev/ic/dwc_gmac.c: 1.43-1.44
sys/dev/pci/if_iwm.c: 1.75
sys/dev/pci/if_wm.c: 1.543
sys/dev/pci/ixgbe/ixgbe.c: 1.112
sys/dev/pci/ixgbe/ixv.c: 1.74
sys/kern/sys_socket.c: 1.75
sys/net/agr/if_agr.c: 1.43
sys/net/bpf.c: 1.219
sys/net/if.c: 1.397, 1.399, 1.401-1.403, 1.406-1.410, 1.412-1.416
sys/net/if.h: 1.242-1.247, 1.250, 1.252-1.257
sys/net/if_bridge.c: 1.140 via patch, 1.142-1.146
sys/net/if_etherip.c: 1.40
sys/net/if_ethersubr.c: 1.243, 1.246
sys/net/if_faith.c: 1.57
sys/net/if_gif.c: 1.132
sys/net/if_l2tp.c: 1.15, 1.17
sys/net/if_loop.c: 1.98-1.101
sys/net/if_media.c: 1.35
sys/net/if_pppoe.c: 1.131-1.132
sys/net/if_spppsubr.c: 1.176-1.177
sys/net/if_tun.c: 1.142
sys/net/if_vlan.c: 1.107, 1.109, 1.114-1.121
sys/net/npf/npf_ifaddr.c: 1.3
sys/net/npf/npf_os.c: 1.8-1.9
sys/net/rtsock.c: 1.230
sys/netcan/if_canloop.c: 1.3-1.5
sys/netinet/if_arp.c: 1.255
sys/netinet/igmp.c: 1.65
sys/netinet/in.c: 1.210-1.211
sys/netinet/in_pcb.c: 1.180
sys/netinet/ip_carp.c: 1.92, 1.94
sys/netinet/ip_flow.c: 1.81
sys/netinet/ip_input.c: 1.362
sys/netinet/ip_mroute.c: 1.147
sys/netinet/ip_output.c: 1.283, 1.285, 1.287
sys/netinet6/frag6.c: 1.61
sys/netinet6/in6.c: 1.251, 1.255
sys/netinet6/in6_pcb.c: 1.162
sys/netinet6/ip6_flow.c: 1.35
sys/netinet6/ip6_input.c: 1.183
sys/netinet6/ip6_output.c: 1.196
sys/netinet6/mld6.c: 1.90
sys/netinet6/nd6.c: 1.239-1.240
sys/netinet6/nd6_nbr.c: 1.139
sys/netinet6/nd6_rtr.c: 1.136
sys/netipsec/ipsec_output.c: 1.65
sys/rump/net/lib/libnetinet/netinet_component.c: 1.9-1.10
kmem_intr_free kmem_intr_[z]alloced memory
the underlying pools are the same but api-wise those should match
Unify IFEF_*_MPSAFE into IFEF_MPSAFE
There are already two flags for if_output and if_start, however, it seems such
MPSAFE flags are eventually needed for all if_XXX operations. Having discrete
flags for each operation is wasteful of if_extflags bits. So let's unify
the flags into one: IFEF_MPSAFE.
Fortunately IFEF_*_MPSAFE flags have never been included in any releases, so
we can change them without breaking backward compatibility of the releases
(though the kernel version of -current should be bumped).
Note that if an interface have both MP-safe and non-MP-safe operations at a
time, we have to set the IFEF_MPSAFE flag and let callees of non-MP-safe
opeartions take the kernel lock.
Proposed on tech-kern@ and tech-net@
Provide macros for softnet_lock and KERNEL_LOCK hiding NET_MPSAFE switch
It reduces C&P codes such as "#ifndef NET_MPSAFE KERNEL_LOCK(1, NULL); ..."
scattered all over the source code and makes it easy to identify remaining
KERNEL_LOCK and/or softnet_lock that are held even if NET_MPSAFE.
No functional change
Hold KERNEL_LOCK on if_ioctl selectively based on IFEF_MPSAFE
If IFEF_MPSAFE is set, hold the lock and otherwise don't hold.
This change requires additions of KERNEL_LOCK to subsequence functions from
if_ioctl such as ifmedia_ioctl and ifioctl_common to protect non-MP-safe
components.
Proposed on tech-kern@ and tech-net@
Ensure to hold if_ioctl_lock when calling if_flags_set
Fix locking against myself on ifpromisc
vlan_unconfig_locked could be called with holding if_ioctl_lock.
Ensure to not turn on IFF_RUNNING of an interface until its initialization completes
And ensure to turn off it before destruction as per IFF_RUNNING's description
"resource allocated". (The description is a bit doubtful though, I believe the
change is still proper.)
Ensure to hold if_ioctl_lock on if_up and if_down
One exception for if_down is if_detach; in the case the lock isn't needed
because it's guaranteed that no other one can access ifp at that point.
Make if_link_queue MP-safe if IFEF_MPSAFE
if_link_queue is a queue to store events of link state changes, which is
used to pass events from (typically) an interrupt handler to
if_link_state_change softint. The queue was protected by KERNEL_LOCK so far,
but if IFEF_MPSAFE is enabled, it becomes unsafe because (perhaps) an interrupt
handler of an interface with IFEF_MPSAFE doesn't take KERNEL_LOCK. Protect it
by a spin mutex.
Additionally with this change KERNEL_LOCK of if_link_state_change softint is
omitted if NET_MPSAFE is enabled.
Note that the spin mutex is now ifp->if_snd.ifq_lock as well as the case of
if_timer (see the comment).
Use IFADDR_WRITER_FOREACH instead of IFADDR_READER_FOREACH
At that point no other one modifies the list so IFADDR_READER_FOREACH
is unnecessary. Use of IFADDR_READER_FOREACH is harmless in general though,
if we try to detect contract violations of pserialize, using it violates
the contract. So avoid using it makes life easy.
Ensure to call if_addr_init with holding if_ioctl_lock
Get rid of outdated comments
Fix build of kernels without ether
By throwing out if_enable_vlan_mtu and if_disable_vlan_mtu that
created a unnecessary dependency from if.c to if_ethersubr.c.
PR kern/52790
Rename IFNET_LOCK to IFNET_GLOBAL_LOCK
IFNET_LOCK will be used in another lock, if_ioctl_lock (might be renamed then).
Wrap if_ioctl_lock with IFNET_* macros (NFC)
Also if_ioctl_lock perhaps needs to be renamed to something because it's now
not just for ioctl...
Reorder some destruction routines in if_detach
- Destroy if_ioctl_lock at the end of the if_detach because it's used in various
destruction routines
- Move psref_target_destroy after pr_purgeif because we want to use psref in
pr_purgeif (otherwise destruction procedures can be tricky)
Ensure to call if_mcast_op with holding IFNET_LOCK
Note that CARP doesn't deal with IFNET_LOCK yet.
Remove IFNET_GLOBAL_LOCK where it's unnecessary because IFNET_LOCK is held
Describe which lock is used to protect each member variable of struct ifnet
Requested by skrll@
Write a guideline for converting an interface to IFEF_MPSAFE
Requested by skrll@
Note that IFNET_LOCK must not be held in softint
Don't set IFEF_MPSAFE unless NET_MPSAFE at this point
Because recent investigations show that interfaces with IFEF_MPSAFE need to
follow additional restrictions to work with the flag safely. We should enable it
on an interface by default only if the interface surely satisfies the
restrictions, which are described in if.h.
Note that enabling IFEF_MPSAFE solely gains a few benefit on performance because
the network stack is still serialized by the big kernel locks by default.
 1.355.2.2  10-Dec-2017  snj Pull up following revision(s) (requested by roy in ticket #390):
sys/netinet/ip_input.c: 1.363
sys/netinet6/ip6_input.c: 1.184-1.185
sys/netinet6/ip6_output.c: 1.194-1.195
sys/netinet6/in6_src.c: 1.83-1.84
Allow local communication over DETACHED addresses.
Allow binding to DETACHED or TENTATIVE addresses as we deny
sending upstream from them anyway.
Prefer non DETACHED or TENTATIVE addresses.
--
Attempt to restore v6 networking. Not 100% certain that these
changes are all that is needed, but they're certainly a big part of it
(especially the ip6_input.c change.)
--
Treat unvalidated addresses as deprecated in rule 3.
 1.355.2.1  21-Oct-2017  snj Pull up following revision(s) (requested by ozaki-r in ticket #300):
crypto/dist/ipsec-tools/src/setkey/parse.y: 1.19
crypto/dist/ipsec-tools/src/setkey/token.l: 1.20
distrib/sets/lists/tests/mi: 1.754, 1.757, 1.759
doc/TODO.smpnet: 1.12-1.13
sys/net/pfkeyv2.h: 1.32
sys/net/raw_cb.c: 1.23-1.24, 1.28
sys/net/raw_cb.h: 1.28
sys/net/raw_usrreq.c: 1.57-1.58
sys/net/rtsock.c: 1.228-1.229
sys/netinet/in_proto.c: 1.125
sys/netinet/ip_input.c: 1.359-1.361
sys/netinet/tcp_input.c: 1.359-1.360
sys/netinet/tcp_output.c: 1.197
sys/netinet/tcp_var.h: 1.178
sys/netinet6/icmp6.c: 1.213
sys/netinet6/in6_proto.c: 1.119
sys/netinet6/ip6_forward.c: 1.88
sys/netinet6/ip6_input.c: 1.181-1.182
sys/netinet6/ip6_output.c: 1.193
sys/netinet6/ip6protosw.h: 1.26
sys/netipsec/ipsec.c: 1.100-1.122
sys/netipsec/ipsec.h: 1.51-1.61
sys/netipsec/ipsec6.h: 1.18-1.20
sys/netipsec/ipsec_input.c: 1.44-1.51
sys/netipsec/ipsec_netbsd.c: 1.41-1.45
sys/netipsec/ipsec_output.c: 1.49-1.64
sys/netipsec/ipsec_private.h: 1.5
sys/netipsec/key.c: 1.164-1.234
sys/netipsec/key.h: 1.20-1.32
sys/netipsec/key_debug.c: 1.18-1.21
sys/netipsec/key_debug.h: 1.9
sys/netipsec/keydb.h: 1.16-1.20
sys/netipsec/keysock.c: 1.59-1.62
sys/netipsec/keysock.h: 1.10
sys/netipsec/xform.h: 1.9-1.12
sys/netipsec/xform_ah.c: 1.55-1.74
sys/netipsec/xform_esp.c: 1.56-1.72
sys/netipsec/xform_ipcomp.c: 1.39-1.53
sys/netipsec/xform_ipip.c: 1.50-1.54
sys/netipsec/xform_tcp.c: 1.12-1.16
sys/rump/librump/rumpkern/Makefile.rumpkern: 1.170
sys/rump/librump/rumpnet/net_stub.c: 1.27
sys/sys/protosw.h: 1.67-1.68
tests/net/carp/t_basic.sh: 1.7
tests/net/if_gif/t_gif.sh: 1.11
tests/net/if_l2tp/t_l2tp.sh: 1.3
tests/net/ipsec/Makefile: 1.7-1.9
tests/net/ipsec/algorithms.sh: 1.5
tests/net/ipsec/common.sh: 1.4-1.6
tests/net/ipsec/t_ipsec_ah_keys.sh: 1.2
tests/net/ipsec/t_ipsec_esp_keys.sh: 1.2
tests/net/ipsec/t_ipsec_gif.sh: 1.6-1.7
tests/net/ipsec/t_ipsec_l2tp.sh: 1.6-1.7
tests/net/ipsec/t_ipsec_misc.sh: 1.8-1.18
tests/net/ipsec/t_ipsec_sockopt.sh: 1.1-1.2
tests/net/ipsec/t_ipsec_tcp.sh: 1.1-1.2
tests/net/ipsec/t_ipsec_transport.sh: 1.5-1.6
tests/net/ipsec/t_ipsec_tunnel.sh: 1.9
tests/net/ipsec/t_ipsec_tunnel_ipcomp.sh: 1.1-1.2
tests/net/ipsec/t_ipsec_tunnel_odd.sh: 1.3
tests/net/mcast/t_mcast.sh: 1.6
tests/net/net/t_ipaddress.sh: 1.11
tests/net/net_common.sh: 1.20
tests/net/npf/t_npf.sh: 1.3
tests/net/route/t_flags.sh: 1.20
tests/net/route/t_flags6.sh: 1.16
usr.bin/netstat/fast_ipsec.c: 1.22
Do m_pullup before mtod

It may fix panicks of some tests on anita/sparc and anita/GuruPlug.
---
KNF
---
Enable DEBUG for babylon5
---
Apply C99-style struct initialization to xformsw
---
Tweak outputs of netstat -s for IPsec

- Get rid of "Fast"
- Use ipsec and ipsec6 for titles to clarify protocol
- Indent outputs of sub protocols

Original outputs were organized like this:

(Fast) IPsec:
IPsec ah:
IPsec esp:
IPsec ipip:
IPsec ipcomp:
(Fast) IPsec:
IPsec ah:
IPsec esp:
IPsec ipip:
IPsec ipcomp:

New outputs are organized like this:

ipsec:
ah:
esp:
ipip:
ipcomp:
ipsec6:
ah:
esp:
ipip:
ipcomp:
---
Add test cases for IPComp
---
Simplify IPSEC_OSTAT macro (NFC)
---
KNF; replace leading whitespaces with hard tabs
---
Introduce and use SADB_SASTATE_USABLE_P
---
KNF
---
Add update command for testing

Updating an SA (SADB_UPDATE) requires that a process issuing
SADB_UPDATE is the same as a process issued SADB_ADD (or SADB_GETSPI).
This means that update command must be used with add command in a
configuration of setkey. This usage is normally meaningless but
useful for testing (and debugging) purposes.
---
Add test cases for updating SA/SP

The tests require newly-added udpate command of setkey.
---
PR/52346: Frank Kardel: Fix checksumming for NAT-T
See XXX for improvements.
---
Remove codes for PACKET_TAG_IPSEC_IN_CRYPTO_DONE

It seems that PACKET_TAG_IPSEC_IN_CRYPTO_DONE is for network adapters
that have IPsec accelerators; a driver sets the mtag to a packet
when its device has already encrypted the packet.

Unfortunately no driver implements such offload features for long
years and seems unlikely to implement them soon. (Note that neither
FreeBSD nor Linux doesn't have such drivers.) Let's remove related
(unused) codes and simplify the IPsec code.
---
Fix usages of sadb_msg_errno
---
Avoid updating sav directly

On SADB_UPDATE a target sav was updated directly, which was unsafe.
Instead allocate another sav, copy variables of the old sav to
the new one and replace the old one with the new one.
---
Simplify; we can assume sav->tdb_xform cannot be NULL while it's valid
---
Rename key_alloc* functions (NFC)

We shouldn't use the term "alloc" for functions that just look up
data and actually don't allocate memory.
---
Use explicit_memset to surely zero-clear key_auth and key_enc
---
Make sure to clear keys on error paths of key_setsaval
---
Add missing KEY_FREESAV
---
Make sure a sav is inserted to a sah list after its initialization completes
---
Remove unnecessary zero-clearing codes from key_setsaval

key_setsaval is now used only for a newly-allocated sav. (It was
used to reset variables of an existing sav.)
---
Correct wrong assumption of sav->refcnt in key_delsah

A sav in a list is basically not to be sav->refcnt == 0. And also
KEY_FREESAV assumes sav->refcnt > 0.
---
Let key_getsavbyspi take a reference of a returning sav
---
Use time_mono_to_wall (NFC)
---
Separate sending message routine (NFC)
---
Simplify; remove unnecessary zero-clears

key_freesaval is used only when a target sav is being destroyed.
---
Omit NULL checks for sav->lft_c

sav->lft_c can be NULL only when initializing or destroying sav.
---
Omit unnecessary NULL checks for sav->sah
---
Omit unnecessary check of sav->state

key_allocsa_policy picks a sav of either MATURE or DYING so we
don't need to check its state again.
---
Simplify; omit unnecessary saidx passing

- ipsec_nextisr returns a saidx but no caller uses it
- key_checkrequest is passed a saidx but it can be gotton by
another argument (isr)
---
Fix splx isn't called on some error paths
---
Fix header size calculation of esp where sav is NULL
---
Fix header size calculation of ah in the case sav is NULL

This fix was also needed for esp.
---
Pass sav directly to opencrypto callback

In a callback, use a passed sav as-is by default and look up a sav
only if the passed sav is dead.
---
Avoid examining freshness of sav on packet processing

If a sav list is sorted (by lft_c->sadb_lifetime_addtime) in advance,
we don't need to examine each sav and also don't need to delete one
on the fly and send up a message. Fortunately every sav lists are sorted
as we need.

Added key_validate_savlist validates that each sav list is surely sorted
(run only if DEBUG because it's not cheap).
---
Add test cases for SAs with different SPIs
---
Prepare to stop using isr->sav

isr is a shared resource and using isr->sav as a temporal storage
for each packet processing is racy. And also having a reference from
isr to sav makes the lifetime of sav non-deterministic; such a reference
is removed when a packet is processed and isr->sav is overwritten by
new one. Let's have a sav locally for each packet processing instead of
using shared isr->sav.

However this change doesn't stop using isr->sav yet because there are
some users of isr->sav. isr->sav will be removed after the users find
a way to not use isr->sav.
---
Fix wrong argument handling
---
fix printf format.
---
Don't validate sav lists of LARVAL or DEAD states

We don't sort the lists so the validation will always fail.

Fix PR kern/52405
---
Make sure to sort the list when changing the state by key_sa_chgstate
---
Rename key_allocsa_policy to key_lookup_sa_bysaidx
---
Separate test files
---
Calculate ah_max_authsize on initialization as well as esp_max_ivlen
---
Remove m_tag_find(PACKET_TAG_IPSEC_PENDING_TDB) because nobody sets the tag
---
Restore a comment removed in previous

The comment is valid for the below code.
---
Make tests more stable

sleep command seems to wait longer than expected on anita so
use polling to wait for a state change.
---
Add tests that explicitly delete SAs instead of waiting for expirations
---
Remove invalid M_AUTHIPDGM check on ESP isr->sav

M_AUTHIPDGM flag is set to a mbuf in ah_input_cb. An sav of ESP can
have AH authentication as sav->tdb_authalgxform. However, in that
case esp_input and esp_input_cb are used to do ESP decryption and
AH authentication and M_AUTHIPDGM never be set to a mbuf. So
checking M_AUTHIPDGM of a mbuf on isr->sav of ESP is meaningless.
---
Look up sav instead of relying on unstable sp->req->sav

This code is executed only in an error path so an additional lookup
doesn't matter.
---
Correct a comment
---
Don't release sav if calling crypto_dispatch again
---
Remove extra KEY_FREESAV from ipsec_process_done

It should be done by the caller.
---
Don't bother the case of crp->crp_buf == NULL in callbacks
---
Hold a reference to an SP during opencrypto processing

An SP has a list of isr (ipsecrequest) that represents a sequence
of IPsec encryption/authentication processing. One isr corresponds
to one opencrypto processing. The lifetime of an isr follows its SP.

We pass an isr to a callback function of opencrypto to continue
to a next encryption/authentication processing. However nobody
guaranteed that the isr wasn't freed, i.e., its SP wasn't destroyed.

In order to avoid such unexpected destruction of isr, hold a reference
to its SP during opencrypto processing.
---
Don't make SAs expired on tests that delete SAs explicitly
---
Fix a debug message
---
Dedup error paths (NFC)
---
Use pool to allocate tdb_crypto

For ESP and AH, we need to allocate an extra variable space in addition
to struct tdb_crypto. The fixed size of pool items may be larger than
an actual requisite size of a buffer, but still the performance
improvement by replacing malloc with pool wins.
---
Don't use unstable isr->sav for header size calculations

We may need to optimize to not look up sav here for users that
don't need to know an exact size of headers (e.g., TCP segmemt size
caclulation).
---
Don't use sp->req->sav when handling NAT-T ESP fragmentation

In order to do this we need to look up a sav however an additional
look-up degrades performance. A sav is later looked up in
ipsec4_process_packet so delay the fragmentation check until then
to avoid an extra look-up.
---
Don't use key_lookup_sp that depends on unstable sp->req->sav

It provided a fast look-up of SP. We will provide an alternative
method in the future (after basic MP-ification finishes).
---
Stop setting isr->sav on looking up sav in key_checkrequest
---
Remove ipsecrequest#sav
---
Stop setting mtag of PACKET_TAG_IPSEC_IN_DONE because there is no users anymore
---
Skip ipsec_spi_*_*_preferred_new_timeout when running on qemu

Probably due to PR 43997
---
Add localcount to rump kernels
---
Remove unused macro
---
Fix key_getcomb_setlifetime

The fix adjusts a soft limit to be 80% of a corresponding hard limit.

I'm not sure the fix is really correct though, at least the original
code is wrong. A passed comb is zero-cleared before calling
key_getcomb_setlifetime, so
comb->sadb_comb_soft_addtime = comb->sadb_comb_soft_addtime * 80 / 100;
is meaningless.
---
Provide and apply key_sp_refcnt (NFC)

It simplifies further changes.
---
Fix indentation

Pointed out by knakahara@
---
Use pslist(9) for sptree
---
Don't acquire global locks for IPsec if NET_MPSAFE

Note that the change is just to make testing easy and IPsec isn't MP-safe yet.
---
Let PF_KEY socks hold their own lock instead of softnet_lock

Operations on SAD and SPD are executed via PF_KEY socks. The operations
include deletions of SAs and SPs that will use synchronization mechanisms
such as pserialize_perform to wait for references to SAs and SPs to be
released. It is known that using such mechanisms with holding softnet_lock
causes a dead lock. We should avoid the situation.
---
Make IPsec SPD MP-safe

We use localcount(9), not psref(9), to make the sptree and secpolicy (SP)
entries MP-safe because SPs need to be referenced over opencrypto
processing that executes a callback in a different context.

SPs on sockets aren't managed by the sptree and can be destroyed in softint.
localcount_drain cannot be used in softint so we delay the destruction of
such SPs to a thread context. To do so, a list to manage such SPs is added
(key_socksplist) and key_timehandler_spd deletes dead SPs in the list.

For more details please read the locking notes in key.c.

Proposed on tech-kern@ and tech-net@
---
Fix updating ipsec_used

- key_update_used wasn't called in key_api_spddelete2 and key_api_spdflush
- key_update_used wasn't called if an SP had been added/deleted but
a reply to userland failed
---
Fix updating ipsec_used; turn on when SPs on sockets are added
---
Add missing IPsec policy checks to icmp6_rip6_input

icmp6_rip6_input is quite similar to rip6_input and the same checks exist
in rip6_input.
---
Add test cases for setsockopt(IP_IPSEC_POLICY)
---
Don't use KEY_NEWSP for dummy SP entries

By the change KEY_NEWSP is now not called from softint anymore
and we can use kmem_zalloc with KM_SLEEP for KEY_NEWSP.
---
Comment out unused functions
---
Add test cases that there are SPs but no relevant SAs
---
Don't allow sav->lft_c to be NULL

lft_c of an sav that was created by SADB_GETSPI could be NULL.
---
Clean up clunky eval strings

- Remove unnecessary \ at EOL
- This allows to omit ; too
- Remove unnecessary quotes for arguments of atf_set
- Don't expand $DEBUG in eval
- We expect it's expanded on execution

Suggested by kre@
---
Remove unnecessary KEY_FREESAV in an error path

sav should be freed (unreferenced) by the caller.
---
Use pslist(9) for sahtree
---
Use pslist(9) for sah->savtree
---
Rename local variable newsah to sah

It may not be new.
---
MP-ify SAD slightly

- Introduce key_sa_mtx and use it for some list operations
- Use pserialize for some list iterations
---
Introduce KEY_SA_UNREF and replace KEY_FREESAV with it where sav will never be actually freed in the future

KEY_SA_UNREF is still key_freesav so no functional change for now.

This change reduces diff of further changes.
---
Remove out-of-date log output

Pointed out by riastradh@
---
Use KDASSERT instead of KASSERT for mutex_ownable

Because mutex_ownable is too heavy to run in a fast path
even for DIAGNOSTIC + LOCKDEBUG.

Suggested by riastradh@
---
Assemble global lists and related locks into cache lines (NFCI)

Also rename variable names from *tree to *list because they are
just lists, not trees.

Suggested by riastradh@
---
Move locking notes
---
Update the locking notes

- Add locking order
- Add locking notes for misc lists such as reglist
- Mention pserialize, key_sp_ref and key_sp_unref on SP operations

Requested by riastradh@
---
Describe constraints of key_sp_ref and key_sp_unref

Requested by riastradh@
---
Hold key_sad.lock on SAVLIST_WRITER_INSERT_TAIL
---
Add __read_mostly to key_psz

Suggested by riastradh@
---
Tweak wording (pserialize critical section => pserialize read section)

Suggested by riastradh@
---
Add missing mutex_exit
---
Fix setkey -D -P outputs

The outputs were tweaked (by me), but I forgot updating libipsec
in my local ATF environment...
---
MP-ify SAD (key_sad.sahlist and sah entries)

localcount(9) is used to protect key_sad.sahlist and sah entries
as well as SPD (and will be used for SAD sav).

Please read the locking notes of SAD for more details.
---
Introduce key_sa_refcnt and replace sav->refcnt with it (NFC)
---
Destroy sav only in the loop for DEAD sav
---
Fix KASSERT(solocked(sb->sb_so)) failure in sbappendaddr that is called eventually from key_sendup_mbuf

If key_sendup_mbuf isn't passed a socket, the assertion fails.
Originally in this case sb->sb_so was softnet_lock and callers
held softnet_lock so the assertion was magically satisfied.
Now sb->sb_so is key_so_mtx and also softnet_lock isn't always
held by callers so the assertion can fail.

Fix it by holding key_so_mtx if key_sendup_mbuf isn't passed a socket.

Reported by knakahara@
Tested by knakahara@ and ozaki-r@
---
Fix locking notes of SAD
---
Fix deadlock between key_sendup_mbuf called from key_acquire and localcount_drain

If we call key_sendup_mbuf from key_acquire that is called on packet
processing, a deadlock can happen like this:
- At key_acquire, a reference to an SP (and an SA) is held
- key_sendup_mbuf will try to take key_so_mtx
- Some other thread may try to localcount_drain to the SP with
holding key_so_mtx in say key_api_spdflush
- In this case localcount_drain never return because key_sendup_mbuf
that has stuck on key_so_mtx never release a reference to the SP

Fix the deadlock by deferring key_sendup_mbuf to the timer
(key_timehandler).
---
Fix that prev isn't cleared on retry
---
Limit the number of mbufs queued for deferred key_sendup_mbuf

It's easy to be queued hundreds of mbufs on the list under heavy
network load.
---
MP-ify SAD (savlist)

localcount(9) is used to protect savlist of sah. The basic design is
similar to MP-ifications of SPD and SAD sahlist. Please read the
locking notes of SAD for more details.
---
Simplify ipsec_reinject_ipstack (NFC)
---
Add per-CPU rtcache to ipsec_reinject_ipstack

It reduces route lookups and also reduces rtcache lock contentions
when NET_MPSAFE is enabled.
---
Use pool_cache(9) instead of pool(9) for tdb_crypto objects

The change improves network throughput especially on multi-core systems.
---
Update

ipsec(4), opencrypto(9) and vlan(4) are now MP-safe.
---
Write known issues on scalability
---
Share a global dummy SP between PCBs

It's never be changed so it can be pre-allocated and shared safely between PCBs.
---
Fix race condition on the rawcb list shared by rtsock and keysock

keysock now protects itself by its own mutex, which means that
the rawcb list is protected by two different mutexes (keysock's one
and softnet_lock for rtsock), of course it's useless.

Fix the situation by having a discrete rawcb list for each.
---
Use a dedicated mutex for rt_rawcb instead of softnet_lock if NET_MPSAFE
---
fix localcount leak in sav. fixed by ozaki-r@n.o.

I commit on behalf of him.
---
remove unnecessary comment.
---
Fix deadlock between pserialize_perform and localcount_drain

A typical ussage of localcount_drain looks like this:

mutex_enter(&mtx);
item = remove_from_list();
pserialize_perform(psz);
localcount_drain(&item->localcount, &cv, &mtx);
mutex_exit(&mtx);

This sequence can cause a deadlock which happens for example on the following
situation:

- Thread A calls localcount_drain which calls xc_broadcast after releasing
a specified mutex
- Thread B enters the sequence and calls pserialize_perform with holding
the mutex while pserialize_perform also calls xc_broadcast
- Thread C (xc_thread) that calls an xcall callback of localcount_drain tries
to hold the mutex

xc_broadcast of thread B doesn't start until xc_broadcast of thread A
finishes, which is a feature of xcall(9). This means that pserialize_perform
never complete until xc_broadcast of thread A finishes. On the other hand,
thread C that is a callee of xc_broadcast of thread A sticks on the mutex.
Finally the threads block each other (A blocks B, B blocks C and C blocks A).

A possible fix is to serialize executions of the above sequence by another
mutex, but adding another mutex makes the code complex, so fix the deadlock
by another way; the fix is to release the mutex before pserialize_perform
and instead use a condvar to prevent pserialize_perform from being called
simultaneously.

Note that the deadlock has happened only if NET_MPSAFE is enabled.
---
Add missing ifdef NET_MPSAFE
---
Take softnet_lock on pr_input properly if NET_MPSAFE

Currently softnet_lock is taken unnecessarily in some cases, e.g.,
icmp_input and encap4_input from ip_input, or not taken even if needed,
e.g., udp_input and tcp_input from ipsec4_common_input_cb. Fix them.

NFC if NET_MPSAFE is disabled (default).
---
- sanitize key debugging so that we don't print extra newlines or unassociated
debugging messages.
- remove unused functions and make internal ones static
- print information in one line per message
---
humanize printing of ip addresses
---
cast reduction, NFC.
---
Fix typo in comment
---
Pull out ipsec_fill_saidx_bymbuf (NFC)
---
Don't abuse key_checkrequest just for looking up sav

It does more than expected for example key_acquire.
---
Fix SP is broken on transport mode

isr->saidx was modified accidentally in ipsec_nextisr.

Reported by christos@
Helped investigations by christos@ and knakahara@
---
Constify isr at many places (NFC)
---
Include socketvar.h for softnet_lock
---
Fix buffer length for ipsec_logsastr
 1.376.2.7  18-Jan-2019  pgoyette Synch with HEAD
 1.376.2.6  26-Nov-2018  pgoyette Sync with HEAD, resolve a couple of conflicts
 1.376.2.5  06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.376.2.4  28-Jul-2018  pgoyette Sync with HEAD
 1.376.2.3  21-May-2018  pgoyette Sync with HEAD
 1.376.2.2  02-May-2018  pgoyette Synch with HEAD
 1.376.2.1  16-Apr-2018  pgoyette Sync with HEAD, resolve some conflicts
 1.384.2.2  13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.384.2.1  10-Jun-2019  christos Sync with HEAD
 1.389.2.3  07-Mar-2021  martin Pull up following revision(s) (requested by christos in ticket #1226):

sys/netinet6/ip6_id.c: revision 1.19-1.21
sys/netinet6/ip6_var.h: revision 1.88
sys/netinet/ip_input.c: revision 1.400
sys/netinet/tcp_subr.c: revision 1.285
sys/netinet/ip6.h: revision 1.30

netinet: Enable random IP fragment ids by default (from riastradh)

netinet: Enable RFC 1948 pseudorandom TCP ISS selection by default.
(from riastradh)

netinet6: Mark randomid unused.

Will make merging and bisection easier if anything goes wrong with
flow label or fragment id randomization changes.
(from riastradh)

netinet/netinet6: Add necessary includes to make these standalone.
(from riastradh)

Replace randomid() by cprng_fast32()
 1.389.2.2  24-Sep-2019  martin Pull up following revision(s) (requested by ozaki-r in ticket #238):

sys/netipsec/ipsec_output.c: revision 1.83
sys/net/route.h: revision 1.125
sys/netinet6/ip6_input.c: revision 1.210
sys/netinet6/ip6_input.c: revision 1.211
sys/net/if.c: revision 1.461
sys/net/if_gif.h: revision 1.33
sys/net/route.c: revision 1.220
sys/net/route.c: revision 1.221
sys/net/if.h: revision 1.277
sys/netinet6/ip6_forward.c: revision 1.97
sys/netinet/wqinput.c: revision 1.6
sys/net/if_ipsec.h: revision 1.5
sys/netinet6/in6_l2tp.c: revision 1.18
sys/netinet6/in6_gif.c: revision 1.94
sys/net/if_l2tp.h: revision 1.7
sys/net/if_gif.c: revision 1.149
sys/net/if_l2tp.h: revision 1.8
sys/netinet/in_gif.c: revision 1.95
sys/netinet/in_l2tp.c: revision 1.17
sys/netipsec/ipsecif.c: revision 1.17
sys/net/if_ipsec.c: revision 1.24
sys/net/if_l2tp.c: revision 1.37
sys/netinet/ip_input.c: revision 1.391
sys/net/if_l2tp.c: revision 1.38
sys/netinet/ip_input.c: revision 1.392
sys/net/if_l2tp.c: revision 1.39

Avoid having a rtcache directly in a percpu storage

percpu(9) has a certain memory storage for each CPU and provides it by the piece
to users. If the storages went short, percpu(9) enlarges them by allocating new
larger memory areas, replacing old ones with them and destroying the old ones.

A percpu storage referenced by a pointer gotten via percpu_getref can be
destroyed by the mechanism after a running thread sleeps even if percpu_putref
has not been called.

Using rtcache, i.e., packet processing, typically involves sleepable operations
such as rwlock so we must avoid dereferencing a rtcache that is directly stored
in a percpu storage during packet processing. Address this situation by having
just a pointer to a rtcache in a percpu storage instead.
Reviewed by knakahara@ and yamaguchi@

-

wqinput: avoid having struct wqinput_worklist directly in a percpu storage

percpu(9) has a certain memory storage for each CPU and provides it by the piece
to users. If the storages went short, percpu(9) enlarges them by allocating new
larger memory areas, replacing old ones with them and destroying the old ones.

A percpu storage referenced by a pointer gotten via percpu_getref can be
destroyed by the mechanism after a running thread sleeps even if percpu_putref
has not been called.

Input handlers of wqinput normally involves sleepable operations so we must
avoid dereferencing a percpu data (struct wqinput_worklist) after executing
an input handler. Address this situation by having just a pointer to the data
in a percpu storage instead.
Reviewed by knakahara@ and yamaguchi@

-

Add missing #include <sys/kmem.h>

-

Divide Tx context of l2tp(4) to improve performance.

It seems l2tp(4) call path is too long for instruction cache. So, dividing
l2tp(4) Tx context improves CPU use efficiency.

After this commit, l2tp(4) throughput gains 10% on my machine(Atom C3000).

-

Apply some missing changes lost on the previous commit

-

Avoid having a rtcache directly in a percpu storage for tunnel protocols.
percpu(9) has a certain memory storage for each CPU and provides it by the piece
to users. If the storages went short, percpu(9) enlarges them by allocating new
larger memory areas, replacing old ones with them and destroying the old ones.

A percpu storage referenced by a pointer gotten via percpu_getref can be
destroyed by the mechanism after a running thread sleeps even if percpu_putref
has not been called.

Using rtcache, i.e., packet processing, typically involves sleepable operations
such as rwlock so we must avoid dereferencing a rtcache that is directly stored
in a percpu storage during packet processing. Address this situation by having
just a pointer to a rtcache in a percpu storage instead.

Reviewed by ozaki-r@ and yamaguchi@

-

l2tp(4): avoid having struct ifqueue directly in a percpu storage.
percpu(9) has a certain memory storage for each CPU and provides it by the piece
to users. If the storages went short, percpu(9) enlarges them by allocating new
larger memory areas, replacing old ones with them and destroying the old ones.

A percpu storage referenced by a pointer gotten via percpu_getref can be
destroyed by the mechanism after a running thread sleeps even if percpu_putref
has not been called.

Tx processing of l2tp(4) uses normally involves sleepable operations so we
must avoid dereferencing a percpu data (struct ifqueue) after executing Tx
processing. Address this situation by having just a pointer to the data in
a percpu storage instead.

Reviewed by ozaki-r@ and yamaguchi@
 1.389.2.1  17-Sep-2019  martin Pull up following revision(s) (requested by bouyer in ticket #208):

sys/netinet6/ip6_input.c: revision 1.209
sys/netinet/ip_input.c: revision 1.390

Packet filters can return an mbuf chain with fragmented headers, so
m_pullup() it if needed and remove the KASSERT()s.
 1.397.2.1  03-Apr-2021  thorpej Sync with HEAD.
 1.402.4.2  29-Jul-2025  martin Pull up following revision(s) (requested by ozaki-r in ticket #1140):

sys/netinet/ip_output.c: revision 1.330
sys/netinet/sctp_output.c: revision 1.39
sys/netinet/ip_mroute.c: revision 1.166
sys/netipsec/ipsecif.c: revision 1.24
sys/netipsec/xform_ipip.c: revision 1.80
sys/netinet/ip_output.c: revision 1.327
sys/netinet/ip_output.c: revision 1.328
sys/netinet/ip_input.c: revision 1.406
sys/netinet/ip_output.c: revision 1.329
sys/netinet/in_var.h: revision 1.105

in: get rid of unused argument from ip_newid() and ip_newid_range()

in: take a reference of ifp on IP_ROUTETOIF
The ifp could be released after ia4_release(ia).

in: narrow the scope of ifa in ip_output (NFC)

sctp: follow the recent change of ip_newid()

in: avoid racy ifa_acquire(rt->rt_ifa) in ip_output()
If a rtentry is being destroyed asynchronously, ifa referenced by rt_ifa
can be destructed and taking ifa_acquire(rt->rt_ifa) aborts with a
KASSERT failure. Fortunately, the ifa is not actually freed because of
a reference by rt_ifa, it can be available (except some functions like
psref) so as long the rtentry is held.
PR kern/59527

in: avoid racy ia4_acquire(ifatoia(rt->rt_ifa) in ip_rtaddr()
Same as the case of ip_output(), it's racy and should be avoided.
PR kern/59527
 1.402.4.1  14-Jul-2025  martin Pull up following revision(s) (requested by ozaki-r in ticket #1137):

sys/netinet/ip_input.c: revision 1.405

in: avoid packet looping on incoming packets destining to an initializing
address

The initialization of an IPv4 address is done by adding a connected route and
a local route (if necessary), and then publishing itself by adding it to the
global list (and the global hashtable). Thus, there can exist a route with an
address that is not published. This inconsistent state allows an incoming
packet destining to one of a host address which is not published but has a
local route to be forwarded and routed to a loopback interface. This results
in forwarding the packet back to ip_input, that is, packet looping.

To avoid the situation, prohibit packets being forwarded via a local route.

This is a workaround for "IPv4 address initialization atomicity" in
doc/TODO.smpnet.
 1.403.2.1  02-Aug-2025  perseant Sync with HEAD

RSS XML Feed