Home | History | Annotate | Download | only in netinet
History log of /src/sys/netinet/ip_icmp.c
RevisionDateAuthorComments
 1.180  05-Jun-2025  ozaki-r Apply if_first_addr() and if_first_addr_psref()
 1.179  22-Feb-2025  mlelstv Use canonical M_GETHDR macro. NFCI.
 1.178  29-Aug-2022  knakahara Add sysctl entry to control to send routing message for RTM_DYNAMIC.

Some routing daemons require such routing message to keep coherency.

If we want to let kernel send such message, set net.inet.icmp.dynamic_rt_msg=1
for IPv4, net.inet6.icmp6.dynamic_rt_msg=1 for IPv6.
Default(=0) is the same as before, that is, not send such routing message.
 1.177  22-Dec-2018  maxv Replace M_ALIGN and MH_ALIGN by m_align.
 1.176  15-Nov-2018  maxv Remove the 't' argument from m_tag_find().
 1.175  15-Nov-2018  maxv Simplify the mtag API:

- Remove m_tag_init(), m_tag_first(), m_tag_next() and
m_tag_delete_nonpersistent().

- Remove the 't' argument from m_tag_delete_chain().
 1.174  14-Sep-2018  maxv Use non-variadic function pointer in protosw::pr_input.
 1.173  03-Sep-2018  riastradh Rename min/max -> uimin/uimax for better honesty.

These functions are defined on unsigned int. The generic name
min/max should not silently truncate to 32 bits on 64-bit systems.
This is purely a name change -- no functional change intended.

HOWEVER! Some subsystems have

#define min(a, b) ((a) < (b) ? (a) : (b))
#define max(a, b) ((a) > (b) ? (a) : (b))

even though our standard name for that is MIN/MAX. Although these
may invite multiple evaluation bugs, these do _not_ cause integer
truncation.

To avoid `fixing' these cases, I first changed the name in libkern,
and then compile-tested every file where min/max occurred in order to
confirm that it failed -- and thus confirm that nothing shadowed
min/max -- before changing it.

I have left a handful of bootloaders that are too annoying to
compile-test, and some dead code:

cobalt ews4800mips hp300 hppa ia64 luna68k vax
acorn32/if_ie.c (not included in any kernels)
macppc/if_gm.c (superseded by gem(4))

It should be easy to fix the fallout once identified -- this way of
doing things fails safe, and the goal here, after all, is to _avoid_
silent integer truncations, not introduce them.

Maybe one day we can reintroduce min/max as type-generic things that
never silently truncate. But we should avoid doing that for a while,
so that existing code has a chance to be detected by the compiler for
conversion to uimin/uimax without changing the semantics until we can
properly audit it all. (Who knows, maybe in some cases integer
truncation is actually intended!)
 1.172  21-Jun-2018  knakahara branches: 1.172.2;
sbappendaddr() is required any lock. Currently, softnet_lock is appropriate.

When rip_input() is called as inetsw[].pr_input, rip_iput() is always called
with holding softnet_lock, that is, in case of !defined(NET_MPSAFE) it is
acquired in ipintr(), otherwise(defined(NET_MPSAFE)) it is acquire in
PR_WRAP_INPUT macro.
However, some function calls rip_input() directly without holding softnet_lock.
That causes assertion failure in sbappendaddr().
rip6_input() and icmp6_rip6_input() are also required softnet_lock for the same
reason.
 1.171  01-Jun-2018  ozaki-r Fix _rt_free via rtrequest(RTM_DELETE) hangs in rt_timer handlers

A rt_timer handler is passed a rtentry with an extra reference that avoids the
rtentry is accidentally released. So rt_timer handers must release the reference
of a passed rtentry by themselves (but they didn't).
 1.170  11-May-2018  maxv Retire ICMPPRINTFS, it's annoying and it doesn't build.
 1.169  26-Apr-2018  maxv Use M_UNWRITABLE, no functional change.
 1.168  08-Feb-2018  maxv branches: 1.168.2;
Fix a possible buffer overflow in the IPv4 _ctlinput functions.

In _icmp_input we are guaranteeing that the ICMP_ADVLENMIN-byte area
starting from 'icp' is contiguous.

ICMP_ADVLENMIN = 8 + sizeof(struct ip) + 8 = 36

But the _ctlinput functions (eg udp_ctlinput) expect the area to be
larger. These functions read at:

(uint8_t *)icp + 8 + (icp->icmp_ip.ip_hl << 2)

which can be crafted to be:

(uint8_t *)icp + 68

So we end up reading 'icp+68' while the valid area ended at 'icp+36'.

Having said that, it seems pretty complicated to trigger this bug; it
would have to be a fragmented packet with half of the ICMP header in the
first fragment, and we would need to have a driver that did not allocate
a cluster for the first mbuf of the chain.

The check of icmplen against ICMP_ADVLEN(icp) was not sufficient: while it
did guarantee that the ICMP header fit the chain, it did not guarantee
that it fit 'm'.

Fix this bug by pulling up to hlen+ICMP_ADVLEN(icp). No need to log an
error. Rebase the pointers afterwards.
 1.167  05-Feb-2018  maxv Declare icmperrppslim in ip_icmp.c, it shouldn't be used elsewhere.
 1.166  23-Jan-2018  maxv Don't use global variables, that's obviously incorrect on MP systems.
One remains, because it is imported in tcp_timer.c, and I'm not totally
sure of how it interacts with icmp_mtudisc().
 1.165  23-Jan-2018  maxv Style, localify icmp_send, and add a clear KASSERT (that replaces a vague
comment).
 1.164  22-Jan-2018  maxv Adapt previous, reintroduce MH_ALIGN. It's used as an optimization - we
can later prepend something to the current mbuf without having to allocate
a new mbuf.
 1.163  19-Jan-2018  maxv Fix a buffer overflow in icmp_error. We create in 'm' a packet that must
contain:

IPv4 header | Fixed part of ICMP header | Variable part of ICMP header

But we perform length checks on 'totlen', which does not count the IPv4
header.

So now, add sizeof(struct ip) in totlen, and stop doing this m_data
nonsense, just get the pointers as usual.
 1.162  19-Jan-2018  maxv Clarify icmp_error:

* Rename (and constify) oiplen -> oiphlen.

* Rename icmplen -> datalen, it's the size of the variable part of
the ICMP header, not the total size of the ICMP header itself.

* Introduce totlen, this is the total size of the ICMP header (icmp_ip
included).

No real functional change.
 1.161  31-Mar-2017  ozaki-r branches: 1.161.6;
Don't use a single global variable to store source route information for multiple incoming packets

It's not MP-safe. So use a m_tag to store the information instead.

Pointed out by knakahara@
The fix is from OpenBSD (originally fixed in FreeBSD)
 1.160  06-Mar-2017  ozaki-r Make sure icmp_redirect_timeout_q and ip_mtudisc_timeout_q are initialized on bootup

Fix PR kern/52029
 1.159  17-Feb-2017  ozaki-r Protect sysctl_net_inet_ip_pmtudto with icmp_mtx instead of softnet_lock
 1.158  13-Feb-2017  ozaki-r Protect mtudisc and redirect stuffs of icmp/icmp6 with mutex

We have to run pr_init of icmp and icmp6 prior to tcp and tcp6 ones
for mutex initialization.
 1.157  07-Feb-2017  ozaki-r Add missing NULL checks for m_get_rcvif
 1.156  02-Feb-2017  ozaki-r Defer some pr_input to workqueue

pr_input is currently called in softint. Some pr_input such as ICMP, ICMPv6
and CARP can add/delete/update IP addresses and routing table entries. For
example, icmp6_redirect_input updates an a routing table entry and
nd6_ra_input may delete an IP address.

Basically such operations shouldn't be done in softint. That aside, we have
a reason to avoid the situation; psz/psref waits cannot be used in softint,
however they are required to work in such pr_input in the MP-safe world.

The change implements the workqueue pr_input framework called wqinput which
provides a means to defer pr_input of a protocol to workqueue easily.
Currently icmp_input, icmp6_input, carp_proto_input and carp6_proto_input
are deferred to workqueue by the framework.

Proposed and discussed on tech-kern and tech-net
 1.155  24-Jan-2017  ozaki-r Tweak softnet_lock and NET_MPSAFE

- Don't hold softnet_lock in some functions if NET_MPSAFE
- Add softnet_lock to sysctl_net_inet_icmp_redirtimeout
- Add softnet_lock to expire_upcalls of ip_mroute.c
- Restore softnet_lock for in{,6}_pcbpurgeif{,0} if NET_MPSAFE
- Mark some softnet_lock for future work
 1.154  12-Dec-2016  ozaki-r branches: 1.154.2;
Make the routing table and rtcaches MP-safe

See the following descriptions for details.

Proposed on tech-kern and tech-net


Overview
 1.153  25-Oct-2016  ozaki-r Remove unnecessary argument

No functional change.
 1.152  19-Oct-2016  ozaki-r Set ia to ensure to call ia4_release
 1.151  01-Aug-2016  ozaki-r Apply pserialize and psref to struct ifaddr and its variants

This change makes struct ifaddr and its variants (in_ifaddr and in6_ifaddr)
MP-safe by using pserialize and psref. At this moment, pserialize_perform
and psref_target_destroy are disabled because (1) we don't need them
because of softnet_lock (2) they cause a deadlock because of softnet_lock.
So we'll enable them when we remove softnet_lock in the future.
 1.150  08-Jul-2016  ozaki-r branches: 1.150.2;
Replace macros to get an IP address with proper inline functions

The inline functions are more friendly for applying psz/psref;
they consist of only simple interations.
 1.149  07-Jul-2016  ozaki-r Switch the address list of intefaces to pslist(9)

As usual, we leave the old list to avoid breaking kvm(3) users.
 1.148  06-Jul-2016  ozaki-r Switch the IPv4 address list to pslist(9)

Note that we leave the old list just in case; it seems there are some
kvm(3) users accessing the list. We can remove it later if we confirmed
nobody does actually.
 1.147  10-Jun-2016  ozaki-r Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.
 1.146  10-Jun-2016  ozaki-r Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.
 1.145  01-Apr-2016  ozaki-r Remove unnecessary casts and do s/0/NULL/ for rtrequest
 1.144  21-Jan-2016  riastradh Revert previous: ran cvs commit when I meant cvs diff. Sorry!

Hit up-arrow one too few times.
 1.143  21-Jan-2016  riastradh Give proper prototype to ip_output.
 1.142  31-Aug-2015  ozaki-r Make rt_refcnt take into account rt_timer
 1.141  24-Aug-2015  pooka sprinkle _KERNEL_OPT
 1.140  09-May-2015  christos if no address was found, don't check if it is tentative (hi Roy)
 1.139  09-May-2015  christos assign sin only when it is needed
 1.138  02-May-2015  roy Add IPv4 address flags IN_IFF_TENTATIVE, IN_IFF_DUPLICATED and
IN_IFF_DETATCHED to mimic the IPv6 address behaviour.
Add SIOCGIFAFLAG_IN ioctl to retrieve the address flag via the
ifreq structure.
Add IPv4 DAD detection via the ARP methods described in RFC 5227.
Add sysctls net.inet.ip.dad_count and net.inet.arp.debug.

Discussed on tech-net@
 1.137  24-Apr-2015  ozaki-r Use KASSERT instead of if & panic

rt can be NULL only when programming error (and we sure it cannot for now),
so we can use KASSERT here (i.e., check only if DIAGNOSTIC).
 1.136  24-Apr-2015  ozaki-r Replace 0 with NULL for pointer variables
 1.135  02-Dec-2014  christos use the new printing code.
 1.134  30-May-2014  christos branches: 1.134.4;
Introduce 2 new variables: ipsec_enabled and ipsec_used.
Ipsec enabled is controlled by sysctl and determines if is allowed.
ipsec_used is set automatically based on ipsec being enabled, and
rules existing.
 1.133  19-May-2014  rmind - Split off PRU_ATTACH and PRU_DETACH logic into separate functions.
- Replace malloc with kmem and eliminate M_PCB while here.
- Sprinkle more asserts.
 1.132  25-Feb-2014  pooka branches: 1.132.2;
Ensure that the top level sysctl nodes (kern, vfs, net, ...) exist before
the sysctl link sets are processed, and remove redundancy.

Shaves >13kB off of an amd64 GENERIC, not to mention >1k duplicate
lines of code.
 1.131  05-Jun-2013  christos branches: 1.131.2;
IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.
 1.130  25-Mar-2013  christos PR/47693: Erik E. Fair: Add missing code to icmp handling.
- While there, add the rest of the missing codes
- Merge groups
- Fix indentation
 1.129  22-Mar-2012  drochner branches: 1.129.2;
remove KAME IPSEC, replaced by FAST_IPSEC
 1.128  09-Jan-2012  liamjfoy branches: 1.128.2;
check against NULL
 1.127  31-Dec-2011  christos - fix offsetof usage, and redundant defines
- kill pointer casts to 0
 1.126  19-Dec-2011  drochner rename the IPSEC in-kernel CPP variable and config(8) option to
KAME_IPSEC, and make IPSEC define it so that existing kernel
config files work as before
Now the default can be easily be changed to FAST_IPSEC just by
setting the IPSEC alias to FAST_IPSEC.
 1.125  17-Jul-2011  joerg branches: 1.125.2; 1.125.6;
Retire varargs.h support. Move machine/stdarg.h logic into MI
sys/stdarg.h and expect compiler to provide proper builtins, defaulting
to the GCC interface. lint still has a special fallback.
Reduce abuse of _BSD_VA_LIST_ by defining __va_list by default and
derive va_list as required by standards.
 1.124  02-Jul-2010  kefren manually adjust m_data and m_len so it can later be prepended with a
struct ip in case that a cluster is used. icmp len panic is not valid for
cluster case.

Fixes PR/43548
 1.123  26-Jun-2010  kefren Add MPLS support, proposed on tech-net@ a couple of days ago

Welcome to 5.99.33
 1.122  07-Dec-2009  christos branches: 1.122.2; 1.122.4;
PR/42243: Yasuoka Masahiko: Add "net.inet.icmp.bmcastecho" sysctl support,
to disable icmp replies to the broadcast address.
 1.121  16-Sep-2009  pooka Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL
 1.120  18-Jun-2008  yamt branches: 1.120.6;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@
 1.119  04-May-2008  thorpej branches: 1.119.2; 1.119.4;
Simplify the interface to netstat_sysctl() and allocate space for
the collated counters using kmem_alloc().

PR kern/38577
 1.118  28-Apr-2008  martin Remove clause 3 and 4 from TNF licenses
 1.117  23-Apr-2008  thorpej branches: 1.117.2;
Use <net/net_stats.h> / netstat_sysctl().
 1.116  12-Apr-2008  thorpej branches: 1.116.2;
Make IP, TCP, UDP, and ICMP statistics per-CPU. The stats are collated
when the user requests them via sysctl.
 1.115  06-Apr-2008  thorpej Change ICMP stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old icmpstat structure; old netstat
binaries will continue to work properly.
 1.114  09-Nov-2007  dyoung branches: 1.114.14;
Use sockaddr_in_init(). KNF. No functional change intended.
 1.113  27-Aug-2007  dyoung branches: 1.113.2; 1.113.6; 1.113.8;
Cosmetic: 0 -> NULL. Remove unnecessary cast.
 1.112  19-Jul-2007  dyoung branches: 1.112.4; 1.112.6;
Take steps to hide the radix_node implementation of the forwarding table
from the forwarding table's users:

Introduce rt_walktree() for walking the routing table and
applying a function to each rtentry. Replace most
rn_walktree() calls with it.

Use rt_getkey()/rt_setkey() to get/set a route's destination.
Keep a pointer to the sockaddr key in the rtentry, so that
rtentry users do not have to grovel in the radix_node for
the key.

Add a RTM_GET method to rtrequest. Use that instead of
radix_node lookups in, e.g., carp(4).

Add sys/net/link_proto.c, which supplies sockaddr routines for
link-layer socket addresses (sockaddr_dl).

Cosmetic:

Constify. KNF. Stop open-coding LIST_FOREACH, TAILQ_FOREACH,
et cetera. Use NULL instead of 0 for null pointers. Use
__arraycount(). Reduce gratuitous parenthesization.

Stop using variadic arguments for rip6_output(), it is
unnecessary.

Remove the unnecessary rtentry member rt_genmask and the
code to maintain it, since nothing actually used it.

Make rt_maskedcopy() easier to read by using meaningful variable
names.

Extract a subroutine intern_netmask() for looking up a netmask in
the masks table.

Start converting backslash-ridden IPv6 macros in
sys/netinet6/in6_var.h into inline subroutines that one
can read without special eyeglasses.

One functional change: when the kernel serves an RTM_GET, RTM_LOCK,
or RTM_CHANGE request, it applies the netmask (if supplied) to a
destination before searching for it in the forwarding table.

I have changed sys/netinet/ip_carp.c, carp_setroute(), to remove
the unlawful radix_node knowledge.

Apart from the changes to carp(4), netiso, ATM, and strip(4), I
have run the changes on three nodes in my wireless routing testbed,
which involves IPv4 + IPv6 dynamic routing acrobatics, and it's
working beautifully so far.
 1.111  04-Mar-2007  christos branches: 1.111.2; 1.111.10;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.
 1.110  22-Feb-2007  thorpej TRUE -> true, FALSE -> false
 1.109  17-Feb-2007  dyoung KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.
 1.108  29-Jan-2007  dyoung branches: 1.108.2;
bzero -> memset
 1.107  15-Dec-2006  joerg Introduce new helper functions to abstract the route caching.
rtcache_init and rtcache_init_noclone lookup ro_dst and store
the result in ro_rt, taking care of the reference counting and
calling the domain specific route cache.
rtcache_free checks if a route was cashed and frees the reference.
rtcache_copy copies ro_dst of the given struct route, checking that
enough space is available and incrementing the reference count of the
cached rtentry if necessary.
rtcache_check validates that the cached route is still up. If it isn't,
it tries to look it up again. Afterwards ro_rt is either a valid again
or NULL.
rtcache_copy is used internally.

Adjust to callers of rtalloc/rtflush in the tree to check the sanity of
ro_dst first (if necessary). If it doesn't fit the expectations, free
the cache, otherwise check if the cached route is still valid. After
that combination, a single check for ro_rt == NULL is enough to decide
whether a new lookup needs to be done with a different ro_dst.
Make the route checking in gre stricter by repeating the loop check
after revalidation.
Remove some unused RADIX_MPATH code in in6_src.c. The logic is slightly
changed here to first validate the route and check RTF_GATEWAY
afterwards. This is sementically equivalent though.
etherip doesn't need sc_route_expire similiar to the gif changes from
dyoung@ earlier.

Based on the earlier patch from dyoung@, reviewed and discussed with
him.
 1.106  09-Dec-2006  dyoung Here are various changes designed to protect against bad IPv4
routing caused by stale route caches (struct route). Route caches
are sprinkled throughout PCBs, the IP fast-forwarding table, and
IP tunnel interfaces (gre, gif, stf).

Stale IPv6 and ISO route caches will be treated by separate patches.

Thank you to Christoph Badura for suggesting the general approach
to invalidating route caches that I take here.

Here are the details:

Add hooks to struct domain for tracking and for invalidating each
domain's route caches: dom_rtcache, dom_rtflush, and dom_rtflushall.

Introduce helper subroutines, rtflush(ro) for invalidating a route
cache, rtflushall(family) for invalidating all route caches in a
routing domain, and rtcache(ro) for notifying the domain of a new
cached route.

Chain together all IPv4 route caches where ro_rt != NULL. Provide
in_rtcache() for adding a route to the chain. Provide in_rtflush()
and in_rtflushall() for invalidating IPv4 route caches. In
in_rtflush(), set ro_rt to NULL, and remove the route from the
chain. In in_rtflushall(), walk the chain and remove every route
cache.

In rtrequest1(), call rtflushall() to invalidate route caches when
a route is added.

In gif(4), discard the workaround for stale caches that involves
expiring them every so often.

Replace the pattern 'RTFREE(ro->ro_rt); ro->ro_rt = NULL;' with a
call to rtflush(ro).

Update ipflow_fastforward() and all other users of route caches so
that they expect a cached route, ro->ro_rt, to turn to NULL.

Take care when moving a 'struct route' to rtflush() the source and
to rtcache() the destination.

In domain initializers, use .dom_xxx tags.

KNF here and there.
 1.105  16-Nov-2006  christos __unused removal on arguments; approved by core.
 1.104  12-Oct-2006  christos - sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386
 1.103  30-Aug-2006  christos branches: 1.103.2; 1.103.4;
fix initializers
 1.102  28-Aug-2006  yamt icmp_input: don't assume relations between PRC_ and ICMP_ values.
 1.101  10-Jul-2006  peter Wrap long lines, unwrap a short line.
 1.100  10-Jul-2006  peter Moves the PF_GENERATED m_tag to the new packet in icmp_error.
This is needed because the pf code can call icmp_error with setting
this tag, but the new packet should not be filtered when it comes back
to pf(4).

ok christos@
 1.99  29-Mar-2006  dyoung branches: 1.99.4;
When reflecting an ICMP Echo, do not scribble over read-only/shared
mbuf storage.
 1.98  22-Mar-2006  matt An MTU can't be negative so store them in unsigned variables.
 1.97  10-Nov-2005  christos branches: 1.97.6; 1.97.8; 1.97.10; 1.97.12; 1.97.14;
Remove redundant assignment (from Liam Foy)
 1.96  23-Oct-2005  christos No need to pass an interface when only the mtu is needed. From OpenBSD via
Liam Foy.
 1.95  19-Aug-2005  christos branches: 1.95.2;
make ICMPPRINTFS work; from Liam Foy.
 1.94  05-Aug-2005  elad Add sysctls for IP, ICMP, TCP, and UDP statistics.
 1.93  19-Jul-2005  christos Implement PMTU checks from:

http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html

1. Don't act on ICMP-need-frag immediately if adhoc checks on the
advertised MTU fail. The MTU update is delayed until a TCP retransmit
happens.
2. Ignore ICMP Source Quench messages meant for TCP connections.

From OpenBSD.
 1.92  29-Apr-2005  yamt branches: 1.92.2;
move decl of inetsw to its own header to avoid array of incomplete type.
found by gcc4. reported by Adam Ciarcinski.
 1.91  26-Feb-2005  perry nuke trailing whitespace
 1.90  03-Feb-2005  perry ANSIfy function declarations
 1.89  02-Feb-2005  perry de-__P -- will ANSIfy .c files later.
 1.88  24-Jan-2005  matt branches: 1.88.2;
Add IFNET_FOREACH and IFADDR_FOREACH macros and start using them.
 1.87  03-Aug-2004  cube branches: 1.87.4;
Remove a common (icmpstat).
 1.86  25-Jun-2004  itojun icmp_reflect: check if m_pkthdr.rcvif is non-NULL before touching it.
icmp_reflect could be called from the output path, so m_pkthdr.rcvif may not
be set. (found by panic when PF is configured "block return all")
 1.85  25-Jun-2004  itojun be careful touching m_pkthdr.rcvif, it could be NULL if the packet was
generated from local node and icmp_error calls icmp_reflect.
 1.84  25-May-2004  atatat Sysctl descriptions under net subtree (net.key not done)
 1.83  26-Apr-2004  matt Remove #else clause of __STDC__
 1.82  24-Mar-2004  atatat branches: 1.82.2;
Tango on sysctl_createv() and flags. The flags have all been renamed,
and sysctl_createv() now uses more arguments.
 1.81  04-Dec-2003  atatat Dynamic sysctl.

Gone are the old kern_sysctl(), cpu_sysctl(), hw_sysctl(),
vfs_sysctl(), etc, routines, along with sysctl_int() et al. Now all
nodes are registered with the tree, and nodes can be added (or
removed) easily, and I/O to and from the tree is handled generically.

Since the nodes are registered with the tree, the mapping from name to
number (and back again) can now be discovered, instead of having to be
hard coded. Adding new nodes to the tree is likewise much simpler --
the new infrastructure handles almost all the work for simple types,
and just about anything else can be done with a small helper function.

All existing nodes are where they were before (numerically speaking),
so all existing consumers of sysctl information should notice no
difference.

PS - I'm sorry, but there's a distinct lack of documentation at the
moment. I'm working on sysctl(3/8/9) right now, and I promise to
watch out for buses.
 1.80  13-Nov-2003  jonathan Add m_tag_delete_nonpesrsistent(), for deleting all packet tags on
mbuf chains which are recycled (e.g., ICMP reflection, loopback
interface). A consensus was reached that such recycled packets should
behave (more-or-less) the same way if a new chain had been allocated
and the contents copied to that chain.

Some packet tags may in future be marked as "persistent" (e.g., for
mandatory access controls) and should persist across such deletion.
NetBSD as yet hos no persistent tags, so m_tag_delete_nonpersistent()
just deletes all tags. This should not be relied upon.
 1.79  11-Nov-2003  jonathan Change global head-of-local-IP-address list from in_ifaddr to
in_ifaddrhead. Recent changes in struct names caused a namespace
collision in fast-ipsec, which are most cleanly fixed by using
"in_ifaddrhead" as the listhead name.
 1.78  22-Aug-2003  itojun remove ipsec_set/getsocket. now we explicitly pass socket * to ip{,6}_output.
 1.77  22-Aug-2003  itojun change the additional arg to be passed to ip{,6}_output to struct socket *.

this fixes KAME policy lookup which was broken by the previous commit.
 1.76  15-Aug-2003  jonathan (fast-ipsec): Add hooks to pass IPv4 IPsec traffic into fast-ipsec, if
configured with ``options FAST_IPSEC''. Kernels with KAME IPsec or
with no IPsec should work as before.

All calls to ip_output() now always pass an additional compulsory
argument: the inpcb associated with the packet being sent,
or 0 if no inpcb is available.

Fast-ipsec tested with ICMP or UDP over ESP. TCP doesn't work, yet.
 1.75  07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.74  26-Jun-2003  itojun branches: 1.74.2;
fix comment
 1.73  17-Apr-2003  tron Clear hardware checksum flags before reusing a mbuf for an ICMP reply as
suggested by Enami Tsugutomo. This fixes PR kern/21203 by myself.
 1.72  26-Feb-2003  matt Add MBUFTRACE kernel option.
Do a little mbuf rework while here. Change all uses of MGET*(*, M_WAIT, *)
to m_get*(M_WAIT, *). These are not performance critical and making them
call m_get saves considerable space. Add m_clget analogue of MCLGET and
make corresponding change for M_WAIT uses.
Modify netinet, gem, fxp, tulip, nfs to support MBUFTRACE.
Begin to change netstat to use sysctl.
 1.71  23-Sep-2002  simonb Remove breaks after returns, unreachable returns and returns after
returns(!).
 1.70  14-Aug-2002  itojun avoid swapping endian of ip_len and ip_off on mbuf, to meet with M_LEADINGSPACE
optimization made last year. should solve PR 17867 and 10195.

IP_HDRINCL behavior of raw ip socket is kept unchanged. we may want to
provide IP_HDRINCL variant that does not swap endian.
 1.69  30-Jun-2002  thorpej Changes to allow the IPv4 and IPv6 layers to align headers themseves,
as necessary:
* Implement a new mbuf utility routine, m_copyup(), is is like
m_pullup(), except that it always prepends and copies, rather
than only doing so if the desired length is larger than m->m_len.
m_copyup() also allows an offset into the destination mbuf, which
allows space for packet headers, in the forwarding case.
* Add *_HDR_ALIGNED_P() macros for IP, IPv6, ICMP, and IGMP. These
macros expand to 1 if __NO_STRICT_ALIGNMENT is defined, so that
architectures which do not have strict alignment constraints don't
pay for the test or visit the new align-if-needed path.
* Use the new macros to check if a header needs to be aligned, or to
assert that it already is, as appropriate.

Note: This code is still somewhat experimental. However, the new
code path won't be visited if individual device drivers continue
to guarantee that packets are delivered to layer 3 already properly
aligned (which are rules that are already in use).
 1.68  13-Jun-2002  itojun set IPv4 parameter to modern value.
- turn on path MTU discovery (previous: turned off)
- ICMPv4 redirect entry timeout = 600 sec (previous: never timeout)
 1.67  09-Jun-2002  itojun whitespace
 1.66  13-Nov-2001  lukem branches: 1.66.8; 1.66.10;
add RCSIDs
 1.65  04-Nov-2001  matt Convert netinet to not use the internal <sys/queue.h> field names
but instead the access macros. Use the FOREACH macros where appropriate.
 1.64  04-Nov-2001  matt Keep only one mtu_table (the two were identical except for
one value - 65280).
 1.63  30-Oct-2001  kml Add in support for timing out IPv4 routes added due to redirects,
as discussed in tech-net several weeks ago. It turned out that
KAME had already added this functionality to the IPv6 stack, so
I followed their example in adding the sysctl variables
net.inet.icmp.rediraccept and net.inet.icmp.redirtimeout.
 1.62  29-Oct-2001  simonb Don't need to include <uvm/uvm_extern.h> just to include <sys/sysctl.h>
anymore.
 1.61  20-Oct-2001  matt branches: 1.61.2;
Make the two MTU tables const and change their type to u_int (one was int
and one was u_long!).
 1.60  08-Mar-2001  itojun branches: 1.60.2;
Remove a bogus rtfree(); OpenBSD PR 1706.
 1.59  01-Mar-2001  itojun branches: 1.59.2;
make sure to enforce inbound ipsec policy checking, for any protocols on top
of ip (check it when final header is visited). sync with kame.
XXX kame team will need to re-check policy engine code
 1.58  24-Jan-2001  itojun - record IPsec packet history into m_aux structure.
- let ipfilter look at wire-format packet only (not the decapsulated ones),
so that VPN setting can work with NAT/ipfilter settings.
sync with kame.

TODO: use header history for stricter inbound validation
 1.57  18-Oct-2000  itojun s/mtudisc_callback/icmp_&/ so that we don't feel conflict between IPv4 and
IPv6 counterpart. (or icmp4_&?)
 1.56  18-Oct-2000  itojun count successful path MTU changes. good for debugging.
(there could be some discussion on when to increase the counter...)
 1.55  18-Oct-2000  thorpej Restructure the Path MTU Discovery code somewhat to avoid
entering rtentry's for hosts we're not actually communicating
with.

Do this by invoking the ctlinput for the protocol, which is
responsible for validating the ICMP message:
* TCP -- Lookup the connection based on the address/port
pairs in the ICMP message.
* AH/ESP -- Lookup the SA based on the SPI in the ICMP message.

If validation succeeds, ctlinput is responsible for calling
icmp_mtudisc(). icmp_mtudisc() then invokes callbacks registered
by protocols (such as TCP) which want to take some sort of special
action when a path's MTU changes. For TCP, this is where we now
refresh cached routes and re-enter slow-start.

As a side-effect, this fixes the problem where TCP would not be
notified when a path's MTU changed if AH/ESP were being used.

XXX Note, this is only a fix for the IPv4 case. For the IPv6
XXX case, we need to wait for the KAME folks.

Reviewed by sommerfeld@netbsd.org and itojun@netbsd.org.
 1.54  28-Jul-2000  itojun nuke the following sysctl variables. "ppsratelimit" should work better.
need to recompile sbin/sysctl after updating /usr/include.
net.inet.tcp.rstratelimit
net.inet.icmp.errratelimit
net.inet6.icmp6.errratelimit
 1.53  27-Jul-2000  itojun do not disable icmp error rate limitation for local address.
local address can be abused too. pps rate limitation should work fine for
moderate amount of icmp errors.
 1.52  24-Jul-2000  sommerfeld Improve robustness of icmp_error():
- allow it to work when icmpreturndatabytes is sufficiently large that the
icmp error message doesn't fit in a header mbuf.
- defend against mbuf chains shorter than their contained ip->ip_len.
 1.51  10-Jul-2000  itojun implement net.inet.icmp.errppslimit.
make default value for net.inet.icmp.erratelimit to 0, as < 10ms value
does not do the right thing.
 1.50  06-Jul-2000  itojun remove unnecessary #include <netkey/key_debug.h>. from kame.
 1.49  01-Jul-2000  sommerfeld Don't rate-limit ICMP errors from packets we send to ourselves.
The dns resolver depends on reliably receiving errors to allow it to
quickly detect a dead local nameserver.
 1.48  28-Jun-2000  mrg <vm/vm.h> -> <uvm/uvm_extern.h>
 1.47  10-Jun-2000  darrenr branches: 1.47.2;
add icmpreturndatabytes kernel variable (default 8) which specifies the
number of extra data bytes to return in ICMP error messages. This is
also available via sysctl as net.icmp.returndatabytes and is limited to
[8,512].
 1.46  22-May-2000  itojun branches: 1.46.2;
disallow negative numbers for ratelimit interval (tcp, icmp, icmp6).
 1.45  10-May-2000  itojun add missing boundary checks to ip options processing.
correct timestamp option validation (len and ptr upper/lower bound
based on RFC791).
fill "pointer" field for parameter problem in timestamp option processing.
 1.44  30-Mar-2000  augustss Remove register declarations.
 1.43  01-Mar-2000  itojun introduce m->m_pkthdr.aux to hold random data which needs to be passed
between protocol handlers.

ipsec socket pointers, ipsec decryption/auth information, tunnel
decapsulation information are in my mind - there can be several other usage.
at this moment, we use this for ipsec socket pointer passing. this will
avoid reuse of m->m_pkthdr.rcvif in ipsec code.

due to the change, MHLEN will be decreased by sizeof(void *) - for example,
for i386, MHLEN was 100 bytes, but is now 96 bytes.
we may want to increase MSIZE from 128 to 256 for some of our architectures.

take caution if you use it for keeping some data item for long period
of time - use extra caution on M_PREPEND() or m_adj(), as they may result
in loss of m->m_pkthdr.aux pointer (and mbuf leak).

this will bump kernel version.

(as discussed in tech-net, tested in kame tree)
 1.42  24-Feb-2000  itojun don't transmit ICMPv4 packet back, if the original packet was encyrpted.
 1.41  17-Feb-2000  darrenr Change the use of pfil hooks. There is no longer a single list of all
pfil information, instead, struct protosw now contains a structure
which caontains list heads, etc. The per-protosw pfil struct is passed
to pfil_hook_get(), along with an in/out flag to get the head of the
relevant filter list. This has been done for only IPv4 and IPv6, at
present, with these patches only enabling filtering for IPPROTO_IP and
IPPROTO_IPV6, although it is possible to have tcp/udp, etc, dedicated
filters now also. The ipfilter code has been updated to only filter
IPv4 packets - next major release of ipfilter is required for ipv6.
 1.40  15-Feb-2000  thorpej Add ICMP error rate limiting, based on the same for ICMP6.

Note, we're reusing the previously unused slot for "MTU discovery" (which
was moved to the "net.inet.ip" branch of the sysctl tree quite some time
ago).
 1.39  25-Jan-2000  sommerfeld Pick source address for ICMP errors a bit more intelligently when
there are multiple addresses on the interface.

From Marc Horowitz <marc@netbsd.org>, who left this sitting for too long.
 1.38  09-Jul-1999  thorpej branches: 1.38.2;
defopt IPSEC and IPSEC_ESP (both into opt_ipsec.h).
 1.37  01-Jul-1999  itojun IPv6 kernel code, based on KAME/NetBSD 1.4, SNAP kit 19990628.
(Sorry for a big commit, I can't separate this into several pieces...)
Pls check sys/netinet6/TODO and sys/netinet6/IMPLEMENTATION for details.

- sys/kern: do not assume single mbuf, accept chained mbuf on passing
data from userland to kernel (or other way round).
- "midway" ATM card: ATM PVC pseudo device support, like those done in ALTQ
package (ftp://ftp.csl.sony.co.jp/pub/kjc/).
- sys/netinet/tcp*: IPv4/v6 dual stack tcp support.
- sys/netinet/{ip6,icmp6}.h, sys/net/pfkeyv2.h: IETF document assumes those
file to be there so we patch it up.
- sys/netinet: IPsec additions are here and there.
- sys/netinet6/*: most of IPv6 code sits here.
- sys/netkey: IPsec key management code
- dev/pci/pcidevs: regen

In my understanding no code here is subject to export control so it
should be safe.
 1.36  30-Mar-1999  mycroft branches: 1.36.4; 1.36.6;
Fix a null pointer dereference in the case where forwarding is turned on and
there are interfaces up but with no addresses.
 1.35  19-Jan-1999  mycroft There's just no plausible reason to byte-swap ip_id internally. It's opaque.
 1.34  19-Jan-1999  mycroft Don't screw with ip_len; just subtract from it where we actually use the
value.
 1.33  19-Jan-1999  mycroft Fix byte-swapping of ip_len in returned IP header.
 1.32  11-Jan-1999  thorpej Fix byte order and ip_len inconsistencies in ICMP reply code. Also, fix
some formatting and HTONS(foo) vs. foo = htons(foo) inconsistencies.

PR #6602, Darren Reed.
 1.31  19-Dec-1998  thorpej Reverse the copyright-notice-swap. It went against existing practice.
 1.30  30-Sep-1998  tls branches: 1.30.4;
Switch order of TNF and UCB copyrights so UCB copyright is first; this seems more appropriate since UCB wrote the original code, after all.
 1.29  29-Apr-1998  kml Add support for deletion of routes added by path MTU discovery;
uses new generic route timeout code. Add sysctl for timeout period.
 1.28  15-Feb-1998  tls Add correct copyright notice for IP address hash change. This code is donated to TNF by the original copyright holder, Panix.
 1.27  13-Feb-1998  tls Change list of interface IP addresses to a hash. Improves performance on hosts with a large number of IP addresses significantly.
 1.26  29-Oct-1997  kml Changes to path MTU discovery to correctly handle "needs
fragmentation" ICMP messages that specify a new MTU size of zero
(from, say, old buggy Linux kernels).
 1.25  18-Oct-1997  kml branches: 1.25.2;
change sysctl net.inet.icmp.mtudisc to net.inet.ip.mtudisc
 1.24  17-Oct-1997  kml Path MTU Discovery support. This is turned off by default.
Use sysctl -w net.inet.icmp.mtudisc=1 to turn on.
Still to come: path removal after some period, black hole detection
 1.23  24-Jun-1997  thorpej Increment icmpstat.icps_badlen for bad length of ICMP_MASKREQ, per
Stevens in TCP/IP Illustrated vol. 2, p.319. Submitted by
Koji Imada <koji@math.human.nagoya-u.ac.jp> in PR #3712.
 1.22  13-Oct-1996  christos backout previous kprintf changes
 1.21  10-Oct-1996  christos printf -> kprintf, sprintf -> ksprintf
 1.20  09-Sep-1996  mycroft Add in_nullhost() and in_hosteq() macros, to hide some protocol
details. Also, fix a bug in TCP wrt SYN+URG packets.
 1.19  13-Feb-1996  christos netinet prototypes
 1.18  12-Jun-1995  mycroft Various cleanup, including:
* Convert several data structures to use queue.h.
* Split in_pcbnotify() into two parts; one for notifying a specific PCB, and
one for notifying all PCBs for a particular foreign address.
 1.17  04-Jun-1995  mycroft Don't cast things unnecessarily.
 1.16  04-Jun-1995  mycroft Clean up many more casts.
 1.15  01-Jun-1995  mycroft Don't use INADDR_* constants in case labels.
 1.14  01-Jun-1995  mycroft Avoid byte-swapping IP addresses at run time.
 1.13  31-May-1995  mycroft Integrate multicast 3.5 distribution, with several bugs fixed and general
cleanup. This is a (working) snapshot of work in progress.
 1.12  15-May-1995  cgd KNF
 1.11  13-Apr-1995  cgd be a bit more careful and explicit with types. (basically a large no-op.)
 1.10  29-Jun-1994  cgd New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'
 1.9  13-May-1994  mycroft Update to 4.4-Lite networking code, with a few local changes.
 1.8  02-Feb-1994  hpeyerl Multicast is no longer optional.
 1.7  10-Jan-1994  mycroft Should compile now with or without `options MULTICAST'.
 1.6  08-Jan-1994  mycroft More prototypes.
 1.5  08-Jan-1994  mycroft Fix some inconsistent spacing; spaces at the end of lines, etc.
 1.4  18-Dec-1993  mycroft Canonicalize all #includes.
 1.3  06-Dec-1993  hpeyerl multicast support.
>From Chris Maeda, cmaeda@cs.washington.edu
These patches are derived from the IP Multicast patches for BSDI.
 1.2  20-May-1993  cgd more rcsid additions and file header cleanups
 1.1  21-Mar-1993  cgd branches: 1.1.1;
Initial revision
 1.1.1.2  05-Jan-1998  thorpej Import sys/netinet from 4.4BSD-Lite for reference purposes.
 1.1.1.1  21-Mar-1993  cgd initial import of 386bsd-0.1 sources
 1.25.2.3  01-Oct-1998  cgd pull up revisions 1.27-1.28, 1.30 from trunk. (tls)
 1.25.2.2  09-May-1998  mycroft Pull up patch from kml.
 1.25.2.1  30-Oct-1997  mellon Pull rev 1.26 up from trunk
 1.30.4.1  11-Dec-1998  kenh The beginnings of interface detach support. Still some bugs, but mostly
works for me.

This work was originally by Bill Studenmund, and cleaned up by me.
 1.36.6.3  30-Nov-1999  itojun bring in latest KAME (as of 19991130, KAME/NetBSD141) into kame branch
just for reference purposes.
This commit includes 1.4 -> 1.4.1 sync for kame branch.

The branch does not compile at all (due to the lack of ALTQ and some other
source code). Please do not try to modify the branch, this is just for
referenre purposes.

synchronization to latest KAME will take place on HEAD branch soon.
 1.36.6.2  06-Jul-1999  itojun KAME/NetBSD 1.4, SNAP kit 1999/07/05.
NOTE: this branch is just for reference purposes (i.e. for taking cvs diff).
do not touch anything on the branch. actual work must be done on HEAD branch.
 1.36.6.1  28-Jun-1999  itojun KAME/NetBSD 1.4 SNAP kit, dated 19990628.

NOTE: this branch (kame) is used just for refernce. this may not compile
due to multiple reasons.
 1.36.4.2  02-Aug-1999  thorpej Update from trunk.
 1.36.4.1  01-Jul-1999  thorpej Sync w/ -current.
 1.38.2.3  12-Mar-2001  bouyer Sync with HEAD.
 1.38.2.2  11-Feb-2001  bouyer Sync with HEAD.
 1.38.2.1  20-Nov-2000  bouyer Update thorpej_scsipi to -current as of a month ago
 1.46.2.1  22-Jun-2000  minoura Sync w/ netbsd-1-5-base.
 1.47.2.5  06-Apr-2001  he Pull up revision 1.58 (requested by itojun):
Record IPsec packet history in m_aux structure. Let ipfilter
look at wire-format packet only (not the decapsulated ones), so
that VPN setting can work with NAT/ipfilter settings.
 1.47.2.4  11-Mar-2001  he Pull up revision 1.59 (requested by itojun):
Ensure that we enforce inbound IPsec policy on all IP protocols,
not just TCP, UDP and ICMP.
 1.47.2.3  16-Aug-2000  itojun pullup (approved by releng-1-5)

switch from net.inet*.*.*ratelimit to net.inet*.*.ppslimit.

(tags are rough estimate - we had some try-and-error in main trunc)
sys/netinet/icmp6.h 1.9 -> 1.11
sys/netinet/icmp_var.h 1.15 -> 1.17
sys/netinet/in_proto.c 1.39 -> 1.42
sys/netinet/ip_icmp.c 1.50 -> 1.51, 1.52 -> 1.54
sys/netinet/tcp_input.c 1.111 -> 1.112, 1.115 -> 1.117
sys/netinet/tcp_usrreq.c 1.52 -> 1.53
sys/netinet/tcp_var.h 1.72 -> 1.75
sys/netinet6/icmp6.c 1.34 -> 1.35, 1.36 -> 1.38
sys/netinet6/in6_proto.c 1.17 -> 1.19
 1.47.2.2  28-Jul-2000  sommerfeld Pull up UDP, ICMP fixes:

- Drop packet, increment udps_badlen if the udp header length field
reports a size smaller than the udp header; defends against bogus
packets seen by by Assar Westerlund.

- allow icmp_error() to work when icmpreturndatabytes is sufficiently
large that the icmp error message doesn't fit in a header mbuf.

- defend against mbuf chains shorter than their contained ip->ip_len.

Joint work of myself, itojun, and assar
Approved by thorpej

revisions pulled up:
sys/netinet/ip_icmp.c 1.52
sys/netinet/udp_usrreq.c 1.70
 1.47.2.1  02-Jul-2000  sommerfeld Pull up 1.49: don't rate-limit ICMP we send to ourselves.
 1.59.2.7  18-Oct-2002  nathanw Catch up to -current.
 1.59.2.6  27-Aug-2002  nathanw Catch up to -current.
 1.59.2.5  01-Aug-2002  nathanw Catch up to -current.
 1.59.2.4  20-Jun-2002  nathanw Catch up to -current.
 1.59.2.3  14-Nov-2001  nathanw Catch up to -current.
 1.59.2.2  22-Oct-2001  nathanw Catch up to -current.
 1.59.2.1  09-Apr-2001  nathanw Catch up with -current.
 1.60.2.4  10-Oct-2002  jdolecek sync kqueue with -current; this includes merge of gehenna-devsw branch,
merge of i386 MP branch, and part of autoconf rototil work
 1.60.2.3  06-Sep-2002  jdolecek sync kqueue branch with HEAD
 1.60.2.2  23-Jun-2002  jdolecek catch up with -current on kqueue branch
 1.60.2.1  10-Jan-2002  thorpej Sync kqueue branch with -current.
 1.61.2.1  12-Nov-2001  thorpej Sync the thorpej-mips-cache branch with -current.
 1.66.10.2  16-Jun-2003  grant Pull up revision 1.73 (requested by tron in ticket #1260):

Clear hardware checksum flags before reusing a mbuf for an ICMP reply as
suggested by Enami Tsugutomo. This fixes PR kern/21203 by myself.
 1.66.10.1  15-Jun-2002  lukem Pull up revision 1.68 (requested by itojun in ticket #266):
set IPv4 parameter to modern value.
- ICMPv4 redirect entry timeout = 600 sec (previous: never timeout)
 1.66.8.3  29-Aug-2002  gehenna catch up with -current.
 1.66.8.2  15-Jul-2002  gehenna catch up with -current.
 1.66.8.1  20-Jun-2002  gehenna catch up with -current.
 1.74.2.8  11-Dec-2005  christos Sync with head.
 1.74.2.7  10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.74.2.6  04-Mar-2005  skrll Sync with HEAD.

Hi Perry!
 1.74.2.5  04-Feb-2005  skrll Sync with HEAD.
 1.74.2.4  21-Sep-2004  skrll Fix the sync with head I botched.
 1.74.2.3  18-Sep-2004  skrll Sync with HEAD.
 1.74.2.2  12-Aug-2004  skrll Sync with HEAD.
 1.74.2.1  03-Aug-2004  skrll Sync with HEAD
 1.82.2.2  03-Aug-2004  jmc Pullup rev 1.85-1.87 (requested by christos in ticket #732)

icmp_reflect: check if m_pkthdr.rcvif is non-NULL before touching it.
icmp_reflect could be called from the output path, so m_pkthdr.rcvif may not
be set. (found by panic when PF is configured "block return all")
 1.82.2.1  28-May-2004  tron Pull up revision 1.84 (requested by atatat in ticket #391):
Sysctl descriptions under net subtree (net.key not done)
 1.87.4.1  29-Apr-2005  kent sync with -current
 1.88.2.2  19-Mar-2005  yamt sync with head. xen and whitespace. xen part is not finished.
 1.88.2.1  12-Feb-2005  yamt sync with head.
 1.92.2.5  15-Nov-2007  yamt sync with head.
 1.92.2.4  03-Sep-2007  yamt sync with head.
 1.92.2.3  26-Feb-2007  yamt sync with head.
 1.92.2.2  30-Dec-2006  yamt sync with head.
 1.92.2.1  21-Jun-2006  yamt sync with head.
 1.95.2.1  26-Oct-2005  yamt sync with head
 1.97.14.2  31-Mar-2006  tron Merge 2006-03-31 NetBSD-current into the "peter-altq" branch.
 1.97.14.1  28-Mar-2006  tron Merge 2006-03-28 NetBSD-current into the "peter-altq" branch.
 1.97.12.1  19-Apr-2006  elad sync with head.
 1.97.10.3  03-Sep-2006  yamt sync with head.
 1.97.10.2  11-Aug-2006  yamt sync with head
 1.97.10.1  01-Apr-2006  yamt sync with head.
 1.97.8.1  22-Apr-2006  simonb Sync with head.
 1.97.6.1  09-Sep-2006  rpaulo sync with head
 1.99.4.1  13-Jul-2006  gdamore Merge from HEAD.
 1.103.4.3  18-Dec-2006  yamt sync with head.
 1.103.4.2  10-Dec-2006  yamt sync with head.
 1.103.4.1  22-Oct-2006  yamt sync with head
 1.103.2.3  01-Feb-2007  ad Sync with head.
 1.103.2.2  12-Jan-2007  ad Sync with head.
 1.103.2.1  18-Nov-2006  ad Sync with head.
 1.108.2.2  12-Mar-2007  rmind Sync with HEAD.
 1.108.2.1  27-Feb-2007  yamt - sync with head.
- move sched_changepri back to kern_synch.c as it doesn't know PPQ anymore.
 1.111.10.2  03-Sep-2007  skrll Sync with HEAD.
 1.111.10.1  15-Aug-2007  skrll Sync with HEAD.
 1.111.2.2  09-Oct-2007  ad Sync with head.
 1.111.2.1  20-Aug-2007  ad Sync with HEAD.
 1.112.6.2  19-Jul-2007  dyoung Take steps to hide the radix_node implementation of the forwarding table
from the forwarding table's users:

Introduce rt_walktree() for walking the routing table and
applying a function to each rtentry. Replace most
rn_walktree() calls with it.

Use rt_getkey()/rt_setkey() to get/set a route's destination.
Keep a pointer to the sockaddr key in the rtentry, so that
rtentry users do not have to grovel in the radix_node for
the key.

Add a RTM_GET method to rtrequest. Use that instead of
radix_node lookups in, e.g., carp(4).

Add sys/net/link_proto.c, which supplies sockaddr routines for
link-layer socket addresses (sockaddr_dl).

Cosmetic:

Constify. KNF. Stop open-coding LIST_FOREACH, TAILQ_FOREACH,
et cetera. Use NULL instead of 0 for null pointers. Use
__arraycount(). Reduce gratuitous parenthesization.

Stop using variadic arguments for rip6_output(), it is
unnecessary.

Remove the unnecessary rtentry member rt_genmask and the
code to maintain it, since nothing actually used it.

Make rt_maskedcopy() easier to read by using meaningful variable
names.

Extract a subroutine intern_netmask() for looking up a netmask in
the masks table.

Start converting backslash-ridden IPv6 macros in
sys/netinet6/in6_var.h into inline subroutines that one
can read without special eyeglasses.

One functional change: when the kernel serves an RTM_GET, RTM_LOCK,
or RTM_CHANGE request, it applies the netmask (if supplied) to a
destination before searching for it in the forwarding table.

I have changed sys/netinet/ip_carp.c, carp_setroute(), to remove
the unlawful radix_node knowledge.

Apart from the changes to carp(4), netiso, ATM, and strip(4), I
have run the changes on three nodes in my wireless routing testbed,
which involves IPv4 + IPv6 dynamic routing acrobatics, and it's
working beautifully so far.
 1.112.6.1  19-Jul-2007  dyoung file ip_icmp.c was added on branch matt-mips64 on 2007-07-19 20:48:55 +0000
 1.112.4.2  11-Nov-2007  joerg Sync with HEAD.
 1.112.4.1  03-Sep-2007  jmcneill Sync with HEAD.
 1.113.8.1  19-Nov-2007  mjf Sync with HEAD.
 1.113.6.1  13-Nov-2007  bouyer Sync with HEAD
 1.113.2.1  09-Jan-2008  matt sync with HEAD
 1.114.14.2  29-Jun-2008  mjf Sync with HEAD.
 1.114.14.1  02-Jun-2008  mjf Sync with HEAD.
 1.116.2.2  18-May-2008  yamt sync with head.
 1.116.2.1  19-Apr-2008  yamt Peter Postma's work-in-progress pf import from OpenBSD 4.2.
updated to -current by me.
 1.117.2.4  11-Aug-2010  yamt sync with head.
 1.117.2.3  11-Mar-2010  yamt sync with head
 1.117.2.2  04-May-2009  yamt sync with head.
 1.117.2.1  16-May-2008  yamt sync with head.
 1.119.4.1  18-Jun-2008  simonb Sync with head.
 1.119.2.1  23-Jun-2008  wrstuden Sync w/ -current. 34 merge conflicts to follow.
 1.120.6.1  09-Jun-2013  msaitoh Pull up following revision(s) (requested by fair in ticket #1855):
sys/netinet/ip_icmp.c: revision 1.130
PR/47693: Erik E. Fair: Add missing code to icmp handling.
- While there, add the rest of the missing codes
- Merge groups
- Fix indentation
 1.122.4.1  03-Jul-2010  rmind sync with head
 1.122.2.1  17-Aug-2010  uebayasi Sync with HEAD.
 1.125.6.2  05-Apr-2012  mrg sync to latest -current.
 1.125.6.1  18-Feb-2012  mrg merge to -current.
 1.125.2.2  22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.125.2.1  17-Apr-2012  yamt sync with head
 1.128.2.1  31-Mar-2013  riz Pull up following revision(s) (requested by fair in ticket #860):
sys/netinet/ip_icmp.c: revision 1.130
PR/47693: Erik E. Fair: Add missing code to icmp handling.
- While there, add the rest of the missing codes
- Merge groups
- Fix indentation
 1.129.2.3  03-Dec-2017  jdolecek update from HEAD
 1.129.2.2  20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.129.2.1  23-Jun-2013  tls resync from head
 1.131.2.2  18-May-2014  rmind sync with head
 1.131.2.1  28-Aug-2013  rmind Checkpoint work in progress:
- Initial split of the protocol user-request method into the following
methods: pr_attach, pr_detach and pr_generic for old the pr_usrreq.
- Adjust socreate(9) and sonewconn(9) to call pr_attach without the
socket lock held (as a preparation for the locking scheme adjustment).
- Adjust all pr_attach routines to assert that PCB is not set.
- Sprinkle various comments, document some routines and their locking.
- Remove M_PCB, replace with kmem(9).
- Fix few bugs spotted on the way.
 1.132.2.1  10-Aug-2014  tls Rebase.
 1.134.4.9  28-Aug-2017  skrll Sync with HEAD
 1.134.4.8  05-Feb-2017  skrll Sync with HEAD
 1.134.4.7  05-Dec-2016  skrll Sync with HEAD
 1.134.4.6  05-Oct-2016  skrll Sync with HEAD
 1.134.4.5  09-Jul-2016  skrll Sync with HEAD
 1.134.4.4  22-Apr-2016  skrll Sync with HEAD
 1.134.4.3  22-Sep-2015  skrll Sync with HEAD
 1.134.4.2  06-Jun-2015  skrll Sync with HEAD
 1.134.4.1  06-Apr-2015  skrll Sync with HEAD
 1.150.2.5  26-Apr-2017  pgoyette Sync with HEAD
 1.150.2.4  20-Mar-2017  pgoyette Sync with HEAD
 1.150.2.3  07-Jan-2017  pgoyette Sync with HEAD. (Note that most of these changes are simply $NetBSD$
tag issues.)
 1.150.2.2  04-Nov-2016  pgoyette Sync with HEAD
 1.150.2.1  06-Aug-2016  pgoyette Sync with HEAD
 1.154.2.1  21-Apr-2017  bouyer Sync with HEAD
 1.161.6.3  13-Jul-2018  martin Pull up following revision(s) via patch (requested by knakahara in ticket #905):

sys/netinet/ip_mroute.c: revision 1.160
sys/netinet6/in6_l2tp.c: revision 1.16
sys/net/if.h: revision 1.263
sys/netinet/in_l2tp.c: revision 1.15
sys/netinet/ip_icmp.c: revision 1.172
sys/netinet/igmp.c: revision 1.68
sys/netinet/ip_encap.c: revision 1.69
sys/netinet6/ip6_mroute.c: revision 1.129

sbappendaddr() is required any lock. Currently, softnet_lock is appropriate.

When rip_input() is called as inetsw[].pr_input, rip_iput() is always called
with holding softnet_lock, that is, in case of !defined(NET_MPSAFE) it is
acquired in ipintr(), otherwise(defined(NET_MPSAFE)) it is acquire in
PR_WRAP_INPUT macro.

However, some function calls rip_input() directly without holding softnet_lock.
That causes assertion failure in sbappendaddr().
rip6_input() and icmp6_rip6_input() are also required softnet_lock for the same
reason.
 1.161.6.2  08-Jun-2018  martin Pull up following revision(s) (requested by ozaki-r in ticket #852):

sys/netinet6/icmp6.c: revision 1.238
sys/netinet/ip_icmp.c: revision 1.171
sys/net/route.c: revision 1.210

Fix _rt_free via rtrequest(RTM_DELETE) hangs in rt_timer handlers

A rt_timer handler is passed a rtentry with an extra reference that avoids the
rtentry is accidentally released. So rt_timer handers must release
the reference of a passed rtentry by themselves (but they didn't).
 1.161.6.1  31-Mar-2018  martin Pull up following revision(s) (requested by maxv in ticket #675):

sys/netinet/ip_icmp.c: revision 1.168

Fix a possible buffer overflow in the IPv4 _ctlinput functions.

In _icmp_input we are guaranteeing that the ICMP_ADVLENMIN-byte area
starting from 'icp' is contiguous.

ICMP_ADVLENMIN = 8 + sizeof(struct ip) + 8 = 36

But the _ctlinput functions (eg udp_ctlinput) expect the area to be
larger. These functions read at:

(uint8_t *)icp + 8 + (icp->icmp_ip.ip_hl << 2)

which can be crafted to be:

(uint8_t *)icp + 68

So we end up reading 'icp+68' while the valid area ended at 'icp+36'.

Having said that, it seems pretty complicated to trigger this bug; it
would have to be a fragmented packet with half of the ICMP header in the
first fragment, and we would need to have a driver that did not allocate
a cluster for the first mbuf of the chain.

The check of icmplen against ICMP_ADVLEN(icp) was not sufficient: while it
did guarantee that the ICMP header fit the chain, it did not guarantee
that it fit 'm'.

Fix this bug by pulling up to hlen+ICMP_ADVLEN(icp). No need to log an
error. Rebase the pointers afterwards.
 1.168.2.7  26-Dec-2018  pgoyette Sync with HEAD, resolve a few conflicts
 1.168.2.6  26-Nov-2018  pgoyette Sync with HEAD, resolve a couple of conflicts
 1.168.2.5  30-Sep-2018  pgoyette Ssync with HEAD
 1.168.2.4  06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.168.2.3  25-Jun-2018  pgoyette Sync with HEAD
 1.168.2.2  21-May-2018  pgoyette Sync with HEAD
 1.168.2.1  02-May-2018  pgoyette Synch with HEAD
 1.172.2.1  10-Jun-2019  christos Sync with HEAD

RSS XML Feed