Home | History | Annotate | Download | only in netinet6
History log of /src/sys/netinet6/ip6_var.h
RevisionDateAuthorComments
 1.94  09-Feb-2024  andvar fix spelling mistakes, mainly in comments and log messages.
 1.93  28-Oct-2022  ozaki-r inpcb: integrate data structures of PCB into one

Data structures of network protocol control blocks (PCBs), i.e.,
struct inpcb, in6pcb and inpcb_hdr, are not organized well. Users of
the data structures have to handle them separately and thus the code
is cluttered and duplicated.

The commit integrates the data structures into one, struct inpcb. As a
result, users of PCBs only have to handle just one data structure, so
the code becomes simple.

One drawback is that the data size of PCB for IPv4 increases by 40 bytes
(from 248 bytes to 288 bytes).
 1.92  24-Oct-2022  knakahara Fix PR kern/57037

Be able to change the behavior sending parameter changing routing messages.
When set net.inet6.ip6.param_rt_msg=0, don't send parameter changing
routing messages.
When set net.inet6.ip6.param_rt_msg=1(default), send parameter changing
routing messages by RTM_NEWADDR.
 1.91  17-Aug-2021  andvar fix multiplei repetitive typos in comments, messages and documentation. mainly because copy paste code big amount of files are affected.
 1.90  11-Mar-2021  ryo flowlabel will never return anything other than 1 or 0.
s/&&/&/
 1.89  08-Mar-2021  christos no need for ip6_id.c...
 1.88  07-Mar-2021  christos netinet/netinet6: Add necessary includes to make these standalone.
(from riastradh)
 1.87  28-Aug-2020  ozaki-r branches: 1.87.2;
inet6: reduce silent packet discards
 1.86  28-Aug-2020  ozaki-r inet6: pass rcvif to ip6_forward to avoid extra psref_acquire
 1.85  28-Aug-2020  ozaki-r inet, inet6: count packets dropped by IPsec

The counters count packets dropped due to security policy checks.
 1.84  19-Jun-2020  maxv localify
 1.83  12-Jun-2020  roy Remove in-kernel handling of Router Advertisements

This is much better handled by a user-land tool.
Proposed on tech-net here:
https://mail-index.netbsd.org/tech-net/2020/04/22/msg007766.html

Note that the ioctl SIOCGIFINFO_IN6 no longer sets flags. That now
needs to be done using the pre-existing SIOCSIFINFO_FLAGS ioctl.

Compat is fully provided where it makes sense, but trying to turn on
RA handling will obviously throw an error as it no longer exists.

Note that if you use IPv6 temporary addresses, this now needs to be
turned on in dhcpcd.conf(5) rather than in sysctl.conf(5).
 1.82  13-May-2019  ozaki-r branches: 1.82.2;
Count packets dropped by pfil
 1.81  29-Nov-2018  ozaki-r Introduce and use ip_dad_enabled() and ip6_dad_enabled() functions
 1.80  14-Feb-2018  maxv branches: 1.80.2; 1.80.4;
Re-make ip6_nexthdr global, it will be used in soon-to-be-added code...
 1.79  30-Jan-2018  maxv Style, localify, remove dead code, and fix typos. No functional change.
 1.78  30-Jan-2018  maxv Fix a buffer overflow in ip6_get_prevhdr. Doing

mtod(m, char *) + len

is wrong, an option is allowed to be located in another mbuf of the chain.
If the offset of an option within the chain is bigger than the length of
the first mbuf in that chain, we are reading/writing one byte of packet-
controlled data beyond the end of the first mbuf.

The length of this first mbuf depends on the layout the network driver
chose. In the most difficult case, it will allocate a 2KB cluster, which
is bigger than the Ethernet MTU.

But there is at least one way of exploiting this case: by sending a
special combination of nested IPv6 fragments, the packet can control a
good bunch of 'len'. By luck, the memory pool containing clusters does not
embed the pool header in front of the items, so it is not straightforward
to predict what is located at 'mtod(m, char *) + len'.

However, by sending offending fragments in a loop, it is possible to
crash the kernel - at some point we will hit important data structures.

As far as I can tell, PF protects against this difficult case, because
it kicks nested fragments. NPF does not protect against this. IPF I don't
know.

Then there are the more easy cases, if the MTU is bigger than a cluster,
or if the network driver did not allocate a cluster, or perhaps if the
fragments are received via a tunnel; I haven't investigated these cases.

Change ip6_get_prevhdr so that it returns an offset in the chain, and
always use IP6_EXTHDR_GET to get a writable pointer. IP6_EXTHDR_GET
leaves M_PKTHDR untouched.

This place is still fragile.
 1.77  29-Jan-2018  maxv Start cleaning up ip6_input.c. Several pieces of code have evolved but
their neighboring comments were not updated. So update them, and remove
code that has been disabled for years (it has no use anyway).
 1.76  25-Jan-2018  maxv Several changes:

* Move the structure definitions into frag6.c, they should not be used
elsewhere.

* Rename ip6af_mff -> ip6af_more, and switch it to bool, easier to
understand.

* Remove IP6_REASS_MBUF, no point in keeping this.

* Remove ip6q_arrive and ip6q_nxtp, unused.

* Style.
 1.75  10-Jan-2018  knakahara add ipsec(4) interface, which is used for route-based VPN.

man and ATF are added later, please see man for details.

reviewed by christos@n.o, joerg@n.o and ozaki-r@n.o, thanks.
https://mail-index.netbsd.org/tech-net/2017/12/18/msg006557.html
 1.74  03-Mar-2017  ozaki-r branches: 1.74.6;
Pass inpcb/in6pcb instead of socket to ip_output/ip6_output

- Passing a socket to Layer 3 is layer violation and even unnecessary
- The change makes codes of callers and IPsec a bit simple
 1.73  02-Mar-2017  ozaki-r Make usages of ifp MP-safe in some functions of IP multicast
 1.72  14-Feb-2017  ozaki-r Do ND in L2_output in the same manner as arpresolve

The benefits of this change are:
- The flow is consistent with IPv4 (and FreeBSD and OpenBSD)
- old: ip6_output => nd6_output (do ND if needed) => L2_output (lookup a stored cache)
- new: ip6_output => L2_output (lookup a cache. Do ND if cache not found)
- We can remove some workarounds in nd6_output
- We can move L2 specific operations to their own place
- The performance slightly improves because one cache lookup is reduced
 1.71  08-Dec-2016  ozaki-r branches: 1.71.2;
Add rtcache_unref to release points of rtentry stemming from rtcache

In the MP-safe world, a rtentry stemming from a rtcache can be freed at any
points. So we need to protect rtentries somehow say by reference couting or
passive references. Regardless of the method, we need to call some release
function of a rtentry after using it.

The change adds a new function rtcache_unref to release a rtentry. At this
point, this function does nothing because for now we don't add a reference
to a rtentry when we get one from a rtcache. We will add something useful
in a further commit.

This change is a part of changes for MP-safe routing table. It is separated
to avoid one big change that makes difficult to debug by bisecting.
 1.70  10-Nov-2016  ozaki-r Tidy up in6_select*

This change tidies up in6_select* functions, especially
selectroute.

selectroute is annoying because:
- It returns both/either of a rtentry and/or an ifp
- Yes, it may return only an ifp!
- It is valid but selectroute shouldn't handle the case
- Such conditional behavior makes it difficult
to apply locking/psref thingy
- It may return a rtentry even if error
- It may use opt->ip6po_nextroute rtcache implicitly
- The caller can know if it is used
by rtcache_validate(&opt->ip6po_nextroute)
but it's racy in MP-safe world
- Even if it uses opt->ip6po_nextroute, it may
return a rtentry that isn't derived from the rtcache

The change includes:
- Rename selectroute to in6_selectroute
- Let a remaining caller of selectroute, in6_selectif,
use in6_selectroute instead
- Let in6_selectroute return only an rtentry
- If error, it doesn't return an rtentry
- A caller gets an ifp from a returned rtentry
- Allow in6_selectroute to modify a passed rtcache
and a caller can know if opt->ip6po_nextroute is
used via the rtcache
- Let callers (ip6_output and in6_selectif) handle
the case that only an ifp is required

Inspired by OpenBSD
Proposed on tech-kern and tech-net
LGTM by roy@
 1.69  31-Oct-2016  ozaki-r Fix race condition of in6_selectsrc

in6_selectsrc returned a pointer to in6_addr that wan't guaranteed to be
safe by pserialize (or psref), which was racy. Let callers pass a pointer
to in6_addr and in6_selectsrc copy a result to it inside pserialize
critical sections.
 1.68  23-Aug-2016  knakahara improve fast-forward performance when the number of flows exceeds ip6_maxflows.

This is porting of ip_flow.c:r1.76

In ip6flow case, the before degradation is about 45%, the after degradation is
bout 55%.
 1.67  21-Jun-2016  ozaki-r branches: 1.67.2;
Make sure returning ifp from in6_select* functions psref-ed

To this end, callers need to pass struct psref to the functions
and the fuctions acquire a reference of ifp with it. In some cases,
we can simply use if_get_byindex, however, in other cases
(say rt->rt_ifp and ia->ifa_ifp), we have no MP-safe way for now.
In order to take a reference anyway we use non MP-safe function
if_acquire_NOMPSAFE for the latter cases. They should be fixed in
the future somehow.
 1.66  21-Jun-2016  ozaki-r Replace ifp of ip_moptions and ip6_moptions with if_index

The motivation is the same as the mbuf's rcvif case; avoid having a pointer
of an ifnet object in ip_moptions and ip6_moptions, which is not MP-safe.

ip_moptions and ip6_moptions can be stored in a PCB for inet or inet6
that's life time is different from ifnet one and so an ifnet object can be
disappeared anytime we get it via them. Thus we need to look up an ifnet
object by if_index every time for safe.
 1.65  10-Jun-2016  ozaki-r Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.
 1.64  20-Jan-2015  roy Add net.inet6.ip6.prefer_tempaddr sysctl knob so that we can prefer
IPv6 temporary addresses as the source address.

Fixes PR kern/47100 based on a patch by Dieter Roelants.
 1.63  12-Oct-2014  christos branches: 1.63.2;
Refactor the multicast membership code so that we can handle v4 mapped
addresses using the v6 membership ioctls.
 1.62  05-Jun-2014  rmind branches: 1.62.2;
- Implement pktqueue interface for lockless IP input queue.
- Replace ipintrq and ip6intrq with the pktqueue mechanism.
- Eliminate kernel-lock from ipintr() and ip6intr().
- Some preparation work to push softnet_lock out of ipintr().

Discussed on tech-net.
 1.61  19-May-2014  rmind - Split off PRU_ATTACH and PRU_DETACH logic into separate functions.
- Replace malloc with kmem and eliminate M_PCB while here.
- Sprinkle more asserts.
 1.60  18-May-2014  rmind Add struct pr_usrreqs with a pr_generic function and prepare for the
dismantling of pr_usrreq in the protocols; no functional change intended.
PRU_ATTACH/PRU_DETACH changes will follow soon.

Bump for struct protosw. Welcome to 6.99.62!
 1.59  23-Jun-2012  christos branches: 1.59.2; 1.59.4; 1.59.12;
4 new sysctls to avoid ipv6 DoS attacks from OpenBSD
 1.58  19-Jan-2012  liamjfoy branches: 1.58.2; 1.58.6; 1.58.8;
Remove ip6f_start from ip6f struct
 1.57  10-Jan-2012  drochner add patch from Arnaud Degroote to handle IPv6 extended options with
(FAST_)IPSEC, tested lightly with a DSTOPTS header consisting
of PAD1
 1.56  04-Nov-2011  zoltan branches: 1.56.4;
Change the IPv6 reassembly mechanism to use mutex(9).
Also add ip6_reass_packet() to be used by NPF.
 1.55  24-May-2011  spz branches: 1.55.4;
RA flood mitigation via a limit on accepted routes:
- introduce a limit for the routes accepted via IPv6 Router Advertisement:
a common 2 interface client will have 6, the default limit is 100 and
can be adjusted via sysctl
- report the current number of routes installed via RA via sysctl
- count discarded route additions. Note that one RA message is two routes.
This is at present only across all interfaces even though per-interface
would be more useful, since the per-interface structure complies to RFC2466
- bump kernel version due to the previous change
- adjust netstat to use the new value (with netstat -p icmp6)
 1.54  03-May-2011  dyoung *_drain() routines may be called with locks held, so instead of doing
any work in *_drain(), set a drain-needed flag. Do the work in the
fasttimo handler.

Contributed by Coyote Point Systems, Inc.
 1.53  06-May-2009  elad branches: 1.53.4; 1.53.6;
Remove some usage of "priv" and "privileged" variables and instead pass
around credentials. Also push down kauth(9) calls closer to where the
operation is done.

Mailing list reference:

http://mail-index.netbsd.org/tech-net/2009/04/30/msg001270.html
 1.52  23-Mar-2009  liamjfoy Init ip6flow pool dynamically instead of using a linkset.
 1.51  06-Aug-2008  plunky branches: 1.51.2; 1.51.8;
Convert socket options code to use a sockopt structure
instead of laying everything into an mbuf.

approved by core
 1.50  24-Apr-2008  ad branches: 1.50.2; 1.50.4; 1.50.8;
Merge the socket locking patch:

- Socket layer becomes MP safe.
- Unix protocols become MP safe.
- Allows protocol processing interrupts to safely block on locks.
- Fixes a number of race conditions.

With much feedback from matt@ and plunky@.
 1.49  15-Apr-2008  thorpej branches: 1.49.2;
Make ip6 and icmp6 stats per-cpu.
 1.48  08-Apr-2008  thorpej Change IPv6 stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ip6stat structure; old netstat
binaries will continue to work properly.
 1.47  19-Mar-2008  dyoung No code ever sets struct ip6_pktopts member ip6po_m, so get rid of
it.
 1.46  29-Oct-2007  dyoung branches: 1.46.12; 1.46.16;
The IPv6 stack labels incoming packets with an m_tag whose payload
is a struct ip6aux. A struct ip6aux used to contain a pointer to
an in6_ifaddr, but that pointer could become a dangling reference
in the lifetime of the m_tag, because ip6_setdstifaddr() did not
increase the in6_ifaddr's reference count. I have removed the
pointer from ip6aux. I load it with the interesting fields from
the in6_ifaddr (an IPv6 address, a scope ID, and some flags),
instead.
 1.45  19-Jul-2007  dyoung branches: 1.45.4; 1.45.6; 1.45.10; 1.45.12;
Take steps to hide the radix_node implementation of the forwarding table
from the forwarding table's users:

Introduce rt_walktree() for walking the routing table and
applying a function to each rtentry. Replace most
rn_walktree() calls with it.

Use rt_getkey()/rt_setkey() to get/set a route's destination.
Keep a pointer to the sockaddr key in the rtentry, so that
rtentry users do not have to grovel in the radix_node for
the key.

Add a RTM_GET method to rtrequest. Use that instead of
radix_node lookups in, e.g., carp(4).

Add sys/net/link_proto.c, which supplies sockaddr routines for
link-layer socket addresses (sockaddr_dl).

Cosmetic:

Constify. KNF. Stop open-coding LIST_FOREACH, TAILQ_FOREACH,
et cetera. Use NULL instead of 0 for null pointers. Use
__arraycount(). Reduce gratuitous parenthesization.

Stop using variadic arguments for rip6_output(), it is
unnecessary.

Remove the unnecessary rtentry member rt_genmask and the
code to maintain it, since nothing actually used it.

Make rt_maskedcopy() easier to read by using meaningful variable
names.

Extract a subroutine intern_netmask() for looking up a netmask in
the masks table.

Start converting backslash-ridden IPv6 macros in
sys/netinet6/in6_var.h into inline subroutines that one
can read without special eyeglasses.

One functional change: when the kernel serves an RTM_GET, RTM_LOCK,
or RTM_CHANGE request, it applies the netmask (if supplied) to a
destination before searching for it in the forwarding table.

I have changed sys/netinet/ip_carp.c, carp_setroute(), to remove
the unlawful radix_node knowledge.

Apart from the changes to carp(4), netiso, ATM, and strip(4), I
have run the changes on three nodes in my wireless routing testbed,
which involves IPv4 + IPv6 dynamic routing acrobatics, and it's
working beautifully so far.
 1.44  17-May-2007  yamt branches: 1.44.2;
remove net.inet6.ip6.rht0 sysctl.
it's too dangerous compared to its benefit.

strongly requested by itojun@. ok'ed by core@.
 1.43  02-May-2007  dyoung Eliminate address family-specific route caches (struct route, struct
route_in6, struct route_iso), replacing all caches with a struct
route.

The principle benefit of this change is that all of the protocol
families can benefit from route cache-invalidation, which is
necessary for correct routing. Route-cache invalidation fixes an
ancient PR, kern/3508, at long last; it fixes various other PRs,
also.

Discussions with and ideas from Joerg Sonnenberger influenced this
work tremendously. Of course, all design oversights and bugs are
mine.

DETAILS

1 I added to each address family a pool of sockaddrs. I have
introduced routines for allocating, copying, and duplicating,
and freeing sockaddrs:

struct sockaddr *sockaddr_alloc(sa_family_t af, int flags);
struct sockaddr *sockaddr_copy(struct sockaddr *dst,
const struct sockaddr *src);
struct sockaddr *sockaddr_dup(const struct sockaddr *src, int flags);
void sockaddr_free(struct sockaddr *sa);

sockaddr_alloc() returns either a sockaddr from the pool belonging
to the specified family, or NULL if the pool is exhausted. The
returned sockaddr has the right size for that family; sa_family
and sa_len fields are initialized to the family and sockaddr
length---e.g., sa_family = AF_INET and sa_len = sizeof(struct
sockaddr_in). sockaddr_free() puts the given sockaddr back into
its family's pool.

sockaddr_dup() and sockaddr_copy() work analogously to strdup()
and strcpy(), respectively. sockaddr_copy() KASSERTs that the
family of the destination and source sockaddrs are alike.

The 'flags' argumet for sockaddr_alloc() and sockaddr_dup() is
passed directly to pool_get(9).

2 I added routines for initializing sockaddrs in each address
family, sockaddr_in_init(), sockaddr_in6_init(), sockaddr_iso_init(),
etc. They are fairly self-explanatory.

3 structs route_in6 and route_iso are no more. All protocol families
use struct route. I have changed the route cache, 'struct route',
so that it does not contain storage space for a sockaddr. Instead,
struct route points to a sockaddr coming from the pool the sockaddr
belongs to. I added a new method to struct route, rtcache_setdst(),
for setting the cache destination:

int rtcache_setdst(struct route *, const struct sockaddr *);

rtcache_setdst() returns 0 on success, or ENOMEM if no memory is
available to create the sockaddr storage.

It is now possible for rtcache_getdst() to return NULL if, say,
rtcache_setdst() failed. I check the return value for NULL
everywhere in the kernel.

4 Each routing domain (struct domain) has a list of live route
caches, dom_rtcache. rtflushall(sa_family_t af) looks up the
domain indicated by 'af', walks the domain's list of route caches
and invalidates each one.
 1.42  22-Apr-2007  christos fix typo.
 1.41  22-Apr-2007  christos Disable processing of routing header type 0 packets since they can be used
of DoS attacks. Provide a sysctl to re-enable them (net.inet6.ip6.rht0).

Information from:
http://www.secdev.org/conf/IPv6_RH_security-csw07.pdf
 1.40  23-Mar-2007  liamjfoy Add a new sysctl net.inet6.ip6.hashsize to control the hash table size.

The sysctl handler will ensure this value is a power of 2

ok dyoung@
 1.39  07-Mar-2007  liamjfoy branches: 1.39.2; 1.39.4; 1.39.6;
Add IPv6 Fast Forward - the IPv4 counterpart:

If ip6_forward successfully forwards a packet, a cache, in this case a
ip6flow struct entry, will be created. ether_input and friends will
then be able to call ip6flow_fastforward with the packet which will then
be passed to if_output (unless an issue is found - in that case the packet
is passed back to ip6_input).

ok matt@ christos@ dyoung@ and joerg@
 1.38  17-Feb-2007  dyoung KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.
 1.37  05-May-2006  rpaulo branches: 1.37.12; 1.37.14;
Add support for RFC 3542 Adv. Socket API for IPv6 (which obsoletes 2292).
* RFC 3542 isn't binary compatible with RFC 2292.
* RFC 2292 support is on by default but can be disabled.
* update ping6, telnet and traceroute6 to the new API.

From the KAME project (www.kame.net).
Reviewed by core.
 1.36  05-Mar-2006  rpaulo branches: 1.36.2; 1.36.4;
NDP-related improvements:
RFC4191
- supports host-side router-preference

RFC3542
- if DAD fails on a interface, disables IPv6 operation on the
interface
- don't advertise MLD report before DAD finishes

Others
- fixes integer overflow for valid and preferred lifetimes
- improves timer granularity for MLD, using callout-timer.
- reflects rtadvd's IPv6 host variable information into kernel
(router only)
- adds a sysctl option to enable/disable pMTUd for multicast
packets
- performs NUD on PPP/GRE interface by default
- Redirect works regardless of ip6_accept_rtadv
- removes RFC1885-related code

From the KAME project via SUZUKI Shinsuke.
Reviewed by core.
 1.35  21-Jan-2006  rpaulo branches: 1.35.2; 1.35.4; 1.35.6;
Better support of IPv6 scoped addresses.

- most of the kernel code will not care about the actual encoding of
scope zone IDs and won't touch "s6_addr16[1]" directly.
- similarly, most of the kernel code will not care about link-local
scoped addresses as a special case.
- scope boundary check will be stricter. For example, the current
*BSD code allows a packet with src=::1 and dst=(some global IPv6
address) to be sent outside of the node, if the application do:
s = socket(AF_INET6);
bind(s, "::1");
sendto(s, some_global_IPv6_addr);
This is clearly wrong, since ::1 is only meaningful within a single
node, but the current implementation of the *BSD kernel cannot
reject this attempt.
- and, while there, don't try to remove the ff02::/32 interface route
entry in in6_ifdetach() as it's already gone.

This also includes some level of support for the standard source
address selection algorithm defined in RFC3484, which will be
completed on in the future.

From the KAME project via JINMEI Tatuya.
Approved by core@.
 1.34  11-Dec-2005  christos branches: 1.34.2;
merge ktrace-lwp.
 1.33  18-Oct-2004  itojun branches: 1.33.10; 1.33.12; 1.33.20; 1.33.22;
ip6_flow_seq is no longer available.
 1.32  06-Sep-2003  itojun branches: 1.32.2; 1.32.4; 1.32.6;
randomize IPv4/v6 fragment ID and IPv6 flowlabel. avoids predictability
of these fields. ip_id.c is from openbsd. ip6_id.c is adapted by kame.
 1.31  22-Aug-2003  itojun change the additional arg to be passed to ip{,6}_output to struct socket *.

this fixes KAME policy lookup which was broken by the previous commit.
 1.30  22-Aug-2003  jonathan (Accidentally-omitted change): update for ip6_output() to match commit below.

replace the set_socket() method of passing an extra struct socket*
argument to ip6_output() with a new explicit struct in6pcb* argument.
(The underlying socket can be obtained via in6pcb->inp6_socket.)

In preparation for fast-ipsec. Reviewed by itojun.
 1.29  07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.28  07-Aug-2003  itojun make net.inet6.ip6.redirect actually work. from Tomoyuki Sahara via kame
 1.27  08-Jul-2003  itojun prototype must not have variable name
 1.26  29-Jun-2003  fvdl branches: 1.26.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
 1.25  28-Jun-2003  darrenr Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
 1.24  28-Jan-2003  wiz success, not sucess. Noted by mjl.
 1.23  11-Sep-2002  itojun correct signedness mixup in pointer passing. sync w/kame
 1.22  30-Jun-2002  thorpej Changes to allow the IPv4 and IPv6 layers to align headers themseves,
as necessary:
* Implement a new mbuf utility routine, m_copyup(), is is like
m_pullup(), except that it always prepends and copies, rather
than only doing so if the desired length is larger than m->m_len.
m_copyup() also allows an offset into the destination mbuf, which
allows space for packet headers, in the forwarding case.
* Add *_HDR_ALIGNED_P() macros for IP, IPv6, ICMP, and IGMP. These
macros expand to 1 if __NO_STRICT_ALIGNMENT is defined, so that
architectures which do not have strict alignment constraints don't
pay for the test or visit the new align-if-needed path.
* Use the new macros to check if a header needs to be aligned, or to
assert that it already is, as appropriate.

Note: This code is still somewhat experimental. However, the new
code path won't be visited if individual device drivers continue
to guarantee that packets are delivered to layer 3 already properly
aligned (which are rules that are already in use).
 1.21  08-Jun-2002  itojun sync with latest KAME in6_ifaddr/prefix/default router manipulation.
behavior changes:
- two iocts used by ndp(8) are now obsolete (backward compat provided).
use sysctl path instead.
- lo0 does not get ::1 automatically. it will get ::1 when lo0 comes up.
 1.20  07-Jun-2002  itojun sync IPV6_CHECKSUM handling with kame.
 1.19  28-May-2002  itojun limit number of IPv6 fragments (not the fragment queue size) to
fight against lots-of-frags DoS attacks. sync w/kame
 1.18  21-Dec-2001  itojun branches: 1.18.8; 1.18.10;
move in6_gif_hlim decl to in6_gif.c. sync with kame
 1.17  20-Dec-2001  itojun centralize multicast group management (in6_join/leavegroup).
have a flag for ip6_output() to fragment to minimum MTU.
sync with kame
 1.16  15-Oct-2001  itojun implement IPV6_V6ONLY socket option from draft-ietf-ipngwg-rfc2553bis-03.txt.
IPV6_BINDV6ONLY (netbsd only) is deprecated, but still work just like before.
 1.15  26-Aug-2000  itojun branches: 1.15.2; 1.15.4;
implement net.inet6.ip6.{anon,low}port{min,max} sysctl variable.
 1.14  13-Jul-2000  itojun remove m_pulldown statistics code. it is highly experimental and belong
to kame tree only (not for *bsd).
 1.13  06-Jul-2000  itojun - do not use bitfield for router renumbering header.
- add protection mechanism against ND cache corruption due to bad NUD hints.
- more stats
- icmp6 pps limitation. TOOD: should implement ppsratecheck(9).
 1.12  21-Mar-2000  itojun branches: 1.12.4;
cleanup AH/policy processing.
- parse IPv6 header by using common function, ip6_{last,next}hdr.
- fix behaivior in multiple AH cases.
make strict boundary checks on mbuf chasing.
(sync with latest kame)
 1.11  26-Feb-2000  itojun implement rip6_ctlinput, to cope with routing changes correctly.
(IMHO we need rip_ctlinput as well)
 1.10  26-Feb-2000  itojun bring in recent KAME changes (only important and stable ones, as usual).
- remove net.inet6.ip6.nd6_proxyall. introduce proxy NDP code works
just like "arp -s".
- revise source address selection.
be more careful about use of yet-to-be-valid addresses as source.
- as router, transmit ICMP6_DST_UNREACH_BEYONDSCOPE against out-of-scope
packet forwarding attempt.
- path MTU discovery takes care of routing header properly.
- be more strict about mbuf chain parsing.
 1.9  03-Feb-2000  itojun - Don't reuse ip6 header portion as reassembly pointer, to be friendly
with LP64 arch. (not tested on LP64, sorry)
- add comment on reass rule
- some other cleanups

NetBSD PR: 9340
From: iwamoto@sat.t.u-tokyo.ac.jp
(in sync with kame)
 1.8  06-Jan-2000  itojun remove extra portability #ifdef (like #ifdef __FreeBSD__) in KAME IPv6/IPsec
code, from netbsd-current repository.
#ifdef'ed version is always available from ftp.kame.net.

XXX please do not make too many diff-unfriendly changes, we'll need to take
bunch of diffs on upgrade...
 1.7  06-Jan-2000  itojun make IPV6_BINDV6ONLY setsockopt available. it controls behavior of
AF_INET6 wildcard listening socket. heavily documented in ip6(4).
net.inet6.ip6.bindv6only defines default value. default is 1.

"options INET6_BINDV6ONLY" removes any code fragment that supports
IPV6_BINDV6ONLY == 0 case (not defopt'ed as use of this is rare).
 1.6  13-Dec-1999  itojun sync IPv6 part with latest KAME tree. IPsec part is left unmodified
due to massive changes in KAME side.
- IPv6 output goes through nd6_output
- faith can capture IPv4 packets as well - you can run IPv4-to-IPv6 translator
using heavily modified DNS servers
- per-interface statistics (required for IPv6 MIB)
- interface autoconfig is revisited
- udp input handling has a big change for mapped address support.
- introduce in4_cksum() for non-overwriting checksumming
- introduce m_pulldown()
- neighbor discovery cleanups/improvements
- netinet/in.h strictly conforms to RFC2553 (no extra defs visible to userland)
- IFA_STATS is fixed a bit (not tested)
- and more more more.

TODO:
- cleanup os-independency #ifdef
- avoid rcvif dual use (for IPsec) to help ifdetach

(sorry for jumbo commit, I can't separate this any more...)
 1.5  19-Nov-1999  bouyer Update protocoles and interfaces stats counters to 64bit.
RTM_IFINFO is now 0xf, 0xe is RTM_OIFINFO which returns the old (if_msghdr14)
struct with 32bit counters (binary compat, conditioned on COMPAT_14).
Same for sysctl: node 3 is renamed NET_RT_OIFLIST, NET_RT_IFLIST is now node 4.
Change rt_msg1() to add an mbuf to the mbuf chain instead of just panic()
when the message is larger than MHLEN.
 1.4  22-Jul-1999  itojun branches: 1.4.2; 1.4.8;
change unnecessary u_long/long into u_int32_t or something relevant.
more fixes should follow.
 1.3  03-Jul-1999  thorpej RCS ID police.
 1.2  01-Jul-1999  itojun branches: 1.2.2;
IPv6 kernel code, based on KAME/NetBSD 1.4, SNAP kit 19990628.
(Sorry for a big commit, I can't separate this into several pieces...)
Pls check sys/netinet6/TODO and sys/netinet6/IMPLEMENTATION for details.

- sys/kern: do not assume single mbuf, accept chained mbuf on passing
data from userland to kernel (or other way round).
- "midway" ATM card: ATM PVC pseudo device support, like those done in ALTQ
package (ftp://ftp.csl.sony.co.jp/pub/kjc/).
- sys/netinet/tcp*: IPv4/v6 dual stack tcp support.
- sys/netinet/{ip6,icmp6}.h, sys/net/pfkeyv2.h: IETF document assumes those
file to be there so we patch it up.
- sys/netinet: IPsec additions are here and there.
- sys/netinet6/*: most of IPv6 code sits here.
- sys/netkey: IPsec key management code
- dev/pci/pcidevs: regen

In my understanding no code here is subject to export control so it
should be safe.
 1.1  28-Jun-1999  itojun branches: 1.1.2;
file ip6_var.h was initially added on branch kame.
 1.1.2.3  30-Nov-1999  itojun bring in latest KAME (as of 19991130, KAME/NetBSD141) into kame branch
just for reference purposes.
This commit includes 1.4 -> 1.4.1 sync for kame branch.

The branch does not compile at all (due to the lack of ALTQ and some other
source code). Please do not try to modify the branch, this is just for
referenre purposes.

synchronization to latest KAME will take place on HEAD branch soon.
 1.1.2.2  06-Jul-1999  itojun KAME/NetBSD 1.4, SNAP kit 1999/07/05.
NOTE: this branch is just for reference purposes (i.e. for taking cvs diff).
do not touch anything on the branch. actual work must be done on HEAD branch.
 1.1.2.1  28-Jun-1999  itojun KAME/NetBSD 1.4 SNAP kit, dated 19990628.

NOTE: this branch (kame) is used just for refernce. this may not compile
due to multiple reasons.
 1.2.2.3  02-Aug-1999  thorpej Update from trunk.
 1.2.2.2  01-Jul-1999  thorpej Sync w/ -current.
 1.2.2.1  01-Jul-1999  thorpej file ip6_var.h was added on branch chs-ubc2 on 1999-07-01 23:48:28 +0000
 1.4.8.1  27-Dec-1999  wrstuden Pull up to last week's -current.
 1.4.2.1  20-Nov-2000  bouyer Update thorpej_scsipi to -current as of a month ago
 1.12.4.2  27-Aug-2000  itojun pullup (approved by releng-1-5)

> implement net.inet6.ip6.{anon,low}port{min,max} sysctl variable.

> cvs rdiff -r1.67 -r1.68 basesrc/lib/libc/gen/sysctl.3
> cvs rdiff -r1.53 -r1.54 basesrc/sbin/sysctl/sysctl.8
> cvs rdiff -r1.18 -r1.19 syssrc/sys/netinet6/in6.h
> cvs rdiff -r1.29 -r1.30 syssrc/sys/netinet6/in6_pcb.c
> cvs rdiff -r1.3 -r1.4 syssrc/sys/netinet6/in6_src.c
> cvs rdiff -r1.25 -r1.26 syssrc/sys/netinet6/ip6_input.c
> cvs rdiff -r1.14 -r1.15 syssrc/sys/netinet6/ip6_var.h
 1.12.4.1  14-Jul-2000  itojun pullup (approved by releng-1-5)

remove m_pulldown statistics code. it is highly experimental and belong
to kame tree only (not for *bsd).

1.4 -> 1.5 syssrc/sys/kern/uipc_mbuf2.c
1.8 -> 1.9 syssrc/sys/netinet/ip6.h
1.13 -> 1.14 syssrc/sys/netinet6/ip6_var.h
 1.15.4.4  10-Oct-2002  jdolecek sync kqueue with -current; this includes merge of gehenna-devsw branch,
merge of i386 MP branch, and part of autoconf rototil work
 1.15.4.3  06-Sep-2002  jdolecek sync kqueue branch with HEAD
 1.15.4.2  23-Jun-2002  jdolecek catch up with -current on kqueue branch
 1.15.4.1  10-Jan-2002  thorpej Sync kqueue branch with -current.
 1.15.2.5  17-Sep-2002  nathanw Catch up to -current.
 1.15.2.4  01-Aug-2002  nathanw Catch up to -current.
 1.15.2.3  20-Jun-2002  nathanw Catch up to -current.
 1.15.2.2  08-Jan-2002  nathanw Catch up to -current.
 1.15.2.1  22-Oct-2001  nathanw Catch up to -current.
 1.18.10.1  02-Oct-2003  tron Pull up revision 1.21 via patch (requested by itojun in ticket #1491):
sync with latest KAME in6_ifaddr/prefix/default router manipulation.
behavior changes:
- two iocts used by ndp(8) are now obsolete (backward compat provided).
use sysctl path instead.
- lo0 does not get ::1 automatically. it will get ::1 when lo0 comes up.
 1.18.8.3  15-Jul-2002  gehenna catch up with -current.
 1.18.8.2  20-Jun-2002  gehenna catch up with -current.
 1.18.8.1  30-May-2002  gehenna Catch up with -current.
 1.26.2.5  19-Oct-2004  skrll Sync with HEAD
 1.26.2.4  21-Sep-2004  skrll Fix the sync with head I botched.
 1.26.2.3  18-Sep-2004  skrll Sync with HEAD.
 1.26.2.2  03-Aug-2004  skrll Sync with HEAD
 1.26.2.1  02-Jul-2003  darrenr Apply the aborted ktrace-lwp changes to a specific branch. This is just for
others to review, I'm concerned that patch fuziness may have resulted in some
errant code being generated but I'll look at that later by comparing the diff
from the base to the branch with the file I attempt to apply to it. This will,
at the very least, put the changes in a better context for others to review
them and attempt to tinker with removing passing of 'struct lwp' through
the kernel.
 1.32.6.1  04-Jun-2007  bouyer Pull up following revision(s) (requested by adrianp in ticket #11330):
sys/netinet6/ip6_input.c: revision 1.102 via patch
sys/netinet6/route6.c: revision 1.18 via patch
sys/netinet6/ip6_var.h: revisions 1.41-1.42 via patch
sbin/sysctl/sysctl.8: patch
Disable processing of routing header type 0 packets since they can be used
of DoS attacks. Provide a sysctl to re-enable them (net.inet6.ip6.rht0).
Information from:
http://www.secdev.org/conf/IPv6_RH_security-csw07.pdf
 1.32.4.1  04-Jun-2007  bouyer Pull up following revision(s) (requested by adrianp in ticket #11330):
sys/netinet6/ip6_input.c: revision 1.102 via patch
sys/netinet6/route6.c: revision 1.18 via patch
sys/netinet6/ip6_var.h: revisions 1.41-1.42 via patch
sbin/sysctl/sysctl.8: patch
Disable processing of routing header type 0 packets since they can be used
of DoS attacks. Provide a sysctl to re-enable them (net.inet6.ip6.rht0).
Information from:
http://www.secdev.org/conf/IPv6_RH_security-csw07.pdf
 1.32.2.1  04-Jun-2007  bouyer Pull up following revision(s) (requested by adrianp in ticket #11330):
sys/netinet6/ip6_input.c: revision 1.102 via patch
sys/netinet6/route6.c: revision 1.18 via patch
sys/netinet6/ip6_var.h: revisions 1.41-1.42 via patch
sbin/sysctl/sysctl.8: patch
Disable processing of routing header type 0 packets since they can be used
of DoS attacks. Provide a sysctl to re-enable them (net.inet6.ip6.rht0).
Information from:
http://www.secdev.org/conf/IPv6_RH_security-csw07.pdf
 1.33.22.1  26-Apr-2007  ghen Pull up following revision(s) (requested by christos in ticket #1766):
sys/netinet6/ip6_input.c: revision 1.102 via patch
sys/netinet6/route6.c: revision 1.18 via patch
sys/netinet6/ip6_var.h: revision 1.41 via patch
sys/netinet6/ip6_var.h: revision 1.42 via patch
sbin/sysctl/sysctl.8: patch
Disable processing of routing header type 0 packets since they can be used
of DoS attacks. Provide a sysctl to re-enable them (net.inet6.ip6.rht0).
Information from:
http://www.secdev.org/conf/IPv6_RH_security-csw07.pdf
fix typo.
 1.33.20.1  26-Apr-2007  ghen Pull up following revision(s) (requested by christos in ticket #1766):
sys/netinet6/ip6_input.c: revision 1.102 via patch
sys/netinet6/route6.c: revision 1.18 via patch
sys/netinet6/ip6_var.h: revision 1.41 via patch
sys/netinet6/ip6_var.h: revision 1.42 via patch
sbin/sysctl/sysctl.8: patch
Disable processing of routing header type 0 packets since they can be used
of DoS attacks. Provide a sysctl to re-enable them (net.inet6.ip6.rht0).
Information from:
http://www.secdev.org/conf/IPv6_RH_security-csw07.pdf
fix typo.
 1.33.12.5  24-Mar-2008  yamt sync with head.
 1.33.12.4  15-Nov-2007  yamt sync with head.
 1.33.12.3  03-Sep-2007  yamt sync with head.
 1.33.12.2  26-Feb-2007  yamt sync with head.
 1.33.12.1  21-Jun-2006  yamt sync with head.
 1.33.10.1  26-Apr-2007  ghen Pull up following revision(s) (requested by christos in ticket #1766):
sys/netinet6/ip6_input.c: revision 1.102 via patch
sys/netinet6/route6.c: revision 1.18 via patch
sys/netinet6/ip6_var.h: revision 1.41 via patch
sys/netinet6/ip6_var.h: revision 1.42 via patch
sbin/sysctl/sysctl.8: patch
Disable processing of routing header type 0 packets since they can be used
of DoS attacks. Provide a sysctl to re-enable them (net.inet6.ip6.rht0).
Information from:
http://www.secdev.org/conf/IPv6_RH_security-csw07.pdf
fix typo.
 1.34.2.1  01-Feb-2006  yamt sync with head.
 1.35.6.2  24-May-2006  yamt sync with head.
 1.35.6.1  13-Mar-2006  yamt sync with head.
 1.35.4.2  01-Jun-2006  kardel Sync with head.
 1.35.4.1  22-Apr-2006  simonb Sync with head.
 1.35.2.2  09-Sep-2006  rpaulo sync with head
 1.35.2.1  07-Feb-2006  rpaulo in6pcb -> inpcb.
 1.36.4.1  24-May-2006  tron Merge 2006-05-24 NetBSD-current into the "peter-altq" branch.
 1.36.2.1  11-May-2006  elad sync with head
 1.37.14.5  17-May-2007  yamt sync with head.
 1.37.14.4  07-May-2007  yamt sync with head.
 1.37.14.3  24-Mar-2007  yamt sync with head.
 1.37.14.2  12-Mar-2007  rmind Sync with HEAD.
 1.37.14.1  27-Feb-2007  yamt - sync with head.
- move sched_changepri back to kern_synch.c as it doesn't know PPQ anymore.
 1.37.12.1  28-Apr-2007  bouyer Pull up following revision(s) (requested by christos in ticket #587):
sys/netinet6/ip6_input.c: revision 1.102
sys/netinet6/route6.c: revision 1.18
sys/netinet6/ip6_var.h: revision 1.41
sys/netinet6/ip6_var.h: revision 1.42
sbin/sysctl/sysctl.8: patch
Disable processing of routing header type 0 packets since they can be used
of DoS attacks. Provide a sysctl to re-enable them (net.inet6.ip6.rht0).
Information from:
http://www.secdev.org/conf/IPv6_RH_security-csw07.pdf
fix typo.
 1.39.6.1  29-Mar-2007  reinoud Pullup to -current
 1.39.4.1  11-Jul-2007  mjf Sync with head.
 1.39.2.3  20-Aug-2007  ad Sync with HEAD.
 1.39.2.2  08-Jun-2007  ad Sync with head.
 1.39.2.1  10-Apr-2007  ad Sync with head.
 1.44.2.1  15-Aug-2007  skrll Sync with HEAD.
 1.45.12.2  19-Jul-2007  dyoung Take steps to hide the radix_node implementation of the forwarding table
from the forwarding table's users:

Introduce rt_walktree() for walking the routing table and
applying a function to each rtentry. Replace most
rn_walktree() calls with it.

Use rt_getkey()/rt_setkey() to get/set a route's destination.
Keep a pointer to the sockaddr key in the rtentry, so that
rtentry users do not have to grovel in the radix_node for
the key.

Add a RTM_GET method to rtrequest. Use that instead of
radix_node lookups in, e.g., carp(4).

Add sys/net/link_proto.c, which supplies sockaddr routines for
link-layer socket addresses (sockaddr_dl).

Cosmetic:

Constify. KNF. Stop open-coding LIST_FOREACH, TAILQ_FOREACH,
et cetera. Use NULL instead of 0 for null pointers. Use
__arraycount(). Reduce gratuitous parenthesization.

Stop using variadic arguments for rip6_output(), it is
unnecessary.

Remove the unnecessary rtentry member rt_genmask and the
code to maintain it, since nothing actually used it.

Make rt_maskedcopy() easier to read by using meaningful variable
names.

Extract a subroutine intern_netmask() for looking up a netmask in
the masks table.

Start converting backslash-ridden IPv6 macros in
sys/netinet6/in6_var.h into inline subroutines that one
can read without special eyeglasses.

One functional change: when the kernel serves an RTM_GET, RTM_LOCK,
or RTM_CHANGE request, it applies the netmask (if supplied) to a
destination before searching for it in the forwarding table.

I have changed sys/netinet/ip_carp.c, carp_setroute(), to remove
the unlawful radix_node knowledge.

Apart from the changes to carp(4), netiso, ATM, and strip(4), I
have run the changes on three nodes in my wireless routing testbed,
which involves IPv4 + IPv6 dynamic routing acrobatics, and it's
working beautifully so far.
 1.45.12.1  19-Jul-2007  dyoung file ip6_var.h was added on branch matt-mips64 on 2007-07-19 20:48:58 +0000
 1.45.10.1  13-Nov-2007  bouyer Sync with HEAD
 1.45.6.2  23-Mar-2008  matt sync with HEAD
 1.45.6.1  06-Nov-2007  matt sync with HEAD
 1.45.4.1  31-Oct-2007  joerg Sync with HEAD.
 1.46.16.3  28-Sep-2008  mjf Sync with HEAD.
 1.46.16.2  02-Jun-2008  mjf Sync with HEAD.
 1.46.16.1  03-Apr-2008  mjf Sync with HEAD.
 1.46.12.2  24-Mar-2008  keiichi sync with head.
 1.46.12.1  22-Feb-2008  keiichi imported Mobile IPv6 code developed by the SHISA project
(http://www.mobileip.jp/).
 1.49.2.1  18-May-2008  yamt sync with head.
 1.50.8.1  19-Oct-2008  haad Sync with HEAD.
 1.50.4.1  18-Sep-2008  wrstuden Sync with wrstuden-revivesa-base-2.
 1.50.2.2  16-May-2009  yamt sync with head
 1.50.2.1  04-May-2009  yamt sync with head.
 1.51.8.1  13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.51.2.1  28-Apr-2009  skrll Sync with HEAD.
 1.53.6.1  06-Jun-2011  jruoho Sync with HEAD.
 1.53.4.1  31-May-2011  rmind sync with head
 1.55.4.3  30-Oct-2012  yamt sync with head
 1.55.4.2  17-Apr-2012  yamt sync with head
 1.55.4.1  10-Nov-2011  yamt sync with head
 1.56.4.1  18-Feb-2012  mrg merge to -current.
 1.58.8.2  30-Jan-2018  martin Pull up following revision(s) (requested by maxv in ticket #1523):
sys/netinet6/frag6.c: revision 1.65
sys/netinet6/ip6_input.c: revision 1.187
sys/netinet6/ip6_var.h: revision 1.78
sys/netinet6/raw_ip6.c: revision 1.160 (patch)
sys/netinet6/ah_input.c: adjust other callers (patch)
sys/netinet6/esp_input.c: adjust other callers (patch)
sys/netinet6/ipcomp_input.c: adjust other callers (patch)
Fix a buffer overflow in ip6_get_prevhdr. Doing
mtod(m, char *) + len
is wrong, an option is allowed to be located in another mbuf of the chain.
If the offset of an option within the chain is bigger than the length of
the first mbuf in that chain, we are reading/writing one byte of packet-
controlled data beyond the end of the first mbuf.
The length of this first mbuf depends on the layout the network driver
chose. In the most difficult case, it will allocate a 2KB cluster, which
is bigger than the Ethernet MTU.
But there is at least one way of exploiting this case: by sending a
special combination of nested IPv6 fragments, the packet can control a
good bunch of 'len'. By luck, the memory pool containing clusters does not
embed the pool header in front of the items, so it is not straightforward
to predict what is located at 'mtod(m, char *) + len'.
However, by sending offending fragments in a loop, it is possible to
crash the kernel - at some point we will hit important data structures.
As far as I can tell, PF protects against this difficult case, because
it kicks nested fragments. NPF does not protect against this. IPF I don't
know.
Then there are the more easy cases, if the MTU is bigger than a cluster,
or if the network driver did not allocate a cluster, or perhaps if the
fragments are received via a tunnel; I haven't investigated these cases.
Change ip6_get_prevhdr so that it returns an offset in the chain, and
always use IP6_EXTHDR_GET to get a writable pointer. IP6_EXTHDR_GET
leaves M_PKTHDR untouched.
This place is still fragile.
 1.58.8.1  08-Jul-2013  jdc Pull up revisions:
src/share/man/man7/sysctl.7 revision 1.73 via patch
src/sys/netinet6/icmp6.c revision 1.161 via patch
src/sys/netinet6/in6.c revision 1.161 via patch
src/sys/netinet6/in6_proto.c revision 1.97 via patch
src/sys/netinet6/in6_var.h revision 1.65 via patch
src/sys/netinet6/ip6_input.c revision 1.139 via patch
src/sys/netinet6/ip6_var.h revision 1.59 via patch
src/sys/netinet6/nd6.c revision 1.143 via patch
src/sys/netinet6/nd6.h revision 1.57 via patch
src/sys/netinet6/nd6_rtr.c revision 1.83 via patch
(requested by christos in ticket #905).
Patch by Loganaden Velvindron.

4 new sysctls to avoid ipv6 DoS attacks from OpenBSD
 1.58.6.2  30-Jan-2018  martin Pull up following revision(s) (requested by maxv in ticket #1523):
sys/netinet6/frag6.c: revision 1.65
sys/netinet6/ip6_input.c: revision 1.187
sys/netinet6/ip6_var.h: revision 1.78
sys/netinet6/raw_ip6.c: revision 1.160 (patch)
sys/netinet6/ah_input.c: adjust other callers (patch)
sys/netinet6/esp_input.c: adjust other callers (patch)
sys/netinet6/ipcomp_input.c: adjust other callers (patch)
Fix a buffer overflow in ip6_get_prevhdr. Doing
mtod(m, char *) + len
is wrong, an option is allowed to be located in another mbuf of the chain.
If the offset of an option within the chain is bigger than the length of
the first mbuf in that chain, we are reading/writing one byte of packet-
controlled data beyond the end of the first mbuf.
The length of this first mbuf depends on the layout the network driver
chose. In the most difficult case, it will allocate a 2KB cluster, which
is bigger than the Ethernet MTU.
But there is at least one way of exploiting this case: by sending a
special combination of nested IPv6 fragments, the packet can control a
good bunch of 'len'. By luck, the memory pool containing clusters does not
embed the pool header in front of the items, so it is not straightforward
to predict what is located at 'mtod(m, char *) + len'.
However, by sending offending fragments in a loop, it is possible to
crash the kernel - at some point we will hit important data structures.
As far as I can tell, PF protects against this difficult case, because
it kicks nested fragments. NPF does not protect against this. IPF I don't
know.
Then there are the more easy cases, if the MTU is bigger than a cluster,
or if the network driver did not allocate a cluster, or perhaps if the
fragments are received via a tunnel; I haven't investigated these cases.
Change ip6_get_prevhdr so that it returns an offset in the chain, and
always use IP6_EXTHDR_GET to get a writable pointer. IP6_EXTHDR_GET
leaves M_PKTHDR untouched.
This place is still fragile.
 1.58.6.1  08-Jul-2013  jdc Pull up revisions:
src/share/man/man7/sysctl.7 revision 1.73 via patch
src/sys/netinet6/icmp6.c revision 1.161 via patch
src/sys/netinet6/in6.c revision 1.161 via patch
src/sys/netinet6/in6_proto.c revision 1.97 via patch
src/sys/netinet6/in6_var.h revision 1.65 via patch
src/sys/netinet6/ip6_input.c revision 1.139 via patch
src/sys/netinet6/ip6_var.h revision 1.59 via patch
src/sys/netinet6/nd6.c revision 1.143 via patch
src/sys/netinet6/nd6.h revision 1.57 via patch
src/sys/netinet6/nd6_rtr.c revision 1.83 via patch
(requested by christos in ticket #905).
Patch by Loganaden Velvindron.

4 new sysctls to avoid ipv6 DoS attacks from OpenBSD
 1.58.2.2  30-Jan-2018  martin Pull up following revision(s) (requested by maxv in ticket #1523):
sys/netinet6/frag6.c: revision 1.65
sys/netinet6/ip6_input.c: revision 1.187
sys/netinet6/ip6_var.h: revision 1.78
sys/netinet6/raw_ip6.c: revision 1.160 (patch)
sys/netinet6/ah_input.c: adjust other callers (patch)
sys/netinet6/esp_input.c: adjust other callers (patch)
sys/netinet6/ipcomp_input.c: adjust other callers (patch)
Fix a buffer overflow in ip6_get_prevhdr. Doing
mtod(m, char *) + len
is wrong, an option is allowed to be located in another mbuf of the chain.
If the offset of an option within the chain is bigger than the length of
the first mbuf in that chain, we are reading/writing one byte of packet-
controlled data beyond the end of the first mbuf.
The length of this first mbuf depends on the layout the network driver
chose. In the most difficult case, it will allocate a 2KB cluster, which
is bigger than the Ethernet MTU.
But there is at least one way of exploiting this case: by sending a
special combination of nested IPv6 fragments, the packet can control a
good bunch of 'len'. By luck, the memory pool containing clusters does not
embed the pool header in front of the items, so it is not straightforward
to predict what is located at 'mtod(m, char *) + len'.
However, by sending offending fragments in a loop, it is possible to
crash the kernel - at some point we will hit important data structures.
As far as I can tell, PF protects against this difficult case, because
it kicks nested fragments. NPF does not protect against this. IPF I don't
know.
Then there are the more easy cases, if the MTU is bigger than a cluster,
or if the network driver did not allocate a cluster, or perhaps if the
fragments are received via a tunnel; I haven't investigated these cases.
Change ip6_get_prevhdr so that it returns an offset in the chain, and
always use IP6_EXTHDR_GET to get a writable pointer. IP6_EXTHDR_GET
leaves M_PKTHDR untouched.
This place is still fragile.
 1.58.2.1  08-Jul-2013  jdc Pull up revisions:
src/share/man/man7/sysctl.7 revision 1.73 via patch
src/sys/netinet6/icmp6.c revision 1.161 via patch
src/sys/netinet6/in6.c revision 1.161 via patch
src/sys/netinet6/in6_proto.c revision 1.97 via patch
src/sys/netinet6/in6_var.h revision 1.65 via patch
src/sys/netinet6/ip6_input.c revision 1.139 via patch
src/sys/netinet6/ip6_var.h revision 1.59 via patch
src/sys/netinet6/nd6.c revision 1.143 via patch
src/sys/netinet6/nd6.h revision 1.57 via patch
src/sys/netinet6/nd6_rtr.c revision 1.83 via patch
(requested by christos in ticket #905).
Patch by Loganaden Velvindron.

4 new sysctls to avoid ipv6 DoS attacks from OpenBSD
 1.59.12.1  10-Aug-2014  tls Rebase.
 1.59.4.1  28-Aug-2013  rmind Checkpoint work in progress:
- Initial split of the protocol user-request method into the following
methods: pr_attach, pr_detach and pr_generic for old the pr_usrreq.
- Adjust socreate(9) and sonewconn(9) to call pr_attach without the
socket lock held (as a preparation for the locking scheme adjustment).
- Adjust all pr_attach routines to assert that PCB is not set.
- Sprinkle various comments, document some routines and their locking.
- Remove M_PCB, replace with kmem(9).
- Fix few bugs spotted on the way.
 1.59.2.2  03-Dec-2017  jdolecek update from HEAD
 1.59.2.1  20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.62.2.2  30-Jan-2018  martin Pull up following revision(s) (requested by maxv in ticket #1560):
sys/netinet6/frag6.c: revision 1.65
sys/netinet6/ip6_input.c: revision 1.187
sys/netinet6/ip6_var.h: revision 1.78
sys/netinet6/raw_ip6.c: revision 1.160 (patch)
Fix a buffer overflow in ip6_get_prevhdr. Doing
mtod(m, char *) + len
is wrong, an option is allowed to be located in another mbuf of the chain.
If the offset of an option within the chain is bigger than the length of
the first mbuf in that chain, we are reading/writing one byte of packet-
controlled data beyond the end of the first mbuf.
The length of this first mbuf depends on the layout the network driver
chose. In the most difficult case, it will allocate a 2KB cluster, which
is bigger than the Ethernet MTU.
But there is at least one way of exploiting this case: by sending a
special combination of nested IPv6 fragments, the packet can control a
good bunch of 'len'. By luck, the memory pool containing clusters does not
embed the pool header in front of the items, so it is not straightforward
to predict what is located at 'mtod(m, char *) + len'.
However, by sending offending fragments in a loop, it is possible to
crash the kernel - at some point we will hit important data structures.
As far as I can tell, PF protects against this difficult case, because
it kicks nested fragments. NPF does not protect against this. IPF I don't
know.
Then there are the more easy cases, if the MTU is bigger than a cluster,
or if the network driver did not allocate a cluster, or perhaps if the
fragments are received via a tunnel; I haven't investigated these cases.
Change ip6_get_prevhdr so that it returns an offset in the chain, and
always use IP6_EXTHDR_GET to get a writable pointer. IP6_EXTHDR_GET
leaves M_PKTHDR untouched.
This place is still fragile.
 1.62.2.1  23-Jan-2015  martin branches: 1.62.2.1.2; 1.62.2.1.6;
Pull up following revision(s) (requested by pettai in ticket #441):
sys/netinet6/ip6_var.h: revision 1.64
sys/netinet6/in6.h: revision 1.82
sys/netinet6/in6_src.c: revision 1.56
sys/netinet6/mld6.c: revision 1.62
sys/netinet6/ip6_input.c: revision 1.150
sys/netinet6/ip6_output.c: revision 1.161
Add net.inet6.ip6.prefer_tempaddr sysctl knob so that we can prefer
IPv6 temporary addresses as the source address.
Fixes PR kern/47100 based on a patch by Dieter Roelants.
 1.62.2.1.6.1  30-Jan-2018  martin Pull up following revision(s) (requested by maxv in ticket #1560):
sys/netinet6/frag6.c: revision 1.65
sys/netinet6/ip6_input.c: revision 1.187
sys/netinet6/ip6_var.h: revision 1.78
sys/netinet6/raw_ip6.c: revision 1.160 (patch)
Fix a buffer overflow in ip6_get_prevhdr. Doing
mtod(m, char *) + len
is wrong, an option is allowed to be located in another mbuf of the chain.
If the offset of an option within the chain is bigger than the length of
the first mbuf in that chain, we are reading/writing one byte of packet-
controlled data beyond the end of the first mbuf.
The length of this first mbuf depends on the layout the network driver
chose. In the most difficult case, it will allocate a 2KB cluster, which
is bigger than the Ethernet MTU.
But there is at least one way of exploiting this case: by sending a
special combination of nested IPv6 fragments, the packet can control a
good bunch of 'len'. By luck, the memory pool containing clusters does not
embed the pool header in front of the items, so it is not straightforward
to predict what is located at 'mtod(m, char *) + len'.
However, by sending offending fragments in a loop, it is possible to
crash the kernel - at some point we will hit important data structures.
As far as I can tell, PF protects against this difficult case, because
it kicks nested fragments. NPF does not protect against this. IPF I don't
know.
Then there are the more easy cases, if the MTU is bigger than a cluster,
or if the network driver did not allocate a cluster, or perhaps if the
fragments are received via a tunnel; I haven't investigated these cases.
Change ip6_get_prevhdr so that it returns an offset in the chain, and
always use IP6_EXTHDR_GET to get a writable pointer. IP6_EXTHDR_GET
leaves M_PKTHDR untouched.
This place is still fragile.
 1.62.2.1.2.1  30-Jan-2018  martin Pull up following revision(s) (requested by maxv in ticket #1560):
sys/netinet6/frag6.c: revision 1.65
sys/netinet6/ip6_input.c: revision 1.187
sys/netinet6/ip6_var.h: revision 1.78
sys/netinet6/raw_ip6.c: revision 1.160 (patch)
Fix a buffer overflow in ip6_get_prevhdr. Doing
mtod(m, char *) + len
is wrong, an option is allowed to be located in another mbuf of the chain.
If the offset of an option within the chain is bigger than the length of
the first mbuf in that chain, we are reading/writing one byte of packet-
controlled data beyond the end of the first mbuf.
The length of this first mbuf depends on the layout the network driver
chose. In the most difficult case, it will allocate a 2KB cluster, which
is bigger than the Ethernet MTU.
But there is at least one way of exploiting this case: by sending a
special combination of nested IPv6 fragments, the packet can control a
good bunch of 'len'. By luck, the memory pool containing clusters does not
embed the pool header in front of the items, so it is not straightforward
to predict what is located at 'mtod(m, char *) + len'.
However, by sending offending fragments in a loop, it is possible to
crash the kernel - at some point we will hit important data structures.
As far as I can tell, PF protects against this difficult case, because
it kicks nested fragments. NPF does not protect against this. IPF I don't
know.
Then there are the more easy cases, if the MTU is bigger than a cluster,
or if the network driver did not allocate a cluster, or perhaps if the
fragments are received via a tunnel; I haven't investigated these cases.
Change ip6_get_prevhdr so that it returns an offset in the chain, and
always use IP6_EXTHDR_GET to get a writable pointer. IP6_EXTHDR_GET
leaves M_PKTHDR untouched.
This place is still fragile.
 1.63.2.6  28-Aug-2017  skrll Sync with HEAD
 1.63.2.5  05-Feb-2017  skrll Sync with HEAD
 1.63.2.4  05-Dec-2016  skrll Sync with HEAD
 1.63.2.3  05-Oct-2016  skrll Sync with HEAD
 1.63.2.2  09-Jul-2016  skrll Sync with HEAD
 1.63.2.1  06-Apr-2015  skrll Sync with HEAD
 1.67.2.3  20-Mar-2017  pgoyette Sync with HEAD
 1.67.2.2  07-Jan-2017  pgoyette Sync with HEAD. (Note that most of these changes are simply $NetBSD$
tag issues.)
 1.67.2.1  04-Nov-2016  pgoyette Sync with HEAD
 1.71.2.1  21-Apr-2017  bouyer Sync with HEAD
 1.74.6.4  07-Mar-2021  martin Pull up following revision(s) (requested by christos in ticket #1661):

sys/netinet6/ip6_id.c: revision 1.19-1.21
sys/netinet6/ip6_var.h: revision 1.88
sys/netinet/ip_input.c: revision 1.400
sys/netinet/tcp_subr.c: revision 1.285
sys/netinet/ip6.h: revision 1.30

netinet: Enable random IP fragment ids by default (from riastradh)

netinet: Enable RFC 1948 pseudorandom TCP ISS selection by default.
(from riastradh)

netinet6: Mark randomid unused.

Will make merging and bisection easier if anything goes wrong with
flow label or fragment id randomization changes.
(from riastradh)

netinet/netinet6: Add necessary includes to make these standalone.
(from riastradh)

Replace randomid() by cprng_fast32()
 1.74.6.3  27-Sep-2018  martin Additional change needed for ticket #1041:

sys/netinet6/ip6_var.h (apply patch)

When reassembling IPv4/IPv6 packets, ensure each fragment has been subject
to the same IPsec processing. That is to say, that all fragments are ESP,
or AH, or AH+ESP, or none.

Add ipsec flags to struct ip6q.
 1.74.6.2  11-Feb-2018  snj Pull up following revision(s) (requested by ozaki-r in ticket #536):
distrib/sets/lists/base/shl.mi: 1.825
distrib/sets/lists/comp/mi: 1.2168-1.2169
distrib/sets/lists/comp/shl.mi: 1.310
distrib/sets/lists/debug/mi: 1.234
distrib/sets/lists/debug/shl.mi: 1.188
distrib/sets/lists/man/mi: 1.1570
distrib/sets/lists/tests/mi: 1.772
etc/mtree/NetBSD.dist.tests: 1.150
share/man/man4/Makefile: 1.650
share/man/man4/ipsec.4: 1.42-1.43
share/man/man4/ipsecif.4: 1.1-1.5
sys/arch/amd64/conf/ALL: 1.77
sys/arch/amd64/conf/GENERIC: 1.480
sys/conf/files: 1.1191
sys/net/Makefile: 1.34
sys/net/files.net: 1.14
sys/net/if.c: 1.404
sys/net/if.h: 1.248
sys/net/if_gif.c: 1.135
sys/net/if_ipsec.c: 1.1-1.3
sys/net/if_ipsec.h: 1.1
sys/net/if_l2tp.c: 1.16
sys/net/if_types.h: 1.28
sys/netinet/in.c: 1.214
sys/netinet/in.h: 1.103
sys/netinet/in_gif.c: 1.92
sys/netinet/ip_var.h: 1.122
sys/netinet6/in6.c: 1.257
sys/netinet6/in6.h: 1.88
sys/netinet6/in6_gif.c: 1.90
sys/netinet6/ip6_var.h: 1.75
sys/netipsec/Makefile: 1.6
sys/netipsec/files.netipsec: 1.13
sys/netipsec/ipsec.h: 1.62
sys/netipsec/ipsecif.c: 1.1
sys/netipsec/ipsecif.h: 1.1
sys/netipsec/key.c: 1.246-1.247
sys/netipsec/key.h: 1.34
sys/rump/net/Makefile.rumpnetcomp: 1.20
sys/rump/net/lib/libipsec/IPSEC.ioconf: 1.1
sys/rump/net/lib/libipsec/Makefile: 1.1
sys/rump/net/lib/libipsec/ipsec_component.c: 1.1
tests/net/Makefile: 1.34
tests/net/if_ipsec/Makefile: 1.1
tests/net/if_ipsec/t_ipsec.sh: 1.1-1.2
Don't touch an SP without a reference to it
unify processing to check nesting count for some tunnel protocols.
add ipsec(4) interface, which is used for route-based VPN.
man and ATF are added later, please see man for details.
reviewed by christos@n.o, joerg@n.o and ozaki-r@n.o, thanks.
https://mail-index.netbsd.org/tech-net/2017/12/18/msg006557.html
ipsec(4) interface supports rump now.
add ipsec(4) interface ATF.
add ipsec(4) interface man as ipsecif.4.
add ipsec(4) interface to amd64/GENERIC and amd64/ALL configs.
apply in{,6}_tunnel_validate() to gif(4).
Spell IPsec that way. Simplify macro usage. Sort SEE ALSO. Bump
date for previous.
Improve wording and macro use.
Some parts are not clear to me, so someone with knowledge of ipsecif(4)
should improve this some more.
Improve ipsecif.4. Default port ipsec(4) NAT-T is tested now.
pointed out by wiz@n.o and suggested by ozaki-r@n.o, thanks.
Change the prefix of test names to ipsecif_ to distinguish from tests for ipsec(4)
New sentence, new line. Remove empty macro.
Fix PR kern/52920. Pointed out by David Binderman, thanks.
Improve wording, and put a new drawing, from me and Kengo Nakahara.
apply a little more #ifdef INET/INET6. fixes !INET6 builds.
 1.74.6.1  30-Jan-2018  martin Pull up following revision(s) (requested by maxv in ticket #527):
sys/netinet6/frag6.c: revision 1.65
sys/netinet6/ip6_input.c: revision 1.187
sys/netinet6/ip6_var.h: revision 1.78
sys/netinet6/raw_ip6.c: revision 1.160
Fix a buffer overflow in ip6_get_prevhdr. Doing
mtod(m, char *) + len
is wrong, an option is allowed to be located in another mbuf of the chain.
If the offset of an option within the chain is bigger than the length of
the first mbuf in that chain, we are reading/writing one byte of packet-
controlled data beyond the end of the first mbuf.
The length of this first mbuf depends on the layout the network driver
chose. In the most difficult case, it will allocate a 2KB cluster, which
is bigger than the Ethernet MTU.
But there is at least one way of exploiting this case: by sending a
special combination of nested IPv6 fragments, the packet can control a
good bunch of 'len'. By luck, the memory pool containing clusters does not
embed the pool header in front of the items, so it is not straightforward
to predict what is located at 'mtod(m, char *) + len'.
However, by sending offending fragments in a loop, it is possible to
crash the kernel - at some point we will hit important data structures.
As far as I can tell, PF protects against this difficult case, because
it kicks nested fragments. NPF does not protect against this. IPF I don't
know.
Then there are the more easy cases, if the MTU is bigger than a cluster,
or if the network driver did not allocate a cluster, or perhaps if the
fragments are received via a tunnel; I haven't investigated these cases.
Change ip6_get_prevhdr so that it returns an offset in the chain, and
always use IP6_EXTHDR_GET to get a writable pointer. IP6_EXTHDR_GET
leaves M_PKTHDR untouched.
This place is still fragile.
 1.80.4.1  10-Jun-2019  christos Sync with HEAD
 1.80.2.1  26-Dec-2018  pgoyette Sync with HEAD, resolve a few conflicts
 1.82.2.1  07-Mar-2021  martin Pull up following revision(s) (requested by christos in ticket #1226):

sys/netinet6/ip6_id.c: revision 1.19-1.21
sys/netinet6/ip6_var.h: revision 1.88
sys/netinet/ip_input.c: revision 1.400
sys/netinet/tcp_subr.c: revision 1.285
sys/netinet/ip6.h: revision 1.30

netinet: Enable random IP fragment ids by default (from riastradh)

netinet: Enable RFC 1948 pseudorandom TCP ISS selection by default.
(from riastradh)

netinet6: Mark randomid unused.

Will make merging and bisection easier if anything goes wrong with
flow label or fragment id randomization changes.
(from riastradh)

netinet/netinet6: Add necessary includes to make these standalone.
(from riastradh)

Replace randomid() by cprng_fast32()
 1.87.2.1  03-Apr-2021  thorpej Sync with HEAD.

RSS XML Feed