Home | History | Annotate | Download | only in net
History log of /src/sys/net/if_tun.c
RevisionDateAuthorComments
 1.177  18-Sep-2024  rin tun(4): Mark tunread_filtops `FILTEROP_MPSAFE`

Filter handlers have already been MP-safe since 2018:
https://mail-index.netbsd.org/source-changes/2018/08/06/msg097317.html

Note that we do not expect deadlocks similar to bpf(4) (PR kern/58531),
b/w KERNEL_LOCK and spin mutex for TX queue.

For tun(4), filt_tunread() acquires adaptive mutex. This is forbidden
when spin mutex is already held.

Such a path must have already been detected if present.

Thanks ozaki-r@ for discussion.
 1.176  05-Jul-2024  rin sys: Drop redundant NULL check before m_freem(9)

m_freem(9) safely has accepted NULL argument at least since 4.2BSD:
https://www.tuhs.org/cgi-bin/utree.pl?file=4.2BSD/usr/src/sys/sys/uipc_mbuf.c

Compile-tested on amd64/ALL.

Suggested by knakahara@
 1.175  09-Mar-2024  riastradh tun(4): Allow IPv6 packets with TUNSLMODE configured.

PR kern/58013
 1.174  29-Dec-2023  chs tun: add missing kpreempt_enable() if pktq_enqueue() fails
 1.173  28-Mar-2022  riastradh branches: 1.173.4; 1.173.8;
driver(9): devsw_detach never fails. Make it return void.

Prune a whole lotta dead branches as a result of this. (Some logic
calling this is also wrong for other reasons; devsw_detach is final
-- you should never have any reason to decide to roll it back. To be
cleaned up in subsequent commits...)

XXX kernel ABI change to devsw_detach signature requires bump
 1.172  15-Mar-2022  riastradh tun(4): Fix bug introduced in previous locking change.

Now that tun_lock runs at IPL_NONE, taking it does not have the side
effect of disabling preemption, but pktq_enqueue assumes the caller
has disabled preemption so it can safely schedule a softint.

This isn't a problem in most physical network drivers because the
pktq_enqueue call happens from within the driver's softint context
anyway. But tun(4) is special -- here, the pktq_enqueue is triggered
by a userland write to the device, which is in thread context. So
let's just disable preemption in tunwrite.

Reported-by: syzbot+21c2cb300f1ec2162b35@syzkaller.appspotmail.com
 1.171  13-Mar-2022  riastradh tun(4): Fix some error branches in tunwrite.
 1.170  13-Mar-2022  riastradh tun(4): Omit TUN_RWAIT micro-optimization.

cv_broadcast aleady has a fast path for no-waiters.
 1.169  13-Mar-2022  riastradh tun(4): Deliver SIGIO for hangup under tun_lock.

Otherwise, tp->tun_pgid is not stable.
 1.168  13-Mar-2022  riastradh tun(4): Reduce lock from IPL_NET to IPL_SOFTNET.

This is never taken from hardware interrupt handlers any more, as far
as I can tell -- only SOFTINT_NET soft interrupt handlers.

This avoids trying to take an adaptive lock, proc_lock, in fownsignal
while holding a spin lock. Unfortunately, it doesn't entirely fix the
problem -- proc_lock is at IPL_NONE, and is held across some not
entirely trivial computations like allocating a new pid table. So it
would really be better if we had some way to deliver SIGIO without
taking proc_lock.

Reported-by: syzbot+3dd54993d3e92e697e72@syzkaller.appspotmail.com
Reported-by: syzbot+aca29415f2f0bf23f082@syzkaller.appspotmail.com
 1.167  13-Mar-2022  riastradh tun(4): Reduce tun_softc_lock from IPL_NET to IPL_NONE.

This is always taken in process/thread context, never in interrupt
context, hard or soft.
 1.166  13-Mar-2022  riastradh tun(4): Factor out setup/teardown into separate routines.

- Reduce duplication.
- Plug softint leak on recycling tun.

(This recycling business seems kinda sketchy...)
 1.165  13-Mar-2022  riastradh tun(4): Add missing cv_destroy in tunclose.
 1.164  26-Sep-2021  thorpej Use seltrue_filtops rather than rolling our own with filt_seltrue.
 1.163  26-Sep-2021  thorpej Change the kqueue filterops::f_isfd field to filterops::f_flags, and
define a flag FILTEROP_ISFD that has the meaning of the prior f_isfd.
Field and flag name aligned with OpenBSD.

This does not constitute a functional or ABI change, as the field location
and size, and the value placed in that field, are the same as the previous
code, but we're bumping __NetBSD_Version__ so 3rd-party module source code
can adapt, as needed.

NetBSD 9.99.89
 1.162  18-Dec-2020  thorpej Use sel{record,remove}_knote().
 1.161  27-Sep-2020  roy branches: 1.161.2;
tun: Report link state based on if the interface has been opened or not

This mirrors tap(4).
 1.160  29-Aug-2020  maxv Correct my rev1.159, it was incomplete, the check must be done later
because the value can change in the meantime (and get set to zero).
 1.159  23-Jun-2020  maxv Hum. Fix NULL deref triggerable with just write(0).

Reported-by: syzbot+45b31355bf880e175b73@syzkaller.appspotmail.com
 1.158  29-Jan-2020  thorpej Adopt <net/if_stats.h>.
 1.157  13-Dec-2019  maxv branches: 1.157.2;
Read the len before pushing the packet, otherwise possible use-after-free.
Found by a custom query on LGTM.
 1.156  26-Apr-2019  pgoyette branches: 1.156.2;
Set the "required modules" to NULL, not to an empty string.

It really doesn't make that much difference to the code, but the output
from modstat(8) is different! (With an empty string in the MODULE() macro
modstat reports an empty string, but with a NULL in the macro, modstat
prints a '-' just like it does for other "empty" fields.)
 1.155  25-Mar-2019  pgoyette in tundetach(), error is only used #ifdef _MODULE so wrap its declaration.
 1.154  25-Mar-2019  pgoyette Resequence the stuff in tundetach() to ensure that no new device units
can be created by either 'ifconfig create' or 'open("/dev/tun0")' paths.

Note: previous efforts at fixing 'modunload if_tun' are abandoned, since
there is no bug. Just need to ensure that the cloned interface is both
close(1)d _and_ 'ifconfig tunx destroy' before trying to unload.
 1.153  25-Mar-2019  msaitoh Revert rev. 1.151 and 1.152 to avoid compile error. Requested by pgoyette.
 1.152  25-Mar-2019  pgoyette Use correct list name
 1.151  25-Mar-2019  pgoyette This should do it!

Remove the zombie unit from the zombie list, not the regular list!
 1.150  25-Mar-2019  pgoyette And revert both of the previous. It seems that the structure has
already been removed from the list in the find_zunit() code.

So now, off to really find out why the module won't unload.
 1.149  25-Mar-2019  pgoyette Fix previous - remove it from the list before freeing the memory.
 1.148  25-Mar-2019  pgoyette If the unit being closed was a "zombie" (ie, the interface was destroyed
previously), remove it from the zombie list after freeing all of its
resources.

This should allow the module to be unloaded even if there was a zombie
at some point. Without this change, the zombie list never gets emptied.
 1.147  03-Sep-2018  riastradh Rename min/max -> uimin/uimax for better honesty.

These functions are defined on unsigned int. The generic name
min/max should not silently truncate to 32 bits on 64-bit systems.
This is purely a name change -- no functional change intended.

HOWEVER! Some subsystems have

#define min(a, b) ((a) < (b) ? (a) : (b))
#define max(a, b) ((a) > (b) ? (a) : (b))

even though our standard name for that is MIN/MAX. Although these
may invite multiple evaluation bugs, these do _not_ cause integer
truncation.

To avoid `fixing' these cases, I first changed the name in libkern,
and then compile-tested every file where min/max occurred in order to
confirm that it failed -- and thus confirm that nothing shadowed
min/max -- before changing it.

I have left a handful of bootloaders that are too annoying to
compile-test, and some dead code:

cobalt ews4800mips hp300 hppa ia64 luna68k vax
acorn32/if_ie.c (not included in any kernels)
macppc/if_gm.c (superseded by gem(4))

It should be easy to fix the fallout once identified -- this way of
doing things fails safe, and the goal here, after all, is to _avoid_
silent integer truncations, not introduce them.

Maybe one day we can reintroduce min/max as type-generic things that
never silently truncate. But we should avoid doing that for a while,
so that existing code has a chance to be detected by the compiler for
conversion to uimin/uimax without changing the semantics until we can
properly audit it all. (Who knows, maybe in some cases integer
truncation is actually intended!)
 1.146  06-Aug-2018  ozaki-r Fix tun(4) kevent locking

filt_tunread gets called in two contexts:

- by calls to selnotify in if_tun.c (or knote, as the case may be,
but not here), in which case tp->tun_lock is held; and

- by internal logic in kevent, in which tp->tun_lock is not held.

The standard convention to discriminate between these two cases is by
setting the kernel-only NOTE_SUBMIT bit in the hint to selnotify or
knote; then in filt_*:

if (hint & NOTE_SUBMIT)
KASSERT(mutex_owned(&tp->tun_lock));
else
mutex_enter(&tp->tun_lock);
...
if (hint & NOTE_SUBMIT)
KASSERT(mutex_owned(&tp->tun_lock));
else
mutex_exit(&tp->tun_lock);

Pointed out by and patch from riastradh@
Tested by ozaki-r@ (only the former path)
 1.145  03-Aug-2018  ozaki-r tun: fix locking against myself

filt_tunread is called with tun_lock held from tun_output (via tun_output =>
selnotify => knote), so we must not take tun_lock in filt_tunread. The bug
is triggered only if a tun is used through kqueue.

Found by k-goda@IIJ
 1.144  26-Jun-2018  msaitoh branches: 1.144.2;
Implement the BPF direction filter (BIOC[GS]DIRECTION). It provides backward
compatibility with BIOC[GS]SEESENT ioctl. The userland interface is the same
as FreeBSD.

This change also fixes a bug that the direction is misunderstand on some
environment by passing the direction to bpf_mtap*() instead of checking
m->m_pkthdr.rcvif.
 1.143  16-Mar-2018  tih Add packet filtering to tun(4) interfaces.

Calls to pfil_run_hooks() were missing in if_tun.c. This meant that
filtering configuration could be added to e.g. /etc/npf.conf, but
would be ignored, because the filter never saw the packets. This
change adds the required calls.

While here, correct the return value from tun_output(): it's been
returning 0 regardless of any error condition present, but will now
correctly propagate such information upward.

Thanks to maxv for guidance!

OK: christos, martin
 1.142  06-Dec-2017  ozaki-r branches: 1.142.2;
Ensure to not turn on IFF_RUNNING of an interface until its initialization completes

And ensure to turn off it before destruction as per IFF_RUNNING's description
"resource allocated". (The description is a bit doubtful though, I believe the
change is still proper.)
 1.141  30-Oct-2017  ozaki-r Set IFEF_NO_LINK_STATE_CHANGE flag to pseudo devices that don't use if_link_state_change
 1.140  25-Oct-2017  maya Use C99 initializer for filterops

Mostly done with spatch with touchups for indentation

@@
expression a;
identifier b,c,d;
identifier p;
@@
const struct filterops p =
- { a, b, c, d
+ {
+ .f_isfd = a,
+ .f_attach = b,
+ .f_detach = c,
+ .f_event = d,
};
 1.139  24-May-2017  pgoyette branches: 1.139.2;
Call cv_destroy() to deactivate the tun_cv before calling kmem_intr_free()
to deallocate the containing memory chunk (the tunnel's softc). Otherwise
a LOCKDEBUG kernel will panic in tun_clone_destroy().

Fixes PR kern/52255
 1.138  29-Jan-2017  maya branches: 1.138.4;
Most error paths that goto out; don't hold tun_lock.
so don't mutex_exit(tun_lock) in them, but only in
the one that needs it.

ok skrll
 1.137  26-Jan-2017  skrll Fix logic inversion spotted by paulg
 1.136  26-Jan-2017  skrll Make MP-safe and use kmem(9)

Mostly from rmind-smpnet
 1.135  23-Jan-2017  skrll KNF. Same code before and after.
 1.134  11-Jan-2017  ozaki-r branches: 1.134.2;
Get rid of unnecessary header inclusions
 1.133  02-Oct-2016  christos MFREE -> m_free
 1.132  07-Sep-2016  ozaki-r Fix tun_enable

Before the rearrangement of ifaddr initializations (in.c,v 1.169),
when we called tun_enable via ioctl(SIOCINITIFADDR), an ifaddr
in question was inserted in the interface address list. However,
after the change the ifaddr isn't in the list at that point. So
we shouldn't rely on that we can find the ifaddr by
IFADDR_READER_FOREACH. Instead simply use the ifaddr passed by
ioctl(SIOCINITIFADDR).
 1.131  07-Sep-2016  ozaki-r Rename tuncreate to tun_enable

It should be more proper.
 1.130  05-Sep-2016  ozaki-r Support tun devices on rump kernels
 1.129  05-Sep-2016  ozaki-r Fix typo in a comment
 1.128  07-Aug-2016  christos modularize some more drivers and merge the module glue
 1.127  07-Jul-2016  ozaki-r branches: 1.127.2;
Switch the address list of intefaces to pslist(9)

As usual, we leave the old list to avoid breaking kvm(3) users.
 1.126  10-Jun-2016  ozaki-r Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.
 1.125  28-Apr-2016  ozaki-r Constify rtentry of if_output

We no longer need to change rtentry below if_output.

The change makes it clear where rtentries are changed (or not)
and helps forthcoming locking (os psrefing) rtentries.
 1.124  20-Apr-2016  knakahara IFQ_ENQUEUE refactor (3/3) : eliminate pktattr argument from IFQ_ENQUEUE caller
 1.123  24-Aug-2015  pooka sprinkle _KERNEL_OPT
 1.122  20-Aug-2015  christos include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.
 1.121  20-Apr-2015  roy Introduce p2p_rtrequest() so that IFF_POINTOPOINT interfaces can work
with RTF_LOCAL.
Fixes PR kern/49829.
 1.120  25-Jul-2014  dholland branches: 1.120.4;
Add d_discard to all struct cdevsw instances I could find.

All have been set to "nodiscard"; some should get a real implementation.
 1.119  19-Jun-2014  ws Enqueue the mbuf with the start of the packet,
not some intermediate one (hi, rmind!).
 1.118  05-Jun-2014  rmind - Implement pktqueue interface for lockless IP input queue.
- Replace ipintrq and ip6intrq with the pktqueue mechanism.
- Eliminate kernel-lock from ipintr() and ip6intr().
- Some preparation work to push softnet_lock out of ipintr().

Discussed on tech-net.
 1.117  20-Mar-2014  skrll branches: 1.117.2;
Mechanically replace simplelock with kmutex_t.
 1.116  16-Mar-2014  dholland Change (mostly mechanically) every cdevsw/bdevsw I can find to use
designated initializers.

I have not built every extant kernel so I have probably broken at
least one build; however I've also found and fixed some wrong
cdevsw/bdevsw entries so even if so I think we come out ahead.
 1.115  28-Jan-2012  rmind branches: 1.115.6; 1.115.10;
Replace tun_lock with mutex(9). XXX: too far from being MP-safe yet.
 1.114  28-Oct-2011  dyoung branches: 1.114.2; 1.114.6;
For these interfaces, the implementation of SIOCSIFDSTADDR is identical
to SIOCINITIFADDR, and SIOCSIFDSTADDR callers always fall back to
SIOCINITIFADDR, so just get rid of the SIOCSIFDSTADDR case.
 1.113  05-Apr-2010  joerg Push the bpf_ops usage back into bpf.h. Push the common ifp->if_bpf
check into the inline functions as well the fourth argument for
bpf_attach.
 1.112  19-Jan-2010  pooka branches: 1.112.2; 1.112.4;
Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.
 1.111  08-May-2009  elad Add and use a network scope action/request for tun(4), similar to ppp(4),
sl(4), and strip(4).
 1.110  20-Nov-2008  dyoung branches: 1.110.4;
Update comment for last.
 1.109  20-Nov-2008  dyoung In the new ifioctl order, tun_ioctl() can call itself through
ifioctl_common(). Since the first tun_ioctl() call already holds
the simplelock, the second tun_ioctl() call will wait forever to
acquire it: deadlock.

To fix this, wait to acquire the lock until tuninit().
 1.108  07-Nov-2008  dyoung *** Summary ***

When a link-layer address changes (e.g., ifconfig ex0 link
02:de:ad:be:ef:02 active), send a gratuitous ARP and/or a Neighbor
Advertisement to update the network-/link-layer address bindings
on our LAN peers.

Refuse a change of ethernet address to the address 00:00:00:00:00:00
or to any multicast/broadcast address. (Thanks matt@.)

Reorder ifnet ioctl operations so that driver ioctls may inherit
the functions of their "class"---ether_ioctl(), fddi_ioctl(), et
cetera---and the class ioctls may inherit from the generic ioctl,
ifioctl_common(), but both driver- and class-ioctls may override
the generic behavior. Make network drivers share more code.

Distinguish a "factory" link-layer address from others for the
purposes of both protecting that address from deletion and computing
EUI64.

Return consistent, appropriate error codes from network drivers.

Improve readability. KNF.

*** Details ***

In if_attach(), always initialize the interface ioctl routine,
ifnet->if_ioctl, if the driver has not already initialized it.
Delete if_ioctl == NULL tests everywhere else, because it cannot
happen.

In the ioctl routines of network interfaces, inherit common ioctl
behaviors by calling either ifioctl_common() or whichever ioctl
routine is appropriate for the class of interface---e.g., ether_ioctl()
for ethernets.

Stop (ab)using SIOCSIFADDR and start to use SIOCINITIFADDR. In
the user->kernel interface, SIOCSIFADDR's argument was an ifreq,
but on the protocol->ifnet interface, SIOCSIFADDR's argument was
an ifaddr. That was confusing, and it would work against me as I
make it possible for a network interface to overload most ioctls.
On the protocol->ifnet interface, replace SIOCSIFADDR with
SIOCINITIFADDR. In ifioctl(), return EPERM if userland tries to
invoke SIOCINITIFADDR.

In ifioctl(), give the interface the first shot at handling most
interface ioctls, and give the protocol the second shot, instead
of the other way around. Finally, let compatibility code (COMPAT_OSOCK)
take a shot.

Pull device initialization out of switch statements under
SIOCINITIFADDR. For example, pull ..._init() out of any switch
statement that looks like this:

switch (...->sa_family) {
case ...:
..._init();
...
break;
...
default:
..._init();
...
break;
}

Rewrite many if-else clauses that handle all permutations of IFF_UP
and IFF_RUNNING to use a switch statement,

switch (x & (IFF_UP|IFF_RUNNING)) {
case 0:
...
break;
case IFF_RUNNING:
...
break;
case IFF_UP:
...
break;
case IFF_UP|IFF_RUNNING:
...
break;
}

unifdef lots of code containing #ifdef FreeBSD, #ifdef NetBSD, and
#ifdef SIOCSIFMTU, especially in fwip(4) and in ndis(4).

In ipw(4), remove an if_set_sadl() call that is out of place.

In nfe(4), reuse the jumbo MTU logic in ether_ioctl().

Let ethernets register a callback for setting h/w state such as
promiscuous mode and the multicast filter in accord with a change
in the if_flags: ether_set_ifflags_cb() registers a callback that
returns ENETRESET if the caller should reset the ethernet by calling
if_init(), 0 on success, != 0 on failure. Pull common code from
ex(4), gem(4), nfe(4), sip(4), tlp(4), vge(4) into ether_ioctl(),
and register if_flags callbacks for those drivers.

Return ENOTTY instead of EINVAL for inappropriate ioctls. In
zyd(4), use ENXIO instead of ENOTTY to indicate that the device is
not any longer attached.

Add to if_set_sadl() a boolean 'factory' argument that indicates
whether a link-layer address was assigned by the factory or some
other source. In a comment, recommend using the factory address
for generating an EUI64, and update in6_get_hw_ifid() to prefer a
factory address to any other link-layer address.

Add a routing message, RTM_LLINFO_UPD, that tells protocols to
update the binding of network-layer addresses to link-layer addresses.
Implement this message in IPv4 and IPv6 by sending a gratuitous
ARP or a neighbor advertisement, respectively. Generate RTM_LLINFO_UPD
messages on a change of an interface's link-layer address.

In ether_ioctl(), do not let SIOCALIFADDR set a link-layer address
that is broadcast/multicast or equal to 00:00:00:00:00:00.

Make ether_ioctl() call ifioctl_common() to handle ioctls that it
does not understand.

In gif(4), initialize if_softc and use it, instead of assuming that
the gif_softc and ifp overlap.

Let ifioctl_common() handle SIOCGIFADDR.

Sprinkle rtcache_invariants(), which checks on DIAGNOSTIC kernels
that certain invariants on a struct route are satisfied.

In agr(4), rewrite agr_ioctl_filter() to be a bit more explicit
about the ioctls that we do not allow on an agr(4) member interface.

bzero -> memset. Delete unnecessary casts to void *. Use
sockaddr_in_init() and sockaddr_in6_init(). Compare pointers with
NULL instead of "testing truth". Replace some instances of (type
*)0 with NULL. Change some K&R prototypes to ANSI C, and join
lines.
 1.107  15-Jun-2008  christos branches: 1.107.2; 1.107.4;
- add if_alloc (ours just mallocs), and if_initname and use them (from FreeBSD)
- kill memsets where M_ZERO can be used.
 1.106  24-Apr-2008  ad branches: 1.106.2; 1.106.4; 1.106.6;
Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.
 1.105  21-Mar-2008  ad branches: 1.105.2;
Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.
 1.104  01-Mar-2008  rmind Welcome to 4.99.55:

- Add a lot of missing selinit() and seldestroy() calls.

- Merge selwakeup() and selnotify() calls into a single selnotify().

- Add an additional 'events' argument to selnotify() call. It will
indicate which event (POLL_IN, POLL_OUT, etc) happen. If unknown,
zero may be used.

Note: please pass appropriate value of 'events' where possible.
Proposed on: <tech-kern>
 1.103  20-Feb-2008  matt branches: 1.103.2; 1.103.6;
s/u_\(int[0-9]*_t\)/u\1/g
(change u_int*_t to uint*_t)
 1.102  07-Feb-2008  dyoung Start patching up the kernel so that a network driver always has
the opportunity to handle an ioctl before generic ifioctl handling
occurs. This will ease extending the kernel and sharing of code
between drivers.

First steps: Make the signature of ifioctl_common() match struct
ifinet->if_ioctl. Convert SIOCSIFCAP and SIOCSIFMTU to the new
ifioctl() regime, throughout the kernel.
 1.101  04-Jan-2008  ad Start detangling lock.h from intr.h. This is likely to cause short term
breakage, but the mess of dependencies has been regularly breaking the
build recently anyhow.
 1.100  05-Dec-2007  pooka branches: 1.100.4;
Do not "return 1" from kqfilter for errors. That value is passed
directly to the userland caller and results in a mysterious EPERM.
Instead, return EINVAL or something else sensible depending on the
case.
 1.99  19-Oct-2007  ad branches: 1.99.2; 1.99.4;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h
 1.98  01-Sep-2007  dyoung branches: 1.98.4;
Use ifreq_setaddr(), ifreq_getaddr(), sockaddr_in_init(), and
sockaddr_copy(). Constify. Compare pointers with NULL, not 0.
Don't "test truth" of pointers, but compare with NULL.
 1.97  04-Mar-2007  christos branches: 1.97.2; 1.97.10; 1.97.14; 1.97.16;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.
 1.96  17-Feb-2007  dyoung KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.
 1.95  04-Jan-2007  elad branches: 1.95.2;
Consistent usage of KAUTH_GENERIC_ISSUSER.
 1.94  16-Nov-2006  christos __unused removal on arguments; approved by core.
 1.93  12-Oct-2006  christos - sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386
 1.92  07-Sep-2006  dogcow branches: 1.92.2; 1.92.4;
remove more vestiges of CCITT, LLC, HDLC, NS, and NSIP.
 1.91  30-Aug-2006  christos fix initializer
 1.90  23-Jul-2006  ad Use the LWP cached credentials where sane.
 1.89  14-May-2006  elad integrate kauth.
 1.88  18-Apr-2006  rpaulo Fix another typo... I must be on drugs...
 1.87  08-Apr-2006  rpaulo IFHEAD and PREPADDR are mutually exclusive. From FreeBSD.
 1.86  04-Apr-2006  rpaulo Add another bit from FreeBSD that I forgot: in tun_output, don't try to send
an AF_INET packet if TUN_IFHEAD is not set.
From FreeBSD and spotted (again) by DEGROOTE Arnaud.
 1.85  04-Apr-2006  rpaulo Fix a if-clause botched in a previous revision now that we have TUN_IFHEAD.
Spotted by DEGROOTE Arnaud <degroote@enseirb.fr>.
 1.84  03-Apr-2006  rpaulo Implement TUN_IFHEAD, the missing piece that was breaking old applications.
 1.83  29-Mar-2006  rpaulo Add missing break tunwrite() which was causing EAFNOSUPPORT to be
returned, thus making IPv6 support broken.
!@#$%^...
 1.82  03-Mar-2006  rpaulo branches: 1.82.2; 1.82.4; 1.82.6;
Some minor KNF.
 1.81  03-Mar-2006  rpaulo Fix typo in comment.
 1.80  28-Feb-2006  rpaulo Add full support for IPv6 tunnels. From DEGROOTE Arnaud in PR 32944.
The PR submitter and the PR handler were unable to test this code
using Teredo userland clients such as Miredo. However, the PR handler
dumped and analyzed some of the packets produced by Miredo and they
seemed fine.
(On a side note: I was unable to setup Teredo in Windows XP and the
problem seemed similar to what I currently see in NetBSD: lack of
replies from the Teredo relay).
 1.79  05-Feb-2006  rpaulo Add preliminary/not tested support for IPv6.
 1.78  11-Dec-2005  thorpej branches: 1.78.2; 1.78.4; 1.78.6;
ANSI function decls and application of static.
 1.77  11-Dec-2005  christos merge ktrace-lwp.
 1.76  24-Jan-2005  matt branches: 1.76.8;
Add IFNET_FOREACH and IFADDR_FOREACH macros and start using them.
 1.75  06-Dec-2004  christos branches: 1.75.4;
Sprinkle #ifdef INET to make a GENERIC kernel compile with INET undefined.
 1.74  04-Dec-2004  peter Remove redundant conditional; NTUN is always 1 when this file is compiled.
Also remove tun.h include, since it's no longer needed.
 1.73  04-Dec-2004  peter Change ifc_destroy to return an int instead of void, so that it
can pass back errors to ifconfig.
 1.72  19-Aug-2004  christos Factor out the hand-crafting of mbufs from the interface files. Reviewed by
gimpy. XXX: I could have used bpf_mtap2 on some of the new functions, but I
chose not to, because I just wanted to do what amounts to a code move.
 1.71  06-Jun-2004  dyoung Use bpf_mtap2 in tun(4).
 1.70  14-May-2004  pk Fix locking issues noticed by Tom Ivar Helbekkmo on tech-net:
* always acquire the device instance lock at splnet()
* missing unlocks in various places

Also, since this driver allows its device instances manipulated by two
independent subsystems (character device & interface clone create/destroy),
be careful not to rip away instance data in a clone destroy request if the
instance is still opened as a character device.
 1.69  13-May-2004  tron Initialize interface type to IFT_TUNNEL as suggested by Erik �ngg�rd
in PR kern/25555.
 1.68  01-Mar-2004  tron branches: 1.68.2;
Don't leak memory if a copyin fails.
 1.67  22-Sep-2003  cl pass signo to fownsignal #ifdef ALTQ
 1.66  22-Sep-2003  christos - pass signo to fownsignal [ok by jd]
- make urg signal handling use fownsignal
- remove out of band detection in sowakeup
 1.65  22-Sep-2003  jdolecek kill unused variable in #ifdef ALTQ part, to make this compile
with ALTQ configured in
 1.64  21-Sep-2003  jdolecek cleanup & uniform descriptor owner handling:
* introduce fsetown(), fgetown(), fownsignal() - this sets/retrieves/signals
the owner of descriptor, according to appropriate sematics
of TIOCSPGRP/FIOSETOWN/SIOCSPGRP/TIOCGPGRP/FIOGETOWN/SIOCGPGRP ioctl; use
these routines instead of custom code where appropriate
* make every place handling TIOCSPGRP/TIOCGPGRP handle also FIOSETOWN/FIOGETOWN
properly, and remove the translation of FIO[SG]OWN to TIOC[SG]PGRP
in sys_ioctl() & sys_fcntl()
* also remove the socket-specific hack in sys_ioctl()/sys_fcntl() and
pass the ioctls down to soo_ioctl() as any other ioctl

change discussed on tech-kern@
 1.63  29-Jun-2003  fvdl branches: 1.63.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
 1.62  28-Jun-2003  darrenr Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
 1.61  02-May-2003  itojun KNF
 1.60  01-May-2003  itojun bpf_mtap() does not care about M_PKTHDR at the top. M_COPY_PKTHDR has some
consequences, so avoid it. if we need to attach dummy headers, we should
use M_PREPEND instead.
 1.59  13-Mar-2003  dsl Validate pgid arg to TIOCSPGRP
 1.58  25-Dec-2002  jdolecek count input/output bytes for tun device
Problem reported and patch provided in PR kern/19554 by Michael van Elst
 1.57  26-Nov-2002  christos si_ -> sel_
 1.56  23-Oct-2002  jdolecek merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe
 1.55  23-Sep-2002  simonb Remove breaks after returns, unreachable returns and returns after
returns(!).
 1.54  23-Sep-2002  simonb uio_resid is a size_t (ie, unsigned), so don't check if it's less than 0.
 1.53  06-Sep-2002  gehenna Merge the gehenna-devsw branch into the trunk.

This merge changes the device switch tables from static array to
dynamically generated by config(8).

- All device switches is defined as a constant structure in device drivers.

- The new grammer ``device-major'' is introduced to ``files''.

device-major <prefix> char <num> [block <num>] [<rules>]

- All device major numbers must be listed up in port dependent majors.<arch>
by using this grammer.

- Added the new naming convention.
The name of the device switch must be <prefix>_[bc]devsw for auto-generation
of device switch tables.

- The backward compatibility of loading block/character device
switch by LKM framework is broken. This is necessary to convert
from block/character device major to device name in runtime and vice versa.

- The restriction to assign device major by LKM is completely removed.
We don't need to reserve LKM entries for dynamic loading of device switch.

- In compile time, device major numbers list is packed into the kernel and
the LKM framework will refer it to assign device major number dynamically.
 1.52  29-Jul-2002  atatat Make tun interfaces perform auto-creation. This means that if a
program opens /dev/tun# and tun# has not been SIOCIFCREATE'd already,
it will be SIOCIFCREATE'd automatically. FreeBSD's tun interfaces
behave in a somewhat similar fashion.
 1.51  13-Mar-2002  itojun branches: 1.51.4; 1.51.6;
suppress -Wunused if !INET6
 1.50  05-Mar-2002  itojun bring in latest ALTQ from kjc. ALTQify some of the drivers.
 1.49  13-Nov-2001  lukem remove unnecessary #if NFOO > 0 .... #endif wrappers
 1.48  12-Nov-2001  lukem add RCSIDs
 1.47  05-Nov-2001  matt Switch to using queue access macros instead of refering to the member
fields explicitly.
 1.46  31-Oct-2001  atatat Turn the tun device/network interface into a cloning device.
 1.45  03-Aug-2001  itojun branches: 1.45.2; 1.45.4;
simplify previous fix (0-length mbuf in mbuf chain). from freebsd
 1.44  02-Aug-2001  itojun do not break from loop even if m_len == 0. it's valid to have
mbuf with m_len == 0 in mbuf chain.
 1.43  13-Apr-2001  thorpej branches: 1.43.2;
Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.
 1.42  17-Jan-2001  thorpej branches: 1.42.2;
Fix a rather annoying problem where the sockaddr_dl which holds
the link level name for the interface (ifp->if_sadl) is allocated
before ifp->if_addrlen is initialized, which could lead to allocating
too little space for the link level address.

Do this by splitting allocation of the link level name out of
if_attach() and into if_alloc_sadl(), which is normally called
by functions like ether_ifattach(). Network interfaces which
don't have a link-specific attach routine must call if_alloc_sadl()
themselves (example: gif).

Link level names are freed by if_free_sadl(), which can be called
from e.g. ether_ifdetach(). Drivers never need call if_free_sadl()
themselves as if_detach() will do it if it is not already done.

While here, add the ability to pass an AF_LINK address to
SIOCSIFADDR in ether_ioctl() (this is what caused me to notice
the problem that the above fixes).
 1.41  18-Dec-2000  thorpej Fill in if_dlt.
 1.40  12-Dec-2000  thorpej Adapt to bpfattach() changes, and further centralize the bpfattach()
and bpfdetach() calls into link-type subroutines where possible.
 1.39  30-Mar-2000  augustss Kill some more register declarations.
 1.38  01-Jul-1999  itojun branches: 1.38.2;
IPv6 kernel code, based on KAME/NetBSD 1.4, SNAP kit 19990628.
(Sorry for a big commit, I can't separate this into several pieces...)
Pls check sys/netinet6/TODO and sys/netinet6/IMPLEMENTATION for details.

- sys/kern: do not assume single mbuf, accept chained mbuf on passing
data from userland to kernel (or other way round).
- "midway" ATM card: ATM PVC pseudo device support, like those done in ALTQ
package (ftp://ftp.csl.sony.co.jp/pub/kjc/).
- sys/netinet/tcp*: IPv4/v6 dual stack tcp support.
- sys/netinet/{ip6,icmp6}.h, sys/net/pfkeyv2.h: IETF document assumes those
file to be there so we patch it up.
- sys/netinet: IPsec additions are here and there.
- sys/netinet6/*: most of IPv6 code sits here.
- sys/netkey: IPsec key management code
- dev/pci/pcidevs: regen

In my understanding no code here is subject to export control so it
should be safe.
 1.37  04-Mar-1999  mjacob branches: 1.37.4; 1.37.6;
adjust format args for compiler changes
 1.36  30-Nov-1998  sommerfe branches: 1.36.2;
Fix PR6473: allow sends to tun* devices using bpf.
 1.35  20-Aug-1998  veego Add some braces to stop the new egcs warnings.
 1.34  05-Jul-1998  jonathan defopt NS, NSIP.
 1.33  05-Jul-1998  jonathan defopt INET, NETATALK.
 1.32  25-Sep-1997  matt Add SIOC{ADD|DEL}MULTI ioctl to support (for IFF_MULTICAST).
 1.31  24-Sep-1997  matt Add support of SIOCIFMTU to vary mtu of interface. Also allow IFF_MULTICAST
on TUNSIFMODE (sometimes you'd like to do IP multicast on tunnel devices).
 1.30  15-Mar-1997  is branches: 1.30.4;
New ARP system, supports IPv4 over any hardware link.

Some of the stuff (e.g., rarpd, bootpd, dhcpd etc., libsa) still will
only support Ethernet. Tcpdump itself should be ok, but libpcap needs
lot of work.

For the detailed change history, look at the commit log entries for
the is-newarp branch.
 1.29  13-Oct-1996  christos branches: 1.29.4;
backout previous kprintf change
 1.28  10-Oct-1996  christos - printf -> kprintf, sprintf -> ksprintf
 1.27  07-Sep-1996  mycroft Implement poll(2).
 1.26  25-Jun-1996  pk A couple of emulation enhancements from der mouse's PR#2411:
- ability to be either a BROADCAST or POINTTOPOINT interface.
- a humble beginning of link-layer addressing (differs from PR
by using a `struct sockaddr' instead of single byte).
 1.25  22-May-1996  mycroft Removing a completely unneeded reference to curproc.
 1.24  07-May-1996  thorpej Changed struct ifnet to have a pointer to the softc of the underlying
device and a printable "external name" (name + unit number), thus eliminating
if_name and if_unit. Updated interface to (*if_watchdog)() and (*if_reset)()
to take a struct ifnet *, rather than a unit number.
 1.23  30-Mar-1996  christos Eliminate need for and remove net_conf.h
 1.22  13-Feb-1996  christos Net prototypes
 1.21  05-Feb-1996  scottr Grammar police; noted by Peter Seebach <seebs@solon.com>. Closes PR #1982.
 1.20  01-Feb-1996  mycroft Rename tunioctl() and tuncioctl() so that cdevsw points to the right one.
From der Mouse, PR 2005.
 1.19  13-Dec-1995  pk Return actual packet length in FIONREAD (noted by Bob Smart).
 1.18  13-Jun-1995  mycroft Update to match data structure changes.
 1.17  12-Jun-1995  mycroft Various cleanup, including:
* Convert several data structures to use queue.h.
* Split in_pcbnotify() into two parts; one for notifying a specific PCB, and
one for notifying all PCBs for a particular foreign address.
 1.16  08-Mar-1995  cgd fixed sized types, where appropriate. when casting pointers to
integers to do math on them, cast to long. ioctl commands are
u_longs.
 1.15  30-Oct-1994  cgd be more careful with types, also pull in headers where necessary.
 1.14  29-Jun-1994  cgd New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'
 1.13  26-May-1994  deraadt MIN -> min
 1.12  15-May-1994  deraadt repair protos and functions
 1.11  03-May-1994  deraadt fixes from <brad@fcr.com> who claims it now works correctly
 1.10  28-Feb-1994  andrew Fixed a bug with TUN_OPEN flag handling during tunclose(), as noted by
Mark Delany <markd@bushwire.apana.org.au>.
 1.9  24-Dec-1993  deraadt must pull in machine-cpu.h
 1.8  13-Dec-1993  deraadt tunnel driver cleanup done by Brad Parker <brad@fcr.com> and myself
 1.7  14-Nov-1993  deraadt use one stop shopping selwakeup/selrecord
 1.6  14-Nov-1993  deraadt cleaned up version of the tunnel driver
 1.5  09-Aug-1993  deraadt branches: 1.5.2;
suser() was being called in the old 4.3 way
 1.4  07-Aug-1993  cgd merge in changes from netbsd-0-9-ALPHA2
 1.3  22-May-1993  cgd branches: 1.3.2;
add include of select.h if necessary for protos, or delete if extraneous
 1.2  18-May-1993  cgd make kernel select interface be one-stop shopping & clean it all up.
 1.1  21-Mar-1993  cgd branches: 1.1.1;
Initial revision
 1.1.1.1  21-Mar-1993  cgd initial import of 386bsd-0.1 sources
 1.3.2.1  31-Jul-1993  cgd give names, err, wmesg's, to my "pain" -- i.e. convert sleep() to tsleep()
 1.5.2.1  03-Nov-1993  mycroft Delete useless assignments to if_init.
 1.29.4.2  09-Mar-1997  is netinet/if_ether.h -> netinet/if_inarp.h
 1.29.4.1  07-Feb-1997  is Snapshot of new ARP code.

Our old ARP code was hardwired for 6-byte length medium
addresses, while the protocol is designed for any size.

This snapshot contains a first hack at getting rid of
Ethernet specific data structures. The ep driver is updated
(and tested on the PCI bus), the iy and fpa drivers have been
updated, but not real life tested yet.

If you want to test this with other drivers, you have to update
them first yourself, and probably tag the relevant directories.
Better contact me if you want to do this.
 1.30.4.1  29-Sep-1997  thorpej Update marc-pcmcia branch from trunk.
 1.36.2.1  11-Dec-1998  kenh The beginnings of interface detach support. Still some bugs, but mostly
works for me.

This work was originally by Bill Studenmund, and cleaned up by me.
 1.37.6.2  30-Nov-1999  itojun bring in latest KAME (as of 19991130, KAME/NetBSD141) into kame branch
just for reference purposes.
This commit includes 1.4 -> 1.4.1 sync for kame branch.

The branch does not compile at all (due to the lack of ALTQ and some other
source code). Please do not try to modify the branch, this is just for
referenre purposes.

synchronization to latest KAME will take place on HEAD branch soon.
 1.37.6.1  28-Jun-1999  itojun KAME/NetBSD 1.4 SNAP kit, dated 19990628.

NOTE: this branch (kame) is used just for refernce. this may not compile
due to multiple reasons.
 1.37.4.1  01-Jul-1999  thorpej Sync w/ -current.
 1.38.2.5  21-Apr-2001  bouyer Sync with HEAD
 1.38.2.4  18-Jan-2001  bouyer Sync with head (for UBC+NFS fixes, mostly).
 1.38.2.3  05-Jan-2001  bouyer Sync with HEAD
 1.38.2.2  13-Dec-2000  bouyer Sync with HEAD (for UBC fixes).
 1.38.2.1  20-Nov-2000  bouyer Update thorpej_scsipi to -current as of a month ago
 1.42.2.10  29-Dec-2002  thorpej Sync with HEAD.
 1.42.2.9  11-Dec-2002  thorpej Sync with HEAD.
 1.42.2.8  11-Nov-2002  nathanw Catch up to -current
 1.42.2.7  18-Oct-2002  nathanw Catch up to -current.
 1.42.2.6  17-Sep-2002  nathanw Catch up to -current.
 1.42.2.5  01-Aug-2002  nathanw Catch up to -current.
 1.42.2.4  01-Apr-2002  nathanw Catch up to -current.
(CVS: It's not just a program. It's an adventure!)
 1.42.2.3  14-Nov-2001  nathanw Catch up to -current.
 1.42.2.2  24-Aug-2001  nathanw Catch up with -current.
 1.42.2.1  21-Jun-2001  nathanw Catch up to -current.
 1.43.2.10  10-Oct-2002  jdolecek sync kqueue with -current; this includes merge of gehenna-devsw branch,
merge of i386 MP branch, and part of autoconf rototil work
 1.43.2.9  02-Oct-2002  jdolecek do not need the (void *) cast for kn_hook anymore
 1.43.2.8  06-Sep-2002  jdolecek sync kqueue branch with HEAD
 1.43.2.7  16-Mar-2002  jdolecek Catch up with -current.
 1.43.2.6  10-Jan-2002  thorpej Sync kqueue branch with -current.
 1.43.2.5  08-Sep-2001  thorpej Use the seltrue filter as appropriate (or, rather, as the "poll"
entry points of these drivers indicate).
 1.43.2.4  08-Sep-2001  thorpej Oops, selwakeup() -> selnotify() for last.
 1.43.2.3  08-Sep-2001  thorpej Add kqueue support.
 1.43.2.2  25-Aug-2001  thorpej Merge Aug 24 -current into the kqueue branch.
 1.43.2.1  03-Aug-2001  lukem update to -current
 1.45.4.1  12-Nov-2001  thorpej Sync the thorpej-mips-cache branch with -current.
 1.45.2.2  26-Sep-2001  fvdl * add a VCLONED vnode flag that indicates a vnode representing a cloned
device.
* rename REVOKEALL to REVOKEALIAS, and add a REVOKECLONE flag, to pass
to VOP_REVOKE
* the revoke system call will revoke all aliases, as before, but not the
clones
* vdevgone is called when detaching a device, so make it use REVOKECLONE
to get rid of all clones as well
* clean up all uses of VOP_OPEN wrt. locking.
* add a few VOPS to spec_vnops that need to do something when it's a
clone vnode (access and getattr)
* add a copy of the vnode vattr structure of the original 'master' vnode
to the specinfo of a cloned vnode. could possibly redirect getattr to
the 'master' vnode, but this has issues with revoke
* add a vdev_reassignvp function that disassociates a vnode from its
original device, and reassociates it with the specified dev_t. to be
used by cloning devices only, in case a new minor is allocated.
* change all direct references in drivers to v_devcookie and v_rdev
to vdev_privdata(vp) and vdev_rdev(vp). for diagnostic purposes
when debugging race conditions that still exist wrt. locking and
revoking vnodes.
* make the locking state of a vnode consistent when passed to
d_open and d_close (unlocked). locked would be better, but has
some deadlock issues
 1.45.2.1  07-Sep-2001  thorpej Commit my "devvp" changes to the thorpej-devvp branch. This
replaces the use of dev_t in most places with a struct vnode *.

This will form the basic infrastructure for real cloning device
support (besides being architecurally cleaner -- it'll be good
to get away from using numbers to represent objects).
 1.51.6.1  30-Jul-2002  lukem Pull up revision 1.52 (requested by atatat in ticket #572):
Make tun interfaces perform auto-creation. This means that if a
program opens /dev/tun# and tun# has not been SIOCIFCREATE'd already,
it will be SIOCIFCREATE'd automatically. FreeBSD's tun interfaces
behave in a somewhat similar fashion.
 1.51.4.2  29-Aug-2002  gehenna catch up with -current.
 1.51.4.1  16-May-2002  gehenna Add the character device switch.
 1.63.2.7  04-Feb-2005  skrll Sync with HEAD.
 1.63.2.6  18-Dec-2004  skrll Sync with HEAD.
 1.63.2.5  21-Sep-2004  skrll Fix the sync with head I botched.
 1.63.2.4  18-Sep-2004  skrll Sync with HEAD.
 1.63.2.3  25-Aug-2004  skrll Sync with HEAD.
 1.63.2.2  03-Aug-2004  skrll Sync with HEAD
 1.63.2.1  02-Jul-2003  darrenr Apply the aborted ktrace-lwp changes to a specific branch. This is just for
others to review, I'm concerned that patch fuziness may have resulted in some
errant code being generated but I'll look at that later by comparing the diff
from the base to the branch with the file I attempt to apply to it. This will,
at the very least, put the changes in a better context for others to review
them and attempt to tinker with removing passing of 'struct lwp' through
the kernel.
 1.68.2.2  20-May-2004  grant Pull up revision 1.69 (requested by tron in ticket #325):

Initialize interface type to IFT_TUNNEL as suggested by Erik �ngg�rd
in PR kern/25555.
 1.68.2.1  15-May-2004  tron Pull up revision 1.70 (requested by pk in ticket #335):
Fix locking issues noticed by Tom Ivar Helbekkmo on tech-net:
* always acquire the device instance lock at splnet()
* missing unlocks in various places
Also, since this driver allows its device instances manipulated by two
independent subsystems (character device & interface clone create/destroy),
be careful not to rip away instance data in a clone destroy request if the
instance is still opened as a character device.
 1.75.4.1  29-Apr-2005  kent sync with -current
 1.76.8.11  24-Mar-2008  yamt sync with head.
 1.76.8.10  17-Mar-2008  yamt sync with head.
 1.76.8.9  27-Feb-2008  yamt sync with head.
 1.76.8.8  11-Feb-2008  yamt sync with head.
 1.76.8.7  21-Jan-2008  yamt sync with head
 1.76.8.6  07-Dec-2007  yamt sync with head
 1.76.8.5  27-Oct-2007  yamt sync with head.
 1.76.8.4  03-Sep-2007  yamt sync with head.
 1.76.8.3  26-Feb-2007  yamt sync with head.
 1.76.8.2  30-Dec-2006  yamt sync with head.
 1.76.8.1  21-Jun-2006  yamt sync with head.
 1.78.6.2  01-Jun-2006  kardel Sync with head.
 1.78.6.1  22-Apr-2006  simonb Sync with head.
 1.78.4.1  09-Sep-2006  rpaulo sync with head
 1.78.2.2  01-Mar-2006  yamt sync with head.
 1.78.2.1  18-Feb-2006  yamt sync with head.
 1.82.6.2  24-May-2006  tron Merge 2006-05-24 NetBSD-current into the "peter-altq" branch.
 1.82.6.1  31-Mar-2006  tron Merge 2006-03-31 NetBSD-current into the "peter-altq" branch.
 1.82.4.5  11-May-2006  elad sync with head
 1.82.4.4  06-May-2006  christos - Move kauth_cred_t declaration to <sys/types.h>
- Cleanup struct ucred; forward declarations that are unused.
- Don't include <sys/kauth.h> in any header, but include it in the c files
that need it.

Approved by core.
 1.82.4.3  19-Apr-2006  elad sync with head.
 1.82.4.2  10-Mar-2006  elad generic_authorize() -> kauth_authorize_generic().
 1.82.4.1  08-Mar-2006  elad Adapt to kernel authorization KPI.
 1.82.2.6  14-Sep-2006  yamt sync with head.
 1.82.2.5  03-Sep-2006  yamt sync with head.
 1.82.2.4  11-Aug-2006  yamt sync with head
 1.82.2.3  24-May-2006  yamt sync with head.
 1.82.2.2  11-Apr-2006  yamt sync with head
 1.82.2.1  01-Apr-2006  yamt sync with head.
 1.92.4.2  10-Dec-2006  yamt sync with head.
 1.92.4.1  22-Oct-2006  yamt sync with head
 1.92.2.2  12-Jan-2007  ad Sync with head.
 1.92.2.1  18-Nov-2006  ad Sync with head.
 1.95.2.2  12-Mar-2007  rmind Sync with HEAD.
 1.95.2.1  27-Feb-2007  yamt - sync with head.
- move sched_changepri back to kern_synch.c as it doesn't know PPQ anymore.
 1.97.16.3  23-Mar-2008  matt sync with HEAD
 1.97.16.2  09-Jan-2008  matt sync with HEAD
 1.97.16.1  06-Nov-2007  matt sync with HEAD
 1.97.14.3  09-Dec-2007  jmcneill Sync with HEAD.
 1.97.14.2  26-Oct-2007  joerg Sync with HEAD.

Follow the merge of pmap.c on i386 and amd64 and move
pmap_init_tmp_pgtbl into arch/x86/x86/pmap.c. Modify the ACPI wakeup
code to restore CR4 before jumping back into kernel space as the large
page option might cover that.
 1.97.14.1  03-Sep-2007  jmcneill Sync with HEAD.
 1.97.10.1  03-Sep-2007  skrll Sync with HEAD.
 1.97.2.2  23-Oct-2007  ad Sync with head.
 1.97.2.1  09-Oct-2007  ad Sync with head.
 1.98.4.1  25-Oct-2007  bouyer Sync with HEAD.
 1.99.4.1  08-Dec-2007  ad Sync with head.
 1.99.2.2  18-Feb-2008  mjf Sync with HEAD.
 1.99.2.1  08-Dec-2007  mjf Sync with HEAD.
 1.100.4.1  08-Jan-2008  bouyer Sync with HEAD
 1.103.6.4  17-Jan-2009  mjf Sync with HEAD.
 1.103.6.3  29-Jun-2008  mjf Sync with HEAD.
 1.103.6.2  02-Jun-2008  mjf Sync with HEAD.
 1.103.6.1  03-Apr-2008  mjf Sync with HEAD.
 1.103.2.1  24-Mar-2008  keiichi sync with head.
 1.105.2.2  17-Jun-2008  yamt sync with head.
 1.105.2.1  18-May-2008  yamt sync with head.
 1.106.6.1  18-Jun-2008  simonb Sync with head.
 1.106.4.1  23-Jun-2008  wrstuden Sync w/ -current. 34 merge conflicts to follow.
 1.106.2.4  11-Aug-2010  yamt sync with head.
 1.106.2.3  11-Mar-2010  yamt sync with head
 1.106.2.2  16-May-2009  yamt sync with head
 1.106.2.1  04-May-2009  yamt sync with head.
 1.107.4.1  19-Jan-2009  skrll Sync with HEAD.
 1.107.2.1  13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.110.4.1  13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.112.4.1  30-May-2010  rmind sync with head
 1.112.2.1  30-Apr-2010  uebayasi Sync with HEAD.
 1.114.6.1  18-Feb-2012  mrg merge to -current.
 1.114.2.2  22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.114.2.1  17-Apr-2012  yamt sync with head
 1.115.10.2  18-May-2014  rmind sync with head
 1.115.10.1  17-Jul-2013  rmind Checkpoint work in progress:
- Move PCB structures under __INPCB_PRIVATE, adjust most of the callers
and thus make IPv4 PCB structures mostly opaque. Any volunteers for
merging in6pcb with inpcb (see rpaulo-netinet-merge-pcb branch)?
- Move various global vars to the modules where they belong, make them static.
- Some preliminary work for IPv4 PCB locking scheme.
- Make raw IP code mostly MP-safe. Simplify some of it.
- Rework "fast" IP forwarding (ipflow) code to be mostly MP-safe. It should
run from a software interrupt, rather than hard.
- Rework tun(4) pseudo interface to be MP-safe.
- Work towards making some other interfaces more strict.
 1.115.6.2  03-Dec-2017  jdolecek update from HEAD
 1.115.6.1  20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.117.2.1  10-Aug-2014  tls Rebase.
 1.120.4.8  28-Aug-2017  skrll Sync with HEAD
 1.120.4.7  05-Feb-2017  skrll Sync with HEAD
 1.120.4.6  05-Oct-2016  skrll Sync with HEAD
 1.120.4.5  09-Jul-2016  skrll Sync with HEAD
 1.120.4.4  29-May-2016  skrll Sync with HEAD
 1.120.4.3  22-Apr-2016  skrll Sync with HEAD
 1.120.4.2  22-Sep-2015  skrll Sync with HEAD
 1.120.4.1  06-Jun-2015  skrll Sync with HEAD
 1.127.2.2  20-Mar-2017  pgoyette Sync with HEAD
 1.127.2.1  04-Nov-2016  pgoyette Sync with HEAD
 1.134.2.1  21-Apr-2017  bouyer Sync with HEAD
 1.138.4.2  29-Apr-2017  pgoyette Remove explicit inclusion of <sys/localcount.h> since there is no
explicit usage of localcounts here. <sys/conf.h> will take care of
including as needed.
 1.138.4.1  28-Apr-2017  pgoyette Add a localcount to the devsw so it can be loaded as a rump module
 1.139.2.5  11-Mar-2024  martin Pull up following revision(s) (requested by riastradh in ticket #1946):

sys/net/if_tun.c: revision 1.175

tun(4): Allow IPv6 packets with TUNSLMODE configured.
PR kern/58013
 1.139.2.4  15-Aug-2018  martin Pull up following revision(s) (requested by ozaki-r in ticket #974):

sys/net/if_tun.c: revision 1.145
sys/net/if_tun.c: revision 1.146

tun: fix locking against myself

filt_tunread is called with tun_lock held from tun_output (via tun_output =>
selnotify => knote), so we must not take tun_lock in filt_tunread. The bug
is triggered only if a tun is used through kqueue.

Found by k-goda@IIJ

Fix tun(4) kevent locking

filt_tunread gets called in two contexts:
- by calls to selnotify in if_tun.c (or knote, as the case may be,
but not here), in which case tp->tun_lock is held; and
- by internal logic in kevent, in which tp->tun_lock is not held.

The standard convention to discriminate between these two cases is by
setting the kernel-only NOTE_SUBMIT bit in the hint to selnotify or
knote; then in filt_*:

if (hint & NOTE_SUBMIT)
KASSERT(mutex_owned(&tp->tun_lock));
else
mutex_enter(&tp->tun_lock);
...
if (hint & NOTE_SUBMIT)
KASSERT(mutex_owned(&tp->tun_lock));
else
mutex_exit(&tp->tun_lock);

Pointed out by and patch from riastradh@
Tested by ozaki-r@ (only the former path)
 1.139.2.3  17-Mar-2018  martin Pull up following revision(s) (requested by tih in ticket #638):
sys/net/if_tun.c: revision 1.143

Add packet filtering to tun(4) interfaces.

Calls to pfil_run_hooks() were missing in if_tun.c. This meant that
filtering configuration could be added to e.g. /etc/npf.conf, but
would be ignored, because the filter never saw the packets. This
change adds the required calls.

While here, correct the return value from tun_output(): it's been
returning 0 regardless of any error condition present, but will now
correctly propagate such information upward.

Thanks to maxv for guidance!
OK: christos, martin
 1.139.2.2  02-Jan-2018  snj Pull up following revision(s) (requested by ozaki-r in ticket #456):
sys/arch/arm/sunxi/sunxi_emac.c: 1.9
sys/dev/ic/dwc_gmac.c: 1.43-1.44
sys/dev/pci/if_iwm.c: 1.75
sys/dev/pci/if_wm.c: 1.543
sys/dev/pci/ixgbe/ixgbe.c: 1.112
sys/dev/pci/ixgbe/ixv.c: 1.74
sys/kern/sys_socket.c: 1.75
sys/net/agr/if_agr.c: 1.43
sys/net/bpf.c: 1.219
sys/net/if.c: 1.397, 1.399, 1.401-1.403, 1.406-1.410, 1.412-1.416
sys/net/if.h: 1.242-1.247, 1.250, 1.252-1.257
sys/net/if_bridge.c: 1.140 via patch, 1.142-1.146
sys/net/if_etherip.c: 1.40
sys/net/if_ethersubr.c: 1.243, 1.246
sys/net/if_faith.c: 1.57
sys/net/if_gif.c: 1.132
sys/net/if_l2tp.c: 1.15, 1.17
sys/net/if_loop.c: 1.98-1.101
sys/net/if_media.c: 1.35
sys/net/if_pppoe.c: 1.131-1.132
sys/net/if_spppsubr.c: 1.176-1.177
sys/net/if_tun.c: 1.142
sys/net/if_vlan.c: 1.107, 1.109, 1.114-1.121
sys/net/npf/npf_ifaddr.c: 1.3
sys/net/npf/npf_os.c: 1.8-1.9
sys/net/rtsock.c: 1.230
sys/netcan/if_canloop.c: 1.3-1.5
sys/netinet/if_arp.c: 1.255
sys/netinet/igmp.c: 1.65
sys/netinet/in.c: 1.210-1.211
sys/netinet/in_pcb.c: 1.180
sys/netinet/ip_carp.c: 1.92, 1.94
sys/netinet/ip_flow.c: 1.81
sys/netinet/ip_input.c: 1.362
sys/netinet/ip_mroute.c: 1.147
sys/netinet/ip_output.c: 1.283, 1.285, 1.287
sys/netinet6/frag6.c: 1.61
sys/netinet6/in6.c: 1.251, 1.255
sys/netinet6/in6_pcb.c: 1.162
sys/netinet6/ip6_flow.c: 1.35
sys/netinet6/ip6_input.c: 1.183
sys/netinet6/ip6_output.c: 1.196
sys/netinet6/mld6.c: 1.90
sys/netinet6/nd6.c: 1.239-1.240
sys/netinet6/nd6_nbr.c: 1.139
sys/netinet6/nd6_rtr.c: 1.136
sys/netipsec/ipsec_output.c: 1.65
sys/rump/net/lib/libnetinet/netinet_component.c: 1.9-1.10
kmem_intr_free kmem_intr_[z]alloced memory
the underlying pools are the same but api-wise those should match
Unify IFEF_*_MPSAFE into IFEF_MPSAFE
There are already two flags for if_output and if_start, however, it seems such
MPSAFE flags are eventually needed for all if_XXX operations. Having discrete
flags for each operation is wasteful of if_extflags bits. So let's unify
the flags into one: IFEF_MPSAFE.
Fortunately IFEF_*_MPSAFE flags have never been included in any releases, so
we can change them without breaking backward compatibility of the releases
(though the kernel version of -current should be bumped).
Note that if an interface have both MP-safe and non-MP-safe operations at a
time, we have to set the IFEF_MPSAFE flag and let callees of non-MP-safe
opeartions take the kernel lock.
Proposed on tech-kern@ and tech-net@
Provide macros for softnet_lock and KERNEL_LOCK hiding NET_MPSAFE switch
It reduces C&P codes such as "#ifndef NET_MPSAFE KERNEL_LOCK(1, NULL); ..."
scattered all over the source code and makes it easy to identify remaining
KERNEL_LOCK and/or softnet_lock that are held even if NET_MPSAFE.
No functional change
Hold KERNEL_LOCK on if_ioctl selectively based on IFEF_MPSAFE
If IFEF_MPSAFE is set, hold the lock and otherwise don't hold.
This change requires additions of KERNEL_LOCK to subsequence functions from
if_ioctl such as ifmedia_ioctl and ifioctl_common to protect non-MP-safe
components.
Proposed on tech-kern@ and tech-net@
Ensure to hold if_ioctl_lock when calling if_flags_set
Fix locking against myself on ifpromisc
vlan_unconfig_locked could be called with holding if_ioctl_lock.
Ensure to not turn on IFF_RUNNING of an interface until its initialization completes
And ensure to turn off it before destruction as per IFF_RUNNING's description
"resource allocated". (The description is a bit doubtful though, I believe the
change is still proper.)
Ensure to hold if_ioctl_lock on if_up and if_down
One exception for if_down is if_detach; in the case the lock isn't needed
because it's guaranteed that no other one can access ifp at that point.
Make if_link_queue MP-safe if IFEF_MPSAFE
if_link_queue is a queue to store events of link state changes, which is
used to pass events from (typically) an interrupt handler to
if_link_state_change softint. The queue was protected by KERNEL_LOCK so far,
but if IFEF_MPSAFE is enabled, it becomes unsafe because (perhaps) an interrupt
handler of an interface with IFEF_MPSAFE doesn't take KERNEL_LOCK. Protect it
by a spin mutex.
Additionally with this change KERNEL_LOCK of if_link_state_change softint is
omitted if NET_MPSAFE is enabled.
Note that the spin mutex is now ifp->if_snd.ifq_lock as well as the case of
if_timer (see the comment).
Use IFADDR_WRITER_FOREACH instead of IFADDR_READER_FOREACH
At that point no other one modifies the list so IFADDR_READER_FOREACH
is unnecessary. Use of IFADDR_READER_FOREACH is harmless in general though,
if we try to detect contract violations of pserialize, using it violates
the contract. So avoid using it makes life easy.
Ensure to call if_addr_init with holding if_ioctl_lock
Get rid of outdated comments
Fix build of kernels without ether
By throwing out if_enable_vlan_mtu and if_disable_vlan_mtu that
created a unnecessary dependency from if.c to if_ethersubr.c.
PR kern/52790
Rename IFNET_LOCK to IFNET_GLOBAL_LOCK
IFNET_LOCK will be used in another lock, if_ioctl_lock (might be renamed then).
Wrap if_ioctl_lock with IFNET_* macros (NFC)
Also if_ioctl_lock perhaps needs to be renamed to something because it's now
not just for ioctl...
Reorder some destruction routines in if_detach
- Destroy if_ioctl_lock at the end of the if_detach because it's used in various
destruction routines
- Move psref_target_destroy after pr_purgeif because we want to use psref in
pr_purgeif (otherwise destruction procedures can be tricky)
Ensure to call if_mcast_op with holding IFNET_LOCK
Note that CARP doesn't deal with IFNET_LOCK yet.
Remove IFNET_GLOBAL_LOCK where it's unnecessary because IFNET_LOCK is held
Describe which lock is used to protect each member variable of struct ifnet
Requested by skrll@
Write a guideline for converting an interface to IFEF_MPSAFE
Requested by skrll@
Note that IFNET_LOCK must not be held in softint
Don't set IFEF_MPSAFE unless NET_MPSAFE at this point
Because recent investigations show that interfaces with IFEF_MPSAFE need to
follow additional restrictions to work with the flag safely. We should enable it
on an interface by default only if the interface surely satisfies the
restrictions, which are described in if.h.
Note that enabling IFEF_MPSAFE solely gains a few benefit on performance because
the network stack is still serialized by the big kernel locks by default.
 1.139.2.1  08-Nov-2017  snj Pull up following revision(s) (requested by ozaki-r in ticket #349):
sys/net/if_l2tp.c: revision 1.14
sys/net/if_tap.c: revision 1.101
sys/net/if_tun.c: revision 1.141
sys/net/if_vlan.c: revision 1.106
Set IFEF_NO_LINK_STATE_CHANGE flag to pseudo devices that don't use
if_link_state_change
 1.142.2.3  06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.142.2.2  28-Jul-2018  pgoyette Sync with HEAD
 1.142.2.1  22-Mar-2018  pgoyette Synch with HEAD, resolve conflicts
 1.144.2.2  08-Apr-2020  martin Merge changes from current as of 20200406
 1.144.2.1  10-Jun-2019  christos Sync with HEAD
 1.156.2.1  11-Mar-2024  martin Pull up following revision(s) (requested by riastradh in ticket #1815):

sys/net/if_tun.c: revision 1.175

tun(4): Allow IPv6 packets with TUNSLMODE configured.
PR kern/58013
 1.157.2.1  29-Feb-2020  ad Sync with head.
 1.161.2.1  03-Jan-2021  thorpej Sync w/ HEAD.
 1.173.8.1  16-Nov-2023  thorpej IFQ_CLASSIFY() -> ifq_classify_packet().
 1.173.4.3  21-Sep-2024  martin Pull up following revision(s) (requested by rin in ticket #899):

sys/net/if_tun.c: revision 1.177

tun(4): Mark tunread_filtops `FILTEROP_MPSAFE`

Filter handlers have already been MP-safe since 2018:
https://mail-index.netbsd.org/source-changes/2018/08/06/msg097317.html

Note that we do not expect deadlocks similar to bpf(4) (PR kern/58531),
b/w KERNEL_LOCK and spin mutex for TX queue.

For tun(4), filt_tunread() acquires adaptive mutex. This is forbidden
when spin mutex is already held.

Such a path must have already been detected if present.

Thanks ozaki-r@ for discussion.
 1.173.4.2  11-Mar-2024  martin Pull up following revision(s) (requested by riastradh in ticket #627):

sys/net/if_tun.c: revision 1.175

tun(4): Allow IPv6 packets with TUNSLMODE configured.
PR kern/58013
 1.173.4.1  14-Jan-2024  martin Pull up following revision(s) (requested by chs in ticket #540):

sys/net/if_tun.c: revision 1.174

tun: add missing kpreempt_enable() if pktq_enqueue() fails

RSS XML Feed