Cross Reference: /src/sys/netinet6/ip6

History log of /src/sys/netinet6/ip6_flow.c
Revision	Date	Author	Comments
1.43	29-Jun-2024	riastradh	netinet6: Use _NET_STAT* API instead of direct array access. XXX Exception: ip6flow_addstats_rt _assigns_ one of the `statistics' to the current count of ip6 flows in use, and we don't have anything in the _NET_STAT* API for that. So for now I abuse the abstraction, until we sort out this one exceptional case properly. PR kern/58380
1.42	19-Feb-2021	christos	- Make ALIGNED_POINTER use __alignof(t) instead of sizeof(t). This is more correct because it works with non-primitive types and provides the ABI alignment for the type the compiler will use. - Remove all the *_HDR_ALIGNMENT macros and asserts - Replace POINTER_ALIGNED_P with ACCESSIBLE_POINTER which is identical to ALIGNED_POINTER, but returns that the pointer is always aligned if the CPU supports unaligned accesses. [ as proposed in tech-kern ]
1.41	14-Feb-2021	christos	- centralize header align and pullup into a single inline function - use a single macro to align pointers and expose the alignment, instead of hard-coding 3 in 1/2 the macros. - fix an issue in the ipv6 lt2p where it was aligning for ipv4 and pulling for ipv6.
1.40	06-Feb-2018	ozaki-r	branches: 1.40.16; Shorten the name of a workqueue instance to fit to the limit (15)
1.39	29-Jan-2018	maxv	Style, and use __cacheline_aligned. By the way, it would be nice to revisit the use of 'ip6flow_lock' in ip6flow_fastforward(): it is taken right away because of 'ip6flow_inuse', but then we perform several checks that do not require it.
1.38	08-Jan-2018	knakahara	Committed debugging logs by mistake, sorry. Revert cryoto.c:r.1.103 and ip6_flow.c:r.1.37.
1.37	08-Jan-2018	knakahara	Fix PR kern/52910. Reported and implemented a patch by Sevan Janiyan, thanks.
1.36	10-Dec-2017	maxv	Fix use-after-free: if m_pullup fails the (freed) mbuf is pushed on the ip6_pktq queue and re-processed later. Return 1 to say "processed and freed".
1.35	17-Nov-2017	ozaki-r	Provide macros for softnet_lock and KERNEL_LOCK hiding NET_MPSAFE switch It reduces C&P codes such as "#ifndef NET_MPSAFE KERNEL_LOCK(1, NULL); ..." scattered all over the source code and makes it easy to identify remaining KERNEL_LOCK and/or softnet_lock that are held even if NET_MPSAFE. No functional change
1.34	11-Jan-2017	ozaki-r	branches: 1.34.8; Get rid of unnecessary header inclusions
1.33	08-Dec-2016	ozaki-r	Add rtcache_unref to release points of rtentry stemming from rtcache In the MP-safe world, a rtentry stemming from a rtcache can be freed at any points. So we need to protect rtentries somehow say by reference couting or passive references. Regardless of the method, we need to call some release function of a rtentry after using it. The change adds a new function rtcache_unref to release a rtentry. At this point, this function does nothing because for now we don't add a reference to a rtentry when we get one from a rtcache. We will add something useful in a further commit. This change is a part of changes for MP-safe routing table. It is separated to avoid one big change that makes difficult to debug by bisecting.
1.32	18-Oct-2016	ozaki-r	Don't hold global locks if NET_MPSAFE is enabled If NET_MPSAFE is enabled, don't hold KERNEL_LOCK and softnet_lock in part of the network stack such as IP forwarding paths. The aim of the change is to make it easy to test the network stack without the locks and reduce our local diffs. By default (i.e., if NET_MPSAFE isn't enabled), the locks are held as they used to be. Reviewed by knakahara@
1.31	23-Aug-2016	knakahara	improve fast-forward performance when the number of flows exceeds ip6_maxflows. This is porting of ip_flow.c:r1.76 In ip6flow case, the before degradation is about 45%, the after degradation is bout 55%.
1.30	02-Aug-2016	knakahara	ip6flow refactor like ipflow. - move ip6flow sysctls into ip6_flow.c like ip_flow.c:r1.64 - build ip6_flow.c only if GATEWAY kernel option is enabled
1.29	26-Jul-2016	ozaki-r	Simplify by using atomic_swap instead of mutex Suggested by kefren@
1.28	11-Jul-2016	ozaki-r	branches: 1.28.2; Run timers in workqueue Timers (such as nd6_timer) typically free/destroy some data in callout (softint). If we apply psz/psref for such data, we cannot do free/destroy process in there because synchronization of psz/psref cannot be used in softint. So run timer callbacks in workqueue works (normal LWP context). Doing workqueue_enqueue a work twice (i.e., call workqueue_enqueue before a previous task is scheduled) isn't allowed. For nd6_timer and rt_timer_timer, this doesn't happen because callout_reset is called only from workqueue's work. OTOH, ip{,6}flow_slowtimo's callout can be called before its work starts and completes because the callout is periodically called regardless of completion of the work. To avoid such a situation, add a flag for each protocol; the flag is set true when a work is enqueued and set false after the work finished. workqueue_enqueue is called only if the flag is false. Proposed on tech-net and tech-kern.
1.27	20-Jun-2016	knakahara	apply if_output_lock() to L3 callers which call ifp->if_output() of L2(or L3 tunneling).
1.26	13-Jun-2016	knakahara	eliminate unnecessary splnet
1.25	13-Jun-2016	knakahara	MP-ify fastforward to support GATEWAY kernel option. I add "ipflow_lock" mutex in ip_flow.c and "ip6flow_lock" mutex in ip6_flow.c to protect all data in each file. Of course, this is not MP-scalable. However, it is sufficient as tentative workaround. We should make it scalable somehow in the future. ok by ozaki-r@n.o.
1.24	23-Mar-2015	roy	Add RTF_BROADCAST to mark routes used for the broadcast address when they are created on the fly. This makes it clear what the route is for and allows an optimisation in ip_output() by avoiding a call to in_broadcast() because most of the time we do talk to a host. It also avoids a needless allocation for the storage of llinfo_arp and thus vanishes from arp(8) - it showed as incomplete anyway so this is a nice side effect. Guard against this and routes marked with RTF_BLACKHOLE in ip_fastforward(). While here, guard against routes marked with RTF_BLACKHOLE in ip6_fastforward(). RTF_BROADCAST is IPv4 only, so don't bother checking that here.
1.23	20-May-2014	bouyer	branches: 1.23.2; 1.23.4; Sync with the ipv4 code and call ifp->if_output() with KERNEL_LOCK held. Problem reported and fix tested by njoly@ on current-users@
1.22	01-Apr-2014	pooka	branches: 1.22.2; Wrap ipflow_create() & ip6flow_create() in kernel lock. Prevents the interrupt side on another core from seeing the situation while the ipflow is being modified.
1.21	23-May-2013	msaitoh	branches: 1.21.2; Clear mbuf's csum_flags in ip6flow_fastforward(). Fixes PR#47849.
1.20	11-Oct-2012	christos	PR/47058: Antti Kantee: If the ipv6 flow code modifies the mbuf, pass the change up to the caller.
1.19	19-Jan-2012	liamjfoy	branches: 1.19.2; 1.19.6; 1.19.8; Remove ip6f_start from ip6f struct
1.18	23-Mar-2009	liamjfoy	branches: 1.18.12; 1.18.16; Init ip6flow pool dynamically instead of using a linkset.
1.17	28-Apr-2008	martin	branches: 1.17.8; 1.17.10; 1.17.14; Remove clause 3 and 4 from TNF licenses
1.16	24-Apr-2008	ad	branches: 1.16.2; Merge the socket locking patch: - Socket layer becomes MP safe. - Unix protocols become MP safe. - Allows protocol processing interrupts to safely block on locks. - Fixes a number of race conditions. With much feedback from matt@ and plunky@.
1.15	15-Apr-2008	thorpej	branches: 1.15.2; Make ip6 and icmp6 stats per-cpu.
1.14	08-Apr-2008	thorpej	Change IPv6 stats from a structure to an array of uint64_t's. Note: This is ABI-compatible with the old ip6stat structure; old netstat binaries will continue to work properly.
1.13	04-Jan-2008	dyoung	branches: 1.13.6; Constify.
1.12	04-Jan-2008	dyoung	Replace rtcache_down() with rtcache_validate() and update rtcache_down() uses.
1.11	20-Dec-2007	dyoung	Poison struct route->ro_rt uses in the kernel by changing the name to _ro_rt. Use rtcache_getrt() to access a route cache's struct rtentry *. Introduce struct ifnet->if_dl that always points at the interface identifier/link-layer address. Make code that treated the first ifaddr on struct ifnet->if_addrlist as the interface address use if_dl, instead. Remove stale debugging code from net/route.c. Move the rtflush() code into rtcache_clear() and delete rtflush(). Delete rtalloc(), because nothing uses it any more. Make ND6_HINT an inline, lowercase subroutine, nd6_hint. I've done my best to convert IP Filter, the ISO stack, and the AppleTalk stack to rtcache_getrt(). They compile, but I have not tested them. I have given the changes to PF, GRE, IPv4 and IPv6 stacks a lot of exercise.
1.10	11-Dec-2007	lukem	use __KERNEL_RCSID()
1.9	20-Aug-2007	dyoung	branches: 1.9.2; 1.9.4; 1.9.10; 1.9.12; 1.9.14; 1.9.16; Don't call rtcache_check() from the fast-forward code, which runs at IPL_NET, because rtcache_check() may read the forwarding table. Elsewhere, the kernel only blocks interrupts at priority IPL_SOFTNET and below while it modifies the forwarding table, so rtcache_check() could be reading the table in an inconsistent state. Use rtcache_done(), instead. XXX netinet/ip_flow.c and netinet6/ip6_flow.c are virtually identical. XXX They should share code.
1.8	02-May-2007	dyoung	branches: 1.8.2; 1.8.6; Remove obsolete files netinet/in_route.[ch].
1.7	02-May-2007	dyoung	Eliminate address family-specific route caches (struct route, struct route_in6, struct route_iso), replacing all caches with a struct route. The principle benefit of this change is that all of the protocol families can benefit from route cache-invalidation, which is necessary for correct routing. Route-cache invalidation fixes an ancient PR, kern/3508, at long last; it fixes various other PRs, also. Discussions with and ideas from Joerg Sonnenberger influenced this work tremendously. Of course, all design oversights and bugs are mine. DETAILS 1 I added to each address family a pool of sockaddrs. I have introduced routines for allocating, copying, and duplicating, and freeing sockaddrs: struct sockaddr sockaddr_alloc(sa_family_t af, int flags); struct sockaddr sockaddr_copy(struct sockaddr dst, const struct sockaddr src); struct sockaddr sockaddr_dup(const struct sockaddr src, int flags); void sockaddr_free(struct sockaddr sa); sockaddr_alloc() returns either a sockaddr from the pool belonging to the specified family, or NULL if the pool is exhausted. The returned sockaddr has the right size for that family; sa_family and sa_len fields are initialized to the family and sockaddr length---e.g., sa_family = AF_INET and sa_len = sizeof(struct sockaddr_in). sockaddr_free() puts the given sockaddr back into its family's pool. sockaddr_dup() and sockaddr_copy() work analogously to strdup() and strcpy(), respectively. sockaddr_copy() KASSERTs that the family of the destination and source sockaddrs are alike. The 'flags' argumet for sockaddr_alloc() and sockaddr_dup() is passed directly to pool_get(9). 2 I added routines for initializing sockaddrs in each address family, sockaddr_in_init(), sockaddr_in6_init(), sockaddr_iso_init(), etc. They are fairly self-explanatory. 3 structs route_in6 and route_iso are no more. All protocol families use struct route. I have changed the route cache, 'struct route', so that it does not contain storage space for a sockaddr. Instead, struct route points to a sockaddr coming from the pool the sockaddr belongs to. I added a new method to struct route, rtcache_setdst(), for setting the cache destination: int rtcache_setdst(struct route , const struct sockaddr *); rtcache_setdst() returns 0 on success, or ENOMEM if no memory is available to create the sockaddr storage. It is now possible for rtcache_getdst() to return NULL if, say, rtcache_setdst() failed. I check the return value for NULL everywhere in the kernel. 4 Each routing domain (struct domain) has a list of live route caches, dom_rtcache. rtflushall(sa_family_t af) looks up the domain indicated by 'af', walks the domain's list of route caches and invalidates each one.
1.6	05-Apr-2007	liamjfoy	use size_t for indexes ok christos@
1.5	23-Mar-2007	macallan	caddr_t -> void *
1.4	23-Mar-2007	liamjfoy	Add a new sysctl net.inet6.ip6.hashsize to control the hash table size. The sysctl handler will ensure this value is a power of 2 ok dyoung@
1.3	12-Mar-2007	ad	branches: 1.3.2; 1.3.4; Pass an ipl argument to pool_init/POOL_INIT to be used when initializing the pool's lock.
1.2	08-Mar-2007	liamjfoy	branches: 1.2.2; 1.2.4; Use ip6flowtable when looking up
1.1	07-Mar-2007	liamjfoy	Add IPv6 Fast Forward - the IPv4 counterpart: If ip6_forward successfully forwards a packet, a cache, in this case a ip6flow struct entry, will be created. ether_input and friends will then be able to call ip6flow_fastforward with the packet which will then be passed to if_output (unless an issue is found - in that case the packet is passed back to ip6_input). ok matt@ christos@ dyoung@ and joerg@
1.2.4.5	07-May-2007	yamt	sync with head.
1.2.4.4	15-Apr-2007	yamt	sync with head.
1.2.4.3	24-Mar-2007	yamt	sync with head.
1.2.4.2	12-Mar-2007	rmind	Sync with HEAD (missed new files in previous).
1.2.4.1	08-Mar-2007	rmind	file ip6_flow.c was added on branch yamt-idlelwp on 2007-03-12 06:14:56 +0000
1.2.2.4	09-Oct-2007	ad	Sync with head.
1.2.2.3	08-Jun-2007	ad	Sync with head.
1.2.2.2	10-Apr-2007	ad	Sync with head.
1.2.2.1	13-Mar-2007	ad	Sync with head.
1.3.4.1	29-Mar-2007	reinoud	Pullup to -current
1.3.2.1	11-Jul-2007	mjf	Sync with head.
1.8.6.1	03-Sep-2007	jmcneill	Sync with HEAD.
1.8.2.1	03-Sep-2007	skrll	Sync with HEAD.
1.9.16.3	08-Jan-2008	bouyer	Sync with HEAD
1.9.16.2	02-Jan-2008	bouyer	Sync with HEAD
1.9.16.1	13-Dec-2007	bouyer	Sync with HEAD
1.9.14.1	11-Dec-2007	yamt	sync with head.
1.9.12.1	26-Dec-2007	ad	Sync with head.
1.9.10.1	18-Feb-2008	mjf	Sync with HEAD.
1.9.4.3	21-Jan-2008	yamt	sync with head
1.9.4.2	03-Sep-2007	yamt	sync with head.
1.9.4.1	20-Aug-2007	yamt	file ip6_flow.c was added on branch yamt-lazymbuf on 2007-09-03 14:43:32 +0000
1.9.2.1	09-Jan-2008	matt	sync with HEAD
1.13.6.1	02-Jun-2008	mjf	Sync with HEAD.
1.15.2.1	18-May-2008	yamt	sync with head.
1.16.2.2	04-May-2009	yamt	sync with head.
1.16.2.1	16-May-2008	yamt	sync with head.
1.17.14.1	13-May-2009	jym	Sync with HEAD. Commit is split, to avoid a "too many arguments" protocol error.
1.17.10.1	19-Jun-2013	bouyer	Pull up following revision(s) (requested by msaitoh in ticket #1864): sys/netinet6/ip6_flow.c: revision 1.21 Clear mbuf's csum_flags in ip6flow_fastforward(). Fixes PR#47849.
1.17.8.1	28-Apr-2009	skrll	Sync with HEAD.
1.18.16.1	18-Feb-2012	mrg	merge to -current.
1.18.12.3	22-May-2014	yamt	sync with head. for a reference, the tree before this commit was tagged as yamt-pagecache-tag8. this commit was splitted into small chunks to avoid a limitation of cvs. ("Protocol error: too many arguments")
1.18.12.2	30-Oct-2012	yamt	sync with head
1.18.12.1	17-Apr-2012	yamt	sync with head
1.19.8.1	18-Jun-2014	msaitoh	Pull up following revision(s) (requested by bouyer in ticket #1067): sys/dist/ipf/netinet/ip_fil_netbsd.c 1.9 via patch sys/net/if_ethersubr.c 1.197 via patch sys/net/if_loop.c 1.77 via patch sys/net/if_vlan.c 1.70 via patch sys/netinet/if_arp.c 1.158 sys/netinet/ip_carp.c 1.54 via patch sys/netinet6/ip6_flow.c 1.23 via patch sys/netinet6/nd6.c 1.150 via patch sys/rump/librump/rumpkern/klock.c 1.4 Make sure *(if_output)() is called with KERNEL_LOCK held to avoid mbuf leak. See http://mail-index.netbsd.org/tech-net/2014/04/09/msg004511.html for details. For netinet6, the problem report, fix and test were done by njoly@ on current-users@
1.19.6.4	03-Dec-2017	jdolecek	update from HEAD
1.19.6.3	20-Aug-2014	tls	Rebase to HEAD as of a few days ago.
1.19.6.2	23-Jun-2013	tls	resync from head
1.19.6.1	20-Nov-2012	tls	Resync to 2012-11-19 00:00:00 UTC
1.19.2.3	03-Jun-2014	msaitoh	Pull up following revision(s) (requested by bouyer in ticket #1067): sys/dist/ipf/netinet/ip_fil_netbsd.c 1.9 via patch sys/net/if_ethersubr.c 1.197 via patch sys/net/if_loop.c 1.77 via patch sys/net/if_vlan.c 1.70 via patch sys/netinet/if_arp.c 1.158 sys/netinet/ip_carp.c 1.54 via patch sys/netinet6/ip6_flow.c 1.23 via patch sys/netinet6/nd6.c 1.150 via patch sys/rump/librump/rumpkern/klock.c 1.4 Make sure *(if_output)() is called with KERNEL_LOCK held to avoid mbuf leak. See http://mail-index.netbsd.org/tech-net/2014/04/09/msg004511.html for details. For netinet6, the problem report, fix and test were done by njoly@ on current-users@
1.19.2.2	19-Jun-2013	bouyer	Pull up following revision(s) (requested by msaitoh in ticket #895): sys/netinet6/ip6_flow.c: revision 1.21 Clear mbuf's csum_flags in ip6flow_fastforward(). Fixes PR#47849.
1.19.2.1	31-Oct-2012	riz	branches: 1.19.2.1.2; Pull up following revision(s) (requested by christos in ticket #638): sys/net/if_ppp.c: revision 1.137 sys/netinet6/ip6_flow.c: revision 1.20 sys/net/if_fddisubr.c: revision 1.82 sys/net/if_ethersubr.c: revision 1.192 sys/netinet6/in6_var.h: revision 1.66 sys/net/if_atmsubr.c: revision 1.50 PR/47058: Antti Kantee: If the ipv6 flow code modifies the mbuf, pass the change up to the caller.
1.19.2.1.2.1	18-Jun-2014	msaitoh	Pull up following revision(s) (requested by bouyer in ticket #1067): sys/dist/ipf/netinet/ip_fil_netbsd.c 1.9 via patch sys/net/if_ethersubr.c 1.197 via patch sys/net/if_loop.c 1.77 via patch sys/net/if_vlan.c 1.70 via patch sys/netinet/if_arp.c 1.158 sys/netinet/ip_carp.c 1.54 via patch sys/netinet6/ip6_flow.c 1.23 via patch sys/netinet6/nd6.c 1.150 via patch sys/rump/librump/rumpkern/klock.c 1.4 Make sure *(if_output)() is called with KERNEL_LOCK held to avoid mbuf leak. See http://mail-index.netbsd.org/tech-net/2014/04/09/msg004511.html for details. For netinet6, the problem report, fix and test were done by njoly@ on current-users@
1.21.2.1	18-May-2014	rmind	sync with head
1.22.2.1	10-Aug-2014	tls	Rebase.
1.23.4.5	05-Feb-2017	skrll	Sync with HEAD
1.23.4.4	05-Dec-2016	skrll	Sync with HEAD
1.23.4.3	05-Oct-2016	skrll	Sync with HEAD
1.23.4.2	09-Jul-2016	skrll	Sync with HEAD
1.23.4.1	06-Apr-2015	skrll	Sync with HEAD
1.23.2.1	12-May-2017	snj	Pull up following revision(s) (requested by skrll/ozaki-r in ticket #1402): sys/net/route.c: revision 1.170 via patch sys/netinet/ip_flow.c: revision 1.73 via patch sys/netinet6/ip6_flow.c: revision 1.28 via patch sys/netinet6/nd6.c: revision 1.203 via patch Run timers in workqueue Timers (such as nd6_timer) typically free/destroy some data in callout (softint). If we apply psz/psref for such data, we cannot do free/destroy process in there because synchronization of psz/psref cannot be used in softint. So run timer callbacks in workqueue works (normal LWP context). Doing workqueue_enqueue a work twice (i.e., call workqueue_enqueue before a previous task is scheduled) isn't allowed. For nd6_timer and rt_timer_timer, this doesn't happen because callout_reset is called only from workqueue's work. OTOH, ip{,6}flow_slowtimo's callout can be called before its work starts and completes because the callout is periodically called regardless of completion of the work. To avoid such a situation, add a flag for each protocol; the flag is set true when a work is enqueued and set false after the work finished. workqueue_enqueue is called only if the flag is false. Proposed on tech-net and tech-kern.
1.28.2.4	20-Mar-2017	pgoyette	Sync with HEAD
1.28.2.3	07-Jan-2017	pgoyette	Sync with HEAD. (Note that most of these changes are simply $NetBSD$ tag issues.)
1.28.2.2	04-Nov-2016	pgoyette	Sync with HEAD
1.28.2.1	06-Aug-2016	pgoyette	Sync with HEAD
1.34.8.2	09-Jan-2018	snj	Pull up following revision(s) (requested by maxv in ticket #481): sys/netinet6/ip6_flow.c: revision 1.36 Fix use-after-free: if m_pullup fails the (freed) mbuf is pushed on the ip6_pktq queue and re-processed later. Return 1 to say "processed and freed".
1.34.8.1	02-Jan-2018	snj	Pull up following revision(s) (requested by ozaki-r in ticket #456): sys/arch/arm/sunxi/sunxi_emac.c: 1.9 sys/dev/ic/dwc_gmac.c: 1.43-1.44 sys/dev/pci/if_iwm.c: 1.75 sys/dev/pci/if_wm.c: 1.543 sys/dev/pci/ixgbe/ixgbe.c: 1.112 sys/dev/pci/ixgbe/ixv.c: 1.74 sys/kern/sys_socket.c: 1.75 sys/net/agr/if_agr.c: 1.43 sys/net/bpf.c: 1.219 sys/net/if.c: 1.397, 1.399, 1.401-1.403, 1.406-1.410, 1.412-1.416 sys/net/if.h: 1.242-1.247, 1.250, 1.252-1.257 sys/net/if_bridge.c: 1.140 via patch, 1.142-1.146 sys/net/if_etherip.c: 1.40 sys/net/if_ethersubr.c: 1.243, 1.246 sys/net/if_faith.c: 1.57 sys/net/if_gif.c: 1.132 sys/net/if_l2tp.c: 1.15, 1.17 sys/net/if_loop.c: 1.98-1.101 sys/net/if_media.c: 1.35 sys/net/if_pppoe.c: 1.131-1.132 sys/net/if_spppsubr.c: 1.176-1.177 sys/net/if_tun.c: 1.142 sys/net/if_vlan.c: 1.107, 1.109, 1.114-1.121 sys/net/npf/npf_ifaddr.c: 1.3 sys/net/npf/npf_os.c: 1.8-1.9 sys/net/rtsock.c: 1.230 sys/netcan/if_canloop.c: 1.3-1.5 sys/netinet/if_arp.c: 1.255 sys/netinet/igmp.c: 1.65 sys/netinet/in.c: 1.210-1.211 sys/netinet/in_pcb.c: 1.180 sys/netinet/ip_carp.c: 1.92, 1.94 sys/netinet/ip_flow.c: 1.81 sys/netinet/ip_input.c: 1.362 sys/netinet/ip_mroute.c: 1.147 sys/netinet/ip_output.c: 1.283, 1.285, 1.287 sys/netinet6/frag6.c: 1.61 sys/netinet6/in6.c: 1.251, 1.255 sys/netinet6/in6_pcb.c: 1.162 sys/netinet6/ip6_flow.c: 1.35 sys/netinet6/ip6_input.c: 1.183 sys/netinet6/ip6_output.c: 1.196 sys/netinet6/mld6.c: 1.90 sys/netinet6/nd6.c: 1.239-1.240 sys/netinet6/nd6_nbr.c: 1.139 sys/netinet6/nd6_rtr.c: 1.136 sys/netipsec/ipsec_output.c: 1.65 sys/rump/net/lib/libnetinet/netinet_component.c: 1.9-1.10 kmem_intr_free kmem_intr_[z]alloced memory the underlying pools are the same but api-wise those should match Unify IFEF__MPSAFE into IFEF_MPSAFE There are already two flags for if_output and if_start, however, it seems such MPSAFE flags are eventually needed for all if_XXX operations. Having discrete flags for each operation is wasteful of if_extflags bits. So let's unify the flags into one: IFEF_MPSAFE. Fortunately IFEF__MPSAFE flags have never been included in any releases, so we can change them without breaking backward compatibility of the releases (though the kernel version of -current should be bumped). Note that if an interface have both MP-safe and non-MP-safe operations at a time, we have to set the IFEF_MPSAFE flag and let callees of non-MP-safe opeartions take the kernel lock. Proposed on tech-kern@ and tech-net@ Provide macros for softnet_lock and KERNEL_LOCK hiding NET_MPSAFE switch It reduces C&P codes such as "#ifndef NET_MPSAFE KERNEL_LOCK(1, NULL); ..." scattered all over the source code and makes it easy to identify remaining KERNEL_LOCK and/or softnet_lock that are held even if NET_MPSAFE. No functional change Hold KERNEL_LOCK on if_ioctl selectively based on IFEF_MPSAFE If IFEF_MPSAFE is set, hold the lock and otherwise don't hold. This change requires additions of KERNEL_LOCK to subsequence functions from if_ioctl such as ifmedia_ioctl and ifioctl_common to protect non-MP-safe components. Proposed on tech-kern@ and tech-net@ Ensure to hold if_ioctl_lock when calling if_flags_set Fix locking against myself on ifpromisc vlan_unconfig_locked could be called with holding if_ioctl_lock. Ensure to not turn on IFF_RUNNING of an interface until its initialization completes And ensure to turn off it before destruction as per IFF_RUNNING's description "resource allocated". (The description is a bit doubtful though, I believe the change is still proper.) Ensure to hold if_ioctl_lock on if_up and if_down One exception for if_down is if_detach; in the case the lock isn't needed because it's guaranteed that no other one can access ifp at that point. Make if_link_queue MP-safe if IFEF_MPSAFE if_link_queue is a queue to store events of link state changes, which is used to pass events from (typically) an interrupt handler to if_link_state_change softint. The queue was protected by KERNEL_LOCK so far, but if IFEF_MPSAFE is enabled, it becomes unsafe because (perhaps) an interrupt handler of an interface with IFEF_MPSAFE doesn't take KERNEL_LOCK. Protect it by a spin mutex. Additionally with this change KERNEL_LOCK of if_link_state_change softint is omitted if NET_MPSAFE is enabled. Note that the spin mutex is now ifp->if_snd.ifq_lock as well as the case of if_timer (see the comment). Use IFADDR_WRITER_FOREACH instead of IFADDR_READER_FOREACH At that point no other one modifies the list so IFADDR_READER_FOREACH is unnecessary. Use of IFADDR_READER_FOREACH is harmless in general though, if we try to detect contract violations of pserialize, using it violates the contract. So avoid using it makes life easy. Ensure to call if_addr_init with holding if_ioctl_lock Get rid of outdated comments Fix build of kernels without ether By throwing out if_enable_vlan_mtu and if_disable_vlan_mtu that created a unnecessary dependency from if.c to if_ethersubr.c. PR kern/52790 Rename IFNET_LOCK to IFNET_GLOBAL_LOCK IFNET_LOCK will be used in another lock, if_ioctl_lock (might be renamed then). Wrap if_ioctl_lock with IFNET_* macros (NFC) Also if_ioctl_lock perhaps needs to be renamed to something because it's now not just for ioctl... Reorder some destruction routines in if_detach - Destroy if_ioctl_lock at the end of the if_detach because it's used in various destruction routines - Move psref_target_destroy after pr_purgeif because we want to use psref in pr_purgeif (otherwise destruction procedures can be tricky) Ensure to call if_mcast_op with holding IFNET_LOCK Note that CARP doesn't deal with IFNET_LOCK yet. Remove IFNET_GLOBAL_LOCK where it's unnecessary because IFNET_LOCK is held Describe which lock is used to protect each member variable of struct ifnet Requested by skrll@ Write a guideline for converting an interface to IFEF_MPSAFE Requested by skrll@ Note that IFNET_LOCK must not be held in softint Don't set IFEF_MPSAFE unless NET_MPSAFE at this point Because recent investigations show that interfaces with IFEF_MPSAFE need to follow additional restrictions to work with the flag safely. We should enable it on an interface by default only if the interface surely satisfies the restrictions, which are described in if.h. Note that enabling IFEF_MPSAFE solely gains a few benefit on performance because the network stack is still serialized by the big kernel locks by default.
1.40.16.1	03-Apr-2021	thorpej	Sync with HEAD.

OpenGrok