Cross Reference: /src/sys/kern/uipc

History log of /src/sys/kern/uipc_mbuf.c
Revision	Date	Author	Comments
1.255	15-Dec-2024	skrll	KNF
1.254	06-Dec-2024	riastradh	sys/kern/sys_socket.c, uipc_*.c: Sprinkle SET_ERROR dtrace probes. PR kern/58378: Kernel error code origination lacks dtrace probes
1.253	06-Dec-2024	riastradh	sys/kern/sys_socket.c, uipc_*.c: Sort includes. No functional change intended.
1.252	27-Nov-2023	ozaki-r	mbuf: avoid assertion failure when splitting mbuf cluster From OpenBSD: commit 7b4d35e0a60ba1dd4daf4b1c2932020a22463a89 Author: bluhm <bluhm@openbsd.org> Date: Fri Oct 20 16:25:15 2023 +0000 Avoid assertion failure when splitting mbuf cluster. m_split() calls m_align() to initialize the data pointer of newly allocated mbuf. If the new mbuf will be converted to a cluster, this is not necessary. If additionally the new mbuf is larger than MLEN, this can lead to a panic. Only call m_align() when a valid m_data is needed. This is the case if we do not refecence the existing cluster, but memcpy() the data into the new mbuf. Reported-by: syzbot+0e6817f5877926f0e96a@syzkaller.appspotmail.com OK claudio@ deraadt@ The issue is harmless if DIAGNOSTIC is not enabled. XXX pullup-10 XXX pullup-9
1.251	12-Apr-2023	riastradh	mbuf(9): New m_get_n, m_gethdr_n. m_get_n(how, type, alignbytes, nbytes) returns an mbuf with no packet header having space for nbytes, with an internal buffer pointer aligned by alignbytes (typically ETHER_ALIGN or similar, if not zero). m_gethdr_n(how, type, alignbytes, nbytes) does the same but for an mbuf with a packet header. These return NULL on failure, which can happen either: (a) because how is M_DONTWAIT and allocating memory would sleep, or (b) because alignbytes + nbytes > MCLBYTES. On exit, m_len is set to nbytes, as is m_pkthdr.len for m_gethdr_n. These should be used to systematically replace all calls to m_get, m_gethdr, MGET, MGETHDR, and m_getcl. Most calls to m_clget and MCLGET will probably evaporate as a consequence. Proposed on tech-net last year: https://mail-index.netbsd.org/tech-net/2022/07/16/msg008285.html
1.250	01-Apr-2023	skrll	0x%p -> %p in KASSERTMSGs
1.249	31-Mar-2023	riastradh	mbuf(9): Sprinkle KASSERTMSG. No functional change intended.
1.248	24-Feb-2023	riastradh	kern: Eliminate most __HAVE_ATOMIC_AS_MEMBAR conditionals. I'm leaving in the conditional around the legacy membar_enters (store-before-load, store-before-store) in kern_mutex.c and in kern_lock.c because they may still matter: store-before-load barriers tend to be the most expensive kind, so eliding them is probably worthwhile on x86. (It also may not matter; I just don't care to do measurements right now, and it's a single valid and potentially justifiable use case in the whole tree.) However, membar_release/acquire can be mere instruction barriers on all TSO platforms including x86, so there's no need to go out of our way with a bad API to conditionalize them. If the procedure call overhead is measurable we just could change them to be macros on x86 that expand into __insn_barrier. Discussed on tech-kern: https://mail-index.netbsd.org/tech-kern/2023/02/23/msg028729.html
1.247	16-Dec-2022	msaitoh	branches: 1.247.2; Add new "kern.mbuf.nmbclusters_limit" sysctl. - Used to know the upper limit of nmbclusters. - It's read only.
1.246	09-Apr-2022	riastradh	sys: Use membar_release/acquire around reference drop. This just goes through my recent reference count membar audit and changes membar_exit to membar_release and membar_enter to membar_acquire -- this should make everything cheaper on most CPUs without hurting correctness, because membar_acquire is generally cheaper than membar_enter.
1.245	12-Mar-2022	riastradh	sys: Membar audit around reference count releases. If two threads are using an object that is freed when the reference count goes to zero, we need to ensure that all memory operations related to the object happen before freeing the object. Using an atomic_dec_uint_nv(&refcnt) == 0 ensures that only one thread takes responsibility for freeing, but it's not enough to ensure that the other thread's memory operations happen before the freeing. Consider: Thread A Thread B obj->foo = 42; obj->baz = 73; mumble(&obj->bar); grumble(&obj->quux); /* membar_exit(); / / membar_exit(); / atomic_dec -- not last atomic_dec -- last / membar_enter(); / KASSERT(invariant(obj->foo, obj->bar)); free_stuff(obj); The memory barriers ensure that obj->foo = 42; mumble(&obj->bar); in thread A happens before KASSERT(invariant(obj->foo, obj->bar)); free_stuff(obj); in thread B. Without them, this ordering is not guaranteed. So in general it is necessary to do membar_exit(); if (atomic_dec_uint_nv(&obj->refcnt) != 0) return; membar_enter(); to release a reference, for the `last one out hit the lights' style of reference counting. (This is in contrast to the style where one thread blocks new references and then waits under a lock for existing ones to drain with a condvar -- no membar needed thanks to mutex(9).) I searched for atomic_dec to find all these. Obviously we ought to have a better abstraction for this because there's so much copypasta. This is a stop-gap measure to fix actual bugs until we have that. It would be nice if an abstraction could gracefully handle the different styles of reference counting in use -- some years ago I drafted an API for this, but making it cover everything got a little out of hand (particularly with struct vnode::v_usecount) and I ended up setting it aside to work on psref/localcount instead for better scalability. I got bored of adding #ifdef __HAVE_ATOMIC_AS_MEMBAR everywhere, so I only put it on things that look performance-critical on 5sec review. We should really adopt membar_enter_preatomic/membar_exit_postatomic or something (except they are applicable only to atomic r/m/w, not to atomic_load/store_, making the naming annoying) and get rid of all the ifdefs.
1.244	06-Oct-2021	msaitoh	Fix a bug that NMBCLUSTERS(kern.mbuf.nmbclusters) can't be changed by sysctl.
1.243	04-Mar-2021	msaitoh	Revert accidentally committed debug code. Sorry.
1.242	04-Mar-2021	msaitoh	Add missing opt_inet.h.
1.241	05-May-2020	jdolecek	branches: 1.241.2; fix KASSERT() for MHLEN case in m_defrag() - network stack usually does m_adj(ETHER_ALIGN) so check that the mbuf chain data fits M_LEADINGSPACE() + M_TRAILINGSPACE()
1.240	25-Apr-2020	jdolecek	in m_defrag() must copy data elsewhere before adding cluster, the data part of mbuf gets reused and hence overwritten by extbuf
1.239	24-Apr-2020	jdolecek	add KASSERT() that the while data buffer in a mbuf or the mbuf cluster fits within the same page pools actually never return items whose memory cross page boundary for item sizes smaller than PAGE_SIZE
1.238	24-Apr-2020	jdolecek	change m_defrag() to coalesce the chain to single mbuf if it's short enough and first mbuf doesn't use external storage most fragmented packets end up with first short mbuf containing frame + protocol header only, and second mbuf containing the data; m_defrag() previously always returned chain of at least two mbufs, now it should actually return all data in single mbuf for typical mbuf chain with length < MCLBYTES
1.237	15-Mar-2020	thorpej	branches: 1.237.2; Add and use a new function, mowner_init_owner(), that initializes an MBUFTRACE mowner structure (so that providers of it don't have to grovel the internals).
1.236	06-Dec-2019	maxv	Minor changes, reported by the LGTM bot.
1.235	19-Oct-2019	tnn	mcl_cache: align items to COHERENCY_UNIT Because we do cache incoherent DMA to/from mbufs we cannot safely share share cache lines with adjacent items that may be concurrently accessed.
1.234	28-Sep-2019	jmcneill	mbstat_conver_to_user_cb -> mbstat_convert_to_user_cb
1.233	18-Sep-2019	maxv	Handle M_EXT with M_BUFADDR, and introduce M_BUFSIZE. Use them to dedup code.
1.232	17-Jan-2019	knakahara	branches: 1.232.4; Fix ipsecif(4) cannot apply input direction packet filter. Reviewed by ozaki-r@n.o and ryo@n.o. Add ATF later.
1.231	16-Jan-2019	knakahara	Initialize m_pkthdr members explicity.
1.230	27-Dec-2018	maxv	Remove M_COPY_PKTHDR, M_MOVE_PKTHDR, M_ALIGN and MH_ALIGN.
1.229	22-Dec-2018	maxv	Replace M_ALIGN and MH_ALIGN by m_align.
1.228	22-Dec-2018	maxv	Replace: M_COPY_PKTHDR -> m_copy_pkthdr. No functional change, since the former is a macro to the latter.
1.227	22-Dec-2018	maxv	Move m_align() back into the kernel, and switch M_ALIGN and MH_ALIGN to it. Forcing a distinction between M_ALIGN and MH_ALIGN is too bug-friendly and serves no particular purpose.
1.226	22-Dec-2018	maxv	Replace: M_MOVE_PKTHDR -> m_move_pkthdr. No functional change, since the former is a macro to the latter.
1.225	15-Nov-2018	maxv	Remove the 'copy' argument from m_devget(), unused. While here rename off0->off.
1.224	15-Nov-2018	maxv	Add KASSERTs.
1.223	15-Nov-2018	maxv	Remove the 't' argument from m_tag_find().
1.222	15-Nov-2018	maxv	Simplify the mtag API: - Remove m_tag_init(), m_tag_first(), m_tag_next() and m_tag_delete_nonpersistent(). - Remove the 't' argument from m_tag_delete_chain().
1.221	15-Nov-2018	maxv	Merge uipc_mbuf2.c into uipc_mbuf.c. Reorder the latter a little to gather similar functions. No functional change.
1.220	05-Oct-2018	msaitoh	s/conver_to/convert_to/. No functional change.
1.219	03-Sep-2018	riastradh	Rename min/max -> uimin/uimax for better honesty. These functions are defined on unsigned int. The generic name min/max should not silently truncate to 32 bits on 64-bit systems. This is purely a name change -- no functional change intended. HOWEVER! Some subsystems have #define min(a, b) ((a) < (b) ? (a) : (b)) #define max(a, b) ((a) > (b) ? (a) : (b)) even though our standard name for that is MIN/MAX. Although these may invite multiple evaluation bugs, these do _not_ cause integer truncation. To avoid `fixing' these cases, I first changed the name in libkern, and then compile-tested every file where min/max occurred in order to confirm that it failed -- and thus confirm that nothing shadowed min/max -- before changing it. I have left a handful of bootloaders that are too annoying to compile-test, and some dead code: cobalt ews4800mips hp300 hppa ia64 luna68k vax acorn32/if_ie.c (not included in any kernels) macppc/if_gm.c (superseded by gem(4)) It should be easy to fix the fallout once identified -- this way of doing things fails safe, and the goal here, after all, is to _avoid_ silent integer truncations, not introduce them. Maybe one day we can reintroduce min/max as type-generic things that never silently truncate. But we should avoid doing that for a while, so that existing code has a chance to be detected by the compiler for conversion to uimin/uimax without changing the semantics until we can properly audit it all. (Who knows, maybe in some cases integer truncation is actually intended!)
1.218	09-Aug-2018	maxv	Localify mcl_cache.
1.217	18-Jul-2018	msaitoh	- Fix compile error for kernel configuration file which has no any Ethernet device driver. - Add missing default label. - Fix NetBSD RCS Id.
1.216	17-Jul-2018	msaitoh	Add /d(dump) and /v(verbose) modifiers to DDB's "show mbuf" command. Mainly written by Hiroki SUENAGA. Currently, /v supports Ethernet, PPP, PPPoE, ARP, IPv4, ICMP, IPv6, ICMPv6, TCP and UDP.
1.215	07-May-2018	maxv	branches: 1.215.2; Copy some KASSERTs from m_move_pkthdr into m_copy_pkthdr, and reorder the latter to reduce the diff with the former.
1.214	03-May-2018	maxv	Revert my rev1.190, remove the M_READONLY check. The initial code was correct: what is read-only is the mbuf storage, not the mbuf itself. The storage contains the packet payload, and never has anything related to mbufs. So it is fine to remove M_PKTHDR on mbufs that have a read-only storage. In fact it was kind of obvious, since several places already manually remove M_PKTHDR without taking care of the external storage.
1.213	03-May-2018	maxv	Rename m_pkthdr_remove -> m_remove_pkthdr, to match the existing naming convention, eg m_copy_pkthdr and m_move_pkthdr.
1.212	28-Apr-2018	maxv	Rename the 'flags' and 'nowait' arguments to 'how'. The other BSDs did the same. Also, in m_defrag, rename 'mold' to 'm'.
1.211	28-Apr-2018	maxv	Modify m_defrag, so that it never frees the first mbuf of the chain. While here use the given 'flags' argument, and not M_DONTWAIT. We have a problem with several drivers: they poll an mbuf chain from their queues and call m_defrag on them, but m_defrag could update the mbuf pointer, so the mbuf in the queue is no longer valid. It is not easy to fix each driver, because doing pop+push will reorder the queue, and we don't really want that to happen. This problem was independently spotted by me, Kengo, Masanobu, and other people too it seems (perhaps PR/53218). Now m_defrag leaves the first mbuf in place, and compresses the chain only starting from the second mbuf in the chain. It is important not to compress the first mbuf with hacks, because the storage of this first mbuf may be shared with other mbufs.
1.210	27-Apr-2018	maxv	Remove unused debug code.
1.209	27-Apr-2018	maxv	Remove reference to m_ext.ext_type (doesn't exist).
1.208	27-Apr-2018	maxv	Remove unused ext_flags field in struct _m_ext_storage. Also, simplify MEXTMALLOC, mbtypes[] doesn't exist anymore, but the code still compiled correctly because "malloc" is a macro and the argument was dropped.
1.207	27-Apr-2018	maxv	Stop passing the pool as argument of the storage. M_EXT_CLUSTER mbufs are supposed to take their area from mcl_cache only.
1.206	27-Apr-2018	maxv	Remove _MCLGET, merge its content into m_clget(). The code is slightly modified to reduce the indentation level.
1.205	27-Apr-2018	maxv	Reorder, to group related functions.
1.204	27-Apr-2018	maxv	M_CLUSTER -> M_EXT_CLUSTER
1.203	27-Apr-2018	maxv	Rename m_reclaim -> mb_drain, and localify.
1.202	27-Apr-2018	maxv	Implement M_COPY_PKTHDR as a function, like m_move_pkthdr.
1.201	27-Apr-2018	maxv	Move m_align and m_append into iee80211_netbsd.c. They are part of net80211, and shouldn't be used outside.
1.200	27-Apr-2018	maxv	Simplify m_copydata, use unsigned int, and change its last argument to match that of the man page.
1.199	27-Apr-2018	maxv	Style and simplify.
1.198	27-Apr-2018	maxv	Panic in m_copypacket if no header is present, that's a requirement.
1.197	26-Apr-2018	maxv	Change MCLGET, so that it calls m_clget instead of doing the work in a macro. Macros are inefficient when they contain too many instructions and are used too often, because of cache coherency (and also register use). This change saves 32KB of kernel .text.
1.196	26-Apr-2018	maxv	Rename m_copyback0 -> m_copyback_internal M_COPYBACK0_* -> CB_* That's a lot less misleading. While here, fix a bunch of panic messages.
1.195	26-Apr-2018	maxv	Stop adding '0's in parameter and function names, that's just misleading. Some remain, they need more investigation.
1.194	26-Apr-2018	maxv	Change comment, to clearly say that m_prepend should not be used directly.
1.193	20-Apr-2018	maxv	Cast to int, to properly handle dstoff > MHLEN (which never happens).
1.192	19-Apr-2018	maxv	The mbuf length is allowed to be zero.
1.191	17-Apr-2018	maxv	change the comment
1.190	17-Apr-2018	maxv	If the mbuf is shared leave M_PKTHDR in place. Given where this function is called from that's not supposed to happen, but I'm growing unconfident about our mbuf code.
1.189	16-Apr-2018	maxv	Disable the M_PKTHDR check for now. It causes PR/53189 (which is also reproducible on i386). It seems that someone is giving looutput a malformed chain.
1.188	15-Apr-2018	maxv	Introduce a m_verify_packet function, that verifies the mbuf chain of a packet to ensure it is not malformed. Call this function in "points of interest", that are the IPv4/IPv6/IPsec entry points. There could be more. We use M_VERIFY_PACKET(m), declared under DIAGNOSTIC only. This function should not be called everywhere, especially not in places that temporarily manipulate (and clobber) the mbuf structure; once they're done they put the mbuf back in a correct format.
1.187	10-Apr-2018	maxv	Remove m_getclr. It is unused, confusing (vs m_clget), and is a weak implementation (eg you can't request a zeroed pkthdr mbuf).
1.186	10-Apr-2018	maxv	Put the "free" functions close to one another. No functional change.
1.185	10-Apr-2018	maxv	Localify m_ext_free.
1.184	21-Mar-2018	maxv	Localify and remove unused prototypes.
1.183	21-Mar-2018	maxv	Remove these global variables. They are unused, racy, and the only thing they do is triggering cache synchronization latencies between CPUs.
1.182	09-Mar-2018	maxv	Remove M_PKTHDR from secondary mbufs when reassembling packets. This is a real problem, because I found at least one component that relies on the fact that only the first mbuf has M_PKTHDR: far from here, in m_splithdr, we don't update m->m_pkthdr.len if M_PKTHDR is found in a secondary mbuf. (The initial intention there was to avoid updating m_pkthdr.len twice, the assumption was that if M_PKTHDR is set then we're dealing with the first mbuf.) Therefore, when handling fragmented IPsec packets (in particular IPv6, IPv4 is a bit more complicated), we may end up with an incorrect m_pkthdr.len after authentication or decryption. In the case of ESP, this can lead to a remote crash on this instruction: m_copydata(m, m->m_pkthdr.len - 3, 3, lastthree); m_pkthdr.len is bigger than the actual mbuf chain. It seems possible to me to trigger this bug even if you don't have the ESP key, because the fragmentation part is outside of the encrypted ESP payload. So if you MITM the target, and intercept an incoming ESP packet (which you can't decrypt), you should be able to forge a new specially-crafted, fragmented packet and stuff the ESP payload (still encrypted, as you intercepted it) into it. The decryption succeeds and the target crashes.
1.181	22-Jan-2018	maxv	branches: 1.181.2; Style and clarify.
1.180	22-Jan-2018	maxv	m_prepend does not tolerate being given len > MHLEN, so add a panic, and document this behavior.
1.179	22-Jan-2018	maxv	Style, no functional change.
1.178	22-Jan-2018	maxv	Fix m_prepend(). If 'm' is not a pkthdr, it doesn't make sense to use MH_ALIGN, it should rather be M_ALIGN. I'm wondering whether there should not be a KASSERT to make sure 'm' is always a pkthdr.
1.177	14-Jan-2018	maxv	style
1.176	01-Jan-2018	maxv	Detect use-after-frees on mbufs with external storage, too. This is done even when the refcount is > 1. Again, this code is enabled by default, because it is fast and quite useful.
1.175	01-Jan-2018	maxv	Don't use macros, rather inline, much clearer. For the record, I was partly mistaken in my previous commit: even though the macros were local, the function names were still the ones of the real callers. However, setting the name in m_data was not a good thing; this was a valid pointer, and the kernel could execute a long time before figuring out the mbuf was already freed - therefore making debugging more difficult. And information on the caller can be obtained via ddb anyway.
1.174	31-Dec-2017	maxv	Check MT_FREE by default, and not just under DEBUG (or DIAGNOSTIC). This code is fast, with an nonexistent overhead - and we already take care of setting MT_FREE, so why not check it. In addition, stop registering the function name, that's not helpful since the MBUFFREE macro is local. Instead, set m_data to NULL, so that any access to a freed mbuf's data after mtod() or similar will page fault. The combination of these two changes provides a fast and efficient way of detecting use-after-frees in the network stack.
1.173	09-Nov-2017	christos	Don't use 0 for PR_NOWAIT
1.172	31-Mar-2017	msaitoh	branches: 1.172.6; Remove extra 0x in m_print().
1.171	14-Mar-2017	ozaki-r	Use if_acquire and if_release instead of using psref API directly - Provide if_release for consistency to if_acquire - Use if_acquire and if_release for ifp iterations - Make ifnet_psref_class static
1.170	09-Jan-2017	christos	branches: 1.170.2; If we had an error, don't do the debug checks because they will most certainly fail and we'll panic.
1.169	04-Oct-2016	christos	Hide MFREE now that it is not being used anymore and provide some debugging for the location of the last free for debugging kernels.
1.168	16-Jun-2016	ozaki-r	branches: 1.168.2; Use curlwp_bind and curlwp_bindx instead of open-coding LP_BOUND
1.167	10-Jun-2016	ozaki-r	Avoid storing a pointer of an interface in a mbuf Having a pointer of an interface in a mbuf isn't safe if we remove big kernel locks; an interface object (ifnet) can be destroyed anytime in any packet processing and accessing such object via a pointer is racy. Instead we have to get an object from the interface collection (ifindex2ifnet) via an interface index (if_index) that is stored to a mbuf instead of an pointer. The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9) for sleep-able critical sections and m_{get,put}_rcvif that use pserialize(9) for other critical sections. The change also adds another API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition moratorium, i.e., it is intended to be used for places where are not planned to be MP-ified soon. The change adds some overhead due to psref to performance sensitive paths, however the overhead is not serious, 2% down at worst. Proposed on tech-kern and tech-net.
1.166	10-Jun-2016	ozaki-r	Introduce m_set_rcvif and m_reset_rcvif The API is used to set (or reset) a received interface of a mbuf. They are counterpart of m_get_rcvif, which will come in another commit, hide internal of rcvif operation, and reduce the diff of the upcoming change. No functional change.
1.165	12-May-2016	ozaki-r	Protect ifnet list with psz and psref The change ensures that ifnet objects in the ifnet list aren't freed during list iterations by using pserialize(9) and psref(9). Note that the change adds a pslist(9) for ifnet but doesn't remove the original ifnet list (ifnet_list) to avoid breaking kvm(3) users. We shouldn't use the original list in the kernel anymore.
1.164	20-Apr-2016	knakahara	Add init function for mbuf. some functions use mbuf as stack variable instead of allocating by m_get(). They should use this function(s) to prevent access to uninitialized fields. Currently, the mbuf stack allocating functions are the following. + sys/dev/ic/bwi.c - bwi_rxeof() - bwi_encap() + sys/dev/ic/dp8390.c - dp8390_ipkdb_send() + sys/dev/pci/if_txp.c - txp_download_fw_section() + sys/dev/ppbus/if_plip.c - lptap() + sys/net/bpf.c - _pf_mtap2() - _pf_mtap_af() - _pf_mtap_sl_out() + sys/netisdn/i4b_ipr.c - ipr_rx_data_rdy() - ipr_tx_queue_empty() Reviewed by kre@n.o and christos@n.o, thanks.
1.163	24-Aug-2015	pooka	sprinkle _KERNEL_OPT
1.162	24-Jul-2015	maxv	typo (comment)
1.161	08-Feb-2015	mlelstv	Correct m_len calculation for m_dup() with mbuf clusters. Fixes kern/49650.
1.160	02-Dec-2014	ozaki-r	Revert "Pull if_drain routine out of m_reclaim" The commit broke dlopen()'d rumpnet on platforms where ld.so does not override weak aliases (e.g. musl, Solaris, potentially OS X, ...). Requested by pooka@.
1.159	27-Nov-2014	ozaki-r	branches: 1.159.2; Pull if_drain routine out of m_reclaim It's if-specific and should be in if.c. No functional change.
1.158	25-Feb-2014	pooka	branches: 1.158.4; Ensure that the top level sysctl nodes (kern, vfs, net, ...) exist before the sysctl link sets are processed, and remove redundancy. Shaves >13kB off of an amd64 GENERIC, not to mention >1k duplicate lines of code.
1.157	15-Nov-2013	christos	remove trigger happy assertion. in m_adj negative lengths are valid.
1.156	14-Nov-2013	christos	- add KASSERTS on functions that don't accept M_COPYALL - compute length for m_copyback0, m_makewritable used from ipf, is using M_COPYALL.
1.155	14-Nov-2013	skrll	Deal with M_COPYALL becoming -ve properly in m_copym0. I can now mount via nfs again.
1.154	14-Nov-2013	christos	change M_COPYALL to be -1 instead of depending on it too be "too large", so that we check explicitly against it in all places. ok gimpy
1.153	09-Oct-2013	christos	- initialize m_len m_pkthgr.len to 0 in constructors, as discussed in tech-net. - s/MGET/m_get - s/0/NULL
1.152	20-Sep-2013	christos	mark mbuf as free when we return it to the pool (Beverly Schwartz)
1.151	28-Jun-2013	matt	branches: 1.151.2; Make m_copydata panics more verbose
1.150	27-Jun-2013	christos	- add m_add() that puts an mbuf to end of a chain - m_append() and m_align() with their family - remove parameters from prototypes
1.149	08-May-2013	pooka	print more diagnostic info in panic message
1.148	19-Jan-2013	rmind	Add m_ensure_contig() routine, which is equivalent to m_pullup, but does not destroy the mbuf chain on failure (it is kept valid).
1.147	18-Oct-2012	para	bring comment up to reality kmem_map => kmem_arena
1.146	29-Apr-2012	dsl	branches: 1.146.2; Remove the unused 'struct malloc_type' args to kern_malloc/realloc/free The M_xxx arg is left on the calls to malloc() and free(), maybe they could be converted to an enumeration and just saved in the malloc header (for deep diag use). Remove the malloc_type from mbuf extension. Fixes rump build as well. Welcome to 6.99.6
1.145	10-Feb-2012	para	branches: 1.145.2; 1.145.6; proper sizing of kmem_arena on different ports PR port-i386/45946: Kernel locks up in VMEM system
1.144	27-Jan-2012	para	extending vmem(9) to be able to allocated resources for it's own needs. simplifying uvm_map handling (no special kernel entries anymore no relocking) make malloc(9) a thin wrapper around kmem(9) (with private interface for interrupt safety reasons) releng@ acknowledged
1.143	31-Aug-2011	plunky	branches: 1.143.2; 1.143.6; NULL does not need a cast
1.142	08-Aug-2011	dyoung	Miscellaneous mbuf changes: 1 Add some protection against double-freeing mbufs in DIAGNOSTIC kernels. 2 Add a m_defrag() that's derived from sys/dev/pci/if_vge.c:vge_m_defrag(). This one copies the packet header. 3 Constify m_tag_find().
1.141	27-Jul-2011	uebayasi	These don't need uvm/uvm_extern.h.
1.140	24-Apr-2011	rmind	- Replace few malloc(9) uses with kmem(9). - Rename buf_malloc() to buf_alloc(), fix comments. - Remove some unnecessary inclusions.
1.139	17-Jan-2011	uebayasi	Include internal definitions (uvm/uvm.h) only where necessary.
1.138	24-Nov-2010	cegger	branches: 1.138.2; No need to print '0x' twice in the printing of the mbuf flags via 'show mbuf'
1.137	28-Oct-2010	seanb	Always use m_split() in m_copyback() instead of its local, abridged, version. This closes a window where a new mbuf (n) can be inserted where n->m_next == n.
1.136	11-May-2010	pooka	remove unnecessary #ifdef
1.135	16-Apr-2010	rmind	Remove mclpool_allocator, which is unnecessary since mb_map removal.
1.134	08-Feb-2010	joerg	branches: 1.134.2; Handle rump like the direct mapping case.
1.133	08-Feb-2010	joerg	Remove separate mb_map. The nmbclusters is computed at boot time based on the amount of physical memory and limited by NMBCLUSTERS if present. Architectures without direct mapping also limit it based on the kmem_map size, which is used as backing store. On i386 and ARM, the maximum KVA used for mbuf clusters is limited to 64MB by default. The old default limits and limits based on GATEWAY have been removed. key_registered_sb_max is hard-wired to a value derived from 2048 clusters.
1.132	05-Apr-2009	bouyer	branches: 1.132.2; m_split0(): If the newly allocated mbuf holds only the header, don't forget to set m_len to 0. Otherwise whatever will compute the size of this chain (including s_split() itself if called again on this chain) will get it wrong, leading to various issues. Bug exposed by the NFS server code with linux clients using TCP mounts.
1.131	15-Mar-2009	cegger	ansify function definitions
1.130	16-Dec-2008	christos	branches: 1.130.2; replace bitmask_snprintf(9) with snprintb(3)
1.129	07-Dec-2008	pooka	Move some sysctl node creations away from linksets and into the constructors for subsystems. XXX: CTLFLAG_PERMANENT is non-sensible.
1.128	02-Jul-2008	matt	branches: 1.128.2; 1.128.4; 1.128.6; Switch from KASSERT to CTASSERT for those asserts testing sizes of types.
1.127	28-Apr-2008	martin	branches: 1.127.2; 1.127.4; Remove clause 3 and 4 from TNF licenses
1.126	09-Apr-2008	thorpej	branches: 1.126.2; 1.126.4; Make the percpu API a little more friendly: - percpu_getptr() is now called percpu_getref() and implicitly disables preemption (via crit_enter()) when it is called. - Added percpu_putref() which implicitly reenables preemption (via crit_exit()).
1.125	24-Mar-2008	yamt	merge yamt-lazymbuf branch.
1.124	17-Jan-2008	yamt	branches: 1.124.6; make some mbuf related statistics per-cpu.
1.123	14-Nov-2007	yamt	branches: 1.123.6; m_print: avoid sign extention of m_flags.
1.122	07-Nov-2007	ad	Merge from vmlocking: - pool_cache changes. - Debugger/procfs locking fixes. - Other minor changes.
1.121	12-Mar-2007	ad	branches: 1.121.12; 1.121.14; 1.121.18; 1.121.20; Pass an ipl argument to pool_init/POOL_INIT to be used when initializing the pool's lock.
1.120	04-Mar-2007	yamt	branches: 1.120.2; fix a fallout from caddr_t changes.
1.119	04-Mar-2007	christos	Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.
1.118	22-Feb-2007	thorpej	TRUE -> true, FALSE -> false
1.117	21-Feb-2007	thorpej	Replace the Mach-derived boolean_t type with the C99 bool type. A future commit will replace use of TRUE and FALSE with true and false.
1.116	01-Nov-2006	yamt	branches: 1.116.4; remove some __unused from function parameters.
1.115	12-Oct-2006	christos	- sprinkle __unused on function decls. - fix a couple of unused bugs - no more -Wno-unused for i386
1.114	10-Oct-2006	dogcow	change the MOWNER_INIT define to take two args; fix extant struct mowner decls to use it. Makes options MBUFTRACE compile again and not whinge about missing structure declarations. (Also makes initialization consistent.)
1.113	03-Sep-2006	christos	branches: 1.113.2; 1.113.4; use c99 initializers
1.112	08-Aug-2006	pavel	MCLAIM the correct mbuf. PR kern/34162.
1.111	25-May-2006	yamt	branches: 1.111.4; move wait points for kva from upper layers to vm_map. PR/33185 #1. XXX there is a concern about interaction with kva fragmentation. see: http://mail-index.NetBSD.org/tech-kern/2006/05/11/0000.html
1.110	15-Apr-2006	christos	branches: 1.110.2; Coverity CID 848: Protect against NULL deref.
1.109	19-Mar-2006	yamt	m_copyback0: - unify two copies of code to extend a chain. - when extending a chain, - use trailing space of the last mbuf if any. - use mbuf cluster if appropriate.
1.108	18-Mar-2006	yamt	m_print: fix the previous correctly.
1.107	18-Mar-2006	chris	Fix Coverity CID 1473: Static buffer overrun. Add a counter for the number of pages, so that we print out the ext_pgs values.
1.106	15-Mar-2006	yamt	branches: 1.106.2; m_copyback0: add comments and assertions.
1.105	24-Jan-2006	yamt	branches: 1.105.2; 1.105.4; 1.105.6; 1.105.8; add ddb "sh mbuf" command.
1.104	26-Dec-2005	perry	branches: 1.104.2; u_intN_t -> uintN_t
1.103	08-Dec-2005	thorpej	Sprinkle static.
1.102	09-Nov-2005	skrll	Typo in comment.
1.101	18-Aug-2005	yamt	- introduce M_MOVE_PKTHDR and use it where appropriate. intended to be mostly API compatible with openbsd/freebsd. - remove a glue #define in netipsec/ipsec_osdep.h.
1.100	06-Jun-2005	martin	branches: 1.100.2; Since we decided "const struct mbuf *" would not do the right thing (tm), remove ~all const from mbuf pointers.
1.99	06-Jun-2005	martin	Constify the source arg of m_copydata
1.98	02-Jun-2005	explorer	restore NetBSD RCS tag in __KERNEL_RCSID() macro
1.97	02-Jun-2005	tron	Change first argument of m_copydata() back to "struct mbuf *" because m_copydata() might eventually modify the "mbuf" structure to support lazy mbuf mapping as pointed out by YAMAMOTO Takashi on "tech-net".
1.96	02-Jun-2005	tron	Add missing RCS id. Problem pointed out by Jukka Salmi.
1.95	02-Jun-2005	tron	Fix bad botch invented in last change.
1.94	02-Jun-2005	tron	Change the first argument of m_copydata() to "const struct mbuf *" (which doesn't require any implementation changes). This will allow us to get rid off a lot of nasty type casts.
1.93	01-Apr-2005	yamt	merge yamt-km branch. - don't use managed mappings/backing objects for wired memory allocations. save some resources like pv_entry. also fix (most of) PR/27030. - simplify kernel memory management API. - simplify pmap bootstrap of some ports. - some related cleanups.
1.92	24-Jan-2005	matt	branches: 1.92.2; 1.92.6; Add IFNET_FOREACH and IFADDR_FOREACH macros and start using them.
1.91	23-Jan-2005	matt	Change initialzie of domains to use link sets. Switch to using STAILQ. Add a convenience macro DOMAIN_FOREACH to interate through the domain.
1.90	20-Oct-2004	matt	branches: 1.90.4; Make panic messages print out what condition they though was panic-worthy instead of a 1 word message.
1.89	05-Oct-2004	is	Some code likes to mix MT_HEADER and MT_DATA. Revert this assertion until the usage of MT_HEADER vs. MT_DATA is better defined and implemented.
1.88	17-Sep-2004	enami	Delete m_tag from a mbuf being non-pkthdr mbuf rather than newly becoming pkthdr mbuf.
1.87	11-Sep-2004	yamt	m_split: restore a behaviour on M_PKTHDR, which was unintentionaly changed when i added m_copyback_cow.
1.86	08-Sep-2004	yamt	m_copyback, m_copyback_cow, m_copydata: - caddr_t -> void * - constify. partly from openbsd.
1.85	06-Sep-2004	yamt	add m_copyback_cow and m_makewritable.
1.84	21-Jul-2004	yamt	m_copyback: add an assertion to detect write attempts to a read-only mbuf.
1.83	24-Jun-2004	jonathan	Rename MBUFTRACE helper function m_claim() to m_claimm(), for consistency with M_FREE() and m_freem(). Affected files: sys/mbuf.h kern/uipc_socket2.c kern/uipc_mbuf.c net/if_ethersubr.c netatalk/ddp_input.c nfs/nfs_socket.c
1.82	25-May-2004	atatat	Remaining sysctl descriptions under kern subtree
1.81	22-Apr-2004	matt	Constify protosw arrays. This can reduce the kernel .data section by over 4K (if all the network protocols) are loaded.
1.80	24-Mar-2004	atatat	branches: 1.80.2; Tango on sysctl_createv() and flags. The flags have all been renamed, and sysctl_createv() now uses more arguments.
1.79	23-Mar-2004	junyoung	Nuke __P().
1.78	09-Mar-2004	yamt	m_cat: assert mbuf types only when coalescing them by copying. mbuf n often have 0-sized "headers" and their types don't matter much. PR/24713 from Darrin B. Jewell.
1.77	26-Feb-2004	itojun	m_cat() - if it is safe, copy data portion into 1st mbuf even if 1st mbuf is M_EXT mbuf.
1.76	21-Jan-2004	atatat	Fix the kern.mbuf tunables.
1.75	04-Dec-2003	atatat	Dynamic sysctl. Gone are the old kern_sysctl(), cpu_sysctl(), hw_sysctl(), vfs_sysctl(), etc, routines, along with sysctl_int() et al. Now all nodes are registered with the tree, and nodes can be added (or removed) easily, and I/O to and from the tree is handled generically. Since the nodes are registered with the tree, the mapping from name to number (and back again) can now be discovered, instead of having to be hard coded. Adding new nodes to the tree is likewise much simpler -- the new infrastructure handles almost all the work for simple types, and just about anything else can be done with a small helper function. All existing nodes are where they were before (numerically speaking), so all existing consumers of sysctl information should notice no difference. PS - I'm sorry, but there's a distinct lack of documentation at the moment. I'm working on sysctl(3/8/9) right now, and I promise to watch out for buses.
1.74	03-Oct-2003	itojun	when dropping M_PKTHDR, need to free m_tag associated with it.
1.73	07-Sep-2003	yamt	assert mbuf chains m_cat'ed are of the same type.
1.72	04-Sep-2003	itojun	clarify comment on m_cat().
1.71	15-Aug-2003	simonb	Return NULL instead of 0 for functions that return pointers. Sprinkle some KNF whitespace.
1.70	07-Aug-2003	agc	Move UCB-licensed code from 4-clause to 3-clause licence. Patches provided by Joel Baker in PR 22364, verified by myself.
1.69	23-Jun-2003	martin	branches: 1.69.2; Make sure to include opt_foo.h if a defflag option FOO is used.
1.68	27-May-2003	simonb	Fix tyop in a comment.
1.67	18-Apr-2003	simonb	Add a KASSERT to make sure that "sizeof(struct mbuf)" is MSIZE. Extra insurance for Steve Woodford's recent <sys/mbuf.h> patch.
1.66	12-Apr-2003	thorpej	Add two new mbuf routines: * m_apply(), which applies a function to each mbuf in chain starting at a specified offset, for a specified length. * m_getptr(), which returns a pointer to the mbuf, as well as the offset into that mbuf, corresponding to an offset from the beginning of an mbuf chain. From OpenBSD, cleaned up slightly by me.
1.65	09-Apr-2003	thorpej	* Use a pool_cache constructor to record the physical address of mbufs in the mbuf header. * Use the new cached paddr feature of the pool_cache API to record the physical address of mbuf clusters. (We cannot use a ctor for clusters, since clusters have no constructed form; they are merely buffers). Bus_dma back-ends may use the cached physical addresses to save having to extract the physical address from virtual. * Provide space in m_ext recording the vm_page 's for an SOSEND_LOAN_CHUNK- sized non-cluster external buffer. Use this in the sosend_loan code to save having to extract the physical address from virtual and then look up the vm_page 's. * Provide an indication that an external buffer is mapped read-only at the MMU. Set this flag for the external buffer in the sosend_loan case, since loaned pages are always mapped read-only. Bus_dma back-ends may use this information to save cache flushing, since a cache flush of a read-only mapping is redundant on some architectures (the cache would have already been flushed when making the mapping read-only). Part 2 in a series of simple patches contributed by Wasabi Systems to improve network performance.
1.64	26-Feb-2003	matt	Add MBUFTRACE kernel option. Do a little mbuf rework while here. Change all uses of MGET(, M_WAIT, ) to m_get(M_WAIT, *). These are not performance critical and making them call m_get saves considerable space. Add m_clget analogue of MCLGET and make corresponding change for M_WAIT uses. Modify netinet, gem, fxp, tulip, nfs to support MBUFTRACE. Begin to change netstat to use sysctl.
1.63	01-Feb-2003	thorpej	Add extensible malloc types, adapted from FreeBSD. This turns malloc types into a structure, a pointer to which is passed around, instead of an int constant. Allow the limit to be adjusted when the malloc type is defined, or with a function call, as suggested by Jonathan Stone.
1.62	31-Jan-2003	thorpej	ANSI'ify.
1.61	25-Sep-2002	thorpej	Don't include <sys/map.h>.
1.60	30-Jun-2002	thorpej	Changes to allow the IPv4 and IPv6 layers to align headers themseves, as necessary: * Implement a new mbuf utility routine, m_copyup(), is is like m_pullup(), except that it always prepends and copies, rather than only doing so if the desired length is larger than m->m_len. m_copyup() also allows an offset into the destination mbuf, which allows space for packet headers, in the forwarding case. * Add _HDR_ALIGNED_P() macros for IP, IPv6, ICMP, and IGMP. These macros expand to 1 if __NO_STRICT_ALIGNMENT is defined, so that architectures which do not have strict alignment constraints don't pay for the test or visit the new align-if-needed path. Use the new macros to check if a header needs to be aligned, or to assert that it already is, as appropriate. Note: This code is still somewhat experimental. However, the new code path won't be visited if individual device drivers continue to guarantee that packets are delivered to layer 3 already properly aligned (which are rules that are already in use).
1.59	09-Mar-2002	thorpej	branches: 1.59.6; Make mbpool and mclpool use the new drain hook facaility. Adjust m_reclaim() to match the drain hook signature. This allows us to delete m_retry() and m_retryhdr(), as the pool allocator will now perform the reclaimation step for us. From art@openbsd.org.
1.58	08-Mar-2002	thorpej	Pool deals fairly well with physical memory shortage, but it doesn't deal with shortages of the VM maps where the backing pages are mapped (usually kmem_map). Try to deal with this: * Group all information about the backend allocator for a pool in a separate structure. The pool references this structure, rather than the individual fields. * Change the pool_init() API accordingly, and adjust all callers. * Link all pools using the same backend allocator on a list. * The backend allocator is responsible for waiting for physical memory to become available, but will still fail if it cannot callocate KVA space for the pages. If this happens, carefully drain all pools using the same backend allocator, so that some KVA space can be freed. * Change pool_reclaim() to indicate if it actually succeeded in freeing some pages, and use that information to make draining easier and more efficient. * Get rid of PR_URGENT. There was only one use of it, and it could be dealt with by the caller. From art@openbsd.org.
1.57	12-Feb-2002	thorpej	const char *mclpool_warnmsg -> const char mclpool_warnmsg[] Noted by Matt Thomas.
1.56	12-Nov-2001	lukem	add RCSIDs
1.55	29-Oct-2001	simonb	Don't need to include <uvm/uvm_extern.h> just to include <sys/sysctl.h> anymore.
1.54	15-Sep-2001	chs	branches: 1.54.2; a whole bunch of changes to improve performance and robustness under load: - remove special treatment of pager_map mappings in pmaps. this is required now, since I've removed the globals that expose the address range. pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's no longer any need to special-case it. - eliminate struct uvm_vnode by moving its fields into struct vnode. - rewrite the pageout path. the pager is now responsible for handling the high-level requests instead of only getting control after a bunch of work has already been done on its behalf. this will allow us to UBCify LFS, which needs tighter control over its pages than other filesystems do. writing a page to disk no longer requires making it read-only, which allows us to write wired pages without causing all kinds of havoc. - use a new PG_PAGEOUT flag to indicate that a page should be freed on behalf of the pagedaemon when it's unlocked. this flag is very similar to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the pageout fails due to eg. an indirect-block buffer being locked. this allows us to remove the "version" field from struct vm_page, and together with shrinking "loan_count" from 32 bits to 16, struct vm_page is now 4 bytes smaller. - no longer use PG_RELEASED for swap-backed pages. if the page is busy because it's being paged out, we can't release the swap slot to be reallocated until that write is complete, but unlike with vnodes we don't keep a count of in-progress writes so there's no good way to know when the write is done. instead, when we need to free a busy swap-backed page, just sleep until we can get it busy ourselves. - implement a fast-path for extending writes which allows us to avoid zeroing new pages. this substantially reduces cpu usage. - encapsulate the data used by the genfs code in a struct genfs_node, which must be the first element of the filesystem-specific vnode data for filesystems which use genfs_{get,put}pages(). - eliminate many of the UVM pagerops, since they aren't needed anymore now that the pager "put" operation is a higher-level operation. - enhance the genfs code to allow NFS to use the genfs_{get,put}pages instead of a modified copy. - clean up struct vnode by removing all the fields that used to be used by the vfs_cluster.c code (which we don't use anymore with UBC). - remove kmem_object and mb_object since they were useless. instead of allocating pages to these objects, we now just allocate pages with no object. such pages are mapped in the kernel until they are freed, so we can use the mapping to find the page to free it. this allows us to remove splvm() protection in several places. The sum of all these changes improves write throughput on my decstation 5000/200 to within 1% of the rate of NetBSD 1.5 and reduces the elapsed time for "make release" of a NetBSD 1.5 source tree on my 128MB pc to 10% less than a 1.5 kernel took.
1.53	26-Jul-2001	thorpej	branches: 1.53.2; Use pool_cache_*() for mbufs and clusters. While we don't use the ctor/dtor feature, it's still faster to allocate from the cache groups than it is from the pool (cache groups are analogous to "magazines" in the Solaris SLAB allocator).
1.52	14-Jan-2001	thorpej	branches: 1.52.2; 1.52.4; Change some low-hanging splimp() calls to splvm().
1.51	14-Nov-2000	itojun	make sure every m_aux will be freed. there are direct use of MFREE() from sys/kern. (we experienced no memory leak so far, but if we use m_aux for other purposes, we will need this change)
1.50	18-Aug-2000	itojun	repair m_dup(). specifically, now it is safe against non-MCLBYTES cluster mbuf. noone seem to be using this function at this moment.
1.49	18-Aug-2000	itojun	disable m_dup(), as it makes false assumption on cluster mbuf and unsafe (does not do the right thing).
1.48	18-Aug-2000	itojun	add a comment about false assumption made by m_dup()
1.47	27-Jun-2000	mrg	remove include of <vm/vm.h>
1.46	26-Jun-2000	mrg	remove/move more mach vm header files: <vm/pglist.h> -> <uvm/uvm_pglist.h> <vm/vm_inherit.h> -> <uvm/uvm_inherit.h> <vm/vm_kern.h> -> into <uvm/uvm_extern.h> <vm/vm_object.h> -> nothing <vm/vm_pager.h> -> into <uvm/uvm_pager.h> also includes a bunch of <vm/vm_page.h> include removals (due to redudancy with <vm/vm.h>), and a scattering of other similar headers.
1.45	01-Mar-2000	itojun	branches: 1.45.4; introduce m->m_pkthdr.aux to hold random data which needs to be passed between protocol handlers. ipsec socket pointers, ipsec decryption/auth information, tunnel decapsulation information are in my mind - there can be several other usage. at this moment, we use this for ipsec socket pointer passing. this will avoid reuse of m->m_pkthdr.rcvif in ipsec code. due to the change, MHLEN will be decreased by sizeof(void *) - for example, for i386, MHLEN was 100 bytes, but is now 96 bytes. we may want to increase MSIZE from 128 to 256 for some of our architectures. take caution if you use it for keeping some data item for long period of time - use extra caution on M_PREPEND() or m_adj(), as they may result in loss of m->m_pkthdr.aux pointer (and mbuf leak). this will bump kernel version. (as discussed in tech-net, tested in kame tree)
1.44	27-Oct-1999	itojun	add mbuf deep-copy fnudtion, m_dup(). NOTE: if you use m_dup(), your additional kernel code can become incompatible with 4.xBSD or other *BSD.
1.43	05-Aug-1999	thorpej	branches: 1.43.2; 1.43.4; 1.43.6; Add some more diagnostic information to the 3 different `panic("m_copym")' calls.
1.42	26-Apr-1999	thorpej	More improvements to mbuf and mbuf cluster allocation: - Initialize mbpool and mclpool with msize and mclbytes, respectively, so that those values may be patched and have an actual affect on the next system reboot. - Set low water marks on mbpool (default: 16) and mclpool (default: 8). This should be of great help for diskless systems, which need to allocate mbufs in order to clean dirty pages; the low water marks increase the chances of this being possible to do in memory starvation situations. - Add support for getting/setting some mbuf-related parameters via sysctl. * msize and mclsize (read-only) * nmbclusters (read-only unless the platform has direct-mapped pool pages, in which case the value can be increased). * mblowat and mcllowat (read/write)
1.41	25-Apr-1999	simonb	Use the nmbclusters variable and not the NMBCLUSTERS constant when setting the mclpool hardlimit.
1.40	01-Apr-1999	thorpej	branches: 1.40.4; mbinit() can now allocate memory. Update a comment accordingly.
1.39	31-Mar-1999	thorpej	Set a hard limit (rather than an advisory high water mark for pages) of NMBCLUSTERS for the mbuf cluster pool. On platforms which use direct-mapped segments for pool pages (MIPS and Alpha), this makes NMBCLUSTERS actually meaningful (such ports don't even allocate mb_map, as it is not used to map mbuf cluster pages). Improve the message logged at a maximum rate of once per second. The new message: "WARNING: mclpool limit reached; increase NMBCLUSTERS". In the back-end pool page allocator, remove the message about mb_map being full. The message was not necessarily correct as the allocator may have been starved for pages, rather than for space in the map. Also, the hard limit on the mbuf cluster pool will be reached before the map fills (the last cluster will always fit into the map), so the message is redundant. Add a comment in mbinit() about considering setting low water marks on the mbuf and mbuf cluster pools.
1.38	24-Mar-1999	mrg	completely remove Mach VM support. all that is left is the all the header files as UVM still uses (most of) these.
1.37	23-Mar-1999	thorpej	Set the high water mark on the mbuf cluster pool to NMBCLUSTERS.
1.36	22-Mar-1999	thorpej	Put back the code to log `mb_map full' that was lost when mbuf clusters were converted to use the pool allocator.
1.35	09-Jan-1999	thorpej	Garbage-collect `mbutl'.
1.34	09-Jan-1999	thorpej	Garbage-collect `union mcluster' and `mclfree'.
1.33	18-Dec-1998	thorpej	Reverse the stopgap change made in revision 1.29: date: 1998/08/01 01:47:24; author: thorpej; state: Exp; lines: +18 -8 Don't call the protocol drain routines if how == M_NOWAIT, which typically means we're in interrupt context. Since we can be called from a network hardware interrupt, we could corrupt the protocol queues we try to drain them at that time. The problem has been addressed by letting the drain'able protocols use a locking scheme to prevent queue corruption.
1.32	28-Aug-1998	thorpej	branches: 1.32.4; Add a waitok boolean argument to the VM system's pool page allocator backend.
1.31	13-Aug-1998	thorpej	Oops, this got missed in the vm_offset_t -> vaddr_t change.
1.30	04-Aug-1998	perry	Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one. bcopy(x, y, z) -> memcpy(y, x, z) ovbcopy(x, y, z) -> memmove(y, x, z) bcmp(x, y, z) -> memcmp(x, y, z) bzero(x, y) -> memset(x, 0, y)
1.29	01-Aug-1998	thorpej	Don't call the protocol drain routines if how == M_NOWAIT, which typically means we're in interrupt context. Since we can be called from a network hardware interrupt, we could corrupt the protocol queues we try to drain them at that time.
1.28	01-Aug-1998	thorpej	Use the pool allocator for mbufs and mbufs clusters (two pools, one for each). Partially from pk@netbsd.org.
1.27	22-May-1998	matt	branches: 1.27.2; Add an if_drain to the ifnet structure (call when the system is low on mbufs). Add code to m_reclaim to call if_drain in each ifnet that has one set. Remove register from declarations.
1.26	01-Mar-1998	fvdl	Merge with Lite2 + local changes
1.25	12-Feb-1998	kleink	Fix variable declarations: register -> register int.
1.24	10-Feb-1998	mrg	- add defopt's for UVM, UVMHIST and PMAP_NEW. - remove unnecessary UVMHIST_DECL's.
1.23	05-Feb-1998	mrg	initial import of the new virtual memory system, UVM, into -current. UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some minor portions derived from the old Mach code. i provided some help getting swap and paging working, and other bug fixes/ideas. chuck silvers <chuq@chuq.com> also provided some other fixes. this is the rest of the MI portion changes. this will be KNF'd shortly. :-)
1.22	20-Nov-1997	thorpej	In m_split(), restore m_pkthdr.len if an error occurs. From Koji Imada, PR #3986.
1.21	06-Jun-1997	pk	branches: 1.21.8; Get `canwait' argument to kmem_malloc() right.
1.20	28-Apr-1997	mycroft	Oops; forgot to GC the last mbuf allocated when out of clusters.
1.19	24-Apr-1997	mycroft	If we fail to allocate a cluster to hold a large packet, simply drop it rather than using a chain of tiny mbufs.
1.18	27-Mar-1997	thorpej	Update and enhancement to the mbuf code, to support use of non-cluster external storage. Highlights: - additional "void " argument to (ext_free)(), an opaque cookie for use by the free function. - MCLALLOC() and MCLFREE() calls are gone. They are replaced by MEXTADD() (add external storage to mbuf), MEXTMALLOC() (malloc() external storage and attach to mbuf), and MEXTREMOVE() (remove external storage from mbuf). - completely new external storage reference counting mechanism; mclrefcnt[] is gone. These changes will eventually be used to pass driver DMA buffers up the network stack, and reduce/eliminate copies in certain code paths (e.g. NFS writes). From Matt Thomas <matt@3am-software.com> and myself <thorpej@nas.nasa.gov>, with some input from Chris Demetriou <cgd@cs.cmu.edu> and review by Charles Hannum <mycroft@mit.edu>.
1.17	18-Dec-1996	gwr	Move `static' to the beginning of the storage class specifiers.
1.16	13-Jun-1996	cgd	if kmem_malloc() fails while trying to allocate an mbuf cluster, try and free some space by calling m_reclaim(). Also, log the "mb_map full" error message (at most) every 60-seconds. The old code would log it once over the lifetime of the system, but that's not a useful diagnostic. (More useful is the new behaviour, which roughly indicates how often periods of heavy load occur, without spamming the console and system logs with messages.)
1.15	09-Feb-1996	christos	branches: 1.15.4; More proto fixes
1.14	04-Feb-1996	christos	First pass at prototyping
1.13	30-Oct-1994	cgd	be more careful with types, also pull in headers where necessary.
1.12	28-Sep-1994	deraadt	don't play with CLBYTES in cpp
1.11	19-Sep-1994	mycroft	m_adj() returns void.
1.10	29-Jun-1994	cgd	branches: 1.10.2; New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'
1.9	13-May-1994	mycroft	Update to 4.4-Lite networking code, with a few local changes.
1.8	14-Apr-1994	deraadt	the packet header is at the start of the mbuf chain, not the end.
1.7	08-Jan-1994	mycroft	#include vm_kern.h.
1.6	18-Dec-1993	mycroft	Canonicalize all #includes.
1.5	22-Oct-1993	cgd	slightly clean up ws's original patch to this file for the sense of wait vs. nowait. this patch from torek.
1.4	04-Sep-1993	jtc	branches: 1.4.2; Include systm.h to get prototypes (and possibly inlines) of *max functions. Change mbinit() to match prototype.
1.3	20-May-1993	cgd	add $Id$ strings, and clean up file headers where necessary
1.2	21-Mar-1993	cgd	after 0.2.2 "stable" patches applied
1.1	21-Mar-1993	cgd	branches: 1.1.1; Initial revision
1.1.1.3	01-Mar-1998	fvdl	Import 4.4BSD-Lite2
1.1.1.2	01-Mar-1998	fvdl	Import 4.4BSD-Lite for reference
1.1.1.1	21-Mar-1993	cgd	initial import of 386bsd-0.1 sources
1.4.2.3	14-Nov-1993	mycroft	Canonicalize all #includes.
1.4.2.2	26-Oct-1993	mycroft	Merge changes from trunk.
1.4.2.1	24-Sep-1993	mycroft	Make all files using spl*() #include cpu.h. Changes from trunk. init_main.c: New method of pseudo-device of initialization. kern_clock.c: hardclock() and softclock() now take a pointer to a clockframe. softclock() only does callouts. kern_synch.c: Remove spurious declaration of endtsleep(). Adjust uses of averunnable for new struct loadav. subr_prf.c: Allow printf() formats in panic(). tty.c: averunnable changes. vfs_subr.c: va_size and va_bytes are now quads.
1.10.2.1	06-Oct-1994	mycroft	Update from trunk.
1.15.4.1	13-Jun-1996	cgd	pull up from trunk: >if kmem_malloc() fails while trying to allocate an mbuf cluster, try >and free some space by calling m_reclaim(). Also, log the "mb_map full" >error message (at most) every 60-seconds. The old code would log it >once over the lifetime of the system, but that's not a useful diagnostic. >(More useful is the new behaviour, which roughly indicates how often >periods of heavy load occur, without spamming the console and system >logs with messages.)
1.21.8.1	20-Nov-1997	thorpej	Pull up from trunk: restore m_pkthdr.len in m_split() on error.
1.27.2.1	08-Aug-1998	eeh	Revert cdevsw mmap routines to return int.
1.32.4.1	11-Dec-1998	kenh	The beginnings of interface detach support. Still some bugs, but mostly works for me. This work was originally by Bill Studenmund, and cleaned up by me.
1.40.4.1	21-Jun-1999	thorpej	Sync w/ -current.
1.43.6.1	27-Dec-1999	wrstuden	Pull up to last week's -current.
1.43.4.1	15-Nov-1999	fvdl	Sync with -current
1.43.2.3	18-Jan-2001	bouyer	Sync with head (for UBC+NFS fixes, mostly).
1.43.2.2	22-Nov-2000	bouyer	Sync with HEAD.
1.43.2.1	20-Nov-2000	bouyer	Update thorpej_scsipi to -current as of a month ago
1.45.4.2	04-Feb-2001	he	Pull up revision 1.51 (requested by itojun): Make sure every m_aux will be freed.
1.45.4.1	19-Aug-2000	itojun	pullup 1.48 -> 1.50 (approved by releng-1-5) repair m_dup(). specifically, now it is safe against non-MCLBYTES external mbuf. noone seem to be using this function at this moment.
1.52.4.5	10-Oct-2002	jdolecek	sync kqueue with -current; this includes merge of gehenna-devsw branch, merge of i386 MP branch, and part of autoconf rototil work
1.52.4.4	06-Sep-2002	jdolecek	sync kqueue branch with HEAD
1.52.4.3	16-Mar-2002	jdolecek	Catch up with -current.
1.52.4.2	10-Jan-2002	thorpej	Sync kqueue branch with -current.
1.52.4.1	03-Aug-2001	lukem	update to -current
1.52.2.7	18-Oct-2002	nathanw	Catch up to -current.
1.52.2.6	01-Aug-2002	nathanw	Catch up to -current.
1.52.2.5	01-Apr-2002	nathanw	Catch up to -current. (CVS: It's not just a program. It's an adventure!)
1.52.2.4	28-Feb-2002	nathanw	Catch up to -current.
1.52.2.3	14-Nov-2001	nathanw	Catch up to -current.
1.52.2.2	21-Sep-2001	nathanw	Catch up to -current.
1.52.2.1	24-Aug-2001	nathanw	Catch up with -current.
1.53.2.1	01-Oct-2001	fvdl	Catch up with -current.
1.54.2.1	12-Nov-2001	thorpej	Sync the thorpej-mips-cache branch with -current.
1.59.6.1	15-Jul-2002	gehenna	catch up with -current.
1.69.2.10	11-Dec-2005	christos	Sync with head.
1.69.2.9	10-Nov-2005	skrll	Sync with HEAD. Here we go again...
1.69.2.8	01-Apr-2005	skrll	Sync with HEAD.
1.69.2.7	04-Feb-2005	skrll	Sync with HEAD.
1.69.2.6	24-Jan-2005	skrll	Sync with HEAD.
1.69.2.5	02-Nov-2004	skrll	Sync with HEAD.
1.69.2.4	19-Oct-2004	skrll	Sync with HEAD
1.69.2.3	21-Sep-2004	skrll	Fix the sync with head I botched.
1.69.2.2	18-Sep-2004	skrll	Sync with HEAD.
1.69.2.1	03-Aug-2004	skrll	Sync with HEAD
1.80.2.5	08-Oct-2004	jmc	Pullup rev 1.89 (requested by is in ticket #895) Some code likes to mix MT_HEADER and MT_DATA. Revert this assertion until the usage of MT_HEADER vs. MT_DATA is better defined and implemented.
1.80.2.4	11-Sep-2004	he	Pull up revision 1.87 (requested by yamt in ticket #841): Restore behaviour of m_split() on M_PKTHDR which was unintentionally changed when m_copyback_cow() was added.
1.80.2.3	11-Sep-2004	he	Pull up revisions 1.84-1.85 (requested by yamt in ticket #831): Add an assertion to detect write to a read-only mbuf. Add m_copyback_cow and m_makewritable.
1.80.2.2	14-Jul-2004	tron	Pull up revision 1.83 (requested by jonathan in ticket #648): Rename MBUFTRACE helper function m_claim() to m_claimm(), for consistency with M_FREE() and m_freem(). Affected files: sys/mbuf.h kern/uipc_socket2.c kern/uipc_mbuf.c net/if_ethersubr.c netatalk/ddp_input.c nfs/nfs_socket.c
1.80.2.1	26-May-2004	he	Pull up revision 1.82 (requested by atatat in ticket #388): Add remaining sysctl descriptions under kern subtree.
1.90.4.1	29-Apr-2005	kent	sync with -current
1.92.6.3	08-Sep-2006	ghen	Pull up following revision(s) (requested by pavel in ticket #1503): sys/kern/uipc_mbuf.c: revision 1.112 MCLAIM the correct mbuf. PR kern/34162.
1.92.6.2	09-Jun-2005	snj	Pull up revision 1.98 (requested by tron in ticket #387): restore NetBSD RCS tag in __KERNEL_RCSID() macro
1.92.6.1	09-Jun-2005	snj	Pull up revision 1.96 (requested by tron in ticket #387): Add missing RCS id. Problem pointed out by Jukka Salmi.
1.92.2.1	25-Jan-2005	yamt	convert to new apis.
1.100.2.26	27-Feb-2008	yamt	remove mbuf ext_lock which is no longer used.
1.100.2.25	27-Feb-2008	yamt	drop lazy mapping of mbuf external storage for now, because: - it's currently broken wrt asm code. (cpu_in_cksum) - there are other approaches worth to consider. eg. sf_buf
1.100.2.24	14-Feb-2008	yamt	m_ext_free: optimize the common case.
1.100.2.23	11-Feb-2008	yamt	m_ext_free: don't use atomic op where unnecessary.
1.100.2.22	05-Feb-2008	yamt	use mutex_spin_enter.
1.100.2.21	21-Jan-2008	yamt	sync with head
1.100.2.20	07-Dec-2007	yamt	use atomic ops unconditionally.
1.100.2.19	15-Nov-2007	yamt	mcl_inc_reference, mcl_dec_and_test_reference: use atomic ops if x86.
1.100.2.18	15-Nov-2007	yamt	update a comment
1.100.2.17	15-Nov-2007	yamt	mbpool_cache -> mb_cache
1.100.2.16	15-Nov-2007	yamt	sync with head.
1.100.2.15	27-Oct-2007	yamt	make ext_lock kmutex_t.
1.100.2.14	03-Sep-2007	yamt	kill caddr_t.
1.100.2.13	03-Sep-2007	yamt	sync with head.
1.100.2.12	26-Feb-2007	yamt	sync with head.
1.100.2.11	30-Dec-2006	yamt	sync with head.
1.100.2.10	07-Jul-2006	yamt	- fix typos and compilation problems in uipc_mbuf.c rev.1.100.2.8. - m_ext_free: fix the recursive call case. - change return value of mcl_dec_and_test_reference. - tweak assertions.
1.100.2.9	07-Jul-2006	yamt	m_print: print raw ext_refcnt rather than MCLISREFERENCED.
1.100.2.8	06-Jul-2006	yamt	tweak code so that it can be switched to atomic operations later easily.
1.100.2.7	06-Jul-2006	yamt	- move some macros from mbuf.h to uipc_mbuf.c. - remove unused MCLBUFREF.
1.100.2.6	21-Jun-2006	yamt	sync with head.
1.100.2.5	15-Jul-2005	yamt	m_mapin: fix an spl botch.
1.100.2.4	07-Jul-2005	yamt	defer mapping only when defined(__HAVE_LAZY_MBUF).
1.100.2.3	07-Jul-2005	yamt	sosend_loan: defer mapping of mbuf external data pages. mtod: map mbuf external data pages if needed.
1.100.2.2	07-Jul-2005	yamt	de-inline m_ext_free.
1.100.2.1	07-Jul-2005	yamt	adapt to mbuf.h changes.
1.104.2.1	01-Feb-2006	yamt	sync with head.
1.105.8.1	19-Apr-2006	elad	sync with head.
1.105.6.5	14-Sep-2006	yamt	sync with head.
1.105.6.4	11-Aug-2006	yamt	sync with head
1.105.6.3	26-Jun-2006	yamt	sync with head.
1.105.6.2	24-May-2006	yamt	sync with head.
1.105.6.1	01-Apr-2006	yamt	sync with head.
1.105.4.2	01-Jun-2006	kardel	Sync with head.
1.105.4.1	22-Apr-2006	simonb	Sync with head.
1.105.2.1	09-Sep-2006	rpaulo	sync with head
1.106.2.2	24-May-2006	tron	Merge 2006-05-24 NetBSD-current into the "peter-altq" branch.
1.106.2.1	28-Mar-2006	tron	Merge 2006-03-28 NetBSD-current into the "peter-altq" branch.
1.110.2.1	19-Jun-2006	chap	Sync with head.
1.111.4.1	08-Sep-2006	rpaulo	Pull up following revision(s) (requested by pavel in ticket #135): sys/kern/uipc_mbuf.c: revision 1.112 MCLAIM the correct mbuf. PR kern/34162.
1.113.4.2	10-Dec-2006	yamt	sync with head.
1.113.4.1	22-Oct-2006	yamt	sync with head
1.113.2.1	18-Nov-2006	ad	Sync with head.
1.116.4.3	24-Mar-2007	yamt	sync with head.
1.116.4.2	12-Mar-2007	rmind	Sync with HEAD.
1.116.4.1	27-Feb-2007	yamt	- sync with head. - move sched_changepri back to kern_synch.c as it doesn't know PPQ anymore.
1.120.2.3	01-Nov-2007	ad	m_reclaim: acquire kernel_lock as this can be called from the pagedaemon.
1.120.2.2	01-Sep-2007	ad	Update for pool_cache API changes.
1.120.2.1	13-Mar-2007	ad	Sync with head.
1.121.20.2	18-Feb-2008	mjf	Sync with HEAD.
1.121.20.1	19-Nov-2007	mjf	Sync with HEAD.
1.121.18.2	18-Nov-2007	bouyer	Sync with HEAD
1.121.18.1	13-Nov-2007	bouyer	Sync with HEAD
1.121.14.3	23-Mar-2008	matt	sync with HEAD
1.121.14.2	09-Jan-2008	matt	sync with HEAD
1.121.14.1	08-Nov-2007	matt	sync with -HEAD
1.121.12.2	14-Nov-2007	joerg	Sync with HEAD.
1.121.12.1	11-Nov-2007	joerg	Sync with HEAD.
1.123.6.1	19-Jan-2008	bouyer	Sync with HEAD
1.124.6.4	17-Jan-2009	mjf	Sync with HEAD.
1.124.6.3	02-Jul-2008	mjf	Sync with HEAD.
1.124.6.2	02-Jun-2008	mjf	Sync with HEAD.
1.124.6.1	03-Apr-2008	mjf	Sync with HEAD.
1.126.4.4	11-Aug-2010	yamt	sync with head.
1.126.4.3	11-Mar-2010	yamt	sync with head
1.126.4.2	04-May-2009	yamt	sync with head.
1.126.4.1	16-May-2008	yamt	sync with head.
1.126.2.1	18-May-2008	yamt	sync with head.
1.127.4.1	03-Jul-2008	simonb	Sync with head.
1.127.2.1	18-Sep-2008	wrstuden	Sync with wrstuden-revivesa-base-2.
1.128.6.1	07-Apr-2009	snj	Pull up following revision(s) (requested by bouyer in ticket #674): sys/kern/uipc_mbuf.c: revision 1.132 m_split0(): If the newly allocated mbuf holds only the header, don't forget to set m_len to 0. Otherwise whatever will compute the size of this chain (including s_split() itself if called again on this chain) will get it wrong, leading to various issues. Bug exposed by the NFS server code with linux clients using TCP mounts.
1.128.4.2	28-Apr-2009	skrll	Sync with HEAD.
1.128.4.1	19-Jan-2009	skrll	Sync with HEAD.
1.128.2.1	13-Dec-2008	haad	Update haad-dm branch to haad-dm-base2.
1.130.2.1	13-May-2009	jym	Sync with HEAD. Commit is split, to avoid a "too many arguments" protocol error.
1.132.2.3	06-Nov-2010	uebayasi	Sync with HEAD.
1.132.2.2	17-Aug-2010	uebayasi	Sync with HEAD.
1.132.2.1	30-Apr-2010	uebayasi	Sync with HEAD.
1.134.2.3	31-May-2011	rmind	sync with head
1.134.2.2	05-Mar-2011	rmind	sync with head
1.134.2.1	30-May-2010	rmind	sync with head
1.138.2.1	06-Jun-2011	jruoho	Sync with HEAD.
1.143.6.2	29-Apr-2012	mrg	sync to latest -current.
1.143.6.1	18-Feb-2012	mrg	merge to -current.
1.143.2.5	22-May-2014	yamt	sync with head. for a reference, the tree before this commit was tagged as yamt-pagecache-tag8. this commit was splitted into small chunks to avoid a limitation of cvs. ("Protocol error: too many arguments")
1.143.2.4	23-Jan-2013	yamt	sync with head
1.143.2.3	30-Oct-2012	yamt	sync with head
1.143.2.2	23-May-2012	yamt	sync with head.
1.143.2.1	17-Apr-2012	yamt	sync with head
1.145.6.1	03-May-2018	martin	Pull up following revision(s) (requested by maxv in ticket #1547): sys/kern/uipc_mbuf.c: revision 1.211 (via patch) Modify m_defrag, so that it never frees the first mbuf of the chain. While here use the given 'flags' argument, and not M_DONTWAIT. We have a problem with several drivers: they poll an mbuf chain from their queues and call m_defrag on them, but m_defrag could update the mbuf pointer, so the mbuf in the queue is no longer valid. It is not easy to fix each driver, because doing pop+push will reorder the queue, and we don't really want that to happen. This problem was independently spotted by me, Kengo, Masanobu, and other people too it seems (perhaps PR/53218). Now m_defrag leaves the first mbuf in place, and compresses the chain only starting from the second mbuf in the chain. It is important not to compress the first mbuf with hacks, because the storage of this first mbuf may be shared with other mbufs.
1.145.2.2	03-May-2018	martin	Pull up following revision(s) (requested by maxv in ticket #1547): sys/kern/uipc_mbuf.c: revision 1.211 (via patch) Modify m_defrag, so that it never frees the first mbuf of the chain. While here use the given 'flags' argument, and not M_DONTWAIT. We have a problem with several drivers: they poll an mbuf chain from their queues and call m_defrag on them, but m_defrag could update the mbuf pointer, so the mbuf in the queue is no longer valid. It is not easy to fix each driver, because doing pop+push will reorder the queue, and we don't really want that to happen. This problem was independently spotted by me, Kengo, Masanobu, and other people too it seems (perhaps PR/53218). Now m_defrag leaves the first mbuf in place, and compresses the chain only starting from the second mbuf in the chain. It is important not to compress the first mbuf with hacks, because the storage of this first mbuf may be shared with other mbufs.
1.145.2.1	08-Feb-2013	riz	branches: 1.145.2.1.2; Pull up following revision(s) (requested by rmind in ticket #777): usr.sbin/npf/npfctl/npfctl.c: revision 1.27 sys/net/npf/npf_session.c: revision 1.19 usr.sbin/npf/npftest/libnpftest/npf_mbuf_subr.c: revision 1.4 sys/net/npf/npf_rproc.c: revision 1.5 usr.sbin/npf/npftest/README: revision 1.3 sys/sys/mbuf.h: revision 1.151 sys/net/npf/npf_ruleset.c: revision 1.15 usr.sbin/npf/npftest/libnpftest/npf_nbuf_test.c: revision 1.3 sys/net/npf/npf_ruleset.c: revision 1.16 usr.sbin/npf/npftest/libnpftest/npf_state_test.c: revision 1.4 usr.sbin/npf/npftest/libnpftest/npf_nbuf_test.c: revision 1.4 sys/net/npf/npf_inet.c: revision 1.19 sys/net/npf/npf_instr.c: revision 1.15 sys/net/npf/npf_handler.c: revision 1.24 sys/net/npf/npf_handler.c: revision 1.25 sys/net/npf/npf_state_tcp.c: revision 1.12 sys/net/npf/npf_processor.c: revision 1.13 sys/net/npf/npf_impl.h: revision 1.25 sys/net/npf/npf_processor.c: revision 1.14 sys/net/npf/npf_mbuf.c: revision 1.10 sys/net/npf/npf_alg_icmp.c: revision 1.14 sys/net/npf/npf_mbuf.c: revision 1.9 usr.sbin/npf/npftest/libnpftest/npf_nat_test.c: revision 1.2 usr.sbin/npf/npftest/libnpftest/npf_rule_test.c: revision 1.3 sys/net/npf/npf_session.c: revision 1.20 sys/net/npf/npf_alg.c: revision 1.6 sys/kern/uipc_mbuf.c: revision 1.148 sys/net/npf/npf_inet.c: revision 1.20 sys/net/npf/npf.h: revision 1.25 sys/net/npf/npf_nat.c: revision 1.18 sys/net/npf/npf_state.c: revision 1.13 sys/net/npf/npf_sendpkt.c: revision 1.13 sys/net/npf/npf_ext_log.c: revision 1.2 usr.sbin/npf/npftest/libnpftest/npf_processor_test.c: revision 1.4 sys/net/npf/npf_ext_normalise.c: revision 1.2 - Rework NPF's nbuf interface: use advancing and ensuring as a main method. Eliminate unnecessary copy and simplify. Adapt regression tests. - Simplify ICMP ALG a little. While here, handle ICMP ECHO for traceroute. - Minor fixes, misc cleanup. Silence gcc in npf_recache(). Add m_ensure_contig() routine, which is equivalent to m_pullup, but does not destroy the mbuf chain on failure (it is kept valid). - nbuf_ensure_contig: rework to use m_ensure_contig(9), which will not free the mbuf chain on failure. Fixes some corner cases. Improve regression test and sprinkle some asserts. - npf_reassembly: clear nbuf on IPv6 reassembly failure path (partial fix). The problem was found and fix provided by Anthony Mallet.
1.145.2.1.2.1	03-May-2018	martin	Pull up following revision(s) (requested by maxv in ticket #1547): sys/kern/uipc_mbuf.c: revision 1.211 (via patch) Modify m_defrag, so that it never frees the first mbuf of the chain. While here use the given 'flags' argument, and not M_DONTWAIT. We have a problem with several drivers: they poll an mbuf chain from their queues and call m_defrag on them, but m_defrag could update the mbuf pointer, so the mbuf in the queue is no longer valid. It is not easy to fix each driver, because doing pop+push will reorder the queue, and we don't really want that to happen. This problem was independently spotted by me, Kengo, Masanobu, and other people too it seems (perhaps PR/53218). Now m_defrag leaves the first mbuf in place, and compresses the chain only starting from the second mbuf in the chain. It is important not to compress the first mbuf with hacks, because the storage of this first mbuf may be shared with other mbufs.
1.146.2.5	03-Dec-2017	jdolecek	update from HEAD
1.146.2.4	20-Aug-2014	tls	Rebase to HEAD as of a few days ago.
1.146.2.3	23-Jun-2013	tls	resync from head
1.146.2.2	25-Feb-2013	tls	resync with head
1.146.2.1	20-Nov-2012	tls	Resync to 2012-11-19 00:00:00 UTC
1.151.2.1	18-May-2014	rmind	sync with head
1.158.4.5	22-May-2018	martin	Pull up following revision(s) (requested by maxv in ticket #1606): sys/kern/uipc_mbuf.c: revision 1.214 Revert my rev1.190, remove the M_READONLY check. The initial code was correct: what is read-only is the mbuf storage, not the mbuf itself. The storage contains the packet payload, and never has anything related to mbufs. So it is fine to remove M_PKTHDR on mbufs that have a read-only storage. In fact it was kind of obvious, since several places already manually remove M_PKTHDR without taking care of the external storage.
1.158.4.4	03-May-2018	martin	Pull up following revision(s) (requested by maxv in ticket #1602): sys/kern/uipc_mbuf.c: revision 1.211 (via patch) Modify m_defrag, so that it never frees the first mbuf of the chain. While here use the given 'flags' argument, and not M_DONTWAIT. We have a problem with several drivers: they poll an mbuf chain from their queues and call m_defrag on them, but m_defrag could update the mbuf pointer, so the mbuf in the queue is no longer valid. It is not easy to fix each driver, because doing pop+push will reorder the queue, and we don't really want that to happen. This problem was independently spotted by me, Kengo, Masanobu, and other people too it seems (perhaps PR/53218). Now m_defrag leaves the first mbuf in place, and compresses the chain only starting from the second mbuf in the chain. It is important not to compress the first mbuf with hacks, because the storage of this first mbuf may be shared with other mbufs.
1.158.4.3	17-Apr-2018	martin	Pull up following revision(s) (requested by maxv in ticket #1598): sys/kern/uipc_mbuf.c: revision 1.190 If the mbuf is shared leave M_PKTHDR in place. Given where this function is called from that's not supposed to happen, but I'm growing unconfident about our mbuf code.
1.158.4.2	05-Apr-2018	martin	Pull up following revision(s) (requested by maxv in ticket #1594): sys/kern/uipc_mbuf.c: revision 1.182 sys/netinet6/frag6.c: revision 1.67 sys/netinet/ip_reass.c: revision 1.14 sys/sys/mbuf.h: revision 1.179 Remove M_PKTHDR from secondary mbufs when reassembling packets. This is a real problem, because I found at least one component that relies on the fact that only the first mbuf has M_PKTHDR: far from here, in m_splithdr, we don't update m->m_pkthdr.len if M_PKTHDR is found in a secondary mbuf. (The initial intention there was to avoid updating m_pkthdr.len twice, the assumption was that if M_PKTHDR is set then we're dealing with the first mbuf.) Therefore, when handling fragmented IPsec packets (in particular IPv6, IPv4 is a bit more complicated), we may end up with an incorrect m_pkthdr.len after authentication or decryption. In the case of ESP, this can lead to a remote crash on this instruction: m_copydata(m, m->m_pkthdr.len - 3, 3, lastthree); m_pkthdr.len is bigger than the actual mbuf chain. It seems possible to me to trigger this bug even if you don't have the ESP key, because the fragmentation part is outside of the encrypted ESP payload. So if you MITM the target, and intercept an incoming ESP packet (which you can't decrypt), you should be able to forge a new specially-crafted, fragmented packet and stuff the ESP payload (still encrypted, as you intercepted it) into it. The decryption succeeds and the target crashes.
1.158.4.1	09-Feb-2015	martin	branches: 1.158.4.1.2; 1.158.4.1.6; Pull up following revision(s) (requested by mlelstv in ticket #501): sys/kern/uipc_mbuf.c: revision 1.161 Correct m_len calculation for m_dup() with mbuf clusters. Fixes kern/49650.
1.158.4.1.6.4	22-May-2018	martin	Pull up following revision(s) (requested by maxv in ticket #1606): sys/kern/uipc_mbuf.c: revision 1.214 Revert my rev1.190, remove the M_READONLY check. The initial code was correct: what is read-only is the mbuf storage, not the mbuf itself. The storage contains the packet payload, and never has anything related to mbufs. So it is fine to remove M_PKTHDR on mbufs that have a read-only storage. In fact it was kind of obvious, since several places already manually remove M_PKTHDR without taking care of the external storage.
1.158.4.1.6.3	03-May-2018	martin	Pull up following revision(s) (requested by maxv in ticket #1602): sys/kern/uipc_mbuf.c: revision 1.211 (via patch) Modify m_defrag, so that it never frees the first mbuf of the chain. While here use the given 'flags' argument, and not M_DONTWAIT. We have a problem with several drivers: they poll an mbuf chain from their queues and call m_defrag on them, but m_defrag could update the mbuf pointer, so the mbuf in the queue is no longer valid. It is not easy to fix each driver, because doing pop+push will reorder the queue, and we don't really want that to happen. This problem was independently spotted by me, Kengo, Masanobu, and other people too it seems (perhaps PR/53218). Now m_defrag leaves the first mbuf in place, and compresses the chain only starting from the second mbuf in the chain. It is important not to compress the first mbuf with hacks, because the storage of this first mbuf may be shared with other mbufs.
1.158.4.1.6.2	17-Apr-2018	martin	Pull up following revision(s) (requested by maxv in ticket #1598): sys/kern/uipc_mbuf.c: revision 1.190 If the mbuf is shared leave M_PKTHDR in place. Given where this function is called from that's not supposed to happen, but I'm growing unconfident about our mbuf code.
1.158.4.1.6.1	05-Apr-2018	martin	Pull up following revision(s) (requested by maxv in ticket #1594): sys/kern/uipc_mbuf.c: revision 1.182 sys/netinet6/frag6.c: revision 1.67 sys/netinet/ip_reass.c: revision 1.14 sys/sys/mbuf.h: revision 1.179 Remove M_PKTHDR from secondary mbufs when reassembling packets. This is a real problem, because I found at least one component that relies on the fact that only the first mbuf has M_PKTHDR: far from here, in m_splithdr, we don't update m->m_pkthdr.len if M_PKTHDR is found in a secondary mbuf. (The initial intention there was to avoid updating m_pkthdr.len twice, the assumption was that if M_PKTHDR is set then we're dealing with the first mbuf.) Therefore, when handling fragmented IPsec packets (in particular IPv6, IPv4 is a bit more complicated), we may end up with an incorrect m_pkthdr.len after authentication or decryption. In the case of ESP, this can lead to a remote crash on this instruction: m_copydata(m, m->m_pkthdr.len - 3, 3, lastthree); m_pkthdr.len is bigger than the actual mbuf chain. It seems possible to me to trigger this bug even if you don't have the ESP key, because the fragmentation part is outside of the encrypted ESP payload. So if you MITM the target, and intercept an incoming ESP packet (which you can't decrypt), you should be able to forge a new specially-crafted, fragmented packet and stuff the ESP payload (still encrypted, as you intercepted it) into it. The decryption succeeds and the target crashes.
1.158.4.1.2.4	22-May-2018	martin	Pull up following revision(s) (requested by maxv in ticket #1606): sys/kern/uipc_mbuf.c: revision 1.214 Revert my rev1.190, remove the M_READONLY check. The initial code was correct: what is read-only is the mbuf storage, not the mbuf itself. The storage contains the packet payload, and never has anything related to mbufs. So it is fine to remove M_PKTHDR on mbufs that have a read-only storage. In fact it was kind of obvious, since several places already manually remove M_PKTHDR without taking care of the external storage.
1.158.4.1.2.3	15-May-2018	martin	Pull up following revision(s) (requested by maxv in ticket #1602): sys/kern/uipc_mbuf.c: revision 1.211 (via patch) Modify m_defrag, so that it never frees the first mbuf of the chain. While here use the given 'flags' argument, and not M_DONTWAIT. We have a problem with several drivers: they poll an mbuf chain from their queues and call m_defrag on them, but m_defrag could update the mbuf pointer, so the mbuf in the queue is no longer valid. It is not easy to fix each driver, because doing pop+push will reorder the queue, and we don't really want that to happen. This problem was independently spotted by me, Kengo, Masanobu, and other people too it seems (perhaps PR/53218). Now m_defrag leaves the first mbuf in place, and compresses the chain only starting from the second mbuf in the chain. It is important not to compress the first mbuf with hacks, because the storage of this first mbuf may be shared with other mbufs.
1.158.4.1.2.2	17-Apr-2018	martin	Pull up following revision(s) (requested by maxv in ticket #1598): sys/kern/uipc_mbuf.c: revision 1.190 If the mbuf is shared leave M_PKTHDR in place. Given where this function is called from that's not supposed to happen, but I'm growing unconfident about our mbuf code.
1.158.4.1.2.1	05-Apr-2018	martin	Pull up following revision(s) (requested by maxv in ticket #1594): sys/kern/uipc_mbuf.c: revision 1.182 sys/netinet6/frag6.c: revision 1.67 sys/netinet/ip_reass.c: revision 1.14 sys/sys/mbuf.h: revision 1.179 Remove M_PKTHDR from secondary mbufs when reassembling packets. This is a real problem, because I found at least one component that relies on the fact that only the first mbuf has M_PKTHDR: far from here, in m_splithdr, we don't update m->m_pkthdr.len if M_PKTHDR is found in a secondary mbuf. (The initial intention there was to avoid updating m_pkthdr.len twice, the assumption was that if M_PKTHDR is set then we're dealing with the first mbuf.) Therefore, when handling fragmented IPsec packets (in particular IPv6, IPv4 is a bit more complicated), we may end up with an incorrect m_pkthdr.len after authentication or decryption. In the case of ESP, this can lead to a remote crash on this instruction: m_copydata(m, m->m_pkthdr.len - 3, 3, lastthree); m_pkthdr.len is bigger than the actual mbuf chain. It seems possible to me to trigger this bug even if you don't have the ESP key, because the fragmentation part is outside of the encrypted ESP payload. So if you MITM the target, and intercept an incoming ESP packet (which you can't decrypt), you should be able to forge a new specially-crafted, fragmented packet and stuff the ESP payload (still encrypted, as you intercepted it) into it. The decryption succeeds and the target crashes.
1.159.2.8	28-Aug-2017	skrll	Sync with HEAD
1.159.2.7	05-Feb-2017	skrll	Sync with HEAD
1.159.2.6	05-Oct-2016	skrll	Sync with HEAD
1.159.2.5	09-Jul-2016	skrll	Sync with HEAD
1.159.2.4	29-May-2016	skrll	Sync with HEAD
1.159.2.3	22-Apr-2016	skrll	Sync with HEAD
1.159.2.2	22-Sep-2015	skrll	Sync with HEAD
1.159.2.1	06-Apr-2015	skrll	Sync with HEAD
1.168.2.3	26-Apr-2017	pgoyette	Sync with HEAD
1.168.2.2	20-Mar-2017	pgoyette	Sync with HEAD
1.168.2.1	04-Nov-2016	pgoyette	Sync with HEAD
1.170.2.1	21-Apr-2017	bouyer	Sync with HEAD
1.172.6.6	25-Oct-2021	martin	Pull up following revision(s) (requested by msaitoh in ticket #1703): sys/conf/files: revision 1.1288 sys/kern/uipc_mbuf.c: revision 1.244 share/man/man4/options.4: revision 1.520 Fix a bug that NMBCLUSTERS(kern.mbuf.nmbclusters) can't be changed by sysctl. Update the description of the NMBCLUSTERS. Add NMBCLUSTERS_MAX. defparam NMBCLUSTERS_MAX.
1.172.6.5	22-May-2018	martin	Pull up following revision(s) (requested by maxv in ticket #833): sys/kern/uipc_mbuf.c: revision 1.214 Revert my rev1.190, remove the M_READONLY check. The initial code was correct: what is read-only is the mbuf storage, not the mbuf itself. The storage contains the packet payload, and never has anything related to mbufs. So it is fine to remove M_PKTHDR on mbufs that have a read-only storage. In fact it was kind of obvious, since several places already manually remove M_PKTHDR without taking care of the external storage.
1.172.6.4	06-May-2018	martin	Pull up following revision(s) (requested by maxv in ticket #802): sys/kern/uipc_mbuf.c: revision 1.211 (via patch) Modify m_defrag, so that it never frees the first mbuf of the chain. While here use the given 'flags' argument, and not M_DONTWAIT. We have a problem with several drivers: they poll an mbuf chain from their queues and call m_defrag on them, but m_defrag could update the mbuf pointer, so the mbuf in the queue is no longer valid. It is not easy to fix each driver, because doing pop+push will reorder the queue, and we don't really want that to happen. This problem was independently spotted by me, Kengo, Masanobu, and other people too it seems (perhaps PR/53218). Now m_defrag leaves the first mbuf in place, and compresses the chain only starting from the second mbuf in the chain. It is important not to compress the first mbuf with hacks, because the storage of this first mbuf may be shared with other mbufs.
1.172.6.3	17-Apr-2018	martin	Pull up following revision(s) (requested by maxv in ticket #770): sys/kern/uipc_mbuf.c: revision 1.190 If the mbuf is shared leave M_PKTHDR in place. Given where this function is called from that's not supposed to happen, but I'm growing unconfident about our mbuf code.
1.172.6.2	05-Apr-2018	martin	Pull up following revision(s) (requested by maxv in ticket #695): sys/kern/uipc_mbuf.c: revision 1.182 sys/netinet6/frag6.c: revision 1.67 sys/netinet/ip_reass.c: revision 1.14 sys/sys/mbuf.h: revision 1.179 Remove M_PKTHDR from secondary mbufs when reassembling packets. This is a real problem, because I found at least one component that relies on the fact that only the first mbuf has M_PKTHDR: far from here, in m_splithdr, we don't update m->m_pkthdr.len if M_PKTHDR is found in a secondary mbuf. (The initial intention there was to avoid updating m_pkthdr.len twice, the assumption was that if M_PKTHDR is set then we're dealing with the first mbuf.) Therefore, when handling fragmented IPsec packets (in particular IPv6, IPv4 is a bit more complicated), we may end up with an incorrect m_pkthdr.len after authentication or decryption. In the case of ESP, this can lead to a remote crash on this instruction: m_copydata(m, m->m_pkthdr.len - 3, 3, lastthree); m_pkthdr.len is bigger than the actual mbuf chain. It seems possible to me to trigger this bug even if you don't have the ESP key, because the fragmentation part is outside of the encrypted ESP payload. So if you MITM the target, and intercept an incoming ESP packet (which you can't decrypt), you should be able to forge a new specially-crafted, fragmented packet and stuff the ESP payload (still encrypted, as you intercepted it) into it. The decryption succeeds and the target crashes.
1.172.6.1	27-Feb-2018	martin	Pull up following revision(s) (requested by mrg in ticket #593): sys/dev/marvell/mvxpsec.c: revision 1.2 sys/arch/m68k/m68k/pmap_motorola.c: revision 1.70 sys/opencrypto/crypto.c: revision 1.102 sys/arch/sparc64/sparc64/pmap.c: revision 1.308 sys/ufs/chfs/chfs_malloc.c: revision 1.5 sys/arch/powerpc/oea/pmap.c: revision 1.95 sys/sys/pool.h: revision 1.80,1.82 sys/kern/subr_pool.c: revision 1.209-1.216,1.219-1.220 sys/arch/alpha/alpha/pmap.c: revision 1.262 sys/kern/uipc_mbuf.c: revision 1.173 sys/uvm/uvm_fault.c: revision 1.202 sys/sys/mbuf.h: revision 1.172 sys/kern/subr_extent.c: revision 1.86 sys/arch/x86/x86/pmap.c: revision 1.266 (via patch) sys/dev/dtv/dtv_scatter.c: revision 1.4 Allow only one pending call to a pool's backing allocator at a time. Candidate fix for problems with hanging after kva fragmentation related to PR kern/45718. Proposed on tech-kern: https://mail-index.NetBSD.org/tech-kern/2017/10/23/msg022472.html Tested by bouyer@ on i386. This makes one small change to the semantics of pool_prime and pool_setlowat: they may fail with EWOULDBLOCK instead of ENOMEM, if there is a pending call to the backing allocator in another thread but we are not actually out of memory. That is unlikely because nearly always these are used during initialization, when the pool is not in use. Define the new flag too for previous commit. pool_grow can now fail even when sleeping is ok. Catch this case in pool_get and retry. Assert that pool_get failure happens only with PR_NOWAIT. This would have caught the mistake I made last week leading to null pointer dereferences all over the place, a mistake which I evidently poorly scheduled alongside maxv's change to the panic message on x86 for null pointer dereferences. Since pr_lock is now used to wait for two things now (PR_GROWING and PR_WANTED) we need to loop for the condition we wanted. make the KASSERTMSG/panic strings consistent as '%s: [%s], __func__, wchan' Handle the ERESTART case from pool_grow() don't pass 0 to the pool flags Guess pool_cache_get(pc, 0) means PR_WAITOK here. Earlier on in the same context we use kmem_alloc(sz, KM_SLEEP). use PR_WAITOK everywhere. use PR_NOWAIT. Don't use 0 for PR_NOWAIT use PR_NOWAIT instead of 0 panic ex nihilo -- PR_NOWAITing for zerot Add assertions that either PR_WAITOK or PR_NOWAIT are set. - fix an assert; we can reach there if we are nowait or limitfail. - when priming the pool and failing with ERESTART, don't decrement the number of pages; this avoids the issue of returning an ERESTART when we get to 0, and is more correct. - simplify the pool_grow code, and don't wakeup things if we ENOMEM. In pmap_enter_ma(), only try to allocate pves if we might need them, and even if that fails, only fail the operation if we later discover that we really do need them. This implements the requirement that pmap_enter(PMAP_CANFAIL) must not fail when replacing an existing mapping with the first mapping of a new page, which is an unintended consequence of the changes from the rmind-uvmplock branch in 2011. The problem arises when pmap_enter(PMAP_CANFAIL) is used to replace an existing pmap mapping with a mapping of a different page (eg. to resolve a copy-on-write). If that fails and leaves the old pmap entry in place, then UVM won't hold the right locks when it eventually retries. This entanglement of the UVM and pmap locking was done in rmind-uvmplock in order to improve performance, but it also means that the UVM state and pmap state need to be kept in sync more than they did before. It would be possible to handle this in the UVM code instead of in the pmap code, but these pmap changes improve the handling of low memory situations in general, and handling this in UVM would be clunky, so this seemed like the better way to go. This somewhat indirectly fixes PR 52706, as well as the failing assertion about "uvm_page_locked_p(old_pg)". (but only on x86, various other platforms will need their own changes to handle this issue.) In uvm_fault_upper_enter(), if pmap_enter(PMAP_CANFAIL) fails, assert that the pmap did not leave around a now-stale pmap mapping for an old page. If such a pmap mapping still existed after we unlocked the vm_map, the UVM code would not know later that it would need to lock the lower layer object while calling the pmap to remove or replace that stale pmap mapping. See PR 52706 for further details. hopefully workaround the irregularly "fork fails in init" problem. if a pool is growing, and the grower is PR_NOWAIT, mark this. if another caller wants to grow the pool and is also PR_NOWAIT, busy-wait for the original caller, which should either succeed or hard-fail fairly quickly. implement the busy-wait by unlocking and relocking this pools mutex and returning ERESTART. other methods (such as having the caller do this) were significantly more code and this hack is fairly localised. ok chs@ riastradh@ Don't release the lock in the PR_NOWAIT allocation. Move flags setting after the acquiring the mutex. (from Tobias Nygren) apply the change from arch/x86/x86/pmap.c rev. 1.266 commitid vZRjvmxG7YTHLOfA: In pmap_enter_ma(), only try to allocate pves if we might need them, and even if that fails, only fail the operation if we later discover that we really do need them. If we are replacing an existing mapping, reuse the pv structure where possible. This implements the requirement that pmap_enter(PMAP_CANFAIL) must not fail when replacing an existing mapping with the first mapping of a new page, which is an unintended consequence of the changes from the rmind-uvmplock branch in 2011. The problem arises when pmap_enter(PMAP_CANFAIL) is used to replace an existing pmap mapping with a mapping of a different page (eg. to resolve a copy-on-write). If that fails and leaves the old pmap entry in place, then UVM won't hold the right locks when it eventually retries. This entanglement of the UVM and pmap locking was done in rmind-uvmplock in order to improve performance, but it also means that the UVM state and pmap state need to be kept in sync more than they did before. It would be possible to handle this in the UVM code instead of in the pmap code, but these pmap changes improve the handling of low memory situations in general, and handling this in UVM would be clunky, so this seemed like the better way to go. This somewhat indirectly fixes PR 52706 on the remaining platforms where this problem existed.
1.181.2.12	18-Jan-2019	pgoyette	Synch with HEAD
1.181.2.11	26-Dec-2018	pgoyette	Sync with HEAD, resolve a few conflicts
1.181.2.10	26-Nov-2018	pgoyette	Sync with HEAD, resolve a couple of conflicts
1.181.2.9	20-Oct-2018	pgoyette	Sync with head
1.181.2.8	06-Sep-2018	pgoyette	Sync with HEAD Resolve a couple of conflicts (result of the uimin/uimax changes)
1.181.2.7	28-Jul-2018	pgoyette	Sync with HEAD
1.181.2.6	21-May-2018	pgoyette	Sync with HEAD
1.181.2.5	02-May-2018	pgoyette	Synch with HEAD
1.181.2.4	22-Apr-2018	pgoyette	Sync with HEAD
1.181.2.3	16-Apr-2018	pgoyette	Sync with HEAD, resolve some conflicts
1.181.2.2	22-Mar-2018	pgoyette	Synch with HEAD, resolve conflicts
1.181.2.1	15-Mar-2018	pgoyette	Synch with HEAD
1.215.2.2	13-Apr-2020	martin	Mostly merge changes from HEAD upto 20200411
1.215.2.1	10-Jun-2019	christos	Sync with HEAD
1.232.4.3	27-Nov-2023	martin	Pull up following revision(s) (requested by ozaki-r in ticket #1768): sys/kern/uipc_mbuf.c: revision 1.252 mbuf: avoid assertion failure when splitting mbuf cluster From OpenBSD: commit 7b4d35e0a60ba1dd4daf4b1c2932020a22463a89 Author: bluhm <bluhm@openbsd.org> Date: Fri Oct 20 16:25:15 2023 +0000 Avoid assertion failure when splitting mbuf cluster. m_split() calls m_align() to initialize the data pointer of newly allocated mbuf. If the new mbuf will be converted to a cluster, this is not necessary. If additionally the new mbuf is larger than MLEN, this can lead to a panic. Only call m_align() when a valid m_data is needed. This is the case if we do not refecence the existing cluster, but memcpy() the data into the new mbuf. Reported-by: syzbot+0e6817f5877926f0e96a@syzkaller.appspotmail.com OK claudio@ deraadt@ The issue is harmless if DIAGNOSTIC is not enabled.
1.232.4.2	25-Oct-2021	martin	Pull up following revision(s) (requested by msaitoh in ticket #1368): sys/conf/files: revision 1.1288 sys/kern/uipc_mbuf.c: revision 1.244 share/man/man4/options.4: revision 1.520 Fix a bug that NMBCLUSTERS(kern.mbuf.nmbclusters) can't be changed by sysctl. Update the description of the NMBCLUSTERS. Add NMBCLUSTERS_MAX. defparam NMBCLUSTERS_MAX.
1.232.4.1	11-Aug-2020	martin	Pull up following revision(s) (requested by mrg in ticket #1045): sys/kern/uipc_mbuf.c: revision 1.235 sys/dev/ic/dwc_gmac.c: revision 1.70 sys/dev/ic/dwc_gmac_reg.h: revision 1.20 sys/dev/ic/dwc_gmac.c: revision 1.66 sys/dev/ic/dwc_gmac.c: revision 1.67 sys/dev/ic/dwc_gmac.c: revision 1.68 awge: fix issue that caused rx packets to be corrupt with DIAGNOSTIC kernel It seems the hardware can only reliably do rx DMA to addresses that are dcache size aligned. This is hinted at by some GMAC data sheets but hard to find an authoritative source. on non-DIAGNOSTIC kernels we always implicitly get MCLBYTES-aligned mbuf data pointers, but with the reintroduction of POOL_REDZONE for DIAGNOSTIC we can get 8-byte alignment due to redzone padding. So align rx pointers to 64 bytes which should be good for both arm32 and aarch64. While here change some bus_dmamap_load() to bus_dmamap_load_mbuf() and add one missing bus_dmamap_sync(). Also fixes the code to not assume that MCLBYTES == AWGE_MAX_PACKET. User may override MCLSHIFT in kernel config. correct pointer arithmetics mcl_cache: align items to COHERENCY_UNIT Because we do cache incoherent DMA to/from mbufs we cannot safely share share cache lines with adjacent items that may be concurrently accessed. awge: drop redundant m_adj(). Handled via uipc_mbuf.c r1.235 instead. Mask all the MMC counter interrupts if the MMC module is present.
1.237.2.1	25-Apr-2020	bouyer	Sync with bouyer-xenpvh-base2 (HEAD)
1.241.2.1	03-Apr-2021	thorpej	Sync with HEAD.
1.247.2.2	20-Sep-2024	martin	Pull up following revision(s) (requested by rin in ticket #882): sys/kern/uipc_mbuf.c: revision 1.250 sys/kern/uipc_mbuf.c: revision 1.249 mbuf(9): Sprinkle KASSERTMSG. No functional change intended. 0x%p -> %p in KASSERTMSGs
1.247.2.1	27-Nov-2023	martin	Pull up following revision(s) (requested by ozaki-r in ticket #475): sys/kern/uipc_mbuf.c: revision 1.252 mbuf: avoid assertion failure when splitting mbuf cluster From OpenBSD: commit 7b4d35e0a60ba1dd4daf4b1c2932020a22463a89 Author: bluhm <bluhm@openbsd.org> Date: Fri Oct 20 16:25:15 2023 +0000 Avoid assertion failure when splitting mbuf cluster. m_split() calls m_align() to initialize the data pointer of newly allocated mbuf. If the new mbuf will be converted to a cluster, this is not necessary. If additionally the new mbuf is larger than MLEN, this can lead to a panic. Only call m_align() when a valid m_data is needed. This is the case if we do not refecence the existing cluster, but memcpy() the data into the new mbuf. Reported-by: syzbot+0e6817f5877926f0e96a@syzkaller.appspotmail.com OK claudio@ deraadt@ The issue is harmless if DIAGNOSTIC is not enabled.

OpenGrok