TODO.smpnet revision 1.24
1$NetBSD: TODO.smpnet,v 1.24 2018/09/06 06:46:25 maxv Exp $ 2 3MP-safe components 4================== 5 6They work without the big kernel lock (KERNEL_LOCK), i.e., with NET_MPSAFE 7kernel option. Some components scale up and some don't. 8 9 - Device drivers 10 - vioif(4) 11 - vmx(4) 12 - wm(4) 13 - ixg(4) 14 - ixv(4) 15 - Layer 2 16 - Ethernet (if_ethersubr.c) 17 - bridge(4) 18 - STP 19 - Fast forward (ipflow) 20 - Layer 3 21 - All except for items in the below section 22 - Interfaces 23 - gif(4) 24 - ipsecif(4) 25 - l2tp(4) 26 - pppoe(4) 27 - if_spppsubr.c 28 - tun(4) 29 - vlan(4) 30 - Packet filters 31 - npf(7) 32 - Others 33 - bpf(4) 34 - ipsec(4) 35 - opencrypto(9) 36 - pfil(9) 37 38Non MP-safe components and kernel options 39========================================= 40 41The components and options aren't MP-safe, i.e., requires the big kernel lock, 42yet. Some of them can be used safely even if NET_MPSAFE is enabled because 43they're still protected by the big kernel lock. The others aren't protected and 44so unsafe, e.g, they may crash the kernel. 45 46Protected ones 47-------------- 48 49 - Device drivers 50 - Most drivers other than ones listed in the above section 51 - Layer 4 52 - DCCP 53 - SCTP 54 - TCP 55 - UDP 56 57Unprotected ones 58---------------- 59 60 - Layer 2 61 - ARCNET (if_arcsubr.c) 62 - BRIDGE_IPF 63 - FDDI (if_fddisubr.c) 64 - HIPPI (if_hippisubr.c) 65 - IEEE 1394 (if_ieee1394subr.c) 66 - IEEE 802.11 (ieee80211(4)) 67 - Token ring (if_tokensubr.c) 68 - Layer 3 69 - IPSELSRC 70 - MROUTING 71 - PIM 72 - MPLS (mpls(4)) 73 - IPv6 address selection policy 74 - Interfaces 75 - agr(4) 76 - carp(4) 77 - faith(4) 78 - gre(4) 79 - ppp(4) 80 - sl(4) 81 - stf(4) 82 - strip(4) 83 - if_srt 84 - tap(4) 85 - Packet filters 86 - ipf(4) 87 - pf(4) 88 - Others 89 - AppleTalk (sys/netatalk/) 90 - Bluetooth (sys/netbt/) 91 - altq(4) 92 - CIFS (sys/netsmb/) 93 - ISDN (sys/netisbn/) 94 - kttcp(4) 95 - NFS 96 97Know issues 98=========== 99 100NOMPSAFE 101-------- 102 103We use "NOMPSAFE" as a mark that indicates that the code around it isn't MP-safe 104yet. We use it in comments and also use as part of function names, for example 105m_get_rcvif_NOMPSAFE. Let's use "NOMPSAFE" to make it easy to find non-MP-safe 106codes by grep. 107 108bpf 109--- 110 111MP-ification of bpf requires all of bpf_mtap* are called in normal LWP context 112or softint context, i.e., not in hardware interrupt context. For Tx, all 113bpf_mtap satisfy the requrement. For Rx, most of bpf_mtap are called in softint. 114Unfortunately some bpf_mtap on Rx are still called in hardware interrupt context. 115 116This is the list of the functions that have such bpf_mtap: 117 118 - sca_frame_process() @ sys/dev/ic/hd64570.c 119 - rxintr_cleanup() @ sys/dev/pci/if_lmc.c 120 - ipr_rx_data_rdy() @ sys/netisdn/i4b_ipr.c 121 122Ideally we should make the functions run in softint somehow, but we don't have 123actual devices, no time (or interest/love) to work on the task, so instead we 124provide a deferred bpf_mtap mechanism that forcibly runs bpf_mtap in softint 125context. It's a workaround and once the functions run in softint, we should use 126the original bpf_mtap again. 127 128Lingering obsolete variables 129----------------------------- 130 131Some obsolete global variables and member variables of structures remain to 132avoid breaking old userland programs which directly access such variables via 133kvm(3). 134 135The following programs still use kvm(3) to get some information related to 136the network stack. 137 138 - netstat(1) 139 - vmstat(1) 140 - fstat(1) 141 142netstat(1) accesses ifnet_list, the head of a list of interface objects 143(struct ifnet), and traverses each object through ifnet#if_list member variable. 144ifnet_list and ifnet#if_list is obsoleted by ifnet_pslist and 145ifnet#if_pslist_entry respectively. netstat also accesses the IP address list 146of an interface throught ifnet#if_addrlist. struct ifaddr, struct in_ifaddr 147and struct in6_ifaddr are accessed and the following obsolete member variables 148are stuck: ifaddr#ifa_list, in_ifaddr#ia_hash, in_ifaddr#ia_list, 149in6_ifaddr#ia_next and in6_ifaddr#_ia6_multiaddrs. Note that netstat already 150implements alternative methods to fetch the above information via sysctl(3). 151 152vmstat(1) shows statistics of hash tables created by hashinit(9) in the kernel. 153The statistic information is retrieved via kvm(3). The global variables 154in_ifaddrhash and in_ifaddrhashtbl, which are for a hash table of IPv4 155addresses and obsoleted by in_ifaddrhash_pslist and in_ifaddrhashtbl_pslist, 156are kept for this purpose. We should provide a means to fetch statistics of 157hash tables via sysctl(3). 158 159fstat(1) shows information of bpf instances. Each bpf instance (struct bpf) is 160obtained via kvm(3). bpf_d#_bd_next, bpf_d#_bd_filter and bpf_d#_bd_list 161member variables are obsolete but remain. ifnet#if_xname is also accessed 162via struct bpf_if and obsolete ifnet#if_list is required to remain to not change 163the offset of ifnet#if_xname. The statistic counters (bpf#bd_rcount, 164bpf#bd_dcount and bpf#bd_ccount) are also victims of this restriction; for 165scalability the statistic counters should be per-CPU and we should stop using 166atomic operations for them however we have to remain the counters and atomic 167operations. 168 169Scalability 170----------- 171 172 - Per-CPU rtcaches (used in say IP forwarding) aren't scalable on multiple 173 flows per CPU 174 - ipsec(4) isn't scalable on the number of SA/SP; the cost of a look-up 175 is O(n) 176 - opencrypto(9)'s crypto_newsession()/crypto_freesession() aren't scalable 177 as they are serialized by one mutex 178 179ec_multi* of ethercom 180--------------------- 181 182ec_multiaddrs and ec_multicnt of struct ethercom and items listed in 183ec_multiaddrs must be protected by ec_lock. The core of ethernet subsystem is 184already MP-safe, however, device drivers that use the data should also be fixed. 185A typical change should be to protect manipulations of the data via ETHER_* 186macros such as ETHER_FIRST_MULTI by ETHER_LOCK and ETHER_UNLOCK. 187 188ALTQ 189---- 190 191If ALTQ is enabled in the kernel, it enforces to use just one Tx queue (if_snd) 192for packet transmissions, resulting in serializing all Tx packet processing on 193the queue. We should probably design and implement an alternative queuing 194mechanism that deals with multi-core systems at the first place, not making the 195existing ALTQ MP-safe because it's just annoying. 196