TODO.smpnet revision 1.18
1$NetBSD: TODO.smpnet,v 1.18 2017/12/05 03:23:29 ozaki-r Exp $
2
3MP-safe components
4==================
5
6 - Device drivers
7   - vioif(4)
8   - vmx(4)
9   - wm(4)
10   - ixg(4)
11   - ixv(4)
12 - Layer 2
13   - Ethernet (if_ethersubr.c)
14   - bridge(4)
15     - STP
16   - Fast forward (ipflow)
17 - Layer 3
18   - All except for items in the below section
19 - Interfaces
20   - gif(4)
21   - l2tp(4)
22   - pppoe(4)
23     - if_spppsubr.c
24   - tun(4)
25   - vlan(4)
26 - Packet filters
27   - npf(7)
28 - Others
29   - bpf(4)
30   - ipsec(4)
31   - opencrypto(9)
32   - pfil(9)
33
34Non MP-safe components and kernel options
35=========================================
36
37 - Device drivers
38   - Most drivers other than ones listed in the above section
39 - Layer 2
40   - ARCNET (if_arcsubr.c)
41   - ATM (if_atmsubr.c)
42   - BRIDGE_IPF
43   - if_ecosubr.c
44   - FDDI (if_fddisubr.c)
45   - HIPPI (if_hippisubr.c)
46   - IEEE 1394 (if_ieee1394subr.c)
47   - IEEE 802.11 (ieee80211(4))
48   - Token ring (if_tokensubr.c)
49 - Layer 3
50   - IPSELSRC
51   - MROUTING
52   - PIM
53   - MPLS (mpls(4))
54   - IPv6 address selection policy
55 - Layer 4
56   - DCCP
57   - SCTP
58   - TCP
59   - UDP
60 - Interfaces
61   - agr(4)
62   - carp(4)
63   - etherip(4)
64   - faith(4)
65   - gre(4)
66   - ppp(4)
67   - sl(4)
68   - stf(4)
69   - strip(4)
70   - if_srt
71   - tap(4)
72 - Packet filters
73   - ipf(4)
74   - pf(4)
75 - Others
76   - AppleTalk (sys/netatalk/)
77   - ATM (sys/netnatm/)
78   - Bluetooth (sys/netbt/)
79   - altq(4)
80   - CIFS (sys/netsmb/)
81   - ISDN (sys/netisbn/)
82   - kttcp(4)
83   - NFS
84
85Know issues
86===========
87
88NOMPSAFE
89--------
90
91We use "NOMPSAFE" as a mark that indicates that the code around it isn't MP-safe
92yet.  We use it in comments and also use as part of function names, for example
93m_get_rcvif_NOMPSAFE.  Let's use "NOMPSAFE" to make it easy to find non-MP-safe
94codes by grep.
95
96bpf
97---
98
99MP-ification of bpf requires all of bpf_mtap* are called in normal LWP context
100or softint context, i.e., not in hardware interrupt context.  For Tx, all
101bpf_mtap satisfy the requrement.  For Rx, most of bpf_mtap are called in softint.
102Unfortunately some bpf_mtap on Rx are still called in hardware interrupt context.
103
104This is the list of the functions that have such bpf_mtap:
105
106 - sca_frame_process() @ sys/dev/ic/hd64570.c
107 - en_intr() @ sys/dev/ic/midway.c
108 - rxintr_cleanup() and txintr_cleanup() @ sys/dev/pci/if_lmc.c
109 - ipr_rx_data_rdy() @ sys/netisdn/i4b_ipr.c
110
111Ideally we should make the functions run in softint somehow, but we don't have
112actual devices, no time (or interest/love) to work on the task, so instead we
113provide a deferred bpf_mtap mechanism that forcibly runs bpf_mtap in softint
114context.  It's a workaround and once the functions run in softint, we should use
115the original bpf_mtap again.
116
117Lingering obsolete variables
118-----------------------------
119
120Some obsolete global variables and member variables of structures remain to
121avoid breaking old userland programs which directly access such variables via
122kvm(3).
123
124The following programs still use kvm(3) to get some information related to
125the network stack.
126
127 - netstat(1)
128 - vmstat(1)
129 - fstat(1)
130
131netstat(1) accesses ifnet_list, the head of a list of interface objects
132(struct ifnet), and traverses each object through ifnet#if_list member variable.
133ifnet_list and ifnet#if_list is obsoleted by ifnet_pslist and
134ifnet#if_pslist_entry respectively. netstat also accesses the IP address list
135of an interface throught ifnet#if_addrlist. struct ifaddr, struct in_ifaddr
136and struct in6_ifaddr are accessed and the following obsolete member variables
137are stuck: ifaddr#ifa_list, in_ifaddr#ia_hash, in_ifaddr#ia_list,
138in6_ifaddr#ia_next and in6_ifaddr#_ia6_multiaddrs. Note that netstat already
139implements alternative methods to fetch the above information via sysctl(3).
140
141vmstat(1) shows statistics of hash tables created by hashinit(9) in the kernel.
142The statistic information is retrieved via kvm(3). The global variables
143in_ifaddrhash and in_ifaddrhashtbl, which are for a hash table of IPv4
144addresses and obsoleted by in_ifaddrhash_pslist and in_ifaddrhashtbl_pslist,
145are kept for this purpose. We should provide a means to fetch statistics of
146hash tables via sysctl(3).
147
148fstat(1) shows information of bpf instances. Each bpf instance (struct bpf) is
149obtained via kvm(3). bpf_d#_bd_next, bpf_d#_bd_filter and bpf_d#_bd_list
150member variables are obsolete but remain. ifnet#if_xname is also accessed
151via struct bpf_if and obsolete ifnet#if_list is required to remain to not change
152the offset of ifnet#if_xname. The statistic counters (bpf#bd_rcount,
153bpf#bd_dcount and bpf#bd_ccount) are also victims of this restriction; for
154scalability the statistic counters should be per-CPU and we should stop using
155atomic operations for them however we have to remain the counters and atomic
156operations.
157
158Scalability
159-----------
160
161 - Per-CPU rtcaches (used in say IP forwarding) aren't scalable on multiple
162   flows per CPU
163 - ipsec(4) isn't scalable on the number of SA/SP; the cost of a look-up
164   is O(n)
165 - opencrypto(9)'s crypto_newsession()/crypto_freesession() aren't scalable
166   as they are serialized by one mutex
167
168ec_multi* of ethercom
169---------------------
170
171ec_multiaddrs and ec_multicnt of struct ethercom and items listed in
172ec_multiaddrs must be protected by ec_lock.  The core of ethernet subsystem is
173already MP-safe, however, device drivers that use the data should also be fixed.
174A typical change should be to protect manipulations of the data via ETHER_*
175macros such as ETHER_FIRST_MULTI by ETHER_LOCK and ETHER_UNLOCK.
176
177ALTQ
178----
179
180If ALTQ is enabled in the kernel, it enforces to use just one Tx queue (if_snd)
181for packet transmissions, resulting in serializing all Tx packet processing on
182the queue.  We should probably design and implement an alternative queuing
183mechanism that deals with multi-core systems at the first place, not making the
184existing ALTQ MP-safe because it's just annoying.
185