TODO.modules revision 1.24 1 /* $NetBSD: TODO.modules,v 1.24 2021/08/09 20:49:08 andvar Exp $ */
2
3 Some notes on the limitations of our current (as of 7.99.35) module
4 subsystem. This list was triggered by an Email exchange between
5 christos and pgoyette.
6
7 1. Builtin drivers can't depend on modularized drivers (the modularized
8 drivers are attempted to load as builtins).
9
10 The assumption is that dependencies are loaded before those
11 modules which depend on them. At load time, a module's
12 undefined global symbols are resolved; if any symbols can't
13 be resolved, the load fails. Similarly, if a module is
14 included in (built-into) the kernel, all of its symbols must
15 be resolvable by the linker, otherwise the link fails.
16
17 There are ways around this (such as, having the parent
18 module's initialization command recursively call the module
19 load code), but they're often gross hacks.
20
21 Another alternative (which is used by ppp) is to provide a
22 "registration" mechanism for the "child" modules, and then when
23 the need for a specific child module is encountered, use
24 module_autoload() to load the child module. Of course, this
25 requires that the parent module know about all potentially
26 loadable children.
27
28 2. Currently, config(1) has no way to "no define" drivers
29 XXX: I don't think this is true anymore. I think we can
30 undefine drivers now, see MODULAR in amd64, which does
31 no ath* and no select sppp*
32
33 3. It is not always obvious by their names which drivers/options
34 correspond to which modules.
35
36 4. Right now critical drivers that would need to be pre-loaded (ffs,
37 exec_elf64) are still built-in so that we don't need to alter the boot
38 blocks to boot.
39
40 This was a conscious decision by core@ some years ago. It is
41 not a requirement that ffs or exec_* be built-in. The only
42 requirement is that the root file-system's module must be
43 available when the module subsystem is initialized, in order
44 to load other modules. This can be accomplished by having the
45 boot loader "push" the module at boot time. (It used to do
46 this in all cases; currently the "push" only occurs if the
47 booted filesystem is not ffs.)
48
49 5. Not all parent bus drivers are capable of rescan, so some drivers
50 just have to be built-in.
51
52 6. Many (most?) drivers are not yet modularized
53
54 7. There's currently no provisions for autoconfig to figure out which
55 modules are needed, and thus to load the required modules.
56
57 In the "normal" built-in world, autoconfigure can only ask
58 existing drivers if they're willing to manage (ie, attach) a
59 device. Removing the built-in drivers tends to limit the
60 availability of possible managers. There's currently no
61 mechanism for identifying and loading drivers based on what
62 devices might be found.
63
64 8. Even for existing modules, there are "surprise" dependencies with
65 code that has not yet been modularized.
66
67 For example, even though the bpf code has been modularized,
68 there is some shared code in bpf_filter.c which is needed by
69 both ipfilter and ppp. ipf is already modularized, but ppp
70 is not. Thus, even though bpf_filter is modular, it MUST be
71 included as a built-in module if you also have ppp in your
72 configuration.
73
74 Another example is sysmon_taskq module. It is required by
75 other parts of the sysmon subsystem, including the
76 "sysmon_power" module. Unfortunately, even though the
77 sysmon_power code is modularized, it is referenced by the
78 acpi code which has not been modularized. Therefore, if your
79 configuration has acpi, then you must include the "sysmon_power"
80 module built-in the kernel. And therefore you also need to
81 have "sysmon_taskq" and "sysmon" built-in since "sysmon_power"
82 rerefences them.
83
84 9. As a corollary to #8 above, having dependencies on modules from code
85 which has not been modularized makes it extremely difficult to test
86 the module code adequately. Testing of module code should include
87 both testing-as-a-built-in module and testing-as-a-loaded-module, and
88 all dependencies need to be identified.
89
90 10. The current /stand/$ARCH/$VERSION/modules/ hierarchy won't scale as
91 we get more and more modules. There are hundreds of potential device
92 driver modules.
93
94 11. There currently isn't any good way to handle attachment-specific
95 modules. The build infrastructure (ie, sys/modules/Makefile) doesn't
96 readily lend itself to bus-specific modules irrespective of $ARCH,
97 and maintaining distrib/sets/lists/modules/* is awkward at best.
98
99 Furthermore, devices such as ld(4), which can attach to a large set
100 of parent devices, need to be modified. The parent devices need to
101 provide a common attribute (for example, ld_bus), and the ld driver
102 should attach to that attribute rather than to each parent. But
103 currently, config(1) doesn't handle this - it doesn't allow an
104 attribute to be used as the device tree's pseudo-root. The current
105 directory structure where driver foo is split between ic/foo.c
106 and bus1/foo_bus1.c ... busn/foo_busn.c is annoying. It would be
107 better to switch to the FreeBSD model which puts all the driver
108 files in one directory.
109
110 12. Item #11 gets even murkier when a particular parent can provide more
111 than one attribute.
112
113 13. It seems that we might want some additional sets-lists "attributes"
114 to control contents of distributions. As an example, many of our
115 architectures have PCI bus capabilities, but not all. It is rather
116 painful to need to maintain individual architectures' modules/md_*
117 sets lists, especially when we already have to conditionalize the
118 build of the modules based on architecture. If we had a single
119 "attribute" for PCI-bus-capable, the same attribute could be used to
120 select which modules to build and which modules from modules/mi to
121 include in the release. (This is not limited to PCI; recently we
122 encounter similar issues with spkr aka spkr_synth module.)
123
124 14. As has been pointed out more than once, the current method of storing
125 modules in a version-specific subdirectory of /stand is sub-optimal
126 and leads to much difficulty and/or confusion. A better mechanism of
127 associating a kernel and its modules needs to be developed. Some
128 have suggested having a top-level directory (say, /netbsd) with a
129 kernel and its modules at /netbsd/kernel and /netbsd/modules/...
130 Whatever new mechanism we arrive at will probably require changes to
131 installation procedures and bootstrap code, and will need to handle
132 both the new and old mechanisms for compatibility.
133
134 One additional option mentioned is to be able to specify, at boot
135 loader time, an alternate value for the os-release portion of the
136 default module path, i.e. /stand/$MACHINE/$ALT-RELEASE/modules/
137
138 The following statement regarding this issue was previously issued
139 by the "core" group:
140
141 Date: Fri, 27 Jul 2012 08:02:56 +0200
142 From: <redacted>
143 To: <redacted>
144 Subject: Core statement on directory naming for kernel modules
145
146 The core group would also like to see the following changes in
147 the near future:
148
149 Implementation of the scheme described by Luke Mewburn in
150 <http://mail-index.NetBSD.org/current-users/2009/05/10/msg009372.html>
151 to allow a kernel and its modules to be kept together.
152 Changes to config(1) to extend the existing notion of whether or not
153 an option is built-in to the kernel, to three states: built-in, not
154 built-in but loadable as a module, entirely excluded and not even
155 loadable as a module.
156
157
158 15. The existing config(5) framework provides an excellent mechanism
159 for managing the content of kernels. Unfortunately, this mechanism
160 does not apply for modules, and instead we need to manually manage
161 a list of files to include in the module, the set of compiler
162 definitions with which to build those files, and also the set of
163 other modules on which a module depends. We really need a common
164 mechanism to define and build modules, whether they are included as
165 "built-in" modules or as separately-loadable modules.
166
167 (From John Nemeth) Some sort of mechanism for a (driver) module
168 to declare the list of vendor/product/other tuples that it can
169 handle would be nice. Perhaps this would go in the module's .plist
170 file? (See #17 below.) Then drivers that scan for children might
171 be able to search the modules directory for an "appropriate" module
172 for each child, and auto-load.
173
174 16. PR kern/52821 exposes another limitation of config(1) WRT modules.
175 Here, an explicit device attachment is required, because we cannot
176 rely on all kernel configs to contain the attribute at which the
177 modular driver wants to attach. Unfortunately, the explicit
178 attachment causes conflicts with built-in drivers. (See the PR for
179 more details.)
180
181 17. (From John Nemeth) It would be potentially useful if a "push" from
182 the bootloader could also load-and-push a module's .plist (if it
183 exists.
184
185 18. (From John Nemeth) Some sort of schema for a module to declare the
186 options (or other things?) that the module understands. This could
187 result in a module-options editor to manipulate the .plist
188
189 19. (From John Nemeth) Currently, the order of module initialization is
190 based on module classes and declared dependencies. It might be
191 useful to have additional classes (or sub-classes) with additional
192 invocations of module_class_init(), and it might be useful to have a
193 non-dependency mechanism to provide "IF module-A and module-B are
194 BOTH present, module-A needs to be initialized before module-B".
195
196 20. (Long-ago memory rises to the surface) Note that currently there is
197 nothing that requires a module's name to correspond in any way with
198 the name of file from which the module is loaded. Thus, it is
199 possible to attempt to access device /dev/x, discover that there is
200 no such device so we autoload /stand/.../x/x.kmod and initialize
201 the module loaded, even if the loaded module is for some other
202 device entirely!
203
204 21. We currently do not support "weak" symbols in the in-kernel linker.
205 It would take some serious thought to get such support right. For
206 example, consider module A with a weak reference to symbol S which
207 is defined in module B. If module B is loaded first, and then
208 module A, the symbol gets resolved. But if module A is loaded first,
209 the symbol won't be resolved. If we subsequently load module B, we
210 would have to "go back" and re-run the linker for module A.
211
212 Additional difficulties arise when the module which defines the
213 weak symbol gets unloaded. Then, you would need to re-run the
214 linker and _unresolve_ the weak symbol which is no longer defined.
215
216 22. A fairly large number of modules still require a maximum warning
217 level of WARNS=3 due to signed-vs-unsigned integer comparisons. We
218 really ought to clean these up. (I haven't looked at them in any
219 detail, but I have to wonder how code that compiles cleanly in a
220 normal kernel has these issues when compiled in a module, when both
221 are done with WARNS=5).
222
223 23. The current process of "load all the emulation/exec modules in case
224 one of them might handle the image currently being exec'd" isn't
225 really cool. (See sys/kern/kern_exec.c?) It ends up auto-loading
226 a whole bunch of modules, involving file-system access, just to have
227 most of the modules getting unloaded a few seconds later. We don't
228 have any way to identify which module is needed for which image (ie,
229 we can't determine that an image needs compat_linux vs some other
230 module).
231
232 24. Details are no longer remembered, but there are some issues with
233 building xen-variant modules (on amd4, and likely i386). In some
234 cases, wrong headers are included (because a XEN-related #define
235 is missing), but even if you add the definition some headers get
236 included in the wrong order. One particular fallout from this is
237 the inability to have a compat version of x86_64 cpu-microcode
238 module. PR port-xen/53130
239
240 This is likely to be fixed by Chuck Silvers on 2020-07-04 which
241 removed the differences between the xen and non-xen module ABIs.
242 As of 2021-05-28 the cpu-microcode functionality has once again
243 been enabled for i386 and amd64 compat_60 modules.
244