Home | History | Annotate | Line # | Download | only in doc
TODO.modules revision 1.22.2.1
      1 /* $NetBSD: TODO.modules,v 1.22.2.1 2021/05/31 22:06:55 cjep Exp $ */
      2 
      3 Some notes on the limitations of our current (as of 7.99.35) module
      4 subsystem.  This list was triggered by an Email exchange between
      5 christos and pgoyette.
      6 
      7  1. Builtin drivers can't depend on modularized drivers (the modularized
      8     drivers are attempted to load as builtins).
      9 
     10 	The assumption is that dependencies are loaded before those
     11 	modules which depend on them.  At load time, a module's
     12 	undefined global symbols are resolved;  if any symbols can't
     13 	be resolved, the load fails.  Similarly, if a module is
     14 	included in (built-into) the kernel, all of its symbols must
     15 	be resolvable by the linker, otherwise the link fails.
     16 
     17 	There are ways around this (such as, having the parent
     18 	module's initialization command recursively call the module
     19 	load code), but they're often gross hacks.
     20 
     21 	Another alternative (which is used by ppp) is to provide a
     22 	"registration" mechanism for the "child" modules, and then when
     23 	the need for a specific child module is encountered, use
     24 	module_autoload() to load the child module.  Of course, this
     25 	requires that the parent module know about all potentially
     26 	loadable children.
     27 
     28  2. Currently, config(1) has no way to "no define" drivers
     29 	XXX: I don't think this is true anymore. I think we can
     30 	undefine drivers now, see MODULAR in amd64, which does
     31 	no ath* and no select sppp*
     32 
     33  3. It is not always obvious by their names which drivers/options
     34     correspond to which modules.
     35 
     36  4. Right now critical drivers that would need to be pre-loaded (ffs,
     37     exec_elf64) are still built-in so that we don't need to alter the boot
     38     blocks to boot.
     39 
     40 	This was a conscious decision by core@ some years ago.  It is
     41 	not a requirement that ffs or exec_* be built-in.  The only
     42 	requirement is that the root file-system's module must be
     43 	available when the module subsystem is initialized, in order
     44 	to load other modules.  This can be accomplished by having the
     45 	boot loader "push" the module at boot time.  (It used to do
     46 	this in all cases; currently the "push" only occurs if the
     47 	booted filesystem is not ffs.)
     48 
     49  5. Not all parent bus drivers are capable of rescan, so some drivers
     50     just have to be built-in.
     51 
     52  6. Many (most?) drivers are not yet modularized
     53 
     54  7. There's currently no provisions for autoconfig to figure out which
     55     modules are needed, and thus to load the required modules.
     56 
     57 	In the "normal" built-in world, autoconfigure can only ask
     58 	existing drivers if they're willing to manage (ie, attach) a
     59 	device.  Removing the built-in drivers tends to limit the
     60 	availability of possible managers.  There's currently no
     61 	mechanism for identifying and loading drivers based on what
     62 	devices might be found.
     63 
     64  8. Even for existing modules, there are "surprise" dependencies with
     65     code that has not yet been modularized.
     66 
     67 	For example, even though the bpf code has been modularized,
     68 	there is some shared code in bpf_filter.c which is needed by
     69 	both ipfilter and ppp.  ipf is already modularized, but ppp
     70 	is not.  Thus, even though bpf_filter is modular, it MUST be
     71 	included as a built-in module if you also have ppp in your
     72 	configuration.
     73 
     74 	Another example is sysmon_taskq module.  It is required by
     75 	other parts of the sysmon subsystem, including the
     76 	"sysmon_power" module.  Unfortunately, even though the
     77 	sysmon_power code is modularized, it is referenced by the
     78 	acpi code which has not been modularized.  Therefore, if your
     79 	configuration has acpi, then you must include the "sysmon_power"
     80 	module built-in the kernel.  And therefore you also need to
     81 	have "sysmon_taskq" and "sysmon" built-in since "sysmon_power"
     82 	rerefences them.
     83 
     84  9. As a corollary to #8 above, having dependencies on modules from code
     85     which has not been modularized makes it extremely difficult to test
     86     the module code adequately.  Testing of module code should include
     87     both testing-as-a-built-in module and testing-as-a-loaded-module, and
     88     all dependencies need to be identified.
     89 
     90 10. The current /stand/$ARCH/$VERSION/modules/ hierarchy won't scale as
     91     we get more and more modules.  There are hundreds of potential device
     92     driver modules.
     93 
     94 11. There currently isn't any good way to handle attachment-specific
     95     modules.  The build infrastructure (ie, sys/modules/Makefile) doesn't
     96     readily lend itself to bus-specific modules irrespective of $ARCH,
     97     and maintaining distrib/sets/lists/modules/* is awkward at best.
     98 
     99     Furthermore, devices such as ld(4), which can attach to a large set
    100     of parent devices, need to be modified.  The parent devices need to
    101     provide a common attribute (for example, ld_bus), and the ld driver
    102     should attach to that attribute rather than to each parent.  But
    103     currently, config(1) doesn't handle this - it doesn't allow an
    104     attribute to be used as the device tree's pseudo-root. The current
    105     directory structure where driver foo is split between ic/foo.c
    106     and bus1/foo_bus1.c ... busn/foo_busn.c is annoying. It would be
    107     better to switch to the FreeBSD model which puts all the driver
    108     files in one directory.
    109 
    110 12. Item #11 gets even murkier when a particular parent can provide more
    111     than one attribute.
    112 
    113 13. It seems that we might want some additional sets-lists "attributes"
    114     to control contents of distributions.  As an example, many of our
    115     architectures have PCI bus capabilities, but not all.  It is rather
    116     painful to need to maintain individual architectures' modules/md_*
    117     sets lists, especially when we already have to conditionalize the
    118     build of the modules based on architecture.  If we had a single
    119     "attribute" for PCI-bus-capable, the same attribute could be used to
    120     select which modules to build and which modules from modules/mi to
    121     include in the release.  (This is not limited to PCI;  recently we
    122     encounter similar issues with spkr aka spkr_synth module.)
    123 
    124 14. As has been pointed out more than once, the current method of storing
    125     modules in a version-specific subdirectory of /stand is sub-optimal
    126     and leads to much difficulty and/or confusion.  A better mechanism of
    127     associating a kernel and its modules needs to be developed.  Some
    128     have suggested having a top-level directory (say, /netbsd) with a
    129     kernel and its modules at /netbsd/kernel and /netbsd/modules/...
    130     Whatever new mechanism we arrive at will probably require changes to
    131     installation procedures and bootstrap code, and will need to handle
    132     both the new and old mechanisms for compatability.
    133 
    134     One additional option mentioned is to be able to specify, at boot
    135     loader time, an alternate value for the os-release portion of the
    136     default module path,  i.e. /stand/$MACHINE/$ALT-RELEASE/modules/
    137 
    138     The following statement regarding this issue was previously issued
    139     by the "core" group:
    140 
    141     Date: Fri, 27 Jul 2012 08:02:56 +0200
    142     From: <redacted>
    143     To: <redacted>
    144     Subject: Core statement on directory naming for kernel modules
    145 
    146     The core group would also like to see the following changes in
    147     the near future:
    148 
    149        Implementation of the scheme described by Luke Mewburn in
    150        <http://mail-index.NetBSD.org/current-users/2009/05/10/msg009372.html>
    151        to allow a kernel and its modules to be kept together.
    152        Changes to config(1) to extend the existing notion of whether or not
    153        an option is built-in to the kernel, to three states: built-in, not
    154        built-in but loadable as a module, entirely excluded and not even
    155        loadable as a module.
    156 
    157 
    158 15. The existing config(5) framework provides an excellent mechanism
    159     for managing the content of kernels.  Unfortunately, this mechanism
    160     does not apply for modules, and instead we need to manually manage
    161     a list of files to include in the module, the set of compiler
    162     definitions with which to build those files, and also the set of
    163     other modules on which a module depends.  We really need a common
    164     mechanism to define and build modules, whether they are included as
    165     "built-in" modules or as separately-loadable modules.
    166 
    167     (From John Nemeth) Some sort of mechanism for a (driver) module
    168     to declare the list of vendor/product/other tuples that it can
    169     handle would be nice.  Perhaps this would go in the module's .plist
    170     file?  (See #17 below.)  Then drivers that scan for children might
    171     be able to search the modules directory for an "appropriate" module
    172     for each child, and auto-load.
    173 
    174 16. PR kern/52821 exposes another limitation of config(1) WRT modules.
    175     Here, an explicit device attachment is required, because we cannot
    176     rely on all kernel configs to contain the attribute at which the
    177     modular driver wants to attach.  Unfortunately, the explicit
    178     attachment causes conflicts with built-in drivers.  (See the PR for
    179     more details.)
    180 
    181 17. (From John Nemeth) It would be potentially useful if a "push" from
    182     the bootloader could also load-and-push a module's .plist (if it
    183     exists.
    184 
    185 18. (From John Nemeth) Some sort of schema for a module to declare the
    186     options (or other things?) that the module understands.  This could
    187     result in a module-options editor to manipulate the .plist
    188 
    189 19. (From John Nemeth) Currently, the order of module initialization is
    190     based on module classes and declared dependencies.  It might be
    191     useful to have additional classes (or sub-classes) with additional
    192     invocations of module_class_init(), and it might be useful to have a
    193     non-dependency mechanism to provide "IF module-A and module-B are
    194     BOTH present, module-A needs to be initialized before module-B".
    195 
    196 20. (Long-ago memory rises to the surface) Note that currently there is
    197     nothing that requires a module's name to correspond in any way with
    198     the name of file from which the module is loaded.  Thus, it is
    199     possible to attempt to access device /dev/x, discover that there is
    200     no such device so we autoload /stand/.../x/x.kmod and initialize
    201     the module loaded, even if the loaded module is for some other
    202     device entirely!
    203 
    204 21. We currently do not support "weak" symbols in the in-kernel linker.
    205     It would take some serious thought to get such support right.  For
    206     example, consider module A with a weak reference to symbol S which
    207     is defined in module B.  If module B is loaded first, and then
    208     module A, the symbol gets resolved.  But if module A is loaded first,
    209     the symbol won't be resolved.  If we subsequently load module B, we
    210     would have to "go back" and re-run the linker for module A.
    211 
    212     Additional difficulties arise when the module which defines the
    213     weak symbol gets unloaded.  Then, you would need to re-run the
    214     linker and _unresolve_ the weak symbol which is no longer defined.
    215 
    216 22. A fairly large number of modules still require a maximum warning
    217     level of WARNS=3 due to signed-vs-unsigned integer comparisons.  We
    218     really ought to clean these up.  (I haven't looked at them in any
    219     detail, but I have to wonder how code that compiles cleanly in a
    220     normal kernel has these issues when compiled in a module, when both
    221     are done with WARNS=5).
    222 
    223 23. The current process of "load all the emulation/exec modules in case
    224     one of them might handle the image currently being exec'd" isn't
    225     really cool.  (See sys/kern/kern_exec.c?)  It ends up auto-loading
    226     a whole bunch of modules, involving file-system access, just to have
    227     most of the modules getting unloaded a few seconds later.  We don't
    228     have any way to identify which module is needed for which image (ie,
    229     we can't determine that an image needs compat_linux vs some other
    230     module).
    231 
    232 24. Details are no longer remembered, but there are some issues with
    233     building xen-variant modules (on amd4, and likely i386).  In some
    234     cases, wrong headers are included (because a XEN-related #define
    235     is missing), but even if you add the definition some headers get
    236     included in the wrong order.  One particular fallout from this is
    237     the inability to have a compat version of x86_64 cpu-microcode
    238     module.  PR port-xen/53130
    239 
    240     This is likely to be fixed by Chuck Silvers on 2020-07-04 which
    241     removed the differences between the xen and non-xen module ABIs.
    242     As of 2021-05-28 the cpu-microcode functionality has once again
    243     been enabled for i386 and amd64 compat_60 modules.
    244