TODO revision 1.26
1o Call module as module. 2 3 Until now, everything is called as attribute. Separate module from it: 4 5 - Module is a collection of code (*.[cSo]), and provides a function. 6 Module can depend on other modules. 7 8 - Attribute provides metadata for modules. One module can have 9 multiple attributes. Attribute doesn't generate a module (*.o, 10 *.ko). 11 12o Emit everything (ioconf.*, Makefile, ...) per-attribute. 13 14 config(9) related metadata (cfdriver, cfattach, cfdata, ...) should be 15 collected using linker. Create ELF sections like 16 .{rodata,data}.config.{cfdriver,cfattach,cfdata}. Provide reference 17 symbols (e.g. cfdriverinit[]) using linker script. Sort entries by name 18 to lookup entries by binary search in kernel. 19 20o Generate modular(9) related information. Especially module dependency. 21 22 At this moment modular(9) modules hardcode dependency in *.c using the 23 MODULE() macro: 24 25 MODULE(MODULE_CLASS_DRIVER, hdaudio, "pci"); 26 27 This information already exists in config(5) definitions (files.*). 28 Extend config(5) to be able to specify module's class. 29 30 Ideally these module metadata are kept somewhere in ELF headers, so that 31 loaders (e.g. boot(8)) can easily read. One idea is to abuse DYNAMIC 32 sections to record dependency, as shared library does. (Feasibility 33 unknown.) 34 35o Rename "interface attribute" to "bus". 36 37 Instead of 38 39 define audiobus {} 40 attach audio at audiobus 41 42 Do like this 43 44 defbus audiobus {} 45 attach audio at audiobus 46 47 Always provide xxxbusprint() (and xxxbussubmatch if multiple children). 48 Extend struct cfiattrdata like: 49 50 struct cfiattrdata { 51 const char *ci_name; 52 cfprint_t ci_print; 53 cfsubmatch_t ci_submatch; 54 int ci_loclen; 55 const struct cflocdesc ci_locdesc[]; 56 }; 57 58o Simplify child configuration API 59 60 With said struct cfiattrdata extension, config_found*() can omit 61 print/submatch args. If the found child is known (e.g., "pcibus" creating 62 "pci"): 63 64 config_found(self, "pcibus"); 65 66 If finding unknown children (e.g. "pci" finding pci devices): 67 68 config_find(self, "pci", locs, aux); 69 70o Retire "attach foo at bar with foo_bar.c" 71 72 Most of these should be rewritten by defining a common interface attribute 73 "foobus", instead of writing multiple attachments. com(4), ld(4), ehci(4) 74 are typical examples. For ehci(4), EHCI-capable controller drivers implement 75 "ehcibus" interface, like: 76 77 defne ehcibus {} 78 device imxehci: ehcibus 79 80 These drivers' attach functions call config_found() to attach ehci(4) via 81 the "ehcibus" interface attribute, instead of calling ehci_init() directly. 82 Same for com(4) (com_attach_subr()) and ld(4) (ldattach()). 83 84o Sort objects in more reasonable order. 85 86 Put machdep.ko in the lowest address. uvm.ko and kern.ko follow. 87 88 Kill alphabetical sort (${OBJS:O} in sys/conf/Makefile.inc.kern. 89 90 Use ldscript. Do like this 91 92 .text : 93 AT (ADDR(.text) & 0x0fffffff) 94 { 95 *(.text.machdep.locore.entry) 96 *(.text.machdep.locore) 97 *(.text.machdep) 98 *(.text) 99 *(.text.*) 100 : 101 102 Kill linker definitions in sys/conf/Makefile.inc.kern. 103 104o Differentiate "options" and "flags"/"params". 105 106 "options" enables features by adding *.c files (via attributes). 107 108 "flags" and "params" are to change contents of *.c files. These don't add 109 *.c files to the result kernel, or don't build attributes (modules). 110 111o Make flags/params per attributes (modules). 112 113 Basically flags and params are cpp(1) #define's generated in opt_*.h. Make 114 them local to one attributes (modules). Flags/params which affects files 115 across attributes (modules) are possible, but should be discouraged. 116 117o Generate things only by definitions. 118 119 In the ideal dynamically modular world, "selection" will be done not at 120 compile time but at runtime. Users select their wanted modules, by 121 dynamically loading them. 122 123 This means that the system provides all choices; that is, build all modules 124 in the source tree. Necessary information is defined in the "definition" 125 part. 126 127o Split cfdata. 128 129 cfdata is a set of pattern matching rules to enable devices at runtime device 130 auto-configuration. It is pure data and can (should) be generated separately 131 from the code. 132 133o Allow easier adding and removing of options. 134 135 It should be possible to add or remove options, flags, etc., 136 without regard to whether or not they are already defined. 137 For example, a configuration like this: 138 139 include GENERIC 140 options FOO 141 no options BAR 142 143 should work regardless of whether or not options FOO and/or 144 options BAR were defined in GENERIC. It should not give 145 errors like "options BAR was already defined" or "options FOO 146 was not defined". 147 148o Introduce "class". 149 150 Every module should be classified as at least one class, as modular(9) 151 modules already do. For example, file systems are marked as "vfs", network 152 protocols are "netproto". 153 154 Consider to merge "devclass" into "class". 155 156 For syntax clarity, class names could be used as a keyword to select the 157 class's instance module: 158 159 # Define net80211 module as netproto class 160 class netproto 161 define net80211: netproto 162 163 # Select net80211 to be builtin 164 netproto net80211 165 166 Accordingly device/attach selection syntax should be revisited. 167 168o Support kernel constructor/destructor (.kctors/.kdtors) 169 170 Initialization and finalization should be called via constructors and 171 destructors. Don't hardcode those sequences as sys/kern/init_main.c:main() 172 does. 173 174 The order of .kctors/.kdtors is resolved by dependency. The difference from 175 userland is that in kernel depended ones are located in lower addresses; 176 "machdep" module is the lowest. Thus the lowest entry in .ctors must be 177 executed the first. 178 179 The .kctors/.kdtors entries are executed by kernel's main() function, unlike 180 userland where start code executes .ctors/.dtors before main(). The hardcoded 181 sequence of various subsystem initializations in init_main.c:main() will be 182 replaced by an array of .kctors invocations, and #ifdef's there will be gone. 183 184o Hide link-set in the final kernel. 185 186 Link-set is used to collect references (pointers) at link time. It relys on 187 the ld(1) behavior that it automatically generates `__start_X' and `__stop_X' 188 symbols for the section `X' to reduce coding. 189 190 Don't allow kernel subsystems create random ELF sections. 191 192 Pre-define all the available link-set names and pre-generate a linker script 193 to merge them into .rodata. 194 195 (For modular(9) modules, `link_set_modules' is looked up by kernel loader. 196 Provide only it.) 197 198 Provide a way for 3rd party modules to declare extra link-set. 199 200o Shared kernel objects. 201 202 Since NetBSD has not established a clear kernel ABI, every single kernel 203 has to build all the objects by their own. As a result, similar kernels 204 (e.g. evbarm kernels) repeatedly compile similar objects, that is waste of 205 energy & space. 206 207 Share them if possible. For evb* ports, ideally everything except machdep.ko 208 should be shared. 209 210 While leaving optimizations as options (CPU specific optimizations, inlined 211 bus_space(9) operations, etc.) for users, the official binaries build 212 provided by TNF should be as portable as possible. 213 214o Always use explicit kernel linker script. 215 216 ld(1) has an option -T <ldscript> to use a given linker script. If not 217 specified, a default, built-in linker script, mainly meant for userland 218 programs, is used. 219 220 Currently m68k, sh3, and vax don't have kernel linker scripts. These work 221 because these have no constraints about page boundary; they map and access 222 kernel .text/.data in the same way. 223 224o Control ELF sections using linker script. 225 226 Now kernel is linked and built directly from object files (*.o). Each port 227 has an MD linker script, which does everything needed to be done at link 228 time. As a result, they do from MI alignment restriction (read_mostly, 229 cacheline_aligned) to load address specification for external boot loaders. 230 231 Make this into multiple stages to make linkage more structural. Especially, 232 reserve the final link for purely MD purpose. Note that in modular build, 233 *.ko are shared between build of kernel and modular(9) modules (*.kmod). 234 235 Monolithic build: 236 *.o ---> netbsd.ko Generic MI linkage 237 netbsd.ko ---> netbsd.ro Kernel MI linkage 238 netbsd.ro ---> netbsd Kernel MD linkage 239 240 Modular build (kernel): 241 *.o ---> *.ko Generic + Per-module MI linkage 242 *.ko ---> netbsd.ro Kernel MI linkage 243 netbsd.ro ---> netbsd Kernel MD linkage 244 245 Modular build (module): 246 *.o ---> *.ko Generic + Per-module MI linkage 247 *.ko ---> *.ro Modular MI linkage 248 *.ro ---> *.kmod Modular MD linkage 249 250 Genric MI linkage is for processing MI linkage that can be applied generally. 251 Data section alignment (.data.read_mostly and .data.cacheline_aligned) is 252 processed here. 253 254 Per-module MI linkage is for modules that want some ordering. For example, 255 machdep.ko wants to put entry code at the top of .text and .data. 256 257 Kernel MI linkage is for collecting kernel global section data, that is what 258 link-set is used for now. Once they are collected and symbols to the ranges 259 are assigned, those sections are merged into the pre-existing sections 260 (.rodata) because link-set sections in "netbsd" will never be interpreted by 261 external loaders. 262 263 Kernel MD linkage is used purely for MD purposes, that is, how kernels are 264 loaded by external loaders. It might be possible that one kernel relocatable 265 (netbsd.ro) is linked into multiple final kernel image (netbsd) for diferent 266 load addresses. 267 268 Modular MI linkage is to prepare a module to be loadable as modular(9). This 269 may add some extra sections and/or symbols. 270 271 Modular MD linkage is again for pure MD purposes like kernel MD linkage. 272 Adjustment and/or optimization may be done. 273 274 Kernel and modular MI linkages may change behavior depending on existence 275 of debug information. In the future .symtab will be copied using linker 276 during this stage. 277 278o Fix db_symtab copying (COPY_SYMTAB) 279 280 o Collect all objects and create a relocatable (netbsd.ro). At this point, 281 the number of symbols is known. 282 283 o Relink and allocate .rodata.symtab with the calculated size of .symtab. 284 Linker recalculates symbol addresses. 285 286 o Embed the .symtab into .rodata.symtab. 287 288 o Link the final netbsd ELF. 289 290 The make(1) rule (dependency graph) should be identical with/without 291 COPY_SYMTAB. Kill .ifdef COPY_SYMTAB from $S/conf/Makefile.kern.inc. 292 293o Preprocess and generate linker scripts dynamically. 294 295 Include opt_xxx.h and replace some constant values (e.g. COHERENCY_UNIT, 296 PAGE_SIZE, KERNEL_BASE_PHYS, KERNEL_BASE_VIRT, ...) with cpp(1). 297 298 Don't unnecessarily define symbols. Don't use sed(1). 299 300o Clean up linker scripts. 301 302 o Don't specify OUTPUT_FORMAT()/OUTPUT_ARCH() 303 304 These are basically set in compilers/linkers. If non-default ABI is used, 305 command-line arguments should be specified. 306 307 o Remove .rel/.rela handlings. 308 309 These are set in relocatable objects, and handled by dynamic linkers. 310 Totally irrelefant for kernels. 311 312 o Clean up debug section handlings. 313 314 o Document (section boundary) symbols set in linker scripts. 315 316 There must be a reason why symbols are defined and exported. 317 318 PROVIDE() is to define internal symbols. 319 320 o Clean up load addresses. 321 322 o Program headers. 323 324 o According to matt@, .ARM.extab/.ARM.exidx sections are no longer needed. 325 326o Redesign swapnetbsd.c (root/swap device specification) 327 328 Don't build a whole kernel only to specify root/swap devices. 329 330 Make these parameter re-configurable afterwards. 331 332o Namespace. 333 334 Investigate namespace of attributes/modules/options. Figure out the hidden 335 design about these, document it, then re-design it. 336 337 At this moment, all of them share the single "selecttab", which means their 338 namespaces are common, but they also have respective tables (attrtab, 339 opttab, etc.). 340 341 Selecting an option (addoption()), that is also a module name, works only if 342 the module doesn't depend on anything, because addoption() doesn't select 343 module and its dependencies (selectattr()). In other words, an option is 344 only safely converted to a module (define), only if it doesn't depend on 345 anything. (One example is DDB.) 346 347o Convert pseudo(dev) attach functions to take (void) (== kernel ctors). 348 349 The pseudo attach function was originally designed to take `int n' as 350 the number of instances of the pseudo device. Now most of pseudo 351 devices have been converted to be `cloneable', meaning that their 352 instances are dynamically allocated at run-time, because guessing how 353 much instances are needed for users at compile time is almost impossible. 354 Restricting such a pure software resource at compile time is senseless, 355 considering that the rest of the world is dynamic. 356 357 If pseudo attach functions once become (void), config(1) no longer 358 has to generate iteration to call those functions, by making them part 359 of kernel constructors, that are a list of (void) functions. 360 361 Some pseudo devices may have dependency/ordering problems, because 362 pseudo attach functions have no choice when to be called. This could 363 be solved by converting to kctors, where functions are called in order 364 by dependency. 365 366o Enhance ioconf behavior for pseudo-devices 367 368 See "bin/48571: config(1) ioconf is insufficient for pseudo-devices" for 369 more details. In a nutshell, it would be "useful" for config to emit 370 the necessary stuff in the generated ioconf.[ch] to enable use of 371 config_{init,fini}_component() for attaching and detaching pseudodev's. 372 373 Currently, you need to manually construct your own data structures, and 374 manually "attach" them, one at a time. This leads to duplication of 375 code (where multiple drivers contain the same basic logic), and doesn't 376 necessarily handle all of the "frobbing" of the kernel lists. 377 378o Don't use -Ttext ${TEXTADDR}. 379 380 Although ld(1)'s `-Ttext ${TEXTADDR}' is an easy way to specify the virtual 381 base address of .text at link time, it needs to change command-line; in 382 kernel build, Makefile needs to change to reflect kernel's configuration. 383 It is simpler to reflect kenel configuration using linker script via assym.h. 384 385o Convert ${DIAGNOSTIC} and ${DEBUG} as flags (defflag). 386 387 Probably generate opt_diagnostic.h/opt_debug.h and include them in 388 sys/param.h. 389 390o Strictly define DIAGNOSTIC. 391 392 It is possible to make DIAGNOSTIC kernel and modules binary-compatible with 393 non-DIAGNOSTIC ones. In that case, debug type informations should match 394 theoretically (not confirmed). 395 396o Use suffix rules. 397 398 Build objects following suffix rules. Source files are defined as relative to 399 $S (e.g. sys/kern/init_main.c) and objects are generated in the corresponding 400 subdirectories under kernel build directories (e.g. 401 .../compile/GENERIC/sys/kern/init_main.o). Dig subdirectories from within 402 config(1). 403 404 Debugging (-g) and profiling (-pg) objects could be generated with *.go/*.po 405 suffixes as userland libraries do. Maybe something similar for 406 DIAGNOSTIC/DEBUG. 407 408 genassym(1) definitions will be split into per-source instead of the single 409 assym.h. Dependencies are corrected and some of misterious dependencies on 410 `Makefile' in sys/conf/Makefile.kern.inc can go away. 411 412o Define genassym(1) symbols per file. 413 414 Have each file define symbols that have to be generated by genassym(1) so 415 that more accurate dependency is reflected. 416 417 For example, if foo.S needs some symbols, it defines them in foo.assym, 418 declaring that foo.S depends on foo.assym.h, and includes foo.assym.h. 419 foo.assym.h is generated by following the suffix rule of .assym -> .assym.h. 420 When one header is updated, only related *.assym.h files are regenerated, 421 instead of rebuilding all MD/*.S files that depend on the global, single 422 assym.h. 423