TODO revision 1.29 1 o Call module as module.
2
3 Until now, everything is called as attribute. Separate module from it:
4
5 - Module is a collection of code (*.[cSo]), and provides a function.
6 Module can depend on other modules.
7
8 - Attribute provides metadata for modules. One module can have
9 multiple attributes. Attribute doesn't generate a module (*.o,
10 *.ko).
11
12 o Emit everything (ioconf.*, Makefile, ...) per-attribute.
13
14 config(9) related metadata (cfdriver, cfattach, cfdata, ...) should be
15 collected using linker. Create ELF sections like
16 .{rodata,data}.config.{cfdriver,cfattach,cfdata}. Provide reference
17 symbols (e.g. cfdriverinit[]) using linker script. Sort entries by name
18 to lookup entries by binary search in kernel.
19
20 o Generate modular(9) related information. Especially module dependency.
21
22 At this moment modular(9) modules hardcode dependency in *.c using the
23 MODULE() macro:
24
25 MODULE(MODULE_CLASS_DRIVER, hdaudio, "pci");
26
27 This information already exists in config(5) definitions (files.*).
28 Extend config(5) to be able to specify module's class.
29
30 Ideally these module metadata are kept somewhere in ELF headers, so that
31 loaders (e.g. boot(8)) can easily read. One idea is to abuse DYNAMIC
32 sections to record dependency, as shared library does. (Feasibility
33 unknown.)
34
35 o Rename "interface attribute" to "bus".
36
37 Instead of
38
39 define audiobus {}
40 attach audio at audiobus
41
42 Do like this
43
44 defbus audiobus {}
45 attach audio at audiobus
46
47 Always provide xxxbusprint() (and xxxbussubmatch if multiple children).
48 Extend struct cfiattrdata like:
49
50 struct cfiattrdata {
51 const char *ci_name;
52 cfprint_t ci_print;
53 cfsubmatch_t ci_submatch;
54 int ci_loclen;
55 const struct cflocdesc ci_locdesc[];
56 };
57
58 o Simplify child configuration API
59
60 With said struct cfiattrdata extension, config_found*() can omit
61 print/submatch args. If the found child is known (e.g., "pcibus" creating
62 "pci"):
63
64 config_found(self, "pcibus");
65
66 If finding unknown children (e.g. "pci" finding pci devices):
67
68 config_find(self, "pci", locs, aux);
69
70 o Retire "attach foo at bar with foo_bar.c"
71
72 Most of these should be rewritten by defining a common interface attribute
73 "foobus", instead of writing multiple attachments. com(4), ld(4), ehci(4)
74 are typical examples. For ehci(4), EHCI-capable controller drivers implement
75 "ehcibus" interface, like:
76
77 defne ehcibus {}
78 device imxehci: ehcibus
79
80 These drivers' attach functions call config_found() to attach ehci(4) via
81 the "ehcibus" interface attribute, instead of calling ehci_init() directly.
82 Same for com(4) (com_attach_subr()) and ld(4) (ldattach()).
83
84 o Sort objects in more reasonable order.
85
86 Put machdep.ko in the lowest address. uvm.ko and kern.ko follow.
87
88 Kill alphabetical sort (${OBJS:O} in sys/conf/Makefile.inc.kern.
89
90 Use ldscript. Do like this
91
92 .text :
93 AT (ADDR(.text) & 0x0fffffff)
94 {
95 *(.text.machdep.locore.entry)
96 *(.text.machdep.locore)
97 *(.text.machdep)
98 *(.text)
99 *(.text.*)
100 :
101
102 Kill linker definitions in sys/conf/Makefile.inc.kern.
103
104 o Differentiate "options" and "flags"/"params".
105
106 "options" enables features by adding *.c files (via attributes).
107
108 "flags" and "params" are to change contents of *.c files. These don't add
109 *.c files to the result kernel, or don't build attributes (modules).
110
111 o Make flags/params per attributes (modules).
112
113 Basically flags and params are cpp(1) #define's generated in opt_*.h. Make
114 them local to one attributes (modules). Flags/params which affects files
115 across attributes (modules) are possible, but should be discouraged.
116
117 o Generate things only by definitions.
118
119 In the ideal dynamically modular world, "selection" will be done not at
120 compile time but at runtime. Users select their wanted modules, by
121 dynamically loading them.
122
123 This means that the system provides all choices; that is, build all modules
124 in the source tree. Necessary information is defined in the "definition"
125 part.
126
127 o Split cfdata.
128
129 cfdata is a set of pattern matching rules to enable devices at runtime device
130 auto-configuration. It is pure data and can (should) be generated separately
131 from the code.
132
133 o Allow easier adding and removing of options.
134
135 It should be possible to add or remove options, flags, etc.,
136 without regard to whether or not they are already defined.
137 For example, a configuration like this:
138
139 include GENERIC
140 options FOO
141 no options BAR
142
143 should work regardless of whether or not options FOO and/or
144 options BAR were defined in GENERIC. It should not give
145 errors like "options BAR was already defined" or "options FOO
146 was not defined".
147
148 o Introduce "class".
149
150 Every module should be classified as at least one class, as modular(9)
151 modules already do. For example, file systems are marked as "vfs", network
152 protocols are "netproto".
153
154 Consider to merge "devclass" into "class".
155
156 For syntax clarity, class names could be used as a keyword to select the
157 class's instance module:
158
159 # Define net80211 module as netproto class
160 class netproto
161 define net80211: netproto
162
163 # Select net80211 to be builtin
164 netproto net80211
165
166 Accordingly device/attach selection syntax should be revisited.
167
168 o Support kernel constructor/destructor (.kctors/.kdtors)
169
170 Initialization and finalization should be called via constructors and
171 destructors. Don't hardcode those sequences as sys/kern/init_main.c:main()
172 does.
173
174 The order of .kctors/.kdtors is resolved by dependency. The difference from
175 userland is that in kernel depended ones are located in lower addresses;
176 "machdep" module is the lowest. Thus the lowest entry in .ctors must be
177 executed the first.
178
179 The .kctors/.kdtors entries are executed by kernel's main() function, unlike
180 userland where start code executes .ctors/.dtors before main(). The hardcoded
181 sequence of various subsystem initializations in init_main.c:main() will be
182 replaced by an array of .kctors invocations, and #ifdef's there will be gone.
183
184 o Hide link-set in the final kernel.
185
186 Link-set is used to collect references (pointers) at link time. It relys on
187 the ld(1) behavior that it automatically generates `__start_X' and `__stop_X'
188 symbols for the section `X' to reduce coding.
189
190 Don't allow kernel subsystems create random ELF sections.
191
192 Pre-define all the available link-set names and pre-generate a linker script
193 to merge them into .rodata.
194
195 (For modular(9) modules, `link_set_modules' is looked up by kernel loader.
196 Provide only it.)
197
198 Provide a way for 3rd party modules to declare extra link-set.
199
200 o Shared kernel objects.
201
202 Since NetBSD has not established a clear kernel ABI, every single kernel
203 has to build all the objects by their own. As a result, similar kernels
204 (e.g. evbarm kernels) repeatedly compile similar objects, that is waste of
205 energy & space.
206
207 Share them if possible. For evb* ports, ideally everything except machdep.ko
208 should be shared.
209
210 While leaving optimizations as options (CPU specific optimizations, inlined
211 bus_space(9) operations, etc.) for users, the official binaries build
212 provided by TNF should be as portable as possible.
213
214 o Always use explicit kernel linker script.
215
216 ld(1) has an option -T <ldscript> to use a given linker script. If not
217 specified, a default, built-in linker script, mainly meant for userland
218 programs, is used.
219
220 Currently m68k, sh3, and vax don't have kernel linker scripts. These work
221 because these have no constraints about page boundary; they map and access
222 kernel .text/.data in the same way.
223
224 o Pass input files to ${LD} via linker script.
225
226 Instead of passing input files on command-line, output "INPUT(xxx.o)"
227 commands, and include it from generated linker scripts.
228
229 o Generate `*.d' files.
230
231 Output source/object files in raw texts instead of `Makefile'. Generate
232 `*.d' (make(1) depend) files. make(1) knows which object files are to be
233 compiled. With "INPUT(xxx.o)" linker scripts, either generated `Makefile'
234 or `Makefile.kern.inc' don't need to keep source/object files in variables.
235
236 o Control ELF sections using linker script.
237
238 Now kernel is linked and built directly from object files (*.o). Each port
239 has an MD linker script, which does everything needed to be done at link
240 time. As a result, they do from MI alignment restriction (read_mostly,
241 cacheline_aligned) to load address specification for external boot loaders.
242
243 Make this into multiple stages to make linkage more structural. Especially,
244 reserve the final link for purely MD purpose. Note that in modular build,
245 *.ko are shared between build of kernel and modular(9) modules (*.kmod).
246
247 Monolithic build:
248 *.o ---> netbsd.ko Generic MI linkage
249 netbsd.ko ---> netbsd.ro Kernel MI linkage
250 netbsd.ro ---> netbsd Kernel MD linkage
251
252 Modular build (kernel):
253 *.o ---> *.ko Generic + Per-module MI linkage
254 *.ko ---> netbsd.ro Kernel MI linkage
255 netbsd.ro ---> netbsd Kernel MD linkage
256
257 Modular build (module):
258 *.o ---> *.ko Generic + Per-module MI linkage
259 *.ko ---> *.ro Modular MI linkage
260 *.ro ---> *.kmod Modular MD linkage
261
262 Genric MI linkage is for processing MI linkage that can be applied generally.
263 Data section alignment (.data.read_mostly and .data.cacheline_aligned) is
264 processed here.
265
266 Per-module MI linkage is for modules that want some ordering. For example,
267 machdep.ko wants to put entry code at the top of .text and .data.
268
269 Kernel MI linkage is for collecting kernel global section data, that is what
270 link-set is used for now. Once they are collected and symbols to the ranges
271 are assigned, those sections are merged into the pre-existing sections
272 (.rodata) because link-set sections in "netbsd" will never be interpreted by
273 external loaders.
274
275 Kernel MD linkage is used purely for MD purposes, that is, how kernels are
276 loaded by external loaders. It might be possible that one kernel relocatable
277 (netbsd.ro) is linked into multiple final kernel image (netbsd) for diferent
278 load addresses.
279
280 Modular MI linkage is to prepare a module to be loadable as modular(9). This
281 may add some extra sections and/or symbols.
282
283 Modular MD linkage is again for pure MD purposes like kernel MD linkage.
284 Adjustment and/or optimization may be done.
285
286 Kernel and modular MI linkages may change behavior depending on existence
287 of debug information. In the future .symtab will be copied using linker
288 during this stage.
289
290 o Fix db_symtab copying (COPY_SYMTAB)
291
292 o Collect all objects and create a relocatable (netbsd.ro). At this point,
293 the number of symbols is known.
294
295 o Relink and allocate .rodata.symtab with the calculated size of .symtab.
296 Linker recalculates symbol addresses.
297
298 o Embed the .symtab into .rodata.symtab.
299
300 o Link the final netbsd ELF.
301
302 The make(1) rule (dependency graph) should be identical with/without
303 COPY_SYMTAB. Kill .ifdef COPY_SYMTAB from $S/conf/Makefile.kern.inc.
304
305 o Preprocess and generate linker scripts dynamically.
306
307 Include opt_xxx.h and replace some constant values (e.g. COHERENCY_UNIT,
308 PAGE_SIZE, KERNEL_BASE_PHYS, KERNEL_BASE_VIRT, ...) with cpp(1).
309
310 Don't unnecessarily define symbols. Don't use sed(1).
311
312 o Clean up linker scripts.
313
314 o Don't specify OUTPUT_FORMAT()/OUTPUT_ARCH()
315
316 These are basically set in compilers/linkers. If non-default ABI is used,
317 command-line arguments should be specified.
318
319 o Remove .rel/.rela handlings.
320
321 These are set in relocatable objects, and handled by dynamic linkers.
322 Totally irrelefant for kernels.
323
324 o Clean up debug section handlings.
325
326 o Document (section boundary) symbols set in linker scripts.
327
328 There must be a reason why symbols are defined and exported.
329
330 PROVIDE() is to define internal symbols.
331
332 o Clean up load addresses.
333
334 o Program headers.
335
336 o According to matt@, .ARM.extab/.ARM.exidx sections are no longer needed.
337
338 o Redesign swapnetbsd.c (root/swap device specification)
339
340 Don't build a whole kernel only to specify root/swap devices.
341
342 Make these parameter re-configurable afterwards.
343
344 o Namespace.
345
346 Investigate namespace of attributes/modules/options. Figure out the hidden
347 design about these, document it, then re-design it.
348
349 At this moment, all of them share the single "selecttab", which means their
350 namespaces are common, but they also have respective tables (attrtab,
351 opttab, etc.).
352
353 Selecting an option (addoption()), that is also a module name, works only if
354 the module doesn't depend on anything, because addoption() doesn't select
355 module and its dependencies (selectattr()). In other words, an option is
356 only safely converted to a module (define), only if it doesn't depend on
357 anything. (One example is DDB.)
358
359 o Convert pseudo(dev) attach functions to take (void) (== kernel ctors).
360
361 The pseudo attach function was originally designed to take `int n' as
362 the number of instances of the pseudo device. Now most of pseudo
363 devices have been converted to be `cloneable', meaning that their
364 instances are dynamically allocated at run-time, because guessing how
365 much instances are needed for users at compile time is almost impossible.
366 Restricting such a pure software resource at compile time is senseless,
367 considering that the rest of the world is dynamic.
368
369 If pseudo attach functions once become (void), config(1) no longer
370 has to generate iteration to call those functions, by making them part
371 of kernel constructors, that are a list of (void) functions.
372
373 Some pseudo devices may have dependency/ordering problems, because
374 pseudo attach functions have no choice when to be called. This could
375 be solved by converting to kctors, where functions are called in order
376 by dependency.
377
378 o Enhance ioconf behavior for pseudo-devices
379
380 See "bin/48571: config(1) ioconf is insufficient for pseudo-devices" for
381 more details. In a nutshell, it would be "useful" for config to emit
382 the necessary stuff in the generated ioconf.[ch] to enable use of
383 config_{init,fini}_component() for attaching and detaching pseudodev's.
384
385 Currently, you need to manually construct your own data structures, and
386 manually "attach" them, one at a time. This leads to duplication of
387 code (where multiple drivers contain the same basic logic), and doesn't
388 necessarily handle all of the "frobbing" of the kernel lists.
389
390 o Don't use -Ttext ${TEXTADDR}.
391
392 Although ld(1)'s `-Ttext ${TEXTADDR}' is an easy way to specify the virtual
393 base address of .text at link time, it needs to change command-line; in
394 kernel build, Makefile needs to change to reflect kernel's configuration.
395 It is simpler to reflect kenel configuration using linker script via assym.h.
396
397 o Convert ${DIAGNOSTIC} and ${DEBUG} as flags (defflag).
398
399 Probably generate opt_diagnostic.h/opt_debug.h and include them in
400 sys/param.h.
401
402 o Strictly define DIAGNOSTIC.
403
404 It is possible to make DIAGNOSTIC kernel and modules binary-compatible with
405 non-DIAGNOSTIC ones. In that case, debug type informations should match
406 theoretically (not confirmed).
407
408 o Use suffix rules.
409
410 Build objects following suffix rules. Source files are defined as relative to
411 $S (e.g. sys/kern/init_main.c) and objects are generated in the corresponding
412 subdirectories under kernel build directories (e.g.
413 .../compile/GENERIC/sys/kern/init_main.o). Dig subdirectories from within
414 config(1).
415
416 Debugging (-g) and profiling (-pg) objects could be generated with *.go/*.po
417 suffixes as userland libraries do. Maybe something similar for
418 DIAGNOSTIC/DEBUG.
419
420 genassym(1) definitions will be split into per-source instead of the single
421 assym.h. Dependencies are corrected and some of misterious dependencies on
422 `Makefile' in sys/conf/Makefile.kern.inc can go away.
423
424 o Define genassym(1) symbols per file.
425
426 Have each file define symbols that have to be generated by genassym(1) so
427 that more accurate dependency is reflected.
428
429 For example, if foo.S needs some symbols, it defines them in foo.assym,
430 declaring that foo.S depends on foo.assym.h, and includes foo.assym.h.
431 foo.assym.h is generated by following the suffix rule of .assym -> .assym.h.
432 When one header is updated, only related *.assym.h files are regenerated,
433 instead of rebuilding all MD/*.S files that depend on the global, single
434 assym.h.
435
436 o Support library.
437
438 Provide a consistent way to build library either as .o or .a.
439
440 o Accept `.a' suffix.
441
442 Make "file" command accept `.a' suffix. Handle it the same way as `.o'.
443
444 o Clean up ${MD_OBJS} and friends in Makefile.${MACHINE}.
445
446 Don't use ${MD_OBJS}, ${MD_LIBS}, ${MD_SFILES}, and ${MD_CFILES}.
447
448 List files in config(5)'s "file". Override build rules only when neccesary.
449
450 Rely on the fact that config(1) parses files.${MACHINE} first, outputs
451 files in the order it parses files.* (actually include depth), and
452 `Makefile.kern.inc' preserve file order to pass to ${LD}.
453
454 o Clean up CTF-related rules.
455
456 Don't overwrite compile/link rules conditionally by existence of
457 ${CTFCONVERT}/${CTFMERGE}. Give a separate suffix (*.ctfo) and define its
458 rules (.c -> .ctfo).
459