TODO revision 1.25
1o Call module as module.
2
3  Until now, everything is called as attribute.  Separate module from it:
4
5	- Module is a collection of code (*.[cSo]), and provides a function.
6	  Module can depend on other modules.
7
8	- Attribute provides metadata for modules.  One module can have
9	  multiple attributes.  Attribute doesn't generate a module (*.o,
10	  *.ko).
11
12o Emit everything (ioconf.*, Makefile, ...) per-attribute.
13
14  config(9) related metadata (cfdriver, cfattach, cfdata, ...) should be
15  collected using linker.  Create ELF sections like
16  .{rodata,data}.config.{cfdriver,cfattach,cfdata}.  Provide reference
17  symbols (e.g. cfdriverinit[]) using linker script.  Sort entries by name
18  to lookup entries by binary search in kernel.
19
20o Generate modular(9) related information.  Especially module dependency.
21
22  At this moment modular(9) modules hardcode dependency in *.c using the
23  MODULE() macro:
24
25	MODULE(MODULE_CLASS_DRIVER, hdaudio, "pci");
26
27  This information already exists in config(5) definitions (files.*).
28  Extend config(5) to be able to specify module's class.
29
30  Ideally these module metadata are kept somewhere in ELF headers, so that
31  loaders (e.g. boot(8)) can easily read.  One idea is to abuse DYNAMIC
32  sections to record dependency, as shared library does.  (Feasibility
33  unknown.)
34
35o Rename "interface attribute" to "bus".
36
37  Instead of
38
39	define	audiobus {}
40	attach	audio at audiobus
41
42  Do like this
43
44	defbus	audiobus {}
45	attach	audio at audiobus
46
47  Always provide xxxbusprint() (and xxxbussubmatch if multiple children).
48  Extend struct cfiattrdata like:
49
50	struct cfiattrdata {
51		const char *ci_name;
52		cfprint_t ci_print;
53		cfsubmatch_t ci_submatch;
54		int ci_loclen;
55		const struct cflocdesc ci_locdesc[];
56	};
57
58o Simplify child configuration API
59
60  With said struct cfiattrdata extension, config_found*() can omit
61  print/submatch args.  If the found child is known (e.g., "pcibus" creating
62  "pci"):
63
64	config_found(self, "pcibus");
65
66  If finding unknown children (e.g. "pci" finding pci devices):
67
68	config_find(self, "pci", locs, aux);
69
70o Retire "attach foo at bar with foo_bar.c"
71
72  Most of these should be rewritten by defining a common interface attribute
73  "foobus", instead of writing multiple attachments.  com(4), ld(4), ehci(4)
74  are typical examples.  For ehci(4), EHCI-capable controller drivers implement
75  "ehcibus" interface, like:
76
77	defne	ehcibus {}
78	device	imxehci: ehcibus
79
80  These drivers' attach functions call config_found() to attach ehci(4) via
81  the "ehcibus" interface attribute, instead of calling ehci_init() directly.
82  Same for com(4) (com_attach_subr()) and ld(4) (ldattach()).
83
84o Sort objects in more reasonable order.
85
86  Put machdep.ko in the lowest address.  uvm.ko and kern.ko follow.
87
88  Kill alphabetical sort (${OBJS:O} in sys/conf/Makefile.inc.kern.
89
90  Use ldscript.  Do like this
91
92	.text :
93	AT (ADDR(.text) & 0x0fffffff)
94	{
95	  *(.text.machdep.locore.entry)
96	  *(.text.machdep.locore)
97	  *(.text.machdep)
98	  *(.text)
99	  *(.text.*)
100	  :
101
102  Kill linker definitions in sys/conf/Makefile.inc.kern.
103
104o Differentiate "options" and "flags"/"params".
105
106  "options" enables features by adding *.c files (via attributes).
107
108  "flags" and "params" are to change contents of *.c files.  These don't add
109  *.c files to the result kernel, or don't build attributes (modules).
110
111o Make flags/params per attributes (modules).
112
113  Basically flags and params are cpp(1) #define's generated in opt_*.h.  Make
114  them local to one attributes (modules).  Flags/params which affects files
115  across attributes (modules) are possible, but should be discouraged.
116
117o Generate things only by definitions.
118
119  In the ideal dynamically modular world, "selection" will be done not at
120  compile time but at runtime.  Users select their wanted modules, by
121  dynamically loading them.
122
123  This means that the system provides all choices; that is, build all modules
124  in the source tree.  Necessary information is defined in the "definition"
125  part.
126
127o Split cfdata.
128
129  cfdata is a set of pattern matching rules to enable devices at runtime device
130  auto-configuration.  It is pure data and can (should) be generated separately
131  from the code.
132
133o Allow easier adding and removing of options.
134
135  It should be possible to add or remove options, flags, etc.,
136  without regard to whether or not they are already defined.
137  For example, a configuration like this:
138
139	include GENERIC
140	options FOO
141	no options BAR
142
143  should work regardless of whether or not options FOO and/or
144  options BAR were defined in GENERIC.  It should not give
145  errors like "options BAR was already defined" or "options FOO
146  was not defined".
147
148o Introduce "class".
149
150  Every module should be classified as at least one class, as modular(9)
151  modules already do.  For example, file systems are marked as "vfs", network
152  protocols are "netproto".
153
154  Consider to merge "devclass" into "class".
155
156  For syntax clarity, class names could be used as a keyword to select the
157  class's instance module:
158
159	# Define net80211 module as netproto class
160	class netproto
161	define net80211: netproto
162
163	# Select net80211 to be builtin
164	netproto net80211
165
166  Accordingly device/attach selection syntax should be revisited.
167
168o Support kernel constructor/destructor (.kctors/.kdtors)
169
170  Initialization and finalization should be called via constructors and
171  destructors.  Don't hardcode those sequences as sys/kern/init_main.c:main()
172  does.
173
174  The order of .kctors/.kdtors is resolved by dependency.  The difference from
175  userland is that in kernel depended ones are located in lower addresses;
176  "machdep" module is the lowest.  Thus the lowest entry in .ctors must be
177  executed the first.
178
179  The .kctors/.kdtors entries are executed by kernel's main() function, unlike
180  userland where start code executes .ctors/.dtors before main().  The hardcoded
181  sequence of various subsystem initializations in init_main.c:main() will be
182  replaced by an array of .kctors invocations, and #ifdef's there will be gone.
183
184o Hide link-set in the final kernel.
185
186  Link-set is used to collect references (pointers) at link time.  It relys on
187  the ld(1) behavior that it automatically generates `__start_X' and `__stop_X'
188  symbols for the section `X' to reduce coding.
189
190  Don't allow kernel subsystems create random ELF sections.
191
192  Pre-define all the available link-set names and pre-generate a linker script
193  to merge them into .rodata.
194
195  (For modular(9) modules, `link_set_modules' is looked up by kernel loader.
196  Provide only it.)
197
198  Provide a way for 3rd party modules to declare extra link-set.
199
200o Shared kernel objects.
201
202  Since NetBSD has not established a clear kernel ABI, every single kernel
203  has to build all the objects by their own.  As a result, similar kernels
204  (e.g. evbarm kernels) repeatedly compile similar objects, that is waste of
205  energy & space.
206
207  Share them if possible.  For evb* ports, ideally everything except machdep.ko
208  should be shared.
209
210  While leaving optimizations as options (CPU specific optimizations, inlined
211  bus_space(9) operations, etc.) for users, the official binaries build
212  provided by TNF should be as portable as possible.
213
214o Always use explicit kernel linker script.
215
216  ld(1) has an option -T <ldscript> to use a given linker script.  If not
217  specified, a default, built-in linker script, mainly meant for userland
218  programs, is used.
219
220  Currently m68k, sh3, and vax don't have kernel linker scripts.  These work
221  because these have no constraints about page boundary; they map and access
222  kernel .text/.data in the same way.
223
224o Control ELF sections using linker script.
225
226  Now kernel is linked and built directly from object files (*.o).  Each port
227  has an MD linker script, which does everything needed to be done at link
228  time.  As a result, they do from MI alignment restriction (read_mostly,
229  cacheline_aligned) to load address specification for external boot loaders.
230
231  Make this into multiple stages to make linkage more structural.  Especially,
232  reserve the final link for purely MD purpose.  Note that in modular build,
233  *.ko are shared between build of kernel and modular(9) modules (*.kmod).
234
235	Monolithic build:
236		     *.o  ---> netbsd.ko	Generic MI linkage
237		netbsd.ko ---> netbsd.ro	Kernel MI linkage
238		netbsd.ro ---> netbsd		Kernel MD linkage
239
240	Modular build (kernel):
241		     *.o  --->      *.ko	Generic + Per-module MI linkage
242		     *.ko ---> netbsd.ro	Kernel MI linkage
243		netbsd.ro ---> netbsd		Kernel MD linkage
244
245	Modular build (module):
246		     *.o  --->      *.ko	Generic + Per-module MI linkage
247		     *.ko --->      *.ro	Modular MI linkage
248		     *.ro --->      *.kmod	Modular MD linkage
249
250  Genric MI linkage is for processing MI linkage that can be applied generally.
251  Data section alignment (.data.read_mostly and .data.cacheline_aligned) is
252  processed here.
253
254  Per-module MI linkage is for modules that want some ordering.  For example,
255  machdep.ko wants to put entry code at the top of .text and .data.
256
257  Kernel MI linkage is for collecting kernel global section data, that is what
258  link-set is used for now.  Once they are collected and symbols to the ranges
259  are assigned, those sections are merged into the pre-existing sections
260  (.rodata) because link-set sections in "netbsd" will never be interpreted by
261  external loaders.
262
263  Kernel MD linkage is used purely for MD purposes, that is, how kernels are
264  loaded by external loaders.  It might be possible that one kernel relocatable
265  (netbsd.ro) is linked into multiple final kernel image (netbsd) for diferent
266  load addresses.
267
268  Modular MI linkage is to prepare a module to be loadable as modular(9).  This
269  may add some extra sections and/or symbols.
270
271  Modular MD linkage is again for pure MD purposes like kernel MD linkage.
272  Adjustment and/or optimization may be done.
273
274  Kernel and modular MI linkages may change behavior depending on existence
275  of debug information.  In the future .symtab will be copied using linker
276  during this stage.
277
278o Fix db_symtab copying (COPY_SYMTAB)
279
280  o Collect all objects and create a relocatable (netbsd.ro).  At this point,
281    the number of symbols is known.
282
283  o Relink and allocate .rodata.symtab with the calculated size of .symtab.
284    Linker recalculates symbol addresses.
285
286  o Embed the .symtab into .rodata.symtab.
287
288  o Link the final netbsd ELF.
289
290  The make(1) rule (dependency graph) should be identical with/without
291  COPY_SYMTAB.  Kill .ifdef COPY_SYMTAB from $S/conf/Makefile.kern.inc.
292
293o Preprocess and generate linker scripts dynamically.
294
295  Include opt_xxx.h and replace some constant values (e.g. COHERENCY_UNIT,
296  PAGE_SIZE, KERNEL_BASE_PHYS, KERNEL_BASE_VIRT, ...) with cpp(1).
297
298  Don't unnecessarily define symbols.  Don't use sed(1).
299
300o Clean up linker scripts.
301
302  o Don't specify OUTPUT_FORMAT()/OUTPUT_ARCH()
303
304    These are basically set in compilers/linkers.  If non-default ABI is used,
305    command-line arguments should be specified.
306
307  o Remove .rel/.rela handlings.
308
309    These are set in relocatable objects, and handled by dynamic linkers.
310    Totally irrelefant for kernels.
311
312  o Clean up debug section handlings.
313
314  o Document (section boundary) symbols set in linker scripts.
315
316    There must be a reason why symbols are defined and exported.
317
318    PROVIDE() is to define internal symbols.
319
320  o Clean up load addresses.
321
322  o Program headers.
323
324  o According to matt@, .ARM.extab/.ARM.exidx sections are no longer needed.
325
326o Redesign swapnetbsd.c (root/swap device specification)
327
328  Don't build a whole kernel only to specify root/swap devices.
329
330  Make these parameter re-configurable afterwards.
331
332o Namespace.
333
334  Investigate namespace of attributes/modules/options.  Figure out the hidden
335  design about these, document it, then re-design it.
336
337  At this moment, all of them share the single "selecttab", which means their
338  namespaces are common, but they also have respective tables (attrtab,
339  opttab, etc.).
340
341  Selecting an option (addoption()), that is also a module name, works only if
342  the module doesn't depend on anything, because addoption() doesn't select
343  module and its dependencies (selectattr()).  In other words, an option is
344  only safely converted to a module (define), only if it doesn't depend on
345  anything.  (One example is DDB.)
346
347o Convert pseudo(dev) attach functions to take (void) (== kernel ctors).
348
349  The pseudo attach function was originally designed to take `int n' as
350  the number of instances of the pseudo device.  Now most of pseudo
351  devices have been converted to be `cloneable', meaning that their
352  instances are dynamically allocated at run-time, because guessing how
353  much instances are needed for users at compile time is almost impossible.
354  Restricting such a pure software resource at compile time is senseless,
355  considering that the rest of the world is dynamic.
356
357  If pseudo attach functions once become (void), config(1) no longer
358  has to generate iteration to call those functions, by making them part
359  of kernel constructors, that are a list of (void) functions.
360
361  Some pseudo devices may have dependency/ordering problems, because
362  pseudo attach functions have no choice when to be called.  This could
363  be solved by converting to kctors, where functions are called in order
364  by dependency.
365
366o Enhance ioconf behavior for pseudo-devices
367
368  See "bin/48571: config(1) ioconf is insufficient for pseudo-devices" for
369  more details.  In a nutshell, it would be "useful" for config to emit
370  the necessary stuff in the generated ioconf.[ch] to enable use of
371  config_{init,fini}_component() for attaching and detaching pseudodev's.
372
373  Currently, you need to manually construct your own data structures, and
374  manually "attach" them, one at a time.  This leads to duplication of
375  code (where multiple drivers contain the same basic logic), and doesn't
376  necessarily handle all of the "frobbing" of the kernel lists.
377
378o Disallow unknown options.
379
380  Don't accept options that are not defined as either defflag or defparam.
381  Report them and exit.  Don't set ${IDENT} in the generated Makefile.
382
383o Kill makeoptions.
384
385  It adds a variable defined in the generated `Makefile'.  While it looks
386  useful, it is too flexible and easy to abuse.  The `Makefile' should be
387  kept as simple as possible and have nothing that affects output contents.
388  Consider to kill `makeoptions' totally, replace existing ones with `options'.
389
390o Don't use -Ttext ${TEXTADDR}.
391
392  Although ld(1)'s `-Ttext ${TEXTADDR}' is an easy way to specify the virtual
393  base address of .text at link time, it needs to change command-line; in
394  kernel build, Makefile needs to change to reflect kernel's configuration.
395  It is simpler to reflect kenel configuration using linker script via assym.h.
396
397o Convert ${DIAGNOSTIC} and ${DEBUG} as flags (defflag).
398
399  Probably generate opt_diagnostic.h/opt_debug.h and include them in
400  sys/param.h.
401
402o Strictly define DIAGNOSTIC.
403
404  It is possible to make DIAGNOSTIC kernel and modules binary-compatible with
405  non-DIAGNOSTIC ones.  In that case, debug type informations should match
406  theoretically (not confirmed).
407
408o Use suffix rules.
409
410  Build objects following suffix rules.  Source files are defined as relative to
411  $S (e.g. sys/kern/init_main.c) and objects are generated in the corresponding
412  subdirectories under kernel build directories (e.g.
413  .../compile/GENERIC/sys/kern/init_main.o).  Dig subdirectories from within
414  config(1).
415
416  Debugging (-g) and profiling (-pg) objects could be generated with *.go/*.po
417  suffixes as userland libraries do.  Maybe something similar for
418  DIAGNOSTIC/DEBUG.
419
420  genassym(1) definitions will be split into per-source instead of the single
421  assym.h.  Dependencies are corrected and some of misterious dependencies on
422  `Makefile' in sys/conf/Makefile.kern.inc can go away.
423
424o Define genassym(1) symbols per file.
425
426  Have each file define symbols that have to be generated by genassym(1) so
427  that more accurate dependency is reflected.
428
429  For example, if foo.S needs some symbols, it defines them in foo.assym,
430  declaring that foo.S depends on foo.assym.h, and includes foo.assym.h.
431  foo.assym.h is generated by following the suffix rule of .assym -> .assym.h.
432  When one header is updated, only related *.assym.h files are regenerated,
433  instead of rebuilding all MD/*.S files that depend on the global, single
434  assym.h.
435