system revision 1.5 1 $NetBSD: system,v 1.5 2009/01/26 05:09:25 agc Exp $
2
3 NetBSD System Roadmap
4 =====================
5
6 This is a small roadmap document, and deals with the main system
7 aspects of the operating system.
8
9 NetBSD 5.0 will ship with the following main changes to the system:
10
11 1. Modularized scheduler
12 2. Real-time scheduling classes and priorities
13 3. Processor sets, processor affinity and processor control
14 4. Multiprocessor optimized scheduler
15 5. High-performance 1:1 threading implementation
16 6. Pushback of the global kernel lock
17 7. New kernel concurrency model
18 8. Multiprocessor optimized memory allocators
19 9. POSIX asynchronous I/O and message queues
20 10. In-kernel linker
21 11. SysV IPC tuneables
22 12. Improved observability: minidumps, lockstat and tprof
23 13. Power management framework
24
25 The following element has been added to the NetBSD-current tree, and will be
26 in NetBSD 6.0
27
28 14. 64-bit time values supported
29
30 The following projects are expected to be included in NetBSD 6.0
31
32 15. Full kernel preemption for real-time threads
33 16. POSIX shared memory
34 17. namei() tactical changes
35 18. Better resource controls
36 19. Improved observability: online crashdumps, remote debugging
37 20. Processor and cache topology aware scheduler
38
39 The timescales for 6.0 are not known at the present time, but we would
40 expect to branch 6.0 late in 2009, with a view to a 6.0 release in
41 early 2010.
42
43 We'll continue to update this roadmap as features and dates get firmed up.
44
45
46 Some explanations
47 =================
48
49 1. Modularized scheduler
50 ------------------------
51
52 Traditionally the only method of control on process scheduling was the
53 'nice' value assigned to each process. The scheduler interface has been
54 redesiged to allow for pluggable schedulers, selected at compile time.
55 At the current time, there are no plans to switch schedulers at run-time,
56 since there is little appreciable gain to be had from that, and the extra
57 performance hit to provide this functionality is thought not to be worth
58 it.
59
60 The in-kernel scheduler interface has been enhanced to provide a framework
61 for adding new schedulers, called the common scheduler framework - more
62 information can be found in the csf(9) manual page.
63
64 Responsible: ad, dsieger, rmind, yamt
65
66 2. Real-time scheduling classes and priorities
67 ----------------------------------------------
68
69 The scheduler has been extended to allow provide multiple new priority
70 bands, including real-time. POSIX standard interfaces for controlling
71 thread priority and scheduling class have been implemented, along with
72 a command line tool to allow control by the system administrator.
73
74 3. Processor sets, processor affinity and processor control
75 -----------------------------------------------------------
76
77 A Solaris and HP-UX compatible interface for defining and controlling
78 processor sets has been added. Processor sets allow applications and
79 the administrator complete flexibility in partitioning CPU resources
80 among applications, down to thread-level granularity.
81
82 Linux compatibile interface controlling processor affinity, similar
83 in spirit to processor sets, is provided.
84
85 A new utility to control CPU status (cpuctl) is provided. cpuctl
86 allows the administrator to enable and disable individual CPUs at
87 the software level, while the system is running. It is expected that
88 this will in time be extended to support full dynamic reconfiguration,
89 in concert with a hypervisor such as Xen.
90
91 4. Multiprocessor optimized scheduler
92 -------------------------------------
93
94 An intelligent, pluggable scheduler named M2 that is optimized for
95 multiprocessor systems, supports POSIX real-time extensions,
96 time-sharing class, and implements thread affinity.
97
98 5. High-performance 1:1 threading implementation
99 ------------------------------------------------
100
101 A new lightweight 1:1 threading implementation, replacing the M:N based
102 implementation found in NetBSD 4.0 and earlier. The new implementation is
103 more correct according to POSIX thread standards, and provides a massive
104 performance boost to threaded workloads in both uni- and multi-processor
105 configurations.
106
107 6. Pushback of the global kernel lock
108 -------------------------------------
109
110 Previously, most access to the kernel was single threaded on multiprocessor
111 systems by the global kernel_lock. The kernel_lock has been pushed back to
112 to the device driver and wire-protocol layers, providing a significant
113 performance boost on heavily loaded multiprocessor systems.
114
115 7. New kernel concurrency model
116 -------------------------------
117
118 The non-preemptive spinlock and "interrupt priority level" synchronization
119 model has been replaced wholesale with a hybrid thread/interrupt model. A
120 full range of new, lightweight synchronization primitives are available to
121 the kernel programmer, including: adaptive mutexes, reader/writer locks,
122 memory barriers, atomic operations, threaded soft interrupts, generic cross
123 calls, workqueues, priority inheritance, and per-CPU storage.
124
125 8. Multiprocessor optimized memory allocators
126 ---------------------------------------------
127
128 The memory allocators in both the kernel and user space are now fully
129 optimized for multiprocessor systems and eliminate the performance
130 degradation typically associated with memory allocators in an MP setting.
131
132 9. POSIX asynchronous I/O and message queues
133 ---------------------------------------------
134
135 A full implementation of the POSIX asynchronous I/O and message
136 queue facilities is now available.
137
138 10. In-kernel linker
139 --------------------
140
141 A in-kernel ELF object linker has been added, and a revamped kernel module
142 infrastructure developed to accompany it. It is expected that the kernel
143 will become completely modular over time, while continuing to retain the
144 ability to link to a single binary image for embedded and hobby systems.
145
146 11. SysV IPC tuneables
147 ----------------------
148
149 Parameters for the SVR3-compatible IPC mechanisms can now be tuned
150 completely at runtime.
151
152 12. Improved observability: minidumps, lockstat and tprof
153 ---------------------------------------------------------
154
155 The x86 architecture now supports mini crash-dumps as a support aid for
156 kernel debugging. Only memory contents actively in use by the kernel at
157 the time of crash are dumped to and recovered from disk, an improvement
158 over the traditional scheme where the complete contents of memory is
159 dumped to disk.
160
161 The lockstat and tprof commands have been addded to the system. lockstat
162 provides a high-resolution description of lock activity in a running system.
163
164 tprof uses sample based profiling in conjuction with the available
165 performance counters in order to better profile system activity.
166
167 13. Power management framework
168 ------------------------------
169
170 A new power management framework has been introduced that improves
171 handling of device power state transitions. As power management support
172 is now integrated with the auto-configuration subsystem, the kernel can
173 ensure that a parent device is powered on before attempting to access
174 the device.
175
176 With these changes comes an updated release of the Intel ACPI
177 Component Architecture and an x86 emulator which assists in restoring
178 uninitialized display adapters.
179
180 Leveraging this work, the i386 and amd64 kernels now support suspend
181 to RAM in uni- and multi-processor configurations on ACPI-capable
182 machines. This support has been successfully tested on a wide variety of
183 laptops, including (but not limited to) recent systems from Dell, IBM/Lenovo,
184 Fujitsu, Toshiba, and Sony.
185
186 Responsible: jmcneill, joerg
187
188 14. 64-bit time_t support
189 -------------------------
190
191 The Unix 32-bit time_t value will overflow in 2037 - any mortgage calculations
192 which use a time_t value are in danger of overflowing at the present time -
193 and to address this, 64-bit time_t values will be used to contain the number
194 of seconds since 1970.
195
196 Responsible: christos
197
198 15. Full kernel preemption for real-time threads
199 ------------------------------------------------
200
201 With the revamp of the kernel concurrency model, much of the kernel is fully
202 multi-threaded and can therefore be preempted at any time. In support of
203 lower context switch and dispatch times for real-time threads, full kernel
204 preemption is being implemented.
205
206 16. POSIX shared memory
207 -----------------------
208
209 Implement POSIX shared memory facilities, which can be used to create the
210 shared memory objects and add the memory locations to the address space of
211 a process.
212
213 Responsible: rmind
214
215 17. Incremental namei improvements, Phase 1
216 -------------------------------------------
217
218 Implement the rest of the changes to namei outlined in Message-ID:
219 <20080319053709.GB3951 (a] netbsd.org>. Simplify the locking and behavior
220 of namei() calls within the kernel to resolve path names within file
221 systems. This phase simplifies the majority of calls to namei().
222
223 Responsible: dholland
224
225 18. Better resource controls
226 ----------------------------
227
228 A resource provisioning and control framework that extends beyond the
229 traditional Unix process limits.
230
231 19. Improved observability: online crashdumps, remote debugging
232 ---------------------------------------------------------------
233
234 XXX crashdumps while the system is running
235 XXX firewire support in libkvm
236
237 20. Processor and cache topology aware scheduler
238 ------------------------------------------------
239
240 Implement the detection of the topology of the processors and caches.
241 Improve the scheduler to make decisions about thread migration
242 according to the topology, to get better thread affinity and less
243 cache thrashing, and thus improve overall performance in modern SMP
244 systems.
245
246 Responsible: rmind
247
248 29. Incremental namei improvements, Phase 2
249 -------------------------------------------
250
251 Implement the rest of the changes to namei outlined in Message-ID:
252 <20080319053709.GB3951 (a] netbsd.org>. Simplify the locking and behavior
253 of namei() calls within the kernel to resolve path names within file
254 systems.
255
256 Responsible: dholland
257
258
259
260 Andrew Doran
261 Alistair Crooks
262 Sun 25 Jan 2009 21:03:04 PST
263