Home | History | Annotate | Download | only in kern
History log of /src/sys/kern/kern_fork.c
RevisionDateAuthorComments
 1.232  16-Jul-2025  kre Kernel part of O_CLOFORK implementation (plus kernel revbump)

This is Ricardo Branco's implementation of O_CLOFORK (and
associated fcntl, etc) for NetBSD (with a few minor changes
by me).

For now, the header file symbols that should be exposed to
userland are hidden inside temporary #ifdef _KERNEL blocks,
just to avoid random userland apps, or config scripts, from
seeing any of this before it is better tested.

Userland parts of this will follow soon.

This also bumps the kernel version to 10.99.15 (changes to
data structs, and the signature of fd_dup()).
 1.231  14-May-2024  andvar fix recently committed typos by msaitoh in few more places, as well as few more.
mainly s/contigous/contiguous/ and s/miliseconds/milliseconds/ in comments.
 1.230  25-Feb-2023  skrll Trailing whitespace
 1.229  01-Jul-2022  prlw1 Uglify code to fix build.
 1.228  01-Jul-2022  riastradh fork(2): Plug leaks in proc_alloc error branch.
 1.227  10-Oct-2021  thorpej Changes to make EVFILT_PROC MP-safe:

Because the locking protocol around processes is somewhat complex
compared to other events that can be posted on kqueues, introduce
new functions for posting NOTE_EXEC, NOTE_EXIT, and NOTE_FORK,
rather than just using the generic knote() function. These functions
KASSERT() their locking expectations, and deal with other complexities
for each situation.

knote_proc_fork(), in particiular, needs to handle NOTE_TRACK, which
requires allocation of a new knote to attach to the child process. We
don't want to be allocating memory while holding the parent's p_lock.
Furthermore, we also have to attach the tracking note to the child
process, which means we have to acquire the child's p_lock.

So, to handle all this, we introduce some additional synchronization
infrastructure around the 'knote' structure:

- Add the ability to mark a knote as being in a state of flux. Knotes
in this state are guaranteed not to be detached/deleted, thus allowing
a code path drop other locks after putting a knote in this state.

- Code paths that wish to detach/delete a knote must first check if the
knote is in-flux. If so, they must wait for it to quiesce. Because
multiple threads of execution may attempt this concurrently, a mechanism
exists for a single LWP to claim the detach responsibility; all other
threads simply wait for the knote to disappear before they can make
further progress.

- When kqueue_scan() encounters an in-flux knote, it simply treats the
situation just like encountering another thread's queue marker -- wait
for the flux to settle and continue on.

(The "in-flux knote" idea was inspired by FreeBSD, but this works differently
from their implementation, as the two kqueue implementations have diverged
quite a bit.)

knote_proc_fork() uses this infrastructure to implement NOTE_TRACK like so:

- Attempt to put the original tracking knote into a state of flux; if this
fails (because the note has a detach pending), we skip all processing
(the original process has lost interest, and we simply won the race).

- Once the note is in-flux, drop the kq and forking process's locks, and
allocate 2 knotes: one to post the NOTE_CHILD event, and one to attach
a new NOTE_TRACK to the child process. Notably, we do NOT go through
kqueue_register() to do this, but rather do all of the work directly
and KASSERT() our assumptions; this allows us to directly control our
interaction with locks. All memory allocations here are performed with
KM_NOSLEEP, in order to prevent holding the original knote in-flux
indefinitely.

- Because the NOTE_TRACK use case adds knotes to kqueues through a
sort of back-door mechanism, we must serialize with the closing of
the destination kqueue's file descriptor, so steal another bit from
the kq_count field to notify other threads that a kqueue is on its
way out to prevent new knotes from being enqueued while the close
path detaches them.

In addition to fixing EVFILT_PROC's reliance on KERNEL_LOCK, this also
fixes a long-standing bug whereby a NOTE_CHILD event could be dropped
if the child process exited before the interested process received the
NOTE_CHILD event (the same knote would be used to deliver the NOTE_EXIT
event, and would clobber the NOTE_CHILD's 'data' field).

Add a bunch of comments to explain what's going on in various critical
sections, and sprinkle additional KASSERT()s to validate assumptions
in several more locations.
 1.226  23-May-2020  ad Move proc_lock into the data segment. It was dynamically allocated because
at the time we had mutex_obj_alloc() but not __cacheline_aligned.
 1.225  12-May-2020  kamil Remove the stub support of CLONE_PID in clone(2)

CLONE_PID causes the child clonee to share the same process id as cloner.

It was implemented for debugging purposes in the Linux kernel 2.0,
restricted to root only in 2.3.21 and removed from Linux 2.5.16.

The CLONE_PID bit was recycled for CLONE_PIDFD in Linux 5.2.
 1.224  07-May-2020  kamil On debugger attach to a prestarted process don't report SIGTRAP

Introduce PSL_TRACEDCHILD that indicates tracking of birth of a process.
A freshly forked process checks whether it is traced and if so, reports
SIGTRAP + TRAP_CHLD event to a debugger as a result of tracking forks-like
events. There is a time window when a debugger can attach to a newly
created process and receive SIGTRAP + TRAP_CHLD instead of SIGSTOP.

Fixes races in t_ptrace_wait* tests when a test hangs or misbehaves,
especially the ones reported in tracer_sysctl_lookup_without_duplicates.
 1.223  24-Apr-2020  thorpej Overhaul the way LWP IDs are allocated. Instead of each LWP having it's
own LWP ID space, LWP IDs came from the same number space as PIDs. The
lead LWP of a process gets the PID as its LID. If a multi-LWP process's
lead LWP exits, the PID persists for the process.

In addition to providing system-wide unique thread IDs, this also lets us
eliminate the per-process LWP radix tree, and some associated locks.

Remove the separate "global thread ID" map added previously; it is no longer
needed to provide this functionality.

Nudged in this direction by ad@ and chs@.
 1.222  14-Apr-2020  kamil Set p_oppid always, not just when a parent is traced

PR kern/55151 by Martin Husemann
 1.221  06-Apr-2020  kamil branches: 1.221.2;
Reintroduce struct proc::p_oppid

Relying on p_opptr is not safe as there is a race between:
- spawner giving a birth to a child process and being killed
- spawnee accessng p_opptr and reporting TRAP_CHLD

PR kern/54786 by Andreas Gustafsson
 1.220  05-Apr-2020  christos - Untangle spawn_return by splitting it up to sub-functions.
- Merge the eventswitch parent notification code which was copied in two
places (eventswitchchild)
- Fix bugs in the eventswitch parent notification code:
1. p_slflags should be accessed holding both proc_lock and p->p_lock
2. p->p_opptr can be NULL if the parent was PSL_CHTRACED and exited.

Fixes random crashes the posix_spawn_kill_spawner unit test which tried
to dereference a NULL pptr.
 1.219  01-Mar-2020  ad child_return():

- This was assuming arg == curlwp, but NULL is passed to lwp_create(), as
evidenced by a random panic during testing. How did this ever work?

- Replace a goto.
 1.218  29-Jan-2020  ad - Track LWPs in a per-process radixtree. It uses no extra memory in the
single threaded case. Replace scans of p->p_lwps with lookups in the
tree. Find free LIDs for new LWPs in the tree. Replace the hashed sleep
queues for park/unpark with lookups in the tree under cover of a RW lock.

- lwp_wait(): if waiting on a specific LWP, find the LWP via tree lookup and
return EINVAL if it's detached, not ESRCH.

- Group the locks in struct proc at the end of the struct in their own cache
line.

- Add some comments.
 1.217  16-Dec-2019  ad branches: 1.217.2;
- Extend the per-CPU counters matt@ did to include all of the hot counters
in UVM, excluding uvmexp.free, which needs special treatment and will be
done with a separate commit. Cuts system time for a build by 20-25% on
a 48 CPU machine w/DIAGNOSTIC.

- Avoid 64-bit integer divide on every fault (for rnd_add_uint32).
 1.216  23-Nov-2019  ad Minor scheduler cleanup:

- Adapt to cpu_need_resched() changes. Avoid lost & duplicate IPIs and ASTs.
sched_resched_cpu() and sched_resched_lwp() contain the logic for this.
- Changes for LSIDL to make the locking scheme match the intended design.
- Reduce lock contention and false sharing further.
- Numerous small bugfixes, including some corrections for SCHED_FIFO/RT.
- Use setrunnable() in more places, and merge cut & pasted code.
 1.215  12-Oct-2019  kamil Remove p_oppid from struct proc

This field is not needed as it duplicated p_opptr that is alread safe to
use, unless proven otherwise.

eventswitch() already contained a check for != initproc (pid1).

Ride ABI bump for 9.99.16.
 1.214  30-Sep-2019  kamil Move TRAP_CHLD/TRAP_LWP ptrace information from struct proc to siginfo

Storing struct ptrace_state information inside struct proc was vulnerable
to synchronization bugs, as multiple events emitted in the same time were
overwritting other ones.

Cache the original parent process id in p_oppid. Reusing here p_opptr is
in theory prone to slight race codition.

Change the semantics of PT_GET_PROCESS_STATE, reutning EINVAL for calls
prompting for the value in cases when there wasn't registered an
appropriate event.

Add an alternative approach to check the ptrace_state information, directly
from the siginfo_t value returned from PT_GET_SIGINFO. The original
PT_GET_PROCESS_STATE approach is kept for compat with older NetBSD and
OpenBSD. New code is recommended to keep using PT_GET_PROCESS_STATE.

Add a couple of compile-time asserts for assumptions in the code.

No functional change intended in existing ptrace(2) software.

All ATF ptrace(2) and ATF GDB tests pass.

This change improves reliability of the threading ptrace(2) code.
 1.213  13-Jun-2019  kamil branches: 1.213.2;
Correct use-after-free issue in vfork(2)

In the previous behavior vforking parent was keeping pointer to a child
and checking whether it clears a PL_PPWAIT in its bitfield p_lflag. However
a child can go invalid between exec/exit event from child and waking up
vforked parent and this can cause invalid pointer read and in the worst
scenario kernel crash.

In the new behavior vforked child keeps a reference to vforked parent LWP
and sets a value l_vforkwaiting to false. This means that vforked child
can finish its work, exec/exit and be terminated and once parent will be
woken up it will read its own field whether its child is still blocking.

Add new field in struct lwp: l_vforkwaiting protected by proc_lock.
In future it should be refactored and all PL_PPWAIT users transformed to
l_vforkwaiting and next l_vforkwaiting probably transformed into a bit
field.

This is another attempt of fixing this bug after <rmind> from 2012 in
commit:

Author: rmind <rmind@NetBSD.org>
Date: Sun Jul 22 22:40:18 2012 +0000

fork1: fix use-after-free problems. Addresses PR/46128 from Andrew Doran.
Note: PL_PPWAIT should be fully replaced and modificaiton of l_pflag by
other LWP is undesirable, but this is enough for netbsd-6.

The new version no longer performs unsafe access in l_lflag changing the
LP_VFORKWAIT bit.

Verified with ATF t_vfork and t_ptrace* tests and they are no longer
causing any issues in my local setup.

Fixes PR/46128 by Andrew Doran
 1.212  03-May-2019  kamil Register KTR events for debugger related signals

Register signals for:

- crashes (FPE, SEGV, FPE, ILL, BUS)
- LWP events
- CHLD (FORK/VFORK/VFORK_DONE) events -- temporarily disabled
- EXEC events

While there refactor related functions in order to simplify the code.

Add missing comment documentation for recently added kernel functions.
 1.211  01-May-2019  kamil Correct handling of corner cases in fork1(9) code under a debugger

Correct detaching and SIGKILLing forker and vforker in the middle of its
operation.
 1.210  01-May-2019  kamil Add eventswitch() in signal code

Route all crash and debugger related signal through eventswitch(), that
calls sigswitch() with preprocessed arguments.

This code avoids code duplication and allows to introduce changes that
will affect all callers of sigswitch() in debugger-related events.

No functional change intended.
 1.209  07-Apr-2019  kamil Add a paranoid racy lock check in child_return()

In theory a child could be detached for some reason or another during
the time window between checking for PSL_TRACED and acquiring proc_lock.

Acquire the proc_lock mutex and recheck for PSL_TRACED before emitting
SIGTRAP. sigswitch() must acquite it internally anyway so this does not
have a negative impact and adds an extra sanity check.

For !PSL_TRACED case there is no impact.
 1.208  06-Apr-2019  kamil Centralized shared part of child_return() into MI part

Add a new function md_child_return() for MD specific bits only.

New child_return() is now part of MI and central code that handles
uniformly tracing code (KTR and ptrace(2)).

Synchronize value passed to ktrsysret() among ports to SYS_fork. This is
a traditional value and accessing p_lflag to check for PL_PPWAIT shall
use locking against proc_lock. Returning SYS_fork vs SYS_vfork still isn't
correct enough as there are more entry points to forking code. Instead of
making it too good, just settle with plain SYS_fork for all ports.
 1.207  05-Apr-2019  kamil Correct distinguishing fork/vfork tracing event in fork1(9)

flags can contain a different value than FORK_PPWAIT and bit comparing
with '&&' was a typo.

Detected with __clone(2) usage scenarios.
 1.206  03-Apr-2019  kamil Rework the fork(2)/vfork(2) event signalling under ptrace(2)

Remove the constraint of SIGTRAP event being maskable by a tracee.

Now all SIGTRAP TRAP_CHLD events are delivered to debugger.

This code touches MD specific logic and the child_return routine.
It's an intermediate step with a room for refactoring in future and
right now the least invasive approach. This allows to assert expected
behavior in already existing ATF tests and make the code prettier
in future keeping the same semantics. Probably there is a need for a MI
wrapper of child_return for shared functionality between ports.
 1.205  01-May-2018  kamil branches: 1.205.2;
Implement PTRACE_VFORK

Add support for tracing vfork(2) events in the context of ptrace(2).

This API covers other frontends to fork1(9) like posix_spawn(2) or clone(2),
if they cause parent to wait for exec(2) or exit(2) of the child.

Changes:
- Add new argument to sigswitch() determining whether we need to acquire
the proc_lock or whether it's already held.
- Refactor fork1(9) for fork(2) and vfork(2)-like events.
Call sigswitch() from fork(1) for forking or vforking parent, instead of
emitting kpsignal(9). We need to emit the signal and suspend the parent,
returning to user and relock proc_lock.
- Add missing prototype for proc_stop_done() in kern_sig.c.
- Make sigswitch a public function accessible from other kernel code
including <sys/signalvar.h>.
- Remove an entry about unimplemented PTRACE_VFORK in the ptrace(2) man page.
- Permin PTRACE_VFORK in the ptrace(2) frontend for userland.
- Remove expected failure for unimplemented PTRACE_VFORK tests in the ATF
ptrace(2) test-suite.
- Relax signal routing constraints under a debugger for a vfork(2)ed child.
This intended to protect from signaling a parent of a vfork(2)ed child that
called PT_TRACE_ME, but wrongly misrouted other signals in vfork(2)
use-cases.

Add XXX comments about still existing problems and future enhancements:
- correct vfork(2) + PT_TRACE_ME handling.
- fork1(2) handling of scenarios when a process is collected in valid but
rare cases.

All ATF ptrace(2) fork[1-8] and vfork[1-8] tests pass.

Fix PR kern/51630 by Kamil Rytarowski (myself).

Sponsored by <The NetBSD Foundation>
 1.204  16-Apr-2018  kamil Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>
 1.203  07-Nov-2017  christos branches: 1.203.2;
Store full executable path in p->p_path as discussed in tech-kern.
This means that the full executable path is always available.

- exec_elf.c: use p->path to set AT_SUN_EXECNAME, and since this is
always set, do so unconditionally.
- kern_exec.c: simplify pathexec, use kmem_strfree where appropriate
and set p->p_path
- kern_exit.c: free p->p_path
- kern_fork.c: set p->p_path for the child.
- kern_proc.c: use p->p_path to return the executable pathname; the
NULL check for p->p_path, should be a KASSERT?
- exec.h: gc ep_path, it is not used anymore
- param.h: bump version, 'struct proc' size change

TODO:
1. reference count the path string, to save copy at fork and free
just before exec?
2. canonicalize the pathname by changing namei() to LOCKPARENT
vnode and then using getcwd() on the parent directory?
 1.202  21-Apr-2017  christos - Propagate the signal mask from the ucontext_t to the newly created thread
as specified by _lwp_create(2)
- Reset the signal stack for threads created with _lwp_create(2)
 1.201  31-Mar-2017  skrll spaces to tab
 1.200  31-Mar-2017  martin PR kern/52117: move stop code for debuged children after fork into MI code.
XXX we might want to revisit this when handling the same event for vfork
better.
 1.199  13-Jan-2017  kamil branches: 1.199.2;
Add support for PTRACE_VFORK_DONE and stub for PTRACE_VFORK in ptrace(2)

PTRACE_VFORK is supposed to be used to track vfork(2)-like events, when
parent gives birth to new process child and stops till it exits or calls
exec().
Currently PTRACE_VFORK is a stub.

PTRACE_VFORK_DONE is notification to notify a debugger that a parent has
resumed after vfork(2)-like action.
PTRACE_VFORK_DONE throws SIGTRAP with TRAP_CHLD.

Sponsored by <The NetBSD Foundation>
 1.198  10-Jan-2017  kamil Introduce new si_code for SIGTRAP: TRAP_CHLD - process child trap

The SIGTRAP signal is thrown from the kernel if EVENT_MASK (ptrace_event)
enables PTRACE_FORK. This new si_code helps debuggers to distinguish the
exact source of signal delivered for a debugger.

Another purpose of TRAP_CHLD is to retain the same behavior inside the
NetBSD kernel for process child traps and have an interface to monitor it.

Retrieving exact event and extended properties of process child trap is
available with PT_GET_PROCESS_STATE.

There is no behavior change for existing software.

This si_code value is NetBSD extension.

Sponsored by <The NetBSD Foundation>
 1.197  09-Jan-2017  kamil Cleanup dead code after revert of racy vfork(2) commit

This removes dead code introduced with the following commit:

date: 2012-07-27 22:52:49 +0200; author: christos; state: Exp; lines: +8 -2;
revert racy vfork() parent-blocking-before-child-execs-or-exits code.
ok rmind
 1.196  04-Nov-2016  christos deduplicate the complex lock reparent dance.
 1.195  09-Jan-2016  dholland branches: 1.195.2;
When doing an unlock/relock dance to avoid lock inversion, it's important
to relock the lock you unlocked. Otherwise the lock you unlocked won't
walk the walk, not by a long chalk, and you'll end up getting mocked.

From Mateusz Guzik of FreeBSD via freenode.

XXX: pullup-6 and -7
 1.194  02-Oct-2015  christos Change SDT (Statically Defined Tracing) probes to use link sets so that it
is easier to add probes. (From FreeBSD)
 1.193  22-Nov-2013  christos branches: 1.193.6;
convert vmem, signals, powerhooks from CIRCLEQ -> TAILQ.
 1.192  09-Jun-2013  riz branches: 1.192.2;
Add another field to the SDT_PROBE_DEFINE macro, so our DTrace probes
can named the same as those on other platforms.

For example, proc:::exec-success, not proc:::exec_success.

Implementation follows the same basic principle as FreeBSD's; add
another field to the SDT_PROBE_DEFINE macro which is the name
as exposed to userland.
 1.191  27-Jul-2012  christos branches: 1.191.2;
revert racy vfork() parent-blocking-before-child-execs-or-exits code.
ok rmind
 1.190  22-Jul-2012  rmind fork1: fix use-after-free problems. Addresses PR/46128 from Andrew Doran.
Note: PL_PPWAIT should be fully replaced and modificaiton of l_pflag by
other LWP is undesirable, but this is enough for netbsd-6.
 1.189  13-Mar-2012  elad Replace the remaining KAUTH_GENERIC_ISSUSER authorization calls with
something meaningful. All relevant documentation has been updated or
written.

Most of these changes were brought up in the following messages:

http://mail-index.netbsd.org/tech-kern/2012/01/18/msg012490.html
http://mail-index.netbsd.org/tech-kern/2012/01/19/msg012502.html
http://mail-index.netbsd.org/tech-kern/2012/02/17/msg012728.html

Thanks to christos, manu, njoly, and jmmv for input.

Huge thanks to pgoyette for spinning these changes through some build
cycles and ATF.
 1.188  02-Mar-2012  rmind - Add __cacheline_aligned for nprocs, make fork_tfmrate static.
- Fix indentation, remove whitespaces and redundant brackets.
 1.187  02-Feb-2012  christos Disable PTRACE_FORK for vforked() children, because the parent is waiting
and will not receive the SIGTRAP in time.
 1.186  02-Sep-2011  christos branches: 1.186.2; 1.186.6;
Add support for PTRACE_FORK. NB: This does not (yet) work for vfork(), because:
1. When we vfork() PL_PPWAIT is set, and that makes us do regular disposition
of the TRAP signal, and not indirect through the debugger.
2. The parent needs to keep running, so that the debugger can release it.
Unfortunately, with vfork() we cannot release the parent because it will
eventually core-dump since the parent and the child cannot run on the
same address space.
 1.185  23-Aug-2011  christos don't use lwp_setprivate in fork, but copy the private lwp member directly
because userland might have messed with the TLS register without letting
the kernel know. This fixes fork() on amd64. Thanks chuq!
 1.184  14-May-2011  rmind fork1: fix stop-on-fork case, lend a correct lock to LWP for LSSTOP state.

Fixes PR/44935.
 1.183  01-May-2011  rmind - Remove FORK_SHARELIMIT and PL_SHAREMOD, simplify lim_privatise().
- Use kmem(9) for struct plimit::pl_corename.
 1.182  26-Apr-2011  joerg Remove IRIX emulation
 1.181  24-Apr-2011  rmind - Move some checks into mqueue_get() and avoid some duplication.
- Simplify message queue descriptor unlinking and closure operations.
- Update proc_t::p_mqueue_cnt atomically. Inherit it on fork().
- Use separate allocation for the name of message queue.
 1.180  23-Mar-2011  joerg Preserve l_private across forks.
 1.179  18-Jan-2011  matt Copy PK_32 to p2->p_flag instead of doing it in the cpu_proc_fork hook.
 1.178  07-Jul-2010  chs branches: 1.178.2;
many changes for COMPAT_LINUX:
- update the linux syscall table for each platform.
- support new-style (NPTL) linux pthreads on all platforms.
clone() with CLONE_THREAD uses 1 process with many LWPs
instead of separate processes.
- move the contents of sys__lwp_setprivate() into a new
lwp_setprivate() and use that everywhere.
- update linux_release[] and linux32_release[] to "2.6.18".
- adjust placement of emul fork/exec/exit hooks as needed
and adjust other emul code to match.
- convert all struct emul definitions to use named initializers.
- change the pid allocator to allow multiple pids to refer to the same proc.
- remove a few fields from struct proc that are no longer needed.
- disable the non-functional "vdso" code in linux32/amd64,
glibc works fine without it.
- fix a race in the futex code where we could miss a wakeup after
a requeue operation.
- redo futex locking to be a little more efficient.
 1.177  13-Jun-2010  yamt increment p_nrlwps in lwp_create rather than letting callers do so
as it's always decremented by lwp_exit. this fixes error recovery of
eg. aio_procinit.
 1.176  01-Mar-2010  darran branches: 1.176.2;
DTrace: Add an SDT (Statically Defined Tracing) provider framework, and
implement most of the proc provider. Adds proc:::create, exec,
exec_success, exec_faillure, signal_send, signal_discard, signal_handle,
lwp_create, lwp_start, lwp_exit.
 1.175  08-Jan-2010  pooka branches: 1.175.2;
The VATTR_NULL/VREF/VHOLD/HOLDRELE() macros lost their will to live
years ago when the kernel was modified to not alter ABI based on
DIAGNOSTIC, and now just call the respective function interfaces
(in lowercase). Plenty of mix'n match upper/lowercase has creeped
into the tree since then. Nuke the macros and convert all callsites
to lowercase.

no functional change
 1.174  21-Oct-2009  rmind Remove uarea swap-out functionality:

- Addresses the issue described in PR/38828.
- Some simplification in threading and sleepq subsystems.
- Eliminates pmap_collect() and, as a side note, allows pmap optimisations.
- Eliminates XS_CTL_DATA_ONSTACK in scsipi code.
- Avoids few scans on LWP list and thus potentially long holds of proc_lock.
- Cuts ~1.5k lines of code. Reduces amd64 kernel size by ~4k.
- Removes __SWAP_BROKEN cases.

Tested on x86, mips, acorn32 (thanks <mpumford>) and partly tested on
acorn26 (thanks to <bjh21>).

Discussed on <tech-kern>, reviewed by <ad>.
 1.173  24-Mar-2009  christos use kauth instead of uid != 0
 1.172  17-Jan-2009  yamt branches: 1.172.2;
malloc -> kmem_alloc.
 1.171  11-Oct-2008  pooka branches: 1.171.2; 1.171.4; 1.171.8; 1.171.10; 1.171.12;
Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.
 1.170  16-Jun-2008  ad branches: 1.170.2;
- PPWAIT is need only be locked by proc_lock, so move it to proc::p_lflag.
- Remove a few needless lock acquires from exec/fork/exit.
- Sprinkle branch hints.

No functional change.
 1.169  02-Jun-2008  ad branches: 1.169.2;
Most contention on proc_lock is from getppid(), so cache the parent's PID.
 1.168  02-Jun-2008  ad If vfork(), we want the LWP to run fast and on the same CPU
as its parent, so that it can reuse the VM context and cache
footprint on the local CPU.
 1.167  31-May-2008  ad Hold proc_lock when sleeping on p_waitcv, not proc::p_lock.
 1.166  27-May-2008  ad tsleep -> kpause
 1.165  27-May-2008  ad Start profiling clock on new process before setting it running, in case
there is a preemption.
 1.164  19-May-2008  ad Reduce ifdefs due to MULTIPROCESSOR slightly.
 1.163  28-Apr-2008  martin branches: 1.163.2;
Remove clause 3 and 4 from TNF licenses
 1.162  24-Apr-2008  ad branches: 1.162.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.
 1.161  24-Apr-2008  ad Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.
 1.160  23-Mar-2008  ad branches: 1.160.2;
Undo 1.150 (Don't make root an exception when enforcing rlimits). No other
Unix behaves this way and it breaks too many things, e.g. web servers.
 1.159  21-Mar-2008  ad Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.
 1.158  24-Feb-2008  dsl Set p->p_trace_enabled in fork and whenever the controlling falgs change
instead of doing it in syscall_intern().
Note that syscall_intern() must still be called when the flags change
since many ports use a different copy of the syscall entry code when
tracing is enabled.
 1.157  28-Jan-2008  ad branches: 1.157.2; 1.157.6;
Authorize using the LWP cached credentials, not process credentials.
 1.156  07-Jan-2008  elad Make fork use kauth.

Been running in my tree for over a month at least.

Reviewed and okay yamt@, and special thanks to him as well as rittera@
for making this possible through fixing NDIS to not call fork1() with
l1 != curlwp.
 1.155  02-Jan-2008  ad Merge vmlocking2 to head.
 1.154  31-Dec-2007  ad Remove systrace. Ok core@.
 1.153  20-Dec-2007  dsl Convert all the system call entry points from:
int foo(struct lwp *l, void *v, register_t *retval)
to:
int foo(struct lwp *l, const struct foo_args *uap, register_t *retval)
Fixup compat code to not write into 'uap' and (in some cases) to actually
pass a correctly formatted 'uap' structure with the right name to the
next routine.
A few 'compat' routines that just call standard ones have been deleted.
All the 'compat' code compiles (along with the kernels required to test
build it).
98% done by automated scripts.
 1.152  05-Dec-2007  ad branches: 1.152.4;
Match the docs: MUTEX_DRIVER/SPIN are now only for porting code written
for Solaris.
 1.151  04-Dec-2007  ad Use atomics to maintain nprocs.
 1.150  03-Dec-2007  elad Don't make root an exception when enforcing rlimits.

Suggested by yamt@ months ago; okay christos@.
 1.149  03-Dec-2007  ad branches: 1.149.2;
Soft interrupts can now take proclist_lock, so there is no need to
double-lock alllwp or allproc.
 1.148  27-Nov-2007  ad Tidy up the sigacts locking a bit: sigacts can be shared between
multiple processes.
 1.147  07-Nov-2007  ad Merge from vmlocking:

- pool_cache changes.
- Debugger/procfs locking fixes.
- Other minor changes.
 1.146  06-Nov-2007  ad Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.
 1.145  24-Oct-2007  ad branches: 1.145.2;
Make ras_lookup() lockless.
 1.144  29-Sep-2007  dsl branches: 1.144.2;
Change the way p->p_limit (and hence p->p_rlimit) is locked.
Should fix PR/36939 and make the rlimit code MP safe.
Posted for comment to tech-kern (non received!)

The p_limit field (for a process) is only be changed once (on the first
write), and a reference to the old structure is kept (for code paths
that have cached the pointer).
Only p->p_limit is now locked by p->p_mutex, and since the referenced memory
will not go away, is only needed if the pointer is to be changed.
The contents of 'struct plimit' are all locked by pl_mutex, except that the
code doesn't bother to acquire it for reads (which are basically atomic).
Add FORK_SHARELIMIT that causes fork1() to share the limits between parent
and child, use it for the IRIX_PR_SULIMIT.
Fix borked test for both IRIX_PR_SUMASK and IRIX_PR_SDIR being set.
 1.143  21-Sep-2007  dsl branches: 1.143.2;
Rename members of 'struct plimit' so that the fields are 'pl_xxx' and
no longer have the same names as members of 'struct proc'.
 1.142  15-Aug-2007  ad branches: 1.142.2;
Changes to make ktrace LKM friendly and reduce ifdef KTRACE. Proposed
on tech-kern.
 1.141  09-Jul-2007  ad branches: 1.141.2; 1.141.6;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements
 1.140  15-Jun-2007  ad splstatclock, spllock -> splhigh
 1.139  17-May-2007  yamt merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.
 1.138  30-Apr-2007  rmind Import of POSIX Asynchronous I/O.
Seems to be quite stable. Some work still left to do.

Please note, that syscalls are not yet MP-safe, because
of the file and vnode subsystems.

Reviewed by: <tech-kern>, <ad>
 1.137  13-Mar-2007  ad Sync with kern_proc.c: make p2->p_rasmutex a spin mutex at IPL_SCHED.
 1.136  09-Mar-2007  ad branches: 1.136.2; 1.136.4;
- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.
 1.135  04-Mar-2007  christos Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.
 1.134  22-Feb-2007  thorpej TRUE -> true, FALSE -> false
 1.133  21-Feb-2007  thorpej Pick up some additional files that were missed before due to conflicts
with newlock2 merge:

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.
 1.132  17-Feb-2007  pavel Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.
 1.131  15-Feb-2007  ad branches: 1.131.2;
Restore proc::p_userret in a limited way for Linux compat. XXX
 1.130  09-Feb-2007  ad Merge newlock2 to head.
 1.129  15-Jan-2007  elad Introduce kauth_proc_fork() to control credential inheritance.
 1.128  01-Nov-2006  yamt remove some __unused from function parameters.
 1.127  12-Oct-2006  christos - sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386
 1.126  17-Jul-2006  ad branches: 1.126.4; 1.126.6;
- Always make p->p_cred a private copy before modifying.
- Share credentials among processes when forking.
 1.125  07-Jun-2006  kardel merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html
 1.124  14-May-2006  elad branches: 1.124.2;
integrate kauth.
 1.123  11-Dec-2005  christos branches: 1.123.4; 1.123.6; 1.123.8; 1.123.10; 1.123.12;
merge ktrace-lwp.
 1.122  17-May-2005  cube branches: 1.122.2;
Add P_CLDSIGIGN, P_NOCLDSTOP and P_NOCLDWAIT to the list of flags we want
to inherit from the parent process.
 1.121  02-Mar-2005  mycroft branches: 1.121.2;
Copyright maintenance.
 1.120  26-Feb-2005  perry nuke trailing whitespace
 1.119  17-Sep-2004  enami branches: 1.119.4; 1.119.6;
- proc_alloc() already initializes p_stat to SIDL.
- copy unconditionaly inherited p_flag bits in a single place.
 1.118  08-Aug-2004  jdolecek Linux enforces CLONE_VM if CLONE_SIGHAND in clone(2) is specified,
follow the suit - this is intended to be Linux-compatible call
 1.117  08-Aug-2004  jdolecek pass the fork flags down the emulation proc fork hook
 1.116  07-Aug-2004  christos PR/26468: Andrew Brown: Setting stopfork can panic the kernel.
When stopfork is set, we need to set p_nrlwps, since we are not going to
ber running.
 1.115  06-May-2004  pk Provide a mutex for the process limits data structure.
 1.114  12-Feb-2004  enami branches: 1.114.2;
Also defer the writing of KTR_EMUL entry. Otherwise, the parent process
may sleep with setting KTRFAC_ACTIVE of child process and the child will
run without emitting any ktrace entry.
 1.113  12-Nov-2003  dsl - Count number of zombies and stopped children and requeue them at the top
of the sibling list so that find_stopped_child can be optimised to avoid
traversing the entire sibling list - helps when a process has a lot of
children.
- Modify locking in pfind() and pgfind() to that the caller can rely on the
result being valid, allow caller to request that zombies be findable.
- Rename pfind() to p_find() to ensure we break binary compatibility.
- Remove svr4_pfind since p_find willnow do the job.
- Modify some of the SMP locking of the proc lists - signals are still stuffed.

Welcome to 1.6ZF
 1.112  04-Nov-2003  dsl Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)
 1.111  16-Sep-2003  christos add siginfo lock and siginfo queue initialization.
 1.110  07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.109  29-Jun-2003  fvdl branches: 1.109.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
 1.108  28-Jun-2003  darrenr Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
 1.107  19-Mar-2003  dsl Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.
 1.106  24-Jan-2003  thorpej Add "fork hooks", a'la "exec hooks" and "exit hooks" which allow
subsystems to do special processing to the parent and/or child at
fork time.
 1.105  18-Jan-2003  thorpej Merge the nathanw_sa branch.
 1.104  12-Dec-2002  jdolecek branches: 1.104.2;
replace magic number '500' in pid allocation code with a macro PID_SKIP,
defined in <sys/proc.h> (along PID_MAX, NO_PID)
 1.103  11-Dec-2002  jdolecek put back portion of fork-bomb protection removed in last commit,
and make the sleep length depend on value of variable forkfsleep;
it's set to zero by default (no sleep)
this is a preparation for making the sleep length settable via sysctl
 1.102  11-Dec-2002  groo Remove portion of fork-bomb protection that has unfortunate side effects.
 1.101  05-Dec-2002  jdolecek Couple fork-bomb defense changes:

- leave 5 processes for root-only use, the previous value of 1
was unsufficient to execute additional commands once logged, and
perhaps also not enough to actually login remotely with recent (open)sshd
- protect the log of "proc: table full" with ratecheck(), so that
the message is only logged once per 10 seconds; though syslogd normally
doesn't pass the repeated messages through, this avoids flooding
syslogd and potentially also screen/logs
- If the process hits either system limit of number of processes in system,
or user's limit of same, force the process to sleep for 0.5 seconds
before returning failure. This turns 2000 rampaging fork monsters into
2000 harmlessly snoozing fork monsters.
The sleep is intentionally uninterruptible by signals.

These are not intended as ultimate protection agains fork-bombs.
Determined attacker can eat CPU differently than via repeating
fork() calls. But this is good enough to help protect against
programming mistakes or simple-minded tests.

Based on FreeBSD kern_fork.c change in revision 1.132 by
Mike Silbersack <silby at FreeBSD org>

Change also discussed on tech-kern@NetBSD.org, thread
'Fork bomb protection patch'.
 1.100  30-Nov-2002  manu cosmetic fix
 1.99  17-Nov-2002  chs change uvm_uarea_alloc() to indicate whether the returned uarea is already
backed by physical pages (ie. because it reused a previously-freed one),
so that we can skip a bunch of useless work in that case.
this fixes the underlying problem behind PR 18543, and also speeds up fork()
quite a bit (eg. 7% on my pc, 1% on my ultra2) when we get a cache hit.
 1.98  13-Nov-2002  provos fix systrace panic that was introduced when postponing pid number allocation
approved itojun
 1.97  07-Nov-2002  manu Added two sysctl-able flags: proc.curproc.stopfork and proc.curproc.stopexec
that can be used to block a process after fork(2) or exec(2) calls. The
new process is created in the SSTOP state and is never scheduled for running.

This feature is designed so that it is esay to attach the process using gdb
before it has done anything.

It works also with sproc, kthread_create, clone...
 1.96  23-Oct-2002  jdolecek merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe
 1.95  21-Oct-2002  christos Move pid allocation to the bottom of the process allocation, so that we
don't have to deal with partially initialized proc structs in the scheduler.
Pointed out by: Artur Grabowski and Chuck Silvers.
 1.94  25-Sep-2002  thorpej Don't include <sys/map.h>.
 1.93  22-Sep-2002  chs encapsulate knowledge of uarea allocation in some new functions.
 1.92  28-Aug-2002  gmcgarry MI kernel support for user-level Restartable Atomic Sequences (RAS).
 1.91  06-Aug-2002  pooka Add FORK_CLEANFILES flag to fork1(), which makes the new process start out
with a clean descriptor set (ie. not copied or shared from parent).

for rfork()
 1.90  11-Jul-2002  pooka Add FORK_NOWAIT flag, which sets init as the parent of the forked
process. Useful for FreeBSD rfork() emulation.

ok'd by Christos
 1.89  17-Jun-2002  christos Niels Provos systrace work, ported to NetBSD by kittenz and reworked...
 1.88  08-Dec-2001  thorpej branches: 1.88.4; 1.88.8;
Make the coredump routine exec-format/emulation specific. Split
out traditional NetBSD coredump routines into core_netbsd.c and
netbsd32_core.c (for COMPAT_NETBSD32).
 1.87  12-Nov-2001  lukem add RCSIDs
 1.86  07-Jul-2001  fvdl branches: 1.86.2; 1.86.6;
flags was used uninitialized.
 1.85  01-Jul-2001  thorpej Linux-compatible clone(2) system call, lifted from the Linux
compatibility module. Based on patches from Bang Jun-Young <bjy@mogua.org>.
 1.84  26-Feb-2001  lukem branches: 1.84.2;
minor KNF
 1.83  09-Jan-2001  fvdl Do syscall_intern after p_traceflag has been copied to the new
process (if it is inherited), so that ktrace continues to work
properly on the child.
 1.82  31-Dec-2000  ad PR 4853: we fork a lot more during startup these days. Wrap nextpid to 500.
 1.81  22-Dec-2000  jdolecek split off thread specific stuff from struct sigacts to struct sigctx, leaving
only signal handler array sharable between threads
move other random signal stuff from struct proc to struct sigctx

This addresses kern/10981 by Matthew Orgass.
 1.80  11-Dec-2000  tsutsui Use USPACE_ALIGN for an alignment argument on allocating U-area.
The default value is 0, and could be overridden by machine/vmparam.h.
 1.79  11-Dec-2000  mycroft Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.
 1.78  10-Dec-2000  jdolecek fork1(): write the ktrace entry before the parent is put to sleep for
FORK_PPWAIT case, so that this DTRT for vfork() too
 1.77  27-Nov-2000  nisimura Introduce uvm_km_valloc_align() and use it to glab process's USPACE
aligned on USPACE boundary in kernel virutal address. It's benefitial
for MIPS R4000's paired TLB entry design.
 1.76  08-Nov-2000  chs in fork1(), only add make the new proc visible (by giving it a pid
and adding it to allproc) after it's fully initialized.
this prevents the scheduler from coming in via a clock interrupt
and tripping over a partially-initialized proc.
 1.75  07-Nov-2000  jdolecek add void *p_emuldata into struct proc - this can be used to hold per-process
emulation-specific data
add process exit, exec and fork function hooks into struct emul:
* e_proc_fork() - called in fork1() after the new forked process is setup
* e_proc_exec() - called in sys_execve() after the executed process is setup
* e_proc_exit() - called in exit1() after all the other process cleanups are
done, right before machine-dependant switch to new context; also called
for "old" emulation from sys_execve() if emulation of executed program and
the original process is different

This was discussed on tech-kern.
 1.74  07-Nov-2000  jdolecek write KTR_EMUL entry on end of fork1() - primarily usable when the new
process never does execve(2), such as when creating a thread
 1.73  06-Sep-2000  sommerfeld Lock scheduler before putting new proc on run queues.
 1.72  25-Aug-2000  sommerfeld MULTIPROCESSOR: Initialize new proc's p_cpu pointer to NULL, so
anything which looks at it before it runs won't explode.
 1.71  22-Aug-2000  thorpej Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.
 1.70  01-Aug-2000  thorpej ANSI'ify.
 1.69  04-Jul-2000  jdolecek change tablefull() to accept one more parameter - optional hint

use that to inform about way to raise current limit when we reach maximum
number of processes, descriptors or vnodes

XXX hopefully I catched all users of tablefull()
 1.68  27-Jun-2000  mrg remove include of <vm/vm.h>
 1.67  26-Jun-2000  mrg remove/move more mach vm header files:

<vm/pglist.h> -> <uvm/uvm_pglist.h>
<vm/vm_inherit.h> -> <uvm/uvm_inherit.h>
<vm/vm_kern.h> -> into <uvm/uvm_extern.h>
<vm/vm_object.h> -> nothing
<vm/vm_pager.h> -> into <uvm/uvm_pager.h>

also includes a bunch of <vm/vm_page.h> include removals (due to redudancy
with <vm/vm.h>), and a scattering of other similar headers.
 1.66  31-May-2000  thorpej branches: 1.66.2;
Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.
 1.65  28-May-2000  thorpej Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.
 1.64  08-May-2000  thorpej branches: 1.64.2;
__predict_false() the test for full process table, user exceeding their
process limit, and USPACE valloc failure.
 1.63  30-Mar-2000  augustss Get rid of register declarations.
 1.62  23-Mar-2000  thorpej New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.
 1.61  22-Jul-1999  thorpej branches: 1.61.2;
Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.
 1.60  22-Jul-1999  thorpej Rearrange some code slightly.
 1.59  13-May-1999  thorpej Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).
 1.58  13-May-1999  thorpej Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).
 1.57  30-Apr-1999  thorpej Pay attention to FORK_SHARECWD, FORK_SHAREFILES, and FORK_SHARESIGS.
 1.56  30-Apr-1999  thorpej Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).
 1.55  30-Apr-1999  thorpej Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).
 1.54  24-Mar-1999  mrg branches: 1.54.4;
completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.
 1.53  23-Feb-1999  ross Replace the recent scheduler mods with calls to scheduler_{fork,wait}_hook(),
(inlined) so scheduler functionality can be kept in a single .h/.c set.
Also, the wait hook has changed the way it clips the scheduler history.
 1.52  23-Jan-1999  sommerfe Tweak to earlier fix to p_estcpu:
- no longer conditionalized
- when traced, charge time to real parent, not debugger
- make it clear for future rototillers that p_estcpu should be moved
to the "copy" region of struct proc.
 1.51  23-Jan-1999  sommerfe Under control of "slowchild" global, make child process inherit the
scheduling penalty for being cpu-bound (p_estcpu) of its parent.
 1.50  11-Nov-1998  thorpej Move fork_kthread() to a new file, kern_kthread.c, and rename it to
kthread_create(). Implement kthread_exit() (causes a thrad to exit).
Set P_NOCLDWAIT on kernel threads, which will cause any of their children
to be reparented to init(8) (which is already prepared to wait out orphaned
processes).
 1.49  11-Nov-1998  thorpej Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().
 1.48  08-Sep-1998  thorpej - Use proclists[], rather than checking allproc and zombproc explicitly.
- Add some comments about locking.
 1.47  31-Aug-1998  thorpej Use the pool allocator and "nointr" pool page allocator for pcred and
plimit structures.
 1.46  13-Aug-1998  eeh Merge paddr_t changes into the main branch.
 1.45  04-Aug-1998  perry Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)
 1.44  02-Aug-1998  thorpej Use a pool for proc structures.
 1.43  25-Jun-1998  thorpej branches: 1.43.2;
defopt KTRACE
 1.42  02-May-1998  christos New fktrace syscall from Darren Reed [with fixes from me]
 1.41  09-Apr-1998  thorpej Allocate kernel virtual address space for the U-area before allocating
the new proc structure when performing a fork. This makes it much
easier to abort a fork operation and return an error if we run out
of KVA space.

The U-area pages are still wired down in {,u}vm_fork(), as before.
 1.40  01-Mar-1998  fvdl Merge with Lite2 + local changes
 1.39  14-Feb-1998  thorpej Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.
 1.38  10-Feb-1998  mrg - add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.
 1.37  05-Feb-1998  mrg initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)
 1.36  06-Jan-1998  thorpej Allow retval to be NULL, filling it in only if it was passed.
 1.35  05-Jan-1998  thorpej Also pass fork1() a struct proc **, in case the caller wants a pointer
to the newly created process.
 1.34  04-Jan-1998  thorpej New vfork(2) implementation, whith semantics matching those of the original
3BSD vfork(2), i.e. share address space w/ parent and block parent.

Keep statistics on the total number of forks, the number of forks that
block the parent, and the number of forks that share the address space
with the parent.
 1.33  03-Jan-1998  thorpej Update for additional argument to vm_fork() ("shared" boolean).
 1.32  19-Jun-1997  pk branches: 1.32.6; 1.32.8;
Remove __FORK_BRAINDAMAGEd code; it's no longer needed.
 1.31  18-Feb-1997  mrg pass P_SUGID to child. from freebsd.
 1.30  09-Oct-1996  mycroft branches: 1.30.6;
Use PHOLD() and PRELE() rather than frobbing p_holdcnt directly.
 1.29  09-Feb-1996  christos branches: 1.29.4;
More proto fixes
 1.28  04-Feb-1996  christos First pass at prototyping
 1.27  10-Dec-1995  mycroft If __FORK_BRAINDAMAGE, continue stuffing retval[1] for the benefit of main().
 1.26  09-Dec-1995  mycroft Only expect vm_fork() to return if __FORK_BRAINDAMAGE is defined.
Use splstatclock() rather than splhigh() in one place.
Eliminate unused third arg to vm_fork().
 1.25  07-Oct-1995  mycroft Prefix names of system call implementation functions with `sys_'.
 1.24  18-Mar-1995  mycroft Clean up comments related to last change, and remove an unneeded
splclock/splx pair.
 1.23  23-Feb-1995  mycroft Move a couple of assignments from the parent to the child.
 1.22  20-Oct-1994  cgd update for new syscall args description mechanism
 1.21  30-Aug-1994  mycroft Make sure p_emul is copied on fork.
 1.20  30-Aug-1994  mycroft Convert process, file, and namei lists and hash tables to use queue.h.
 1.19  29-Jun-1994  cgd New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'
 1.18  15-Jun-1994  mycroft Turn P_NOSWAP and P_PHYSIO into a hold count, as suggested by a comment.
 1.17  19-May-1994  cgd update to lite
 1.16  17-May-1994  cgd copyright foo
 1.15  13-May-1994  cgd setrq -> setrunqueue, sched -> scheduler
 1.14  05-May-1994  cgd lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.
 1.13  04-May-1994  cgd Rename a lot of process flags.
 1.12  10-Apr-1994  cgd light clean
 1.11  21-Feb-1994  chopps fix incorect check of nprocs vs. maxproc.
 1.10  04-Jan-1994  cgd field name change
 1.9  22-Dec-1993  cgd add support for p_vnode, from jsp
 1.8  21-Dec-1993  cgd p_spare is in the 'zero range' now
 1.7  18-Dec-1993  mycroft Canonicalize all #includes.
 1.6  15-Sep-1993  cgd make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...
 1.5  07-Aug-1993  cgd branches: 1.5.2;
merge in changes from netbsd-0-9-ALPHA2
 1.4  27-Jun-1993  andrew branches: 1.4.2;
ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.
 1.3  02-Jun-1993  cgd nextpid & maxproc fixes from ws
 1.2  20-May-1993  cgd add $Id$ strings, and clean up file headers where necessary
 1.1  21-Mar-1993  cgd branches: 1.1.1;
Initial revision
 1.1.1.3  01-Mar-1998  fvdl Import 4.4BSD-Lite2
 1.1.1.2  01-Mar-1998  fvdl Import 4.4BSD-Lite for reference
 1.1.1.1  21-Mar-1993  cgd initial import of 386bsd-0.1 sources
 1.4.2.1  02-Aug-1993  cgd mark process as SIDL *before* putting it on the allproc list.
this should protect against people using fields before
they're initialized.
 1.5.2.2  14-Nov-1993  mycroft Canonicalize all #includes.
 1.5.2.1  24-Sep-1993  mycroft Make all files using spl*() #include cpu.h. Changes from trunk.
init_main.c: New method of pseudo-device of initialization.
kern_clock.c: hardclock() and softclock() now take a pointer to a clockframe.
softclock() only does callouts.
kern_synch.c: Remove spurious declaration of endtsleep(). Adjust uses of
averunnable for new struct loadav.
subr_prf.c: Allow printf() formats in panic().
tty.c: averunnable changes.
vfs_subr.c: va_size and va_bytes are now quads.
 1.29.4.1  19-Feb-1997  rat Pullup 1.30 -> 1.31 by request from Matt Green. Fixes SUID security
bug.
 1.30.6.1  12-Mar-1997  is Merge in changes from Trunk
 1.32.8.1  04-Feb-1999  cgd pull up revs 1.51-1.52 from trunk (bad)
 1.32.6.1  08-Sep-1997  thorpej Set up the new process's sigacts before calling vm_fork().
 1.43.2.1  30-Jul-1998  eeh Split vm_offset_t and vm_size_t into paddr_t, psize_t, vaddr_t, and vsize_t.
 1.54.4.2  02-Aug-1999  thorpej Update from trunk.
 1.54.4.1  21-Jun-1999  thorpej Sync w/ -current.
 1.61.2.7  12-Mar-2001  bouyer Sync with HEAD.
 1.61.2.6  18-Jan-2001  bouyer Sync with head (for UBC+NFS fixes, mostly).
 1.61.2.5  05-Jan-2001  bouyer Sync with HEAD
 1.61.2.4  13-Dec-2000  bouyer Sync with HEAD (for UBC fixes).
 1.61.2.3  08-Dec-2000  bouyer Sync with HEAD.
 1.61.2.2  22-Nov-2000  bouyer Sync with HEAD.
 1.61.2.1  20-Nov-2000  bouyer Update thorpej_scsipi to -current as of a month ago
 1.64.2.1  22-Jun-2000  minoura Sync w/ netbsd-1-5-base.
 1.66.2.1  04-Jul-2000  jdolecek Pullup from trunk [approved by thorpej]:

change tablefull() to accept one more parameter - optional hint

use that to inform about way to raise current limit when we reach maximum
number of processes, descriptors or vnodes
 1.84.2.23  19-Dec-2002  thorpej Sync with HEAD.
 1.84.2.22  15-Dec-2002  thorpej Add an "inmem" argument to newlwp(), and use it to properly set
up L_INMEM in the new LWP. Should fix a problem mentioned to me
by Nick Hudson.
 1.84.2.21  11-Dec-2002  thorpej Sync with HEAD.
 1.84.2.20  11-Dec-2002  thorpej Sync with HEAD.
 1.84.2.19  12-Nov-2002  skrll Reapply a fix lost in the catch up to -current.
 1.84.2.18  11-Nov-2002  nathanw Catch up to -current
 1.84.2.17  18-Oct-2002  nathanw Catch up to -current.
 1.84.2.16  17-Sep-2002  nathanw Catch up to -current.
 1.84.2.15  13-Aug-2002  nathanw Catch up to -current.
 1.84.2.14  01-Aug-2002  nathanw Catch up to -current.
 1.84.2.13  17-Jul-2002  nathanw Whitespace.
 1.84.2.12  12-Jul-2002  nathanw No longer need to pull in lwp.h; proc.h pulls it in for us.
 1.84.2.11  24-Jun-2002  nathanw Curproc->curlwp renaming.

Change uses of "curproc->l_proc" back to "curproc", which is more like the
original use. Bare uses of "curproc" are now "curlwp".

"curproc" is now #defined in proc.h as ((curlwp) ? (curlwp)->l_proc) : NULL)
so that it is always safe to reference curproc (*de*referencing curproc
is another story, but that's always been true).
 1.84.2.10  20-Jun-2002  nathanw Oops, added one line too many when PID selection was rearranged.
 1.84.2.9  20-Jun-2002  nathanw Catch up to -current.
 1.84.2.8  29-May-2002  nathanw #include <sys/sa.h> before <sys/syscallargs.h>, to provide sa_upcall_t
now that <sys/param.h> doesn't include <sys/sa.h>.

(Behold the Power of Ed)
 1.84.2.7  08-Jan-2002  nathanw Catch up to -current.
 1.84.2.6  17-Nov-2001  nathanw Implement POSIX realtime timers, and reimplement getitimer() and setitimer()
in terms of them.
 1.84.2.5  14-Nov-2001  nathanw Catch up to -current.
 1.84.2.4  05-Nov-2001  briggs Call uvm_proc_fork() before newlwp() to fork the process's VM space, if
necessary. This ensures that p2 is more-fully initialized before entering
newlwp(). This also ensures cpu_fork() will have access to p_vmspace for
both processes.
 1.84.2.3  05-Nov-2001  briggs KERNEL_PROC_LOCK/KERNEL_PROC_UNLOCK take struct lwp *, not struct proc *.
 1.84.2.2  24-Aug-2001  nathanw Catch up with -current.
 1.84.2.1  05-Mar-2001  nathanw Initial commit of scheduler activations and lightweight process support.
 1.86.6.1  12-Nov-2001  thorpej Sync the thorpej-mips-cache branch with -current.
 1.86.2.4  10-Oct-2002  jdolecek sync kqueue with -current; this includes merge of gehenna-devsw branch,
merge of i386 MP branch, and part of autoconf rototil work
 1.86.2.3  06-Sep-2002  jdolecek sync kqueue branch with HEAD
 1.86.2.2  10-Jan-2002  thorpej Sync kqueue branch with -current.
 1.86.2.1  10-Jul-2001  lukem add calls to KNOTE(9) as appropriate
 1.88.8.3  29-Aug-2002  gehenna catch up with -current.
 1.88.8.2  15-Jul-2002  gehenna catch up with -current.
 1.88.8.1  20-Jun-2002  gehenna catch up with -current.
 1.88.4.2  10-Mar-2002  thorpej Use a pool cache for turnstiles, saving the memset (missed this in
previous commit).
 1.88.4.1  10-Mar-2002  thorpej First cut implementation of turnstiles, a specialized sleep queue used for
kernel synchronization objects. A detailed description of turnstiles
can be found in:

Solaris Internals: Core Kernel Architecture, by Jim Mauro
and Richard McDougall, section 3.7.

Note this implementation does not yet implement priority inheritence,
nor does it currently differentiate between reader and writer queues
(though they are provided for in the API).
 1.104.2.1  18-Dec-2002  gmcgarry Merge pcred and ucred, and poolify. TBD: check backward compatibility
and factor-out some higher-level functionality.
 1.109.2.7  10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.109.2.6  04-Mar-2005  skrll Sync with HEAD.

Hi Perry!
 1.109.2.5  21-Sep-2004  skrll Fix the sync with head I botched.
 1.109.2.4  18-Sep-2004  skrll Sync with HEAD.
 1.109.2.3  12-Aug-2004  skrll Sync with HEAD.
 1.109.2.2  03-Aug-2004  skrll Sync with HEAD
 1.109.2.1  02-Jul-2003  darrenr Apply the aborted ktrace-lwp changes to a specific branch. This is just for
others to review, I'm concerned that patch fuziness may have resulted in some
errant code being generated but I'll look at that later by comparing the diff
from the base to the branch with the file I attempt to apply to it. This will,
at the very least, put the changes in a better context for others to review
them and attempt to tinker with removing passing of 'struct lwp' through
the kernel.
 1.114.2.3  15-Aug-2004  tron Pull up revision 1.117 (requested by jdolecek in ticket #762):
pass the fork flags down the emulation proc fork hook
 1.114.2.2  12-Aug-2004  jmc Pullup rev 1.116 (requested by atatat in ticket #744)

When stopfork is set, we need to set p_nrlwps, since we are not going to
be running. PR#26468
 1.114.2.1  11-Aug-2004  jmc Pullup rev 1.118 (requested by jdolecek in ticket #741)

Linux enforces CLONE_VM if CLONE_SIGHAND in clone(2) is specified,
follow the suit - this is intended to be Linux-compatible call.
 1.119.6.1  19-Mar-2005  yamt sync with head. xen and whitespace. xen part is not finished.
 1.119.4.1  29-Apr-2005  kent sync with -current
 1.121.2.2  23-May-2005  riz Revert pullup ticket #316, because it breaks kernel compiles. A more
complete pullup request is forthcoming.
(requested by cube in ticket #341)
 1.121.2.1  22-May-2005  snj Pull up revision 1.122 (requested by cube in ticket #316):
Add P_CLDSIGIGN, P_NOCLDSTOP and P_NOCLDWAIT to the list of flags we want
to inherit from the parent process.
 1.122.2.11  24-Mar-2008  yamt sync with head.
 1.122.2.10  27-Feb-2008  yamt sync with head.
 1.122.2.9  04-Feb-2008  yamt sync with head.
 1.122.2.8  21-Jan-2008  yamt sync with head
 1.122.2.7  07-Dec-2007  yamt sync with head
 1.122.2.6  15-Nov-2007  yamt sync with head.
 1.122.2.5  27-Oct-2007  yamt sync with head.
 1.122.2.4  03-Sep-2007  yamt sync with head.
 1.122.2.3  26-Feb-2007  yamt sync with head.
 1.122.2.2  30-Dec-2006  yamt sync with head.
 1.122.2.1  21-Jun-2006  yamt sync with head.
 1.123.12.1  24-May-2006  tron Merge 2006-05-24 NetBSD-current into the "peter-altq" branch.
 1.123.10.2  06-May-2006  christos - Move kauth_cred_t declaration to <sys/types.h>
- Cleanup struct ucred; forward declarations that are unused.
- Don't include <sys/kauth.h> in any header, but include it in the c files
that need it.

Approved by core.
 1.123.10.1  08-Mar-2006  elad Adapt to kernel authorization KPI.
 1.123.8.3  11-Aug-2006  yamt sync with head
 1.123.8.2  26-Jun-2006  yamt sync with head.
 1.123.8.1  24-May-2006  yamt sync with head.
 1.123.6.2  01-Jun-2006  kardel Sync with head.
 1.123.6.1  04-Feb-2006  simonb Adapt for timecounters: mostly use get*time() and use "time_second"
instead of "time.tv_sec".
 1.123.4.1  09-Sep-2006  rpaulo sync with head
 1.124.2.1  19-Jun-2006  chap Sync with head.
 1.126.6.2  10-Dec-2006  yamt sync with head.
 1.126.6.1  22-Oct-2006  yamt sync with head
 1.126.4.10  05-Feb-2007  ad IPL_STATCLOCK needs to be >= IPL_CLOCK, so assume that proc::p_stmutex is
always a spinlock.
 1.126.4.9  05-Feb-2007  ad - When clearing signals dequeue siginfo first and free later, once
outside the lock permiter.
- Push kernel_lock back in a a couple of places.
- Adjust limcopy() to be MP safe (this needs redoing).
- Fix a couple of bugs noticed along the way.
- Catch up with condvar changes.
 1.126.4.8  01-Feb-2007  ad Sync with head.
 1.126.4.7  30-Jan-2007  ad Remove support for SA. Ok core@.
 1.126.4.6  27-Jan-2007  ad Drop proclist_mutex and proc::p_smutex back to IPL_VM.
 1.126.4.5  29-Dec-2006  ad Checkpoint work in progress.
 1.126.4.4  17-Nov-2006  ad Checkpoint work in progress.
 1.126.4.3  24-Oct-2006  ad - Redo LWP locking slightly and fix some races.
- Fix some locking botches.
- Make signal mask / stack per-proc for SA processes.
- Add _lwp_kill().
 1.126.4.2  21-Oct-2006  ad Checkpoint work in progress on locking and per-LWP signals. Very much a
a work in progress and there is still a lot to do.
 1.126.4.1  11-Sep-2006  ad - Convert some lockmgr() locks to mutexes and RW locks.
- Acquire proclist_lock and p_crmutex in some obvious places.
 1.131.2.8  07-May-2007  yamt sync with head.
 1.131.2.7  24-Mar-2007  ad - Ensure that context switch always happens at least at IPL_SCHED, even
if no spin lock is held. Should fix the assertion failure seen on hppa.
- Reduce the amount of spl frobbing in mi_switch.
- Add some comments.

Reviewed by yamt@.
 1.131.2.6  24-Mar-2007  yamt sync with head.
 1.131.2.5  17-Mar-2007  rmind Do not do an implicit enqueue in sched_switch(), move enqueueing back to
the dispatcher. Rename sched_switch() back to sched_nextlwp(). Add for
sched_enqueue() new argument, which indicates the calling from mi_switch().

Requested by yamt@
 1.131.2.4  12-Mar-2007  rmind Sync with HEAD.
 1.131.2.3  27-Feb-2007  yamt - sync with head.
- move sched_changepri back to kern_synch.c as it doesn't know PPQ anymore.
 1.131.2.2  20-Feb-2007  rmind General Common Scheduler Framework (CSF) patch import. Huge thanks for
Daniel Sieger <dsieger at TechFak.Uni-Bielefeld de> for this work.

Short abstract: Split the dispatcher from the scheduler in order to
make the scheduler more modular. Introduce initial API for other
schedulers' implementations.

Discussed in tech-kern@
OK: yamt@, ad@

Note: further work will go soon.
 1.131.2.1  17-Feb-2007  yamt - separate context switching and thread scheduling.
- introduce idle lwp.
- change some related MD/MI interfaces and implement i386 version.
 1.136.4.1  11-Jul-2007  mjf Sync with head.
 1.136.2.10  01-Nov-2007  ad - Fix interactivity problems under high load. Beacuse soft interrupts
are being stacked on top of regular LWPs, more often than not aston()
was being called on a soft interrupt thread instead of a user thread,
meaning that preemption was not happening on EOI.

- Don't use bool in a couple of data structures. Sub-word writes are not
always atomic and may clobber other fields in the containing word.

- For SCHED_4BSD, make p_estcpu per thread (l_estcpu). Rework how the
dynamic priority level is calculated - it's much better behaved now.

- Kill the l_usrpri/l_priority split now that priorities are no longer
directly assigned by tsleep(). There are three fields describing LWP
priority:

l_priority: Dynamic priority calculated by the scheduler.
This does not change for kernel/realtime threads,
and always stays within the correct band. Eg for
timeshared LWPs it never moves out of the user
priority range. This is basically what l_usrpri
was before.

l_inheritedprio: Lent to the LWP due to priority inheritance
(turnstiles).

l_kpriority: A boolean value set true the first time an LWP
sleeps within the kernel. This indicates that the LWP
should get a priority boost as compensation for blocking.
lwp_eprio() now does the equivalent of sched_kpri() if
the flag is set. The flag is cleared in userret().

- Keep track of scheduling class (OTHER, FIFO, RR) in struct lwp, and use
this to make decisions in a few places where we previously tested for a
kernel thread.

- Partially fix itimers and usr/sys/intr time accounting in the presence
of software interrupts.

- Use kthread_create() to create idle LWPs. Move priority definitions
from the various modules into sys/param.h.

- newlwp -> lwp_create
 1.136.2.9  25-Oct-2007  ad - Simplify debugger/procfs reference counting of processes. Use a per-proc
rwlock: rw_tryenter(RW_READER) to gain a reference, and rw_enter(RW_WRITER)
by the process itself to drain out reference holders before major changes
like exiting.
- Fix numerous bugs and locking issues in procfs.
- Mark procfs MPSAFE.
 1.136.2.8  09-Oct-2007  ad Sync with head.
 1.136.2.7  20-Aug-2007  ad Sync with HEAD.
 1.136.2.6  15-Jul-2007  ad Sync with head.
 1.136.2.5  08-Jun-2007  ad Sync with head.
 1.136.2.4  10-Apr-2007  ad proc_trampoline_mp: don't grab kernel_lock for LWPs marked as MP safe.
 1.136.2.3  05-Apr-2007  ad - Make context switch counters 64-bit, and count the total number of
context switches + voluntary, instead of involuntary + voluntary.
- Add lwp::l_swaplock for uvm.
- PHOLD/PRELE are replaced.
 1.136.2.2  21-Mar-2007  ad - Put a lock around the proc's CWD info (work in progress).
- Replace some more simplelocks.
- Make lbolt a condvar.
 1.136.2.1  13-Mar-2007  ad Sync with head.
 1.141.6.7  09-Dec-2007  jmcneill Sync with HEAD.
 1.141.6.6  27-Nov-2007  joerg Sync with HEAD. amd64 Xen support needs testing.
 1.141.6.5  11-Nov-2007  joerg Sync with HEAD.
 1.141.6.4  06-Nov-2007  joerg Sync with HEAD.
 1.141.6.3  28-Oct-2007  joerg Sync with HEAD.
 1.141.6.2  02-Oct-2007  joerg Sync with HEAD.
 1.141.6.1  16-Aug-2007  jmcneill Sync with HEAD.
 1.141.2.1  03-Sep-2007  skrll Sync with HEAD.
 1.142.2.4  23-Mar-2008  matt sync with HEAD
 1.142.2.3  09-Jan-2008  matt sync with HEAD
 1.142.2.2  08-Nov-2007  matt sync with -HEAD
 1.142.2.1  06-Nov-2007  matt sync with HEAD
 1.143.2.1  06-Oct-2007  yamt sync with head.
 1.144.2.1  13-Nov-2007  bouyer Sync with HEAD
 1.145.2.4  18-Feb-2008  mjf Sync with HEAD.
 1.145.2.3  27-Dec-2007  mjf Sync with HEAD.
 1.145.2.2  08-Dec-2007  mjf Sync with HEAD.
 1.145.2.1  19-Nov-2007  mjf Sync with HEAD.
 1.149.2.3  26-Dec-2007  ad - Push kernel_lock back into exit, wait and sysctl system calls, mainly
for visibility.
- Serialize calls to brk() from within the same process.
- Mark more syscalls MPSAFE.
 1.149.2.2  26-Dec-2007  ad Sync with head.
 1.149.2.1  08-Dec-2007  ad Sync with head.
 1.152.4.2  08-Jan-2008  bouyer Sync with HEAD
 1.152.4.1  02-Jan-2008  bouyer Sync with HEAD
 1.157.6.5  17-Jan-2009  mjf Sync with HEAD.
 1.157.6.4  29-Jun-2008  mjf Sync with HEAD.
 1.157.6.3  05-Jun-2008  mjf Sync with HEAD.

Also fix build.
 1.157.6.2  02-Jun-2008  mjf Sync with HEAD.
 1.157.6.1  03-Apr-2008  mjf Sync with HEAD.
 1.157.2.1  24-Mar-2008  keiichi sync with head.
 1.160.2.3  17-Jun-2008  yamt sync with head.
 1.160.2.2  04-Jun-2008  yamt sync with head
 1.160.2.1  18-May-2008  yamt sync with head.
 1.162.2.4  11-Aug-2010  yamt sync with head.
 1.162.2.3  11-Mar-2010  yamt sync with head
 1.162.2.2  04-May-2009  yamt sync with head.
 1.162.2.1  16-May-2008  yamt sync with head.
 1.163.2.3  23-Jun-2008  wrstuden Sync w/ -current. 34 merge conflicts to follow.
 1.163.2.2  14-May-2008  wrstuden Per discussion with ad, remove most of the #include <sys/sa.h> lines
as they were including sa.h just for the type(s) needed for syscallargs.h.

Instead, create a new file, sys/satypes.h, which contains just the
types needed for syscallargs.h. Yes, there's only one now, but that
may change and it's probably more likely to change if it'd be difficult
to handle. :-)

Per discussion with matt at n dot o, add an include of satypes.h to
sigtypes.h. Upcall handlers are kinda signal handlers, and signalling
is the header file that's already included for syscallargs.h that
closest matches SA.

This shaves about 3000 lines off of the diff of the branch relative
to the base. That also represents about 18% of the total before this
checkin.

I think this reduction is very good thing.
 1.163.2.1  10-May-2008  wrstuden Initial checkin of re-adding SA. Everything except kern_sa.c
compiles in GENERIC for i386. This is still a work-in-progress, but
this checkin covers most of the mechanical work (changing signalling
to be able to accomidate SA's process-wide signalling and re-adding
includes of sys/sa.h and savar.h). Subsequent changes will be much
more interesting.

Also, kern_sa.c has received partial cleanup. There's still more
to do, though.
 1.169.2.1  18-Jun-2008  simonb Sync with head.
 1.170.2.1  19-Oct-2008  haad Sync with HEAD.
 1.171.12.2  05-Feb-2012  bouyer Pull up following revision(s) (requested by rmind in ticket #1628):
sys/kern/kern_fork.c: revision 1.184 via patch
fork1: fix stop-on-fork case, lend a correct lock to LWP for LSSTOP state.
Fixes PR/44935.
 1.171.12.1  18-Jun-2011  bouyer Pull up following revision(s) (requested by rmind in ticket #1629):
sys/kern/kern_fork.c: revision 1.181
Inherit proc_t::p_mqueue_cnt on fork().
 1.171.10.1  29-Apr-2011  matt Pull in lwp_setprivate/cpu_lwp_setprivate from -current.
Also pull in lwp_getpcb
 1.171.8.2  05-Feb-2012  bouyer Pull up following revision(s) (requested by rmind in ticket #1628):
sys/kern/kern_fork.c: revision 1.184 via patch
fork1: fix stop-on-fork case, lend a correct lock to LWP for LSSTOP state.
Fixes PR/44935.
 1.171.8.1  18-Jun-2011  bouyer Pull up following revision(s) (requested by rmind in ticket #1629):
sys/kern/kern_fork.c: revision 1.181
Inherit proc_t::p_mqueue_cnt on fork().
 1.171.4.2  05-Feb-2012  bouyer Pull up following revision(s) (requested by rmind in ticket #1628):
sys/kern/kern_fork.c: revision 1.184 via patch
fork1: fix stop-on-fork case, lend a correct lock to LWP for LSSTOP state.
Fixes PR/44935.
 1.171.4.1  18-Jun-2011  bouyer Pull up following revision(s) (requested by rmind in ticket #1629):
sys/kern/kern_fork.c: revision 1.181
Inherit proc_t::p_mqueue_cnt on fork().
 1.171.2.2  28-Apr-2009  skrll Sync with HEAD.
 1.171.2.1  19-Jan-2009  skrll Sync with HEAD.
 1.172.2.1  13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.175.2.2  17-Aug-2010  uebayasi Sync with HEAD.
 1.175.2.1  30-Apr-2010  uebayasi Sync with HEAD.
 1.176.2.4  31-May-2011  rmind sync with head
 1.176.2.3  21-Apr-2011  rmind sync with head
 1.176.2.2  05-Mar-2011  rmind sync with head
 1.176.2.1  03-Jul-2010  rmind sync with head
 1.178.2.1  06-Jun-2011  jruoho Sync with HEAD.
 1.186.6.5  05-Apr-2012  mrg sync to latest -current.
 1.186.6.4  06-Mar-2012  mrg sync to -current
 1.186.6.3  06-Mar-2012  mrg sync to -current
 1.186.6.2  04-Mar-2012  mrg sync to latest -current.
 1.186.6.1  18-Feb-2012  mrg merge to -current.
 1.186.2.3  22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.186.2.2  30-Oct-2012  yamt sync with head
 1.186.2.1  17-Apr-2012  yamt sync with head
 1.191.2.3  03-Dec-2017  jdolecek update from HEAD
 1.191.2.2  20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.191.2.1  23-Jun-2013  tls resync from head
 1.192.2.1  18-May-2014  rmind sync with head
 1.193.6.5  28-Aug-2017  skrll Sync with HEAD
 1.193.6.4  05-Feb-2017  skrll Sync with HEAD
 1.193.6.3  05-Dec-2016  skrll Sync with HEAD
 1.193.6.2  19-Mar-2016  skrll Sync with HEAD
 1.193.6.1  27-Dec-2015  skrll Sync with HEAD (as of 26th Dec)
 1.195.2.3  26-Apr-2017  pgoyette Sync with HEAD
 1.195.2.2  20-Mar-2017  pgoyette Sync with HEAD
 1.195.2.1  07-Jan-2017  pgoyette Sync with HEAD. (Note that most of these changes are simply $NetBSD$
tag issues.)
 1.199.2.1  21-Apr-2017  bouyer Sync with HEAD
 1.203.2.2  02-May-2018  pgoyette Synch with HEAD
 1.203.2.1  22-Apr-2018  pgoyette Sync with HEAD
 1.205.2.3  21-Apr-2020  martin Sync with HEAD
 1.205.2.2  13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.205.2.1  10-Jun-2019  christos Sync with HEAD
 1.213.2.1  15-Oct-2019  martin Pull up following revision(s) (requested by kamil in ticket #311):

sys/sys/siginfo.h: revision 1.34
sys/kern/sys_ptrace_common.c: revision 1.59
sys/kern/sys_lwp.c: revision 1.70
sys/compat/sys/siginfo.h: revision 1.8
sys/kern/kern_sig.c: revision 1.365
sys/kern/kern_lwp.c: revision 1.203
sys/sys/signalvar.h: revision 1.96
sys/kern/kern_exec.c: revision 1.482
sys/kern/kern_fork.c: revision 1.214

Move TRAP_CHLD/TRAP_LWP ptrace information from struct proc to siginfo

Storing struct ptrace_state information inside struct proc was vulnerable
to synchronization bugs, as multiple events emitted in the same time were
overwritting other ones.

Cache the original parent process id in p_oppid. Reusing here p_opptr is
in theory prone to slight race codition.

Change the semantics of PT_GET_PROCESS_STATE, reutning EINVAL for calls
prompting for the value in cases when there wasn't registered an
appropriate event.

Add an alternative approach to check the ptrace_state information, directly
from the siginfo_t value returned from PT_GET_SIGINFO. The original
PT_GET_PROCESS_STATE approach is kept for compat with older NetBSD and
OpenBSD. New code is recommended to keep using PT_GET_PROCESS_STATE.
Add a couple of compile-time asserts for assumptions in the code.

No functional change intended in existing ptrace(2) software.

All ATF ptrace(2) and ATF GDB tests pass.

This change improves reliability of the threading ptrace(2) code.
 1.217.2.1  29-Feb-2020  ad Sync with head.
 1.221.2.2  25-Apr-2020  bouyer Sync with bouyer-xenpvh-base2 (HEAD)
 1.221.2.1  20-Apr-2020  bouyer Sync with HEAD

RSS XML Feed