Home | History | Annotate | Download | only in kern
History log of /src/sys/kern/init_sysctl.c
RevisionDateAuthorComments
 1.228  09-Sep-2023  christos Move the initialization of the random hash for addresses earlier so that
it does not happen under a spin lock context (when it is first used).
 1.227  20-Sep-2020  skrll KNF (sort #includes and remove duplicate sys/cpu.h)
 1.226  23-May-2020  ad Move proc_lock into the data segment. It was dynamically allocated because
at the time we had mutex_obj_alloc() but not __cacheline_aligned.
 1.225  22-Mar-2020  ad Merge vfs_cache.c from the ad-namecache branch. With this the namecache
index becomes per-directory (initially, a red-black tree). The remaining
changes on the branch to namei()/getcwd() will be merged in the future.
 1.224  18-Jan-2020  skrll Use 4K pages on ARM_MMU_EXTENDED platforms (all armv[67] except RPI) by
creating a new pool l1ttpl for the userland L1 translation table which
needs to be 8KB and 8KB aligned.

Limit the pool to maxproc and add hooks to allow the sysctl changing of
maxproc to adjust the pool.

This comes at a 5% performance penalty for build.sh -j8 kernel on a
Tegra TK1.
 1.223  02-Jan-2020  thorpej branches: 1.223.2;
- Eliminate the global "boottime" variable, which was being accessed
without any synchronization against changes by e.g. clock_settime().
- Replace with new getbinboottime() / getnanoboottime() / getmicroboottime()
functions (naming mirrors that of other time access functions in kern_tc.c).
It returns the (maybe-converted) value of timebasebin, which also tracks
our estimate of when the system was booted (i.e. the legacy "boottime" was
redundant).

XXX There needs to be a lockless synchronization mechanism for reading
timebasebin, but this is a problem in kern_tc.c that pre-existed these
"boottime" changes. At least now the problem is centralized in one location.
 1.222  15-Jan-2019  mrg remove kern.panic_now -- crashme panic node replaces it.
 1.221  05-Dec-2018  christos As discussed in tech-kern:

- make sysctl kern.expose_address tri-state:
0: no access
1: access to processes with open /dev/kmem
2: access to everyone
defaults:
0: KASLR kernels
1: non-KASLR kernels

- improve efficiency by calling get_expose_address() per sysctl, not per
process.

- don't expose addresses for linux procfs

- welcome to 8.99.27, changes to fill_*proc ABI
 1.220  03-Dec-2018  christos Expose addresses depending on the KASLR setting (from mrg@). Restores the
status quo of exposing kernel addresses if there is no KASLR.
 1.219  24-Nov-2018  maxv Fix kernel pointer leaks in the kern.lwp sysctl.
 1.218  05-Oct-2018  christos Provide a sysctl kern.expose_address to expose kernel addresses in
sysctl structure returns for non-root. Defaults to off. Turning it
on will restore sockstat/fstat and friends for regular users.
 1.217  16-Sep-2018  mrg CTL_DEBUG_MAXID is only used to size a static array that the compiler
can do just fine itself. use the compiler and remove the define.
 1.216  03-Sep-2018  riastradh Rename min/max -> uimin/uimax for better honesty.

These functions are defined on unsigned int. The generic name
min/max should not silently truncate to 32 bits on 64-bit systems.
This is purely a name change -- no functional change intended.

HOWEVER! Some subsystems have

#define min(a, b) ((a) < (b) ? (a) : (b))
#define max(a, b) ((a) > (b) ? (a) : (b))

even though our standard name for that is MIN/MAX. Although these
may invite multiple evaluation bugs, these do _not_ cause integer
truncation.

To avoid `fixing' these cases, I first changed the name in libkern,
and then compile-tested every file where min/max occurred in order to
confirm that it failed -- and thus confirm that nothing shadowed
min/max -- before changing it.

I have left a handful of bootloaders that are too annoying to
compile-test, and some dead code:

cobalt ews4800mips hp300 hppa ia64 luna68k vax
acorn32/if_ie.c (not included in any kernels)
macppc/if_gm.c (superseded by gem(4))

It should be easy to fix the fallout once identified -- this way of
doing things fails safe, and the goal here, after all, is to _avoid_
silent integer truncations, not introduce them.

Maybe one day we can reintroduce min/max as type-generic things that
never silently truncate. But we should avoid doing that for a while,
so that existing code has a chance to be detected by the compiler for
conversion to uimin/uimax without changing the semantics until we can
properly audit it all. (Who knows, maybe in some cases integer
truncation is actually intended!)
 1.215  22-Aug-2018  msaitoh - Cleanup for dynamic sysctl:
- Remove unused *_NAMES macros for sysctl.
- Remove unused *_MAXID for sysctls.
- Move CTL_MACHDEP sysctl definitions for m68k into m68k/include/cpu.h and
use them on all m68k machines.
 1.214  04-Feb-2018  maxv branches: 1.214.2; 1.214.4;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.
 1.213  01-Jun-2017  chs remove checks for failure after memory allocation calls that cannot fail:

kmem_alloc() with KM_SLEEP
kmem_zalloc() with KM_SLEEP
percpu_alloc()
pserialize_create()
psref_class_create()

all of these paths include an assertion that the allocation has not failed,
so callers should not assert that again.
 1.212  14-Dec-2016  hannken Remove the "target" argment from vfs_drainvnodes() as it is
always equal to "desiredvnodes" and move its definition
from sys/vnode.h to sys/vnode_impl.h.

Extend vfs_drainvnodes() to also wait for deferred vrele to flush
and replace the call to vrele_flush() with a call to vfs_drainvnodes().
 1.211  31-May-2016  pgoyette branches: 1.211.2;
Add a new kern.messages sysctl to allow kernel message verbosity to be
altered after boot.

Fixes PR kern/46539 using patch submitted by Nat Sloss.
 1.210  09-Nov-2015  pgoyette Whether or not the semaphore code is loaded as a module or built-in, its
sysctl data belongs with the module code. Move it from kern/init_sysctl.c
to kern/uipc_sem.c

While here, add a new sysctl variable kern.posix.semcnt (current count of
semaphores) to complement the existing kern.posix.semmax (maximum number
of semaphores).
 1.209  25-Aug-2015  pooka Move a bunch of sysctl nodes from init_sysctl (kitchen sink sysctl file)
to init_sysctl_base (only base kernel defs). Main motivation was to
fix sysconf(_SC_NPROCESSORS) for Rumprun. As reported by neeraj on irc,
it returned -1 before this fix, so we were doing imaginary computing.
 1.208  07-Jul-2015  justin Move hw.machine and hw.machine_arch sysctls to base so rump can use them

This allows uname(3) and uname(1) to work on rump kernels.
 1.207  20-May-2015  pooka group msgbuf sysctls with the msgbuf code
(init_sysctl.c -> subr_log.c)
 1.206  13-May-2015  pgoyette More preparation for modularizing the SYSVxxx options. Here we
change the kern.ipc.sysvxxx sysctls into dynamic values, so each
sub-component of SYSVxxx can declare its own availability.
 1.205  22-Apr-2015  pooka move clock sysctls from init_sysctl.c to kern_clock.c
 1.204  03-Aug-2014  apb branches: 1.204.4;
BUILDINFO part 2: expose sysctl kern.buildinfo
 1.203  08-May-2014  hannken Add a global vnode cache:

- vcache_get() retrieves a referenced and initialised vnode / fs node pair.
- vcache_remove() removes a vnode / fs node pair from the cache.

On cache miss vcache_get() calls new vfs operation vfs_loadvnode() to
initialise a vnode / fs node pair. This call is guaranteed exclusive,
no other thread will try to load this vnode / fs node pair.

Convert ufs/ext2fs, ufs/ffs and ufs/mfs to use this interface.

Remove now unused ufs/ufs_ihash

Discussed on tech-kern.

Welcome to 6.99.41
 1.202  24-Mar-2014  christos branches: 1.202.2;
- create cpu_{g,s}etmodel() and hide cpu_model from direct access.
 1.201  25-Feb-2014  pooka Ensure that the top level sysctl nodes (kern, vfs, net, ...) exist before
the sysctl link sets are processed, and remove redundancy.

Shaves >13kB off of an amd64 GENERIC, not to mention >1k duplicate
lines of code.
 1.200  25-Feb-2014  justin Add kern.{ostype,osrelease,osrevision,version} kern.domainname,
kern.rawpartition sysctl support to rump kernel.
Moved the sysctl support that is shared between rump and normal
kernels to init_sysctl_base.c as rump cannot use init_sysctl.c
in order to avoid code duplication. Agreed with pooka@.
 1.199  17-Jan-2014  pooka Put cprng sysctls into subr_cprng.c. Also, make sysctl_prng static
in subr_cprng and get rid of SYSCTL_PRIVATE namespace leak macro.

Fixes ping(8) when run against a standalone rump kernel due to appearance
of the kern.urandom sysctl node (in case someone was wondering ...)
 1.198  14-Sep-2013  joerg GC various arrays defined and used in kern_proc.c
 1.197  18-Mar-2013  para branches: 1.197.6;
calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html
 1.196  07-Mar-2013  matt Add a kern.configname sysctl object.
 1.195  21-Feb-2013  pgoyette Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.
 1.194  02-Feb-2013  matt Make the inclusion of <sys/cprng.h> a private matter for sysctl. No reason
to expose the rest of the kernel to it.
 1.193  27-Oct-2012  chs split device_t/softc for all remaining drivers.
replace "struct device *" with "device_t".
use device_xname(), device_unit(), etc.
 1.192  08-Oct-2012  pooka put all kern socket sysctls in the same place
 1.191  03-Oct-2012  mlelstv Add sanity check to sysctl_kern_maxvnodes.
 1.190  02-Jun-2012  dsl branches: 1.190.2;
Add some pre-processor magic to verify that the type of the data item
passed to sysctl_createv() actually matches the declared type for
the item itself.
In the places where the caller specifies a function and a structure
address (typically the 'softc') an explicit (void *) cast is now needed.
Fixes bugs in sys/dev/acpi/asus_acpi.c sys/dev/bluetooth/bcsp.c
sys/kern/vfs_bio.c sys/miscfs/syncfs/sync_subr.c and setting
AcpiGbl_EnableAmlDebugObject.
(mostly passing the address of a uint64_t when typed as CTLTYPE_INT).
I've test built quite a few kernels, but there may be some unfixed MD
fallout. Most likely passing &char[] to char *.
Also add CTLFLAG_UNSIGNED for unsiged decimals - not set yet.
 1.189  07-Apr-2012  christos remove bogus check.
 1.188  10-Mar-2012  joerg P1003_1B_SEMAPHORE is no longer optional.
 1.187  19-Feb-2012  rmind Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.
 1.186  17-Dec-2011  tls branches: 1.186.2;

Separate /dev/random pseudodevice implemenation from kernel entropy pool
implementation. Rewrite pseudodevice code to use cprng_strong(9).

The new pseudodevice is cloning, so each caller gets bits from a stream
generated with its own key. Users of /dev/urandom get their generators
keyed on a "best effort" basis -- the kernel will rekey generators
whenever the entropy pool hits the high water mark -- while users of
/dev/random get their generators rekeyed every time key-length bits
are output.

The underlying cprng_strong API can use AES-256 or AES-128, but we use
AES-128 because of concerns about related-key attacks on AES-256. This
improves performance (and reduces entropy pool depletion) significantly
for users of /dev/urandom but does cause users of /dev/random to rekey
twice as often.

Also fixes various bugs (including some missing locking and a reseed-counter
overflow in the CTR_DRBG code) found while testing this.

For long reads, this generator is approximately 20 times as fast as the
old generator (dd with bs=64K yields 53MB/sec on 2Ghz Core2 instead of
2.5MB/sec) and also uses a separate mutex per instance so concurrency
is greatly improved. For reads of typical key sizes for modern
cryptosystems (16-32 bytes) performance is about the same as the old
code: a little better for 32 bytes, a little worse for 16 bytes.
 1.185  20-Nov-2011  tls branches: 1.185.2;
An undocumented behavior of the sysctl kern.arandom node used to allow
sucking up to 8192 bytes out of the kernel arc4random() generator at a
time. Supposedly some very old application code uses this to rekey
other instances of RC4 in userspace (a truly great idea). Reduce the
limit to 256 bytes -- and note that it will probably be reduced to
sizeof(int) in the future, since this node is so documented.
 1.184  19-Nov-2011  tls First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.
 1.183  30-Aug-2011  bouyer branches: 1.183.2;
Add getlabelusesmbr(), as proposed in
http://mail-index.netbsd.org/tech-userlevel/2011/08/25/msg005404.html
This is used by disk tools such as disklabel(8) to dynamically decide is
the undelyling platform uses a disklabel-in-mbr-partition or not
(instead of using a compile-time list of ports).
getlabelusesmbr() reads the sysctl kern.labelusesmbr, takes its value from the
machdep #define LABELUSESMBR.
For evbmips, make LABELUSESMBR 1 if the platform uses pmon
as bootloader, and 0 (the previous value) otherwise.
 1.182  23-Jul-2011  jym When KERN_SA is not defined, kern.no_sa_support is a constant (1). So
add CTLFLAG_IMMEDIATE to flags. Make the macro block logically reversed so
it looks more natural when reading.

Reported by Peter Tworek on tech-kern@.
 1.181  24-May-2011  joerg Add some needed __UNCONST
 1.180  02-Apr-2011  rmind vfs_drainvnodes: drop lwp argument, remove variable name in prototype.
 1.179  05-Feb-2011  christos avoid code duplication.
 1.178  28-Jan-2011  pooka migrate compat32 handling with previous

pointed out by Lars Heidieker
 1.177  28-Jan-2011  pooka Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.
 1.176  22-Jan-2011  christos Use the L_ flags instead of the P_ flags for lwps.
 1.175  01-Jul-2010  rmind branches: 1.175.2; 1.175.4;
Remove pfind() and pgfind(), fix locking in various broken uses of these.
Rename real routines to proc_find() and pgrp_find(), remove PFIND_* flags
and have consistent behaviour. Provide proc_find_raw() for special cases.
Fix memory leak in sysctl_proc_corename().

COMPAT_LINUX: rework ptrace() locking, minimise differences between
different versions per-arch.

Note: while this change adds some formal cosmetics for COMPAT_DARWIN and
COMPAT_IRIX - locking there is utterly broken (for ages).

Fixes PR/43176.
 1.174  16-Jun-2010  pooka Set kinfo_lwp to 0 before filling it so that if someone removes
variable assignments from here, kernel memory does not leak to
userspace.

Bug found, a little bit suprisingly, by the atf ps test which failed
due to the column width between the -o holdcnt column being too
wide due to the contents displayed being garbage.
 1.173  13-Feb-2010  yamt branches: 1.173.2;
sysctl_doeproc: don't follow a possibly stale pointer.
 1.172  13-Jan-2010  pooka branches: 1.172.2;
Minimize unnecessary differences in rump.
 1.171  24-Dec-2009  elad When reporting open files using sysctl, don't use 'filehead' to fetch files,
as we don't have a process context to authorize on. Instead, traverse the
file descriptor table of each process -- as we already do in one case.

Introduce a "marker" we can use to mark files we've seen in an iteration, as
the same file can be referenced more than once.

Hopefully this availability of filtering by process also makes life easier
for those who are interested in implementing process "containers" etc.
 1.170  12-Dec-2009  dsl Report L_INMEM in the lwp info as well.
 1.169  12-Dec-2009  dsl Always set L_INMEM to maintain binary compatibility.
 1.168  21-Oct-2009  rmind Remove uarea swap-out functionality:

- Addresses the issue described in PR/38828.
- Some simplification in threading and sleepq subsystems.
- Eliminates pmap_collect() and, as a side note, allows pmap optimisations.
- Eliminates XS_CTL_DATA_ONSTACK in scsipi code.
- Avoids few scans on LWP list and thus potentially long holds of proc_lock.
- Cuts ~1.5k lines of code. Reduces amd64 kernel size by ~4k.
- Removes __SWAP_BROKEN cases.

Tested on x86, mips, acorn32 (thanks <mpumford>) and partly tested on
acorn26 (thanks to <bjh21>).

Discussed on <tech-kern>, reviewed by <ad>.
 1.167  16-Sep-2009  pooka Chop init_sysctl into base nodes (init_sysctl_base.c) and the
kitchen sink (init_sysctl.c). Further surgery may be needed down
the line.
 1.166  11-Sep-2009  apb Expose the kernel's boothowto(9) variable through the sysctl
kern.boothowto variable.

Part of the /etc/rc silent changes requested in PR 41946
and proposed in tech-userlevel.
 1.165  16-Aug-2009  christos provide compatibility for the older variant of kern.consdev, which used
a 32 bit dev_t. Reported by mrg.
 1.164  24-May-2009  ad More changes to improve kern_descrip.c.

- Avoid atomics in more places.
- Remove the per-descriptor mutex, and just use filedesc_t::fd_lock.
It was only being used to synchronize close, and in any case we needed
to take fd_lock to free the descriptor slot.
- Optimize certain paths for the <NDFDFILE case.
- Sprinkle more comments and assertions.
- Cache more stuff in filedesc_t.
- Fix numerous minor bugs spotted along the way.
- Restructure how the open files array is maintained, for clarity and so
that we can eliminate the membar_consumer() call in fd_getfile(). This is
mostly syntactic sugar; the main functional change is that fd_nfiles now
lives alongside the open file array.

Some measurements with libmicro:

- simple file syscalls are like close() are between 1 to 10% faster.
- some nice improvements, e.g. poll(1000) which is ~50% faster.
 1.163  16-May-2009  yamt sysctl_doeproc:
- simplify.
- KERN_PROC: fix possible stale proc pointer dereference.
- KERN_PROC: don't do copyout with proc_lock held.
 1.162  12-May-2009  yamt don't forget to skip marker processes.
 1.161  04-May-2009  yamt sysctl_doeproc: fix a bug in rev.1.135.
don't forget to mark our marker process PK_MARKER.
this fixes crashes in sched_pstats, etc.
 1.160  29-Mar-2009  mrg - add new RLIMIT_AS (aka RLIMIT_VMEM) resource that limits the total
address space available to processes. this limit exists in most other
modern unix variants, and like most of them, our defaults are unlimited.
remove the old mmap / rlimit.datasize hack.

- adds the VMCMD_STACK flag to all the stack-creation vmcmd callers.
it is currently unused, but was added a few years ago.

- add a pair of new process size values to kinfo_proc2{}. one is the
total size of the process memory map, and the other is the total size
adjusted for unused stack space (since most processes have a lot of
this...)

- patch sh, and csh to notice RLIMIT_AS. (in some cases, the alias
RLIMIT_VMEM was already present and used if availble.)

- patch ps, top and systat to notice the new k_vm_vsize member of
kinfo_proc2{}.

- update irix, svr4, svr4_32, linux and osf1 emulations to support
this information. (freebsd could be done, but that it's best left
as part of the full-update of compat/freebsd.)


this addresses PR 7897. it also gives correct memory usage values,
which have never been entirely correct (since mmap), and have been
very incorrect since jemalloc() was enabled.

tested on i386 and sparc64, build tested on several other platforms.

thanks to many folks for feedback and testing but most espcially
chuq and yamt for critical suggestions that lead to this patch not
having a special ugliness i wasn't happy with anyway :-)
 1.159  11-Mar-2009  mrg like KERN_FILE2: *do* update "needed" when there is no count. we want
userland to know what sort of size to provide..

while here, slightly normalise the previous to init_sysctl.c.
 1.158  11-Mar-2009  mrg always calculate "needed" for KERN_FILE2 calls. this allows a caller
to get an estimate of the needed space, like the intention is.
 1.157  08-Mar-2009  ad Don't bother with file_t::f_iflags any more, as it's not used.
Noted by mrg@.
 1.156  13-Feb-2009  apb Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.
 1.155  17-Jan-2009  cegger branches: 1.155.2;
whitespace nit
 1.154  17-Jan-2009  yamt malloc -> kmem_alloc.
 1.153  11-Jan-2009  christos merge christos-time_t
 1.152  29-Dec-2008  pooka Rename specfs_lock as device_lock and move it from specfs to devsw.
Relaxes kernel dependency on vfs.
 1.151  28-Nov-2008  elad PR/40002: Daniel Horecki: sockstat doesn't work for user with sysctl
security.curtain=1

If the kauth call failed, we'd silently continue the loop, but the error
code would remain and eventually "leak" to userspace. Reset the error to
zero when continuing.

Tested by snj@ and myself. Okay snj@.
 1.150  12-Nov-2008  ad Allow the POSIX semaphore code to be loaded as a module.
 1.149  22-Oct-2008  ad branches: 1.149.2; 1.149.4;
Set kern.posix_semaphores are runtime so it can be a module.
(Picked wrong header the last time.)
 1.148  22-Oct-2008  ad Set kern.posix_semaphores are runtime so it can be a module.
 1.147  19-Oct-2008  christos rename proc_representative_lwp to proc_active_lwp and clarify it is for
ps display purposes. suggested by rmind.
 1.146  19-Oct-2008  christos Select a "representative" lwp instead of the first lwp in the list. The
first lwp in the list is the last created and in the firefox and gtk-gnash
case this is usually a zombie, so the status in ps was ZLl. This now picks
the lwp in order ONPROC > RUN > SLEEP > STOP > SUSPENDED > IDL > DEAD > ZOMB
and breaks ties using cpticks.
 1.145  15-Oct-2008  wrstuden Merge wrstuden-revivesa into HEAD.
 1.144  15-Jul-2008  christos make l_flags contain more stuff. Fixes top thread display where system processes
were always displayed.
 1.143  02-Jul-2008  rmind branches: 1.143.2;
Remove proc_representative_lwp(), use a simple LIST_FIRST() instead.
OK by <ad>.
 1.142  16-Jun-2008  ad PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.
 1.141  16-Jun-2008  ad - PPWAIT is need only be locked by proc_lock, so move it to proc::p_lflag.
- Remove a few needless lock acquires from exec/fork/exit.
- Sprinkle branch hints.

No functional change.
 1.140  31-May-2008  ad branches: 1.140.2;
Kill devsw_lock and just use specfs_lock. The two would need merging
in order to prevent unload of modules when a device that they provide
is still open.
 1.139  25-May-2008  christos don't forget to fill in the emulation.
 1.138  12-May-2008  ad Use cpu_index(), not ci_cpuid.
 1.137  30-Apr-2008  ad branches: 1.137.2;
KERN_FILE_BYPID: fix locking botch.
 1.136  29-Apr-2008  ad Don't try grabbing a zombie's p_reflock.
 1.135  29-Apr-2008  ad PR kern/37917 /bin/ps no longer shows zombies
 1.134  28-Apr-2008  martin Remove clause 3 and 4 from TNF licenses
 1.133  24-Apr-2008  ad branches: 1.133.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.
 1.132  24-Apr-2008  ad Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.
 1.131  05-Apr-2008  yamt branches: 1.131.2;
- l_wmesg is not always valid. check l_wchan when using l_wmesg.
should fix a crash reported by Juan RP on current-users@.
- ttyinfo: lock lwp when accessing l_wmesg.
- fill_lwp: add an assertion.
 1.130  04-Apr-2008  cegger use device_xname() where appropriate
OK martin
 1.129  02-Apr-2008  xtraeme Revert rev 1.126-1.128. The original code was correct and rmind and I
didn't look correctly at them.
 1.128  01-Apr-2008  xtraeme When copying l_name and l_wmesg use KI_LNAMELEN and KI_WMESGLEN
respectively, so that we don't care if l_name/wmesg is longer
than kl_name/wmesg and the KASSERTs added in previous can go away.
 1.127  01-Apr-2008  xtraeme Fix previous: use the length of l->l_foo not kl->l_foo and add
two KASSERTs to check for max lenght limits before copying.

As suggested by rmind@.
 1.126  01-Apr-2008  xtraeme fill_lwp: when copying l_wmesg and l_name, use the size of the string
not of the variable.

Found and ok by rmind@.
 1.125  27-Mar-2008  ad branches: 1.125.2;
Make rusage collection per-LWP and collate in the appropriate places.
cloned threads need a little bit more work but the locking needs to
be fixed first.
 1.124  21-Mar-2008  ad Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.
 1.123  27-Feb-2008  matt Convert to ansi definitions from old-style definitons.
 1.122  30-Jan-2008  ad branches: 1.122.2; 1.122.6;
Another locking botch.
 1.121  28-Jan-2008  ad More file/proc locking fixes.
 1.120  23-Jan-2008  elad Tons of process scope changes.

- Add a KAUTH_PROCESS_SCHEDULER action, to handle scheduler related
requests, and add specific requests for set/get scheduler policy and
set/get scheduler parameters.

- Add a KAUTH_PROCESS_KEVENT_FILTER action, to handle kevent(2) related
requests.

- Add a KAUTH_DEVICE_TTY_STI action to handle requests to TIOCSTI.

- Add requests for the KAUTH_PROCESS_CANSEE action, indicating what
process information is being looked at (entry itself, args, env,
open files).

- Add requests for the KAUTH_PROCESS_RLIMIT action indicating set/get.

- Add requests for the KAUTH_PROCESS_CORENAME action indicating set/get.

- Make bsd44 secmodel code handle the newly added rqeuests appropriately.

All of the above make it possible to issue finer-grained kauth(9) calls in
many places, removing some KAUTH_GENERIC_ISSUSER requests.

- Remove the "CAN" from KAUTH_PROCESS_CAN{KTRACE,PROCFS,PTRACE,SIGNAL}.

Discussed with christos@ and yamt@.
 1.119  12-Jan-2008  ad sysctl_kern_proc_args: avoid zero length allocation.
 1.118  07-Jan-2008  ad Patch up sysctl locking:

- Lock processes, credentials, filehead etc correctly.
- Acquire a read hold on sysctl_treelock if only doing a query.
- Don't wire down the output buffer. It doesn't work correctly and the code
regularly does long term sleeps with it held - it's not worth it.
- Don't hold locks other than sysctl_lock while doing copyout().
- Drop sysctl_lock while doing copyout / allocating memory in a few places.
- Don't take kernel_lock for sysctl.
- Fix a number of bugs spotted along the way
 1.117  31-Dec-2007  ad Remove systrace. Ok core@.
 1.116  26-Dec-2007  christos Add PaX ASLR (Address Space Layout Randomization) [from elad and myself]

For regular (non PIE) executables randomization is enabled for:
1. The data segment
2. The stack

For PIE executables(*) randomization is enabled for:
1. The program itself
2. All shared libraries
3. The data segment
4. The stack

(*) To generate a PIE executable:
- compile everything with -fPIC
- link with -shared-libgcc -Wl,-pie

This feature is experimental, and might change. To use selectively add
options PAX_ASLR=0
in your kernel.

Currently we are using 12 bits for the stack, program, and data segment and
16 or 24 bits for mmap, depending on __LP64__.
 1.115  22-Dec-2007  yamt use binuptime for l_stime/l_rtime.
 1.114  10-Dec-2007  elad - Use KAUTH_ARG() instead of casts,
- Don't ignore return value of settime() in sysctl_kern_rtc_offset(), as
suggested by yamt@.

Note: the kauth(9) call in sysctl_kern_rtc_offset() is bogus, but this will
be addressed separately.
 1.113  06-Nov-2007  ad branches: 1.113.2; 1.113.4; 1.113.6;
Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.
 1.112  19-Oct-2007  ad branches: 1.112.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h
 1.111  16-Oct-2007  christos branches: 1.111.2;
Don't fail to produce the argument vector if the program has modified it
by deleting arguments. This is a popular practice, and failing means that
ps(1) prints (programname). For example this is what XtOpenDisplay() with
-geometry. This used to work before 2.0H, and the behavior is allowed and
hinted by POSIX. Found out by Anon Ymous.
 1.110  16-Oct-2007  christos - fix comment sentence capitalization.
- whitespace cleanup.
No functional changes.
 1.109  15-Oct-2007  ad Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.
 1.108  13-Oct-2007  rmind sysctl_kern_lwp: Use a correct variable when rechecking if LWP still
exists after relocking. Found via CID: 4689. OK by <dsl>.
 1.107  08-Oct-2007  ad Merge from vmlocking: don't hold scheduler locks across copyout().
 1.106  28-Sep-2007  joerg Add kern.no_sa_support to easily detect whether a kernel supports
Scheduler Activation or not. This is a negative name as ld.so.conf
conditionals threat undefined sysctls like 0.
 1.105  15-Aug-2007  ad branches: 1.105.2; 1.105.4;
Changes to make ktrace LKM friendly and reduce ifdef KTRACE. Proposed
on tech-kern.
 1.104  06-Aug-2007  yamt branches: 1.104.2;
remove a homegrown definition of CPU_INFO_FOREACH.
 1.103  09-Jul-2007  ad branches: 1.103.2; 1.103.6;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements
 1.102  30-Jun-2007  dsl Add a flags parameter to kauth_cred_get/setgroups() so that sys_set/setgroups
can copy directly to/from userspace.
Avoids exposing the implementation of the group list as an array to code
outside kern_auth.c.
compat code and man page need updating.
 1.101  17-May-2007  yamt merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.
 1.100  30-Apr-2007  dsl Remove proc->p_ru and the 'rusage' pool.
I think it existed to cache the numbers in kernel memory of a zombie when
proc->p_stats was part of the 'u' area - so got freed earlier and wouldn't
(easily) be accessible from a separate process. However since both the
p_ru and p_stats fields are freed at the same time it is no longer needed.
Ride the recent 4.99.19 version change.
 1.99  11-Mar-2007  ad branches: 1.99.2;
Add the LWP's runtime to kinfo_lwp.
 1.98  09-Mar-2007  ad branches: 1.98.2;
- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.
 1.97  17-Feb-2007  pavel Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.
 1.96  15-Feb-2007  ad branches: 1.96.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.
 1.95  09-Feb-2007  ad Merge newlock2 to head.
 1.94  22-Jan-2007  elad Don't rely on KAUTH_PROCESS_CANSEE for environment just yet,
otherwise we're allowing anyone to read the environment unless
curtain is enabled.

From yamt@.
 1.93  27-Nov-2006  elad branches: 1.93.2;
Move Veriexec's sysctl(9) setup routine and helper to kern_verifiedexec.c.
 1.92  25-Nov-2006  christos PR/34837: Mindaguas: Add SysV SHM dynamic reallocation and locking to the
physical memory
 1.91  01-Nov-2006  christos implement kern.arandom properly, instead of lying about it and only filling
the first 4 bytes of the array with random data.
 1.90  29-Oct-2006  christos add the emulation in kinfo_proc2
 1.89  03-Oct-2006  elad Back out previous (p_flag2).

In 30 minutes from now Jason Thorpe will come up with an implementation
of a proplib dictionary in struct proc, so adding an int doesn't really
make any sense.
 1.88  03-Oct-2006  elad Until we figure out the Perfect Way of adding flags to processes, add
a p_flag2. No objections on tech-kern@.

Input from simonb@, thanks!
 1.87  24-Sep-2006  dogcow correct dcopyout #define for !KTRACE case.
 1.86  23-Sep-2006  manu Add a -t+S flag to ktrace for tracing activity related to sysctl. MIB
names will be displayed, with data readen and written as well.
 1.85  13-Sep-2006  elad branches: 1.85.2;
Don't use KAUTH_RESULT_* where it's not applicable.
Prompted by yamt@.
 1.84  10-Sep-2006  manu When getting the program argument or environement string, we previously
assumed that all the strings were stored in a row, separated by NUL chars,
at the address pointed bu argv[0] (or envp[0]).

This was wrong: if the program changed argvs[0], we still read the
first string correctly, but the next strings did contain unexpected data.

The fix: read the whole argv (or envp) array, then copy the string one by
one, using their addresses in argv (or agrp)
 1.83  08-Sep-2006  elad First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)
 1.82  08-Sep-2006  manu When colecting a 32 bit process' argument or environement vector, we need
to convert 32 bits pointers to the 64 bit environement
 1.81  26-Jul-2006  dogcow branches: 1.81.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.
 1.80  25-Jul-2006  dogcow mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.
 1.79  24-Jul-2006  elad some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.
 1.78  23-Jul-2006  ad Use the LWP cached credentials where sane.
 1.77  17-Jul-2006  ad - Don't cast kauth_cred_t to (struct ucred *), just set pc_ucred = NULL.
- Fill ucred::cr_ref.
 1.76  16-Jul-2006  elad CURTAIN() -> KAUTH_GENERIC_CANSEE.
 1.75  14-Jul-2006  elad move security.setid_core.* to kern.coredump.setid.*, as requested by yamt@.
 1.74  21-Jun-2006  christos Don't leak memory on success. Allocate only the type of struct that we'll
need for efficiency.
 1.73  20-Jun-2006  christos don't allocate too much stuff on the stack.
 1.72  17-Jun-2006  yamt sysctl_security_setidcorename: don't allocate MAXPATHLEN bytes on stack.
 1.71  13-Jun-2006  yamt branches: 1.71.2;
remove unnecessary arguments from kauth_authorize_process.
ie. make it similar to the one found in apple TN.
 1.70  13-Jun-2006  yamt sysctl_kern_file, sysctl_kern_file2: don't abuse kauth_authorize_process
for non-process objects.
 1.69  13-Jun-2006  yamt sysctl_kern_file2: fix an indent.
 1.68  14-May-2006  elad branches: 1.68.2;
integrate kauth.
 1.67  17-Apr-2006  elad Move securelevel-specific stuff to its own file.
 1.66  14-Apr-2006  blymn Make i/o statistics collection more generic, include tape drives and
nfs mounts in the set of devices that statistics will be reported on.
 1.65  01-Apr-2006  christos PR/32809: Pavel Cahyna: Conflicting flags in l_flag and p_flag are causing
ps(1) to print incorrect information. Annotate the flags in the header files
to make sure that flags are not being re-used and move flags so that there
are no conflicts.
 1.64  26-Mar-2006  erh When DIAGNOSTIC is defined, provide a kern.panic_now sysctl to conviniently
and reliably panic the system
 1.63  01-Mar-2006  yamt branches: 1.63.2; 1.63.4; 1.63.6;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.
 1.62  04-Feb-2006  yamt for some random places, use PNBUF_GET/PUT rather than
- on-stack buffer
- malloc(MAXPATHLEN)
 1.61  02-Feb-2006  elad branches: 1.61.2;
implement a security.setid_core node as discussed on tech-kern@ and
tech-security@.
 1.60  27-Jan-2006  elad branches: 1.60.2;
remove security node sysctl objects; they are now created using CTL_CREATE.
 1.59  26-Dec-2005  perry branches: 1.59.2;
u_intN_t -> uintN_t
 1.58  11-Dec-2005  christos merge ktrace-lwp.
 1.57  05-Dec-2005  christos - make settime take timespec.
- avoid wrapping of time in settime.
- pass struct proc down so that we can log a detailed message.
 1.56  08-Oct-2005  yamt sysctl_kern_proc_args: don't assume that the process is
resident while we are sleeping.
 1.55  07-Sep-2005  elad Implement curtain in KERN_{PROC,PROC2,FILE,FILE2,PROC_ARGS}.
While I'm here, disable curtain by default.
 1.54  07-Sep-2005  elad Introduce ``security.curtain'', new node for security features and
settings, and new variable for controlling access to objects based
on user-id.
 1.53  06-Sep-2005  rpaulo Implement kern.hardclock_ticks.
 1.52  24-Aug-2005  simonb Fix a tyop in a comment.
 1.51  13-Aug-2005  blymn Remove the tape stats from here, they caused issues on non-scsipi
architectures.
 1.50  08-Aug-2005  blymn Don't include tape stats functions if no devices configured.
 1.49  07-Aug-2005  blymn Add tape statistics gathering functions.
 1.48  29-Jul-2005  elad #ifdef VERIFIED_EXEC
 1.47  16-Jul-2005  christos defopt verified_exec.
 1.46  17-Jun-2005  atatat branches: 1.46.2;
Comment in new cp_id implementation was wrong since I abandoned
rewriting it in favor of some testing and then never got back to it.
It's better now.
 1.45  16-Jun-2005  christos Add a new sysctl 'cp_id' that returns the array of cpu id values. Requested by
me, implemented by atatat.
 1.44  15-Jun-2005  elad Fix sysctl handling for raise-only variables. This affected the veriexec
node entirely. Reported by Nino Dehne.
 1.43  09-Jun-2005  atatat Properly fix the constipated lossage wrt -Wcast-qual and the sysctl
code. I know it's not the prettiest code, but it seems to work rather
well in spite of itself.
 1.42  06-Jun-2005  jdc Revert previous ('_ncpus' is now 'ncpus' again).
MI variable names have precedence.
 1.41  05-Jun-2005  jdc Rename 'ncpus' to '_ncpus', otherwise we shadow sparc/sparc64's 'ncpus'
when MULTIPROCESSOR is defined.
 1.40  29-May-2005  christos - add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.
 1.39  22-May-2005  elad Add indication for number of fingerprinted files on each device.

When a table is created for a new device, a new variable is created
under the kern.veriexec.count node named "dev_<id>". For example,
dev_0, dev_3, etc.
 1.38  19-May-2005  elad Some changes in veriexec.

New features:

- Add a veriexec_report() routine to make most reporting consistent and
remove some common code.
- Add 'strict' mode that controls how veriexec behaves.
- Add sysctl knobs:
o kern.veriexec.verbose controls verbosity levels. Value: 0, 1.
o kern.veriexec.strict controls strict level. Values: 0, 1, 2. See
documentation in sysctl(3) for details.
o kern.veriexec.algorithms returns a string with a space separated
list of supported hashing algorithms in veriexec.
- Updated documentation in man pages for sysctl(3) and sysctl(8).

Bug fixes:

- veriexec_removechk(): Code cleanup + handle FINGERPRINT_NOTEVAL
correctly.
- exec_script(): Don't pass 0 as flag when executing a script; use the
defined VERIEXEC_INDIRECT - which is 1. Makes indirect execution
enforcement work.
- Fix some printing formats and types..
 1.37  18-Apr-2005  mrg be explicit in the description for POSIX saved set-id that this is for
POSIX-style, not sane-style. (ie, add "POSIX " to the description.)
 1.36  11-Mar-2005  atatat branches: 1.36.2;
Revert the change that made kern.file2 and net.*.*.pcblist into nodes
instead of structs. It had other deleterious side-effects that are
rather nasty. Another solution must be found.
 1.35  10-Mar-2005  atatat Change types of kern.file2 and net.*.*.pcblist to NODE
 1.34  09-Mar-2005  atatat Add kern.file2. As kern.proc2 is to kern.proc, so is kern.file2 to
kern.file, namely a 32/64 bit clean sysctl interface to the same data.
It also borrows a few things from struct vnode (if applicable) and
from struct proc, just to tie things together a bit more.

You can walk this list "by file" or "by pid". The former method is
similar to kern.file but omits the filehead, and the latter can give
you duplicates if multiple processes have the same struct file open,
but tells you which process it is.
 1.33  26-Feb-2005  perry nuke trailing whitespace
 1.32  01-Oct-2004  yamt branches: 1.32.4; 1.32.6;
introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.
 1.31  27-Jul-2004  atatat branches: 1.31.2;
The message buffer datum instrumented by KERN_MSGBUFSIZE is actually a
long, not an int, and this causes "problems" on LP64be machines
(sparc64, etc). Assign the value to a temporary int and instrument
that instead. Should be fine until someone wants a message buffer
larger than two gigabytes.
 1.30  26-May-2004  christos (off_t)(long) is wrong when it comes to kernel addresses [because on a 32 bit
machine if the high bit is set they turn negative]. Make an intermediate cast
to unsigned long.
 1.29  03-May-2004  martin Fix a comment.
Approved by Andrew Brown.
 1.28  23-Apr-2004  simonb s/the the/the/ (only in sources that aren't regularly imported from
elsewhere).
 1.27  16-Apr-2004  atatat Prefer that kern.hostid is printed in hex, not as a signed decimal,
and avoid accidental sign-extension when setting it.
 1.26  08-Apr-2004  atatat Lots of sysctl descriptions (if someone wants to help out here, that
would be good) mostly copied from sysctl(3). This takes care of the
top-level, most of kern.* and hw.* (modulo the ath and bge stuff), and
all of proc.*.

If you don't want the added rodata in your kernel, use "options
SYSCTL_NO_DESCR" in your kernel config.
 1.25  08-Apr-2004  atatat Clear out the struct kinfo_drivers before stuffing things into it.
Avoids leaking garbage from the stack (left over from the earlier
call to sysctl_locate()).
 1.24  24-Mar-2004  atatat branches: 1.24.2;
Tango on sysctl_createv() and flags. The flags have all been renamed,
and sysctl_createv() now uses more arguments.
 1.23  17-Mar-2004  yamt - move kern.somaxkva sysctl stuff from init_sysctl.c to uipc_socket.c.
- when changing its value, wakeup sokva waiters.
 1.22  21-Feb-2004  atatat Use KERN_PROCSLOP for struct kinfo_proc and KERN_LWPSLOP for
struct kinfo_lwp, and not vice versa.

Should solve the issue with top dying because it's unable to "allocate
memory".
 1.21  19-Feb-2004  atatat Use new PTRTOUINT64() macro instead of local PTRTOINT64() macro.
 1.20  17-Jan-2004  atatat Avoid dereferencing l...it might be NULL
 1.19  28-Dec-2003  atatat Sysctl functions called for "generic" nodes should forward "query"
requests (where possible), rather than returning errors.
 1.18  28-Dec-2003  atatat Adjust error returns in kern.cp_time when a specific processor is
being requested so that (1) the uniprocessor case and the
multiprocessor case are more similar and (2) so that we return ENOENT
when a non-existent processor is requested (which is both more
sensible and follows the general order of things anyway).
 1.17  28-Dec-2003  atatat Rename sysctl_kern_hostname() to sysctl_setlen() and use it also for
domainname. Note that there's no need to copy rnode since we're not
changing any of it, nor protecting anything from change.

Thanks to martin for initial work.
 1.16  28-Dec-2003  atatat RCSid police
 1.15  28-Dec-2003  martin After changing hostname, adjust hostnamelen.
This closes PR kern/23907.
 1.14  26-Dec-2003  martin Make kern.rtc_offset writable at securelevel <= 0.
This allows boot-time adjustment when a machine runs other OSes with
RTC == localtime.
 1.13  20-Dec-2003  yamt update a comment to match with the previous change (rev.1.12).
 1.12  20-Dec-2003  yamt restore functionality to decrease kern.maxvnodes which
has been backed out during sysctl rework.
 1.11  12-Dec-2003  simonb In sysctl_kern_lwp adjust offsets into the mib entries so that
they are now correct. Fixes problems with "ps -s" not working.
Also use KERN_LWPSLOP instead of KERN_PROCSLOP.

Both changes from Andrew Brown.
 1.10  10-Dec-2003  atatat Make kern.dump_on_panic writeable again, too
 1.9  09-Dec-2003  atatat Make kern.sbmax writeable again as well.

From a follow-on to PR kern/23695 by a Mr. Davis, which I missed at a
quick glance.
 1.8  09-Dec-2003  atatat Make kern.logsigexit writeable again.

Fixes PR kern/23695.
 1.7  07-Dec-2003  martin Add missing break.
 1.6  07-Dec-2003  he Also make declaration of sysctl_kern_maxptys() depend on NPTY > 0.
Makes the mvme68k RAMDISK kernel compile again.
 1.5  06-Dec-2003  martin Fix kern.cp_time for MULTIPROCESSOR kernels: calculate size of result
correctly, free original instead of incremented pointer, copy results for
n = -2 case too, so top shows correct stats.
Additionaly, rearange code for better readability (from Andrew).
 1.4  06-Dec-2003  fvdl Include opt_posix.h for the P1003_1B_SEMAPHORE define.
Include <machine/cpu.h> just to be sure.
 1.3  06-Dec-2003  martin We can not count CPUs at sysctl initialization time - so don't make
hw.ncpu an immediate value.
 1.2  06-Dec-2003  atatat #include "opt_multiprocessor.h"

This makes hw.ncpu and kern.cp_time work better on those platforms.
 1.1  04-Dec-2003  atatat Dynamic sysctl.

Gone are the old kern_sysctl(), cpu_sysctl(), hw_sysctl(),
vfs_sysctl(), etc, routines, along with sysctl_int() et al. Now all
nodes are registered with the tree, and nodes can be added (or
removed) easily, and I/O to and from the tree is handled generically.

Since the nodes are registered with the tree, the mapping from name to
number (and back again) can now be discovered, instead of having to be
hard coded. Adding new nodes to the tree is likewise much simpler --
the new infrastructure handles almost all the work for simple types,
and just about anything else can be done with a small helper function.

All existing nodes are where they were before (numerically speaking),
so all existing consumers of sysctl information should notice no
difference.

PS - I'm sorry, but there's a distinct lack of documentation at the
moment. I'm working on sysctl(3/8/9) right now, and I promise to
watch out for buses.
 1.24.2.5  28-Jul-2004  tron Pull up revision 1.31 (requested by atatat in ticket #720):
The message buffer datum instrumented by KERN_MSGBUFSIZE is actually a
long, not an int, and this causes "problems" on LP64be machines
(sparc64, etc). Assign the value to a temporary int and instrument
that instead. Should be fine until someone wants a message buffer
larger than two gigabytes.
 1.24.2.4  06-May-2004  jmc Pullup rev 1.29 (requested by martin in ticket #257)

Fix a comment.
 1.24.2.3  21-Apr-2004  jmc Pullup rev 1.27 (requested by atatat in ticket #150)

Prefer that kern.hostid is printed in hex, not as a signed decimal,
and avoid accidental sign-extension when setting it.
 1.24.2.2  21-Apr-2004  jmc Pullup rev 1.26 (requested by atatat in ticket #93)

Lots of sysctl descriptions mostly copied from sysctl(3).
 1.24.2.1  08-Apr-2004  jmc Pullup rev 1.25 (requested by atatat in ticket #85)

Clear out the struct kinfo_drivers before stuffing things into it.
Avoids leaking garbage from the stack (left over from the earlier
call to sysctl_locate()).
 1.31.2.9  11-Dec-2005  christos Sync with head.
 1.31.2.8  10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.31.2.7  01-Apr-2005  skrll Sync with HEAD.
 1.31.2.6  04-Mar-2005  skrll Sync with HEAD.

Hi Perry!
 1.31.2.5  19-Oct-2004  skrll Sync with HEAD
 1.31.2.4  21-Sep-2004  skrll Fix the sync with head I botched.
 1.31.2.3  18-Sep-2004  skrll Sync with HEAD.
 1.31.2.2  03-Aug-2004  skrll Sync with HEAD
 1.31.2.1  27-Jul-2004  skrll file init_sysctl.c was added on branch ktrace-lwp on 2004-08-03 10:52:43 +0000
 1.32.6.1  19-Mar-2005  yamt sync with head. xen and whitespace. xen part is not finished.
 1.32.4.1  29-Apr-2005  kent sync with -current
 1.36.2.8  20-Mar-2009  msaitoh Pull up following revision(s) (requested by mrg in ticket #1999):
sys/kern/init_sysctl.c: revision 1.158
always calculate "needed" for KERN_FILE2 calls. this allows a caller
to get an estimate of the needed space, like the intention is.
 1.36.2.7  07-Apr-2006  tron branches: 1.36.2.7.2;
Pull up following revision(s) (requested by pavel in ticket #1244):
sys/sys/lwp.h: revision 1.36 via patch
sys/sys/proc.h: revision 1.219 via patch
sys/kern/init_sysctl.c: revision 1.65 via patch
PR/32809: Pavel Cahyna: Conflicting flags in l_flag and p_flag are causing
ps(1) to print incorrect information. Annotate the flags in the header files
to make sure that flags are not being re-used and move flags so that there
are no conflicts.
 1.36.2.6  08-Sep-2005  tron branches: 1.36.2.6.2;
Apply patch (requested by elad in ticket #740):
Defopt VERIFIED_EXEC.
 1.36.2.5  23-Aug-2005  tron Backout ticket 685. It causes build failures.
 1.36.2.4  23-Aug-2005  tron Pull up revision 1.47 (requested by elad in ticket #685):
defopt verified_exec.
 1.36.2.3  02-Jul-2005  tron Pull up revision 1.44 (requested by elad in ticket #487):
Fix sysctl handling for raise-only variables. This affected the veriexec
node entirely. Reported by Nino Dehne.
 1.36.2.2  10-Jun-2005  tron Pull up revision 1.39 (requested by elad in ticket #389):
Add indication for number of fingerprinted files on each device.
When a table is created for a new device, a new variable is created
under the kern.veriexec.count node named "dev_<id>". For example,
dev_0, dev_3, etc.
 1.36.2.1  10-Jun-2005  tron Pull up revision 1.38 (requested by elad in ticket #389):
Some changes in veriexec.
New features:
- Add a veriexec_report() routine to make most reporting consistent and
remove some common code.
- Add 'strict' mode that controls how veriexec behaves.
- Add sysctl knobs:
o kern.veriexec.verbose controls verbosity levels. Value: 0, 1.
o kern.veriexec.strict controls strict level. Values: 0, 1, 2. See
documentation in sysctl(3) for details.
o kern.veriexec.algorithms returns a string with a space separated
list of supported hashing algorithms in veriexec.
- Updated documentation in man pages for sysctl(3) and sysctl(8).
Bug fixes:
- veriexec_removechk(): Code cleanup + handle FINGERPRINT_NOTEVAL
correctly.
- exec_script(): Don't pass 0 as flag when executing a script; use the
defined VERIEXEC_INDIRECT - which is 1. Makes indirect execution
enforcement work.
- Fix some printing formats and types..
 1.36.2.7.2.1  27-Mar-2009  msaitoh Pull up following revision(s) (requested by mrg in ticket #1999):
sys/kern/init_sysctl.c: revision 1.158
always calculate "needed" for KERN_FILE2 calls. this allows a caller
to get an estimate of the needed space, like the intention is.
 1.36.2.6.2.1  27-Mar-2009  msaitoh Pull up following revision(s) (requested by mrg in ticket #1999):
sys/kern/init_sysctl.c: revision 1.158
always calculate "needed" for KERN_FILE2 calls. this allows a caller
to get an estimate of the needed space, like the intention is.
 1.46.2.11  24-Mar-2008  yamt sync with head.
 1.46.2.10  17-Mar-2008  yamt sync with head.
 1.46.2.9  04-Feb-2008  yamt sync with head.
 1.46.2.8  21-Jan-2008  yamt sync with head
 1.46.2.7  15-Nov-2007  yamt sync with head.
 1.46.2.6  27-Oct-2007  yamt sync with head.
 1.46.2.5  03-Sep-2007  yamt sync with head.
 1.46.2.4  26-Feb-2007  yamt sync with head.
 1.46.2.3  30-Dec-2006  yamt sync with head.
 1.46.2.2  21-Jun-2006  yamt pull init_sysctl.c rev.1.74.
 1.46.2.1  21-Jun-2006  yamt sync with head.
 1.59.2.3  18-Feb-2006  yamt sync with head.
 1.59.2.2  01-Feb-2006  yamt sync with head.
 1.59.2.1  31-Dec-2005  yamt uio_segflg/uio_lwp -> uio_vmspace.
 1.60.2.1  09-Sep-2006  rpaulo sync with head
 1.61.2.2  01-Jun-2006  kardel Sync with head.
 1.61.2.1  22-Apr-2006  simonb Sync with head.
 1.63.6.2  24-May-2006  tron Merge 2006-05-24 NetBSD-current into the "peter-altq" branch.
 1.63.6.1  28-Mar-2006  tron Merge 2006-03-28 NetBSD-current into the "peter-altq" branch.
 1.63.4.8  06-May-2006  christos - Move kauth_cred_t declaration to <sys/types.h>
- Cleanup struct ucred; forward declarations that are unused.
- Don't include <sys/kauth.h> in any header, but include it in the c files
that need it.

Approved by core.
 1.63.4.7  21-Apr-2006  elad fix typo.
 1.63.4.6  21-Apr-2006  elad adjust number of groups to kauth_cred_getgroups() with min() as in the
original code. okay christos@
 1.63.4.5  20-Apr-2006  christos Pass the correct number of groups instead of the size of the groups array
to avoid a KASSERT panic.
 1.63.4.4  19-Apr-2006  elad sync with head.
 1.63.4.3  13-Apr-2006  elad Deprecate use of CURTAIN() where it's easy -- now it's done via kauth(9),
process scope, CANSEE.
 1.63.4.2  14-Mar-2006  elad Use kauth_cred_[sg]etgroups() where appropriate.
 1.63.4.1  08-Mar-2006  elad Adapt to kernel authorization KPI.
 1.63.2.5  14-Sep-2006  yamt sync with head.
 1.63.2.4  11-Aug-2006  yamt sync with head
 1.63.2.3  26-Jun-2006  yamt sync with head.
 1.63.2.2  24-May-2006  yamt sync with head.
 1.63.2.1  01-Apr-2006  yamt sync with head.
 1.68.2.1  19-Jun-2006  chap Sync with head.
 1.71.2.1  13-Jul-2006  gdamore Merge from HEAD.
 1.81.4.11  05-Feb-2007  ad - When clearing signals dequeue siginfo first and free later, once
outside the lock permiter.
- Push kernel_lock back in a a couple of places.
- Adjust limcopy() to be MP safe (this needs redoing).
- Fix a couple of bugs noticed along the way.
- Catch up with condvar changes.
 1.81.4.10  04-Feb-2007  ad Add a compat flag to match LPR_DETACHED.
 1.81.4.9  01-Feb-2007  ad Sync with head.
 1.81.4.8  30-Jan-2007  ad Remove support for SA. Ok core@.
 1.81.4.7  12-Jan-2007  ad Sync with head.
 1.81.4.6  29-Dec-2006  ad Checkpoint work in progress.
 1.81.4.5  18-Nov-2006  ad Sync with head.
 1.81.4.4  17-Nov-2006  ad Checkpoint work in progress.
 1.81.4.3  24-Oct-2006  ad - Redo LWP locking slightly and fix some races.
- Fix some locking botches.
- Make signal mask / stack per-proc for SA processes.
- Add _lwp_kill().
 1.81.4.2  21-Oct-2006  ad Checkpoint work in progress on locking and per-LWP signals. Very much a
a work in progress and there is still a lot to do.
 1.81.4.1  11-Sep-2006  ad - Convert some lockmgr() locks to mutexes and RW locks.
- Acquire proclist_lock and p_crmutex in some obvious places.
 1.85.2.2  10-Dec-2006  yamt sync with head.
 1.85.2.1  22-Oct-2006  yamt sync with head
 1.93.2.3  07-Mar-2011  snj Apply patch (requested by joerg in ticket 1419):
Sanitize arguments before memory allocation.
 1.93.2.2  20-Mar-2009  msaitoh Pull up following revision(s) (requested by mrg in ticket #1287):
sys/kern/init_sysctl.c: revision 1.158
always calculate "needed" for KERN_FILE2 calls. this allows a caller
to get an estimate of the needed space, like the intention is.
 1.93.2.1  28-Jan-2007  tron branches: 1.93.2.1.6;
Pull up following revision(s) (requested by elad in ticket #383):
sys/kern/init_sysctl.c: revision 1.94
Don't rely on KAUTH_PROCESS_CANSEE for environment just yet.
From yamt@.
 1.93.2.1.6.2  07-Mar-2011  snj Apply patch (requested by joerg in ticket 1419):
Sanitize arguments before memory allocation.
 1.93.2.1.6.1  27-Mar-2009  msaitoh Pull up following revision(s) (requested by mrg in ticket #1287):
sys/kern/init_sysctl.c: revision 1.158
always calculate "needed" for KERN_FILE2 calls. this allows a caller
to get an estimate of the needed space, like the intention is.
 1.96.2.7  07-May-2007  yamt sync with head.
 1.96.2.6  21-Apr-2007  ad Some changes mainly for top/ps:

- Add an optional name field to struct lwp.
- Count the total number of context switches + involuntary,
not voluntary + involuntary.
- Mark the idle threads as LSIDL when not running, otherwise
they show up funny in a top(1) that shows threads.
- Make pctcpu and cpticks per-LWP attributes.
- Add to kinfo_lwp: cpticks, pctcpu, pid, name.
 1.96.2.5  02-Apr-2007  rmind - Move the ccpu sysctl back to the scheduler-independent part.
- Move the scheduler-independent parts of 4BSD's schedcpu() to
kern_synch.c.
- Add scheduler-specific hook to satisfy individual scheduler's
needs.
- Remove autonice, which is archaic and not useful.

Patch provided by Daniel Sieger.
 1.96.2.4  23-Mar-2007  yamt don't bother to show l_forw and l_back to userland.
 1.96.2.3  12-Mar-2007  rmind Sync with HEAD.
 1.96.2.2  09-Mar-2007  rmind Checkpoint:

- Addition of scheduler-specific pointers in the struct proc, lwp and
schedstate_percpu.
- Addition of sched_lwp_fork(), sched_lwp_exit() and sched_slept() hooks.
- mi_switch() now has only one argument.
- sched_nextlwp(void) becomes sched_switch(struct lwp *) and does an
enqueueing of LWP.
- Addition of general kern.sched sysctl node.
- Remove twice called uvmexp.swtch++, other cleanups.

Discussed on tech-kern@
 1.96.2.1  27-Feb-2007  yamt - sync with head.
- move sched_changepri back to kern_synch.c as it doesn't know PPQ anymore.
 1.98.2.10  01-Nov-2007  ad - Fix interactivity problems under high load. Beacuse soft interrupts
are being stacked on top of regular LWPs, more often than not aston()
was being called on a soft interrupt thread instead of a user thread,
meaning that preemption was not happening on EOI.

- Don't use bool in a couple of data structures. Sub-word writes are not
always atomic and may clobber other fields in the containing word.

- For SCHED_4BSD, make p_estcpu per thread (l_estcpu). Rework how the
dynamic priority level is calculated - it's much better behaved now.

- Kill the l_usrpri/l_priority split now that priorities are no longer
directly assigned by tsleep(). There are three fields describing LWP
priority:

l_priority: Dynamic priority calculated by the scheduler.
This does not change for kernel/realtime threads,
and always stays within the correct band. Eg for
timeshared LWPs it never moves out of the user
priority range. This is basically what l_usrpri
was before.

l_inheritedprio: Lent to the LWP due to priority inheritance
(turnstiles).

l_kpriority: A boolean value set true the first time an LWP
sleeps within the kernel. This indicates that the LWP
should get a priority boost as compensation for blocking.
lwp_eprio() now does the equivalent of sched_kpri() if
the flag is set. The flag is cleared in userret().

- Keep track of scheduling class (OTHER, FIFO, RR) in struct lwp, and use
this to make decisions in a few places where we previously tested for a
kernel thread.

- Partially fix itimers and usr/sys/intr time accounting in the presence
of software interrupts.

- Use kthread_create() to create idle LWPs. Move priority definitions
from the various modules into sys/param.h.

- newlwp -> lwp_create
 1.98.2.9  23-Oct-2007  ad Sync with head.
 1.98.2.8  09-Oct-2007  ad Sync with head.
 1.98.2.7  20-Aug-2007  ad Sync with HEAD.
 1.98.2.6  18-Aug-2007  yamt sysctl_kern_lwp:
- don't copyout with p_smutex held.
on i386, copyout can involve pmap_load, which can block.
- don't forget to release mutexes on error.
ok'ed by Andrew Doran.
 1.98.2.5  15-Jul-2007  ad Sync with head.
 1.98.2.4  01-Jul-2007  ad - LW_SELECT is gone
- struct callout -> callout_t
 1.98.2.3  08-Jun-2007  ad Sync with head.
 1.98.2.2  05-Apr-2007  ad - Make context switch counters 64-bit, and count the total number of
context switches + voluntary, instead of involuntary + voluntary.
- Add lwp::l_swaplock for uvm.
- PHOLD/PRELE are replaced.
 1.98.2.1  13-Mar-2007  ad Sync with head.
 1.99.2.1  11-Jul-2007  mjf Sync with head.
 1.103.6.5  06-Nov-2007  joerg Sync with HEAD.
 1.103.6.4  26-Oct-2007  joerg Sync with HEAD.

Follow the merge of pmap.c on i386 and amd64 and move
pmap_init_tmp_pgtbl into arch/x86/x86/pmap.c. Modify the ACPI wakeup
code to restore CR4 before jumping back into kernel space as the large
page option might cover that.
 1.103.6.3  02-Oct-2007  joerg Sync with HEAD.
 1.103.6.2  16-Aug-2007  jmcneill Sync with HEAD.
 1.103.6.1  09-Aug-2007  jmcneill Sync with HEAD.
 1.103.2.2  03-Sep-2007  skrll Sync with HEAD.
 1.103.2.1  15-Aug-2007  skrll Sync with HEAD.
 1.104.2.2  06-Aug-2007  yamt remove a homegrown definition of CPU_INFO_FOREACH.
 1.104.2.1  06-Aug-2007  yamt file init_sysctl.c was added on branch matt-mips64 on 2007-08-06 11:51:47 +0000
 1.105.4.3  18-Oct-2007  yamt sync with head.
 1.105.4.2  14-Oct-2007  yamt sync with head.
 1.105.4.1  06-Oct-2007  yamt sync with head.
 1.105.2.3  23-Mar-2008  matt sync with HEAD
 1.105.2.2  09-Jan-2008  matt sync with HEAD
 1.105.2.1  06-Nov-2007  matt sync with HEAD
 1.111.2.2  13-Nov-2007  bouyer Sync with HEAD
 1.111.2.1  25-Oct-2007  bouyer Sync with HEAD.
 1.112.2.3  18-Feb-2008  mjf Sync with HEAD.
 1.112.2.2  27-Dec-2007  mjf Sync with HEAD.
 1.112.2.1  19-Nov-2007  mjf Sync with HEAD.
 1.113.6.5  23-Jan-2008  bouyer Sync with HEAD.
 1.113.6.4  19-Jan-2008  bouyer Sync with HEAD
 1.113.6.3  08-Jan-2008  bouyer Sync with HEAD
 1.113.6.2  02-Jan-2008  bouyer Sync with HEAD
 1.113.6.1  13-Dec-2007  bouyer Sync with HEAD
 1.113.4.1  11-Dec-2007  yamt sync with head.
 1.113.2.1  26-Dec-2007  ad Sync with head.
 1.122.6.5  17-Jan-2009  mjf Sync with HEAD.
 1.122.6.4  28-Sep-2008  mjf Sync with HEAD.
 1.122.6.3  29-Jun-2008  mjf Sync with HEAD.
 1.122.6.2  02-Jun-2008  mjf Sync with HEAD.
 1.122.6.1  03-Apr-2008  mjf Sync with HEAD.
 1.122.2.1  24-Mar-2008  keiichi sync with head.
 1.125.2.7  30-Dec-2008  christos sync with head.
 1.125.2.6  29-Dec-2008  christos adjust for tdev.
 1.125.2.5  27-Dec-2008  christos merge with head.
 1.125.2.4  20-Nov-2008  christos merge with head.
 1.125.2.3  01-Nov-2008  christos catch up with changes in head.
 1.125.2.2  01-Nov-2008  christos Sync with head.
 1.125.2.1  29-Mar-2008  christos Welcome to the time_t=long long dev_t=uint64_t branch.
 1.131.2.3  17-Jun-2008  yamt sync with head.
 1.131.2.2  04-Jun-2008  yamt sync with head
 1.131.2.1  18-May-2008  yamt sync with head.
 1.133.2.8  11-Aug-2010  yamt sync with head.
 1.133.2.7  11-Mar-2010  yamt sync with head
 1.133.2.6  16-Sep-2009  yamt sync with head
 1.133.2.5  19-Aug-2009  yamt sync with head.
 1.133.2.4  20-Jun-2009  yamt sync with head
 1.133.2.3  16-May-2009  yamt sync with head
 1.133.2.2  04-May-2009  yamt sync with head.
 1.133.2.1  16-May-2008  yamt sync with head.
 1.137.2.5  14-Oct-2008  wrstuden Adapt kern.no_sa_support so that it changes sa_system_disabled
and thus actually controls SA.
 1.137.2.4  18-Sep-2008  wrstuden Sync with wrstuden-revivesa-base-2.
 1.137.2.3  23-Jun-2008  wrstuden Sync w/ -current. 34 merge conflicts to follow.
 1.137.2.2  14-May-2008  wrstuden Per discussion with ad at n dot o, revert signal mask handling
changes.

The l_sigstk changes are most likely totally un-needed as SA will
never use a signal stack - we send an upcall (or will as other
diffs are brought in).

The l_sigmask changes were too controvertial. In all honesty, I
think it's probably best to revert them. The main reason they were
there is the fact that in an SA process, we don't mask signals per
kernel thread, we mask them per user thread. In the kernel, we want
them all to get turned into upcalls. Thus the normal state of
l_sigmask in an SA process is for it to always be empty.

While we are in the process of delivering a signal, we want to
temporarily mask a signal (so we don't recursively exhaust our
upcall stacks). However signal delivery is rare (important, but
rare), and delivering back-to-back signals is even rarer. So rather
than cause every user of a signal mask to be prepared for this very
rare case, we will just add a second check later in the signal
delivery code. Said change is not in this diff.

This also un-compensates all of our compatability code for dealing
with SA. SA is a NetBSD-specific thing, so there's no need for
Irix, Linux, Solaris, SVR4 and so on to cope with it.

As previously, everything other than kern_sa.c compiles in i386
GENERIC as of this checkin. I will switch to ALL soon for compile
testing.
 1.137.2.1  10-May-2008  wrstuden Initial checkin of re-adding SA. Everything except kern_sa.c
compiles in GENERIC for i386. This is still a work-in-progress, but
this checkin covers most of the mechanical work (changing signalling
to be able to accomidate SA's process-wide signalling and re-adding
includes of sys/sa.h and savar.h). Subsequent changes will be much
more interesting.

Also, kern_sa.c has received partial cleanup. There's still more
to do, though.
 1.140.2.3  18-Jul-2008  simonb Sync with head.
 1.140.2.2  03-Jul-2008  simonb Sync with head.
 1.140.2.1  18-Jun-2008  simonb Sync with head.
 1.143.2.2  13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.143.2.1  19-Oct-2008  haad Sync with HEAD.
 1.149.4.8  07-Mar-2011  snj Apply patch (requested by joerg in ticket 1575):
Sanitize arguments before memory allocation.
 1.149.4.7  01-Jul-2009  snj branches: 1.149.4.7.2;
Pull up following revision(s) (requested by rmind in ticket #839):
sys/kern/init_sysctl.c: revision 1.163
sysctl_doeproc:
- simplify.
- KERN_PROC: fix possible stale proc pointer dereference.
- KERN_PROC: don't do copyout with proc_lock held.
 1.149.4.6  01-Jul-2009  snj Pull up following revision(s) (requested by rmind in ticket #838):
sys/kern/init_sysctl.c: revision 1.162
sys/kern/vfs_trans.c: revision 1.25
don't forget to skip marker processes.
 1.149.4.5  01-Jul-2009  snj Pull up following revision(s) (requested by rmind in ticket #835):
sys/kern/init_sysctl.c: revision 1.161
sysctl_doeproc: fix a bug in rev.1.135.
don't forget to mark our marker process PK_MARKER.
this fixes crashes in sched_pstats, etc.
 1.149.4.4  01-Apr-2009  snj branches: 1.149.4.4.2;
Pull up following revision(s) (requested by mrg in ticket #622):
bin/csh/csh.1: revision 1.46
bin/csh/func.c: revision 1.37
bin/ps/print.c: revision 1.111
bin/ps/ps.c: revision 1.74
bin/sh/miscbltin.c: revision 1.38
bin/sh/sh.1: revision 1.92 via patch
external/bsd/top/dist/machine/m_netbsd.c: revision 1.7
lib/libkvm/kvm_proc.c: revision 1.82
sys/arch/mips/mips/cpu_exec.c: revision 1.55
sys/compat/darwin/darwin_exec.c: revision 1.57
sys/compat/ibcs2/ibcs2_exec.c: revision 1.73
sys/compat/irix/irix_resource.c: revision 1.15
sys/compat/linux/arch/amd64/linux_exec_machdep.c: revision 1.16
sys/compat/linux/arch/i386/linux_exec_machdep.c: revision 1.12
sys/compat/linux/common/linux_limit.h: revision 1.5
sys/compat/osf1/osf1_resource.c: revision 1.14
sys/compat/svr4/svr4_resource.c: revision 1.18
sys/compat/svr4_32/svr4_32_resource.c: revision 1.17
sys/kern/exec_subr.c: revision 1.62
sys/kern/init_sysctl.c: revision 1.160
sys/kern/kern_exec.c: revision 1.288
sys/kern/kern_resource.c: revision 1.151
sys/sys/param.h: patch
sys/sys/resource.h: revision 1.31
sys/sys/sysctl.h: revision 1.184
sys/uvm/uvm_extern.h: revision 1.153
sys/uvm/uvm_glue.c: revision 1.136
sys/uvm/uvm_mmap.c: revision 1.128
usr.bin/systat/ps.c: revision 1.32
- - add new RLIMIT_AS (aka RLIMIT_VMEM) resource that limits the total
address space available to processes. this limit exists in most other
modern unix variants, and like most of them, our defaults are unlimited.
remove the old mmap / rlimit.datasize hack.
- - adds the VMCMD_STACK flag to all the stack-creation vmcmd callers.
it is currently unused, but was added a few years ago.
- - add a pair of new process size values to kinfo_proc2{}. one is the
total size of the process memory map, and the other is the total size
adjusted for unused stack space (since most processes have a lot of
this...)
- - patch sh, and csh to notice RLIMIT_AS. (in some cases, the alias
RLIMIT_VMEM was already present and used if availble.)
- - patch ps, top and systat to notice the new k_vm_vsize member of
kinfo_proc2{}.
- - update irix, svr4, svr4_32, linux and osf1 emulations to support
this information. (freebsd could be done, but that it's best left
as part of the full-update of compat/freebsd.)
this addresses PR 7897. it also gives correct memory usage values,
which have never been entirely correct (since mmap), and have been
very incorrect since jemalloc() was enabled.
tested on i386 and sparc64, build tested on several other platforms.
thanks to many folks for feedback and testing but most espcially
chuq and yamt for critical suggestions that lead to this patch not
having a special ugliness i wasn't happy with anyway :-)
 1.149.4.3  15-Mar-2009  snj Pull up following revision(s) (requested by mrg in ticket #566):
sys/kern/init_sysctl.c: revision 1.157
sys/kern/kern_descrip.c: revision 1.187
usr.sbin/pstat/pstat.c: revision 1.112
Don't bother with file_t::f_iflags any more, as it's not used.
Noted by mrg@.
 1.149.4.2  15-Mar-2009  snj Pull up following revision(s) (requested by mrg in ticket #565):
sys/kern/init_sysctl.c: revision 1.158
always calculate "needed" for KERN_FILE2 calls. this allows a caller
to get an estimate of the needed space, like the intention is.
 1.149.4.1  29-Nov-2008  bouyer Pull up following revision(s) (requested by elad in ticket #140):
sys/kern/init_sysctl.c: revision 1.151
PR/40002: Daniel Horecki: sockstat doesn't work for user with sysctl
security.curtain=1
If the kauth call failed, we'd silently continue the loop, but the error
code would remain and eventually "leak" to userspace. Reset the error to
zero when continuing.
Tested by snj@ and myself. Okay snj@.
 1.149.4.7.2.1  07-Mar-2011  snj Apply patch (requested by joerg in ticket 1575):
Sanitize arguments before memory allocation.
 1.149.4.4.2.4  07-Mar-2011  snj Apply patch (requested by joerg in ticket 1575):
Sanitize arguments before memory allocation.
 1.149.4.4.2.3  01-Jul-2009  snj branches: 1.149.4.4.2.3.2;
Pull up following revision(s) (requested by rmind in ticket #839):
sys/kern/init_sysctl.c: revision 1.163
sysctl_doeproc:
- simplify.
- KERN_PROC: fix possible stale proc pointer dereference.
- KERN_PROC: don't do copyout with proc_lock held.
 1.149.4.4.2.2  01-Jul-2009  snj Pull up following revision(s) (requested by rmind in ticket #838):
sys/kern/init_sysctl.c: revision 1.162
sys/kern/vfs_trans.c: revision 1.25
don't forget to skip marker processes.
 1.149.4.4.2.1  01-Jul-2009  snj Pull up following revision(s) (requested by rmind in ticket #835):
sys/kern/init_sysctl.c: revision 1.161
sysctl_doeproc: fix a bug in rev.1.135.
don't forget to mark our marker process PK_MARKER.
this fixes crashes in sched_pstats, etc.
 1.149.4.4.2.3.2.1  21-Apr-2010  matt sync to netbsd-5
 1.149.2.3  28-Apr-2009  skrll Sync with HEAD.
 1.149.2.2  03-Mar-2009  skrll Sync with HEAD.
 1.149.2.1  19-Jan-2009  skrll Sync with HEAD.
 1.155.2.2  23-Jul-2009  jym Sync with HEAD.
 1.155.2.1  13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.172.2.2  17-Aug-2010  uebayasi Sync with HEAD.
 1.172.2.1  30-Apr-2010  uebayasi Sync with HEAD.
 1.173.2.4  31-May-2011  rmind sync with head
 1.173.2.3  21-Apr-2011  rmind sync with head
 1.173.2.2  05-Mar-2011  rmind sync with head
 1.173.2.1  03-Jul-2010  rmind sync with head
 1.175.4.1  08-Feb-2011  bouyer Sync with HEAD
 1.175.2.1  06-Jun-2011  jruoho Sync with HEAD.
 1.183.2.3  22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.183.2.2  30-Oct-2012  yamt sync with head
 1.183.2.1  17-Apr-2012  yamt sync with head
 1.185.2.4  29-Apr-2012  mrg sync to latest -current.
 1.185.2.3  11-Mar-2012  mrg sync to latest -current
 1.185.2.2  24-Feb-2012  mrg sync to -current.
 1.185.2.1  18-Feb-2012  mrg merge to -current.
 1.186.2.1  14-Mar-2013  riz Pull up following revision(s) (requested by pgoyette in ticket #837):
sys/compat/common/kern_time_50.c: revision 1.25
sys/kern/init_sysctl.c: revision 1.195
sys/kern/init_main.c: revision 1.447
sys/compat/common/compat_util.h: revision 1.23
sys/compat/common/compat_mod.h: revision 1.1
sys/compat/common/compat_mod.c: revision 1.16
sys/compat/common/compat_mod.c: revision 1.17
sys/compat/common/compat_mod.c: revision 1.18
sys/compat/common/vfs_syscalls_43.c: revision 1.55
Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.
OK christos@
Will request pull-up to 6.0 in a few days.
Wrap sysctl_teardown(&compat_clog) with the appropriate #if defined()s
remove empty #if
 1.190.2.5  03-Dec-2017  jdolecek update from HEAD
 1.190.2.4  20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.190.2.3  23-Jun-2013  tls resync from head
 1.190.2.2  25-Feb-2013  tls resync with head
 1.190.2.1  20-Nov-2012  tls Resync to 2012-11-19 00:00:00 UTC
 1.197.6.1  18-May-2014  rmind sync with head
 1.202.2.1  10-Aug-2014  tls Rebase.
 1.204.4.6  28-Aug-2017  skrll Sync with HEAD
 1.204.4.5  05-Feb-2017  skrll Sync with HEAD
 1.204.4.4  09-Jul-2016  skrll Sync with HEAD
 1.204.4.3  27-Dec-2015  skrll Sync with HEAD (as of 26th Dec)
 1.204.4.2  22-Sep-2015  skrll Sync with HEAD
 1.204.4.1  06-Jun-2015  skrll Sync with HEAD
 1.211.2.1  07-Jan-2017  pgoyette Sync with HEAD. (Note that most of these changes are simply $NetBSD$
tag issues.)
 1.214.4.2  08-Apr-2020  martin Merge changes from current as of 20200406
 1.214.4.1  10-Jun-2019  christos Sync with HEAD
 1.214.2.6  18-Jan-2019  pgoyette Synch with HEAD
 1.214.2.5  26-Dec-2018  pgoyette Sync with HEAD, resolve a few conflicts
 1.214.2.4  26-Nov-2018  pgoyette Sync with HEAD, resolve a couple of conflicts
 1.214.2.3  20-Oct-2018  pgoyette Sync with head
 1.214.2.2  30-Sep-2018  pgoyette Ssync with HEAD
 1.214.2.1  06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.223.2.2  25-Jan-2020  ad Sync with head.
 1.223.2.1  08-Jan-2020  ad Redo the namecache to focus on per-directory data structures, removing the
huge hashtable and nasty locking scheme.

Initially this uses rbtrees (because that's what's there). The intent is
experiment with other data structures.

RSS XML Feed