Home | History | Annotate | only in /src/sys/miscfs/procfs
History log of /src/sys/miscfs/procfs
RevisionDateAuthorComments
 1.1 12-Jun-1998  cgd Rework the way kernel include files are installed. In the new method,
as with user-land programs, include files are installed by each directory
in the tree that has includes to install. (This allows more flexibility
as to what gets installed, makes 'partial installs' easier, and gives us
more options as to which machines' includes get installed at any given
time.) The old SYS_INCLUDES={symlinks,copies} behaviours are _both_
still supported, though at least one bug in the 'symlinks' case is
fixed by this change. Include files can't be build before installation,
so directories that have includes as targets (e.g. dev/pci) have to move
those targets into a different Makefile.
 1.6 17-Apr-2003  jdolecek g/c, it's outdated and the info wouldn't belong here anyway
 1.5 12-Mar-1999  christos PR/7143: Jaromir Docelek: Add procfs/cmdline from Linux emulation
 1.4 29-Jun-1994  cgd New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'
 1.3 08-Jun-1994  mycroft Update to 4.4-Lite fs code, with local changes.
 1.2 20-Jan-1994  ws Make procfs really work for debugging.
Implement not & notepg files in procfs.
 1.1 05-Jan-1994  cgd branches: 1.1.1;
add new procfs code, from Jan-Simon Pendry, jsp@sequent.com.
This is pretty-much "virgin", so that diffs can be done later.
 1.1.1.1 01-Mar-1998  fvdl Import 4.4BSD-Lite for reference
 1.13 30-Mar-2019  christos add a node for the process resource limits.
 1.12 28-Aug-2017  kamil branches: 1.12.4;
Remove the filesystem tracing feature

This is a legacy interface from 4.4BSD, and it was
introduced to overcome shortcomings of ptrace(2) at that time, which are
no longer relevant (performance). Today /proc/#/ctl offers a narrow
subset of ptrace(2) commands and is not applicable for modern
applications use beyond simplistic tracing scenarios.

This removal will simplify kernel internals. Users will still be able to
use all the other /proc files.

This change won't affect other procfs files neither Linux compat
features within mount_procfs(8). /proc/#/ctl isn't available on Linux.

Remove:
- /proc/#/ctl from mount_procfs(8)
- P_FSTRACE note from the documentation of ps(1)
- /proc/#/ctl and filesystem tracing documentation from mount_procfs(8)
- KAUTH_REQ_PROCESS_PROCFS_CTL documentation from kauth(9)
- source code file miscfs/procfs/procfs_ctl.c
- PFSctl and procfs_doctl() from sys/miscfs/procfs/procfs.h
- KAUTH_REQ_PROCESS_PROCFS_CTL from sys/sys/kauth.h
- PSL_FSTRACE (0x00010000) from sys/sys/proc.h
- P_FSTRACE (0x00010000) from sys/sys/sysctl.h

Reduce code complexity after removal of this functionality.

Update TODO.ptrace accordingly: remove two entries about /proc tracing.

Do not keep legacy notes as comments in the headers about removed
PSL_FSTRACE / P_FSTRACE, as this interface had little number of users
(close or equal to zero).

Proposed on tech-kern@.

All filesystem tracing utility users are encouraged to switch to ptrace(2).

Sponsored by <The NetBSD Foundation>
 1.11 30-Mar-2017  christos branches: 1.11.6;
add an auxv node.
 1.10 02-Nov-2016  pgoyette branches: 1.10.2;
* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.
 1.9 11-Oct-2014  uebayasi branches: 1.9.2; 1.9.4;
Define filesystem attributes with vfs dependency.
 1.8 30-Aug-2006  cube branches: 1.8.102;
Restore dependency on PTRACE for PROCFS.
Bump required config(1) version.
 1.7 30-Aug-2006  jnemeth revert previous as it breaks the build due to invalid syntax
 1.6 29-Aug-2006  matt Make PTRACE and COREDUMP optional. Make the default (status quo) by putting
them in conf/std.
 1.5 11-Dec-2005  christos branches: 1.5.4; 1.5.8;
merge ktrace-lwp.
 1.4 26-Feb-2005  perry branches: 1.4.4;
nuke trailing whitespace
 1.3 03-Jan-2003  christos branches: 1.3.2; 1.3.10; 1.3.12;
Implement /proc/<pid>/fd/<n>. This is work in progress. Questionable things:
- Is it ok to convert DTYPE_PIPE to VFIFO and DTYPE_SOCKET to VSOCK?
- XXX: Avoid locking issue in ls -Rl /proc by avoiding curproc
- Does I/O to pipes work?
- XXX: Are there security implications?
 1.2 09-May-2002  thorpej branches: 1.2.6; 1.2.8;
Move code shared by procfs and the kernel proper out of procfs and
into the kernel proper (renaming functions from procfs_* to process_*).
 1.1 16-Apr-2002  thorpej Cleanup how file system configuration information is declared, grouping
related information together, with the file system code itself.

This is just low-hanging fruit -- more to come.
 1.2.8.2 23-Jun-2002  jdolecek catch up with -current on kqueue branch
 1.2.8.1 09-May-2002  jdolecek file files.procfs was added on branch kqueue on 2002-06-23 17:50:12 +0000
 1.2.6.3 07-Jan-2003  thorpej Sync with HEAD.
 1.2.6.2 20-Jun-2002  nathanw Catch up to -current.
 1.2.6.1 09-May-2002  nathanw file files.procfs was added on branch nathanw_sa on 2002-06-20 03:48:00 +0000
 1.3.12.1 19-Mar-2005  yamt sync with head. xen and whitespace. xen part is not finished.
 1.3.10.1 29-Apr-2005  kent sync with -current
 1.3.2.1 04-Mar-2005  skrll Sync with HEAD.

Hi Perry!
 1.4.4.1 30-Dec-2006  yamt sync with head.
 1.5.8.1 03-Sep-2006  yamt sync with head.
 1.5.4.1 09-Sep-2006  rpaulo sync with head
 1.8.102.1 03-Dec-2017  jdolecek update from HEAD
 1.9.4.2 26-Apr-2017  pgoyette Sync with HEAD
 1.9.4.1 04-Nov-2016  pgoyette Sync with HEAD
 1.9.2.2 28-Aug-2017  skrll Sync with HEAD
 1.9.2.1 05-Dec-2016  skrll Sync with HEAD
 1.10.2.1 21-Apr-2017  bouyer Sync with HEAD
 1.11.6.1 12-Apr-2018  martin Pull up following revision(s) (requested by kamil in ticket #713):

sys/modules/procfs/Makefile: revision 1.4
sys/miscfs/procfs/procfs_vfsops.c: revision 1.98
bin/ps/ps.1: revision 1.108
sys/compat/linux/arch/i386/linux_ptrace.c: revision 1.32
sys/miscfs/procfs/procfs_vnops.c: revision 1.198
sys/kern/sys_ptrace_common.c: revision 1.23
sys/kern/sys_ptrace_common.c: revision 1.24
sbin/mount_procfs/mount_procfs.8: revision 1.36
sys/kern/sys_ptrace_common.c: revision 1.25
sys/kern/sys_ptrace.c: revision 1.5
sys/compat/linux/arch/powerpc/linux_ptrace.c: revision 1.30
sys/sys/proc.h: revision 1.342
sys/kern/sys_ptrace_common.c: revision 1.26
sys/miscfs/procfs/procfs_ctl.c: file removal
sys/kern/sys_ptrace_common.c: revision 1.27
sys/miscfs/procfs/procfs_subr.c: revision 1.109
sys/kern/sys_ptrace_common.c: revision 1.28
sys/secmodel/extensions/secmodel_extensions.c: revision 1.8
sys/kern/sys_ptrace_common.c: revision 1.29
sys/sys/ptrace.h: revision 1.62
sys/compat/netbsd32/netbsd32_signal.c: revision 1.45
share/man/man9/kauth.9: revision 1.109
sys/miscfs/procfs/files.procfs: revision 1.12
sys/compat/netbsd32/netbsd32.h: revision 1.115
sys/miscfs/procfs/procfs.h: revision 1.72
sys/compat/netbsd32/netbsd32_ptrace.c: revision 1.5
sys/kern/kern_sig.c: revision 1.337
sys/sys/kauth.h: revision 1.75
sys/sys/sysctl.h: revision 1.224
sys/kern/sys_ptrace_common.c: revision 1.30
sys/kern/sys_ptrace_common.c: revision 1.31
sys/kern/sys_ptrace_common.c: revision 1.32
sys/kern/sys_ptrace_common.c: revision 1.33
sys/compat/linux/arch/arm/linux_ptrace.c: revision 1.20
sys/kern/sys_ptrace_common.c: revision 1.34
sys/kern/sys_ptrace_common.c: revision 1.36
sys/kern/kern_proc.c: revision 1.207
sys/kern/kern_exit.c: revision 1.269
doc/TODO.ptrace: revision 1.29

Make {s,g}et{db,fp,}regs work again for PK_32 processes
XXX: pullup-8

add disgusting magic to handle compat_netbsd32 as a module.

use process_*reg32 instead of struct *reg32.

Remove the filesystem tracing feature

This is a legacy interface from 4.4BSD, and it was
introduced to overcome shortcomings of ptrace(2) at that time, which are
no longer relevant (performance). Today /proc/#/ctl offers a narrow
subset of ptrace(2) commands and is not applicable for modern
applications use beyond simplistic tracing scenarios.

This removal will simplify kernel internals. Users will still be able to
use all the other /proc files.

This change won't affect other procfs files neither Linux compat
features within mount_procfs(8). /proc/#/ctl isn't available on Linux.

Remove:
- /proc/#/ctl from mount_procfs(8)
- P_FSTRACE note from the documentation of ps(1)
- /proc/#/ctl and filesystem tracing documentation from mount_procfs(8)
- KAUTH_REQ_PROCESS_PROCFS_CTL documentation from kauth(9)
- source code file miscfs/procfs/procfs_ctl.c
- PFSctl and procfs_doctl() from sys/miscfs/procfs/procfs.h
- KAUTH_REQ_PROCESS_PROCFS_CTL from sys/sys/kauth.h
- PSL_FSTRACE (0x00010000) from sys/sys/proc.h
- P_FSTRACE (0x00010000) from sys/sys/sysctl.h

Reduce code complexity after removal of this functionality.

Update TODO.ptrace accordingly: remove two entries about /proc tracing.

Do not keep legacy notes as comments in the headers about removed

PSL_FSTRACE / P_FSTRACE, as this interface had little number of users
(close or equal to zero).
Proposed on tech-kern@.

All filesystem tracing utility users are encouraged to switch to ptrace(2).

Sponsored by <The NetBSD Foundation>

untangle the mess:
- factor out common code
- break each ptrace subcall to its own sub-function
.. more to come ...
- reduce ifdef ugliness by moving it up top.
- factor out PT_IO and make PT_{READ,WRITE}_{I,D} use it
- factor out PT_DUMPCORE
- factor out sendsig code
.. more to come ...

handle siginfo requests for ptrace32

ptrace: Partially undo PT_{READ,WRITE}_{I,D} and unbreak these commands

The refactored code did not work and was generating EFAULT.

Sponsored by <The NetBSD Foundation>

Merge the code back; the problem was that since we are reading/writing
to a kernel address for PT_{READ,WRITE}_{I,D} we need the kernel vmspace.
provide separate read and write functions to accomodate register functions
that need a size argument.

don't ignore error from copyout_piod

Use the proper process (the tracee) to get information about lwps and
registers and the tracer for vmspace.

Add new sysctl(3) entry: security.models.extensions.user_set_dbregs

Model this new sysctl(3) entry after "user_set_cpu_affinity" in the same
level of sysctl(3) switches.

Allow to read unconditionally Debug Registers (no change here). This is
convenient as even if a user of a debugger does not use hardware assisted
watchpoints/breakpoints, a debugger can still prompt these values to store
in an internal cache with context of registers. Reading them should have
no security concerns.

Add a paranoid MI switch that prohibits by default setting these registers
by a regular user (non-superuser). Make this switch disabled by default.
There are enough reserved bits out there to allow using them
unconditionally on hardened hosts.

Features shipped with Debug Registers are optional features in debuggers.
There is no reduction in elementary functionality.

Reviewed by <christos>

Sponsored by <The NetBSD Foundation>
 1.12.4.1 10-Jun-2019  christos Sync with HEAD
 1.7 05-Jan-1994  mycroft Clean up deleted files.
 1.6 07-Sep-1993  ws Changes to VFS readdir semantics
NFS changes for better cookie support
ISOFS changes for better Rockridge support and support for generation numbers
 1.5 26-Aug-1993  pk Implement setattr: mode for process entries; mode + uid/gid for the
PROCFS root directory.
Fixed omission in pfs_root() which came to light as a result of the above:
hold on to vnode for root dir.
 1.4 25-Aug-1993  pk Fixed improperly initialized nfsnode in pfs_lookup()
 1.3 24-Aug-1993  pk copyright update.
 1.2 24-Aug-1993  pk Rcs Id added.
 1.1 24-Aug-1993  pk Initial version of a proc filesystem.
 1.87 01-Jul-2024  christos Add linux POSIX message queue support (Ricardo Branco)
 1.86 12-May-2024  christos branches: 1.86.2;
PR/58227: Ricardo Branco: Add support for proc/sysvipc in Linux emulator
 1.85 12-May-2024  christos PR/58240: Ricardo Branco: Add support for proc/self/limits as used by Linux
 1.84 17-Jan-2024  hannken Using the exechook to revoke procfs nodes is racy and may deadlock:

one thread runs doexechooks() -> procfs_revoke_vnodes() and wants to suspend
the file system for vgone(), while another thread runs a forced unmount,
has the file system suspended, tries to disestablish the exechook and
waits for doexechooks() to complete.

Establish/disestablish the exechook on module load/unload instead
mount/unmount and use the hashmap to access all procfs nodes for this pid.

May fix PR kern/57775 ""panic: unmount: dangling vnode" while umounting procfs"
 1.83 17-Jan-2024  hannken Add a hashmap to access all procfs nodes by pid.
 1.82 19-Jan-2022  martin branches: 1.82.4;
Now that an inline function dereferences it, make sure struct proc
is declared by including sys/proc.h here.
 1.81 17-Jan-2022  bouyer If the calling process is running under linux emulation, make /proc/xxx/fd/
return only symlinks pointing to the original file in the filesystem,
instead of a hard link. This matches the linux behavior, and some
linux programs relies on it (they unconditionally call readlink() on
/proc/xxx/fd/yy and don't deal with it returning EINVAL).
Proposed on tech-kern@ in
http://mail-index.netbsd.org/tech-kern/2022/01/11/msg027877.html
 1.80 29-Apr-2020  riastradh Put forward declaration a little further forward to unbreak build.
 1.79 29-Apr-2020  thorpej If the procfs mount is marked as linux-compat, then allow proc lookup
by any LWP ID in the proc, not just the canonical PID.
 1.78 17-Jan-2020  ad VFS_VGET(), VFS_ROOT(), VFS_FHTOVP(): give them a "int lktype" argument, to
allow us to get shared locks (or no lock) on the returned vnode. Matches
FreeBSD.
 1.77 26-Sep-2019  christos branches: 1.77.2;
Rewrite the procfs_fileno as an inline function to make it more clear what
it does...
 1.76 25-Apr-2019  mlelstv Restore mapping of file id to pid/type/fd.
Use 64bit file id to allow for 32bit fd and 25-26bit pid.
 1.75 30-Mar-2019  christos add a node for the process resource limits.
 1.74 31-Dec-2017  christos branches: 1.74.4;
rename some "cmdline" stuff now that it is used to print environment too
 1.73 31-Dec-2017  christos Add an environ node
 1.72 28-Aug-2017  kamil Remove the filesystem tracing feature

This is a legacy interface from 4.4BSD, and it was
introduced to overcome shortcomings of ptrace(2) at that time, which are
no longer relevant (performance). Today /proc/#/ctl offers a narrow
subset of ptrace(2) commands and is not applicable for modern
applications use beyond simplistic tracing scenarios.

This removal will simplify kernel internals. Users will still be able to
use all the other /proc files.

This change won't affect other procfs files neither Linux compat
features within mount_procfs(8). /proc/#/ctl isn't available on Linux.

Remove:
- /proc/#/ctl from mount_procfs(8)
- P_FSTRACE note from the documentation of ps(1)
- /proc/#/ctl and filesystem tracing documentation from mount_procfs(8)
- KAUTH_REQ_PROCESS_PROCFS_CTL documentation from kauth(9)
- source code file miscfs/procfs/procfs_ctl.c
- PFSctl and procfs_doctl() from sys/miscfs/procfs/procfs.h
- KAUTH_REQ_PROCESS_PROCFS_CTL from sys/sys/kauth.h
- PSL_FSTRACE (0x00010000) from sys/sys/proc.h
- P_FSTRACE (0x00010000) from sys/sys/sysctl.h

Reduce code complexity after removal of this functionality.

Update TODO.ptrace accordingly: remove two entries about /proc tracing.

Do not keep legacy notes as comments in the headers about removed
PSL_FSTRACE / P_FSTRACE, as this interface had little number of users
(close or equal to zero).

Proposed on tech-kern@.

All filesystem tracing utility users are encouraged to switch to ptrace(2).

Sponsored by <The NetBSD Foundation>
 1.71 30-Mar-2017  christos branches: 1.71.6;
add an auxv node.
 1.70 27-Jul-2014  hannken branches: 1.70.4; 1.70.8; 1.70.12;
Change procfs from hashlist to vcache.
- Key is (type, pid, fd)
- Remove argument "p" from procfs_allocvp(). It is only used
when "type == PFSfd". Lookup the proc with proc_find() when
procfs_loadvnode() needs it.
- Use a vfs_vnode_iterator for procfs_revoke_vnodes().
 1.69 05-Apr-2014  christos branches: 1.69.2;
On my 24 proc box I got ENOSPC, so make the routine return the size it wants
and try again.
 1.68 28-May-2012  christos branches: 1.68.2; 1.68.4;
add a task process subdirectory for emul linux
 1.67 27-Sep-2011  christos branches: 1.67.2; 1.67.6;
define PROCFS_MAXNAMLEN and use it.
 1.66 04-Sep-2011  jmcneill PR# kern/45021: Please support /emul/linux/proc/version

Add /proc/version for procfs with -o linux. The version reported depends
on the emulation type of the calling process:

$ cat /proc/version
NetBSD version 5.99.55 (netbsd@localhost) (gcc version 4.1.3 20080704 prerelease (NetBSD nb2 20081120)) NetBSD 5.99.55 (GENERIC) #39: Sun Sep 4 09:10:05 EDT 2011

$ /emul/linux/bin/cat /proc/version
Linux version 2.6.18 (linux@localhost) (gcc version 4.1.3 20080704 prerelease (NetBSD nb2 20081120)) #0 Wed Mar 3 03:03:03 PST 2010

$ /emul/linux32/bin/cat /proc/version
Linux version 2.6.18 (linux32@localhost) (gcc version 4.1.3 20080704 prerelease (NetBSD nb2 20081120)) #0 Wed Mar 3 03:03:03 PST 2010
 1.65 28-Jun-2008  rumble Create sysctl entries during module initialisation and destroy them
appropriately.

Many of these file systems are now ready for modularisation.
 1.64 24-May-2007  agc branches: 1.64.28; 1.64.32; 1.64.34; 1.64.36;
Extend the Linux emulation of /proc to include

/proc/stat
/proc/loadavg and
/proc/<pid>/statm.

These are only present when -o linux is specified as a mount option
to procfs.

Factor out some common code so that it can be used by a number of
functions.

XXX The values returned in the statm emulation need to be verified.
 1.63 09-Feb-2007  ad branches: 1.63.6; 1.63.8;
Merge newlock2 to head.
 1.62 29-Oct-2006  christos add an "emul" file node.
 1.61 25-Oct-2006  christos 1. fix procfs_validfile{,_linux} to test for NULL pointers properly.
2. make "exe" entry be a symlink to the executable, instead of pointing
directly to the vnode of the executable.
3. factor out commonly used code.
 1.60 20-Sep-2006  manu Emulate Linux's /proc/devices
 1.59 11-Dec-2005  christos branches: 1.59.20; 1.59.22;
merge ktrace-lwp.
 1.58 01-Oct-2005  atatat Add "cwd" and "root" symlinks to each process's directory. The cwd
link points to the process's current working directory, and the root
link points to the process's root directory. What else would you
expect?

For directories that are out of reach (caller is in a chroot, target
process is in a different chroot, etc), the links point to "/"
instead.
 1.57 30-Aug-2005  xtraeme Remove __P()
 1.56 20-Sep-2004  jdolecek branches: 1.56.12;
add 'mounts' file for -o linux, which lists all currently mounted
filesystems; Linux glibc statvfs() uses this to get some of mount flags,
and this file is also useful as /emul/linux/etc/mtab (via symlink)
 1.55 27-Aug-2004  skrll Do previous slightly differently - just pass a struct lwp * and derive the
struct proc *.

OK'd by Jaromir.
 1.54 21-Aug-2004  jdolecek fix process used for /proc/<pid>/stat contents - it should be process
<pid>, not the current process looking at the information
 1.53 20-May-2004  atatat Tweak sysctl setup functions (the macros, actually) for use in lkms,
and tweak lkminit_*.c (where applicable) to call them, and to call
sysctl_teardown() when being unloaded.

This consists of (1) making setup functions not be static when being
compiled as lkms (change to sys/sysctl.h), (2) making prototypes
visible for the various setup functions in header files (changes to
various header files), and (3) making simple "load" and "unload"
functions in the actual lkminit stuff.

linux_sysctl.c also needs its root exposed (ie, made not static) for
this (when built as an lkm).
 1.52 10-Dec-2003  drochner branches: 1.52.2;
a little bit more namespace sanity
 1.51 03-Oct-2003  yamt terminate snprintb 'new' format strings correctly.
(fixes overrun in mount_*)
 1.50 27-Sep-2003  mycroft Put pfsnode in the #ifdef _KERNEL too, so this actually compiles.
 1.49 27-Sep-2003  darcy Changes as discussed with itojun on tech-kern. I have modified the enums
to have KFS or PFS differentiators. Further I have wrapped the enum in
procfs in "#ifdef _KERNEL" as it is done in kernfs.

To see the discussion go to http://mail-index.NetBSD.org/tech-kern/2003/09/
and look for "Mismatched enums in include files" in the list.
 1.48 07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.47 29-Jun-2003  fvdl branches: 1.47.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
 1.46 29-Jun-2003  thorpej Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.
 1.45 28-Jun-2003  darrenr Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
 1.44 28-May-2003  christos Add /proc/<pid>/stat for linux compat. j2sdk1.4.2 depends on it.
 1.43 18-Apr-2003  jdolecek change PROCFS_FILENO() to use 5 bits for 'type', since there are more than
16 types nowadays (i.e. Pfd is 17)
 1.42 17-Apr-2003  jdolecek use fd_getfile() in procfs_getfp(), and FILE_USE()/FILE_UNUSE() the
returned file descriptor pointer appropriately
 1.41 25-Feb-2003  jrf This addresses PR kerm/19989. Thanks to hamajima@nagoya.ydc.co.jp for submitting this patch which enables /proc/uptime for linux emul. Patch reviewed by atatat@netbsd.org and tron@netbsd.org, approved by tron@netbsd.org.
 1.40 18-Jan-2003  thorpej Merge the nathanw_sa branch.
 1.39 03-Jan-2003  christos Implement /proc/<pid>/fd/<n>. This is work in progress. Questionable things:
- Is it ok to convert DTYPE_PIPE to VFIFO and DTYPE_SOCKET to VSOCK?
- XXX: Avoid locking issue in ls -Rl /proc by avoiding curproc
- Does I/O to pipes work?
- XXX: Are there security implications?
 1.38 21-Sep-2002  christos MNT_GETARGS support
 1.37 09-May-2002  thorpej Move code shared by procfs and the kernel proper out of procfs and
into the kernel proper (renaming functions from procfs_* to process_*).
 1.36 05-Dec-2001  thorpej * Allow machine-dependent code to specify hooks for ptrace(2)
(__HAVE_PTRACE_MACHDEP) and procfs (__HAVE_PROCFS_MACHDEP).
These changes will allow platforms like x86 (XMM) and PowerPC
(AltiVec) to export extended register sets in a sane manner.

* Use __HAVE_PTRACE_MACHDEP to export x86 XMM registers (standard
FP + SSE/SSE2) using PT_{GET,SET}XMMREGS (in the machdep
ptrace request space).
* Use __HAVE_PROCFS_MACHDEP to export x86 XMM registers via
/proc/N/xmmregs in procfs.
 1.35 15-Sep-2001  chs add a new VFS op, vfs_reinit, which is called when desiredvnodes is
adjusted via sysctl. file systems that have hash tables which are
sized based on the value of this variable now resize those hash tables
using the new value. the max number of FFS softdeps is also recalculated.

convert various file systems to use the <sys/queue.h> macros for
their hash tables.
 1.34 29-Mar-2001  fvdl branches: 1.34.2; 1.34.4;
For -o linux mounts, add some code to emulate /proc/#/maps.
Needs NAMECACHE_ENTER_REVERSE to include filenames.
 1.33 25-Jan-2001  jdolecek branches: 1.33.2;
g/c pmnt_mp in struct procfs_args
 1.32 18-Jan-2001  jdolecek constify
 1.31 17-Jan-2001  fvdl Add a few linux-style files, only enabled when -o linux is specified
for the mount. Currently these are /proc/cpuinfo and /proc/meminfo.
The former only does something on i386 right now.
 1.30 24-Nov-2000  chs remove dead code and other misc cleanup.
 1.29 16-Mar-2000  jdolecek branches: 1.29.4;
Add new VFS op routine - vfs_done and call it on filesystem detach
in vfs_detach(). vfs_done may free global filesystem's resources,
typically those allocated in respective filesystem's init function.
Needed so those filesystems which went in via LKM have a chance to
clean after themselves before unloading. This fixes random panics
when LKM for filesystem using pools was loaded and unloaded several
times.

For each leaf filesystem, add appropriate vfs_done routine.
 1.28 25-Jan-2000  fvdl At mount/unmount time, add an exec hook to revoke all vnodes iff the
process is about to exec a sugid binary.

To speed up things, use hashing for vnode allocation, like other filesystems
do. This avoids walking the whole procfs node list in the revoke case too.
 1.27 02-Sep-1999  thorpej branches: 1.27.2;
Make /proc/self a symlink to /proc/curproc. I've observed Linux programs
that expect /proc/self/cmdline to exist.
 1.26 24-Mar-1999  mrg branches: 1.26.2;
completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.
 1.25 13-Mar-1999  thorpej Expose procfs_rwmem(). (This function will go away entirely when we
delete Mach VM.)
 1.24 12-Mar-1999  christos PR/7143: Jaromir Docelek: Add procfs/cmdline from Linux emulation
 1.23 25-Jan-1999  msaitoh Add /proc/#/map. From FreeBSD.
 1.22 09-Aug-1998  perry bzero->memset, bcopy->memcpy, bcmp->memcmp
 1.21 01-Mar-1998  fvdl Merge with Lite2 + local changes
 1.20 27-Aug-1997  thorpej Fix a reversed argument which caused procfs_checkioperm() to always return
"OK". Add a few comments to avoid further confusion.
 1.19 12-Aug-1997  thorpej Fix the procfs hole described on current-users, similar to a fix for
FreeBSD by Sean Eric Fagan, but a bit different. This makes the checks
in the same places as sef's FreeBSD patch, but does not hardcode the
"kmem" group into the kernel, and also does a check identical to the
(3) and (4) checks in the NetBSD ptrace(2):

(1) it's not owned by you, or is set-id on exec (unless
you're root), or

(2) it's init, which controls the security level of the
entire system, and the system was not compiled with
permanently insecure mode turned on.
 1.18 08-May-1997  mycroft branches: 1.18.4;
Pass the vnode type to vaccess(), and use it when checking VEXEC. Make sure
that the mode bits passed to vaccess() and returned by foo_getattr() contain
only permission bits.
 1.17 12-Feb-1996  christos close PR/2063: procfs_rw prototyped twice with different prototypes
 1.16 09-Feb-1996  christos miscfs prototype changes
 1.15 09-Feb-1996  mycroft Fix vop_link, vop_symlink, and vop_remove semantics in several ways:
* Change the argument names to vop_link so they actually make sense.
* Implement vop_link and vop_symlink for all file systems, so they do proper
cleanup.
* Require the file system to decide whether or not linking and unlinking of
directories is allowed, and disable it for all current file systems.
 1.14 09-Oct-1995  mycroft Add support for cookies, mostly from Greg Hudson.
 1.13 29-Mar-1995  briggs KERNEL -> _KERNEL
 1.12 29-Oct-1994  cgd light clean; make sure headers are properly included, types are OK, etc.
 1.11 29-Jun-1994  cgd New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'
 1.10 15-Jun-1994  mycroft Minor update from JSP after merging my changes.
 1.9 08-Jun-1994  mycroft Update to 4.4-Lite fs code, with local changes.
 1.8 12-Apr-1994  cgd be a bit smarter about determining if files shouldn't be seen by the user.
Also, DON'T allow a lookup to succeed on a file that's not visible!
 1.7 06-Feb-1994  ws If you add files, be sure to have enough bits to encode an inode number!
 1.6 28-Jan-1994  cgd make a fpregs file.
 1.5 20-Jan-1994  ws Make procfs really work for debugging.
Implement not & notepg files in procfs.
 1.4 11-Jan-1994  ws Fix ugliness left over from my last mod
 1.3 09-Jan-1994  ws Bug fixes and enhancements:
Make NFS serving work (BUT DON'T USE "attach" TO /proc/*/ctl FOR NOW!!!)
Make `curproc' a symbolic link
Add `.' and `..' entries to the directories.
Return better guesses on the size of the files.
 1.2 05-Jan-1994  cgd fix UFS vs 'real' fs type mixups
 1.1 05-Jan-1994  cgd branches: 1.1.1;
add new procfs code, from Jan-Simon Pendry, jsp@sequent.com.
This is pretty-much "virgin", so that diffs can be done later.
 1.1.1.2 01-Mar-1998  fvdl Import 4.4BSD-Lite2
 1.1.1.1 01-Mar-1998  fvdl Import 4.4BSD-Lite for reference
 1.18.4.2 28-Aug-1997  thorpej Update marc-pcmcia branch from trunk.
 1.18.4.1 23-Aug-1997  thorpej Update marc-pcmcia branch from trunk.
 1.26.2.1 01-Feb-2000  he Pull up revision 1.28 (via patch, requested by fvdl):
Close procfs security hole. Fixes SA#2000-001.
 1.27.2.5 21-Apr-2001  bouyer Sync with HEAD
 1.27.2.4 11-Feb-2001  bouyer Sync with HEAD.
 1.27.2.3 18-Jan-2001  bouyer Sync with head (for UBC+NFS fixes, mostly).
 1.27.2.2 08-Dec-2000  bouyer Sync with HEAD.
 1.27.2.1 20-Nov-2000  bouyer Update thorpej_scsipi to -current as of a month ago
 1.29.4.1 30-Mar-2001  he Pull up revision 1.31 (requested by fvdl):
Add some required Linux emulation bits to support the Linux
version of VMware.
 1.33.2.10 07-Jan-2003  thorpej Sync with HEAD.
 1.33.2.9 15-Oct-2002  nathanw Make _validfoo() routines go back to taking a proc.
 1.33.2.8 06-Oct-2002  thorpej Sync with HEAD.
 1.33.2.7 20-Jun-2002  nathanw Catch up to -current.
 1.33.2.6 01-Apr-2002  nathanw procfs_domem() should take proc *, proc *; not proc *, lwp *.
 1.33.2.5 09-Jan-2002  nathanw Adapt procfs_machdep_rw() to LWPs.
 1.33.2.4 08-Jan-2002  nathanw Catch up to -current.
 1.33.2.3 21-Sep-2001  nathanw Catch up to -current.
 1.33.2.2 09-Apr-2001  nathanw Catch up with -current.
 1.33.2.1 05-Mar-2001  nathanw Initial commit of scheduler activations and lightweight process support.
 1.34.4.1 01-Oct-2001  fvdl Catch up with -current.
 1.34.2.3 10-Oct-2002  jdolecek sync kqueue with -current; this includes merge of gehenna-devsw branch,
merge of i386 MP branch, and part of autoconf rototil work
 1.34.2.2 23-Jun-2002  jdolecek catch up with -current on kqueue branch
 1.34.2.1 10-Jan-2002  thorpej Sync kqueue branch with -current.
 1.47.2.8 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.47.2.7 24-Sep-2004  skrll Sync with HEAD.
 1.47.2.6 21-Sep-2004  skrll Fix the sync with head I botched.
 1.47.2.5 18-Sep-2004  skrll Sync with HEAD.
 1.47.2.4 24-Aug-2004  skrll Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.
 1.47.2.3 18-Aug-2004  skrll Revert to passing struct proc for {exit,exec}hook.
 1.47.2.2 03-Aug-2004  skrll Sync with HEAD
 1.47.2.1 02-Jul-2003  darrenr Apply the aborted ktrace-lwp changes to a specific branch. This is just for
others to review, I'm concerned that patch fuziness may have resulted in some
errant code being generated but I'll look at that later by comparing the diff
from the base to the branch with the file I attempt to apply to it. This will,
at the very least, put the changes in a better context for others to review
them and attempt to tinker with removing passing of 'struct lwp' through
the kernel.
 1.52.2.1 23-May-2004  tron Pull up revision 1.53 (requested by atatat in ticket #374):
Tweak sysctl setup functions (the macros, actually) for use in lkms,
and tweak lkminit_*.c (where applicable) to call them, and to call
sysctl_teardown() when being unloaded.
This consists of (1) making setup functions not be static when being
compiled as lkms (change to sys/sysctl.h), (2) making prototypes
visible for the various setup functions in header files (changes to
various header files), and (3) making simple "load" and "unload"
functions in the actual lkminit stuff.
linux_sysctl.c also needs its root exposed (ie, made not static) for
this (when built as an lkm).
 1.56.12.4 03-Sep-2007  yamt sync with head.
 1.56.12.3 26-Feb-2007  yamt sync with head.
 1.56.12.2 30-Dec-2006  yamt sync with head.
 1.56.12.1 21-Jun-2006  yamt sync with head.
 1.59.22.2 10-Dec-2006  yamt sync with head.
 1.59.22.1 22-Oct-2006  yamt sync with head
 1.59.20.2 18-Nov-2006  ad Sync with head.
 1.59.20.1 17-Nov-2006  ad Checkpoint work in progress.
 1.63.8.1 11-Jul-2007  mjf Sync with head.
 1.63.6.1 08-Jun-2007  ad Sync with head.
 1.64.36.1 03-Jul-2008  simonb Sync with head.
 1.64.34.1 18-Sep-2008  wrstuden Sync with wrstuden-revivesa-base-2.
 1.64.32.1 04-May-2009  yamt sync with head.
 1.64.28.1 29-Jun-2008  mjf Sync with HEAD.
 1.67.6.1 02-Jun-2012  mrg sync to latest -current.
 1.67.2.2 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.67.2.1 30-Oct-2012  yamt sync with head
 1.68.4.1 18-May-2014  rmind sync with head
 1.68.2.2 03-Dec-2017  jdolecek update from HEAD
 1.68.2.1 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.69.2.1 10-Aug-2014  tls Rebase.
 1.70.12.1 21-Apr-2017  bouyer Sync with HEAD
 1.70.8.1 26-Apr-2017  pgoyette Sync with HEAD
 1.70.4.1 28-Aug-2017  skrll Sync with HEAD
 1.71.6.1 12-Apr-2018  martin Pull up following revision(s) (requested by kamil in ticket #713):

sys/modules/procfs/Makefile: revision 1.4
sys/miscfs/procfs/procfs_vfsops.c: revision 1.98
bin/ps/ps.1: revision 1.108
sys/compat/linux/arch/i386/linux_ptrace.c: revision 1.32
sys/miscfs/procfs/procfs_vnops.c: revision 1.198
sys/kern/sys_ptrace_common.c: revision 1.23
sys/kern/sys_ptrace_common.c: revision 1.24
sbin/mount_procfs/mount_procfs.8: revision 1.36
sys/kern/sys_ptrace_common.c: revision 1.25
sys/kern/sys_ptrace.c: revision 1.5
sys/compat/linux/arch/powerpc/linux_ptrace.c: revision 1.30
sys/sys/proc.h: revision 1.342
sys/kern/sys_ptrace_common.c: revision 1.26
sys/miscfs/procfs/procfs_ctl.c: file removal
sys/kern/sys_ptrace_common.c: revision 1.27
sys/miscfs/procfs/procfs_subr.c: revision 1.109
sys/kern/sys_ptrace_common.c: revision 1.28
sys/secmodel/extensions/secmodel_extensions.c: revision 1.8
sys/kern/sys_ptrace_common.c: revision 1.29
sys/sys/ptrace.h: revision 1.62
sys/compat/netbsd32/netbsd32_signal.c: revision 1.45
share/man/man9/kauth.9: revision 1.109
sys/miscfs/procfs/files.procfs: revision 1.12
sys/compat/netbsd32/netbsd32.h: revision 1.115
sys/miscfs/procfs/procfs.h: revision 1.72
sys/compat/netbsd32/netbsd32_ptrace.c: revision 1.5
sys/kern/kern_sig.c: revision 1.337
sys/sys/kauth.h: revision 1.75
sys/sys/sysctl.h: revision 1.224
sys/kern/sys_ptrace_common.c: revision 1.30
sys/kern/sys_ptrace_common.c: revision 1.31
sys/kern/sys_ptrace_common.c: revision 1.32
sys/kern/sys_ptrace_common.c: revision 1.33
sys/compat/linux/arch/arm/linux_ptrace.c: revision 1.20
sys/kern/sys_ptrace_common.c: revision 1.34
sys/kern/sys_ptrace_common.c: revision 1.36
sys/kern/kern_proc.c: revision 1.207
sys/kern/kern_exit.c: revision 1.269
doc/TODO.ptrace: revision 1.29

Make {s,g}et{db,fp,}regs work again for PK_32 processes
XXX: pullup-8

add disgusting magic to handle compat_netbsd32 as a module.

use process_*reg32 instead of struct *reg32.

Remove the filesystem tracing feature

This is a legacy interface from 4.4BSD, and it was
introduced to overcome shortcomings of ptrace(2) at that time, which are
no longer relevant (performance). Today /proc/#/ctl offers a narrow
subset of ptrace(2) commands and is not applicable for modern
applications use beyond simplistic tracing scenarios.

This removal will simplify kernel internals. Users will still be able to
use all the other /proc files.

This change won't affect other procfs files neither Linux compat
features within mount_procfs(8). /proc/#/ctl isn't available on Linux.

Remove:
- /proc/#/ctl from mount_procfs(8)
- P_FSTRACE note from the documentation of ps(1)
- /proc/#/ctl and filesystem tracing documentation from mount_procfs(8)
- KAUTH_REQ_PROCESS_PROCFS_CTL documentation from kauth(9)
- source code file miscfs/procfs/procfs_ctl.c
- PFSctl and procfs_doctl() from sys/miscfs/procfs/procfs.h
- KAUTH_REQ_PROCESS_PROCFS_CTL from sys/sys/kauth.h
- PSL_FSTRACE (0x00010000) from sys/sys/proc.h
- P_FSTRACE (0x00010000) from sys/sys/sysctl.h

Reduce code complexity after removal of this functionality.

Update TODO.ptrace accordingly: remove two entries about /proc tracing.

Do not keep legacy notes as comments in the headers about removed

PSL_FSTRACE / P_FSTRACE, as this interface had little number of users
(close or equal to zero).
Proposed on tech-kern@.

All filesystem tracing utility users are encouraged to switch to ptrace(2).

Sponsored by <The NetBSD Foundation>

untangle the mess:
- factor out common code
- break each ptrace subcall to its own sub-function
.. more to come ...
- reduce ifdef ugliness by moving it up top.
- factor out PT_IO and make PT_{READ,WRITE}_{I,D} use it
- factor out PT_DUMPCORE
- factor out sendsig code
.. more to come ...

handle siginfo requests for ptrace32

ptrace: Partially undo PT_{READ,WRITE}_{I,D} and unbreak these commands

The refactored code did not work and was generating EFAULT.

Sponsored by <The NetBSD Foundation>

Merge the code back; the problem was that since we are reading/writing
to a kernel address for PT_{READ,WRITE}_{I,D} we need the kernel vmspace.
provide separate read and write functions to accomodate register functions
that need a size argument.

don't ignore error from copyout_piod

Use the proper process (the tracee) to get information about lwps and
registers and the tracer for vmspace.

Add new sysctl(3) entry: security.models.extensions.user_set_dbregs

Model this new sysctl(3) entry after "user_set_cpu_affinity" in the same
level of sysctl(3) switches.

Allow to read unconditionally Debug Registers (no change here). This is
convenient as even if a user of a debugger does not use hardware assisted
watchpoints/breakpoints, a debugger can still prompt these values to store
in an internal cache with context of registers. Reading them should have
no security concerns.

Add a paranoid MI switch that prohibits by default setting these registers
by a regular user (non-superuser). Make this switch disabled by default.
There are enough reserved bits out there to allow using them
unconditionally on hardened hosts.

Features shipped with Debug Registers are optional features in debuggers.
There is no reduction in elementary functionality.

Reviewed by <christos>

Sponsored by <The NetBSD Foundation>
 1.74.4.3 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.74.4.2 08-Apr-2020  martin Merge changes from current as of 20200406
 1.74.4.1 10-Jun-2019  christos Sync with HEAD
 1.77.2.1 17-Jan-2020  ad Sync with head.
 1.82.4.1 18-Apr-2024  martin Pull up following revision(s) (requested by hannken in ticket #668):

sys/miscfs/procfs/procfs.h: revision 1.83
sys/miscfs/procfs/procfs.h: revision 1.84
sys/kern/vfs_mount.c: revision 1.104
sys/miscfs/procfs/procfs_vnops.c: revision 1.230
sys/kern/init_main.c: revision 1.547
sys/kern/kern_hook.c: revision 1.15
sys/miscfs/procfs/procfs_vfsops.c: revision 1.112
sys/miscfs/procfs/procfs_vfsops.c: revision 1.113
sys/miscfs/procfs/procfs_vfsops.c: revision 1.114
sys/miscfs/procfs/procfs_subr.c: revision 1.117

Print dangling vnode before panic() to help debug.

PR kern/57775 ""panic: unmount: dangling vnode" while umounting procfs"
Protect kernel hooks exechook, exithook and forkhook with rwlock.

Lock as writer on establish/disestablish and as reader on list traverse.

For exechook ride "exec_lock" as it is already take as reader when
traversing the list. Add local locks for exithook and forkhook.

Move exec_init before signal_init as signal_init calls exechook_establish()
that needs "exec_lock".

PR kern/39913 "exec, fork, exit hooks need locking"

Add a hashmap to access all procfs nodes by pid.

Using the exechook to revoke procfs nodes is racy and may deadlock:
one thread runs doexechooks() -> procfs_revoke_vnodes() and wants to suspend
the file system for vgone(), while another thread runs a forced unmount,
has the file system suspended, tries to disestablish the exechook and
waits for doexechooks() to complete.

Establish/disestablish the exechook on module load/unload instead
mount/unmount and use the hashmap to access all procfs nodes for this pid.

May fix PR kern/57775 ""panic: unmount: dangling vnode" while umounting procfs"

Remove all procfs nodes for this process on process exit.
 1.86.2.1 02-Aug-2025  perseant Sync with HEAD
 1.4 27-Sep-2019  christos Instead of casting to size_t, cast to uintmax_t to prevent truncation
(pointed out by chuq). In all these cases uio_offset can't be negative.
 1.3 26-Sep-2019  christos fix sign-compare issues: uio->uio_offset (off_t) is compared with (size_t):
cast the offset to size_t.
 1.2 30-Mar-2017  christos branches: 1.2.4; 1.2.6; 1.2.14; 1.2.18; 1.2.22;
remove comment.
 1.1 30-Mar-2017  christos add an auxv node.
 1.2.22.1 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.2.18.2 03-Dec-2017  jdolecek update from HEAD
 1.2.18.1 30-Mar-2017  jdolecek file procfs_auxv.c was added on branch tls-maxphys on 2017-12-03 11:38:48 +0000
 1.2.14.2 28-Aug-2017  skrll Sync with HEAD
 1.2.14.1 30-Mar-2017  skrll file procfs_auxv.c was added on branch nick-nhusb on 2017-08-28 17:53:09 +0000
 1.2.6.2 26-Apr-2017  pgoyette Sync with HEAD
 1.2.6.1 30-Mar-2017  pgoyette file procfs_auxv.c was added on branch pgoyette-localcount on 2017-04-26 02:53:28 +0000
 1.2.4.2 21-Apr-2017  bouyer Sync with HEAD
 1.2.4.1 30-Mar-2017  bouyer file procfs_auxv.c was added on branch bouyer-socketcan on 2017-04-21 16:54:04 +0000
 1.33 18-May-2024  thorpej Remove unnecessary include of <sys/malloc.h>.
 1.32 27-Sep-2019  christos Instead of casting to size_t, cast to uintmax_t to prevent truncation
(pointed out by chuq). In all these cases uio_offset can't be negative.
 1.31 26-Sep-2019  christos fix sign-compare issues: uio->uio_offset (off_t) is compared with (size_t):
cast the offset to size_t.
 1.30 31-Dec-2017  christos branches: 1.30.4;
rename some "cmdline" stuff now that it is used to print environment too
 1.29 31-Dec-2017  christos Add an environ node
 1.28 04-Mar-2011  joerg Refactor ps_strings access. Based on PK_32, write either the normal
version or the 32bit compat layout in execve1. Introduce a new function
copyin_psstrings for reading it back from userland and converting it to
the native layout. Refactor procfs to share most of the code with the
kern.proc_args sysctl handler.

This material is based upon work partially supported by
The NetBSD Foundation under a contract with Joerg Sonnenberger.
 1.27 28-Apr-2008  martin branches: 1.27.22; 1.27.28; 1.27.30;
Remove clause 3 and 4 from TNF licenses
 1.26 17-Feb-2007  pavel branches: 1.26.38; 1.26.40; 1.26.42;
Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.
 1.25 09-Feb-2007  ad branches: 1.25.2;
Merge newlock2 to head.
 1.24 28-Dec-2006  elad PR/32877: Geoff C. Wing: mount_procfs(8) doesn't null-terminate cmdline
output

Patch applied, thanks!
 1.23 16-Nov-2006  christos __unused removal on arguments; approved by core.
 1.22 12-Oct-2006  christos - sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386
 1.21 01-Mar-2006  yamt branches: 1.21.14; 1.21.16;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.
 1.20 11-Dec-2005  christos branches: 1.20.2; 1.20.4; 1.20.6;
merge ktrace-lwp.
 1.19 26-Feb-2005  perry branches: 1.19.4;
nuke trailing whitespace
 1.18 22-Apr-2004  itojun branches: 1.18.4; 1.18.6;
sprintf -> snprintf
 1.17 29-Jun-2003  fvdl branches: 1.17.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
 1.16 28-Jun-2003  darrenr Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
 1.15 07-Nov-2002  thorpej Fix signed/unsigned comparison warnings.
 1.14 09-May-2002  thorpej Move code shared by procfs and the kernel proper out of procfs and
into the kernel proper (renaming functions from procfs_* to process_*).
 1.13 15-Nov-2001  lukem don't need <sys/types.h> when including <sys/param.h>
 1.12 10-Nov-2001  lukem add RCSIDs
 1.11 28-Sep-2000  eeh branches: 1.11.2; 1.11.4; 1.11.8;
Add support for variable end of user stacks needed to support COMPAT_NETBSD32:

`struct vmspace' has a new field `vm_minsaddr' which is the user TOS.

PS_STRINGS is deprecated in favor of curproc->p_pstr which is derived
from `vm_minsaddr'.

Bump the kernel version number.
 1.10 26-Sep-2000  thorpej PHOLD/PRELE around uvm_io() to user address space is unnecessary. There
is nothing in the U-area that we need.
 1.9 28-Jun-2000  mrg <vm/vm.h> -> <uvm/uvm_extern.h>
 1.8 01-Jun-2000  simonb branches: 1.8.2;
Fix a possible kernel memory leak - if the cmdline of a process was
requested after it had started to exit but before it became a zombie
a page of kernel memory wouldn't be free'd.
 1.7 16-May-2000  simonb branches: 1.7.2;
Apply patch from Robert Elz in PR kern/10113. This fixes two problems
with procfs's cmdline - from the PR:

The cmdline implementation in procfs is bogus. It's possible that
part of the fix is a workaround of a UVM problem - that is, when
(internally) accessing the top of the process VM (the end of the
args) a request for I/0 of a PAGE_SIZE'd block starting at less
than a PAGE_SIZE from the end of the mem space returns EINVAL
rather than the data that is available. Whether this is a bug
in UVM or not depends upon how it is defined to work, and I was
unable to determine that. (Simon Burge found that problem, and
provided the basis of the workaround/fix).

Then, the cmdline function is unable to read more than one
page of args, and a good thing too, as the way it is written
attempting to get more than that would reference into lala land.

And, on an attempt to read a lot of data when the above is
fixed, most of the data won't be returned, only the final block
of any read.

Tested on alpha, pmax, i386 and sparc.
 1.6 22-Jul-1999  thorpej branches: 1.6.2;
Rework the process exit path, in preparation for making process exit
and PID allocation MP-safe. A new process state is added: SDEAD. This
state indicates that a process is dead, but not yet a zombie (has not
yet been processed by the process reaper).

SDEAD processes exist on both the zombproc list (via p_list) and deadproc
(via p_hash; the proc has been removed from the pidhash earlier in the exit
path). When the reaper deals with a process, it changes the state to
SZOMB, so that wait4 can process it.

Add a P_ZOMBIE() macro, which treats a proc in SZOMB or SDEAD as a zombie,
and update various parts of the kernel to reflect the new state.
 1.5 27-Apr-1999  thorpej Fix excessive memory usage, and fix handling of SZOMB processes. PR #7164,
Jaromir Dolecek.
 1.4 24-Mar-1999  mrg branches: 1.4.2;
completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.
 1.3 13-Mar-1999  thorpej malloc the arg temporary buffer, rather than declaring it as an automatic
array of ARG_MAX size. ARG_MAX is currently 256k, which causes a rather
serious stack overflow (kernel stacks are not very large, usually 8k).

Fixes memory corruption problems observed after accessig /proc/1/cmdline
during tests. Problem in my case manifested itself as massive lossage
in ffs_sync(), resulting in a crash, and sometimes, pooched file systems.

XXX This could, and probably should, be rewritten to use a much smaller
temporary buffer, and a loop around uiomove().
 1.2 13-Mar-1999  thorpej Some changes to `cmdline' to make it work properly:
- Don't error out on P_SYSTEM or SZOMB processes; instead, do what ps(1)
would do, i.e. the p_comm in parenthesis.
- Use uvm_io() (or procfs_rwmem() if !UVM) to read the target process's
psstrings and argument vector. Using copyin() is problematic, because
it operates on the current processes! That is, the old code would
always get the `cmdline' of the process reading the file, not that of
the target process.
 1.1 12-Mar-1999  christos PR/7143: Jaromir Docelek: Add procfs/cmdline from Linux emulation
 1.4.2.2 01-Jun-2000  he Pull up revision 1.8 (requested by simonb):
Fix a possible kernel memory leak - if the command line of a
process was requested after it had started to exit but before it
became a zombie a page of kernel memory would not be freed.
 1.4.2.1 27-Apr-1999  perry branches: 1.4.2.1.2;
pullup 1.4->1.5 (thorpej)
 1.4.2.1.2.2 02-Aug-1999  thorpej Update from trunk.
 1.4.2.1.2.1 21-Jun-1999  thorpej Sync w/ -current.
 1.6.2.1 20-Nov-2000  bouyer Update thorpej_scsipi to -current as of a month ago
 1.7.2.1 22-Jun-2000  minoura Sync w/ netbsd-1-5-base.
 1.8.2.1 18-Oct-2000  tv Pullup by patch [eeh]:
Support userspace at multiple addresses by making PSSTRINGS variable (using
p_psstr), and fix stackgap_init() appropriately.
 1.11.8.1 12-Nov-2001  thorpej Sync the thorpej-mips-cache branch with -current.
 1.11.4.2 23-Jun-2002  jdolecek catch up with -current on kqueue branch
 1.11.4.1 10-Jan-2002  thorpej Sync kqueue branch with -current.
 1.11.2.4 11-Nov-2002  nathanw Catch up to -current
 1.11.2.3 20-Jun-2002  nathanw Catch up to -current.
 1.11.2.2 08-Jan-2002  nathanw Catch up to -current.
 1.11.2.1 14-Nov-2001  nathanw Catch up to -current.
 1.17.2.5 04-Mar-2005  skrll Sync with HEAD.

Hi Perry!
 1.17.2.4 21-Sep-2004  skrll Fix the sync with head I botched.
 1.17.2.3 18-Sep-2004  skrll Sync with HEAD.
 1.17.2.2 03-Aug-2004  skrll Sync with HEAD
 1.17.2.1 02-Jul-2003  darrenr Apply the aborted ktrace-lwp changes to a specific branch. This is just for
others to review, I'm concerned that patch fuziness may have resulted in some
errant code being generated but I'll look at that later by comparing the diff
from the base to the branch with the file I attempt to apply to it. This will,
at the very least, put the changes in a better context for others to review
them and attempt to tinker with removing passing of 'struct lwp' through
the kernel.
 1.18.6.1 19-Mar-2005  yamt sync with head. xen and whitespace. xen part is not finished.
 1.18.4.1 29-Apr-2005  kent sync with -current
 1.19.4.3 26-Feb-2007  yamt sync with head.
 1.19.4.2 30-Dec-2006  yamt sync with head.
 1.19.4.1 21-Jun-2006  yamt sync with head.
 1.20.6.1 22-Apr-2006  simonb Sync with head.
 1.20.4.1 09-Sep-2006  rpaulo sync with head
 1.20.2.1 15-Jan-2006  yamt convert procfs.
 1.21.16.2 10-Dec-2006  yamt sync with head.
 1.21.16.1 22-Oct-2006  yamt sync with head
 1.21.14.3 12-Jan-2007  ad Sync with head.
 1.21.14.2 18-Nov-2006  ad Sync with head.
 1.21.14.1 17-Nov-2006  ad Checkpoint work in progress.
 1.25.2.1 27-Feb-2007  yamt - sync with head.
- move sched_changepri back to kern_synch.c as it doesn't know PPQ anymore.
 1.26.42.1 16-May-2008  yamt sync with head.
 1.26.40.1 18-May-2008  yamt sync with head.
 1.26.38.1 02-Jun-2008  mjf Sync with HEAD.
 1.27.30.1 05-Mar-2011  bouyer Sync with HEAD
 1.27.28.1 06-Jun-2011  jruoho Sync with HEAD.
 1.27.22.1 05-Mar-2011  rmind sync with head
 1.30.4.1 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.49 28-Aug-2017  kamil Remove the filesystem tracing feature

This is a legacy interface from 4.4BSD, and it was
introduced to overcome shortcomings of ptrace(2) at that time, which are
no longer relevant (performance). Today /proc/#/ctl offers a narrow
subset of ptrace(2) commands and is not applicable for modern
applications use beyond simplistic tracing scenarios.

This removal will simplify kernel internals. Users will still be able to
use all the other /proc files.

This change won't affect other procfs files neither Linux compat
features within mount_procfs(8). /proc/#/ctl isn't available on Linux.

Remove:
- /proc/#/ctl from mount_procfs(8)
- P_FSTRACE note from the documentation of ps(1)
- /proc/#/ctl and filesystem tracing documentation from mount_procfs(8)
- KAUTH_REQ_PROCESS_PROCFS_CTL documentation from kauth(9)
- source code file miscfs/procfs/procfs_ctl.c
- PFSctl and procfs_doctl() from sys/miscfs/procfs/procfs.h
- KAUTH_REQ_PROCESS_PROCFS_CTL from sys/sys/kauth.h
- PSL_FSTRACE (0x00010000) from sys/sys/proc.h
- P_FSTRACE (0x00010000) from sys/sys/sysctl.h

Reduce code complexity after removal of this functionality.

Update TODO.ptrace accordingly: remove two entries about /proc tracing.

Do not keep legacy notes as comments in the headers about removed
PSL_FSTRACE / P_FSTRACE, as this interface had little number of users
(close or equal to zero).

Proposed on tech-kern@.

All filesystem tracing utility users are encouraged to switch to ptrace(2).

Sponsored by <The NetBSD Foundation>
 1.48 04-Apr-2016  christos branches: 1.48.10;
Split p_xstat (composite wait(2) status code, or signal number depending
on context) into:
1. p_xexit: exit code
2. p_xsig: signal number
3. p_sflag & WCOREFLAG bit to indicated that the process core-dumped.

Fix the documentation of the flag bits in <sys/proc.h>
 1.47 21-Oct-2009  rmind branches: 1.47.22; 1.47.40;
Remove uarea swap-out functionality:

- Addresses the issue described in PR/38828.
- Some simplification in threading and sleepq subsystems.
- Eliminates pmap_collect() and, as a side note, allows pmap optimisations.
- Eliminates XS_CTL_DATA_ONSTACK in scsipi code.
- Avoids few scans on LWP list and thus potentially long holds of proc_lock.
- Cuts ~1.5k lines of code. Reduces amd64 kernel size by ~4k.
- Removes __SWAP_BROKEN cases.

Tested on x86, mips, acorn32 (thanks <mpumford>) and partly tested on
acorn26 (thanks to <bjh21>).

Discussed on <tech-kern>, reviewed by <ad>.
 1.46 14-Mar-2009  dsl ANSIfy another 1261 function definitions.
The only ones left in sys are beyond by sed script!
(or in sys/dist or sys/external)
Mostly they have function pointer parameters.
 1.45 24-Apr-2008  ad branches: 1.45.2; 1.45.10; 1.45.16;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.
 1.44 24-Apr-2008  ad Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.
 1.43 23-Jan-2008  elad branches: 1.43.6; 1.43.8;
Tons of process scope changes.

- Add a KAUTH_PROCESS_SCHEDULER action, to handle scheduler related
requests, and add specific requests for set/get scheduler policy and
set/get scheduler parameters.

- Add a KAUTH_PROCESS_KEVENT_FILTER action, to handle kevent(2) related
requests.

- Add a KAUTH_DEVICE_TTY_STI action to handle requests to TIOCSTI.

- Add requests for the KAUTH_PROCESS_CANSEE action, indicating what
process information is being looked at (entry itself, args, env,
open files).

- Add requests for the KAUTH_PROCESS_RLIMIT action indicating set/get.

- Add requests for the KAUTH_PROCESS_CORENAME action indicating set/get.

- Make bsd44 secmodel code handle the newly added rqeuests appropriately.

All of the above make it possible to issue finer-grained kauth(9) calls in
many places, removing some KAUTH_GENERIC_ISSUSER requests.

- Remove the "CAN" from KAUTH_PROCESS_CAN{KTRACE,PROCFS,PTRACE,SIGNAL}.

Discussed with christos@ and yamt@.
 1.42 07-Nov-2007  ad branches: 1.42.6;
Merge from vmlocking:

- pool_cache changes.
- Debugger/procfs locking fixes.
- Other minor changes.
 1.41 09-Jul-2007  ad branches: 1.41.6; 1.41.8; 1.41.12; 1.41.14;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements
 1.40 09-Mar-2007  ad branches: 1.40.2; 1.40.4;
- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.
 1.39 09-Feb-2007  ad branches: 1.39.2;
Merge newlock2 to head.
 1.38 19-Dec-2006  elad Some changes to get rid of another KAUTH_GENERIC_ISSUSER usage:
- Make procfs_control() in procfs_ctl.c static,
- Add an argument to the above, 'pfs', for the pfsnode,
- Add another request type to KAUTH_PROCESS_CANPROCFS named
KAUTH_REQ_PROCESS_CANPROCFS_CTL (and update documentation),
- Use the above combination in a call to kauth_authorize_process().
 1.37 22-Nov-2006  elad branches: 1.37.2;
Remove redundant securelevel check; this is already done in procfs_rw()
and we can't get here (procfs_control()) without being there first.

Pointed out by yamt@.
 1.36 16-Nov-2006  christos __unused removal on arguments; approved by core.
 1.35 12-Oct-2006  christos - sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386
 1.34 03-Sep-2006  christos branches: 1.34.2; 1.34.4;
add missing initializers
 1.33 23-Jul-2006  ad Use the LWP cached credentials where sane.
 1.32 14-May-2006  elad integrate kauth.
 1.31 05-Mar-2006  christos branches: 1.31.2; 1.31.4;
cleanup more SET/CLR/ISSET lossage
 1.30 11-Dec-2005  christos branches: 1.30.4; 1.30.6; 1.30.8;
merge ktrace-lwp.
 1.29 30-Aug-2005  xtraeme Remove __P()
 1.28 26-Feb-2005  perry branches: 1.28.4;
nuke trailing whitespace
 1.27 07-Aug-2003  agc branches: 1.27.8; 1.27.10;
Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.26 29-Jun-2003  fvdl branches: 1.26.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
 1.25 28-Jun-2003  darrenr Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
 1.24 18-Jan-2003  thorpej Merge the nathanw_sa branch.
 1.23 25-Jul-2002  jdolecek branches: 1.23.2;
Make sure that the pointer to old parent process for ptraced children
gets reset properly when the old parent exits before the child. A flag
is set in old parent process when the child is reparented in ptrace(2).
If it's set when process is exiting, all running processes have their
'old parent process' pointer checked and reset if appropriate. Also
change to use 'struct proc *' pointer directly, rather than pid_t.
This fixes security/14444 by David Sainty.

Reviewed by Christos Zoulas.
 1.22 11-Jan-2002  christos branches: 1.22.8; 1.22.10;
Apply the same P_INEXEC test to avoid the execve/trace problem using
the procfs ptrace calls.
 1.21 05-Dec-2001  thorpej * Allow machine-dependent code to specify hooks for ptrace(2)
(__HAVE_PTRACE_MACHDEP) and procfs (__HAVE_PROCFS_MACHDEP).
These changes will allow platforms like x86 (XMM) and PowerPC
(AltiVec) to export extended register sets in a sane manner.

* Use __HAVE_PTRACE_MACHDEP to export x86 XMM registers (standard
FP + SSE/SSE2) using PT_{GET,SET}XMMREGS (in the machdep
ptrace request space).
* Use __HAVE_PROCFS_MACHDEP to export x86 XMM registers via
/proc/N/xmmregs in procfs.
 1.20 10-Nov-2001  lukem add RCSIDs
 1.19 18-Jan-2001  jdolecek branches: 1.19.2; 1.19.4; 1.19.8;
constify
 1.18 20-Aug-2000  thorpej Add a lock around the scheduler, and use it as necessary, including
in the non-MULTIPROCESSOR case (LOCKDEBUG requires it). Scheduler
lock is held upon entry to mi_switch() and cpu_switch(), and
cpu_switch() releases the lock before returning.

Largely from Bill Sommerfeld, with some minor bug fixes and
machine-dependent code hacking from me.
 1.17 22-Jul-1999  thorpej branches: 1.17.2; 1.17.12;
Rework the process exit path, in preparation for making process exit
and PID allocation MP-safe. A new process state is added: SDEAD. This
state indicates that a process is dead, but not yet a zombie (has not
yet been processed by the process reaper).

SDEAD processes exist on both the zombproc list (via p_list) and deadproc
(via p_hash; the proc has been removed from the pidhash earlier in the exit
path). When the reaper deals with a process, it changes the state to
SZOMB, so that wait4 can process it.

Add a P_ZOMBIE() macro, which treats a proc in SZOMB or SDEAD as a zombie,
and update various parts of the kernel to reflect the new state.
 1.16 28-Apr-1997  mycroft branches: 1.16.16; 1.16.18;
Reinstate P_FSTRACE, with different semantics:
* Never send a SIGCHLD to the parent if P_FSTRACE is set.
* Do not permit mixing ptrace(2) and procfs; only permit using the one that
was attached.
 1.15 28-Apr-1997  mycroft Fix several deficiencies, as compared to ptrace(2):
* Did not check for P_SUGID on ATTACH.
* Did not check for tracing of init on ATTACH.
* Did not turn off single-step mode on RUN or DETACH.
* Might have screwed up reparenting in some cases.
* Allowed anyone to detach the process.
 1.14 09-Feb-1996  christos miscfs prototype changes
 1.13 13-Aug-1995  mycroft Lock the process in core before operating on it.
 1.12 29-Jun-1994  cgd New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'
 1.11 15-Jun-1994  mycroft Minor update from JSP after merging my changes.
 1.10 08-Jun-1994  mycroft Update to 4.4-Lite fs code, with local changes.
 1.9 07-May-1994  cgd setrun rename
 1.8 04-May-1994  cgd Rename a lot of process flags.
 1.7 20-Jan-1994  ws Make procfs really work for debugging.
Implement not & notepg files in procfs.
 1.6 09-Jan-1994  cgd fix some of my more recent botches, and clean up slightly.
 1.5 09-Jan-1994  cgd oops. fix that last
 1.4 09-Jan-1994  cgd minor cleanup; kill a few assignments
 1.3 09-Jan-1994  ws Bug fixes and enhancements:
Make NFS serving work (BUT DON'T USE "attach" TO /proc/*/ctl FOR NOW!!!)
Make `curproc' a symbolic link
Add `.' and `..' entries to the directories.
Return better guesses on the size of the files.
 1.2 08-Jan-1994  cgd reorganization of ptrace/procfs code
 1.1 05-Jan-1994  cgd branches: 1.1.1;
add new procfs code, from Jan-Simon Pendry, jsp@sequent.com.
This is pretty-much "virgin", so that diffs can be done later.
 1.1.1.2 01-Mar-1998  fvdl Import 4.4BSD-Lite2
 1.1.1.1 01-Mar-1998  fvdl Import 4.4BSD-Lite for reference
 1.16.18.1 02-Aug-1999  thorpej Update from trunk.
 1.16.16.1 14-Jan-2002  he Pull up revision 1.22 (requested by he):
Fix a ptrace/execve race condition which could be used to modify
the child process' image during execve. This would be a security
issue due to setuid programs.
 1.17.12.1 12-Jan-2002  he Pull up revision 1.22 (requested by christos):
Fix a ptrace/execve race condition which could be used to modify
the child process' image during execve. This would be a security
issue due to setuid programs.
 1.17.2.2 11-Feb-2001  bouyer Sync with HEAD.
 1.17.2.1 20-Nov-2000  bouyer Update thorpej_scsipi to -current as of a month ago
 1.19.8.1 12-Nov-2001  thorpej Sync the thorpej-mips-cache branch with -current.
 1.19.4.3 06-Sep-2002  jdolecek sync kqueue branch with HEAD
 1.19.4.2 11-Feb-2002  jdolecek Sync w/ -current.
 1.19.4.1 10-Jan-2002  thorpej Sync kqueue branch with -current.
 1.19.2.6 01-Aug-2002  nathanw Catch up to -current.
 1.19.2.5 28-Feb-2002  nathanw Catch up to -current.
 1.19.2.4 11-Jan-2002  nathanw More catchup.
 1.19.2.3 08-Jan-2002  nathanw Catch up to -current.
 1.19.2.2 14-Nov-2001  nathanw Catch up to -current.
 1.19.2.1 05-Mar-2001  nathanw Initial commit of scheduler activations and lightweight process support.
 1.22.10.1 29-Jul-2002  lukem Pull up revision 1.23 (requested by jdolocek in ticket #557):
Make sure that the pointer to old parent process for ptraced children
gets reset properly when the old parent exits before the child. A flag
is set in old parent process when the child is reparented in ptrace(2).
If it's set when process is exiting, all running processes have their
'old parent process' pointer checked and reset if appropriate. Also
change to use 'struct proc *' pointer directly, rather than pid_t.
This fixes security/14444 by David Sainty.
Reviewed by Christos Zoulas.
 1.22.8.1 29-Aug-2002  gehenna catch up with -current.
 1.23.2.1 18-Dec-2002  gmcgarry Merge pcred and ucred, and poolify. TBD: check backward compatibility
and factor-out some higher-level functionality.
 1.26.2.6 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.26.2.5 04-Mar-2005  skrll Sync with HEAD.

Hi Perry!
 1.26.2.4 21-Sep-2004  skrll Fix the sync with head I botched.
 1.26.2.3 18-Sep-2004  skrll Sync with HEAD.
 1.26.2.2 03-Aug-2004  skrll Sync with HEAD
 1.26.2.1 02-Jul-2003  darrenr Apply the aborted ktrace-lwp changes to a specific branch. This is just for
others to review, I'm concerned that patch fuziness may have resulted in some
errant code being generated but I'll look at that later by comparing the diff
from the base to the branch with the file I attempt to apply to it. This will,
at the very least, put the changes in a better context for others to review
them and attempt to tinker with removing passing of 'struct lwp' through
the kernel.
 1.27.10.1 19-Mar-2005  yamt sync with head. xen and whitespace. xen part is not finished.
 1.27.8.1 29-Apr-2005  kent sync with -current
 1.28.4.6 04-Feb-2008  yamt sync with head.
 1.28.4.5 15-Nov-2007  yamt sync with head.
 1.28.4.4 03-Sep-2007  yamt sync with head.
 1.28.4.3 26-Feb-2007  yamt sync with head.
 1.28.4.2 30-Dec-2006  yamt sync with head.
 1.28.4.1 21-Jun-2006  yamt sync with head.
 1.30.8.4 03-Sep-2006  yamt sync with head.
 1.30.8.3 11-Aug-2006  yamt sync with head
 1.30.8.2 24-May-2006  yamt sync with head.
 1.30.8.1 13-Mar-2006  yamt sync with head.
 1.30.6.2 01-Jun-2006  kardel Sync with head.
 1.30.6.1 22-Apr-2006  simonb Sync with head.
 1.30.4.1 09-Sep-2006  rpaulo sync with head
 1.31.4.1 24-May-2006  tron Merge 2006-05-24 NetBSD-current into the "peter-altq" branch.
 1.31.2.3 06-May-2006  christos - Move kauth_cred_t declaration to <sys/types.h>
- Cleanup struct ucred; forward declarations that are unused.
- Don't include <sys/kauth.h> in any header, but include it in the c files
that need it.

Approved by core.
 1.31.2.2 10-Mar-2006  elad generic_authorize() -> kauth_authorize_generic().
 1.31.2.1 08-Mar-2006  elad Adapt to kernel authorization KPI.
 1.34.4.3 21-Dec-2006  yamt sync with head.
 1.34.4.2 10-Dec-2006  yamt sync with head.
 1.34.4.1 22-Oct-2006  yamt sync with head
 1.34.2.5 12-Jan-2007  ad Sync with head.
 1.34.2.4 29-Dec-2006  ad Checkpoint work in progress.
 1.34.2.3 18-Nov-2006  ad Sync with head.
 1.34.2.2 17-Nov-2006  ad Checkpoint work in progress.
 1.34.2.1 21-Oct-2006  ad - Make this compile. XXX Needs more work on locking.
- Do FILE_UNUSE() as the current LWP, otherwise we will wipe out the
target's advisory locks. XXX Double check.
 1.37.2.1 04-Jan-2007  bouyer Pull up following revision(s) (requested by hubert in ticket #334):
share/man/man9/kauth.9: revision 1.39
sys/miscfs/procfs/procfs_ctl.c: revision 1.38
sys/sys/kauth.h: revision 1.27
Some changes to get rid of another KAUTH_GENERIC_ISSUSER usage:
- Make procfs_control() in procfs_ctl.c static,
- Add an argument to the above, 'pfs', for the pfsnode,
- Add another request type to KAUTH_PROCESS_CANPROCFS named
KAUTH_REQ_PROCESS_CANPROCFS_CTL (and update documentation),
- Use the above combination in a call to kauth_authorize_process().
 1.39.2.1 12-Mar-2007  rmind Sync with HEAD.
 1.40.4.1 11-Jul-2007  mjf Sync with head.
 1.40.2.3 25-Oct-2007  ad - Simplify debugger/procfs reference counting of processes. Use a per-proc
rwlock: rw_tryenter(RW_READER) to gain a reference, and rw_enter(RW_WRITER)
by the process itself to drain out reference holders before major changes
like exiting.
- Fix numerous bugs and locking issues in procfs.
- Mark procfs MPSAFE.
 1.40.2.2 15-Jul-2007  ad Sync with head.
 1.40.2.1 05-Apr-2007  ad Compile fixes.
 1.41.14.2 18-Feb-2008  mjf Sync with HEAD.
 1.41.14.1 19-Nov-2007  mjf Sync with HEAD.
 1.41.12.1 13-Nov-2007  bouyer Sync with HEAD
 1.41.8.2 23-Mar-2008  matt sync with HEAD
 1.41.8.1 08-Nov-2007  matt sync with -HEAD
 1.41.6.1 11-Nov-2007  joerg Sync with HEAD.
 1.42.6.1 23-Jan-2008  bouyer Sync with HEAD.
 1.43.8.1 18-May-2008  yamt sync with head.
 1.43.6.1 02-Jun-2008  mjf Sync with HEAD.
 1.45.16.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.45.10.1 28-Apr-2009  skrll Sync with HEAD.
 1.45.2.2 11-Mar-2010  yamt sync with head
 1.45.2.1 04-May-2009  yamt sync with head.
 1.47.40.1 22-Apr-2016  skrll Sync with HEAD
 1.47.22.1 03-Dec-2017  jdolecek update from HEAD
 1.48.10.1 12-Apr-2018  martin Pull up following revision(s) (requested by kamil in ticket #713):

sys/modules/procfs/Makefile: revision 1.4
sys/miscfs/procfs/procfs_vfsops.c: revision 1.98
bin/ps/ps.1: revision 1.108
sys/compat/linux/arch/i386/linux_ptrace.c: revision 1.32
sys/miscfs/procfs/procfs_vnops.c: revision 1.198
sys/kern/sys_ptrace_common.c: revision 1.23
sys/kern/sys_ptrace_common.c: revision 1.24
sbin/mount_procfs/mount_procfs.8: revision 1.36
sys/kern/sys_ptrace_common.c: revision 1.25
sys/kern/sys_ptrace.c: revision 1.5
sys/compat/linux/arch/powerpc/linux_ptrace.c: revision 1.30
sys/sys/proc.h: revision 1.342
sys/kern/sys_ptrace_common.c: revision 1.26
sys/miscfs/procfs/procfs_ctl.c: file removal
sys/kern/sys_ptrace_common.c: revision 1.27
sys/miscfs/procfs/procfs_subr.c: revision 1.109
sys/kern/sys_ptrace_common.c: revision 1.28
sys/secmodel/extensions/secmodel_extensions.c: revision 1.8
sys/kern/sys_ptrace_common.c: revision 1.29
sys/sys/ptrace.h: revision 1.62
sys/compat/netbsd32/netbsd32_signal.c: revision 1.45
share/man/man9/kauth.9: revision 1.109
sys/miscfs/procfs/files.procfs: revision 1.12
sys/compat/netbsd32/netbsd32.h: revision 1.115
sys/miscfs/procfs/procfs.h: revision 1.72
sys/compat/netbsd32/netbsd32_ptrace.c: revision 1.5
sys/kern/kern_sig.c: revision 1.337
sys/sys/kauth.h: revision 1.75
sys/sys/sysctl.h: revision 1.224
sys/kern/sys_ptrace_common.c: revision 1.30
sys/kern/sys_ptrace_common.c: revision 1.31
sys/kern/sys_ptrace_common.c: revision 1.32
sys/kern/sys_ptrace_common.c: revision 1.33
sys/compat/linux/arch/arm/linux_ptrace.c: revision 1.20
sys/kern/sys_ptrace_common.c: revision 1.34
sys/kern/sys_ptrace_common.c: revision 1.36
sys/kern/kern_proc.c: revision 1.207
sys/kern/kern_exit.c: revision 1.269
doc/TODO.ptrace: revision 1.29

Make {s,g}et{db,fp,}regs work again for PK_32 processes
XXX: pullup-8

add disgusting magic to handle compat_netbsd32 as a module.

use process_*reg32 instead of struct *reg32.

Remove the filesystem tracing feature

This is a legacy interface from 4.4BSD, and it was
introduced to overcome shortcomings of ptrace(2) at that time, which are
no longer relevant (performance). Today /proc/#/ctl offers a narrow
subset of ptrace(2) commands and is not applicable for modern
applications use beyond simplistic tracing scenarios.

This removal will simplify kernel internals. Users will still be able to
use all the other /proc files.

This change won't affect other procfs files neither Linux compat
features within mount_procfs(8). /proc/#/ctl isn't available on Linux.

Remove:
- /proc/#/ctl from mount_procfs(8)
- P_FSTRACE note from the documentation of ps(1)
- /proc/#/ctl and filesystem tracing documentation from mount_procfs(8)
- KAUTH_REQ_PROCESS_PROCFS_CTL documentation from kauth(9)
- source code file miscfs/procfs/procfs_ctl.c
- PFSctl and procfs_doctl() from sys/miscfs/procfs/procfs.h
- KAUTH_REQ_PROCESS_PROCFS_CTL from sys/sys/kauth.h
- PSL_FSTRACE (0x00010000) from sys/sys/proc.h
- P_FSTRACE (0x00010000) from sys/sys/sysctl.h

Reduce code complexity after removal of this functionality.

Update TODO.ptrace accordingly: remove two entries about /proc tracing.

Do not keep legacy notes as comments in the headers about removed

PSL_FSTRACE / P_FSTRACE, as this interface had little number of users
(close or equal to zero).
Proposed on tech-kern@.

All filesystem tracing utility users are encouraged to switch to ptrace(2).

Sponsored by <The NetBSD Foundation>

untangle the mess:
- factor out common code
- break each ptrace subcall to its own sub-function
.. more to come ...
- reduce ifdef ugliness by moving it up top.
- factor out PT_IO and make PT_{READ,WRITE}_{I,D} use it
- factor out PT_DUMPCORE
- factor out sendsig code
.. more to come ...

handle siginfo requests for ptrace32

ptrace: Partially undo PT_{READ,WRITE}_{I,D} and unbreak these commands

The refactored code did not work and was generating EFAULT.

Sponsored by <The NetBSD Foundation>

Merge the code back; the problem was that since we are reading/writing
to a kernel address for PT_{READ,WRITE}_{I,D} we need the kernel vmspace.
provide separate read and write functions to accomodate register functions
that need a size argument.

don't ignore error from copyout_piod

Use the proper process (the tracee) to get information about lwps and
registers and the tracer for vmspace.

Add new sysctl(3) entry: security.models.extensions.user_set_dbregs

Model this new sysctl(3) entry after "user_set_cpu_affinity" in the same
level of sysctl(3) switches.

Allow to read unconditionally Debug Registers (no change here). This is
convenient as even if a user of a debugger does not use hardware assisted
watchpoints/breakpoints, a debugger can still prompt these values to store
in an internal cache with context of registers. Reading them should have
no security concerns.

Add a paranoid MI switch that prohibits by default setting these registers
by a regular user (non-superuser). Make this switch disabled by default.
There are enough reserved bits out there to allow using them
unconditionally on hardened hosts.

Features shipped with Debug Registers are optional features in debuggers.
There is no reduction in elementary functionality.

Reviewed by <christos>

Sponsored by <The NetBSD Foundation>
 1.14 28-Apr-2008  martin Remove clause 3 and 4 from TNF licenses
 1.13 21-Mar-2008  ad branches: 1.13.2; 1.13.4;
Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.
 1.12 07-Nov-2007  ad branches: 1.12.14;
Merge from vmlocking:

- pool_cache changes.
- Debugger/procfs locking fixes.
- Other minor changes.
 1.11 09-Feb-2007  ad branches: 1.11.6; 1.11.18; 1.11.20; 1.11.24; 1.11.26;
Merge newlock2 to head.
 1.10 16-Nov-2006  christos __unused removal on arguments; approved by core.
 1.9 12-Oct-2006  christos - sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386
 1.8 23-Jul-2006  ad branches: 1.8.4; 1.8.6;
Use the LWP cached credentials where sane.
 1.7 14-May-2006  elad integrate kauth.
 1.6 11-Dec-2005  christos branches: 1.6.4; 1.6.6; 1.6.8; 1.6.10; 1.6.12;
merge ktrace-lwp.
 1.5 29-Jun-2003  fvdl branches: 1.5.2; 1.5.18;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
 1.4 28-Jun-2003  darrenr Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
 1.3 08-May-2003  nakayama Add breaks which were forgotten in rev. 1.2 change.
Inspired from a report by HIRATSUKA Kouichirou in tech-pkg-ja mailing list.
 1.2 17-Apr-2003  jdolecek use fd_getfile() in procfs_getfp(), and FILE_USE()/FILE_UNUSE() the
returned file descriptor pointer appropriately
 1.1 03-Jan-2003  christos branches: 1.1.2;
Implement /proc/<pid>/fd/<n>. This is work in progress. Questionable things:
- Is it ok to convert DTYPE_PIPE to VFIFO and DTYPE_SOCKET to VSOCK?
- XXX: Avoid locking issue in ls -Rl /proc by avoiding curproc
- Does I/O to pipes work?
- XXX: Are there security implications?
 1.1.2.2 07-Jan-2003  thorpej Sync with HEAD.
 1.1.2.1 03-Jan-2003  thorpej file procfs_fd.c was added on branch nathanw_sa on 2003-01-07 21:41:13 +0000
 1.5.18.5 24-Mar-2008  yamt sync with head.
 1.5.18.4 15-Nov-2007  yamt sync with head.
 1.5.18.3 26-Feb-2007  yamt sync with head.
 1.5.18.2 30-Dec-2006  yamt sync with head.
 1.5.18.1 21-Jun-2006  yamt sync with head.
 1.5.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.5.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.5.2.1 02-Jul-2003  darrenr Apply the aborted ktrace-lwp changes to a specific branch. This is just for
others to review, I'm concerned that patch fuziness may have resulted in some
errant code being generated but I'll look at that later by comparing the diff
from the base to the branch with the file I attempt to apply to it. This will,
at the very least, put the changes in a better context for others to review
them and attempt to tinker with removing passing of 'struct lwp' through
the kernel.
 1.6.12.1 24-May-2006  tron Merge 2006-05-24 NetBSD-current into the "peter-altq" branch.
 1.6.10.1 08-Mar-2006  elad Adapt to kernel authorization KPI.
 1.6.8.2 11-Aug-2006  yamt sync with head
 1.6.8.1 24-May-2006  yamt sync with head.
 1.6.6.1 01-Jun-2006  kardel Sync with head.
 1.6.4.1 09-Sep-2006  rpaulo sync with head
 1.8.6.2 10-Dec-2006  yamt sync with head.
 1.8.6.1 22-Oct-2006  yamt sync with head
 1.8.4.3 18-Nov-2006  ad Sync with head.
 1.8.4.2 17-Nov-2006  ad Checkpoint work in progress.
 1.8.4.1 21-Oct-2006  ad - Make this compile. XXX Needs more work on locking.
- Do FILE_UNUSE() as the current LWP, otherwise we will wipe out the
target's advisory locks. XXX Double check.
 1.11.26.1 19-Nov-2007  mjf Sync with HEAD.
 1.11.24.1 13-Nov-2007  bouyer Sync with HEAD
 1.11.20.1 08-Nov-2007  matt sync with -HEAD
 1.11.18.1 11-Nov-2007  joerg Sync with HEAD.
 1.11.6.1 25-Oct-2007  ad - Simplify debugger/procfs reference counting of processes. Use a per-proc
rwlock: rw_tryenter(RW_READER) to gain a reference, and rw_enter(RW_WRITER)
by the process itself to drain out reference holders before major changes
like exiting.
- Fix numerous bugs and locking issues in procfs.
- Mark procfs MPSAFE.
 1.12.14.2 02-Jun-2008  mjf Sync with HEAD.
 1.12.14.1 03-Apr-2008  mjf Sync with HEAD.
 1.13.4.1 16-May-2008  yamt sync with head.
 1.13.2.1 18-May-2008  yamt sync with head.
 1.17 16-Nov-2006  christos __unused removal on arguments; approved by core.
 1.16 12-Oct-2006  christos - sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386
 1.15 11-Dec-2005  christos branches: 1.15.20; 1.15.22;
merge ktrace-lwp.
 1.14 07-Aug-2003  agc branches: 1.14.16;
Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.13 29-Jun-2003  fvdl branches: 1.13.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
 1.12 28-Jun-2003  darrenr Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
 1.11 18-Jan-2003  thorpej Merge the nathanw_sa branch.
 1.10 09-May-2002  thorpej Move code shared by procfs and the kernel proper out of procfs and
into the kernel proper (renaming functions from procfs_* to process_*).
 1.9 05-Dec-2001  thorpej * Allow machine-dependent code to specify hooks for ptrace(2)
(__HAVE_PTRACE_MACHDEP) and procfs (__HAVE_PROCFS_MACHDEP).
These changes will allow platforms like x86 (XMM) and PowerPC
(AltiVec) to export extended register sets in a sane manner.

* Use __HAVE_PTRACE_MACHDEP to export x86 XMM registers (standard
FP + SSE/SSE2) using PT_{GET,SET}XMMREGS (in the machdep
ptrace request space).
* Use __HAVE_PROCFS_MACHDEP to export x86 XMM registers via
/proc/N/xmmregs in procfs.
 1.8 10-Nov-2001  lukem add RCSIDs
 1.7 17-Jan-2001  fvdl branches: 1.7.2; 1.7.4; 1.7.8;
Add a few linux-style files, only enabled when -o linux is specified
for the mount. Currently these are /proc/cpuinfo and /proc/meminfo.
The former only does something on i386 right now.
 1.6 27-Aug-1997  thorpej branches: 1.6.18; 1.6.28;
Fix a reversed argument which caused procfs_checkioperm() to always return
"OK". Add a few comments to avoid further confusion.
 1.5 12-Aug-1997  thorpej Fix the procfs hole described on current-users, similar to a fix for
FreeBSD by Sean Eric Fagan, but a bit different. This makes the checks
in the same places as sef's FreeBSD patch, but does not hardcode the
"kmem" group into the kernel, and also does a check identical to the
(3) and (4) checks in the NetBSD ptrace(2):

(1) it's not owned by you, or is set-id on exec (unless
you're root), or

(2) it's init, which controls the security level of the
entire system, and the system was not compiled with
permanently insecure mode turned on.
 1.4 13-Aug-1995  mycroft branches: 1.4.14;
Lock the process in core before operating on it.
 1.3 29-Jun-1994  cgd New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'
 1.2 15-Jun-1994  mycroft Minor update from JSP after merging my changes.
 1.1 08-Jun-1994  mycroft branches: 1.1.1;
Update to 4.4-Lite fs code, with local changes.
 1.1.1.2 01-Mar-1998  fvdl Import 4.4BSD-Lite2
 1.1.1.1 01-Mar-1998  fvdl Import 4.4BSD-Lite for reference
 1.4.14.2 28-Aug-1997  thorpej Update marc-pcmcia branch from trunk.
 1.4.14.1 23-Aug-1997  thorpej Update marc-pcmcia branch from trunk.
 1.6.28.1 30-Mar-2001  he Pull up revision 1.7 (requested by fvdl):
Add some required Linux emulation bits to support the Linux
version of VMware.
 1.6.18.1 18-Jan-2001  bouyer Sync with head (for UBC+NFS fixes, mostly).
 1.7.8.1 12-Nov-2001  thorpej Sync the thorpej-mips-cache branch with -current.
 1.7.4.2 23-Jun-2002  jdolecek catch up with -current on kqueue branch
 1.7.4.1 10-Jan-2002  thorpej Sync kqueue branch with -current.
 1.7.2.6 15-Oct-2002  nathanw Make _validfoo() routines go back to taking a proc.
 1.7.2.5 12-Jul-2002  nathanw No longer need to pull in lwp.h; proc.h pulls it in for us.
 1.7.2.4 20-Jun-2002  nathanw Catch up to -current.
 1.7.2.3 08-Jan-2002  nathanw Catch up to -current.
 1.7.2.2 14-Nov-2001  nathanw Catch up to -current.
 1.7.2.1 05-Mar-2001  nathanw Initial commit of scheduler activations and lightweight process support.
 1.13.2.4 21-Sep-2004  skrll Fix the sync with head I botched.
 1.13.2.3 18-Sep-2004  skrll Sync with HEAD.
 1.13.2.2 03-Aug-2004  skrll Sync with HEAD
 1.13.2.1 02-Jul-2003  darrenr Apply the aborted ktrace-lwp changes to a specific branch. This is just for
others to review, I'm concerned that patch fuziness may have resulted in some
errant code being generated but I'll look at that later by comparing the diff
from the base to the branch with the file I attempt to apply to it. This will,
at the very least, put the changes in a better context for others to review
them and attempt to tinker with removing passing of 'struct lwp' through
the kernel.
 1.14.16.2 30-Dec-2006  yamt sync with head.
 1.14.16.1 21-Jun-2006  yamt sync with head.
 1.15.22.2 10-Dec-2006  yamt sync with head.
 1.15.22.1 22-Oct-2006  yamt sync with head
 1.15.20.1 18-Nov-2006  ad Sync with head.
 1.5 12-May-2024  christos PR/58240: Ricardo Branco: Add support for proc/self/limits as used by Linux
 1.4 23-May-2020  ad Move proc_lock into the data segment. It was dynamically allocated because
at the time we had mutex_obj_alloc() but not __cacheline_aligned.
 1.3 27-Sep-2019  christos Instead of casting to size_t, cast to uintmax_t to prevent truncation
(pointed out by chuq). In all these cases uio_offset can't be negative.
 1.2 26-Sep-2019  christos fix sign-compare issues: uio->uio_offset (off_t) is compared with (size_t):
cast the offset to size_t.
 1.1 30-Mar-2019  christos branches: 1.1.4;
add a node for the process resource limits.
 1.1.4.3 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.1.4.2 10-Jun-2019  christos Sync with HEAD
 1.1.4.1 30-Mar-2019  christos file procfs_limit.c was added on branch phil-wifi on 2019-06-10 22:09:06 +0000
 1.90 14-Sep-2024  pgoyette Define dependencies based on build options.
 1.89 01-Jul-2024  christos Add linux POSIX message queue support (Ricardo Branco)
 1.88 12-May-2024  christos branches: 1.88.2;
PR/58227: Ricardo Branco: Add support for proc/sysvipc in Linux emulator
 1.87 05-Sep-2020  riastradh Round of uvm.h cleanup.

The poorly named uvm.h is generally supposed to be for uvm-internal
users only.

- Narrow it to files that actually need it -- mostly files that need
to query whether curlwp is the pagedaemon, which should maybe be
exposed by an external header.

- Use uvm_extern.h where feasible and uvm_*.h for things not exposed
by it. We should split up uvm_extern.h but this will serve for now
to reduce the uvm.h dependencies.

- Use uvm_stat.h and #ifdef UVMHIST uvm.h for files that use
UVMHIST(ubchist), since ubchist is declared in uvm.h but the
reference evaporates if UVMHIST is not defined, so we reduce header
file dependencies.

- Make uvm_device.h and uvm_swap.h independently includable while
here.

ok chs@
 1.86 11-Jun-2020  ad Counter tweaks:

- Don't need to count anonpages+filepages any more; clean+unknown+dirty for
each kind of page can be summed to get the totals.

- Track the number of free pages with a counter so that it's one less thing
for the allocator to do, which opens up further options there.

- Remove cpu_count_sync_one(). It has no users and doesn't save a whole lot.
For the cheap option, give cpu_count_sync() a boolean parameter indicating
that a cached value is okay, and rate limit the updates for cached values
to hz.
 1.85 11-Jun-2020  ad uvm_availmem(): give it a boolean argument to specify whether a recent
cached value will do, or if the very latest total must be fetched. It can
be called thousands of times a second and fetching the totals impacts not
only the calling LWP but other CPUs doing unrelated activity in the VM
system.
 1.84 31-May-2020  rin struct statvfs is too large for stack. Use malloc(9) instead.

XXX
Switch to kmem(9) for entire this file.

Frame size, e.g. for m68k, becomes:
3292 --> 12
 1.83 23-May-2020  ad Move proc_lock into the data segment. It was dynamically allocated because
at the time we had mutex_obj_alloc() but not __cacheline_aligned.
 1.82 20-Apr-2020  martin Add missing include of <sys/atomic.h> to fix the build
 1.81 19-Apr-2020  thorpej - Only increment nprocs when we're creating a new process, not just
when allocating a PID.
- Per above, proc_free_pid() no longer decrements nprocs. It's now done
in proc_free() right after proc_free_pid().
- Ensure nprocs is accessed using atomics everywhere.
 1.80 02-Jan-2020  thorpej branches: 1.80.6;
- Eliminate the global "boottime" variable, which was being accessed
without any synchronization against changes by e.g. clock_settime().
- Replace with new getbinboottime() / getnanoboottime() / getmicroboottime()
functions (naming mirrors that of other time access functions in kern_tc.c).
It returns the (maybe-converted) value of timebasebin, which also tracks
our estimate of when the system was booted (i.e. the legacy "boottime" was
redundant).

XXX There needs to be a lockless synchronization mechanism for reading
timebasebin, but this is a problem in kern_tc.c that pre-existed these
"boottime" changes. At least now the problem is centralized in one location.
 1.79 31-Dec-2019  ad Rename uvm_free() -> uvm_availmem().
 1.78 21-Dec-2019  ad uvmexp.free -> uvm_free()
 1.77 16-Dec-2019  ad - Extend the per-CPU counters matt@ did to include all of the hot counters
in UVM, excluding uvmexp.free, which needs special treatment and will be
done with a separate commit. Cuts system time for a build by 20-25% on
a 48 CPU machine w/DIAGNOSTIC.

- Avoid 64-bit integer divide on every fault (for rnd_add_uint32).
 1.76 07-Sep-2019  chs have procfs_do_pid_stat() pass the proc's map to get_proc_size_info(),
rather than having the latter look up the map again and not check
for an error.
 1.75 23-Aug-2019  maxv Fix info leaks.
 1.74 05-Dec-2018  christos branches: 1.74.4;
As discussed in tech-kern:

- make sysctl kern.expose_address tri-state:
0: no access
1: access to processes with open /dev/kmem
2: access to everyone
defaults:
0: KASLR kernels
1: non-KASLR kernels

- improve efficiency by calling get_expose_address() per sysctl, not per
process.

- don't expose addresses for linux procfs

- welcome to 8.99.27, changes to fill_*proc ABI
 1.73 13-Apr-2017  hannken branches: 1.73.4; 1.73.10; 1.73.12;
Switch procfs_domounts() to mountlist iterator.
 1.72 28-Mar-2016  mlelstv branches: 1.72.2; 1.72.4;
Align /proc/<pid>/statm data with /proc/<pid>/stat and
provide RSS information. There is no data about shared
pages.

Helps PR 50801.
 1.71 24-Jul-2015  maxv Unused inits (harmless).

Found by Brainy.
 1.70 10-Aug-2014  matt branches: 1.70.2; 1.70.4; 1.70.10;
#include <sys/cpu.h>
 1.69 12-Jul-2014  njoly Use kproc2 to provide sensible informations for /proc/<pid>/stat.
 1.68 30-Jun-2014  njoly Use NZERO instead of hard-coded "20" value.
 1.67 05-Apr-2014  christos branches: 1.67.2;
On my 24 proc box I got ENOSPC, so make the routine return the size it wants
and try again.
 1.66 27-Nov-2013  christos Change the queue.3 *_END(&head) macros to NULL. Since we don't have CIRCLEQ
anymore, all the macros expand to NULL anyway, so this improves readability.
Requested by rmind@
 1.65 23-Nov-2013  christos change the mountlist CIRCLEQ into a TAILQ
 1.64 19-Dec-2011  christos branches: 1.64.6; 1.64.10;
don't produce different output if we are super user.
 1.63 16-Dec-2011  christos provide a root entry if one was not found.
 1.62 15-Dec-2011  christos PR/45700: use dostatvfs instead of grabbing the latest cached copy of
struct statvfs from the mount point, so that chroot is handled properly.
 1.61 04-Sep-2011  jmcneill branches: 1.61.2; 1.61.6;
PR# kern/45021: Please support /emul/linux/proc/version

Add /proc/version for procfs with -o linux. The version reported depends
on the emulation type of the calling process:

$ cat /proc/version
NetBSD version 5.99.55 (netbsd@localhost) (gcc version 4.1.3 20080704 prerelease (NetBSD nb2 20081120)) NetBSD 5.99.55 (GENERIC) #39: Sun Sep 4 09:10:05 EDT 2011

$ /emul/linux/bin/cat /proc/version
Linux version 2.6.18 (linux@localhost) (gcc version 4.1.3 20080704 prerelease (NetBSD nb2 20081120)) #0 Wed Mar 3 03:03:03 PST 2010

$ /emul/linux32/bin/cat /proc/version
Linux version 2.6.18 (linux32@localhost) (gcc version 4.1.3 20080704 prerelease (NetBSD nb2 20081120)) #0 Wed Mar 3 03:03:03 PST 2010
 1.60 28-Aug-2011  jmcneill both LINUX_USRSTACK32 and USRSTACK32 need to be defined for linux32
 1.59 20-Dec-2010  matt Move counting of faults, traps, intrs, soft[intr]s, syscalls, and nswtch
from uvmexp to per-cpu cpu_data and move them to 64bits. Remove unneeded
includes of <uvm/uvm_extern.h> and/or <uvm/uvm.h>.
 1.58 19-Oct-2009  dholland branches: 1.58.4;
Avoid leaking pages. Fixes PR 42053 from SHIMIZU Ryo.
 1.57 11-Jan-2009  christos this change was somehow missed.
 1.56 11-Jan-2009  christos merge christos-time_t
 1.55 29-Dec-2008  pooka Rename specfs_lock as device_lock and move it from specfs to devsw.
Relaxes kernel dependency on vfs.
 1.54 31-May-2008  ad branches: 1.54.6; 1.54.8; 1.54.14;
Kill devsw_lock and just use specfs_lock. The two would need merging
in order to prevent unload of modules when a device that they provide
is still open.
 1.53 06-May-2008  ad branches: 1.53.2;
PR kern/38141 lookup/vfs_busy acquire rwlock recursively

Simplify the mount locking. Remove all the crud to deal with recursion on
the mount lock, and crud to deal with unmount as another weirdo lock.

Hopefully this will once and for all fix the deadlocks with this. With this
commit there are two locks on each mount:

- krwlock_t mnt_unmounting. This is used to prevent unmount across critical
sections like getnewvnode(). It's only ever read locked with rw_tryenter(),
and is only ever write locked in dounmount(). A write hold can't be taken
on this lock if the current LWP could hold a vnode lock.

- kmutex_t mnt_updating. This is taken by threads updating the mount, for
example when going r/o -> r/w, and is only present to serialize updates.
In order to take this lock, a read hold must first be taken on
mnt_unmounting, and the two need to be held across the operation.

One effect of this change: previously if an unmount failed, we would make a
half hearted attempt to back out of it gracefully, but that was unlikely to
work in a lot of cases. Now while an unmount that will be aborted is in
progress, new file operations within the mount will fail instead of being
delayed. That is unlikely to be a problem though, because if the admin
requests unmount of a file system then s(he) has made a decision to deny
access to the resource.
 1.52 30-Apr-2008  ad PR kern/38135 vfs_busy/vfs_trybusy confusion

The previous fix worked, but it opened a window where mounts could have
disappeared from mountlist while the caller was traversing it using
vfs_trybusy(). Fix that.
 1.51 29-Apr-2008  ad kern/38135 vfs_busy/vfs_trybusy confusion

The symptom was that sometimes file systems would occasionally not appear
in output from 'df' or 'mount' if the system was busy. Resolution:

- Make mount locks work somewhat like vm_map locks.
- vfs_trybusy() now only fails if the mount is gone, or if someone is
unmounting the file system. Simple contention on mnt_lock doesn't
cause it to fail.
- vfs_busy() will wait even if the file system is being unmounted.
 1.50 24-Apr-2008  ad branches: 1.50.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.
 1.49 24-Apr-2008  ad Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.
 1.48 30-Jan-2008  ad branches: 1.48.6; 1.48.8; 1.48.10;
PR kern/37706 (forced unmount of file systems is unsafe):

- Do reference counting for 'struct mount'. Each vnode associated with a
mount takes a reference, and in turn the mount takes a reference to the
vfsops.
- Now that mounts are reference counted, replace the overcomplicated mount
locking inherited from 4.4BSD with a recursable rwlock.
 1.47 22-Dec-2007  yamt procfs_douptime: simply use microuptime() instead of a mysterious calculation.
 1.46 22-Dec-2007  yamt procfs_docpustat: g/c a write-only variable.
 1.45 12-Nov-2007  ad branches: 1.45.2; 1.45.6;
Revision 1.42 was lost. Pointed out by Nicolas Joly:

This was using mutex_exit where mutex_enter was required.
 1.44 11-Nov-2007  christos report the proper stack size on 32 bit emulations.
 1.43 07-Nov-2007  ad Merge from vmlocking:

- pool_cache changes.
- Debugger/procfs locking fixes.
- Other minor changes.
 1.42 11-Oct-2007  ad branches: 1.42.2; 1.42.4;
This was using mutex_exit where mutex_enter was required.
 1.41 10-Oct-2007  ad Merge from vmlocking:

- Split vnode::v_flag into three fields, depending on field locking.
- simple_lock -> kmutex in a few places.
- Fix some simple locking problems.
 1.40 08-Oct-2007  ad Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.
 1.39 26-May-2007  agc branches: 1.39.6; 1.39.8; 1.39.10;
In /proc/<pid>/statm, avoid leaking buffer space if the attempt to get
vmspace information fails.

Return the nice value properly to userland via the /proc/<pid>/stat entry.

Use vm sizes from vmspace, rather than rusage structs, for the same
reasons as mentioned previously - see the comment in
kvm_proc.c::kvm_getproc2() about rusage values and zombie processes.
 1.38 25-May-2007  agc Use a bit more common code for the MULTIPROCESSOR and !MULTIPROCESSOR
cases.

Use the lwp's priority when returning the priority value, rather than
returning the nice value.
 1.37 25-May-2007  agc Various changes for better Linux emulation:

+ in /proc/<pid>/statm emulation, use the memory values from vmspace,
rather than struct rusage, since the rusage values appear to be 0 for
all processes except zombies. cf dsl's comment in
kvm_proc.c::kvm_getproc2()

+ in /proc/<pid>/stat, instead of returning the tv_sec value, return the
number of ticks we've had (roughly equivalent to the Linux jiffies).
Calculate these values from the tv_usec values.

Also:

+ enclose CPU_INFO_ITERATOR and CPU_INFO_FOREACH usage in #ifdef
MULTIPROCESSOR, at the request of Nick Hudson

Together, these changes allow htop to work on NetBSD.
 1.36 24-May-2007  dogcow use PRIu64, not llu, to unbork on 64-bit platforms.
 1.35 24-May-2007  agc Extend the Linux emulation of /proc to include

/proc/stat
/proc/loadavg and
/proc/<pid>/statm.

These are only present when -o linux is specified as a mount option
to procfs.

Factor out some common code so that it can be used by a number of
functions.

XXX The values returned in the statm emulation need to be verified.
 1.34 01-Apr-2007  christos return a page less than the actual top of stack so that linux-java works.
 1.33 09-Mar-2007  ad branches: 1.33.2; 1.33.4;
- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.
 1.32 09-Feb-2007  ad branches: 1.32.2;
Merge newlock2 to head.
 1.31 24-Dec-2006  elad Add two comments. No functional change.
 1.30 16-Nov-2006  christos __unused removal on arguments; approved by core.
 1.29 27-Oct-2006  christos don't allocate large buffers on the stack.
 1.28 23-Oct-2006  elad PR/34888: Nicolas Joly: kernel panic while trying to access
/emul/linux/proc/0/stat

Patch applied, thanks for the report!
 1.27 12-Oct-2006  christos - sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386
 1.26 20-Sep-2006  manu Emulate Linux's /proc/devices
 1.25 24-Jun-2006  christos branches: 1.25.4; 1.25.6;
PR/33815: Nicolas Joly: /emul/linux/proc/#/stat always report current
process status
 1.24 11-Dec-2005  christos branches: 1.24.4; 1.24.8; 1.24.16;
merge ktrace-lwp.
 1.23 29-May-2005  christos branches: 1.23.2;
- sprinkle const
- avoid shadowed variables.
 1.22 01-Mar-2005  christos branches: 1.22.2; 1.22.4;
Remove bogus len setting noted by J. Chapman Flack.
 1.21 27-Feb-2005  christos Give more space for cpu info and allocate it dynamically.
 1.20 26-Feb-2005  perry nuke trailing whitespace
 1.19 20-Sep-2004  jdolecek branches: 1.19.4; 1.19.6;
add 'mounts' file for -o linux, which lists all currently mounted
filesystems; Linux glibc statvfs() uses this to get some of mount flags,
and this file is also useful as /emul/linux/etc/mtab (via symlink)
 1.18 27-Aug-2004  skrll Do previous slightly differently - just pass a struct lwp * and derive the
struct proc *.

OK'd by Jaromir.
 1.17 21-Aug-2004  jdolecek fix process used for /proc/<pid>/stat contents - it should be process
<pid>, not the current process looking at the information
 1.16 22-Apr-2004  itojun sprintf -> snprintf
 1.15 30-Oct-2003  christos branches: 1.15.2;
t_pgrp can be null.
 1.14 21-Aug-2003  he Add casts of LINUX_USRSTACK and USRSTACK to handle the cases
where these are not constants.
 1.13 09-Aug-2003  christos LINUX_USRSTACK is only defined on i386. Thanks Izumi!
 1.12 09-Aug-2003  christos Only choose the linux usrstack if the netbsd usrstack was higher.
 1.11 09-Aug-2003  christos Change the way we compute the top of the stack. This makes java-1.4.2 work.
 1.10 29-Jun-2003  fvdl branches: 1.10.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
 1.9 28-Jun-2003  darrenr Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
 1.8 29-May-2003  hannken Change "%qu" to "PRIu64" to make it compile on sparc64.
 1.7 28-May-2003  christos Add /proc/<pid>/stat for linux compat. j2sdk1.4.2 depends on it.
 1.6 27-Feb-2003  hannken Change "%llu" to "PRIu64" to make it compile on sparc64.
 1.5 25-Feb-2003  jrf This addresses PR kerm/19989. Thanks to hamajima@nagoya.ydc.co.jp for submitting this patch which enables /proc/uptime for linux emul. Patch reviewed by atatat@netbsd.org and tron@netbsd.org, approved by tron@netbsd.org.
 1.4 09-Dec-2001  chs replace "vnode" and "vtext" with "file" and "exec" in uvmexp field names.
 1.3 10-Nov-2001  lukem add RCSIDs
 1.2 18-Jan-2001  tv branches: 1.2.2; 1.2.4; 1.2.6; 1.2.10;
No-op revision to force update of this file to a non-"-kk" version.
 1.1 17-Jan-2001  fvdl branches: 1.1.2;
Add a few linux-style files, only enabled when -o linux is specified
for the mount. Currently these are /proc/cpuinfo and /proc/meminfo.
The former only does something on i386 right now.
 1.1.2.2 18-Jan-2001  bouyer Sync with head (for UBC+NFS fixes, mostly).
 1.1.2.1 17-Jan-2001  bouyer file procfs_linux.c was added on branch thorpej_scsipi on 2001-01-18 09:23:48 +0000
 1.2.10.1 12-Nov-2001  thorpej Sync the thorpej-mips-cache branch with -current.
 1.2.6.1 10-Jan-2002  thorpej Sync kqueue branch with -current.
 1.2.4.2 30-Mar-2001  he Pull up revisions 1.1-1.2 (new, via patch, requested by fvdl):
Add some required Linux emulation bits to support the Linux
version of VMware.
 1.2.4.1 18-Jan-2001  he file procfs_linux.c was added on branch netbsd-1-5 on 2001-03-30 21:48:11 +0000
 1.2.2.2 08-Jan-2002  nathanw Catch up to -current.
 1.2.2.1 14-Nov-2001  nathanw Catch up to -current.
 1.10.2.7 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.10.2.6 04-Mar-2005  skrll Sync with HEAD.

Hi Perry!
 1.10.2.5 24-Sep-2004  skrll Sync with HEAD.
 1.10.2.4 21-Sep-2004  skrll Fix the sync with head I botched.
 1.10.2.3 18-Sep-2004  skrll Sync with HEAD.
 1.10.2.2 03-Aug-2004  skrll Sync with HEAD
 1.10.2.1 02-Jul-2003  darrenr Apply the aborted ktrace-lwp changes to a specific branch. This is just for
others to review, I'm concerned that patch fuziness may have resulted in some
errant code being generated but I'll look at that later by comparing the diff
from the base to the branch with the file I attempt to apply to it. This will,
at the very least, put the changes in a better context for others to review
them and attempt to tinker with removing passing of 'struct lwp' through
the kernel.
 1.15.2.3 29-Oct-2006  tron Pull up following revision(s) (requested by adrianp in ticket #10739):
sys/miscfs/procfs/procfs_linux.c: revision 1.28
PR/34888: Nicolas Joly: kernel panic while trying to access
/emul/linux/proc/0/stat
Patch applied, thanks for the report!
 1.15.2.2 30-Aug-2004  tron branches: 1.15.2.2.2; 1.15.2.2.4;
Pull up revision 1.18 via patch (requested by jdolecek in ticket #799):
Do previous slightly differently - just pass a struct lwp * and derive the
struct proc *.
OK'd by Jaromir.
 1.15.2.1 30-Aug-2004  tron Pull up revision 1.17 (requested by jdolecek in ticket #799):
fix process used for /proc/<pid>/stat contents - it should be process
<pid>, not the current process looking at the information
 1.15.2.2.4.1 29-Oct-2006  tron Pull up following revision(s) (requested by adrianp in ticket #10739):
sys/miscfs/procfs/procfs_linux.c: revision 1.28
PR/34888: Nicolas Joly: kernel panic while trying to access
/emul/linux/proc/0/stat
Patch applied, thanks for the report!
 1.15.2.2.2.1 29-Oct-2006  tron Pull up following revision(s) (requested by adrianp in ticket #10739):
sys/miscfs/procfs/procfs_linux.c: revision 1.28
PR/34888: Nicolas Joly: kernel panic while trying to access
/emul/linux/proc/0/stat
Patch applied, thanks for the report!
 1.19.6.1 19-Mar-2005  yamt sync with head. xen and whitespace. xen part is not finished.
 1.19.4.1 29-Apr-2005  kent sync with -current
 1.22.4.1 24-Oct-2006  ghen Pull up following revision(s) (requested by elad in ticket #1567):
sys/miscfs/procfs/procfs_linux.c: revision 1.28
PR/34888: Nicolas Joly: kernel panic while trying to access
/emul/linux/proc/0/stat
Patch applied, thanks for the report!
 1.22.2.1 24-Oct-2006  ghen Pull up following revision(s) (requested by elad in ticket #1567):
sys/miscfs/procfs/procfs_linux.c: revision 1.28
PR/34888: Nicolas Joly: kernel panic while trying to access
/emul/linux/proc/0/stat
Patch applied, thanks for the report!
 1.23.2.8 04-Feb-2008  yamt sync with head.
 1.23.2.7 21-Jan-2008  yamt sync with head
 1.23.2.6 15-Nov-2007  yamt sync with head.
 1.23.2.5 27-Oct-2007  yamt sync with head.
 1.23.2.4 03-Sep-2007  yamt sync with head.
 1.23.2.3 26-Feb-2007  yamt sync with head.
 1.23.2.2 30-Dec-2006  yamt sync with head.
 1.23.2.1 21-Jun-2006  yamt sync with head.
 1.24.16.1 13-Jul-2006  gdamore Merge from HEAD.
 1.24.8.1 26-Jun-2006  yamt sync with head.
 1.24.4.1 09-Sep-2006  rpaulo sync with head
 1.25.6.2 10-Dec-2006  yamt sync with head.
 1.25.6.1 22-Oct-2006  yamt sync with head
 1.25.4.4 12-Jan-2007  ad Sync with head.
 1.25.4.3 18-Nov-2006  ad Sync with head.
 1.25.4.2 17-Nov-2006  ad Checkpoint work in progress.
 1.25.4.1 21-Oct-2006  ad - Make this compile. XXX Needs more work on locking.
- Do FILE_UNUSE() as the current LWP, otherwise we will wipe out the
target's advisory locks. XXX Double check.
 1.32.2.2 15-Apr-2007  yamt sync with head.
 1.32.2.1 12-Mar-2007  rmind Sync with HEAD.
 1.33.4.1 11-Jul-2007  mjf Sync with head.
 1.33.2.5 25-Oct-2007  ad - Simplify debugger/procfs reference counting of processes. Use a per-proc
rwlock: rw_tryenter(RW_READER) to gain a reference, and rw_enter(RW_WRITER)
by the process itself to drain out reference holders before major changes
like exiting.
- Fix numerous bugs and locking issues in procfs.
- Mark procfs MPSAFE.
 1.33.2.4 14-Jul-2007  ad Make it possible to track time spent by soft interrupts as is done for
normal LWPs, and provide a sysctl to switch it on/off. Not enabled by
default because microtime() is not free. XXX Not happy with this but
I want it get it out of my local tree for the time being.
 1.33.2.3 08-Jun-2007  ad Sync with head.
 1.33.2.2 10-Apr-2007  ad Sync with head.
 1.33.2.1 13-Mar-2007  ad Pull in the initial set of changes for the vmlocking branch.
 1.39.10.1 14-Oct-2007  yamt sync with head.
 1.39.8.4 23-Mar-2008  matt sync with HEAD
 1.39.8.3 09-Jan-2008  matt sync with HEAD
 1.39.8.2 08-Nov-2007  matt sync with -HEAD
 1.39.8.1 06-Nov-2007  matt sync with HEAD
 1.39.6.3 14-Nov-2007  joerg Sync with HEAD.
 1.39.6.2 11-Nov-2007  joerg Sync with HEAD.
 1.39.6.1 26-Oct-2007  joerg Sync with HEAD.

Follow the merge of pmap.c on i386 and amd64 and move
pmap_init_tmp_pgtbl into arch/x86/x86/pmap.c. Modify the ACPI wakeup
code to restore CR4 before jumping back into kernel space as the large
page option might cover that.
 1.42.4.3 18-Feb-2008  mjf Sync with HEAD.
 1.42.4.2 27-Dec-2007  mjf Sync with HEAD.
 1.42.4.1 19-Nov-2007  mjf Sync with HEAD.
 1.42.2.1 13-Nov-2007  bouyer Sync with HEAD
 1.45.6.1 02-Jan-2008  bouyer Sync with HEAD
 1.45.2.1 26-Dec-2007  ad Sync with head.
 1.48.10.2 04-Jun-2008  yamt sync with head
 1.48.10.1 18-May-2008  yamt sync with head.
 1.48.8.3 30-Dec-2008  christos sync with head.
 1.48.8.2 01-Nov-2008  christos Sync with head.
 1.48.8.1 29-Mar-2008  christos Welcome to the time_t=long long dev_t=uint64_t branch.
 1.48.6.2 17-Jan-2009  mjf Sync with HEAD.
 1.48.6.1 02-Jun-2008  mjf Sync with HEAD.
 1.50.2.3 11-Mar-2010  yamt sync with head
 1.50.2.2 04-May-2009  yamt sync with head.
 1.50.2.1 16-May-2008  yamt sync with head.
 1.53.2.1 23-Jun-2008  wrstuden Sync w/ -current. 34 merge conflicts to follow.
 1.54.14.3 29-Apr-2011  matt Use _KERNEL_OPT
 1.54.14.2 05-Feb-2011  cliff - include opt_multiprocessor.h for explicit MULTIPROCESSOR dependency
 1.54.14.1 21-Apr-2010  matt sync to netbsd-5
 1.54.8.1 27-Oct-2009  bouyer Pull up following revision(s) (requested by markd in ticket #1113):
sys/miscfs/procfs/procfs_linux.c: revision 1.58
Avoid leaking pages. Fixes PR 42053 from SHIMIZU Ryo.
 1.54.6.1 19-Jan-2009  skrll Sync with HEAD.
 1.58.4.1 05-Mar-2011  rmind sync with head
 1.61.6.1 18-Feb-2012  mrg merge to -current.
 1.61.2.2 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.61.2.1 17-Apr-2012  yamt sync with head
 1.64.10.1 18-May-2014  rmind sync with head
 1.64.6.2 03-Dec-2017  jdolecek update from HEAD
 1.64.6.1 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.67.2.1 10-Aug-2014  tls Rebase.
 1.70.10.1 21-Jan-2020  martin Pull up the following, requested by christos in ticket #1720:

sys/compat/common/kern_sig_43.c 1.36
sys/compat/linux/arch/amd64/linux_machdep.c 1.59
sys/compat/linux/common/linux_fcntl.h 1.18
sys/compat/linux/common/linux_file64.c 1.62
sys/compat/linux/common/linux_ipc.c 1.57
sys/compat/linux/common/linux_misc.c 1.243
sys/compat/linux/common/linux_signal.c 1.81
sys/compat/linux/common/linux_socket.c 1.149 (patch)
sys/compat/linux/common/linux_socket.h 1.24
sys/compat/linux/common/linux_statfs.h 1.7
sys/compat/linux/common/linux_termios.c 1.38
sys/compat/linux/common/linux_termios.h 1.22
sys/compat/linux32/common/linux32_dirent.c 1.20
sys/compat/linux32/common/linux32_ioctl.c 1.14
sys/compat/linux32/common/linux32_misc.c 1.27
sys/compat/linux32/common/linux32_signal.c 1.20
sys/compat/linux32/common/linux32_sysinfo.c 1.8
sys/compat/linux32/common/linux32_termios.c 1.15
sys/compat/linux32/common/linux32_utsname.c 1.10
sys/compat/netbsd32/netbsd32_compat_20.c 1.39
sys/compat/netbsd32/netbsd32_compat_43.c 1.59
sys/compat/netbsd32/netbsd32_compat_50.c 1.44
sys/compat/ossaudio/ossaudio.c 1.75
sys/kern/sysv_shm.c 1.138
sys/miscfs/procfs/procfs_linux.c 1.75 (patch)
sys/sys/shm.h 1.54 (patch)

Fix various info leaks, out of bound access, usage of uninitialized
values and direct access to userland variables from kernel space
and memory leaks in system calls implemented for the compatibility
subsystems.
 1.70.4.3 28-Aug-2017  skrll Sync with HEAD
 1.70.4.2 22-Apr-2016  skrll Sync with HEAD
 1.70.4.1 22-Sep-2015  skrll Sync with HEAD
 1.70.2.1 21-Jan-2020  martin Pull up the following, requested by christos in ticket #1720:

sys/compat/common/kern_sig_43.c 1.36
sys/compat/linux/arch/amd64/linux_machdep.c 1.59
sys/compat/linux/common/linux_fcntl.h 1.18
sys/compat/linux/common/linux_file64.c 1.62
sys/compat/linux/common/linux_ipc.c 1.57
sys/compat/linux/common/linux_misc.c 1.243
sys/compat/linux/common/linux_signal.c 1.81
sys/compat/linux/common/linux_socket.c 1.149 (patch)
sys/compat/linux/common/linux_socket.h 1.24
sys/compat/linux/common/linux_statfs.h 1.7
sys/compat/linux/common/linux_termios.c 1.38
sys/compat/linux/common/linux_termios.h 1.22
sys/compat/linux32/common/linux32_dirent.c 1.20
sys/compat/linux32/common/linux32_ioctl.c 1.14
sys/compat/linux32/common/linux32_misc.c 1.27
sys/compat/linux32/common/linux32_signal.c 1.20
sys/compat/linux32/common/linux32_sysinfo.c 1.8
sys/compat/linux32/common/linux32_termios.c 1.15
sys/compat/linux32/common/linux32_utsname.c 1.10
sys/compat/netbsd32/netbsd32_compat_20.c 1.39
sys/compat/netbsd32/netbsd32_compat_43.c 1.59
sys/compat/netbsd32/netbsd32_compat_50.c 1.44
sys/compat/ossaudio/ossaudio.c 1.75
sys/kern/sysv_shm.c 1.138
sys/miscfs/procfs/procfs_linux.c 1.75 (patch)
sys/sys/shm.h 1.54 (patch)

Fix various info leaks, out of bound access, usage of uninitialized
values and direct access to userland variables from kernel space
and memory leaks in system calls implemented for the compatibility
subsystems.
 1.72.4.1 21-Apr-2017  bouyer Sync with HEAD
 1.72.2.1 26-Apr-2017  pgoyette Sync with HEAD
 1.73.12.4 21-Apr-2020  martin Sync with HEAD
 1.73.12.3 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.73.12.2 08-Apr-2020  martin Merge changes from current as of 20200406
 1.73.12.1 10-Jun-2019  christos Sync with HEAD
 1.73.10.1 26-Dec-2018  pgoyette Sync with HEAD, resolve a few conflicts
 1.73.4.2 21-Jan-2020  martin Pull up the following, requested by christos in ticket #1487:

sys/compat/common/kern_sig_43.c 1.36
sys/compat/linux/arch/amd64/linux_machdep.c 1.59
sys/compat/linux/common/linux_fcntl.h 1.18
sys/compat/linux/common/linux_file64.c 1.62
sys/compat/linux/common/linux_ipc.c 1.57
sys/compat/linux/common/linux_misc.c 1.243
sys/compat/linux/common/linux_signal.c 1.81
sys/compat/linux/common/linux_socket.c 1.149
sys/compat/linux/common/linux_socket.h 1.24
sys/compat/linux/common/linux_statfs.h 1.7
sys/compat/linux/common/linux_termios.c 1.38
sys/compat/linux/common/linux_termios.h 1.22
sys/compat/linux32/common/linux32_dirent.c 1.20
sys/compat/linux32/common/linux32_ioctl.c 1.14
sys/compat/linux32/common/linux32_misc.c 1.27
sys/compat/linux32/common/linux32_signal.c 1.20
sys/compat/linux32/common/linux32_sysinfo.c 1.8
sys/compat/linux32/common/linux32_termios.c 1.15
sys/compat/linux32/common/linux32_utsname.c 1.10
sys/compat/netbsd32/netbsd32_compat_20.c 1.39
sys/compat/netbsd32/netbsd32_compat_43.c 1.59
sys/compat/netbsd32/netbsd32_compat_50.c 1.44
sys/compat/ossaudio/ossaudio.c 1.75
sys/kern/sysv_shm.c 1.138
sys/miscfs/procfs/procfs_linux.c 1.75 (patch)
sys/sys/shm.h 1.54

Fix various info leaks, out of bound access, usage of uninitialized
values and direct access to userland variables from kernel space
and memory leaks in system calls implemented for the compatibility
subsystems.
 1.73.4.1 10-Sep-2019  martin Pull up following revision(s) (requested by chs in ticket #1370):

sys/miscfs/procfs/procfs_linux.c: revision 1.76

have procfs_do_pid_stat() pass the proc's map to get_proc_size_info(),
rather than having the latter look up the map again and not check
for an error.
 1.74.4.2 13-Sep-2019  martin Pull up following revision(s) (requested by maxv in ticket #194):

sys/compat/linux/common/linux_socket.c: revision 1.146
sys/compat/linux/common/linux_socket.c: revision 1.147
sys/compat/linux/common/linux_socket.c: revision 1.148
sys/compat/linux/common/linux_socket.c: revision 1.149
sys/compat/linux/arch/amd64/linux_machdep.c: revision 1.59
sys/compat/linux32/common/linux32_sysinfo.c: revision 1.8
sys/kern/sysv_shm.c: revision 1.138
sys/compat/linux/common/linux_file64.c: revision 1.61
sys/compat/linux/common/linux_file64.c: revision 1.62
sys/compat/netbsd32/netbsd32_compat_43.c: revision 1.58
sys/compat/linux32/common/linux32_dirent.c: revision 1.20
sys/compat/linux32/common/linux32_utsname.c: revision 1.10
sys/compat/linux/common/linux_termios.h: revision 1.22
sys/compat/linux32/common/linux32_termios.c: revision 1.15
sys/compat/linux32/common/linux32_misc.c: revision 1.27
sys/compat/linux32/common/linux32_ioctl.c: revision 1.14
sys/compat/linux/common/linux_statfs.h: revision 1.7
sys/compat/linux/common/linux_ipc.c: revision 1.57
sys/compat/linux/common/linux_fcntl.h: revision 1.18
sys/compat/linux/common/linux_socket.h: revision 1.24
sys/sys/shm.h: revision 1.54
sys/compat/ossaudio/ossaudio.c: revision 1.75
sys/compat/linux32/common/linux32_signal.c: revision 1.20
sys/miscfs/procfs/procfs_linux.c: revision 1.75
sys/compat/linux/common/linux_signal.c: revision 1.81
sys/compat/linux/common/linux_termios.c: revision 1.38
sys/compat/linux/common/linux_misc.c: revision 1.241
sys/compat/linux/common/linux_misc.c: revision 1.242
sys/compat/linux/common/linux_misc.c: revision 1.243
sys/compat/linux/common/linux_misc.c: revision 1.244

Fix info leaks.

Fix stupid bugs in linux_sys_shmctl(): the index could be out of bound
(page fault) and there was no proper locking.
Maybe we should just remove LINUX_SHM_STAT, like compat_linux32.

Remove printf.

When dealing with an unknown value, set -1, to prevent (harmless)
uninitialized accesses later.

Add a default case, don't call sys_ioctl() with an uninitialized 'com'
argument.

Fix error handling, returns an errno, not -1.

Put the printf under DEBUG_LINUX.


Hum, don't forget the 'pid' argument, otherwise we're not gonna go very
far.

Don't read data from userland directly. This simply does not work on any
recent x86 CPU (thanks to SMAP) and all architectures that forbid direct
access to userland from the kernel. But I guess no one noticed because no
one ever uses compat_linux, right?

Hum, don't pass an mbuf to realloc(). Inspired from copyin32_msg_control().

Fix memory leak.

I don't see the point in having this useless printf, but add a '\n' to it,
so that it at least displays useless stuff correctly.

Hum, remove incorrect assignment. Userland could have passed a smaller
namelen, and the uninitialized bytes from sb_data were being used later in
the network stack.
 1.74.4.1 10-Sep-2019  martin Pull up following revision(s) (requested by chs in ticket #190):

sys/miscfs/procfs/procfs_linux.c: revision 1.76

have procfs_do_pid_stat() pass the proc's map to get_proc_size_info(),
rather than having the latter look up the map again and not check
for an error.
 1.80.6.2 25-Apr-2020  bouyer Sync with bouyer-xenpvh-base2 (HEAD)
 1.80.6.1 20-Apr-2020  bouyer Sync with HEAD
 1.88.2.1 02-Aug-2025  perseant Sync with HEAD
 1.47 27-Sep-2019  christos Instead of casting to size_t, cast to uintmax_t to prevent truncation
(pointed out by chuq). In all these cases uio_offset can't be negative.
 1.46 26-Sep-2019  christos fix sign-compare issues: uio->uio_offset (off_t) is compared with (size_t):
cast the offset to size_t.
 1.45 17-Oct-2014  christos branches: 1.45.20;
Maps don't change that frequently between reads, so don't give up and
do what linux does (support reading from an offset).
 1.44 18-Mar-2014  riastradh branches: 1.44.4; 1.44.8;
Merge riastradh-drm2 to HEAD.
 1.43 18-Jul-2013  ryo PR/48048: Add a missing vm_map_unlock_read() and uvmspace_free() to the ENOMEM error case in procfs_domap()d
 1.42 06-May-2012  christos branches: 1.42.2; 1.42.4; 1.42.10;
- match format with the linux map printing
- fix PK_32 map printing for linux processes
should fix 32 bit java stack guard setting.
 1.41 16-Oct-2011  hannken branches: 1.41.2; 1.41.6; 1.41.8; 1.41.12; 1.41.14;
VOP_GETATTR() needs a shared lock at least.
 1.40 26-Jul-2011  yamt fix a botch in PRIxVADDR change (rev.1.38)
 1.39 15-Sep-2010  jym Use PRIxVADDR to print vaddr_t elements. Wrap lines.
 1.38 14-Dec-2009  uebayasi branches: 1.38.2; 1.38.4;
gimpy invented PRIxVADDR format specifier.
 1.37 11-Jan-2009  christos merge christos-time_t
 1.36 25-Jul-2008  christos branches: 1.36.2; 1.36.6; 1.36.12;
use bufsize instead of BUFFERSIZE
 1.35 25-Jul-2008  christos Handle files with a large number of mappings gracefully. Reported by Nicholas
Joly.
 1.34 15-Dec-2007  christos branches: 1.34.6; 1.34.10; 1.34.12; 1.34.14; 1.34.16;
use vnode_to_path.
 1.33 26-Nov-2007  pooka branches: 1.33.2; 1.33.6;
Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern
 1.32 21-Jul-2007  pooka branches: 1.32.4; 1.32.6; 1.32.12; 1.32.14;
nuke homegrown getcwd_common() decl
 1.31 01-Apr-2007  christos branches: 1.31.4;
Instead of reading and writing little by little, allocate memory and
write the whole map in one shot so that we don't have to deal with the
map changing under us. Fixes the linux emulated jdk-1.6 where it was
losing the last map entry and could not find the stack on startup.
 1.30 18-Feb-2007  ad branches: 1.30.4; 1.30.6;
procfs_map():

- Drop the target's vm_map lock before calling uiomove(). We could
deadlock if inspecting /proc/curproc/map.
- If the vm_map might have changed, restart the operation, but give
up after 250 retries if the map keeps changing. XXX This is not
ideal.
 1.29 17-Feb-2007  pavel Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.
 1.28 09-Feb-2007  ad branches: 1.28.2;
Merge newlock2 to head.
 1.27 16-Nov-2006  christos __unused removal on arguments; approved by core.
 1.26 12-Oct-2006  christos - sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386
 1.25 23-Jul-2006  ad branches: 1.25.4; 1.25.6;
Use the LWP cached credentials where sane.
 1.24 14-May-2006  elad integrate kauth.
 1.23 11-Dec-2005  christos branches: 1.23.4; 1.23.6; 1.23.8; 1.23.10; 1.23.12;
merge ktrace-lwp.
 1.22 30-Aug-2005  xtraeme Remove __P()
 1.21 26-Feb-2005  perry branches: 1.21.4;
nuke trailing whitespace
 1.20 07-Aug-2003  agc branches: 1.20.8; 1.20.10;
Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.19 29-Jun-2003  fvdl branches: 1.19.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
 1.18 29-Jun-2003  thorpej Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.
 1.17 28-Jun-2003  darrenr Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
 1.16 07-Nov-2002  thorpej Fix signed/unsigned comparison warnings.
 1.15 10-Nov-2001  lukem add RCSIDs
 1.14 06-Nov-2001  simonb Remove some variables that are set but never used.
 1.13 02-Jun-2001  chs branches: 1.13.2; 1.13.6;
replace vm_map{,_entry}_t with struct vm_map{,_entry} *.
 1.12 02-Apr-2001  pk Cast `field-width' arguments to type `int'.
 1.11 29-Mar-2001  fvdl For -o linux mounts, add some code to emulate /proc/#/maps.
Needs NAMECACHE_ENTER_REVERSE to include filenames.
 1.10 17-Jan-2001  fvdl branches: 1.10.2;
Add a few linux-style files, only enabled when -o linux is specified
for the mount. Currently these are /proc/cpuinfo and /proc/meminfo.
The former only does something on i386 right now.
 1.9 24-Nov-2000  chs remove dead code and other misc cleanup.
 1.8 28-Jun-2000  mrg <vm/vm.h> -> <uvm/uvm_extern.h>
 1.7 27-Jun-2000  mrg remove redudant <vm/pmap.h> includes. <vm/pmap.h> -> <uvm/uvm_pmap.h>
 1.6 25-Jun-2000  mrg remove some redundant <vm/vm_xxx.h> includes
 1.5 10-Apr-1999  drochner branches: 1.5.2; 1.5.12;
remove unneeded <vm/vm_object.h>
 1.4 24-Mar-1999  mrg branches: 1.4.4;
completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.
 1.3 03-Feb-1999  msaitoh sprintf->snprintf
 1.2 28-Jan-1999  drochner make it compile with !UVM
 1.1 25-Jan-1999  msaitoh Add /proc/#/map. From FreeBSD.
 1.4.4.1 21-Jun-1999  thorpej Sync w/ -current.
 1.5.12.1 30-Mar-2001  he Pull up revision 1.10 (requested by fvdl):
Add some required Linux emulation bits to support the Linux
version of VMware.
 1.5.2.4 21-Apr-2001  bouyer Sync with HEAD
 1.5.2.3 18-Jan-2001  bouyer Sync with head (for UBC+NFS fixes, mostly).
 1.5.2.2 08-Dec-2000  bouyer Sync with HEAD.
 1.5.2.1 20-Nov-2000  bouyer Update thorpej_scsipi to -current as of a month ago
 1.10.2.8 11-Nov-2002  nathanw Catch up to -current
 1.10.2.7 15-Oct-2002  nathanw Make _validfoo() routines go back to taking a proc.
 1.10.2.6 12-Jul-2002  nathanw No longer need to pull in lwp.h; proc.h pulls it in for us.
 1.10.2.5 24-Jun-2002  nathanw Curproc->curlwp renaming.

Change uses of "curproc->l_proc" back to "curproc", which is more like the
original use. Bare uses of "curproc" are now "curlwp".

"curproc" is now #defined in proc.h as ((curlwp) ? (curlwp)->l_proc) : NULL)
so that it is always safe to reference curproc (*de*referencing curproc
is another story, but that's always been true).
 1.10.2.4 14-Nov-2001  nathanw Catch up to -current.
 1.10.2.3 21-Jun-2001  nathanw Catch up to -current.
 1.10.2.2 09-Apr-2001  nathanw Catch up with -current.
 1.10.2.1 05-Mar-2001  nathanw Initial commit of scheduler activations and lightweight process support.
 1.13.6.1 12-Nov-2001  thorpej Sync the thorpej-mips-cache branch with -current.
 1.13.2.1 10-Jan-2002  thorpej Sync kqueue branch with -current.
 1.19.2.7 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.19.2.6 04-Mar-2005  skrll Sync with HEAD.

Hi Perry!
 1.19.2.5 21-Sep-2004  skrll Fix the sync with head I botched.
 1.19.2.4 18-Sep-2004  skrll Sync with HEAD.
 1.19.2.3 24-Aug-2004  skrll Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.
 1.19.2.2 03-Aug-2004  skrll Sync with HEAD
 1.19.2.1 02-Jul-2003  darrenr Apply the aborted ktrace-lwp changes to a specific branch. This is just for
others to review, I'm concerned that patch fuziness may have resulted in some
errant code being generated but I'll look at that later by comparing the diff
from the base to the branch with the file I attempt to apply to it. This will,
at the very least, put the changes in a better context for others to review
them and attempt to tinker with removing passing of 'struct lwp' through
the kernel.
 1.20.10.1 19-Mar-2005  yamt sync with head. xen and whitespace. xen part is not finished.
 1.20.8.1 29-Apr-2005  kent sync with -current
 1.21.4.6 21-Jan-2008  yamt sync with head
 1.21.4.5 07-Dec-2007  yamt sync with head
 1.21.4.4 03-Sep-2007  yamt sync with head.
 1.21.4.3 26-Feb-2007  yamt sync with head.
 1.21.4.2 30-Dec-2006  yamt sync with head.
 1.21.4.1 21-Jun-2006  yamt sync with head.
 1.23.12.1 24-May-2006  tron Merge 2006-05-24 NetBSD-current into the "peter-altq" branch.
 1.23.10.1 08-Mar-2006  elad Adapt to kernel authorization KPI.
 1.23.8.2 11-Aug-2006  yamt sync with head
 1.23.8.1 24-May-2006  yamt sync with head.
 1.23.6.1 01-Jun-2006  kardel Sync with head.
 1.23.4.1 09-Sep-2006  rpaulo sync with head
 1.25.6.2 10-Dec-2006  yamt sync with head.
 1.25.6.1 22-Oct-2006  yamt sync with head
 1.25.4.1 17-Nov-2006  ad Checkpoint work in progress.
 1.28.2.2 15-Apr-2007  yamt sync with head.
 1.28.2.1 27-Feb-2007  yamt - sync with head.
- move sched_changepri back to kern_synch.c as it doesn't know PPQ anymore.
 1.30.6.1 11-Jul-2007  mjf Sync with head.
 1.30.4.2 20-Aug-2007  ad Sync with HEAD.
 1.30.4.1 10-Apr-2007  ad Sync with head.
 1.31.4.1 15-Aug-2007  skrll Sync with HEAD.
 1.32.14.2 21-Jul-2007  pooka nuke homegrown getcwd_common() decl
 1.32.14.1 21-Jul-2007  pooka file procfs_map.c was added on branch matt-mips64 on 2007-07-21 22:47:37 +0000
 1.32.12.2 27-Dec-2007  mjf Sync with HEAD.
 1.32.12.1 08-Dec-2007  mjf Sync with HEAD.
 1.32.6.1 09-Jan-2008  matt sync with HEAD
 1.32.4.1 27-Nov-2007  joerg Sync with HEAD. amd64 Xen support needs testing.
 1.33.6.1 02-Jan-2008  bouyer Sync with HEAD
 1.33.2.1 26-Dec-2007  ad Sync with head.
 1.34.16.1 19-Oct-2008  haad Sync with HEAD.
 1.34.14.1 28-Jul-2008  simonb Sync with head.
 1.34.12.1 18-Sep-2008  wrstuden Sync with wrstuden-revivesa-base-2.
 1.34.10.3 09-Oct-2010  yamt sync with head
 1.34.10.2 11-Mar-2010  yamt sync with head
 1.34.10.1 04-May-2009  yamt sync with head.
 1.34.6.2 17-Jan-2009  mjf Sync with HEAD.
 1.34.6.1 28-Sep-2008  mjf Sync with HEAD.
 1.36.12.2 21-Apr-2010  matt sync to netbsd-5
 1.36.12.1 24-Aug-2009  matt Fix some vaddr_t/vaddr_t type droppings.
 1.36.6.2 09-Nov-2008  christos account for major and minor being unsigned long long
 1.36.6.1 25-Jul-2008  christos file procfs_map.c was added on branch christos-time_t on 2008-11-09 02:05:20 +0000
 1.36.2.1 19-Jan-2009  skrll Sync with HEAD.
 1.38.4.1 05-Mar-2011  rmind sync with head
 1.38.2.1 22-Oct-2010  uebayasi Sync with HEAD (-D20101022).
 1.41.14.1 29-Jul-2013  msaitoh Pull up following revision(s) (requested by ryo in ticket #917):
sys/miscfs/procfs/procfs_map.c: revision 1.43
PR/48048: Add a missing vm_map_unlock_read() and uvmspace_free() to the ENOMEM
error case in procfs_domap()d
 1.41.12.1 29-Jul-2013  msaitoh Pull up following revision(s) (requested by ryo in ticket #917):
sys/miscfs/procfs/procfs_map.c: revision 1.43
PR/48048: Add a missing vm_map_unlock_read() and uvmspace_free() to the ENOMEM
error case in procfs_domap()d
 1.41.8.2 06-Jul-2017  snj Pull up following revision(s) (requested by tsutsui in ticket #1434):
sys/miscfs/procfs/procfs_map.c: revision 1.45
Maps don't change that frequently between reads, so don't give up and
do what linux does (support reading from an offset).
 1.41.8.1 29-Jul-2013  msaitoh Pull up following revision(s) (requested by ryo in ticket #917):
sys/miscfs/procfs/procfs_map.c: revision 1.43
PR/48048: Add a missing vm_map_unlock_read() and uvmspace_free() to the ENOMEM error case in procfs_domap()d
 1.41.6.1 02-Jun-2012  mrg sync to latest -current.
 1.41.2.2 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.41.2.1 23-May-2012  yamt sync with head.
 1.42.10.1 23-Jul-2013  riastradh sync with HEAD
 1.42.4.1 28-Aug-2013  rmind sync with head
 1.42.2.2 03-Dec-2017  jdolecek update from HEAD
 1.42.2.1 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.44.8.1 13-Mar-2017  skrll Sync with netbsd-7-1-RELEASE
 1.44.4.1 14-Feb-2017  snj Pull up following revision(s) (requested by chs in ticket #1358):
sys/miscfs/procfs/procfs_map.c: revision 1.45
Maps don't change that frequently between reads, so don't give up and
do what linux does (support reading from an offset).
 1.45.20.1 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.37 16-Nov-2006  christos __unused removal on arguments; approved by core.
 1.36 12-Oct-2006  christos - sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386
 1.35 11-Dec-2005  christos branches: 1.35.20; 1.35.22;
merge ktrace-lwp.
 1.34 07-Aug-2003  agc branches: 1.34.16;
Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.33 29-Jun-2003  fvdl branches: 1.33.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
 1.32 28-Jun-2003  darrenr Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
 1.31 09-May-2002  thorpej Move code shared by procfs and the kernel proper out of procfs and
into the kernel proper (renaming functions from procfs_* to process_*).
 1.30 12-Jan-2002  christos When checking for permissions, include the P_INEXEC test and return
EAGAIN if the process is exec'ing.
 1.29 10-Nov-2001  lukem add RCSIDs
 1.28 06-Nov-2001  simonb In procfs_domem() the addr variable is only needed if PMAP_NEED_PROCWR is
defined.
 1.27 24-Nov-2000  chs branches: 1.27.2; 1.27.4; 1.27.8;
remove dead code and other misc cleanup.
 1.26 26-Sep-2000  thorpej PHOLD/PRELE around uvm_io() to user address space is unnecessary. There
is nothing in the U-area that we need.
 1.25 28-Jun-2000  mrg <vm/vm.h> -> <uvm/uvm_extern.h>
 1.24 26-Jun-2000  mrg remove/move more mach vm header files:

<vm/pglist.h> -> <uvm/uvm_pglist.h>
<vm/vm_inherit.h> -> <uvm/uvm_inherit.h>
<vm/vm_kern.h> -> into <uvm/uvm_extern.h>
<vm/vm_object.h> -> nothing
<vm/vm_pager.h> -> into <uvm/uvm_pager.h>

also includes a bunch of <vm/vm_page.h> include removals (due to redudancy
with <vm/vm.h>), and a scattering of other similar headers.
 1.23 25-Mar-1999  sommerfe branches: 1.23.2; 1.23.8; 1.23.18;
Disallow tracing of processes unless tracer's root directory is at or
above tracee's root directory.
 1.22 24-Mar-1999  mrg completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.
 1.21 13-Mar-1999  thorpej Expose procfs_rwmem(). (This function will go away entirely when we
delete Mach VM.)
 1.20 25-Feb-1999  is Machine independent part of fix for PR 6152 (gdb doesn't work on machines
with UVM and seperate I&D-Cache). Mostly by Michael Hitch, but pass struct
proc * instead of the pmap. Reason: said machine will need a method to do
the syncing operation for "curproc", too; this way more code can be shared.
 1.19 13-Aug-1998  eeh Merge paddr_t changes into the main branch.
 1.18 10-Feb-1998  mrg branches: 1.18.2;
- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.
 1.17 05-Feb-1998  mrg initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)
 1.16 13-Sep-1997  enami Use the same indentation as other two place, sys_ptrace() and
procfs_control().

Ok'ed by Jason R. Thorpe.
 1.15 10-Sep-1997  christos PR/4098: Alan Barrett: Fix diagnostic printf formatting.
 1.14 27-Aug-1997  thorpej Fix a reversed argument which caused procfs_checkioperm() to always return
"OK". Add a few comments to avoid further confusion.
 1.13 13-Aug-1997  explorer Move procfs_checkioperm() from procvs_subr.c to procfs_mem.c, since _subr is
not included in a kernel without procfs, and it seems wrong to pull
all of procfs_subr.c in for just that one function. Perhaps this
should go into a new file instead?
 1.12 12-Aug-1997  thorpej Fix the procfs hole described on current-users, similar to a fix for
FreeBSD by Sean Eric Fagan, but a bit different. This makes the checks
in the same places as sef's FreeBSD patch, but does not hardcode the
"kmem" group into the kernel, and also does a check identical to the
(3) and (4) checks in the NetBSD ptrace(2):

(1) it's not owned by you, or is set-id on exec (unless
you're root), or

(2) it's init, which controls the security level of the
entire system, and the system was not compiled with
permanently insecure mode turned on.
 1.11 13-Oct-1996  christos branches: 1.11.10;
backout previous kprintf changes
 1.10 10-Oct-1996  christos printf -> kprintf, sprintf -> ksprintf
 1.9 11-Jun-1996  mycroft Add a missing PHOLD()/PRELE() pair.
 1.8 09-Feb-1996  christos branches: 1.8.4;
miscfs prototype changes
 1.7 05-Jan-1995  chopps initialize variable as pointed out by David Jones <dej@qpoint.torfree.net>
this should fix pr #699
 1.6 29-Jun-1994  cgd New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'
 1.5 15-Jun-1994  mycroft Minor update from JSP after merging my changes.
 1.4 08-Jun-1994  mycroft Update to 4.4-Lite fs code, with local changes.
 1.3 17-Mar-1994  briggs PG_COW -> PG_COPYONWRITE to match earlier changes in vm_page.h.
 1.2 05-Jan-1994  cgd make it compile (cleanly) for us
 1.1 05-Jan-1994  cgd branches: 1.1.1;
add new procfs code, from Jan-Simon Pendry, jsp@sequent.com.
This is pretty-much "virgin", so that diffs can be done later.
 1.1.1.2 01-Mar-1998  fvdl Import 4.4BSD-Lite2
 1.1.1.1 01-Mar-1998  fvdl Import 4.4BSD-Lite for reference
 1.8.4.1 10-Dec-1996  mycroft From trunk:
Add a missing PHOLD()/PRELE() pair.
 1.11.10.3 16-Sep-1997  thorpej Update marc-pcmcia branch from trunk.
 1.11.10.2 28-Aug-1997  thorpej Update marc-pcmcia branch from trunk.
 1.11.10.1 23-Aug-1997  thorpej Update marc-pcmcia branch from trunk.
 1.18.2.1 30-Jul-1998  eeh Split vm_offset_t and vm_size_t into paddr_t, psize_t, vaddr_t, and vsize_t.
 1.23.18.1 14-Jan-2002  he Pull up revision 1.30 (requested by christos):
Fix a ptrace/execve race condition which could be used to modify
the child process' image during execve. This would be a security
issue due to setuid programs.
 1.23.8.2 08-Dec-2000  bouyer Sync with HEAD.
 1.23.8.1 20-Nov-2000  bouyer Update thorpej_scsipi to -current as of a month ago
 1.23.2.1 14-Jan-2002  he Pull up revision 1.30 (requested by he):
Fix a ptrace/execve race condition which could be used to modify
the child process' image during execve. This would be a security
issue due to setuid programs.
 1.27.8.1 12-Nov-2001  thorpej Sync the thorpej-mips-cache branch with -current.
 1.27.4.3 23-Jun-2002  jdolecek catch up with -current on kqueue branch
 1.27.4.2 11-Feb-2002  jdolecek Sync w/ -current.
 1.27.4.1 10-Jan-2002  thorpej Sync kqueue branch with -current.
 1.27.2.7 12-Jul-2002  nathanw No longer need to pull in lwp.h; proc.h pulls it in for us.
 1.27.2.6 20-Jun-2002  nathanw Catch up to -current.
 1.27.2.5 01-Apr-2002  nathanw Missed l => p conversion in previous.
 1.27.2.4 01-Apr-2002  nathanw procfs_domem() should take proc *, proc *; not proc *, lwp *.
 1.27.2.3 28-Feb-2002  nathanw Catch up to -current.
 1.27.2.2 14-Nov-2001  nathanw Catch up to -current.
 1.27.2.1 05-Mar-2001  nathanw Initial commit of scheduler activations and lightweight process support.
 1.33.2.5 24-Feb-2005  skrll Reduce diff to HEAD
 1.33.2.4 21-Sep-2004  skrll Fix the sync with head I botched.
 1.33.2.3 18-Sep-2004  skrll Sync with HEAD.
 1.33.2.2 03-Aug-2004  skrll Sync with HEAD
 1.33.2.1 02-Jul-2003  darrenr Apply the aborted ktrace-lwp changes to a specific branch. This is just for
others to review, I'm concerned that patch fuziness may have resulted in some
errant code being generated but I'll look at that later by comparing the diff
from the base to the branch with the file I attempt to apply to it. This will,
at the very least, put the changes in a better context for others to review
them and attempt to tinker with removing passing of 'struct lwp' through
the kernel.
 1.34.16.2 30-Dec-2006  yamt sync with head.
 1.34.16.1 21-Jun-2006  yamt sync with head.
 1.35.22.2 10-Dec-2006  yamt sync with head.
 1.35.22.1 22-Oct-2006  yamt sync with head
 1.35.20.1 18-Nov-2006  ad Sync with head.
 1.15 16-Nov-2006  christos __unused removal on arguments; approved by core.
 1.14 12-Oct-2006  christos - sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386
 1.13 11-Dec-2005  christos branches: 1.13.20; 1.13.22;
merge ktrace-lwp.
 1.12 07-Aug-2003  agc branches: 1.12.16;
Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.11 29-Jun-2003  fvdl branches: 1.11.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
 1.10 28-Jun-2003  darrenr Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
 1.9 10-Nov-2001  lukem add RCSIDs
 1.8 29-Jun-1994  cgd branches: 1.8.46; 1.8.48; 1.8.52;
New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'
 1.7 15-Jun-1994  mycroft Minor update from JSP after merging my changes.
 1.6 08-Jun-1994  mycroft Update to 4.4-Lite fs code, with local changes.
 1.5 05-May-1994  cgd lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.
 1.4 04-May-1994  cgd kill obvious bug; glad to know this was tested!
 1.3 04-May-1994  cgd Rename a lot of process flags.
 1.2 20-Jan-1994  ws Make procfs really work for debugging.
Implement not & notepg files in procfs.
 1.1 05-Jan-1994  cgd branches: 1.1.1;
add new procfs code, from Jan-Simon Pendry, jsp@sequent.com.
This is pretty-much "virgin", so that diffs can be done later.
 1.1.1.1 01-Mar-1998  fvdl Import 4.4BSD-Lite for reference
 1.8.52.1 12-Nov-2001  thorpej Sync the thorpej-mips-cache branch with -current.
 1.8.48.1 10-Jan-2002  thorpej Sync kqueue branch with -current.
 1.8.46.1 14-Nov-2001  nathanw Catch up to -current.
 1.11.2.4 21-Sep-2004  skrll Fix the sync with head I botched.
 1.11.2.3 18-Sep-2004  skrll Sync with HEAD.
 1.11.2.2 03-Aug-2004  skrll Sync with HEAD
 1.11.2.1 02-Jul-2003  darrenr Apply the aborted ktrace-lwp changes to a specific branch. This is just for
others to review, I'm concerned that patch fuziness may have resulted in some
errant code being generated but I'll look at that later by comparing the diff
from the base to the branch with the file I attempt to apply to it. This will,
at the very least, put the changes in a better context for others to review
them and attempt to tinker with removing passing of 'struct lwp' through
the kernel.
 1.12.16.2 30-Dec-2006  yamt sync with head.
 1.12.16.1 21-Jun-2006  yamt sync with head.
 1.13.22.2 10-Dec-2006  yamt sync with head.
 1.13.22.1 22-Oct-2006  yamt sync with head
 1.13.20.1 18-Nov-2006  ad Sync with head.
 1.23 16-Nov-2006  christos __unused removal on arguments; approved by core.
 1.22 12-Oct-2006  christos - sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386
 1.21 11-Dec-2005  christos branches: 1.21.20; 1.21.22;
merge ktrace-lwp.
 1.20 07-Aug-2003  agc branches: 1.20.16;
Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.19 29-Jun-2003  fvdl branches: 1.19.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
 1.18 28-Jun-2003  darrenr Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
 1.17 18-Jan-2003  thorpej Merge the nathanw_sa branch.
 1.16 09-May-2002  thorpej Move code shared by procfs and the kernel proper out of procfs and
into the kernel proper (renaming functions from procfs_* to process_*).
 1.15 12-Jan-2002  christos Don't hide the real return code with EPERM.
 1.14 05-Dec-2001  thorpej * Allow machine-dependent code to specify hooks for ptrace(2)
(__HAVE_PTRACE_MACHDEP) and procfs (__HAVE_PROCFS_MACHDEP).
These changes will allow platforms like x86 (XMM) and PowerPC
(AltiVec) to export extended register sets in a sane manner.

* Use __HAVE_PTRACE_MACHDEP to export x86 XMM registers (standard
FP + SSE/SSE2) using PT_{GET,SET}XMMREGS (in the machdep
ptrace request space).
* Use __HAVE_PROCFS_MACHDEP to export x86 XMM registers via
/proc/N/xmmregs in procfs.
 1.13 10-Nov-2001  lukem add RCSIDs
 1.12 17-Jan-2001  fvdl branches: 1.12.2; 1.12.4; 1.12.8;
Add a few linux-style files, only enabled when -o linux is specified
for the mount. Currently these are /proc/cpuinfo and /proc/meminfo.
The former only does something on i386 right now.
 1.11 27-Aug-1997  thorpej branches: 1.11.12; 1.11.18; 1.11.28;
Fix a reversed argument which caused procfs_checkioperm() to always return
"OK". Add a few comments to avoid further confusion.
 1.10 12-Aug-1997  thorpej Fix the procfs hole described on current-users, similar to a fix for
FreeBSD by Sean Eric Fagan, but a bit different. This makes the checks
in the same places as sef's FreeBSD patch, but does not hardcode the
"kmem" group into the kernel, and also does a check identical to the
(3) and (4) checks in the NetBSD ptrace(2):

(1) it's not owned by you, or is set-id on exec (unless
you're root), or

(2) it's init, which controls the security level of the
entire system, and the system was not compiled with
permanently insecure mode turned on.
 1.9 13-Aug-1995  mycroft branches: 1.9.14;
Lock the process in core before operating on it.
 1.8 29-Jun-1994  cgd New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'
 1.7 15-Jun-1994  mycroft Minor update from JSP after merging my changes.
 1.6 08-Jun-1994  mycroft Update to 4.4-Lite fs code, with local changes.
 1.5 04-May-1994  cgd Rename a lot of process flags.
 1.4 12-Apr-1994  cgd be a bit smarter about determining if files shouldn't be seen by the user.
Also, DON'T allow a lookup to succeed on a file that's not visible!
 1.3 28-Jan-1994  cgd make a fpregs file.
 1.2 08-Jan-1994  cgd reorganization of ptrace/procfs code
 1.1 05-Jan-1994  cgd branches: 1.1.1;
add new procfs code, from Jan-Simon Pendry, jsp@sequent.com.
This is pretty-much "virgin", so that diffs can be done later.
 1.1.1.2 01-Mar-1998  fvdl Import 4.4BSD-Lite2
 1.1.1.1 01-Mar-1998  fvdl Import 4.4BSD-Lite for reference
 1.9.14.2 28-Aug-1997  thorpej Update marc-pcmcia branch from trunk.
 1.9.14.1 23-Aug-1997  thorpej Update marc-pcmcia branch from trunk.
 1.11.28.2 14-Jan-2002  he Pull up revision 1.15 (requested by christos):
Fix a ptrace/execve race condition which could be used to modify
the child process' image during execve. This would be a security
issue due to setuid programs.
 1.11.28.1 30-Mar-2001  he Pull up revision 1.12 (requested by fvdl):
Add some required Linux emulation bits to support the Linux
version of VMware.
 1.11.18.1 18-Jan-2001  bouyer Sync with head (for UBC+NFS fixes, mostly).
 1.11.12.1 14-Jan-2002  he Pull up revision 1.15 (requested by he):
Fix a ptrace/execve race condition which could be used to modify
the child process' image during execve. This would be a security
issue due to setuid programs.
 1.12.8.1 12-Nov-2001  thorpej Sync the thorpej-mips-cache branch with -current.
 1.12.4.3 23-Jun-2002  jdolecek catch up with -current on kqueue branch
 1.12.4.2 11-Feb-2002  jdolecek Sync w/ -current.
 1.12.4.1 10-Jan-2002  thorpej Sync kqueue branch with -current.
 1.12.2.7 15-Oct-2002  nathanw Make _validfoo() routines go back to taking a proc.
 1.12.2.6 12-Jul-2002  nathanw No longer need to pull in lwp.h; proc.h pulls it in for us.
 1.12.2.5 20-Jun-2002  nathanw Catch up to -current.
 1.12.2.4 28-Feb-2002  nathanw Catch up to -current.
 1.12.2.3 08-Jan-2002  nathanw Catch up to -current.
 1.12.2.2 14-Nov-2001  nathanw Catch up to -current.
 1.12.2.1 05-Mar-2001  nathanw Initial commit of scheduler activations and lightweight process support.
 1.19.2.4 21-Sep-2004  skrll Fix the sync with head I botched.
 1.19.2.3 18-Sep-2004  skrll Sync with HEAD.
 1.19.2.2 03-Aug-2004  skrll Sync with HEAD
 1.19.2.1 02-Jul-2003  darrenr Apply the aborted ktrace-lwp changes to a specific branch. This is just for
others to review, I'm concerned that patch fuziness may have resulted in some
errant code being generated but I'll look at that later by comparing the diff
from the base to the branch with the file I attempt to apply to it. This will,
at the very least, put the changes in a better context for others to review
them and attempt to tinker with removing passing of 'struct lwp' through
the kernel.
 1.20.16.2 30-Dec-2006  yamt sync with head.
 1.20.16.1 21-Jun-2006  yamt sync with head.
 1.21.22.2 10-Dec-2006  yamt sync with head.
 1.21.22.1 22-Oct-2006  yamt sync with head
 1.21.20.1 18-Nov-2006  ad Sync with head.
 1.40 23-May-2020  ad Move proc_lock into the data segment. It was dynamically allocated because
at the time we had mutex_obj_alloc() but not __cacheline_aligned.
 1.39 29-Sep-2017  kre Use %ju and (intmax_t) to unbreak i386 build.
 1.38 29-Sep-2017  christos Split the status printing routines (one for NetBSD and one for Linux) for
simplicity (Robert Swindelis)
 1.37 14-Nov-2016  kre Return the "true" parent's pid as the parent pid (ppid) via the
various sysctl/procfs interfaces that allow it to be interrogated.
(This is rather than the temporary parent's pid when a process is
being traced and has been reparented.)

XXX The ppid in elf32 core files has not been similarly adjusted,
XXX Should it be ?
 1.36 21-Oct-2009  rmind branches: 1.36.22; 1.36.40; 1.36.44;
Remove uarea swap-out functionality:

- Addresses the issue described in PR/38828.
- Some simplification in threading and sleepq subsystems.
- Eliminates pmap_collect() and, as a side note, allows pmap optimisations.
- Eliminates XS_CTL_DATA_ONSTACK in scsipi code.
- Avoids few scans on LWP list and thus potentially long holds of proc_lock.
- Cuts ~1.5k lines of code. Reduces amd64 kernel size by ~4k.
- Removes __SWAP_BROKEN cases.

Tested on x86, mips, acorn32 (thanks <mpumford>) and partly tested on
acorn26 (thanks to <bjh21>).

Discussed on <tech-kern>, reviewed by <ad>.
 1.35 11-Jan-2009  christos merge christos-time_t
 1.34 24-Apr-2008  ad branches: 1.34.2; 1.34.10;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.
 1.33 24-Apr-2008  ad Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.
 1.32 09-Mar-2007  ad branches: 1.32.36; 1.32.38; 1.32.40;
- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.
 1.31 17-Feb-2007  pavel Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.
 1.30 09-Feb-2007  ad branches: 1.30.2;
Merge newlock2 to head.
 1.29 16-Nov-2006  christos __unused removal on arguments; approved by core.
 1.28 12-Oct-2006  christos - sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386
 1.27 14-May-2006  elad branches: 1.27.8; 1.27.10;
integrate kauth.
 1.26 11-Dec-2005  christos branches: 1.26.4; 1.26.6; 1.26.8; 1.26.10; 1.26.12;
merge ktrace-lwp.
 1.25 29-May-2005  christos branches: 1.25.2;
- sprinkle const
- avoid shadowed variables.
 1.24 26-Feb-2005  perry nuke trailing whitespace
 1.23 22-Apr-2004  itojun branches: 1.23.4; 1.23.6;
sprintf -> snprintf
 1.22 07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.21 29-Jun-2003  fvdl branches: 1.21.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
 1.20 28-Jun-2003  darrenr Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
 1.19 18-Jan-2003  thorpej Merge the nathanw_sa branch.
 1.18 07-Nov-2002  thorpej Fix signed/unsigned comparison warnings.
 1.17 10-Nov-2001  lukem add RCSIDs
 1.16 30-Dec-2000  david branches: 1.16.2; 1.16.4; 1.16.6; 1.16.8;
Increase psbuf size as in FreeBSD patch. We don't have jail(8), so the
recent bugtraq exploit doesn't apply, but it could be exploitable in
other ways.
 1.15 09-Aug-1998  perry branches: 1.15.12;
bzero->memset, bcopy->memcpy, bcmp->memcmp
 1.14 14-Feb-1998  thorpej Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.
 1.13 13-Oct-1996  christos backout previous kprintf changes
 1.12 10-Oct-1996  christos printf -> kprintf, sprintf -> ksprintf
 1.11 16-Mar-1996  christos Fix printf format follies.
 1.10 01-Jun-1995  jtc Moved egid credential from cr_groups[0] to new field cr_gid. POSIX.1
requires that sgid executables and the setuid() syscall *not* change
the supplemental group list.
 1.9 29-Jun-1994  cgd New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'
 1.8 15-Jun-1994  mycroft Minor update from JSP after merging my changes.
 1.7 15-Jun-1994  mycroft Fix a bug pointed out by JSP.
 1.6 08-Jun-1994  mycroft Update to 4.4-Lite fs code, with local changes.
 1.5 05-May-1994  cgd lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.
 1.4 04-May-1994  cgd Rename a lot of process flags.
 1.3 10-Jan-1994  ws Fix sign extension bug
 1.2 09-Jan-1994  ws Bug fixes and enhancements:
Make NFS serving work (BUT DON'T USE "attach" TO /proc/*/ctl FOR NOW!!!)
Make `curproc' a symbolic link
Add `.' and `..' entries to the directories.
Return better guesses on the size of the files.
 1.1 05-Jan-1994  cgd branches: 1.1.1;
add new procfs code, from Jan-Simon Pendry, jsp@sequent.com.
This is pretty-much "virgin", so that diffs can be done later.
 1.1.1.2 01-Mar-1998  fvdl Import 4.4BSD-Lite2
 1.1.1.1 01-Mar-1998  fvdl Import 4.4BSD-Lite for reference
 1.15.12.1 05-Jan-2001  bouyer Sync with HEAD
 1.16.8.1 12-Nov-2001  thorpej Sync the thorpej-mips-cache branch with -current.
 1.16.6.2 13-Oct-2001  fvdl Revert the t_dev -> t_devvp change in struct tty. The way that tty
structs are currently used (especially by console ttys) aren't
ready for it, and this will require quite a few changes.
 1.16.6.1 07-Sep-2001  thorpej Commit my "devvp" changes to the thorpej-devvp branch. This
replaces the use of dev_t in most places with a struct vnode *.

This will form the basic infrastructure for real cloning device
support (besides being architecurally cleaner -- it'll be good
to get away from using numbers to represent objects).
 1.16.4.1 10-Jan-2002  thorpej Sync kqueue branch with -current.
 1.16.2.3 11-Nov-2002  nathanw Catch up to -current
 1.16.2.2 14-Nov-2001  nathanw Catch up to -current.
 1.16.2.1 05-Mar-2001  nathanw Initial commit of scheduler activations and lightweight process support.
 1.21.2.6 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.21.2.5 04-Mar-2005  skrll Sync with HEAD.

Hi Perry!
 1.21.2.4 21-Sep-2004  skrll Fix the sync with head I botched.
 1.21.2.3 18-Sep-2004  skrll Sync with HEAD.
 1.21.2.2 03-Aug-2004  skrll Sync with HEAD
 1.21.2.1 02-Jul-2003  darrenr Apply the aborted ktrace-lwp changes to a specific branch. This is just for
others to review, I'm concerned that patch fuziness may have resulted in some
errant code being generated but I'll look at that later by comparing the diff
from the base to the branch with the file I attempt to apply to it. This will,
at the very least, put the changes in a better context for others to review
them and attempt to tinker with removing passing of 'struct lwp' through
the kernel.
 1.23.6.1 19-Mar-2005  yamt sync with head. xen and whitespace. xen part is not finished.
 1.23.4.1 29-Apr-2005  kent sync with -current
 1.25.2.4 03-Sep-2007  yamt sync with head.
 1.25.2.3 26-Feb-2007  yamt sync with head.
 1.25.2.2 30-Dec-2006  yamt sync with head.
 1.25.2.1 21-Jun-2006  yamt sync with head.
 1.26.12.1 24-May-2006  tron Merge 2006-05-24 NetBSD-current into the "peter-altq" branch.
 1.26.10.2 06-May-2006  christos - Move kauth_cred_t declaration to <sys/types.h>
- Cleanup struct ucred; forward declarations that are unused.
- Don't include <sys/kauth.h> in any header, but include it in the c files
that need it.

Approved by core.
 1.26.10.1 08-Mar-2006  elad Adapt to kernel authorization KPI.
 1.26.8.1 24-May-2006  yamt sync with head.
 1.26.6.1 01-Jun-2006  kardel Sync with head.
 1.26.4.1 09-Sep-2006  rpaulo sync with head
 1.27.10.2 10-Dec-2006  yamt sync with head.
 1.27.10.1 22-Oct-2006  yamt sync with head
 1.27.8.3 18-Nov-2006  ad Sync with head.
 1.27.8.2 17-Nov-2006  ad Checkpoint work in progress.
 1.27.8.1 21-Oct-2006  ad - Make this compile. XXX Needs more work on locking.
- Do FILE_UNUSE() as the current LWP, otherwise we will wipe out the
target's advisory locks. XXX Double check.
 1.30.2.2 12-Mar-2007  rmind Sync with HEAD.
 1.30.2.1 27-Feb-2007  yamt - sync with head.
- move sched_changepri back to kern_synch.c as it doesn't know PPQ anymore.
 1.32.40.1 18-May-2008  yamt sync with head.
 1.32.38.3 09-Nov-2008  christos account for major and minor being unsigned long long
 1.32.38.2 01-Nov-2008  christos Sync with head.
 1.32.38.1 29-Mar-2008  christos Welcome to the time_t=long long dev_t=uint64_t branch.
 1.32.36.2 17-Jan-2009  mjf Sync with HEAD.
 1.32.36.1 02-Jun-2008  mjf Sync with HEAD.
 1.34.10.1 19-Jan-2009  skrll Sync with HEAD.
 1.34.2.2 11-Mar-2010  yamt sync with head
 1.34.2.1 04-May-2009  yamt sync with head.
 1.36.44.1 07-Jan-2017  pgoyette Sync with HEAD. (Note that most of these changes are simply $NetBSD$
tag issues.)
 1.36.40.1 05-Dec-2016  skrll Sync with HEAD
 1.36.22.1 03-Dec-2017  jdolecek update from HEAD
 1.120 01-Jul-2024  christos Add linux POSIX message queue support (Ricardo Branco)
 1.119 12-May-2024  christos branches: 1.119.2;
PR/58227: Ricardo Branco: Add support for proc/sysvipc in Linux emulator
 1.118 12-May-2024  christos PR/58240: Ricardo Branco: Add support for proc/self/limits as used by Linux
 1.117 17-Jan-2024  hannken Using the exechook to revoke procfs nodes is racy and may deadlock:

one thread runs doexechooks() -> procfs_revoke_vnodes() and wants to suspend
the file system for vgone(), while another thread runs a forced unmount,
has the file system suspended, tries to disestablish the exechook and
waits for doexechooks() to complete.

Establish/disestablish the exechook on module load/unload instead
mount/unmount and use the hashmap to access all procfs nodes for this pid.

May fix PR kern/57775 ""panic: unmount: dangling vnode" while umounting procfs"
 1.116 23-May-2020  ad branches: 1.116.20;
Move proc_lock into the data segment. It was dynamically allocated because
at the time we had mutex_obj_alloc() but not __cacheline_aligned.
 1.115 29-Apr-2020  thorpej If the procfs mount is marked as linux-compat, then allow proc lookup
by any LWP ID in the proc, not just the canonical PID.
 1.114 26-Sep-2019  christos fix sign-compare issues: uio->uio_offset (off_t) is compared with (size_t):
cast the offset to size_t.
 1.113 30-Mar-2019  christos add a node for the process resource limits.
 1.112 16-Apr-2018  hannken branches: 1.112.2;
Change procfs_revoke_vnodes() to use vrecycle()/vgone() instead
of VOP_REVOKE().

Gets rid of a bunch of suspensions on /proc as vrecycle() will
succeed most time and we suspend at most once per call.
 1.111 31-Dec-2017  christos branches: 1.111.2;
rename some "cmdline" stuff now that it is used to print environment too
 1.110 31-Dec-2017  christos Add an environ node
 1.109 28-Aug-2017  kamil Remove the filesystem tracing feature

This is a legacy interface from 4.4BSD, and it was
introduced to overcome shortcomings of ptrace(2) at that time, which are
no longer relevant (performance). Today /proc/#/ctl offers a narrow
subset of ptrace(2) commands and is not applicable for modern
applications use beyond simplistic tracing scenarios.

This removal will simplify kernel internals. Users will still be able to
use all the other /proc files.

This change won't affect other procfs files neither Linux compat
features within mount_procfs(8). /proc/#/ctl isn't available on Linux.

Remove:
- /proc/#/ctl from mount_procfs(8)
- P_FSTRACE note from the documentation of ps(1)
- /proc/#/ctl and filesystem tracing documentation from mount_procfs(8)
- KAUTH_REQ_PROCESS_PROCFS_CTL documentation from kauth(9)
- source code file miscfs/procfs/procfs_ctl.c
- PFSctl and procfs_doctl() from sys/miscfs/procfs/procfs.h
- KAUTH_REQ_PROCESS_PROCFS_CTL from sys/sys/kauth.h
- PSL_FSTRACE (0x00010000) from sys/sys/proc.h
- P_FSTRACE (0x00010000) from sys/sys/sysctl.h

Reduce code complexity after removal of this functionality.

Update TODO.ptrace accordingly: remove two entries about /proc tracing.

Do not keep legacy notes as comments in the headers about removed
PSL_FSTRACE / P_FSTRACE, as this interface had little number of users
(close or equal to zero).

Proposed on tech-kern@.

All filesystem tracing utility users are encouraged to switch to ptrace(2).

Sponsored by <The NetBSD Foundation>
 1.108 01-Apr-2017  riastradh branches: 1.108.6;
KASSERT(mutex_owned(vp->v_interlock)) in vnode iterator selector.
 1.107 30-Mar-2017  christos add an auxv node.
 1.106 10-Nov-2014  maxv branches: 1.106.2; 1.106.4; 1.106.6;
Do not uselessly include <sys/malloc.h>.
 1.105 27-Jul-2014  hannken branches: 1.105.2;
Change procfs from hashlist to vcache.
- Key is (type, pid, fd)
- Remove argument "p" from procfs_allocvp(). It is only used
when "type == PFSfd". Lookup the proc with proc_find() when
procfs_loadvnode() needs it.
- Use a vfs_vnode_iterator for procfs_revoke_vnodes().
 1.104 07-Feb-2014  hannken branches: 1.104.2;
Change vnode operation lookup to return the resulting vnode *vpp unlocked.
Change cache_lookup() to return an unlocked vnode.

Discussed on tech-kern@

Welcome to 6.99.31
 1.103 29-Oct-2013  hannken Vnode API cleanup pass 1.

- Make these defines and functions private to vfs_vnode.c:

VC_MASK, VC_LOCK, DOCLOSE, VI_IANCTREDO and VI_INACTNOW
vclean() and vrelel()

- Remove the long time unused lwp argument from vrecycle().

- Remove vtryget(), it is responsible for ugly hacks and doesn't
look that effective.

Presented on tech-kern.

Welcome to 6.99.25
 1.102 25-Nov-2012  christos branches: 1.102.2;
do something reasonable with kernel semaphores.
 1.101 28-May-2012  christos branches: 1.101.2;
add a task process subdirectory for emul linux
 1.100 04-Sep-2011  jmcneill branches: 1.100.2; 1.100.6;
PR# kern/45021: Please support /emul/linux/proc/version

Add /proc/version for procfs with -o linux. The version reported depends
on the emulation type of the calling process:

$ cat /proc/version
NetBSD version 5.99.55 (netbsd@localhost) (gcc version 4.1.3 20080704 prerelease (NetBSD nb2 20081120)) NetBSD 5.99.55 (GENERIC) #39: Sun Sep 4 09:10:05 EDT 2011

$ /emul/linux/bin/cat /proc/version
Linux version 2.6.18 (linux@localhost) (gcc version 4.1.3 20080704 prerelease (NetBSD nb2 20081120)) #0 Wed Mar 3 03:03:03 PST 2010

$ /emul/linux32/bin/cat /proc/version
Linux version 2.6.18 (linux32@localhost) (gcc version 4.1.3 20080704 prerelease (NetBSD nb2 20081120)) #0 Wed Mar 3 03:03:03 PST 2010
 1.99 12-Jun-2011  rmind Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.
 1.98 21-Jul-2010  hannken branches: 1.98.6;
Make holding v_interlock mandatory for callers of vget().

Announced some time ago on tech-kern.
 1.97 01-Jul-2010  hannken Remove vlockmgr(). Generic vnode lock operations now use a rwlock located
in the vnode. All LK_* flags move from sys/lock.h to sys/vnode.h. Calls
to vlockmgr() in file systems get replaced with VOP_LOCK() or VOP_UNLOCK().

Welcome to 5.99.34.

Discussed on tech-kern.
 1.96 01-Jul-2010  rmind Remove pfind() and pgfind(), fix locking in various broken uses of these.
Rename real routines to proc_find() and pgrp_find(), remove PFIND_* flags
and have consistent behaviour. Provide proc_find_raw() for special cases.
Fix memory leak in sysctl_proc_corename().

COMPAT_LINUX: rework ptrace() locking, minimise differences between
different versions per-arch.

Note: while this change adds some formal cosmetics for COMPAT_DARWIN and
COMPAT_IRIX - locking there is utterly broken (for ages).

Fixes PR/43176.
 1.95 15-Mar-2009  cegger branches: 1.95.2; 1.95.4;
ansify function definitions
 1.94 14-Mar-2009  dsl Change about 4500 of the K&R function definitions to ANSI ones.
There are still about 1600 left, but they have ',' or /* ... */
in the actual variable definitions - which my awk script doesn't handle.
There are also many that need () -> (void).
(The script does handle misordered arguments.)
 1.93 17-Dec-2008  cegger branches: 1.93.2;
kill MALLOC and FREE macros.
 1.92 05-Sep-2008  skrll branches: 1.92.2;
PR/39324 kernel diagnostic assertion "l->l_stat != LSZOMB" failed.

Ignore procs with zero or all LSZOMB LWPs. Get a non-LSZOMB LWP to perform
operations against as part of the deal.

procfs really needs to be updated to support multi-threading fully.
Hi Antti!
 1.91 02-Jul-2008  rmind branches: 1.91.2;
Remove proc_representative_lwp(), use a simple LIST_FIRST() instead.
OK by <ad>.
 1.90 05-May-2008  ad branches: 1.90.2; 1.90.4;
- Convert hashinit() to use kmem_alloc(). The hash tables can be large
and it's better to not have them in kmem_map.
- Convert a couple of minor items along the way to kmem_alloc().
- Fix some memory leaks.
 1.89 28-Apr-2008  martin Remove clause 3 and 4 from TNF licenses
 1.88 24-Apr-2008  ad branches: 1.88.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.
 1.87 24-Apr-2008  ad Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.
 1.86 21-Mar-2008  ad branches: 1.86.2;
Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.
 1.85 30-Jan-2008  ad branches: 1.85.6;
Replace struct lock on vnodes with a simpler lock object built on
krwlock_t. This is a step towards removing lockmgr and simplifying
vnode locking. Discussed on tech-kern.
 1.84 23-Jan-2008  elad Tons of process scope changes.

- Add a KAUTH_PROCESS_SCHEDULER action, to handle scheduler related
requests, and add specific requests for set/get scheduler policy and
set/get scheduler parameters.

- Add a KAUTH_PROCESS_KEVENT_FILTER action, to handle kevent(2) related
requests.

- Add a KAUTH_DEVICE_TTY_STI action to handle requests to TIOCSTI.

- Add requests for the KAUTH_PROCESS_CANSEE action, indicating what
process information is being looked at (entry itself, args, env,
open files).

- Add requests for the KAUTH_PROCESS_RLIMIT action indicating set/get.

- Add requests for the KAUTH_PROCESS_CORENAME action indicating set/get.

- Make bsd44 secmodel code handle the newly added rqeuests appropriately.

All of the above make it possible to issue finer-grained kauth(9) calls in
many places, removing some KAUTH_GENERIC_ISSUSER requests.

- Remove the "CAN" from KAUTH_PROCESS_CAN{KTRACE,PROCFS,PTRACE,SIGNAL}.

Discussed with christos@ and yamt@.
 1.83 02-Jan-2008  ad Merge vmlocking2 to head.
 1.82 07-Nov-2007  ad branches: 1.82.2; 1.82.6;
Merge from vmlocking:

- pool_cache changes.
- Debugger/procfs locking fixes.
- Other minor changes.
 1.81 10-Oct-2007  ad branches: 1.81.2; 1.81.4;
Merge from vmlocking:

- Split vnode::v_flag into three fields, depending on field locking.
- simple_lock -> kmutex in a few places.
- Fix some simple locking problems.
 1.80 24-May-2007  agc branches: 1.80.6; 1.80.8; 1.80.10;
Extend the Linux emulation of /proc to include

/proc/stat
/proc/loadavg and
/proc/<pid>/statm.

These are only present when -o linux is specified as a mount option
to procfs.

Factor out some common code so that it can be used by a number of
functions.

XXX The values returned in the statm emulation need to be verified.
 1.79 09-Mar-2007  ad branches: 1.79.2; 1.79.4;
- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.
 1.78 27-Feb-2007  ad Destroy the hash locks on final unmount.
 1.77 17-Feb-2007  pavel Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.
 1.76 15-Feb-2007  ad branches: 1.76.2;
Replace some uses of lockmgr() / simplelocks.
 1.75 09-Feb-2007  ad Merge newlock2 to head.
 1.74 24-Dec-2006  christos fix permissions on /proc/<pid> node. From elad.
 1.73 28-Nov-2006  elad Move ktrace, ptrace, systrace, and procfs to use kauth(9).

First, remove process_checkioperm() calls from MD code. Similar checks
using kauth(9) routines (on the process scope, using appropriate action)
are done in the callers.

Add secmodel back-end to handle each subsystem.
 1.72 16-Nov-2006  christos __unused removal on arguments; approved by core.
 1.71 29-Oct-2006  christos add an "emul" file node.
 1.70 25-Oct-2006  christos 1. fix procfs_validfile{,_linux} to test for NULL pointers properly.
2. make "exe" entry be a symlink to the executable, instead of pointing
directly to the vnode of the executable.
3. factor out commonly used code.
 1.69 20-Sep-2006  manu Emulate Linux's /proc/devices
 1.68 01-Mar-2006  yamt branches: 1.68.14; 1.68.16;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.
 1.67 11-Dec-2005  christos branches: 1.67.2; 1.67.4; 1.67.6;
merge ktrace-lwp.
 1.66 01-Oct-2005  atatat Add "cwd" and "root" symlinks to each process's directory. The cwd
link points to the process's current working directory, and the root
link points to the process's root directory. What else would you
expect?

For directories that are out of reach (caller is in a chroot, target
process is in a different chroot, etc), the links point to "/"
instead.
 1.65 30-Aug-2005  xtraeme Remove __P()
 1.64 29-May-2005  christos branches: 1.64.2;
- sprinkle const
- avoid shadowed variables.
 1.63 26-Feb-2005  perry nuke trailing whitespace
 1.62 20-Sep-2004  jdolecek branches: 1.62.4; 1.62.6;
add 'mounts' file for -o linux, which lists all currently mounted
filesystems; Linux glibc statvfs() uses this to get some of mount flags,
and this file is also useful as /emul/linux/etc/mtab (via symlink)
 1.61 27-Aug-2004  skrll Do previous slightly differently - just pass a struct lwp * and derive the
struct proc *.

OK'd by Jaromir.
 1.60 21-Aug-2004  jdolecek fix process used for /proc/<pid>/stat contents - it should be process
<pid>, not the current process looking at the information
 1.59 14-May-2004  christos Simplify the code by:
1. Checking for a negative uio_offset at the beginning. This really does
not affect us in most cases because we check that later too.
2. Checking for attempts to write to init sooner and in all cases.
 1.58 27-Sep-2003  darcy branches: 1.58.2; 1.58.4;
Changes as discussed with itojun on tech-kern. I have modified the enums
to have KFS or PFS differentiators. Further I have wrapped the enum in
procfs in "#ifdef _KERNEL" as it is done in kernfs.

To see the discussion go to http://mail-index.NetBSD.org/tech-kern/2003/09/
and look for "Mismatched enums in include files" in the list.
 1.57 07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.56 29-Jun-2003  fvdl branches: 1.56.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
 1.55 29-Jun-2003  thorpej Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.
 1.54 28-Jun-2003  darrenr Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
 1.53 28-May-2003  christos Add /proc/<pid>/stat for linux compat. j2sdk1.4.2 depends on it.
 1.52 18-Apr-2003  christos Make the mode of /proc/<pid>/fd dr-x------
 1.51 18-Apr-2003  christos Make symlinks for directories that point to the actual directory.
Make symlinks to [kqueue] and [misc] for kqueue and misc fds.
 1.50 17-Apr-2003  jdolecek do not show nodes corresponding to directory descriptors for process
in fd/ subdirectory, nor allow lookup/open for the nodes
this fixes PR kern/21187 for good, and also avoids interesting directory
locking issues
 1.49 17-Apr-2003  jdolecek use fd_getfile() in procfs_getfp(), and FILE_USE()/FILE_UNUSE() the
returned file descriptor pointer appropriately
 1.48 15-Mar-2003  enami Release the hash lock on failure.
 1.47 04-Mar-2003  tron Teach procfs_allocvp() about Puptime to avoid panics if "/proc/uptime"
is opened.
 1.46 25-Feb-2003  jrf This addresses PR kerm/19989. Thanks to hamajima@nagoya.ydc.co.jp for submitting this patch which enables /proc/uptime for linux emul. Patch reviewed by atatat@netbsd.org and tron@netbsd.org, approved by tron@netbsd.org.
 1.45 03-Feb-2003  jdolecek don't bother special-casing DTYPE_KQUEUE/DTYPE_MISC nor panic for unknown
descriptors; just return with EOPNOTSUPP for any unsupported descriptor type
 1.44 03-Feb-2003  jdolecek procfs_allocvp():
* do not set *vpp unless successful, otherwise we'd trigger
DIAGNOSTIC panic in lookup(9) on error return
* on error, make sure to free malloc'ed memory and ungetnewvnode() the
previously acquired vnode

this fixes panic on 'tail -f <file> &; ls -l /proc/$!/fd' reported by
Andrew Brown

fix reviewed by Christos Zoulas
 1.43 18-Jan-2003  thorpej Merge the nathanw_sa branch.
 1.42 03-Jan-2003  christos Implement /proc/<pid>/fd/<n>. This is work in progress. Questionable things:
- Is it ok to convert DTYPE_PIPE to VFIFO and DTYPE_SOCKET to VSOCK?
- XXX: Avoid locking issue in ls -Rl /proc by avoiding curproc
- Does I/O to pipes work?
- XXX: Are there security implications?
 1.41 07-Nov-2002  thorpej Fix a signed/unsigned comparison warning.
 1.40 05-Dec-2001  thorpej * Allow machine-dependent code to specify hooks for ptrace(2)
(__HAVE_PTRACE_MACHDEP) and procfs (__HAVE_PROCFS_MACHDEP).
These changes will allow platforms like x86 (XMM) and PowerPC
(AltiVec) to export extended register sets in a sane manner.

* Use __HAVE_PTRACE_MACHDEP to export x86 XMM registers (standard
FP + SSE/SSE2) using PT_{GET,SET}XMMREGS (in the machdep
ptrace request space).
* Use __HAVE_PROCFS_MACHDEP to export x86 XMM registers via
/proc/N/xmmregs in procfs.
 1.39 10-Nov-2001  lukem add RCSIDs
 1.38 15-Sep-2001  chs branches: 1.38.2;
add a new VFS op, vfs_reinit, which is called when desiredvnodes is
adjusted via sysctl. file systems that have hash tables which are
sized based on the value of this variable now resize those hash tables
using the new value. the max number of FFS softdeps is also recalculated.

convert various file systems to use the <sys/queue.h> macros for
their hash tables.
 1.37 29-Mar-2001  fvdl branches: 1.37.2; 1.37.4;
For -o linux mounts, add some code to emulate /proc/#/maps.
Needs NAMECACHE_ENTER_REVERSE to include filenames.
 1.36 18-Jan-2001  jdolecek branches: 1.36.2;
constify
 1.35 17-Jan-2001  fvdl Add a few linux-style files, only enabled when -o linux is specified
for the mount. Currently these are /proc/cpuinfo and /proc/meminfo.
The former only does something on i386 right now.
 1.34 27-Nov-2000  chs Initial integration of the Unified Buffer Cache project.
 1.33 24-Nov-2000  chs remove dead code and other misc cleanup.
 1.32 08-Nov-2000  ad Update for hashinit() change.
 1.31 16-Mar-2000  jdolecek branches: 1.31.4;
Add new VFS op routine - vfs_done and call it on filesystem detach
in vfs_detach(). vfs_done may free global filesystem's resources,
typically those allocated in respective filesystem's init function.
Needed so those filesystems which went in via LKM have a chance to
clean after themselves before unloading. This fixes random panics
when LKM for filesystem using pools was loaded and unloaded several
times.

For each leaf filesystem, add appropriate vfs_done routine.
 1.30 25-Feb-2000  fvdl Fix pasto: some lines of the procfs hash code were copied from the
UFS code, and I forgot to rename the "ihash" variable, causing
weird effects, because 3/4th of the UFS hash table would become
unreachable after procfs was loaded as an LKM.
 1.29 25-Jan-2000  fvdl At mount/unmount time, add an exec hook to revoke all vnodes iff the
process is about to exec a sugid binary.

To speed up things, use hashing for vnode allocation, like other filesystems
do. This avoids walking the whole procfs node list in the revoke case too.
 1.28 02-Sep-1999  thorpej branches: 1.28.2;
Make /proc/self a symlink to /proc/curproc. I've observed Linux programs
that expect /proc/self/cmdline to exist.
 1.27 08-Jul-1999  wrstuden Bump osrelease to 1.4E. Add layerfs files, remove null_subr.c.

Update coda to new struct lock in struct vnode.

make fdescfs, kernfs, portalfs, and procfs actually lock their vnodes.
It's not that hard.

Make unionfs set v_vnlock = NULL so any overlayed fs will call its
VOP_LOCK.
 1.26 12-Mar-1999  christos branches: 1.26.2; 1.26.4;
PR/7143: Jaromir Docelek: Add procfs/cmdline from Linux emulation
 1.25 25-Jan-1999  msaitoh Add /proc/#/map. From FreeBSD.
 1.24 09-Aug-1998  perry bzero->memset, bcopy->memcpy, bcmp->memcmp
 1.23 01-Mar-1998  fvdl Merge with Lite2 + local changes
 1.22 30-Oct-1997  mycroft Make the curproc link executable.
 1.21 13-Aug-1997  explorer branches: 1.21.4;
Move procfs_checkioperm() from procvs_subr.c to procfs_mem.c, since _subr is
not included in a kernel without procfs, and it seems wrong to pull
all of procfs_subr.c in for just that one function. Perhaps this
should go into a new file instead?
 1.20 12-Aug-1997  thorpej Fix the procfs hole described on current-users, similar to a fix for
FreeBSD by Sean Eric Fagan, but a bit different. This makes the checks
in the same places as sef's FreeBSD patch, but does not hardcode the
"kmem" group into the kernel, and also does a check identical to the
(3) and (4) checks in the NetBSD ptrace(2):

(1) it's not owned by you, or is set-id on exec (unless
you're root), or

(2) it's init, which controls the security level of the
entire system, and the system was not compiled with
permanently insecure mode turned on.
 1.19 25-Jun-1997  mycroft branches: 1.19.4;
Don't allow writes to init's memory or registers while in secure mode.
 1.18 05-May-1997  mycroft Need stat.h.
 1.17 05-May-1997  mycroft Eliminate bogus uses of V{READ,WRITE,EXEC}. Use S_I[RWX]{USR,GRP,OTH} where
appropriate.
 1.16 25-Oct-1996  cgd remove bogus cast of second arg to bcmp(). (nm_name is a const char*,
and was being unnecessarily cast to 'char *'; -Wcast-qual.)
 1.15 12-Feb-1996  christos close PR/2063: procfs_rw prototyped twice with different prototypes
 1.14 09-Feb-1996  christos miscfs prototype changes
 1.13 29-Jun-1994  cgd New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'
 1.12 15-Jun-1994  mycroft Minor update from JSP after merging my changes.
 1.11 08-Jun-1994  mycroft Update to 4.4-Lite fs code, with local changes.
 1.10 25-Apr-1994  cgd some prototype cleanup, eliminate/replace bogus types (e.g. quad and
u_quad) -> use better types (e.g. quad_t & u_quad_t in inodes),
some cleanup.
 1.9 28-Jan-1994  cgd make a fpregs file.
 1.8 20-Jan-1994  ws Make procfs really work for debugging.
Implement not & notepg files in procfs.
 1.7 10-Jan-1994  mycroft Add a missing break so my machine doesn't panic.
 1.6 09-Jan-1994  ws Bug fixes and enhancements:
Make NFS serving work (BUT DON'T USE "attach" TO /proc/*/ctl FOR NOW!!!)
Make `curproc' a symbolic link
Add `.' and `..' entries to the directories.
Return better guesses on the size of the files.
 1.5 05-Jan-1994  cgd add new procfs code, from Jan-Simon Pendry, jsp@sequent.com.
This is pretty-much "virgin", so that diffs can be done later.
 1.4 18-Dec-1993  mycroft Canonicalize all #includes.
 1.3 24-Aug-1993  pk branches: 1.3.2;
copyright update.
 1.2 24-Aug-1993  pk Rcs Id added.
 1.1 24-Aug-1993  pk branches: 1.1.1;
Initial version of a proc filesystem.
 1.1.1.2 01-Mar-1998  fvdl Import 4.4BSD-Lite2
 1.1.1.1 01-Mar-1998  fvdl Import 4.4BSD-Lite for reference
 1.3.2.1 14-Nov-1993  mycroft Canonicalize all #includes.
 1.19.4.1 23-Aug-1997  thorpej Update marc-pcmcia branch from trunk.
 1.21.4.1 30-Oct-1997  mellon Pull rev 1.22 up from trunk (mycroft)
 1.26.4.1 02-Aug-1999  thorpej Update from trunk.
 1.26.2.2 28-Feb-2000  he Pull up revision 1.30 (requested by fvdl):
Fix a critical typo in the earlier procfs security fix.
 1.26.2.1 01-Feb-2000  he Pull up revision 1.29 (via patch, requested by fvdl):
Close procfs security hole. Fixes SA#2000-001.
 1.28.2.6 21-Apr-2001  bouyer Sync with HEAD
 1.28.2.5 11-Feb-2001  bouyer Sync with HEAD.
 1.28.2.4 18-Jan-2001  bouyer Sync with head (for UBC+NFS fixes, mostly).
 1.28.2.3 08-Dec-2000  bouyer Sync with HEAD.
 1.28.2.2 22-Nov-2000  bouyer Sync with HEAD.
 1.28.2.1 20-Nov-2000  bouyer Update thorpej_scsipi to -current as of a month ago
 1.31.4.1 30-Mar-2001  he Pull up revision 1.35 (requested by fvdl):
Add some required Linux emulation bits to support the Linux
version of VMware.
 1.36.2.9 07-Jan-2003  thorpej Sync with HEAD.
 1.36.2.8 11-Nov-2002  nathanw Catch up to -current
 1.36.2.7 01-Apr-2002  nathanw procfs_domem() should take proc *, proc *; not proc *, lwp *.
 1.36.2.6 09-Jan-2002  nathanw Use proc_representative_lwp() instead of bailing out.
Adapt PROCFS_MACHDEP to lwps.
 1.36.2.5 08-Jan-2002  nathanw Catch up to -current.
 1.36.2.4 14-Nov-2001  nathanw Catch up to -current.
 1.36.2.3 21-Sep-2001  nathanw Catch up to -current.
 1.36.2.2 09-Apr-2001  nathanw Catch up with -current.
 1.36.2.1 05-Mar-2001  nathanw Initial commit of scheduler activations and lightweight process support.
 1.37.4.2 01-Oct-2001  fvdl Catch up with -current.
 1.37.4.1 26-Sep-2001  fvdl * add a VCLONED vnode flag that indicates a vnode representing a cloned
device.
* rename REVOKEALL to REVOKEALIAS, and add a REVOKECLONE flag, to pass
to VOP_REVOKE
* the revoke system call will revoke all aliases, as before, but not the
clones
* vdevgone is called when detaching a device, so make it use REVOKECLONE
to get rid of all clones as well
* clean up all uses of VOP_OPEN wrt. locking.
* add a few VOPS to spec_vnops that need to do something when it's a
clone vnode (access and getattr)
* add a copy of the vnode vattr structure of the original 'master' vnode
to the specinfo of a cloned vnode. could possibly redirect getattr to
the 'master' vnode, but this has issues with revoke
* add a vdev_reassignvp function that disassociates a vnode from its
original device, and reassociates it with the specified dev_t. to be
used by cloning devices only, in case a new minor is allocated.
* change all direct references in drivers to v_devcookie and v_rdev
to vdev_privdata(vp) and vdev_rdev(vp). for diagnostic purposes
when debugging race conditions that still exist wrt. locking and
revoking vnodes.
* make the locking state of a vnode consistent when passed to
d_open and d_close (unlocked). locked would be better, but has
some deadlock issues
 1.37.2.1 10-Jan-2002  thorpej Sync kqueue branch with -current.
 1.38.2.1 12-Nov-2001  thorpej Sync the thorpej-mips-cache branch with -current.
 1.56.2.9 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.56.2.8 04-Mar-2005  skrll Sync with HEAD.

Hi Perry!
 1.56.2.7 24-Sep-2004  skrll Sync with HEAD.
 1.56.2.6 21-Sep-2004  skrll Fix the sync with head I botched.
 1.56.2.5 18-Sep-2004  skrll Sync with HEAD.
 1.56.2.4 24-Aug-2004  skrll Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.
 1.56.2.3 18-Aug-2004  skrll Revert to passing struct proc for {exit,exec}hook.
 1.56.2.2 03-Aug-2004  skrll Sync with HEAD
 1.56.2.1 02-Jul-2003  darrenr Apply the aborted ktrace-lwp changes to a specific branch. This is just for
others to review, I'm concerned that patch fuziness may have resulted in some
errant code being generated but I'll look at that later by comparing the diff
from the base to the branch with the file I attempt to apply to it. This will,
at the very least, put the changes in a better context for others to review
them and attempt to tinker with removing passing of 'struct lwp' through
the kernel.
 1.58.4.1 31-Aug-2005  tron Pull up following revision(s) (requested by christos in ticket #5634):
sys/miscfs/procfs/procfs_subr.c: revision 1.59
Simplify the code by:
1. Checking for a negative uio_offset at the beginning. This really does
not affect us in most cases because we check that later too.
2. Checking for attempts to write to init sooner and in all cases.
 1.58.2.1 31-Aug-2005  tron Pull up following revision(s) (requested by christos in ticket #5634):
sys/miscfs/procfs/procfs_subr.c: revision 1.59
Simplify the code by:
1. Checking for a negative uio_offset at the beginning. This really does
not affect us in most cases because we check that later too.
2. Checking for attempts to write to init sooner and in all cases.
 1.62.6.1 19-Mar-2005  yamt sync with head. xen and whitespace. xen part is not finished.
 1.62.4.1 29-Apr-2005  kent sync with -current
 1.64.2.9 24-Mar-2008  yamt sync with head.
 1.64.2.8 04-Feb-2008  yamt sync with head.
 1.64.2.7 21-Jan-2008  yamt sync with head
 1.64.2.6 15-Nov-2007  yamt sync with head.
 1.64.2.5 27-Oct-2007  yamt sync with head.
 1.64.2.4 03-Sep-2007  yamt sync with head.
 1.64.2.3 26-Feb-2007  yamt sync with head.
 1.64.2.2 30-Dec-2006  yamt sync with head.
 1.64.2.1 21-Jun-2006  yamt sync with head.
 1.67.6.1 22-Apr-2006  simonb Sync with head.
 1.67.4.1 09-Sep-2006  rpaulo sync with head
 1.67.2.1 15-Jan-2006  yamt convert procfs.
 1.68.16.2 10-Dec-2006  yamt sync with head.
 1.68.16.1 22-Oct-2006  yamt sync with head
 1.68.14.6 12-Jan-2007  ad Sync with head.
 1.68.14.5 29-Dec-2006  ad Checkpoint work in progress.
 1.68.14.4 18-Nov-2006  ad Sync with head.
 1.68.14.3 17-Nov-2006  ad Checkpoint work in progress.
 1.68.14.2 24-Oct-2006  ad - Redo LWP locking slightly and fix some races.
- Fix some locking botches.
- Make signal mask / stack per-proc for SA processes.
- Add _lwp_kill().
 1.68.14.1 21-Oct-2006  ad - Make this compile. XXX Needs more work on locking.
- Do FILE_UNUSE() as the current LWP, otherwise we will wipe out the
target's advisory locks. XXX Double check.
 1.76.2.2 12-Mar-2007  rmind Sync with HEAD.
 1.76.2.1 27-Feb-2007  yamt - sync with head.
- move sched_changepri back to kern_synch.c as it doesn't know PPQ anymore.
 1.79.4.1 11-Jul-2007  mjf Sync with head.
 1.79.2.4 25-Oct-2007  ad - Simplify debugger/procfs reference counting of processes. Use a per-proc
rwlock: rw_tryenter(RW_READER) to gain a reference, and rw_enter(RW_WRITER)
by the process itself to drain out reference holders before major changes
like exiting.
- Fix numerous bugs and locking issues in procfs.
- Mark procfs MPSAFE.
 1.79.2.3 17-Jun-2007  ad - Increase the number of thread priorities from 128 to 256. How the space
is set up is to be revisited.
- Implement soft interrupts as kernel threads. A generic implementation
is provided, with hooks for fast-path MD code that can run the interrupt
threads over the top of other threads executing in the kernel.
- Split vnode::v_flag into three fields, depending on how the flag is
locked (by the interlock, by the vnode lock, by the file system).
- Miscellaneous locking fixes and improvements.
 1.79.2.2 08-Jun-2007  ad Sync with head.
 1.79.2.1 13-Mar-2007  ad Pull in the initial set of changes for the vmlocking branch.
 1.80.10.1 14-Oct-2007  yamt sync with head.
 1.80.8.4 23-Mar-2008  matt sync with HEAD
 1.80.8.3 09-Jan-2008  matt sync with HEAD
 1.80.8.2 08-Nov-2007  matt sync with -HEAD
 1.80.8.1 06-Nov-2007  matt sync with HEAD
 1.80.6.2 11-Nov-2007  joerg Sync with HEAD.
 1.80.6.1 26-Oct-2007  joerg Sync with HEAD.

Follow the merge of pmap.c on i386 and amd64 and move
pmap_init_tmp_pgtbl into arch/x86/x86/pmap.c. Modify the ACPI wakeup
code to restore CR4 before jumping back into kernel space as the large
page option might cover that.
 1.81.4.2 18-Feb-2008  mjf Sync with HEAD.
 1.81.4.1 19-Nov-2007  mjf Sync with HEAD.
 1.81.2.1 13-Nov-2007  bouyer Sync with HEAD
 1.82.6.2 23-Jan-2008  bouyer Sync with HEAD.
 1.82.6.1 02-Jan-2008  bouyer Sync with HEAD
 1.82.2.1 04-Dec-2007  ad Pull the vmlocking changes into a new branch.
 1.85.6.4 17-Jan-2009  mjf Sync with HEAD.
 1.85.6.3 28-Sep-2008  mjf Sync with HEAD.
 1.85.6.2 02-Jun-2008  mjf Sync with HEAD.
 1.85.6.1 03-Apr-2008  mjf Sync with HEAD.
 1.86.2.1 18-May-2008  yamt sync with head.
 1.88.2.3 11-Aug-2010  yamt sync with head.
 1.88.2.2 04-May-2009  yamt sync with head.
 1.88.2.1 16-May-2008  yamt sync with head.
 1.90.4.1 03-Jul-2008  simonb Sync with head.
 1.90.2.1 18-Sep-2008  wrstuden Sync with wrstuden-revivesa-base-2.
 1.91.2.1 19-Oct-2008  haad Sync with HEAD.
 1.92.2.2 28-Apr-2009  skrll Sync with HEAD.
 1.92.2.1 19-Jan-2009  skrll Sync with HEAD.
 1.93.2.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.95.4.4 19-May-2011  rmind Implement sharing of vnode_t::v_interlock amongst vnodes:
- Lock is shared amongst UVM objects using uvm_obj_setlock() or getnewvnode().
- Adjust vnode cache to handle unsharing, add VI_LOCKSHARE flag for that.
- Use sharing in tmpfs and layerfs for underlying object.
- Simplify locking in ubc_fault().
- Sprinkle some asserts.

Discussed with ad@.
 1.95.4.3 05-Mar-2011  rmind sync with head
 1.95.4.2 03-Jul-2010  rmind sync with head
 1.95.4.1 16-Mar-2010  rmind Change struct uvm_object::vmobjlock to be dynamically allocated with
mutex_obj_alloc(). It allows us to share the locks among UVM objects.
 1.95.2.1 17-Aug-2010  uebayasi Sync with HEAD.
 1.98.6.1 23-Jun-2011  cherry Catchup with rmind-uvmplock merge.
 1.100.6.1 02-Jun-2012  mrg sync to latest -current.
 1.100.2.3 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.100.2.2 16-Jan-2013  yamt sync with (a bit old) head
 1.100.2.1 30-Oct-2012  yamt sync with head
 1.101.2.3 03-Dec-2017  jdolecek update from HEAD
 1.101.2.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.101.2.1 25-Feb-2013  tls resync with head
 1.102.2.1 18-May-2014  rmind sync with head
 1.104.2.1 10-Aug-2014  tls Rebase.
 1.105.2.1 17-Jan-2015  martin Pull up following revision(s) (requested by maxv in ticket #427):
sys/compat/svr4/svr4_schedctl.c: revision 1.8
sys/netinet/tcp_timer.c: revision 1.88
sys/miscfs/genfs/layer_vfsops.c: revision 1.45
sys/compat/svr4/svr4_ioctl.c: revision 1.37
sys/ufs/chfs/chfs_vfsops.c: revision 1.14
sys/miscfs/fdesc/fdesc_vfsops.c: revision 1.91
sys/compat/linux/arch/i386/linux_ptrace.c: revision 1.30
sys/compat/common/kern_time_50.c: revision 1.28
sys/netinet6/ip6_forward.c: revision 1.74
sys/miscfs/umapfs/umap_vnops.c: revision 1.57
sys/compat/svr4/svr4_fcntl.c: revision 1.74
distrib/sets/lists/comp/mi: revision 1.1931
sys/netinet6/udp6_output.c: revision 1.46
sys/fs/puffs/puffs_compat.c: revision 1.3
sys/fs/udf/udf_rename.c: revision 1.11
sys/compat/svr4/svr4_filio.c: revision 1.24
sys/fs/udf/udf_rename.c: revision 1.12
sys/netinet/tcp_usrreq.c: revision 1.202
sys/miscfs/umapfs/umap_subr.c: revision 1.29
sys/compat/linux/common/linux_fadvise64.c: revision 1.3
sys/netinet/if_atm.c: revision 1.34
sys/miscfs/procfs/procfs_subr.c: revision 1.106
sys/miscfs/genfs/layer_subr.c: revision 1.37
sys/netinet/tcp_sack.c: revision 1.30
sys/compat/freebsd/freebsd_misc.c: revision 1.33
sys/compat/freebsd/freebsd_file.c: revision 1.33
sys/ufs/chfs/chfs_vnode.c: revision 1.12
sys/compat/svr4/svr4_ttold.c: revision 1.34
sys/compat/linux/common/linux_file.c: revision 1.114
sys/compat/linux/arch/mips/linux_machdep.c: revision 1.43
sys/compat/linux/common/linux_signal.c: revision 1.76
sys/compat/common/compat_util.c: revision 1.46
sys/compat/linux/arch/arm/linux_ptrace.c: revision 1.18
sys/compat/svr4/svr4_sockio.c: revision 1.36
sys/compat/linux/arch/arm/linux_machdep.c: revision 1.32
sys/compat/svr4/svr4_signal.c: revision 1.66
sys/kern/kern_exec.c: revision 1.410
sys/fs/puffs/puffs_vfsops.c: revision 1.115
sys/compat/svr4/svr4_exec_elf64.c: revision 1.15
sys/compat/linux/arch/i386/linux_machdep.c: revision 1.159
sys/compat/linux/arch/alpha/linux_machdep.c: revision 1.50
sys/compat/linux32/common/linux32_misc.c: revision 1.24
sys/netinet/in_pcb.c: revision 1.153
sys/sys/malloc.h: revision 1.116
sys/compat/common/if_43.c: revision 1.9
share/man/man9/Makefile: revision 1.380
sys/netinet/tcp_vtw.c: revision 1.12
sys/miscfs/umapfs/umap_vfsops.c: revision 1.95
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.186
sys/compat/common/uipc_syscalls_43.c: revision 1.46
sys/ufs/ext2fs/ext2fs_vnops.c: revision 1.115
sys/fs/puffs/puffs_msgif.c: revision 1.97
sys/compat/svr4/svr4_ipc.c: revision 1.27
sys/compat/linux/common/linux_exec.c: revision 1.117
sys/ufs/ext2fs/ext2fs_readwrite.c: revision 1.66
sys/netinet/tcp_output.c: revision 1.179
sys/compat/svr4/svr4_termios.c: revision 1.28
sys/fs/udf/udf_strat_bootstrap.c: revision 1.4
sys/fs/puffs/puffs_subr.c: revision 1.67
sys/fs/puffs/puffs_node.c: revision 1.36
sys/miscfs/overlay/overlay_vnops.c: revision 1.21
sys/fs/cd9660/cd9660_node.c: revision 1.34
sys/netinet/raw_ip.c: revision 1.146
sys/sys/mallocvar.h: revision 1.13
sys/miscfs/overlay/overlay_vfsops.c: revision 1.63
share/man/man9/malloc.9: revision 1.50
sys/netinet6/dest6.c: revision 1.18
sys/compat/linux/common/linux_uselib.c: revision 1.33
sys/compat/linux/common/linux_socket.c: revision 1.120
share/man/man9/malloc.9: revision 1.51
sys/netinet/tcp_subr.c: revision 1.257
sys/compat/linux/common/linux_socketcall.c: revision 1.45
sys/compat/linux/common/linux_fadvise64_64.c: revision 1.3
sys/compat/freebsd/freebsd_ipc.c: revision 1.17
sys/compat/linux/common/linux_misc_notalpha.c: revision 1.109
sys/compat/linux/arch/alpha/linux_pipe.c: revision 1.17
sys/netinet6/in6_pcb.c: revision 1.132
sys/netinet6/in6_ifattach.c: revision 1.94
sys/compat/svr4/svr4_exec_elf32.c: revision 1.15
sys/miscfs/nullfs/null_vfsops.c: revision 1.90
sys/fs/cd9660/cd9660_util.c: revision 1.12
sys/compat/linux/arch/powerpc/linux_machdep.c: revision 1.48
sys/compat/freebsd/freebsd_exec_elf32.c: revision 1.20
sys/miscfs/procfs/procfs_vfsops.c: revision 1.94
sys/compat/linux/arch/powerpc/linux_ptrace.c: revision 1.28
sys/compat/linux/common/linux_sched.c: revision 1.67
sys/compat/linux/common/linux_exec_aout.c: revision 1.67
sys/compat/linux/common/linux_pipe.c: revision 1.67
sys/compat/linux/common/linux_llseek.c: revision 1.34
sys/compat/linux/arch/mips/linux_ptrace.c: revision 1.10
Do not uselessly include <sys/malloc.h>.
Cleanup:
- remove struct kmembuckets (dead)
- correctly deadify MALLOC_XX
- remove MALLOC_DEFINE_LIMIT and MALLOC_JUSTDEFINE_LIMIT (dead)
- remove malloc_roundup(), malloc_type_setlimit(), MALLOC_DEFINE_LIMIT()
and MALLOC_JUSTDEFINE_LIMIT() from man 9 malloc
New sentence, new line. Bump date for previous.
Obsolete malloc_roundup(9), malloc_type_setlimit(9) and MALLOC_DEFINE_LIMIT(9)
man pages.
 1.106.6.1 21-Apr-2017  bouyer Sync with HEAD
 1.106.4.1 26-Apr-2017  pgoyette Sync with HEAD
 1.106.2.1 28-Aug-2017  skrll Sync with HEAD
 1.108.6.2 17-Apr-2018  martin Pull up following revision(s) (requested by hannken in ticket #772):

sys/miscfs/procfs/procfs_subr.c: revision 1.112

Change procfs_revoke_vnodes() to use vrecycle()/vgone() instead
of VOP_REVOKE().

Gets rid of a bunch of suspensions on /proc as vrecycle() will
succeed most time and we suspend at most once per call.
 1.108.6.1 12-Apr-2018  martin Pull up following revision(s) (requested by kamil in ticket #713):

sys/modules/procfs/Makefile: revision 1.4
sys/miscfs/procfs/procfs_vfsops.c: revision 1.98
bin/ps/ps.1: revision 1.108
sys/compat/linux/arch/i386/linux_ptrace.c: revision 1.32
sys/miscfs/procfs/procfs_vnops.c: revision 1.198
sys/kern/sys_ptrace_common.c: revision 1.23
sys/kern/sys_ptrace_common.c: revision 1.24
sbin/mount_procfs/mount_procfs.8: revision 1.36
sys/kern/sys_ptrace_common.c: revision 1.25
sys/kern/sys_ptrace.c: revision 1.5
sys/compat/linux/arch/powerpc/linux_ptrace.c: revision 1.30
sys/sys/proc.h: revision 1.342
sys/kern/sys_ptrace_common.c: revision 1.26
sys/miscfs/procfs/procfs_ctl.c: file removal
sys/kern/sys_ptrace_common.c: revision 1.27
sys/miscfs/procfs/procfs_subr.c: revision 1.109
sys/kern/sys_ptrace_common.c: revision 1.28
sys/secmodel/extensions/secmodel_extensions.c: revision 1.8
sys/kern/sys_ptrace_common.c: revision 1.29
sys/sys/ptrace.h: revision 1.62
sys/compat/netbsd32/netbsd32_signal.c: revision 1.45
share/man/man9/kauth.9: revision 1.109
sys/miscfs/procfs/files.procfs: revision 1.12
sys/compat/netbsd32/netbsd32.h: revision 1.115
sys/miscfs/procfs/procfs.h: revision 1.72
sys/compat/netbsd32/netbsd32_ptrace.c: revision 1.5
sys/kern/kern_sig.c: revision 1.337
sys/sys/kauth.h: revision 1.75
sys/sys/sysctl.h: revision 1.224
sys/kern/sys_ptrace_common.c: revision 1.30
sys/kern/sys_ptrace_common.c: revision 1.31
sys/kern/sys_ptrace_common.c: revision 1.32
sys/kern/sys_ptrace_common.c: revision 1.33
sys/compat/linux/arch/arm/linux_ptrace.c: revision 1.20
sys/kern/sys_ptrace_common.c: revision 1.34
sys/kern/sys_ptrace_common.c: revision 1.36
sys/kern/kern_proc.c: revision 1.207
sys/kern/kern_exit.c: revision 1.269
doc/TODO.ptrace: revision 1.29

Make {s,g}et{db,fp,}regs work again for PK_32 processes
XXX: pullup-8

add disgusting magic to handle compat_netbsd32 as a module.

use process_*reg32 instead of struct *reg32.

Remove the filesystem tracing feature

This is a legacy interface from 4.4BSD, and it was
introduced to overcome shortcomings of ptrace(2) at that time, which are
no longer relevant (performance). Today /proc/#/ctl offers a narrow
subset of ptrace(2) commands and is not applicable for modern
applications use beyond simplistic tracing scenarios.

This removal will simplify kernel internals. Users will still be able to
use all the other /proc files.

This change won't affect other procfs files neither Linux compat
features within mount_procfs(8). /proc/#/ctl isn't available on Linux.

Remove:
- /proc/#/ctl from mount_procfs(8)
- P_FSTRACE note from the documentation of ps(1)
- /proc/#/ctl and filesystem tracing documentation from mount_procfs(8)
- KAUTH_REQ_PROCESS_PROCFS_CTL documentation from kauth(9)
- source code file miscfs/procfs/procfs_ctl.c
- PFSctl and procfs_doctl() from sys/miscfs/procfs/procfs.h
- KAUTH_REQ_PROCESS_PROCFS_CTL from sys/sys/kauth.h
- PSL_FSTRACE (0x00010000) from sys/sys/proc.h
- P_FSTRACE (0x00010000) from sys/sys/sysctl.h

Reduce code complexity after removal of this functionality.

Update TODO.ptrace accordingly: remove two entries about /proc tracing.

Do not keep legacy notes as comments in the headers about removed

PSL_FSTRACE / P_FSTRACE, as this interface had little number of users
(close or equal to zero).
Proposed on tech-kern@.

All filesystem tracing utility users are encouraged to switch to ptrace(2).

Sponsored by <The NetBSD Foundation>

untangle the mess:
- factor out common code
- break each ptrace subcall to its own sub-function
.. more to come ...
- reduce ifdef ugliness by moving it up top.
- factor out PT_IO and make PT_{READ,WRITE}_{I,D} use it
- factor out PT_DUMPCORE
- factor out sendsig code
.. more to come ...

handle siginfo requests for ptrace32

ptrace: Partially undo PT_{READ,WRITE}_{I,D} and unbreak these commands

The refactored code did not work and was generating EFAULT.

Sponsored by <The NetBSD Foundation>

Merge the code back; the problem was that since we are reading/writing
to a kernel address for PT_{READ,WRITE}_{I,D} we need the kernel vmspace.
provide separate read and write functions to accomodate register functions
that need a size argument.

don't ignore error from copyout_piod

Use the proper process (the tracee) to get information about lwps and
registers and the tracer for vmspace.

Add new sysctl(3) entry: security.models.extensions.user_set_dbregs

Model this new sysctl(3) entry after "user_set_cpu_affinity" in the same
level of sysctl(3) switches.

Allow to read unconditionally Debug Registers (no change here). This is
convenient as even if a user of a debugger does not use hardware assisted
watchpoints/breakpoints, a debugger can still prompt these values to store
in an internal cache with context of registers. Reading them should have
no security concerns.

Add a paranoid MI switch that prohibits by default setting these registers
by a regular user (non-superuser). Make this switch disabled by default.
There are enough reserved bits out there to allow using them
unconditionally on hardened hosts.

Features shipped with Debug Registers are optional features in debuggers.
There is no reduction in elementary functionality.

Reviewed by <christos>

Sponsored by <The NetBSD Foundation>
 1.111.2.1 22-Apr-2018  pgoyette Sync with HEAD
 1.112.2.2 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.112.2.1 10-Jun-2019  christos Sync with HEAD
 1.116.20.1 18-Apr-2024  martin Pull up following revision(s) (requested by hannken in ticket #668):

sys/miscfs/procfs/procfs.h: revision 1.83
sys/miscfs/procfs/procfs.h: revision 1.84
sys/kern/vfs_mount.c: revision 1.104
sys/miscfs/procfs/procfs_vnops.c: revision 1.230
sys/kern/init_main.c: revision 1.547
sys/kern/kern_hook.c: revision 1.15
sys/miscfs/procfs/procfs_vfsops.c: revision 1.112
sys/miscfs/procfs/procfs_vfsops.c: revision 1.113
sys/miscfs/procfs/procfs_vfsops.c: revision 1.114
sys/miscfs/procfs/procfs_subr.c: revision 1.117

Print dangling vnode before panic() to help debug.

PR kern/57775 ""panic: unmount: dangling vnode" while umounting procfs"
Protect kernel hooks exechook, exithook and forkhook with rwlock.

Lock as writer on establish/disestablish and as reader on list traverse.

For exechook ride "exec_lock" as it is already take as reader when
traversing the list. Add local locks for exithook and forkhook.

Move exec_init before signal_init as signal_init calls exechook_establish()
that needs "exec_lock".

PR kern/39913 "exec, fork, exit hooks need locking"

Add a hashmap to access all procfs nodes by pid.

Using the exechook to revoke procfs nodes is racy and may deadlock:
one thread runs doexechooks() -> procfs_revoke_vnodes() and wants to suspend
the file system for vgone(), while another thread runs a forced unmount,
has the file system suspended, tries to disestablish the exechook and
waits for doexechooks() to complete.

Establish/disestablish the exechook on module load/unload instead
mount/unmount and use the hashmap to access all procfs nodes for this pid.

May fix PR kern/57775 ""panic: unmount: dangling vnode" while umounting procfs"

Remove all procfs nodes for this process on process exit.
 1.119.2.1 02-Aug-2025  perseant Sync with HEAD
 1.120 14-Sep-2024  pgoyette Define dependencies based on build options.
 1.119 09-Sep-2024  pgoyette Now we have another dependency for the SYSV_* stuff.
 1.118 09-Sep-2024  pgoyette procfs grew a new dependency
 1.117 01-Jul-2024  christos Add linux POSIX message queue support (Ricardo Branco)
 1.116 12-May-2024  christos branches: 1.116.2;
PR/58227: Ricardo Branco: Add support for proc/sysvipc in Linux emulator
 1.115 12-May-2024  christos PR/58240: Ricardo Branco: Add support for proc/self/limits as used by Linux
 1.114 17-Jan-2024  hannken Remove all procfs nodes for this process on process exit.
 1.113 17-Jan-2024  hannken Using the exechook to revoke procfs nodes is racy and may deadlock:

one thread runs doexechooks() -> procfs_revoke_vnodes() and wants to suspend
the file system for vgone(), while another thread runs a forced unmount,
has the file system suspended, tries to disestablish the exechook and
waits for doexechooks() to complete.

Establish/disestablish the exechook on module load/unload instead
mount/unmount and use the hashmap to access all procfs nodes for this pid.

May fix PR kern/57775 ""panic: unmount: dangling vnode" while umounting procfs"
 1.112 17-Jan-2024  hannken Add a hashmap to access all procfs nodes by pid.
 1.111 17-Jan-2022  bouyer branches: 1.111.4;
If the calling process is running under linux emulation, make /proc/xxx/fd/
return only symlinks pointing to the original file in the filesystem,
instead of a hard link. This matches the linux behavior, and some
linux programs relies on it (they unconditionally call readlink() on
/proc/xxx/fd/yy and don't deal with it returning EINVAL).
Proposed on tech-kern@ in
http://mail-index.netbsd.org/tech-kern/2022/01/11/msg027877.html
 1.110 28-Dec-2020  riastradh Fix procfs environ node.
 1.109 23-May-2020  ad branches: 1.109.2;
Move proc_lock into the data segment. It was dynamically allocated because
at the time we had mutex_obj_alloc() but not __cacheline_aligned.
 1.108 29-Apr-2020  thorpej If the procfs mount is marked as linux-compat, then allow proc lookup
by any LWP ID in the proc, not just the canonical PID.
 1.107 20-Apr-2020  htodd Sort include files.
 1.106 20-Apr-2020  htodd Add missing include to fix build.
 1.105 19-Apr-2020  thorpej - Only increment nprocs when we're creating a new process, not just
when allocating a PID.
- Per above, proc_free_pid() no longer decrements nprocs. It's now done
in proc_free() right after proc_free_pid().
- Ensure nprocs is accessed using atomics everywhere.
 1.104 04-Apr-2020  ad branches: 1.104.2;
Merge the remaining changes from the ad-namecache branch, affecting namei()
and getcwd():

- push vnode locking back as far as possible.
- do most lookups directly in the namecache, avoiding vnode locks & refs.
- don't block new refs to vnodes across VOP_INACTIVE().
- get shared locks for VOP_LOOKUP() if the file system supports it.
- correct lock types for VOP_ACCESS() / VOP_GETATTR() in a few places.

Possible future enhancements:

- make the lookups lockless.
- support dotdot lookups by being lockless and inferring absence of chroot.
- maybe make it work for layered file systems.
- avoid vnode references at the root & cwd.
 1.103 16-Mar-2020  pgoyette Use the module subsystem's ability to process SYSCTL_SETUP() entries to
automate installation of sysctl nodes.

Note that there are still a number of device and pseudo-device modules
that create entries tied to individual device units, rather than to the
module itself. These are not changed.
 1.102 17-Jan-2020  ad VFS_VGET(), VFS_ROOT(), VFS_FHTOVP(): give them a "int lktype" argument, to
allow us to get shared locks (or no lock) on the returned vnode. Matches
FreeBSD.
 1.101 30-Mar-2019  christos branches: 1.101.4; 1.101.6;
add a node for the process resource limits.
 1.100 31-Dec-2017  christos branches: 1.100.4;
rename some "cmdline" stuff now that it is used to print environment too
 1.99 31-Dec-2017  christos Add an environ node
 1.98 28-Aug-2017  kamil Remove the filesystem tracing feature

This is a legacy interface from 4.4BSD, and it was
introduced to overcome shortcomings of ptrace(2) at that time, which are
no longer relevant (performance). Today /proc/#/ctl offers a narrow
subset of ptrace(2) commands and is not applicable for modern
applications use beyond simplistic tracing scenarios.

This removal will simplify kernel internals. Users will still be able to
use all the other /proc files.

This change won't affect other procfs files neither Linux compat
features within mount_procfs(8). /proc/#/ctl isn't available on Linux.

Remove:
- /proc/#/ctl from mount_procfs(8)
- P_FSTRACE note from the documentation of ps(1)
- /proc/#/ctl and filesystem tracing documentation from mount_procfs(8)
- KAUTH_REQ_PROCESS_PROCFS_CTL documentation from kauth(9)
- source code file miscfs/procfs/procfs_ctl.c
- PFSctl and procfs_doctl() from sys/miscfs/procfs/procfs.h
- KAUTH_REQ_PROCESS_PROCFS_CTL from sys/sys/kauth.h
- PSL_FSTRACE (0x00010000) from sys/sys/proc.h
- P_FSTRACE (0x00010000) from sys/sys/sysctl.h

Reduce code complexity after removal of this functionality.

Update TODO.ptrace accordingly: remove two entries about /proc tracing.

Do not keep legacy notes as comments in the headers about removed
PSL_FSTRACE / P_FSTRACE, as this interface had little number of users
(close or equal to zero).

Proposed on tech-kern@.

All filesystem tracing utility users are encouraged to switch to ptrace(2).

Sponsored by <The NetBSD Foundation>
 1.97 30-Mar-2017  christos branches: 1.97.6;
add an auxv node.
 1.96 17-Feb-2017  hannken Add generic genfs_suspendctl() and use it for all file systems.
Layered file systems need work.
 1.95 03-Nov-2016  pgoyette branches: 1.95.2;
Module procfs needs ptrace_common for process_do{,fp}regs
 1.94 10-Nov-2014  maxv branches: 1.94.2; 1.94.4;
Do not uselessly include <sys/malloc.h>.
 1.93 05-Sep-2014  matt Try not to use f_data, use f_{vnode,socket,pipe,mqueue,kqueue,ksem} to get
a correctly typed pointer.
 1.92 27-Jul-2014  hannken branches: 1.92.2;
Change procfs from hashlist to vcache.
- Key is (type, pid, fd)
- Remove argument "p" from procfs_allocvp(). It is only used
when "type == PFSfd". Lookup the proc with proc_find() when
procfs_loadvnode() needs it.
- Use a vfs_vnode_iterator for procfs_revoke_vnodes().
 1.91 16-Apr-2014  maxv An (un)privileged user can easily make the kernel dereference a NULL
pointer.

The kernel allows 'data' to be NULL; it's the fs's responsibility to
ensure that it isn't NULL (if the fs actually needs data).

ok christos@
 1.90 23-Mar-2014  hannken branches: 1.90.2;
Change all vfsops to use C99 designated initializers.

No functional changes intended.
 1.89 25-Feb-2014  pooka Ensure that the top level sysctl nodes (kern, vfs, net, ...) exist before
the sysctl link sets are processed, and remove redundancy.

Shaves >13kB off of an amd64 GENERIC, not to mention >1k duplicate
lines of code.
 1.88 07-Feb-2014  hannken Change vnode operation lookup to return the resulting vnode *vpp unlocked.
Change cache_lookup() to return an unlocked vnode.

Discussed on tech-kern@

Welcome to 6.99.31
 1.87 30-Apr-2012  rmind branches: 1.87.2; 1.87.4;
- Replace some malloc(9) uses with kmem(9).
- G/C M_IPMOPTS, M_IPMADDR and M_BWMETER.
 1.86 27-Sep-2011  christos branches: 1.86.2; 1.86.6; 1.86.8; 1.86.12; 1.86.14;
define PROCFS_MAXNAMLEN and use it.
 1.85 30-Nov-2009  pooka Introduce genfs_statvfs() as pretty much a no-info statvfs and
convert several pseudo file systems to use it.
 1.84 02-Oct-2009  elad Put procfs policy back in the subsystem.
 1.83 15-Mar-2009  cegger ansify function definitions
 1.82 14-Mar-2009  dsl Change about 4500 of the K&R function definitions to ANSI ones.
There are still about 1600 left, but they have ',' or /* ... */
in the actual variable definitions - which my awk script doesn't handle.
There are also many that need () -> (void).
(The script does handle misordered arguments.)
 1.81 28-Jun-2008  rumble branches: 1.81.4; 1.81.6; 1.81.10; 1.81.16; 1.81.20;
Create sysctl entries during module initialisation and destroy them
appropriately.

Many of these file systems are now ready for modularisation.
 1.80 13-May-2008  simonb branches: 1.80.2;
mnt_data is a pointer, set it to NULL not 0 when we're finished with it.
 1.79 10-May-2008  rumble Convert file systems to dynamically attach with the new module interface.
Make VFS hooks dynamic while we're here and say farewell to VFS_ATTACH and
VFS_HOOKS_ATTACH linksets.

As a consequence, most of the file systems can now be loaded as new style
modules.

Quick sanity check by ad@.
 1.78 29-Apr-2008  ad branches: 1.78.2;
PR kern/38057 ffs makes assuptions about devvp file system
PR kern/33406 softdeps get stuck in endless loop

Introduce VFS_FSYNC() and call it when syncing a block device, if it
has a mounted file system.
 1.77 28-Jan-2008  dholland branches: 1.77.6; 1.77.8; 1.77.10;
Fix some race conditions in rename.
Introduce a per-FS rename lock and new vfsops to manipulate it.
Get this lock while renaming. Also add another relookup() in do_sys_rename,
which is a hack to kludge around some of the worst deficiencies of
ufs_rename.
reviewed-by: pooka (and an earlier rev by ad)
posted on tech-kern with no objections.
 1.76 26-Dec-2007  ad Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.
 1.75 26-Nov-2007  pooka branches: 1.75.2; 1.75.6;
Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern
 1.74 31-Jul-2007  pooka branches: 1.74.2; 1.74.4; 1.74.10; 1.74.12;
* nuke the nameidata parameter from VFS_MOUNT(). Nobody on tech-kern
knew what it was supposed to be used for and wrstuden gave a go-ahead
* while rototilling, convert file systems which went easily to
use VFS_PROTOS() instead of manually prototyping the methods
 1.73 26-Jul-2007  pooka Use eopnotsupp() instead of vfs_stdsuspendctl() and retire the latter.
 1.72 17-Jul-2007  pooka branches: 1.72.2;
Make set_statvfs_info() take a parameter for the vfs name instead
of always retrieving it from mp->mnt_op->vfs_name

christos ok
 1.71 12-Jul-2007  dsl Change the VFS_MOUNT() interface so that the 'data' buffer passed to the
fs code is a kernel buffer, pass though the length of the buffer as well.
Since the length of the userspace buffer isn'it (yet) passed through the mount
system call, add a field to the vfsops structure containing the default length.
Split sys_mount() for calls from compat code.
Ride one of the recent kernel version changes - old fs LKMs will load, but
sys_mount() will reject any attempt to use them.
 1.70 09-Feb-2007  ad branches: 1.70.6;
Merge newlock2 to head.
 1.69 19-Jan-2007  hannken New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).
 1.68 09-Dec-2006  chs a smorgasbord of improvements to vnode locking and path lookup:
- LOCKPARENT is no longer relevant for lookup(), relookup() or VOP_LOOKUP().
these now always return the parent vnode locked. namei() works as before.
lookup() and various other paths no longer acquire vnode locks in the
wrong order via vrele(). fixes PR 32535.
as a nice side effect, path lookup is also up to 25% faster.
- the above allows us to get rid of PDIRUNLOCK.
- also get rid of WANTPARENT (just use LOCKPARENT and unlock it).
- remove an assumption in layer_node_find() that all file systems implement
a recursive VOP_LOCK() (unionfs doesn't).
- require that all file systems supply vfs_vptofh and vfs_fhtovp routines.
fill in eopnotsupp() for file systems that don't support being exported
and remove the checks for NULL. (layerfs calls these without checking.)
- in union_lookup1(), don't change refcounts in the ISDOTDOT case, just
adjust which vnode is locked. fixes PR 33374.
- apply fixes for ufs_rename() from ufs_vnops.c rev. 1.61 to ext2fs_rename().
 1.67 16-Nov-2006  christos branches: 1.67.2;
__unused removal on arguments; approved by core.
 1.66 12-Oct-2006  christos - sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386
 1.65 03-Sep-2006  christos branches: 1.65.2; 1.65.4;
add missing initializers
 1.64 14-May-2006  elad integrate kauth.
 1.63 11-Dec-2005  christos branches: 1.63.4; 1.63.6; 1.63.8; 1.63.10; 1.63.12;
merge ktrace-lwp.
 1.62 23-Sep-2005  jmmv Apply the NFS exports list rototill patch:

- Remove all NFS related stuff from file system specific code.
- Drop the vfs_checkexp hook and generalize it in the new nfs_check_export
function, thus removing redundancy from all file systems.
- Move all NFS export-related stuff from kern/vfs_subr.c to the new
file sys/nfs/nfs_export.c. The former was becoming large and its code
is always compiled, regardless of the build options. Using the latter,
the code is only compiled in when NFSSERVER is enabled. While doing this,
also make some functions in nfs_subs.c conditional to NFSSERVER.
- Add a new command in nfssvc(2), called NFSSVC_SETEXPORTSLIST, that takes a
path and a set of export entries. At the moment it can only clear the
exports list or append entries, one by one, but it is done in a way that
allows setting the whole set of entries atomically in the future (see the
comment in mountd_set_exports_list or in doc/TODO).
- Change mountd(8) to use the nfssvc(2) system call instead of mount(2) so
that it becomes file system agnostic. In fact, all this whole thing was
done to remove a 'XXX' block from this utility!
- Change the mount*, newfs and fsck* userland utilities to not deal with NFS
exports initialization; done internally by the kernel when initializing
the NFS support for each file system.
- Implement an interface for VFS (called VFS hooks) so that several kernel
subsystems can run arbitrary code upon receipt of specific VFS events.
At the moment, this only provides support for unmount and is used to
destroy NFS exports lists from the file systems being unmounted, though it
has room for extension.

Thanks go to yamt@, chs@, thorpej@, wrstuden@ and others for their comments
and advice in the development of this patch.
 1.61 30-Aug-2005  xtraeme Remove __P()
 1.60 29-Mar-2005  thorpej branches: 1.60.2;
- Define a VFS_ATTACH() macro that places a reference to a vfsops structure
into the "vfsops" link set.
- Use VFS_ATTACH() where vfsops are declared for individual file systems.
- In vfsinit(), traverse the "vfsops" link set, rather than vfs_list_initial[].
 1.59 02-Jan-2005  thorpej branches: 1.59.2;
Add the system call and VFS infrastructure for file system extended
attributes.

From FreeBSD.
 1.58 13-Sep-2004  jdolecek set mp->mnt_stat.f_namemax on filesystem mount, for use by statvfs
 1.57 25-May-2004  hannken Add ffs internal snapshots. Written by Marshall Kirk McKusick for FreeBSD.

- Not enabled by default. Needs kernel option FFS_SNAPSHOT.
- Change parameters of ffs_blkfree.
- Let the copy-on-write functions return an error so spec_strategy
may fail if the copy-on-write fails.
- Change genfs_*lock*() to use vp->v_vnlock instead of &vp->v_lock.
- Add flag B_METAONLY to VOP_BALLOC to return indirect block buffer.
- Add a function ffs_checkfreefile needed for snapshot creation.
- Add special handling of snapshot files:
Snapshots may not be opened for writing and the attributes are read-only.
Use the mtime as the time this snapshot was taken.
Deny mtime updates for snapshot files.
- Add function transferlockers to transfer any waiting processes from
one lock to another.
- Add vfsop VFS_SNAPSHOT to take a snapshot and make it accessible through
a vnode.
- Add snapshot support to ls, fsck_ffs and dump.

Welcome to 2.0F.

Approved by: Jason R. Thorpe <thorpej@netbsd.org>
 1.56 25-May-2004  atatat Sysctl descriptions under vfs subtree
 1.55 27-Apr-2004  jrf First pass for some caddr_t removal and changes to get rid of it where we
no longer use and/or need it

- removed casts from unionfs, deadfs and fdesc
(there are more to hunt down still)
- changed vfs_quotactl args argumet from caddr_t to void *
- changed vfs_quotactl structures/callers to reflect the api change

Compiled fine and ran for about a day. Approved/reviewed by
christos@netbsd.org and gimpy@netbsd.org.
 1.54 21-Apr-2004  christos add sys/dirent.h
 1.53 21-Apr-2004  christos Replace the statfs() family of system calls with statvfs().
Retain binary compatibility.
 1.52 24-Mar-2004  atatat branches: 1.52.2;
Tango on sysctl_createv() and flags. The flags have all been renamed,
and sysctl_createv() now uses more arguments.
 1.51 04-Dec-2003  atatat Dynamic sysctl.

Gone are the old kern_sysctl(), cpu_sysctl(), hw_sysctl(),
vfs_sysctl(), etc, routines, along with sysctl_int() et al. Now all
nodes are registered with the tree, and nodes can be added (or
removed) easily, and I/O to and from the tree is handled generically.

Since the nodes are registered with the tree, the mapping from name to
number (and back again) can now be discovered, instead of having to be
hard coded. Adding new nodes to the tree is likewise much simpler --
the new infrastructure handles almost all the work for simple types,
and just about anything else can be done with a small helper function.

All existing nodes are where they were before (numerically speaking),
so all existing consumers of sysctl information should notice no
difference.

PS - I'm sorry, but there's a distinct lack of documentation at the
moment. I'm working on sysctl(3/8/9) right now, and I promise to
watch out for buses.
 1.50 27-Sep-2003  darcy Changes as discussed with itojun on tech-kern. I have modified the enums
to have KFS or PFS differentiators. Further I have wrapped the enum in
procfs in "#ifdef _KERNEL" as it is done in kernfs.

To see the discussion go to http://mail-index.NetBSD.org/tech-kern/2003/09/
and look for "Mismatched enums in include files" in the list.
 1.49 07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.48 29-Jun-2003  fvdl branches: 1.48.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
 1.47 29-Jun-2003  thorpej Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.
 1.46 28-Jun-2003  darrenr Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
 1.45 16-Apr-2003  christos PR/1796: John Kohl: statfs misbehaves under chrooted environments.

- Under chroot it displays only the visible filesystems with appropriate paths.
- The statfs f_mntonname gets adjusted to contain the real path from root.
- While was there, fixed a bug in ext2fs, locking problems with vfs_getfsstat(),
and factored out some of the vfsop statfs() code to copy_statfs_info(). This
fixes the problem where some filesystems forgot to set fsid.
- Made coda look more like a normal fs.
 1.44 03-Jan-2003  christos Implement /proc/<pid>/fd/<n>. This is work in progress. Questionable things:
- Is it ok to convert DTYPE_PIPE to VFIFO and DTYPE_SOCKET to VSOCK?
- XXX: Avoid locking issue in ls -Rl /proc by avoiding curproc
- Does I/O to pipes work?
- XXX: Are there security implications?
 1.43 21-Sep-2002  christos MNT_GETARGS support
 1.42 30-Jul-2002  soren Die, qaddr_t, die! - mnt_data in struct mount is already effectively
a void *, so stop pretending otherwise.
 1.41 10-Nov-2001  lukem branches: 1.41.8;
add RCSIDs
 1.40 15-Sep-2001  chs branches: 1.40.2;
add a new VFS op, vfs_reinit, which is called when desiredvnodes is
adjusted via sysctl. file systems that have hash tables which are
sized based on the value of this variable now resize those hash tables
using the new value. the max number of FFS softdeps is also recalculated.

convert various file systems to use the <sys/queue.h> macros for
their hash tables.
 1.39 30-May-2001  mrg branches: 1.39.2; 1.39.4;
use _KERNEL_OPT
 1.38 25-Jan-2001  jdolecek branches: 1.38.2;
g/c pmnt_mp in struct procfs_args
 1.37 22-Jan-2001  jdolecek make filesystem vnodeop, specop, fifoop and vnodeopv_* arrays const
 1.36 17-Jan-2001  fvdl Add a few linux-style files, only enabled when -o linux is specified
for the mount. Currently these are /proc/cpuinfo and /proc/meminfo.
The former only does something on i386 right now.
 1.35 28-Jun-2000  mrg <vm/vm.h> -> <uvm/uvm_extern.h>
 1.34 10-Jun-2000  assar branches: 1.34.2;
make vfs_getnewfsid only take one argument and fetch the name of the
filesystem from the supplied mount argument. also make makefstype
take a const parameter. update all the callers.
 1.33 16-Mar-2000  jdolecek branches: 1.33.2;
Add new VFS op routine - vfs_done and call it on filesystem detach
in vfs_detach(). vfs_done may free global filesystem's resources,
typically those allocated in respective filesystem's init function.
Needed so those filesystems which went in via LKM have a chance to
clean after themselves before unloading. This fixes random panics
when LKM for filesystem using pools was loaded and unloaded several
times.

For each leaf filesystem, add appropriate vfs_done routine.
 1.32 25-Jan-2000  fvdl At mount/unmount time, add an exec hook to revoke all vnodes iff the
process is about to exec a sugid binary.

To speed up things, use hashing for vnode allocation, like other filesystems
do. This avoids walking the whole procfs node list in the revoke case too.
 1.31 26-Feb-1999  wrstuden branches: 1.31.2; 1.31.8; 1.31.14;
Modify vfsops to seperate vfs_fhtovp() into two routines. vfs_fhtovp() now
only handles the file handle to vnode conversion, and a new call,
vfs_checkexp(), performs the export verification.
 1.30 09-Aug-1998  perry bzero->memset, bcopy->memcpy, bcmp->memcmp
 1.29 05-Jul-1998  jonathan * defopt COMPAT_{09,10,11,12,13} and COMPAT_NOMID.
TODO: revisit interaction between native compat and emul compat usage.
 1.28 01-Mar-1998  fvdl Merge with Lite2 + local changes
 1.27 18-Feb-1998  thorpej Place a pointer to an array of our vnodeopv_desc *'s in our vfsops
structure, for use by vfs_attach().
 1.26 22-Dec-1996  cgd Change the second and third args to struct vfsops' (*vfs_mount)() to
'const char *', and 'void *', respectively. The second arg is taken directly
from user arguments, and is const there, so must be const in the prototypes
and functions. The third arg is also taken directly from user arguments.
It doesn't have to be changed, but since it's cleaner to keep the type
the same as the user arg's type, and I'm already making the 'const char *'
change...
 1.25 09-Feb-1996  christos miscfs prototype changes
 1.24 18-Jun-1995  cgd don't assume the f_fsnamelen is nul-truncated or longer than MFSNAMELEN
 1.23 09-Mar-1995  mycroft copy*str() should use size_t.
 1.22 18-Jan-1995  mycroft Clean up the code to frob mnt_stat a (tiny) bit.
 1.21 15-Dec-1994  mycroft Call foo_statfs() from a common place when mounting.
 1.20 15-Sep-1994  mycroft Fix typo.
 1.19 15-Sep-1994  mycroft stat the file system at mount time, for `df -n', et al.
 1.18 29-Jun-1994  cgd branches: 1.18.2;
New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'
 1.17 15-Jun-1994  mycroft Minor update from JSP after merging my changes.
 1.16 08-Jun-1994  mycroft Update to 4.4-Lite fs code, with local changes.
 1.15 23-Apr-1994  cgd make fs types consistent over new kernels. also, some proto foo.
 1.14 21-Apr-1994  cgd Convert mount, vnode, and buf structs to use <sys/queue.h>. Also,
some knf and structure frobbing to do along with it.
 1.13 15-Apr-1994  cgd forgot these...
 1.12 14-Apr-1994  cgd fs types are names now.
 1.11 20-Jan-1994  ws Make procfs really work for debugging.
Implement not & notepg files in procfs.
 1.10 09-Jan-1994  ws Bug fixes and enhancements:
Make NFS serving work (BUT DON'T USE "attach" TO /proc/*/ctl FOR NOW!!!)
Make `curproc' a symbolic link
Add `.' and `..' entries to the directories.
Return better guesses on the size of the files.
 1.9 05-Jan-1994  cgd add new procfs code, from Jan-Simon Pendry, jsp@sequent.com.
This is pretty-much "virgin", so that diffs can be done later.
 1.8 18-Dec-1993  mycroft Canonicalize all #includes.
 1.7 26-Aug-1993  pk branches: 1.7.2;
Implement setattr: mode for process entries; mode + uid/gid for the
PROCFS root directory.
Fixed omission in pfs_root() which came to light as a result of the above:
hold on to vnode for root dir.
 1.6 25-Aug-1993  mycroft Um, last change was wrong. Instead, add 3 to the number of inodes (forget
about the root directory, too).
 1.5 25-Aug-1993  mycroft Subtract to from the free count for `.' and `..', to maintain the fiction that
this is a real file system.
 1.4 24-Aug-1993  pk Fill inode fields in procfs_statfs(), in stead of block fields
 1.3 24-Aug-1993  pk copyright update.
 1.2 24-Aug-1993  pk Rcs Id added.
 1.1 24-Aug-1993  pk branches: 1.1.1;
Initial version of a proc filesystem.
 1.1.1.2 01-Mar-1998  fvdl Import 4.4BSD-Lite2
 1.1.1.1 01-Mar-1998  fvdl Import 4.4BSD-Lite for reference
 1.7.2.2 28-Dec-1993  pk Return ENODEV rather then EOPNOTSUP for unsupported operations.
 1.7.2.1 14-Nov-1993  mycroft Canonicalize all #includes.
 1.18.2.1 16-Sep-1994  cgd from trunk, per mycroft
 1.31.14.1 21-Dec-1999  wrstuden Initial commit of recent changes to make DEV_BSIZE go away.

Runs on i386, needs work on other arch's. Main kernel routines should be
fine, but a number of the stand programs need help.

cd, fd, ccd, wd, and sd have been updated. sd has been tested with non-512
byte block devices. vnd, raidframe, and lfs need work.

Non 2**n block support is automatic for LKM's and conditional for kernels
on "options NON_PO2_BLOCKS".
 1.31.8.3 11-Feb-2001  bouyer Sync with HEAD.
 1.31.8.2 18-Jan-2001  bouyer Sync with head (for UBC+NFS fixes, mostly).
 1.31.8.1 20-Nov-2000  bouyer Update thorpej_scsipi to -current as of a month ago
 1.31.2.1 01-Feb-2000  he Pull up revision 1.32 (via patch, requested by fvdl):
Close procfs security hole. Fixes SA#2000-001.
 1.33.2.1 22-Jun-2000  minoura Sync w/ netbsd-1-5-base.
 1.34.2.1 30-Mar-2001  he Pull up revision 1.36 (requested by fvdl):
Add some required Linux emulation bits to support the Linux
version of VMware.
 1.38.2.6 07-Jan-2003  thorpej Sync with HEAD.
 1.38.2.5 18-Oct-2002  nathanw Catch up to -current.
 1.38.2.4 01-Aug-2002  nathanw Catch up to -current.
 1.38.2.3 14-Nov-2001  nathanw Catch up to -current.
 1.38.2.2 21-Sep-2001  nathanw Catch up to -current.
 1.38.2.1 21-Jun-2001  nathanw Catch up to -current.
 1.39.4.1 01-Oct-2001  fvdl Catch up with -current.
 1.39.2.3 10-Oct-2002  jdolecek sync kqueue with -current; this includes merge of gehenna-devsw branch,
merge of i386 MP branch, and part of autoconf rototil work
 1.39.2.2 06-Sep-2002  jdolecek sync kqueue branch with HEAD
 1.39.2.1 10-Jan-2002  thorpej Sync kqueue branch with -current.
 1.40.2.1 12-Nov-2001  thorpej Sync the thorpej-mips-cache branch with -current.
 1.41.8.1 29-Aug-2002  gehenna catch up with -current.
 1.48.2.8 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.48.2.7 01-Apr-2005  skrll Sync with HEAD.
 1.48.2.6 17-Jan-2005  skrll Sync with HEAD.
 1.48.2.5 21-Sep-2004  skrll Fix the sync with head I botched.
 1.48.2.4 18-Sep-2004  skrll Sync with HEAD.
 1.48.2.3 24-Aug-2004  skrll Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.
 1.48.2.2 03-Aug-2004  skrll Sync with HEAD
 1.48.2.1 02-Jul-2003  darrenr Apply the aborted ktrace-lwp changes to a specific branch. This is just for
others to review, I'm concerned that patch fuziness may have resulted in some
errant code being generated but I'll look at that later by comparing the diff
from the base to the branch with the file I attempt to apply to it. This will,
at the very least, put the changes in a better context for others to review
them and attempt to tinker with removing passing of 'struct lwp' through
the kernel.
 1.52.2.1 29-May-2004  tron Pull up revision 1.56 (requested by atatat in ticket #393):
Sysctl descriptions under vfs subtree
 1.59.2.1 29-Apr-2005  kent sync with -current
 1.60.2.7 04-Feb-2008  yamt sync with head.
 1.60.2.6 21-Jan-2008  yamt sync with head
 1.60.2.5 07-Dec-2007  yamt sync with head
 1.60.2.4 03-Sep-2007  yamt sync with head.
 1.60.2.3 26-Feb-2007  yamt sync with head.
 1.60.2.2 30-Dec-2006  yamt sync with head.
 1.60.2.1 21-Jun-2006  yamt sync with head.
 1.63.12.1 24-May-2006  tron Merge 2006-05-24 NetBSD-current into the "peter-altq" branch.
 1.63.10.2 06-May-2006  christos - Move kauth_cred_t declaration to <sys/types.h>
- Cleanup struct ucred; forward declarations that are unused.
- Don't include <sys/kauth.h> in any header, but include it in the c files
that need it.

Approved by core.
 1.63.10.1 08-Mar-2006  elad Adapt to kernel authorization KPI.
 1.63.8.2 03-Sep-2006  yamt sync with head.
 1.63.8.1 24-May-2006  yamt sync with head.
 1.63.6.1 01-Jun-2006  kardel Sync with head.
 1.63.4.1 09-Sep-2006  rpaulo sync with head
 1.65.4.2 10-Dec-2006  yamt sync with head.
 1.65.4.1 22-Oct-2006  yamt sync with head
 1.65.2.4 01-Feb-2007  ad Sync with head.
 1.65.2.3 12-Jan-2007  ad Sync with head.
 1.65.2.2 18-Nov-2006  ad Sync with head.
 1.65.2.1 17-Nov-2006  ad Checkpoint work in progress.
 1.67.2.1 17-Feb-2007  tron Apply patch (requested by chs in ticket #422):
- Fix various deadlock problems with nullfs and unionfs.
- Speed up path lookups by upto 25%.
 1.70.6.3 25-Oct-2007  ad - Simplify debugger/procfs reference counting of processes. Use a per-proc
rwlock: rw_tryenter(RW_READER) to gain a reference, and rw_enter(RW_WRITER)
by the process itself to drain out reference holders before major changes
like exiting.
- Fix numerous bugs and locking issues in procfs.
- Mark procfs MPSAFE.
 1.70.6.2 20-Aug-2007  ad Sync with HEAD.
 1.70.6.1 15-Jul-2007  ad Sync with head.
 1.72.2.1 15-Aug-2007  skrll Sync with HEAD.
 1.74.12.2 31-Jul-2007  pooka * nuke the nameidata parameter from VFS_MOUNT(). Nobody on tech-kern
knew what it was supposed to be used for and wrstuden gave a go-ahead
* while rototilling, convert file systems which went easily to
use VFS_PROTOS() instead of manually prototyping the methods
 1.74.12.1 31-Jul-2007  pooka file procfs_vfsops.c was added on branch matt-mips64 on 2007-07-31 21:14:17 +0000
 1.74.10.2 18-Feb-2008  mjf Sync with HEAD.
 1.74.10.1 08-Dec-2007  mjf Sync with HEAD.
 1.74.4.2 23-Mar-2008  matt sync with HEAD
 1.74.4.1 09-Jan-2008  matt sync with HEAD
 1.74.2.1 27-Nov-2007  joerg Sync with HEAD. amd64 Xen support needs testing.
 1.75.6.1 02-Jan-2008  bouyer Sync with HEAD
 1.75.2.1 04-Dec-2007  ad Pull the vmlocking changes into a new branch.
 1.77.10.3 11-Mar-2010  yamt sync with head
 1.77.10.2 04-May-2009  yamt sync with head.
 1.77.10.1 16-May-2008  yamt sync with head.
 1.77.8.1 18-May-2008  yamt sync with head.
 1.77.6.2 29-Jun-2008  mjf Sync with HEAD.
 1.77.6.1 02-Jun-2008  mjf Sync with HEAD.
 1.78.2.2 18-Sep-2008  wrstuden Sync with wrstuden-revivesa-base-2.
 1.78.2.1 23-Jun-2008  wrstuden Sync w/ -current. 34 merge conflicts to follow.
 1.80.2.1 03-Jul-2008  simonb Sync with head.
 1.81.20.1 28-Apr-2014  sborrill Pull up the following revisions(s) (requested by maxv in ticket #1901):
sys/kern/vfs_syscalls.c: revision 1.478, 1.480 via patch
sys/coda/coda_vfsops.c: revision 1.81
sys/fs/adosfs/advfsops.c: revision 1.70
sys/fs/cd9660/cd9660_vfsops.c: revision 1.84
sys/fs/efs/efs_vfsops.c: revision 1.25
sys/fs/filecorefs/filecore_vfsops.c: revision 1.76
sys/fs/hfs/hfs_vfsops.c: revision 1.31
sys/fs/msdosfs/msdosfs_vfsops.c: revision 1.107
sys/fs/ntfs/ntfs_vfsops.c: revision 1.94
sys/fs/ptyfs/ptyfs_vfsops.c: revision 1.50 via patch
sys/fs/puffs/puffs_vfsops.c: revision 1.110 via patch
sys/fs/smbfs/smbfs_vfsops.c: revision 1.100
sys/fs/sysvbfs/sysvbfs_vfsops.c: revision 1.43
sys/fs/tmpfs/tmpfs_vfsops.c: revision 1.59 via patch
sys/fs/udf/udf_vfsops.c: revision 1.67
sys/fs/union/union_vfsops.c: revision 1.72
sys/fs/unionfs/unionfs_vfsops.c: revision 1.13
sys/kern/vfs_syscalls.c: revision 1.479
sys/miscfs/nullfs/null_vfsops.c: revision 1.88 via patch
sys/miscfs/overlay/overlay_vfsops.c: revision 1.61
sys/miscfs/procfs/procfs_vfsops.c: revision 1.91
sys/miscfs/umapfs/umap_vfsops.c: revision 1.92
sys/nfs/nfs_vfsops.c: revision 1.227
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.180
sys/ufs/ffs/ffs_vfsops.c: revision 1.297
sys/ufs/lfs/lfs_vfsops.c: revision 1.321
sys/ufs/mfs/mfs_vfsops.c: revision 1.107

Due to missing checks in the mount syscall, and a wrong assumption on the
file systems side, the kernel could allocate an unbounded or zero-sized
memory buffer, and could dereference a NULL pointer when particular
arguments are given by a user.
 1.81.16.1 28-Apr-2014  sborrill Pull up the following revisions(s) (requested by maxv in ticket #1901):
sys/kern/vfs_syscalls.c: revision 1.478, 1.480 via patch
sys/coda/coda_vfsops.c: revision 1.81
sys/fs/adosfs/advfsops.c: revision 1.70
sys/fs/cd9660/cd9660_vfsops.c: revision 1.84
sys/fs/efs/efs_vfsops.c: revision 1.25
sys/fs/filecorefs/filecore_vfsops.c: revision 1.76
sys/fs/hfs/hfs_vfsops.c: revision 1.31
sys/fs/msdosfs/msdosfs_vfsops.c: revision 1.107
sys/fs/ntfs/ntfs_vfsops.c: revision 1.94
sys/fs/ptyfs/ptyfs_vfsops.c: revision 1.50 via patch
sys/fs/puffs/puffs_vfsops.c: revision 1.110 via patch
sys/fs/smbfs/smbfs_vfsops.c: revision 1.100
sys/fs/sysvbfs/sysvbfs_vfsops.c: revision 1.43
sys/fs/tmpfs/tmpfs_vfsops.c: revision 1.59 via patch
sys/fs/udf/udf_vfsops.c: revision 1.67
sys/fs/union/union_vfsops.c: revision 1.72
sys/fs/unionfs/unionfs_vfsops.c: revision 1.13
sys/kern/vfs_syscalls.c: revision 1.479
sys/miscfs/nullfs/null_vfsops.c: revision 1.88 via patch
sys/miscfs/overlay/overlay_vfsops.c: revision 1.61
sys/miscfs/procfs/procfs_vfsops.c: revision 1.91
sys/miscfs/umapfs/umap_vfsops.c: revision 1.92
sys/nfs/nfs_vfsops.c: revision 1.227
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.180
sys/ufs/ffs/ffs_vfsops.c: revision 1.297
sys/ufs/lfs/lfs_vfsops.c: revision 1.321
sys/ufs/mfs/mfs_vfsops.c: revision 1.107

Due to missing checks in the mount syscall, and a wrong assumption on the
file systems side, the kernel could allocate an unbounded or zero-sized
memory buffer, and could dereference a NULL pointer when particular
arguments are given by a user.
 1.81.10.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.81.6.1 25-Apr-2014  sborrill Pull up the following revisions(s) (requested by maxv in ticket #1901):
sys/kern/vfs_syscalls.c: revision 1.478, 1.480 via patch
sys/coda/coda_vfsops.c: revision 1.81
sys/fs/adosfs/advfsops.c: revision 1.70
sys/fs/cd9660/cd9660_vfsops.c: revision 1.84
sys/fs/efs/efs_vfsops.c: revision 1.25
sys/fs/filecorefs/filecore_vfsops.c: revision 1.76
sys/fs/hfs/hfs_vfsops.c: revision 1.31
sys/fs/msdosfs/msdosfs_vfsops.c: revision 1.107
sys/fs/ntfs/ntfs_vfsops.c: revision 1.94
sys/fs/ptyfs/ptyfs_vfsops.c: revision 1.50 via patch
sys/fs/puffs/puffs_vfsops.c: revision 1.110 via patch
sys/fs/smbfs/smbfs_vfsops.c: revision 1.100
sys/fs/sysvbfs/sysvbfs_vfsops.c: revision 1.43
sys/fs/tmpfs/tmpfs_vfsops.c: revision 1.59 via patch
sys/fs/udf/udf_vfsops.c: revision 1.67
sys/fs/union/union_vfsops.c: revision 1.72
sys/fs/unionfs/unionfs_vfsops.c: revision 1.13
sys/kern/vfs_syscalls.c: revision 1.479
sys/miscfs/nullfs/null_vfsops.c: revision 1.88 via patch
sys/miscfs/overlay/overlay_vfsops.c: revision 1.61
sys/miscfs/procfs/procfs_vfsops.c: revision 1.91
sys/miscfs/umapfs/umap_vfsops.c: revision 1.92
sys/nfs/nfs_vfsops.c: revision 1.227
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.180
sys/ufs/ffs/ffs_vfsops.c: revision 1.297
sys/ufs/lfs/lfs_vfsops.c: revision 1.321
sys/ufs/mfs/mfs_vfsops.c: revision 1.107

Due to missing checks in the mount syscall, and a wrong assumption on the
file systems side, the kernel could allocate an unbounded or zero-sized
memory buffer, and could dereference a NULL pointer when particular
arguments are given by a user.
 1.81.4.1 28-Apr-2009  skrll Sync with HEAD.
 1.86.14.1 21-Apr-2014  bouyer Pull up following revision(s) (requested by maxv in ticket #1050):
sys/ufs/chfs/chfs_vfsops.c: revision 1.11
sys/fs/unionfs/unionfs_vfsops.c: revision 1.13
sys/fs/nilfs/nilfs_vfsops.c: revision 1.16
sys/ufs/mfs/mfs_vfsops.c: revision 1.107
sys/fs/sysvbfs/sysvbfs_vfsops.c: revision 1.43
sys/ufs/ffs/ffs_vfsops.c: revision 1.297
sys/kern/vfs_syscalls.c: revision 1.478
sys/kern/vfs_syscalls.c: revision 1.479
sys/fs/puffs/puffs_vfsops.c: revision 1.110
sys/fs/cd9660/cd9660_vfsops.c: revision 1.84
sys/nfs/nfs_vfsops.c: revision 1.227
sys/fs/v7fs/v7fs_vfsops.c: revision 1.10
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.180
sys/miscfs/umapfs/umap_vfsops.c: revision 1.92
sys/fs/filecorefs/filecore_vfsops.c: revision 1.76
sys/miscfs/nullfs/null_vfsops.c: revision 1.88
sys/fs/ptyfs/ptyfs_vfsops.c: revision 1.50
sys/coda/coda_vfsops.c: revision 1.81
sys/ufs/lfs/lfs_vfsops.c: revision 1.321
sys/fs/tmpfs/tmpfs_vfsops.c: revision 1.59
sys/fs/hfs/hfs_vfsops.c: revision 1.31
sys/miscfs/overlay/overlay_vfsops.c: revision 1.61
sys/fs/union/union_vfsops.c: revision 1.72
sys/fs/ntfs/ntfs_vfsops.c: revision 1.94
sys/kern/vfs_syscalls.c: revision 1.480
sys/fs/efs/efs_vfsops.c: revision 1.25
sys/kern/vfs_syscalls.c: revision 1.482
sys/fs/msdosfs/msdosfs_vfsops.c: revision 1.107
external/cddl/osnet/dist/uts/common/fs/zfs/zfs_vfsops.c: revision 1.12
sys/miscfs/procfs/procfs_vfsops.c: revision 1.91
sys/fs/smbfs/smbfs_vfsops.c: revision 1.100
sys/fs/adosfs/advfsops.c: revision 1.70
sys/fs/udf/udf_vfsops.c: revision 1.67
Limit check for 'data_len'. Otherwise a (un)privileged user can easily
panic the system by passing a huge size.
ok christos@
An (un)privileged user can easily make the kernel dereference a NULL
pointer.
The kernel allows 'data' to be NULL; it's the fs's responsibility to
ensure that it isn't NULL (if the fs actually needs data).
ok christos@
Some fs's - like kernfs - set their vfs_min_mount_data to zero. Add a check
to prevent an (un)privileged user from requesting a zero-sized allocation
(and thus a panic).
This thing is totally buggy: 'data_len' is modified by the fs, so calling
kmem_free with it while its value has changed since the kmem_alloc is far
from being a good idea.
If the kernel figures out that something mismatches, it will panic
(typically with kernfs).
 1.86.12.1 21-Apr-2014  bouyer Pull up following revision(s) (requested by maxv in ticket #1050):
sys/ufs/chfs/chfs_vfsops.c: revision 1.11
sys/fs/unionfs/unionfs_vfsops.c: revision 1.13
sys/fs/nilfs/nilfs_vfsops.c: revision 1.16
sys/ufs/mfs/mfs_vfsops.c: revision 1.107
sys/fs/sysvbfs/sysvbfs_vfsops.c: revision 1.43
sys/ufs/ffs/ffs_vfsops.c: revision 1.297
sys/kern/vfs_syscalls.c: revision 1.478
sys/kern/vfs_syscalls.c: revision 1.479
sys/fs/puffs/puffs_vfsops.c: revision 1.110
sys/fs/cd9660/cd9660_vfsops.c: revision 1.84
sys/nfs/nfs_vfsops.c: revision 1.227
sys/fs/v7fs/v7fs_vfsops.c: revision 1.10
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.180
sys/miscfs/umapfs/umap_vfsops.c: revision 1.92
sys/fs/filecorefs/filecore_vfsops.c: revision 1.76
sys/miscfs/nullfs/null_vfsops.c: revision 1.88
sys/fs/ptyfs/ptyfs_vfsops.c: revision 1.50
sys/coda/coda_vfsops.c: revision 1.81
sys/ufs/lfs/lfs_vfsops.c: revision 1.321
sys/fs/tmpfs/tmpfs_vfsops.c: revision 1.59
sys/fs/hfs/hfs_vfsops.c: revision 1.31
sys/miscfs/overlay/overlay_vfsops.c: revision 1.61
sys/fs/union/union_vfsops.c: revision 1.72
sys/fs/ntfs/ntfs_vfsops.c: revision 1.94
sys/kern/vfs_syscalls.c: revision 1.480
sys/fs/efs/efs_vfsops.c: revision 1.25
sys/kern/vfs_syscalls.c: revision 1.482
sys/fs/msdosfs/msdosfs_vfsops.c: revision 1.107
external/cddl/osnet/dist/uts/common/fs/zfs/zfs_vfsops.c: revision 1.12
sys/miscfs/procfs/procfs_vfsops.c: revision 1.91
sys/fs/smbfs/smbfs_vfsops.c: revision 1.100
sys/fs/adosfs/advfsops.c: revision 1.70
sys/fs/udf/udf_vfsops.c: revision 1.67
Limit check for 'data_len'. Otherwise a (un)privileged user can easily
panic the system by passing a huge size.
ok christos@
An (un)privileged user can easily make the kernel dereference a NULL
pointer.
The kernel allows 'data' to be NULL; it's the fs's responsibility to
ensure that it isn't NULL (if the fs actually needs data).
ok christos@
Some fs's - like kernfs - set their vfs_min_mount_data to zero. Add a check
to prevent an (un)privileged user from requesting a zero-sized allocation
(and thus a panic).
This thing is totally buggy: 'data_len' is modified by the fs, so calling
kmem_free with it while its value has changed since the kmem_alloc is far
from being a good idea.
If the kernel figures out that something mismatches, it will panic
(typically with kernfs).
 1.86.8.1 21-Apr-2014  bouyer Pull up following revision(s) (requested by maxv in ticket #1050):
sys/ufs/chfs/chfs_vfsops.c: revision 1.11
sys/fs/unionfs/unionfs_vfsops.c: revision 1.13
sys/fs/nilfs/nilfs_vfsops.c: revision 1.16
sys/ufs/mfs/mfs_vfsops.c: revision 1.107
sys/fs/sysvbfs/sysvbfs_vfsops.c: revision 1.43
sys/ufs/ffs/ffs_vfsops.c: revision 1.297
sys/kern/vfs_syscalls.c: revision 1.478
sys/kern/vfs_syscalls.c: revision 1.479
sys/fs/puffs/puffs_vfsops.c: revision 1.110
sys/fs/cd9660/cd9660_vfsops.c: revision 1.84
sys/nfs/nfs_vfsops.c: revision 1.227
sys/fs/v7fs/v7fs_vfsops.c: revision 1.10
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.180
sys/miscfs/umapfs/umap_vfsops.c: revision 1.92
sys/fs/filecorefs/filecore_vfsops.c: revision 1.76
sys/miscfs/nullfs/null_vfsops.c: revision 1.88
sys/fs/ptyfs/ptyfs_vfsops.c: revision 1.50
sys/coda/coda_vfsops.c: revision 1.81
sys/ufs/lfs/lfs_vfsops.c: revision 1.321
sys/fs/tmpfs/tmpfs_vfsops.c: revision 1.59
sys/fs/hfs/hfs_vfsops.c: revision 1.31
sys/miscfs/overlay/overlay_vfsops.c: revision 1.61
sys/fs/union/union_vfsops.c: revision 1.72
sys/fs/ntfs/ntfs_vfsops.c: revision 1.94
sys/kern/vfs_syscalls.c: revision 1.480
sys/fs/efs/efs_vfsops.c: revision 1.25
sys/kern/vfs_syscalls.c: revision 1.482
sys/fs/msdosfs/msdosfs_vfsops.c: revision 1.107
external/cddl/osnet/dist/uts/common/fs/zfs/zfs_vfsops.c: revision 1.12
sys/miscfs/procfs/procfs_vfsops.c: revision 1.91
sys/fs/smbfs/smbfs_vfsops.c: revision 1.100
sys/fs/adosfs/advfsops.c: revision 1.70
sys/fs/udf/udf_vfsops.c: revision 1.67
Limit check for 'data_len'. Otherwise a (un)privileged user can easily
panic the system by passing a huge size.
ok christos@
An (un)privileged user can easily make the kernel dereference a NULL
pointer.
The kernel allows 'data' to be NULL; it's the fs's responsibility to
ensure that it isn't NULL (if the fs actually needs data).
ok christos@
Some fs's - like kernfs - set their vfs_min_mount_data to zero. Add a check
to prevent an (un)privileged user from requesting a zero-sized allocation
(and thus a panic).
This thing is totally buggy: 'data_len' is modified by the fs, so calling
kmem_free with it while its value has changed since the kmem_alloc is far
from being a good idea.
If the kernel figures out that something mismatches, it will panic
(typically with kernfs).
 1.86.6.1 02-Jun-2012  mrg sync to latest -current.
 1.86.2.2 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.86.2.1 23-May-2012  yamt sync with head.
 1.87.4.1 18-May-2014  rmind sync with head
 1.87.2.2 03-Dec-2017  jdolecek update from HEAD
 1.87.2.1 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.90.2.1 10-Aug-2014  tls Rebase.
 1.92.2.1 17-Jan-2015  martin Pull up following revision(s) (requested by maxv in ticket #427):
sys/compat/svr4/svr4_schedctl.c: revision 1.8
sys/netinet/tcp_timer.c: revision 1.88
sys/miscfs/genfs/layer_vfsops.c: revision 1.45
sys/compat/svr4/svr4_ioctl.c: revision 1.37
sys/ufs/chfs/chfs_vfsops.c: revision 1.14
sys/miscfs/fdesc/fdesc_vfsops.c: revision 1.91
sys/compat/linux/arch/i386/linux_ptrace.c: revision 1.30
sys/compat/common/kern_time_50.c: revision 1.28
sys/netinet6/ip6_forward.c: revision 1.74
sys/miscfs/umapfs/umap_vnops.c: revision 1.57
sys/compat/svr4/svr4_fcntl.c: revision 1.74
distrib/sets/lists/comp/mi: revision 1.1931
sys/netinet6/udp6_output.c: revision 1.46
sys/fs/puffs/puffs_compat.c: revision 1.3
sys/fs/udf/udf_rename.c: revision 1.11
sys/compat/svr4/svr4_filio.c: revision 1.24
sys/fs/udf/udf_rename.c: revision 1.12
sys/netinet/tcp_usrreq.c: revision 1.202
sys/miscfs/umapfs/umap_subr.c: revision 1.29
sys/compat/linux/common/linux_fadvise64.c: revision 1.3
sys/netinet/if_atm.c: revision 1.34
sys/miscfs/procfs/procfs_subr.c: revision 1.106
sys/miscfs/genfs/layer_subr.c: revision 1.37
sys/netinet/tcp_sack.c: revision 1.30
sys/compat/freebsd/freebsd_misc.c: revision 1.33
sys/compat/freebsd/freebsd_file.c: revision 1.33
sys/ufs/chfs/chfs_vnode.c: revision 1.12
sys/compat/svr4/svr4_ttold.c: revision 1.34
sys/compat/linux/common/linux_file.c: revision 1.114
sys/compat/linux/arch/mips/linux_machdep.c: revision 1.43
sys/compat/linux/common/linux_signal.c: revision 1.76
sys/compat/common/compat_util.c: revision 1.46
sys/compat/linux/arch/arm/linux_ptrace.c: revision 1.18
sys/compat/svr4/svr4_sockio.c: revision 1.36
sys/compat/linux/arch/arm/linux_machdep.c: revision 1.32
sys/compat/svr4/svr4_signal.c: revision 1.66
sys/kern/kern_exec.c: revision 1.410
sys/fs/puffs/puffs_vfsops.c: revision 1.115
sys/compat/svr4/svr4_exec_elf64.c: revision 1.15
sys/compat/linux/arch/i386/linux_machdep.c: revision 1.159
sys/compat/linux/arch/alpha/linux_machdep.c: revision 1.50
sys/compat/linux32/common/linux32_misc.c: revision 1.24
sys/netinet/in_pcb.c: revision 1.153
sys/sys/malloc.h: revision 1.116
sys/compat/common/if_43.c: revision 1.9
share/man/man9/Makefile: revision 1.380
sys/netinet/tcp_vtw.c: revision 1.12
sys/miscfs/umapfs/umap_vfsops.c: revision 1.95
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.186
sys/compat/common/uipc_syscalls_43.c: revision 1.46
sys/ufs/ext2fs/ext2fs_vnops.c: revision 1.115
sys/fs/puffs/puffs_msgif.c: revision 1.97
sys/compat/svr4/svr4_ipc.c: revision 1.27
sys/compat/linux/common/linux_exec.c: revision 1.117
sys/ufs/ext2fs/ext2fs_readwrite.c: revision 1.66
sys/netinet/tcp_output.c: revision 1.179
sys/compat/svr4/svr4_termios.c: revision 1.28
sys/fs/udf/udf_strat_bootstrap.c: revision 1.4
sys/fs/puffs/puffs_subr.c: revision 1.67
sys/fs/puffs/puffs_node.c: revision 1.36
sys/miscfs/overlay/overlay_vnops.c: revision 1.21
sys/fs/cd9660/cd9660_node.c: revision 1.34
sys/netinet/raw_ip.c: revision 1.146
sys/sys/mallocvar.h: revision 1.13
sys/miscfs/overlay/overlay_vfsops.c: revision 1.63
share/man/man9/malloc.9: revision 1.50
sys/netinet6/dest6.c: revision 1.18
sys/compat/linux/common/linux_uselib.c: revision 1.33
sys/compat/linux/common/linux_socket.c: revision 1.120
share/man/man9/malloc.9: revision 1.51
sys/netinet/tcp_subr.c: revision 1.257
sys/compat/linux/common/linux_socketcall.c: revision 1.45
sys/compat/linux/common/linux_fadvise64_64.c: revision 1.3
sys/compat/freebsd/freebsd_ipc.c: revision 1.17
sys/compat/linux/common/linux_misc_notalpha.c: revision 1.109
sys/compat/linux/arch/alpha/linux_pipe.c: revision 1.17
sys/netinet6/in6_pcb.c: revision 1.132
sys/netinet6/in6_ifattach.c: revision 1.94
sys/compat/svr4/svr4_exec_elf32.c: revision 1.15
sys/miscfs/nullfs/null_vfsops.c: revision 1.90
sys/fs/cd9660/cd9660_util.c: revision 1.12
sys/compat/linux/arch/powerpc/linux_machdep.c: revision 1.48
sys/compat/freebsd/freebsd_exec_elf32.c: revision 1.20
sys/miscfs/procfs/procfs_vfsops.c: revision 1.94
sys/compat/linux/arch/powerpc/linux_ptrace.c: revision 1.28
sys/compat/linux/common/linux_sched.c: revision 1.67
sys/compat/linux/common/linux_exec_aout.c: revision 1.67
sys/compat/linux/common/linux_pipe.c: revision 1.67
sys/compat/linux/common/linux_llseek.c: revision 1.34
sys/compat/linux/arch/mips/linux_ptrace.c: revision 1.10
Do not uselessly include <sys/malloc.h>.
Cleanup:
- remove struct kmembuckets (dead)
- correctly deadify MALLOC_XX
- remove MALLOC_DEFINE_LIMIT and MALLOC_JUSTDEFINE_LIMIT (dead)
- remove malloc_roundup(), malloc_type_setlimit(), MALLOC_DEFINE_LIMIT()
and MALLOC_JUSTDEFINE_LIMIT() from man 9 malloc
New sentence, new line. Bump date for previous.
Obsolete malloc_roundup(9), malloc_type_setlimit(9) and MALLOC_DEFINE_LIMIT(9)
man pages.
 1.94.4.3 26-Apr-2017  pgoyette Sync with HEAD
 1.94.4.2 20-Mar-2017  pgoyette Sync with HEAD
 1.94.4.1 04-Nov-2016  pgoyette Sync with HEAD
 1.94.2.2 28-Aug-2017  skrll Sync with HEAD
 1.94.2.1 05-Dec-2016  skrll Sync with HEAD
 1.95.2.1 21-Apr-2017  bouyer Sync with HEAD
 1.97.6.1 12-Apr-2018  martin Pull up following revision(s) (requested by kamil in ticket #713):

sys/modules/procfs/Makefile: revision 1.4
sys/miscfs/procfs/procfs_vfsops.c: revision 1.98
bin/ps/ps.1: revision 1.108
sys/compat/linux/arch/i386/linux_ptrace.c: revision 1.32
sys/miscfs/procfs/procfs_vnops.c: revision 1.198
sys/kern/sys_ptrace_common.c: revision 1.23
sys/kern/sys_ptrace_common.c: revision 1.24
sbin/mount_procfs/mount_procfs.8: revision 1.36
sys/kern/sys_ptrace_common.c: revision 1.25
sys/kern/sys_ptrace.c: revision 1.5
sys/compat/linux/arch/powerpc/linux_ptrace.c: revision 1.30
sys/sys/proc.h: revision 1.342
sys/kern/sys_ptrace_common.c: revision 1.26
sys/miscfs/procfs/procfs_ctl.c: file removal
sys/kern/sys_ptrace_common.c: revision 1.27
sys/miscfs/procfs/procfs_subr.c: revision 1.109
sys/kern/sys_ptrace_common.c: revision 1.28
sys/secmodel/extensions/secmodel_extensions.c: revision 1.8
sys/kern/sys_ptrace_common.c: revision 1.29
sys/sys/ptrace.h: revision 1.62
sys/compat/netbsd32/netbsd32_signal.c: revision 1.45
share/man/man9/kauth.9: revision 1.109
sys/miscfs/procfs/files.procfs: revision 1.12
sys/compat/netbsd32/netbsd32.h: revision 1.115
sys/miscfs/procfs/procfs.h: revision 1.72
sys/compat/netbsd32/netbsd32_ptrace.c: revision 1.5
sys/kern/kern_sig.c: revision 1.337
sys/sys/kauth.h: revision 1.75
sys/sys/sysctl.h: revision 1.224
sys/kern/sys_ptrace_common.c: revision 1.30
sys/kern/sys_ptrace_common.c: revision 1.31
sys/kern/sys_ptrace_common.c: revision 1.32
sys/kern/sys_ptrace_common.c: revision 1.33
sys/compat/linux/arch/arm/linux_ptrace.c: revision 1.20
sys/kern/sys_ptrace_common.c: revision 1.34
sys/kern/sys_ptrace_common.c: revision 1.36
sys/kern/kern_proc.c: revision 1.207
sys/kern/kern_exit.c: revision 1.269
doc/TODO.ptrace: revision 1.29

Make {s,g}et{db,fp,}regs work again for PK_32 processes
XXX: pullup-8

add disgusting magic to handle compat_netbsd32 as a module.

use process_*reg32 instead of struct *reg32.

Remove the filesystem tracing feature

This is a legacy interface from 4.4BSD, and it was
introduced to overcome shortcomings of ptrace(2) at that time, which are
no longer relevant (performance). Today /proc/#/ctl offers a narrow
subset of ptrace(2) commands and is not applicable for modern
applications use beyond simplistic tracing scenarios.

This removal will simplify kernel internals. Users will still be able to
use all the other /proc files.

This change won't affect other procfs files neither Linux compat
features within mount_procfs(8). /proc/#/ctl isn't available on Linux.

Remove:
- /proc/#/ctl from mount_procfs(8)
- P_FSTRACE note from the documentation of ps(1)
- /proc/#/ctl and filesystem tracing documentation from mount_procfs(8)
- KAUTH_REQ_PROCESS_PROCFS_CTL documentation from kauth(9)
- source code file miscfs/procfs/procfs_ctl.c
- PFSctl and procfs_doctl() from sys/miscfs/procfs/procfs.h
- KAUTH_REQ_PROCESS_PROCFS_CTL from sys/sys/kauth.h
- PSL_FSTRACE (0x00010000) from sys/sys/proc.h
- P_FSTRACE (0x00010000) from sys/sys/sysctl.h

Reduce code complexity after removal of this functionality.

Update TODO.ptrace accordingly: remove two entries about /proc tracing.

Do not keep legacy notes as comments in the headers about removed

PSL_FSTRACE / P_FSTRACE, as this interface had little number of users
(close or equal to zero).
Proposed on tech-kern@.

All filesystem tracing utility users are encouraged to switch to ptrace(2).

Sponsored by <The NetBSD Foundation>

untangle the mess:
- factor out common code
- break each ptrace subcall to its own sub-function
.. more to come ...
- reduce ifdef ugliness by moving it up top.
- factor out PT_IO and make PT_{READ,WRITE}_{I,D} use it
- factor out PT_DUMPCORE
- factor out sendsig code
.. more to come ...

handle siginfo requests for ptrace32

ptrace: Partially undo PT_{READ,WRITE}_{I,D} and unbreak these commands

The refactored code did not work and was generating EFAULT.

Sponsored by <The NetBSD Foundation>

Merge the code back; the problem was that since we are reading/writing
to a kernel address for PT_{READ,WRITE}_{I,D} we need the kernel vmspace.
provide separate read and write functions to accomodate register functions
that need a size argument.

don't ignore error from copyout_piod

Use the proper process (the tracee) to get information about lwps and
registers and the tracer for vmspace.

Add new sysctl(3) entry: security.models.extensions.user_set_dbregs

Model this new sysctl(3) entry after "user_set_cpu_affinity" in the same
level of sysctl(3) switches.

Allow to read unconditionally Debug Registers (no change here). This is
convenient as even if a user of a debugger does not use hardware assisted
watchpoints/breakpoints, a debugger can still prompt these values to store
in an internal cache with context of registers. Reading them should have
no security concerns.

Add a paranoid MI switch that prohibits by default setting these registers
by a regular user (non-superuser). Make this switch disabled by default.
There are enough reserved bits out there to allow using them
unconditionally on hardened hosts.

Features shipped with Debug Registers are optional features in debuggers.
There is no reduction in elementary functionality.

Reviewed by <christos>

Sponsored by <The NetBSD Foundation>
 1.100.4.3 21-Apr-2020  martin Sync with HEAD
 1.100.4.2 08-Apr-2020  martin Merge changes from current as of 20200406
 1.100.4.1 10-Jun-2019  christos Sync with HEAD
 1.101.6.2 19-Jan-2020  ad Set IMNT_SHRLOOKUP and use it for the in-cache case. Need to check what
more can be done with tmpfs though, it can probably do the whole lookup.
 1.101.6.1 17-Jan-2020  ad Sync with head.
 1.101.4.1 04-Feb-2021  martin Pull up following revision(s) (requested by riastradh in ticket #1195):

sys/miscfs/procfs/procfs_vfsops.c: revision 1.110

Fix procfs environ node.
 1.104.2.1 20-Apr-2020  bouyer Sync with HEAD
 1.109.2.1 03-Jan-2021  thorpej Sync w/ HEAD.
 1.111.4.3 16-Sep-2024  martin Pull up following revision(s) (requested by pgoyette in ticket #868):

sys/miscfs/procfs/procfs_vfsops.c: revision 1.120 (via patch)

Define dependencies based on build options.
 1.111.4.2 13-Sep-2024  martin Pull up following revision(s) (requested by pgoyette in ticket #857):

sys/modules/procfs/Makefile: revision 1.8
sys/miscfs/procfs/procfs_vfsops.c: revision 1.118
sys/miscfs/procfs/procfs_vfsops.c: revision 1.119

procfs grew a new dependency

Include the SYSV_* entries for modular procfs

Now we have another dependency for the SYSV_* stuff.
 1.111.4.1 18-Apr-2024  martin Pull up following revision(s) (requested by hannken in ticket #668):

sys/miscfs/procfs/procfs.h: revision 1.83
sys/miscfs/procfs/procfs.h: revision 1.84
sys/kern/vfs_mount.c: revision 1.104
sys/miscfs/procfs/procfs_vnops.c: revision 1.230
sys/kern/init_main.c: revision 1.547
sys/kern/kern_hook.c: revision 1.15
sys/miscfs/procfs/procfs_vfsops.c: revision 1.112
sys/miscfs/procfs/procfs_vfsops.c: revision 1.113
sys/miscfs/procfs/procfs_vfsops.c: revision 1.114
sys/miscfs/procfs/procfs_subr.c: revision 1.117

Print dangling vnode before panic() to help debug.

PR kern/57775 ""panic: unmount: dangling vnode" while umounting procfs"
Protect kernel hooks exechook, exithook and forkhook with rwlock.

Lock as writer on establish/disestablish and as reader on list traverse.

For exechook ride "exec_lock" as it is already take as reader when
traversing the list. Add local locks for exithook and forkhook.

Move exec_init before signal_init as signal_init calls exechook_establish()
that needs "exec_lock".

PR kern/39913 "exec, fork, exit hooks need locking"

Add a hashmap to access all procfs nodes by pid.

Using the exechook to revoke procfs nodes is racy and may deadlock:
one thread runs doexechooks() -> procfs_revoke_vnodes() and wants to suspend
the file system for vgone(), while another thread runs a forced unmount,
has the file system suspended, tries to disestablish the exechook and
waits for doexechooks() to complete.

Establish/disestablish the exechook on module load/unload instead
mount/unmount and use the hashmap to access all procfs nodes for this pid.

May fix PR kern/57775 ""panic: unmount: dangling vnode" while umounting procfs"

Remove all procfs nodes for this process on process exit.
 1.116.2.1 02-Aug-2025  perseant Sync with HEAD
 1.233 01-Jul-2024  christos Add linux POSIX message queue support (Ricardo Branco)
 1.232 12-May-2024  christos branches: 1.232.2;
PR/58227: Ricardo Branco: Add support for proc/sysvipc in Linux emulator
 1.231 12-May-2024  christos PR/58240: Ricardo Branco: Add support for proc/self/limits as used by Linux
 1.230 17-Jan-2024  hannken Add a hashmap to access all procfs nodes by pid.
 1.229 17-Jun-2022  shm branches: 1.229.4;
Add missing permission check
 1.228 27-Mar-2022  christos dedup the eofs link/symlink methods
 1.227 17-Jan-2022  bouyer If the calling process is running under linux emulation, make /proc/xxx/fd/
return only symlinks pointing to the original file in the filesystem,
instead of a hard link. This matches the linux behavior, and some
linux programs relies on it (they unconditionally call readlink() on
/proc/xxx/fd/yy and don't deal with it returning EINVAL).
Proposed on tech-kern@ in
http://mail-index.netbsd.org/tech-kern/2022/01/11/msg027877.html
 1.226 14-Jan-2022  christos Fix emul and exe DT_ types (from RVP, as was the previous commit)
 1.225 14-Jan-2022  christos Put the appropriate DT_ constant in the dirent structure depending on the
file type.
 1.224 11-Jan-2022  christos remove redundant error initialization and break earlier. (from rvp)
 1.223 11-Jan-2022  hannken Use a single "p" variable.

Should fix PR kern/56614: kernel panic on tmux
 1.222 10-Jan-2022  christos use a single nc variable.
 1.221 10-Jan-2022  christos Fix locking in the error path (from RVP). Centralize unlock code.
 1.220 08-Dec-2021  andvar s/efficent/efficient/ in comments.
 1.219 05-Oct-2021  christos PR/53299: RVP: kernfs and procfs are broken when sysctl security.curtain
is enabled
 1.218 18-Jul-2021  dholland Abolish all the silly indirection macros for initializing vnode ops tables.

These are things of the form #define foofs_op genfs_op, or #define
foofs_op genfs_eopnotsupp, or similar. They serve no purpose besides
obfuscation, and have gotten cutpasted all over everywhere.
 1.217 29-Jun-2021  dholland - Add a new vnode op: VOP_PARSEPATH.
- Move namei_getcomponent to genfs_vnops.c and call it genfs_parsepath.
- Add a parsepath entry to every vnode ops table.

VOP_PARSEPATH takes a directory vnode to be searched and a complete
following path and chooses how much of that path to consume. To begin
with, all parsepath calls are genfs_parsepath, which locates the first
'/' as always.

Note that the call doesn't take the whole struct componentname, only
the string. The other bits of struct componentname should not be
needed and there's no reason to cause potential complications by
exposing them.
 1.216 28-Jun-2021  chs VOP_BMAP() may be called via ioctl(FIOGETBMAP) on any vnode that applications
can open. change various pseudo-fs *_bmap methods return an error instead of
panic.

Reported-by: syzbot+8289a3eaf2ba60958c87@syzkaller.appspotmail.com
 1.215 27-Jun-2020  christos branches: 1.215.6;
Introduce genfs_pathconf() and use it for the default case in all filesystems.
 1.214 23-May-2020  ad Move proc_lock into the data segment. It was dynamically allocated because
at the time we had mutex_obj_alloc() but not __cacheline_aligned.
 1.213 16-May-2020  christos Add ACL support for FFS. From FreeBSD.
 1.212 29-Apr-2020  thorpej If the procfs mount is marked as linux-compat, then allow proc lookup
by any LWP ID in the proc, not just the canonical PID.
 1.211 21-Apr-2020  ad Revert the changes made in February to make cwdinfo use mostly lockless,
which relied on taking extra vnode refs.

Having benchmarked various experimental changes over the past few months it
seems that it's better to avoid vnode refs as much as possible. cwdi_lock
as a RW lock already did that to some extent for getcwd() and will permit
the same for namei() too.
 1.210 24-Feb-2020  ad branches: 1.210.4;
v_interlock -> vmobjlock
 1.209 23-Feb-2020  ad Merge from ad-namecache:

- Have a stab at clustering the members of vnode_t and vnode_impl_t in a
more cache-conscious way. With that done, go back to adjusting v_usecount
with atomics and keep vi_lock directly in vnode_impl_t (saves KVA).

- Allow VOP_LOCK(LK_NONE) for the benefit of VFS_VGET() and VFS_ROOT().
Make sure LK_UPGRADE always comes with LK_NOWAIT.

- Make cwdinfo use mostly lockless.
 1.208 01-Feb-2020  riastradh Load struct filedesc::fd_dt with atomic_load_consume.

Exceptions: when fd_refcnt <= 1, or when holding fd_lock.

While here:

- Restore KASSERT(mutex_owned(&fdp->fd_lock)) in fd_unused.
=> This is used only in fd_close and fd_abort, where it holds.
- Move bounds check assertion in fd_putfile to where it matters.
- Store fd_dt with atomic_store_release.
- Move load of fd_dt under lock in knote_fdclose.
- Omit membar_consumer in fdesc_readdir.
=> atomic_load_consume serves the same purpose now.
=> Was needed only on alpha anyway.
 1.207 29-Aug-2019  hannken branches: 1.207.2;
Add missing operation VOP_GETPAGES() returning EFAULT.

Without this operation posix_fadvise(..., POSIX_FADV_WILLNEED)
would leave the v_interlock held.

Observed by maxv@
 1.206 30-Mar-2019  christos branches: 1.206.4;
add a node for the process resource limits.
 1.205 14-Oct-2018  jdolecek remove M_CANFAIL flag for malloc(9) - it was completely ignored, so had
actually no effect
 1.204 03-Sep-2018  riastradh Rename min/max -> uimin/uimax for better honesty.

These functions are defined on unsigned int. The generic name
min/max should not silently truncate to 32 bits on 64-bit systems.
This is purely a name change -- no functional change intended.

HOWEVER! Some subsystems have

#define min(a, b) ((a) < (b) ? (a) : (b))
#define max(a, b) ((a) > (b) ? (a) : (b))

even though our standard name for that is MIN/MAX. Although these
may invite multiple evaluation bugs, these do _not_ cause integer
truncation.

To avoid `fixing' these cases, I first changed the name in libkern,
and then compile-tested every file where min/max occurred in order to
confirm that it failed -- and thus confirm that nothing shadowed
min/max -- before changing it.

I have left a handful of bootloaders that are too annoying to
compile-test, and some dead code:

cobalt ews4800mips hp300 hppa ia64 luna68k vax
acorn32/if_ie.c (not included in any kernels)
macppc/if_gm.c (superseded by gem(4))

It should be easy to fix the fallout once identified -- this way of
doing things fails safe, and the goal here, after all, is to _avoid_
silent integer truncations, not introduce them.

Maybe one day we can reintroduce min/max as type-generic things that
never silently truncate. But we should avoid doing that for a while,
so that existing code has a chance to be detected by the compiler for
conversion to uimin/uimax without changing the semantics until we can
properly audit it all. (Who knows, maybe in some cases integer
truncation is actually intended!)
 1.203 07-Apr-2018  hannken branches: 1.203.2;
Lock the target cwdi and take an additional reference to the
vnode we are interested in to prevent it from disappearing
before getcwd_common().

Should fix PR kern/53096 (netbsd-8 crash on heavy disk I/O)
 1.202 31-Dec-2017  christos branches: 1.202.2;
Add an environ node
 1.201 01-Dec-2017  christos Allow procfs_kqfilter, since we allow poll. "go" does it.
 1.200 08-Nov-2017  christos fix locking, remove error(1) comments.
 1.199 08-Nov-2017  christos use p->p_path, remove unused code.
 1.198 28-Aug-2017  kamil Remove the filesystem tracing feature

This is a legacy interface from 4.4BSD, and it was
introduced to overcome shortcomings of ptrace(2) at that time, which are
no longer relevant (performance). Today /proc/#/ctl offers a narrow
subset of ptrace(2) commands and is not applicable for modern
applications use beyond simplistic tracing scenarios.

This removal will simplify kernel internals. Users will still be able to
use all the other /proc files.

This change won't affect other procfs files neither Linux compat
features within mount_procfs(8). /proc/#/ctl isn't available on Linux.

Remove:
- /proc/#/ctl from mount_procfs(8)
- P_FSTRACE note from the documentation of ps(1)
- /proc/#/ctl and filesystem tracing documentation from mount_procfs(8)
- KAUTH_REQ_PROCESS_PROCFS_CTL documentation from kauth(9)
- source code file miscfs/procfs/procfs_ctl.c
- PFSctl and procfs_doctl() from sys/miscfs/procfs/procfs.h
- KAUTH_REQ_PROCESS_PROCFS_CTL from sys/sys/kauth.h
- PSL_FSTRACE (0x00010000) from sys/sys/proc.h
- P_FSTRACE (0x00010000) from sys/sys/sysctl.h

Reduce code complexity after removal of this functionality.

Update TODO.ptrace accordingly: remove two entries about /proc tracing.

Do not keep legacy notes as comments in the headers about removed
PSL_FSTRACE / P_FSTRACE, as this interface had little number of users
(close or equal to zero).

Proposed on tech-kern@.

All filesystem tracing utility users are encouraged to switch to ptrace(2).

Sponsored by <The NetBSD Foundation>
 1.197 26-May-2017  riastradh branches: 1.197.2;
Make VOP_RECLAIM do the last unlock of the vnode.

VOP_RECLAIM naturally has exclusive access to the vnode, so having it
locked on entry is not strictly necessary -- but it means if there
are any final operations that must be done on the vnode, such as
ffs_update, requiring exclusive access to it, we can now kassert that
the vnode is locked in those operations.

We can't just have the caller release the last lock because some file
systems don't use genfs_lock, and require the vnode to remain valid
for VOP_UNLOCK to work, notably unionfs.
 1.196 11-Apr-2017  riastradh Make VOP_INACTIVE preserve vnode lock on return.

Discussed on tech-kern:
https://mail-index.netbsd.org/tech-kern/2017/04/01/msg021751.html

Ride 7.99.68, a bumpy bus of incremental vfs improvements!
 1.195 30-Mar-2017  christos add an auxv node.
 1.194 20-Aug-2016  hannken branches: 1.194.2;
Remove now obsolete operation vcache_remove().

Welcome to 7.99.36
 1.193 20-Apr-2015  riastradh branches: 1.193.2;
Make VOP_LINK return directory still locked and referenced.

Ride 7.99.10 bump.
 1.192 05-Sep-2014  matt branches: 1.192.2;
Try not to use f_data, use f_{vnode,socket,pipe,mqueue,kqueue,ksem} to get
a correctly typed pointer.
 1.191 27-Jul-2014  hannken branches: 1.191.2; 1.191.4; 1.191.8;
Change procfs from hashlist to vcache.
- Key is (type, pid, fd)
- Remove argument "p" from procfs_allocvp(). It is only used
when "type == PFSfd". Lookup the proc with proc_find() when
procfs_loadvnode() needs it.
- Use a vfs_vnode_iterator for procfs_revoke_vnodes().
 1.190 25-Jul-2014  dholland Add VOP_FALLOCATE and VOP_FDISCARD to every vnode ops table I can
find.

The filesystem ones all call genfs_eopnotsupp - right now I am only
implementing the plumbing and we can implement fallocate and/or
fdiscard for files later.

The device ones call spec_fallocate (which is also genfs_eopnotsupp)
and spec_fdiscard, which dispatches to the device-level op.

The fifo ones all call vn_fifo_bypass, which also ends up being
EOPNOTSUPP.
 1.189 07-Feb-2014  hannken branches: 1.189.2;
Change vnode operation lookup to return the resulting vnode *vpp unlocked.
Change cache_lookup() to return an unlocked vnode.

Discussed on tech-kern@

Welcome to 6.99.31
 1.188 23-Jan-2014  hannken Change vnode operations create, mknod, mkdir and symlink to return
the resulting vnode *vpp unlocked.

Discussed on tech-kern@

Welcome to 6.99.30
 1.187 17-Jan-2014  hannken Change vnode operations create, mknod, mkdir and symlink to keep the
directory node dvp locked on return.

Discussed on tech-kern@

Welcome to 6.99.29
 1.186 18-Mar-2013  plunky branches: 1.186.6;
C99 section 6.7.2.3 (Tags) Note 3 states that:

A type specifier of the form

enum identifier

without an enumerator list shall only appear after the type it
specifies is complete.

which means that we cannot pass an "enum vtype" argument to
kauth_access_action() without fully specifying the type first.
Unfortunately there is a complicated include file loop which
makes that difficult, so convert this minimal function into a
macro (and capitalize it).

(ok elad@)
 1.185 25-Nov-2012  christos do something reasonable with kernel semaphores.
 1.184 28-May-2012  christos branches: 1.184.2;
add a task process subdirectory for emul linux
 1.183 13-Mar-2012  elad Replace the remaining KAUTH_GENERIC_ISSUSER authorization calls with
something meaningful. All relevant documentation has been updated or
written.

Most of these changes were brought up in the following messages:

http://mail-index.netbsd.org/tech-kern/2012/01/18/msg012490.html
http://mail-index.netbsd.org/tech-kern/2012/01/19/msg012502.html
http://mail-index.netbsd.org/tech-kern/2012/02/17/msg012728.html

Thanks to christos, manu, njoly, and jmmv for input.

Huge thanks to pgoyette for spinning these changes through some build
cycles and ATF.
 1.182 04-Sep-2011  jmcneill branches: 1.182.2; 1.182.6;
PR# kern/45021: Please support /emul/linux/proc/version

Add /proc/version for procfs with -o linux. The version reported depends
on the emulation type of the calling process:

$ cat /proc/version
NetBSD version 5.99.55 (netbsd@localhost) (gcc version 4.1.3 20080704 prerelease (NetBSD nb2 20081120)) NetBSD 5.99.55 (GENERIC) #39: Sun Sep 4 09:10:05 EDT 2011

$ /emul/linux/bin/cat /proc/version
Linux version 2.6.18 (linux@localhost) (gcc version 4.1.3 20080704 prerelease (NetBSD nb2 20081120)) #0 Wed Mar 3 03:03:03 PST 2010

$ /emul/linux32/bin/cat /proc/version
Linux version 2.6.18 (linux32@localhost) (gcc version 4.1.3 20080704 prerelease (NetBSD nb2 20081120)) #0 Wed Mar 3 03:03:03 PST 2010
 1.181 23-Jun-2011  christos From Aleksey Cheusov: Don't make it easy for compromised systems to bypass
ASLR protections by providing the mapping addresses of programs to everyone.
 1.180 01-Jul-2010  rmind Remove pfind() and pgfind(), fix locking in various broken uses of these.
Rename real routines to proc_find() and pgrp_find(), remove PFIND_* flags
and have consistent behaviour. Provide proc_find_raw() for special cases.
Fix memory leak in sysctl_proc_corename().

COMPAT_LINUX: rework ptrace() locking, minimise differences between
different versions per-arch.

Note: while this change adds some formal cosmetics for COMPAT_DARWIN and
COMPAT_IRIX - locking there is utterly broken (for ages).

Fixes PR/43176.
 1.179 24-Jun-2010  hannken Clean up vnode lock operations pass 2:

VOP_UNLOCK(vp, flags) -> VOP_UNLOCK(vp): Remove the unneeded flags argument.

Welcome to 5.99.32.

Discussed on tech-kern.
 1.178 08-Jun-2010  hannken Procfs_lookup() does not lookup directory descriptors in the fd/
subdirectory. There is no need for recursive vnode locking here.

Ok: Christos Zoulas <christos@netbsd.org>
 1.177 08-Jan-2010  pooka branches: 1.177.2; 1.177.4;
The VATTR_NULL/VREF/VHOLD/HOLDRELE() macros lost their will to live
years ago when the kernel was modified to not alter ABI based on
DIAGNOSTIC, and now just call the respective function interfaces
(in lowercase). Plenty of mix'n match upper/lowercase has creeped
into the tree since then. Nuke the macros and convert all callsites
to lowercase.

no functional change
 1.176 03-Jul-2009  elad Where possible, extract the file-system's access() routine to two internal
functions: the first checking if the operation is possible (regardless of
permissions), the second checking file-system permissions, ACLs, etc.

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/06/21/msg005311.html
 1.175 23-Jun-2009  elad Move the implementation of vaccess() to genfs_can_access(), in line with
the other routines of the same spirit.

Adjust file-system code to use it.

Keep vaccess() for KPI compatibility and to keep element of least
surprise. A "diagnostic" message warning that vaccess() is deprecated will
be printed when it's used (obviously, only in DIAGNOSTIC kernels).

No objections on tech-kern@:

http://mail-index.netbsd.org/tech-kern/2009/06/21/msg005310.html
 1.174 24-May-2009  ad More changes to improve kern_descrip.c.

- Avoid atomics in more places.
- Remove the per-descriptor mutex, and just use filedesc_t::fd_lock.
It was only being used to synchronize close, and in any case we needed
to take fd_lock to free the descriptor slot.
- Optimize certain paths for the <NDFDFILE case.
- Sprinkle more comments and assertions.
- Cache more stuff in filedesc_t.
- Fix numerous minor bugs spotted along the way.
- Restructure how the open files array is maintained, for clarity and so
that we can eliminate the membar_consumer() call in fd_getfile(). This is
mostly syntactic sugar; the main functional change is that fd_nfiles now
lives alongside the open file array.

Some measurements with libmicro:

- simple file syscalls are like close() are between 1 to 10% faster.
- some nice improvements, e.g. poll(1000) which is ~50% faster.
 1.173 17-Dec-2008  cegger branches: 1.173.2;
kill MALLOC and FREE macros.
 1.172 05-Sep-2008  skrll branches: 1.172.2;
PR/39324 kernel diagnostic assertion "l->l_stat != LSZOMB" failed.

Ignore procs with zero or all LSZOMB LWPs. Get a non-LSZOMB LWP to perform
operations against as part of the deal.

procfs really needs to be updated to support multi-threading fully.
Hi Antti!
 1.171 05-Sep-2008  skrll ANSIfy
 1.170 02-Jul-2008  rmind branches: 1.170.2;
Remove proc_representative_lwp(), use a simple LIST_FIRST() instead.
OK by <ad>.
 1.169 28-Apr-2008  martin branches: 1.169.2; 1.169.4;
Remove clause 3 and 4 from TNF licenses
 1.168 24-Apr-2008  ad branches: 1.168.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.
 1.167 24-Apr-2008  ad Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.
 1.166 21-Mar-2008  ad branches: 1.166.2;
Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.
 1.165 23-Jan-2008  elad branches: 1.165.6;
Tons of process scope changes.

- Add a KAUTH_PROCESS_SCHEDULER action, to handle scheduler related
requests, and add specific requests for set/get scheduler policy and
set/get scheduler parameters.

- Add a KAUTH_PROCESS_KEVENT_FILTER action, to handle kevent(2) related
requests.

- Add a KAUTH_DEVICE_TTY_STI action to handle requests to TIOCSTI.

- Add requests for the KAUTH_PROCESS_CANSEE action, indicating what
process information is being looked at (entry itself, args, env,
open files).

- Add requests for the KAUTH_PROCESS_RLIMIT action indicating set/get.

- Add requests for the KAUTH_PROCESS_CORENAME action indicating set/get.

- Make bsd44 secmodel code handle the newly added rqeuests appropriately.

All of the above make it possible to issue finer-grained kauth(9) calls in
many places, removing some KAUTH_GENERIC_ISSUSER requests.

- Remove the "CAN" from KAUTH_PROCESS_CAN{KTRACE,PROCFS,PTRACE,SIGNAL}.

Discussed with christos@ and yamt@.
 1.164 02-Jan-2008  ad Merge vmlocking2 to head.
 1.163 26-Nov-2007  pooka branches: 1.163.2; 1.163.6;
Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern
 1.162 09-Nov-2007  christos make the last argument of procfs_dir size_t
 1.161 07-Nov-2007  ad Merge from vmlocking:

- pool_cache changes.
- Debugger/procfs locking fixes.
- Other minor changes.
 1.160 10-Oct-2007  ad branches: 1.160.2; 1.160.4;
Merge from vmlocking:

- Split vnode::v_flag into three fields, depending on field locking.
- simple_lock -> kmutex in a few places.
- Fix some simple locking problems.
 1.159 08-Oct-2007  ad Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.
 1.158 22-Jul-2007  pooka branches: 1.158.4; 1.158.6; 1.158.8; 1.158.10;
Don't allow getcwd() on procfs vnodes and provide "/" as the path
instead of the result from getcwd(). The works around locking
panics caused by namei calling VOP_READLINK while holding on to a
directory lock and getcwd() trying to acquire that lock. The real
fix would be to get rid of getcwd() calls within VOPs (not locking
safe), but that's not a viable option in the netbsd-4 timeframe.

Suggestion for workaround from David Holland.
 1.157 24-May-2007  agc branches: 1.157.2;
Extend the Linux emulation of /proc to include

/proc/stat
/proc/loadavg and
/proc/<pid>/statm.

These are only present when -o linux is specified as a mount option
to procfs.

Factor out some common code so that it can be used by a number of
functions.

XXX The values returned in the statm emulation need to be verified.
 1.156 04-Apr-2007  rmind Unfortunately, missed procfs_proc_unlock() in previous.
Pointed out by pooka@
 1.155 04-Apr-2007  rmind procfs_readlink: Handle a possible fail of fd_getfile(), also, we
do not need to check for error again.
CID: 4436
 1.154 09-Mar-2007  ad branches: 1.154.2; 1.154.4;
- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.
 1.153 04-Mar-2007  christos Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.
 1.152 03-Mar-2007  salo Don't prepend rootvnode to the path in non-NULL case for exe links.
It breaks procfs in chroot.

from <christos>, tested by me.
 1.151 19-Feb-2007  pooka When checking for file validity under pid/, do proper proc->lwp
lookup (fsvo proper) instead of fiddling directly with the lwp
list.
 1.150 18-Feb-2007  pooka Don't check for validity of p in lookup for root nodes, since it
will always be NULL. Rather, just call pt_valid with NULL directly
and let it decide if we're a linux mount or not.
 1.149 17-Feb-2007  pavel Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.
 1.148 16-Feb-2007  pooka branches: 1.148.2;
In lookup, when checking for procfs process node validity, target the
process we're trying to get information about through procfs, not
the caller of lookup.

fixes 'ls -l /proc/*/file' panic, which would occur when trying to
lookup "file" for a kernel thread, which doesn't have p->p_textvp.
 1.147 15-Feb-2007  ad Need to acquire procp->p_mutex for procfs_dir().
 1.146 11-Feb-2007  ad Eliminate a couple of reference count and mutex leaks.
 1.145 09-Feb-2007  ad Merge newlock2 to head.
 1.144 25-Dec-2006  elad PR/35226: Johann Franz: Problems with permissions in
/usr/pkg/emul/linux/proc .

Okay mlelstv@
 1.143 09-Dec-2006  chs a smorgasbord of improvements to vnode locking and path lookup:
- LOCKPARENT is no longer relevant for lookup(), relookup() or VOP_LOOKUP().
these now always return the parent vnode locked. namei() works as before.
lookup() and various other paths no longer acquire vnode locks in the
wrong order via vrele(). fixes PR 32535.
as a nice side effect, path lookup is also up to 25% faster.
- the above allows us to get rid of PDIRUNLOCK.
- also get rid of WANTPARENT (just use LOCKPARENT and unlock it).
- remove an assumption in layer_node_find() that all file systems implement
a recursive VOP_LOCK() (unionfs doesn't).
- require that all file systems supply vfs_vptofh and vfs_fhtovp routines.
fill in eopnotsupp() for file systems that don't support being exported
and remove the checks for NULL. (layerfs calls these without checking.)
- in union_lookup1(), don't change refcounts in the ISDOTDOT case, just
adjust which vnode is locked. fixes PR 33374.
- apply fixes for ufs_rename() from ufs_vnops.c rev. 1.61 to ext2fs_rename().
 1.142 04-Dec-2006  christos From Nicolas Joly: restore previous behavior in procfs_validfile_linux, since
readdir passes a NULL lwp.
 1.141 03-Dec-2006  elad Move kauth(9) call to where it belongs. Noticed by Nicolas Joly, thanks!
 1.140 28-Nov-2006  elad branches: 1.140.2;
Move ktrace, ptrace, systrace, and procfs to use kauth(9).

First, remove process_checkioperm() calls from MD code. Similar checks
using kauth(9) routines (on the process scope, using appropriate action)
are done in the callers.

Add secmodel back-end to handle each subsystem.
 1.139 25-Nov-2006  skrll Expose the 'exe' symlink to the process realpath in NetBSD as well. An
example user is gdb.

OK'd by christos.
 1.138 16-Nov-2006  christos __unused removal on arguments; approved by core.
 1.137 29-Oct-2006  christos add an "emul" file node.
 1.136 25-Oct-2006  christos 1. fix procfs_validfile{,_linux} to test for NULL pointers properly.
2. make "exe" entry be a symlink to the executable, instead of pointing
directly to the vnode of the executable.
3. factor out commonly used code.
 1.135 12-Oct-2006  christos - sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386
 1.134 20-Sep-2006  manu Emulate Linux's /proc/devices
 1.133 13-Jun-2006  yamt branches: 1.133.6; 1.133.8;
use KAUTH_PROCESS_CANSEE rather than CURTAIN where appropriate.
 1.132 13-Jun-2006  yamt remove unnecessary arguments from kauth_authorize_process.
ie. make it similar to the one found in apple TN.
 1.131 07-Jun-2006  kardel merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html
 1.130 14-May-2006  elad branches: 1.130.2;
integrate kauth.
 1.129 02-Feb-2006  christos branches: 1.129.2; 1.129.4; 1.129.6; 1.129.8;
PR/32692: Matthew Mondor: linux compatibility in /proc/self should point
directly to the directory containing the pid instead of pointing to
/proc/curproc, because some programs rely on calling readlink on /proc/self
to get the pid.
 1.128 11-Dec-2005  christos branches: 1.128.2; 1.128.4;
merge ktrace-lwp.
 1.127 02-Nov-2005  yamt merge yamt-vop branch. remove following VOPs.

VOP_BLKATOFF
VOP_VALLOC
VOP_BALLOC
VOP_REALLOCBLKS
VOP_VFREE
VOP_TRUNCATE
VOP_UPDATE
 1.126 01-Oct-2005  atatat branches: 1.126.2;
Add "cwd" and "root" symlinks to each process's directory. The cwd
link points to the process's current working directory, and the root
link points to the process's root directory. What else would you
expect?

For directories that are out of reach (caller is in a chroot, target
process is in a different chroot, etc), the links point to "/"
instead.
 1.125 11-Sep-2005  elad Implement curtain for procfs.
 1.124 30-Aug-2005  xtraeme Remove __P()
 1.123 29-May-2005  christos branches: 1.123.2;
- sprinkle const
- avoid shadowed variables.
 1.122 02-Apr-2005  christos PR/29782: Martin Husemann: procfs can not unmount when some process has its
current directory in curproc. Fix from Pedro Martelletto:
We cannot call vgone() from procfs_inactive() if we are coming from
vclean(). that's what's probably causing the deadlock.
 1.121 26-Feb-2005  perry nuke trailing whitespace
 1.120 04-Oct-2004  yamt branches: 1.120.4; 1.120.6;
procfs_readdir:
- return correct cookie when buffer size is small.
- simplify logic.
 1.119 04-Oct-2004  yamt procfs_readdir: remove a redundant assignment.
 1.118 02-Oct-2004  yamt procfs_getattr: correct size of /proc/self.
 1.117 01-Oct-2004  yamt procfs_readdir:
- fix a locking problem, using proclist_foreach_call. PR/27098.
- correct snprintf size argument.
 1.116 01-Oct-2004  yamt procfs_readdir: fix an offset handling bug after addition of /proc/self.
 1.115 01-Oct-2004  yamt procfs_readdir: use a list macro.
 1.114 20-Sep-2004  jdolecek add 'mounts' file for -o linux, which lists all currently mounted
filesystems; Linux glibc statvfs() uses this to get some of mount flags,
and this file is also useful as /emul/linux/etc/mtab (via symlink)
 1.113 29-Apr-2004  jrf Removed remaining caddr_t casts we do not need in miscfs. Recompiled
kernel and ran for a day or so. There are still some caddr_t types in
the arguments of some calls, I will do those separately (later) as
they touch a lot more of the system.
Approved by christos@NetBSD.org.
 1.112 22-Apr-2004  itojun sprintf -> snprintf
 1.111 15-Feb-2004  jdolecek unlock the descriptor table simple lock after fd_getfile() call in
procfs_readdir()
fixes procfs locking problems reported on current-users@, problem place
found by enami tsugutomo
 1.110 30-Oct-2003  simonb Remove some assigned-to but otherwise unused variables.
 1.109 27-Sep-2003  darcy Changes as discussed with itojun on tech-kern. I have modified the enums
to have KFS or PFS differentiators. Further I have wrapped the enum in
procfs in "#ifdef _KERNEL" as it is done in kernfs.

To see the discussion go to http://mail-index.NetBSD.org/tech-kern/2003/09/
and look for "Mismatched enums in include files" in the list.
 1.108 07-Sep-2003  itojun remove meaningless line (variable overwritten 2 lines below)
 1.107 07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.106 29-Jun-2003  fvdl branches: 1.106.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
 1.105 29-Jun-2003  thorpej Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.
 1.104 28-Jun-2003  darrenr Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
 1.103 28-May-2003  christos Add /proc/<pid>/stat for linux compat. j2sdk1.4.2 depends on it.
 1.102 18-Apr-2003  christos Make symlinks for directories that point to the actual directory.
Make symlinks to [kqueue] and [misc] for kqueue and misc fds.
 1.101 17-Apr-2003  jdolecek do not show nodes corresponding to directory descriptors for process
in fd/ subdirectory, nor allow lookup/open for the nodes
this fixes PR kern/21187 for good, and also avoids interesting directory
locking issues
 1.100 17-Apr-2003  jdolecek procfs_readdir(): in Pfd case, only show descriptors of types we want
how to represent (vnodes, fifo, pipes); also use fd_getfile() et al

this avoids annoying EOPNOTSUPP error messages from ls -F and such
 1.99 17-Apr-2003  jdolecek procfs_lookup(): use fd_getfile() et al in Pfd case
 1.98 17-Apr-2003  jdolecek use fd_getfile() in procfs_getfp(), and FILE_USE()/FILE_UNUSE() the
returned file descriptor pointer appropriately
 1.97 17-Apr-2003  jdolecek make some local arrays/variables static + const
 1.96 10-Apr-2003  jdolecek use former genfs_eopnotsupp_rele() as genfs_eopnotsupp(), so that vnodes
are vput()/vrele()d as necessary - some filesystems did use the wrong
one for some ops, and it's just safer to not take the chance

based on suggestion by Bill Studenmund
 1.95 05-Apr-2003  dsl Remove pointless check against PID_MAX. Let pfind() do the validation.
(The new pid allocation code may decide to allocate pids above PID_MAX.)
 1.94 25-Feb-2003  jrf This addresses PR kerm/19989. Thanks to hamajima@nagoya.ydc.co.jp for submitting this patch which enables /proc/uptime for linux emul. Patch reviewed by atatat@netbsd.org and tron@netbsd.org, approved by tron@netbsd.org.
 1.93 04-Jan-2003  martin Cast off_t expression to long long to match format even on 64 bit
plattforms.

Shouldn't we introduce a PRIoff_t macro to create such format strings?
 1.92 03-Jan-2003  christos add LK_CANRECURSE in the locking of /dev/<pid>/fd/<n> and remove the curproc
kludge. Thanks to fvdl.
 1.91 03-Jan-2003  christos Implement /proc/<pid>/fd/<n>. This is work in progress. Questionable things:
- Is it ok to convert DTYPE_PIPE to VFIFO and DTYPE_SOCKET to VSOCK?
- XXX: Avoid locking issue in ls -Rl /proc by avoiding curproc
- Does I/O to pipes work?
- XXX: Are there security implications?
 1.90 03-Aug-2002  simonb Just use the "time" variable in the *_getattr functions instead of a call
to (the potentially expensive) microtime().
 1.89 09-May-2002  thorpej branches: 1.89.2;
Move code shared by procfs and the kernel proper out of procfs and
into the kernel proper (renaming functions from procfs_* to process_*).
 1.88 12-Jan-2002  christos Don't hide the real return code with EPERM.
 1.87 06-Dec-2001  chs add a VOP_PUTPAGES method for all the filesystems that don't have pages,
just unlock the interlock.
 1.86 05-Dec-2001  thorpej * Allow machine-dependent code to specify hooks for ptrace(2)
(__HAVE_PTRACE_MACHDEP) and procfs (__HAVE_PROCFS_MACHDEP).
These changes will allow platforms like x86 (XMM) and PowerPC
(AltiVec) to export extended register sets in a sane manner.

* Use __HAVE_PTRACE_MACHDEP to export x86 XMM registers (standard
FP + SSE/SSE2) using PT_{GET,SET}XMMREGS (in the machdep
ptrace request space).
* Use __HAVE_PROCFS_MACHDEP to export x86 XMM registers via
/proc/N/xmmregs in procfs.
 1.85 10-Nov-2001  lukem add RCSIDs
 1.84 06-Nov-2001  simonb Remove some variables that are set but never used.
 1.83 31-Aug-2001  chs branches: 1.83.2; 1.83.4;
map files are zero-length.
 1.82 03-Jun-2001  chs branches: 1.82.2;
procfs_bmap() should never be called, make it a "bad op".
let procfs_mmap() use the default error method.
 1.81 14-Apr-2001  kleink In procfs_readdir(), give /proc/# directories DT_DIR (rather than DT_REG).
 1.80 30-Mar-2001  fvdl Bump va_blocksize for the map files some more, so that programs with
quite a few mappings have a chance of being handled correctly if
st_blksize is looked at.
 1.79 29-Mar-2001  fvdl For -o linux mounts, add some code to emulate /proc/#/maps.
Needs NAMECACHE_ENTER_REVERSE to include filenames.
 1.78 21-Feb-2001  jdolecek branches: 1.78.2;
make some more constant arrays 'const'
 1.77 22-Jan-2001  jdolecek make filesystem vnodeop, specop, fifoop and vnodeopv_* arrays const
 1.76 17-Jan-2001  fvdl Add a few linux-style files, only enabled when -o linux is specified
for the mount. Currently these are /proc/cpuinfo and /proc/meminfo.
The former only does something on i386 right now.
 1.75 24-Nov-2000  chs remove dead code and other misc cleanup.
 1.74 09-Aug-2000  tv Only show the "exe" entry to Linux processes, suggested by christos.
Since there are actually three struct emul's for linux, use the e_name
field to determine eligibility with strcmp().
 1.73 09-Aug-2000  tv Some versions of Linux libc look for /proc/.../exe instead of /proc/../file.
Add an entry for "exe" that is the same as "file", provided only if
COMPAT_LINUX is set.
 1.72 03-Aug-2000  thorpej MALLOC()/FREE() are not to be used for variable sized allocations.
 1.71 28-Jun-2000  mrg <vm/vm.h> -> <uvm/uvm_extern.h>
 1.70 30-Mar-2000  simonb branches: 1.70.4;
Delete duplicate declaration of atopid().
 1.69 02-Sep-1999  thorpej branches: 1.69.2; 1.69.8;
Make /proc/self a symlink to /proc/curproc. I've observed Linux programs
that expect /proc/self/cmdline to exist.
 1.68 25-Aug-1999  sommerfeld Change variable used for directory offset from "int" to "off_t".
Overkill, but avoids a host of truncation problems.
 1.67 24-Aug-1999  sommerfeld Fix PR8270:

Problem turned out to be due to improper handling of reads beyond EOF:
they should just return without error with the uio unchanged, and the
caller will recognize this as a zero-byte return (EOF).

The previous fix to protect directory reads against bogus uio_offset
values returned EINVAL, which broke mount -o union, which only
union'ed in the lower directory if the upper directory cleanly
returned EOF.

While we're here, protect kernfs as well.
 1.66 14-Aug-1999  christos protect against large uio_offset
 1.65 03-Aug-1999  wrstuden Add support for fcntl(2) to generate VOP_FCNTL calls. Any fcntl
call with F_FSCTL set and F_SETFL calls generate calls to a new
fileop fo_fcntl. Add genfs_fcntl() and soo_fcntl() which return 0
for F_SETFL and EOPNOTSUPP otherwise. Have all leaf filesystems
use genfs_fcntl().

Reviewed by: thorpej
Tested by: wrstuden
 1.64 25-Jul-1999  thorpej Add calls to lock the proclist as appropriate.
 1.63 14-Jul-1999  thorpej Fix a paste-o in procfs_lookup() introduced with the vnode locking changes.
Fixes PR #7961, Mario Kemper <magick@bundy.lip.owl.de>.
 1.62 08-Jul-1999  wrstuden Bump osrelease to 1.4E. Add layerfs files, remove null_subr.c.

Update coda to new struct lock in struct vnode.

make fdescfs, kernfs, portalfs, and procfs actually lock their vnodes.
It's not that hard.

Make unionfs set v_vnlock = NULL so any overlayed fs will call its
VOP_LOCK.
 1.61 12-Mar-1999  christos branches: 1.61.2; 1.61.4;
PR/7143: Jaromir Docelek: Add procfs/cmdline from Linux emulation
 1.60 25-Jan-1999  msaitoh Add /proc/#/map. From FreeBSD.
 1.59 08-Sep-1998  thorpej - Use proclists[], rather than checking allproc and zombproc explicitly.
- Add some comments about locking.
 1.58 13-Aug-1998  kleink Per POSIX, fail with EINVAL if advisory locking is attempted on a file type
that doesn't support it, rather than using a homegrown EBADF or EOPNOTSUPP.
 1.57 10-Aug-1998  matthias create miscfs/genfs/genfs_vnops.c:genfs_enoioctl and make all the other
filesystems use it instead of a private version.
 1.56 09-Aug-1998  perry bzero->memset, bcopy->memcpy, bcmp->memcmp
 1.55 03-Aug-1998  kleink Recognize _PC_SYNC_IO.
 1.54 21-Apr-1998  fvdl procfs_readdir: in case of error, check if cookies actually have
been allocated before freeing them. From Wolfgang Solfrank.
 1.53 01-Mar-1998  fvdl Merge with Lite2 + local changes
 1.52 10-Oct-1997  fvdl Bump last argument to VOP_READDIR to off_t (from u_long).
 1.51 27-Aug-1997  thorpej Fix a reversed argument which caused procfs_checkioperm() to always return
"OK". Add a few comments to avoid further confusion.
 1.50 12-Aug-1997  thorpej Fix the procfs hole described on current-users, similar to a fix for
FreeBSD by Sean Eric Fagan, but a bit different. This makes the checks
in the same places as sef's FreeBSD patch, but does not hardcode the
"kmem" group into the kernel, and also does a check identical to the
(3) and (4) checks in the NetBSD ptrace(2):

(1) it's not owned by you, or is set-id on exec (unless
you're root), or

(2) it's init, which controls the security level of the
entire system, and the system was not compiled with
permanently insecure mode turned on.
 1.49 08-May-1997  mycroft branches: 1.49.4;
Pass the vnode type to vaccess(), and use it when checking VEXEC. Make sure
that the mode bits passed to vaccess() and returned by foo_getattr() contain
only permission bits.
 1.48 05-May-1997  mycroft Need stat.h.
 1.47 05-May-1997  mycroft Eliminate bogus uses of V{READ,WRITE,EXEC}. Use S_I[RWX]{USR,GRP,OTH} where
appropriate.
 1.46 28-Apr-1997  mycroft Minor code cleanup.
 1.45 25-Oct-1996  cgd define path name string variables that we should not (and, thankfully, do
not) modify as 'const char *' rather 'char *'.
 1.44 13-Oct-1996  christos backout previous kprintf changes
 1.43 10-Oct-1996  christos printf -> kprintf, sprintf -> ksprintf
 1.42 07-Sep-1996  mycroft Implement poll(2).
 1.41 01-Sep-1996  mycroft Add a set of generic file system operations that most file systems use.
Also, fix some time stamp bogosities.
 1.40 16-Mar-1996  christos Fix printf format follies.
 1.39 13-Feb-1996  mycroft GC *_nullop(). Minor nits.
 1.38 12-Feb-1996  christos close PR/2063: procfs_rw prototyped twice with different prototypes
 1.37 09-Feb-1996  christos miscfs prototype changes
 1.36 09-Feb-1996  mycroft Fix vop_link, vop_symlink, and vop_remove semantics in several ways:
* Change the argument names to vop_link so they actually make sense.
* Implement vop_link and vop_symlink for all file systems, so they do proper
cleanup.
* Require the file system to decide whether or not linking and unlinking of
directories is allowed, and disable it for all current file systems.
 1.35 09-Oct-1995  mycroft Use the index number as the cookie, rather than multiplying by UIO_MX.
 1.34 09-Oct-1995  mycroft Add support for cookies, mostly from Greg Hudson.
 1.33 15-Apr-1995  cgd fix timeval vs. timespec warnings
 1.32 03-Feb-1995  mycroft Return EROFS rather than ENOENT in many cases. Also some cosmetic cleanup.
 1.31 27-Dec-1994  mycroft Format police.
 1.30 24-Dec-1994  ws Implement and use a common access checking routine
 1.29 14-Dec-1994  mycroft Remove a_fp.
 1.28 14-Nov-1994  christos fixed struct comment
 1.27 30-Oct-1994  cgd be more careful with types, also pull in headers where necessary.
 1.26 20-Oct-1994  cgd update for new syscall args description mechanism
 1.25 30-Aug-1994  mycroft Convert process, file, and namei lists and hash tables to use queue.h.
 1.24 29-Jun-1994  cgd New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'
 1.23 16-Jun-1994  mycroft Remove an unneeded test.
 1.22 15-Jun-1994  mycroft Minor update from JSP after merging my changes.
 1.21 08-Jun-1994  mycroft Update to 4.4-Lite fs code, with local changes.
 1.20 05-May-1994  cgd lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.
 1.19 15-Apr-1994  cgd forgot these...
 1.18 12-Apr-1994  cgd be a bit smarter about determining if files shouldn't be seen by the user.
Also, DON'T allow a lookup to succeed on a file that's not visible!
 1.17 15-Feb-1994  mycroft Undo last change; executables is `file', not `a.out'.
 1.16 14-Feb-1994  ws Rename file -> a.out
 1.15 14-Feb-1994  ws Don't try to show a file for a process if there is none
 1.14 28-Jan-1994  cgd make a fpregs file.
 1.13 20-Jan-1994  ws Make procfs really work for debugging.
Implement not & notepg files in procfs.
 1.12 09-Jan-1994  ws Bug fixes and enhancements:
Make NFS serving work (BUT DON'T USE "attach" TO /proc/*/ctl FOR NOW!!!)
Make `curproc' a symbolic link
Add `.' and `..' entries to the directories.
Return better guesses on the size of the files.
 1.11 05-Jan-1994  cgd return size of 'reg' from getattr()
 1.10 05-Jan-1994  cgd make it compile (cleanly) for us
 1.9 05-Jan-1994  cgd add new procfs code, from Jan-Simon Pendry, jsp@sequent.com.
This is pretty-much "virgin", so that diffs can be done later.
 1.8 18-Dec-1993  mycroft Canonicalize all #includes.
 1.7 16-Sep-1993  cgd kill volatile warning.
 1.6 07-Sep-1993  ws branches: 1.6.2;
Changes to VFS readdir semantics
NFS changes for better cookie support
ISOFS changes for better Rockridge support and support for generation numbers
 1.5 26-Aug-1993  pk Implement setattr: mode for process entries; mode + uid/gid for the
PROCFS root directory.
Fixed omission in pfs_root() which came to light as a result of the above:
hold on to vnode for root dir.
 1.4 25-Aug-1993  pk Fixed improperly initialized nfsnode in pfs_lookup()
 1.3 24-Aug-1993  pk copyright update.
 1.2 24-Aug-1993  pk Rcs Id added.
 1.1 24-Aug-1993  pk branches: 1.1.1;
Initial version of a proc filesystem.
 1.1.1.2 01-Mar-1998  fvdl Import 4.4BSD-Lite2
 1.1.1.1 01-Mar-1998  fvdl Import 4.4BSD-Lite for reference
 1.6.2.2 14-Nov-1993  mycroft Canonicalize all #includes.
 1.6.2.1 24-Sep-1993  mycroft Changes from trunk.
 1.49.4.3 14-Oct-1997  thorpej Update marc-pcmcia branch from trunk.
 1.49.4.2 28-Aug-1997  thorpej Update marc-pcmcia branch from trunk.
 1.49.4.1 23-Aug-1997  thorpej Update marc-pcmcia branch from trunk.
 1.61.4.1 02-Aug-1999  thorpej Update from trunk.
 1.61.2.2 14-Jan-2002  he Pull up revision 1.88 (via patch, requested by he):
Fix a ptrace/execve race condition which could be used to modify
the child process' image during execve. This would be a security
issue due to setuid programs.
 1.61.2.1 28-Aug-1999  he Pull up revisions 1.66-1.68:
Protect {fdesc,kernfs,procfs}_readdir against directory seeks
with bogus offsets. (sommerfeld)
 1.69.8.1 21-Dec-1999  wrstuden Initial commit of recent changes to make DEV_BSIZE go away.

Runs on i386, needs work on other arch's. Main kernel routines should be
fine, but a number of the stand programs need help.

cd, fd, ccd, wd, and sd have been updated. sd has been tested with non-512
byte block devices. vnd, raidframe, and lfs need work.

Non 2**n block support is automatic for LKM's and conditional for kernels
on "options NON_PO2_BLOCKS".
 1.69.2.6 21-Apr-2001  bouyer Sync with HEAD
 1.69.2.5 12-Mar-2001  bouyer Sync with HEAD.
 1.69.2.4 11-Feb-2001  bouyer Sync with HEAD.
 1.69.2.3 18-Jan-2001  bouyer Sync with head (for UBC+NFS fixes, mostly).
 1.69.2.2 08-Dec-2000  bouyer Sync with HEAD.
 1.69.2.1 20-Nov-2000  bouyer Update thorpej_scsipi to -current as of a month ago
 1.70.4.2 14-Jan-2002  he Pull up revision 1.88 (via patch, requested by christos):
Fix a ptrace/execve race condition which could be used to modify
the child process' image during execve. This would be a security
issue due to setuid programs.
 1.70.4.1 30-Mar-2001  he Pull up revisions 1.74-1.76 (via patch, requested by fvdl):
Add some required Linux emulation bits to support the Linux
version of VMware.
 1.78.2.13 07-Jan-2003  thorpej Sync with HEAD.
 1.78.2.12 15-Oct-2002  nathanw Make all the procfs_validfoo() routines go back to taking a proc
instead of an lwp; they aren't doing anything useful with the LWP.

Revert changes that changed /proc/curproc to /proc/curlwp, and broke it in
the process.
 1.78.2.11 13-Aug-2002  nathanw Catch up to -current.
 1.78.2.10 12-Jul-2002  nathanw No longer need to pull in lwp.h; proc.h pulls it in for us.
 1.78.2.9 24-Jun-2002  nathanw Curproc->curlwp renaming.

Change uses of "curproc->l_proc" back to "curproc", which is more like the
original use. Bare uses of "curproc" are now "curlwp".

"curproc" is now #defined in proc.h as ((curlwp) ? (curlwp)->l_proc) : NULL)
so that it is always safe to reference curproc (*de*referencing curproc
is another story, but that's always been true).
 1.78.2.8 20-Jun-2002  nathanw Catch up to -current.
 1.78.2.7 28-Feb-2002  nathanw Catch up to -current.
 1.78.2.6 08-Jan-2002  nathanw Catch up to -current.
 1.78.2.5 14-Nov-2001  nathanw Catch up to -current.
 1.78.2.4 21-Sep-2001  nathanw Catch up to -current.
 1.78.2.3 21-Jun-2001  nathanw Catch up to -current.
 1.78.2.2 09-Apr-2001  nathanw Catch up with -current.
 1.78.2.1 05-Mar-2001  nathanw Initial commit of scheduler activations and lightweight process support.
 1.82.2.5 06-Sep-2002  jdolecek sync kqueue branch with HEAD
 1.82.2.4 23-Jun-2002  jdolecek catch up with -current on kqueue branch
 1.82.2.3 11-Feb-2002  jdolecek Sync w/ -current.
 1.82.2.2 10-Jan-2002  thorpej Sync kqueue branch with -current.
 1.82.2.1 13-Sep-2001  thorpej Update the kqueue branch to HEAD.
 1.83.4.1 12-Nov-2001  thorpej Sync the thorpej-mips-cache branch with -current.
 1.83.2.1 18-Sep-2001  fvdl Various changes to make cloning devices possible:

* Add an extra argument (struct vnode **) to VOP_OPEN. If it is
not NULL, specfs will create a cloned (aliased) vnode during
the call, and return it there. The caller should release and
unlock the original vnode if a new vnode was returned. The
new vnode is returned locked.

* Add a flag field to the cdevsw and bdevsw structures.
DF_CLONING indicates that it wants a new vnode for each
open (XXX is there a better way? devprop?)

* If a device is cloning, always call the close entry
point for a VOP_CLOSE.


Also, rewrite cons.c to do the right thing with vnodes. Use VOPs
rather then direct device entry calls. Suggested by mycroft@

Light to moderate testing done an i386 system (arch doesn't matter
though, these are MI changes).
 1.89.2.1 29-Aug-2002  gehenna catch up with -current.
 1.106.2.10 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.106.2.9 04-Mar-2005  skrll Sync with HEAD.

Hi Perry!
 1.106.2.8 27-Oct-2004  skrll Fix various comments that describe the argument structures
 1.106.2.7 19-Oct-2004  skrll Sync with HEAD
 1.106.2.6 24-Sep-2004  skrll Sync with HEAD.
 1.106.2.5 21-Sep-2004  skrll Fix the sync with head I botched.
 1.106.2.4 18-Sep-2004  skrll Sync with HEAD.
 1.106.2.3 24-Aug-2004  skrll Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.
 1.106.2.2 03-Aug-2004  skrll Sync with HEAD
 1.106.2.1 02-Jul-2003  darrenr Apply the aborted ktrace-lwp changes to a specific branch. This is just for
others to review, I'm concerned that patch fuziness may have resulted in some
errant code being generated but I'll look at that later by comparing the diff
from the base to the branch with the file I attempt to apply to it. This will,
at the very least, put the changes in a better context for others to review
them and attempt to tinker with removing passing of 'struct lwp' through
the kernel.
 1.120.6.1 19-Mar-2005  yamt sync with head. xen and whitespace. xen part is not finished.
 1.120.4.1 29-Apr-2005  kent sync with -current
 1.123.2.10 24-Mar-2008  yamt sync with head.
 1.123.2.9 04-Feb-2008  yamt sync with head.
 1.123.2.8 21-Jan-2008  yamt sync with head
 1.123.2.7 07-Dec-2007  yamt sync with head
 1.123.2.6 15-Nov-2007  yamt sync with head.
 1.123.2.5 27-Oct-2007  yamt sync with head.
 1.123.2.4 03-Sep-2007  yamt sync with head.
 1.123.2.3 26-Feb-2007  yamt sync with head.
 1.123.2.2 30-Dec-2006  yamt sync with head.
 1.123.2.1 21-Jun-2006  yamt sync with head.
 1.126.2.1 20-Oct-2005  yamt adapt procfs.
 1.128.4.1 09-Sep-2006  rpaulo sync with head
 1.128.2.1 18-Feb-2006  yamt sync with head.
 1.129.8.1 24-May-2006  tron Merge 2006-05-24 NetBSD-current into the "peter-altq" branch.
 1.129.6.3 06-May-2006  christos - Move kauth_cred_t declaration to <sys/types.h>
- Cleanup struct ucred; forward declarations that are unused.
- Don't include <sys/kauth.h> in any header, but include it in the c files
that need it.

Approved by core.
 1.129.6.2 10-Mar-2006  elad process_authorize() -> kauth_authorize_process(), to be closer to the
original and as requested by yamt@ and thorpej@.
 1.129.6.1 08-Mar-2006  elad Adapt to kernel authorization KPI.
 1.129.4.2 26-Jun-2006  yamt sync with head.
 1.129.4.1 24-May-2006  yamt sync with head.
 1.129.2.2 01-Jun-2006  kardel Sync with head.
 1.129.2.1 04-Feb-2006  simonb Adapt for timecounters: mostly use get*time() and use "time_second"
instead of "time.tv_sec".
 1.130.2.1 19-Jun-2006  chap Sync with head.
 1.133.8.2 10-Dec-2006  yamt sync with head.
 1.133.8.1 22-Oct-2006  yamt sync with head
 1.133.6.7 12-Jan-2007  ad Sync with head.
 1.133.6.6 29-Dec-2006  ad Checkpoint work in progress.
 1.133.6.5 18-Nov-2006  ad Sync with head.
 1.133.6.4 17-Nov-2006  ad Checkpoint work in progress.
 1.133.6.3 24-Oct-2006  ad - Redo LWP locking slightly and fix some races.
- Fix some locking botches.
- Make signal mask / stack per-proc for SA processes.
- Add _lwp_kill().
 1.133.6.2 21-Oct-2006  ad - Make this compile. XXX Needs more work on locking.
- Do FILE_UNUSE() as the current LWP, otherwise we will wipe out the
target's advisory locks. XXX Double check.
 1.133.6.1 11-Sep-2006  ad - Convert some locks to mutexes and RW locks.
- Use the proclist_lock to protect pgrps and sessions in some places.
 1.140.2.7 27-Sep-2007  xtraeme Pull up following revision(s) (requested by martti in ticket #905):
sys/miscfs/procfs/procfs_vnops.c: revision 1.152

Don't prepend rootvnode to the path in non-NULL case for exe links.
It breaks procfs in chroot.
from <christos>, tested by me.
 1.140.2.6 23-Jul-2007  liamjfoy Pull up following revision(s) (requested by pooka in ticket #785):
sys/miscfs/procfs/procfs_vnops.c: revision 1.158
Don't allow getcwd() on procfs vnodes and provide "/" as the path
instead of the result from getcwd(). The works around locking
panics caused by namei calling VOP_READLINK while holding on to a
directory lock and getcwd() trying to acquire that lock. The real
fix would be to get rid of getcwd() calls within VOPs (not locking
safe), but that's not a viable option in the netbsd-4 timeframe.
Suggestion for workaround from David Holland.
 1.140.2.5 31-Mar-2007  bouyer branches: 1.140.2.5.2;
pull up the following revisions (requested by pooka in ticket #537):
sys/miscfs/procfs/procfs_vnops.c 1.148, 1.150-1.151 via patch
Fixes a panic when doing stat */exe.
 1.140.2.4 17-Feb-2007  tron Apply patch (requested by chs in ticket #422):
- Fix various deadlock problems with nullfs and unionfs.
- Speed up path lookups by upto 25%.
 1.140.2.3 03-Jan-2007  tron Pull up following revision(s) (requested by elad in ticket #308):
sys/secmodel/bsd44/secmodel_bsd44_suser.c: revision 1.21 via patch
sys/miscfs/procfs/procfs_vnops.c: revision 1.144
PR/35226: Johann Franz: Problems with permissions in
/usr/pkg/emul/linux/proc .
Okay mlelstv@
 1.140.2.2 03-Jan-2007  tron Pull up following revision(s) (requested by elad in ticket #307):
sys/miscfs/procfs/procfs_vnops.c: revision 1.142
From Nicolas Joly: restore previous behavior in procfs_validfile_linux,
since
readdir passes a NULL lwp.
 1.140.2.1 06-Dec-2006  tron Pull up following revision(s) (requested by elad in ticket #248):
sys/miscfs/procfs/procfs_vnops.c: revision 1.141
Move kauth(9) call to where it belongs. Noticed by Nicolas Joly, thanks!
 1.140.2.5.2.2 30-Sep-2007  wrstuden Catch up on netbsd-4 as of a few days ago.
 1.140.2.5.2.1 03-Sep-2007  wrstuden Sync w/ NetBSD-4-RC_1
 1.148.2.3 15-Apr-2007  yamt sync with head.
 1.148.2.2 12-Mar-2007  rmind Sync with HEAD.
 1.148.2.1 27-Feb-2007  yamt - sync with head.
- move sched_changepri back to kern_synch.c as it doesn't know PPQ anymore.
 1.154.4.1 11-Jul-2007  mjf Sync with head.
 1.154.2.7 25-Oct-2007  ad - Simplify debugger/procfs reference counting of processes. Use a per-proc
rwlock: rw_tryenter(RW_READER) to gain a reference, and rw_enter(RW_WRITER)
by the process itself to drain out reference holders before major changes
like exiting.
- Fix numerous bugs and locking issues in procfs.
- Mark procfs MPSAFE.
 1.154.2.6 16-Sep-2007  ad Checkpoint work in progress on the vnode lifecycle and reference counting
stuff. This makes it work properly without kernel_lock and fixes a few
quite old bugs. See vfs_subr.c 1.283.2.17 for details.
 1.154.2.5 20-Aug-2007  ad Sync with HEAD.
 1.154.2.4 17-Jun-2007  ad - Increase the number of thread priorities from 128 to 256. How the space
is set up is to be revisited.
- Implement soft interrupts as kernel threads. A generic implementation
is provided, with hooks for fast-path MD code that can run the interrupt
threads over the top of other threads executing in the kernel.
- Split vnode::v_flag into three fields, depending on how the flag is
locked (by the interlock, by the vnode lock, by the file system).
- Miscellaneous locking fixes and improvements.
 1.154.2.3 08-Jun-2007  ad Sync with head.
 1.154.2.2 10-Apr-2007  ad Sync with head.
 1.154.2.1 21-Mar-2007  ad - Replace more simple_locks, and fix up in a few places.
- Use condition variables.
- LOCK_ASSERT -> KASSERT.
 1.157.2.1 15-Aug-2007  skrll Sync with HEAD.
 1.158.10.2 22-Jul-2007  pooka Don't allow getcwd() on procfs vnodes and provide "/" as the path
instead of the result from getcwd(). The works around locking
panics caused by namei calling VOP_READLINK while holding on to a
directory lock and getcwd() trying to acquire that lock. The real
fix would be to get rid of getcwd() calls within VOPs (not locking
safe), but that's not a viable option in the netbsd-4 timeframe.

Suggestion for workaround from David Holland.
 1.158.10.1 22-Jul-2007  pooka file procfs_vnops.c was added on branch matt-mips64 on 2007-07-22 13:37:14 +0000
 1.158.8.1 14-Oct-2007  yamt sync with head.
 1.158.6.4 23-Mar-2008  matt sync with HEAD
 1.158.6.3 09-Jan-2008  matt sync with HEAD
 1.158.6.2 08-Nov-2007  matt sync with -HEAD
 1.158.6.1 06-Nov-2007  matt sync with HEAD
 1.158.4.3 27-Nov-2007  joerg Sync with HEAD. amd64 Xen support needs testing.
 1.158.4.2 11-Nov-2007  joerg Sync with HEAD.
 1.158.4.1 26-Oct-2007  joerg Sync with HEAD.

Follow the merge of pmap.c on i386 and amd64 and move
pmap_init_tmp_pgtbl into arch/x86/x86/pmap.c. Modify the ACPI wakeup
code to restore CR4 before jumping back into kernel space as the large
page option might cover that.
 1.160.4.3 18-Feb-2008  mjf Sync with HEAD.
 1.160.4.2 08-Dec-2007  mjf Sync with HEAD.
 1.160.4.1 19-Nov-2007  mjf Sync with HEAD.
 1.160.2.1 13-Nov-2007  bouyer Sync with HEAD
 1.163.6.2 23-Jan-2008  bouyer Sync with HEAD.
 1.163.6.1 02-Jan-2008  bouyer Sync with HEAD
 1.163.2.1 04-Dec-2007  ad Pull the vmlocking changes into a new branch.
 1.165.6.4 17-Jan-2009  mjf Sync with HEAD.
 1.165.6.3 28-Sep-2008  mjf Sync with HEAD.
 1.165.6.2 02-Jun-2008  mjf Sync with HEAD.
 1.165.6.1 03-Apr-2008  mjf Sync with HEAD.
 1.166.2.1 18-May-2008  yamt sync with head.
 1.168.2.6 11-Aug-2010  yamt sync with head.
 1.168.2.5 11-Mar-2010  yamt sync with head
 1.168.2.4 18-Jul-2009  yamt sync with head.
 1.168.2.3 20-Jun-2009  yamt sync with head
 1.168.2.2 04-May-2009  yamt sync with head.
 1.168.2.1 16-May-2008  yamt sync with head.
 1.169.4.1 03-Jul-2008  simonb Sync with head.
 1.169.2.1 18-Sep-2008  wrstuden Sync with wrstuden-revivesa-base-2.
 1.170.2.1 19-Oct-2008  haad Sync with HEAD.
 1.172.2.1 19-Jan-2009  skrll Sync with HEAD.
 1.173.2.1 23-Jul-2009  jym Sync with HEAD.
 1.177.4.1 03-Jul-2010  rmind sync with head
 1.177.2.1 17-Aug-2010  uebayasi Sync with HEAD.
 1.182.6.2 02-Jun-2012  mrg sync to latest -current.
 1.182.6.1 05-Apr-2012  mrg sync to latest -current.
 1.182.2.4 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.182.2.3 16-Jan-2013  yamt sync with (a bit old) head
 1.182.2.2 30-Oct-2012  yamt sync with head
 1.182.2.1 17-Apr-2012  yamt sync with head
 1.184.2.4 03-Dec-2017  jdolecek update from HEAD
 1.184.2.3 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.184.2.2 23-Jun-2013  tls resync from head
 1.184.2.1 25-Feb-2013  tls resync with head
 1.186.6.1 18-May-2014  rmind sync with head
 1.189.2.1 10-Aug-2014  tls Rebase.
 1.191.8.1 29-Aug-2019  martin Pull up following revision(s) (requested by hannken in ticket #1703):

sys/miscfs/kernfs/kernfs_vnops.c: revision 1.161
sys/miscfs/procfs/procfs_vnops.c: revision 1.207

Add missing operation VOP_GETPAGES() returning EFAULT.

Without this operation posix_fadvise(..., POSIX_FADV_WILLNEED)
would leave the v_interlock held.

Observed by maxv@
 1.191.4.1 29-Aug-2019  martin Pull up following revision(s) (requested by hannken in ticket #1703):

sys/miscfs/kernfs/kernfs_vnops.c: revision 1.161
sys/miscfs/procfs/procfs_vnops.c: revision 1.207

Add missing operation VOP_GETPAGES() returning EFAULT.

Without this operation posix_fadvise(..., POSIX_FADV_WILLNEED)
would leave the v_interlock held.

Observed by maxv@
 1.191.2.1 29-Aug-2019  martin Pull up following revision(s) (requested by hannken in ticket #1703):

sys/miscfs/kernfs/kernfs_vnops.c: revision 1.161
sys/miscfs/procfs/procfs_vnops.c: revision 1.207

Add missing operation VOP_GETPAGES() returning EFAULT.

Without this operation posix_fadvise(..., POSIX_FADV_WILLNEED)
would leave the v_interlock held.

Observed by maxv@
 1.192.2.3 28-Aug-2017  skrll Sync with HEAD
 1.192.2.2 05-Oct-2016  skrll Sync with HEAD
 1.192.2.1 06-Jun-2015  skrll Sync with HEAD
 1.193.2.1 26-Apr-2017  pgoyette Sync with HEAD
 1.194.2.1 21-Apr-2017  bouyer Sync with HEAD
 1.197.2.4 17-Jun-2022  martin Pull up following revision(s) (requested by shm in ticket #1748):

sys/miscfs/procfs/procfs_vnops.c: revision 1.229

Add missing permission check
 1.197.2.3 29-Aug-2019  martin Pull up following revision(s) (requested by hannken in ticket #1346):

sys/miscfs/kernfs/kernfs_vnops.c: revision 1.161
sys/miscfs/procfs/procfs_vnops.c: revision 1.207

Add missing operation VOP_GETPAGES() returning EFAULT.

Without this operation posix_fadvise(..., POSIX_FADV_WILLNEED)
would leave the v_interlock held.

Observed by maxv@
 1.197.2.2 12-Apr-2018  martin Pull up following revision(s) (requested by kamil in ticket #713):

sys/modules/procfs/Makefile: revision 1.4
sys/miscfs/procfs/procfs_vfsops.c: revision 1.98
bin/ps/ps.1: revision 1.108
sys/compat/linux/arch/i386/linux_ptrace.c: revision 1.32
sys/miscfs/procfs/procfs_vnops.c: revision 1.198
sys/kern/sys_ptrace_common.c: revision 1.23
sys/kern/sys_ptrace_common.c: revision 1.24
sbin/mount_procfs/mount_procfs.8: revision 1.36
sys/kern/sys_ptrace_common.c: revision 1.25
sys/kern/sys_ptrace.c: revision 1.5
sys/compat/linux/arch/powerpc/linux_ptrace.c: revision 1.30
sys/sys/proc.h: revision 1.342
sys/kern/sys_ptrace_common.c: revision 1.26
sys/miscfs/procfs/procfs_ctl.c: file removal
sys/kern/sys_ptrace_common.c: revision 1.27
sys/miscfs/procfs/procfs_subr.c: revision 1.109
sys/kern/sys_ptrace_common.c: revision 1.28
sys/secmodel/extensions/secmodel_extensions.c: revision 1.8
sys/kern/sys_ptrace_common.c: revision 1.29
sys/sys/ptrace.h: revision 1.62
sys/compat/netbsd32/netbsd32_signal.c: revision 1.45
share/man/man9/kauth.9: revision 1.109
sys/miscfs/procfs/files.procfs: revision 1.12
sys/compat/netbsd32/netbsd32.h: revision 1.115
sys/miscfs/procfs/procfs.h: revision 1.72
sys/compat/netbsd32/netbsd32_ptrace.c: revision 1.5
sys/kern/kern_sig.c: revision 1.337
sys/sys/kauth.h: revision 1.75
sys/sys/sysctl.h: revision 1.224
sys/kern/sys_ptrace_common.c: revision 1.30
sys/kern/sys_ptrace_common.c: revision 1.31
sys/kern/sys_ptrace_common.c: revision 1.32
sys/kern/sys_ptrace_common.c: revision 1.33
sys/compat/linux/arch/arm/linux_ptrace.c: revision 1.20
sys/kern/sys_ptrace_common.c: revision 1.34
sys/kern/sys_ptrace_common.c: revision 1.36
sys/kern/kern_proc.c: revision 1.207
sys/kern/kern_exit.c: revision 1.269
doc/TODO.ptrace: revision 1.29

Make {s,g}et{db,fp,}regs work again for PK_32 processes
XXX: pullup-8

add disgusting magic to handle compat_netbsd32 as a module.

use process_*reg32 instead of struct *reg32.

Remove the filesystem tracing feature

This is a legacy interface from 4.4BSD, and it was
introduced to overcome shortcomings of ptrace(2) at that time, which are
no longer relevant (performance). Today /proc/#/ctl offers a narrow
subset of ptrace(2) commands and is not applicable for modern
applications use beyond simplistic tracing scenarios.

This removal will simplify kernel internals. Users will still be able to
use all the other /proc files.

This change won't affect other procfs files neither Linux compat
features within mount_procfs(8). /proc/#/ctl isn't available on Linux.

Remove:
- /proc/#/ctl from mount_procfs(8)
- P_FSTRACE note from the documentation of ps(1)
- /proc/#/ctl and filesystem tracing documentation from mount_procfs(8)
- KAUTH_REQ_PROCESS_PROCFS_CTL documentation from kauth(9)
- source code file miscfs/procfs/procfs_ctl.c
- PFSctl and procfs_doctl() from sys/miscfs/procfs/procfs.h
- KAUTH_REQ_PROCESS_PROCFS_CTL from sys/sys/kauth.h
- PSL_FSTRACE (0x00010000) from sys/sys/proc.h
- P_FSTRACE (0x00010000) from sys/sys/sysctl.h

Reduce code complexity after removal of this functionality.

Update TODO.ptrace accordingly: remove two entries about /proc tracing.

Do not keep legacy notes as comments in the headers about removed

PSL_FSTRACE / P_FSTRACE, as this interface had little number of users
(close or equal to zero).
Proposed on tech-kern@.

All filesystem tracing utility users are encouraged to switch to ptrace(2).

Sponsored by <The NetBSD Foundation>

untangle the mess:
- factor out common code
- break each ptrace subcall to its own sub-function
.. more to come ...
- reduce ifdef ugliness by moving it up top.
- factor out PT_IO and make PT_{READ,WRITE}_{I,D} use it
- factor out PT_DUMPCORE
- factor out sendsig code
.. more to come ...

handle siginfo requests for ptrace32

ptrace: Partially undo PT_{READ,WRITE}_{I,D} and unbreak these commands

The refactored code did not work and was generating EFAULT.

Sponsored by <The NetBSD Foundation>

Merge the code back; the problem was that since we are reading/writing
to a kernel address for PT_{READ,WRITE}_{I,D} we need the kernel vmspace.
provide separate read and write functions to accomodate register functions
that need a size argument.

don't ignore error from copyout_piod

Use the proper process (the tracee) to get information about lwps and
registers and the tracer for vmspace.

Add new sysctl(3) entry: security.models.extensions.user_set_dbregs

Model this new sysctl(3) entry after "user_set_cpu_affinity" in the same
level of sysctl(3) switches.

Allow to read unconditionally Debug Registers (no change here). This is
convenient as even if a user of a debugger does not use hardware assisted
watchpoints/breakpoints, a debugger can still prompt these values to store
in an internal cache with context of registers. Reading them should have
no security concerns.

Add a paranoid MI switch that prohibits by default setting these registers
by a regular user (non-superuser). Make this switch disabled by default.
There are enough reserved bits out there to allow using them
unconditionally on hardened hosts.

Features shipped with Debug Registers are optional features in debuggers.
There is no reduction in elementary functionality.

Reviewed by <christos>

Sponsored by <The NetBSD Foundation>
 1.197.2.1 08-Apr-2018  snj Pull up following revision(s) (requested by hannken in ticket #702):
sys/miscfs/procfs/procfs_vnops.c: 1.203
Lock the target cwdi and take an additional reference to the
vnode we are interested in to prevent it from disappearing
before getcwd_common().
Should fix PR kern/53096 (netbsd-8 crash on heavy disk I/O)
 1.202.2.3 20-Oct-2018  pgoyette Sync with head
 1.202.2.2 06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.202.2.1 16-Apr-2018  pgoyette Sync with HEAD, resolve some conflicts
 1.203.2.2 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.203.2.1 10-Jun-2019  christos Sync with HEAD
 1.206.4.3 20-Nov-2024  martin Pull up following revision(s) (requested by riastradh in ticket #1921):

sys/kern/kern_event.c: revision 1.106
sys/kern/sys_select.c: revision 1.51
sys/kern/subr_exec_fd.c: revision 1.10
sys/kern/sys_aio.c: revision 1.46
sys/kern/kern_descrip.c: revision 1.244
sys/kern/kern_descrip.c: revision 1.245
sys/ddb/db_xxx.c: revision 1.72
sys/ddb/db_xxx.c: revision 1.73
sys/miscfs/fdesc/fdesc_vnops.c: revision 1.132
sys/kern/uipc_usrreq.c: revision 1.195
sys/kern/sys_descrip.c: revision 1.36
sys/kern/uipc_usrreq.c: revision 1.196
sys/kern/uipc_socket2.c: revision 1.135
sys/kern/uipc_socket2.c: revision 1.136
sys/kern/kern_sig.c: revision 1.383
sys/kern/kern_sig.c: revision 1.384
sys/compat/netbsd32/netbsd32_ioctl.c: revision 1.107
sys/miscfs/procfs/procfs_vnops.c: revision 1.208
sys/kern/subr_exec_fd.c: revision 1.9
sys/kern/kern_descrip.c: revision 1.252
(all via patch)

Load struct filedesc::fd_dt with atomic_load_consume.

Exceptions: when fd_refcnt <= 1, or when holding fd_lock.

While here:
- Restore KASSERT(mutex_owned(&fdp->fd_lock)) in fd_unused.
=> This is used only in fd_close and fd_abort, where it holds.
- Move bounds check assertion in fd_putfile to where it matters.
- Store fd_dt with atomic_store_release.
- Move load of fd_dt under lock in knote_fdclose.
- Omit membar_consumer in fdesc_readdir.
=> atomic_load_consume serves the same purpose now.
=> Was needed only on alpha anyway.

Load struct fdfile::ff_file with atomic_load_consume.
Exceptions: when we're only testing whether it's there, not about to
dereference it.

Note: We do not use atomic_store_release to set it because the
preceding mutex_exit should be enough.

(That said, it's not clear the mutex_enter/exit is needed unless
refcnt > 0 already, in which case maybe it would be a win to switch
from the membar implied by mutex_enter to the membar implied by
atomic_store_release -- which I would generally expect to be much
cheaper. And a little clearer without a long comment.)
kern_descrip.c: Fix membars around reference count decrement.

In general, the `last one out hit the lights' style of reference
counting (as opposed to the `whoever's destroying must wait for
pending users to finish' style) requires memory barriers like so:

... usage of resources associated with object ...
membar_release();
if (atomic_dec_uint_nv(&obj->refcnt) != 0)
return;
membar_acquire();
... freeing of resources associated with object ...

This way, all usage happens-before all freeing. This fixes several
errors:
- fd_close failed to ensure whatever its caller did would
happen-before the freeing, in the case where another thread is
concurrently trying to close the fd (ff->ff_file == NULL).
Fix: Add membar_release before atomic_dec_uint(&ff->ff_refcnt) in
that branch.
- fd_close failed to ensure all loads its caller had issued will have
happened-before the freeing, in the case where the fd is still in
use by another thread (fdp->fd_refcnt > 1 and ff->ff_refcnt-- > 0).
Fix: Change membar_producer to membar_release before
atomic_dec_uint(&ff->ff_refcnt).
- fd_close failed to ensure that any usage of fp by other callers
would happen-before any freeing it does.
Fix: Add membar_acquire after atomic_dec_uint_nv(&ff->ff_refcnt).
- fd_free failed to ensure that any usage of fdp by other callers
would happen-before any freeing it does.
Fix: Add membar_acquire after atomic_dec_uint_nv(&fdp->fd_refcnt).

While here, change membar_exit -> membar_release. No semantic
change, just updating away from the legacy API.
 1.206.4.2 17-Jun-2022  martin Pull up following revision(s) (requested by shm in ticket #1475):

sys/miscfs/procfs/procfs_vnops.c: revision 1.229

Add missing permission check
 1.206.4.1 01-Sep-2019  martin Pull up following revision(s) (requested by hannken in ticket #132):
sys/miscfs/kernfs/kernfs_vnops.c: revision 1.161
sys/miscfs/procfs/procfs_vnops.c: revision 1.207
Add missing operation VOP_GETPAGES() returning EFAULT.
Without this operation posix_fadvise(..., POSIX_FADV_WILLNEED)
would leave the v_interlock held.
Observed by maxv@
 1.207.2.2 29-Feb-2020  ad Sync with head.
 1.207.2.1 25-Jan-2020  ad Make cwdinfo use mostly lockless, and largely hide the details in vfs_cwd.c.
 1.210.4.1 25-Apr-2020  bouyer Sync with bouyer-xenpvh-base2 (HEAD)
 1.215.6.1 01-Aug-2021  thorpej Sync with HEAD.
 1.229.4.1 18-Apr-2024  martin Pull up following revision(s) (requested by hannken in ticket #668):

sys/miscfs/procfs/procfs.h: revision 1.83
sys/miscfs/procfs/procfs.h: revision 1.84
sys/kern/vfs_mount.c: revision 1.104
sys/miscfs/procfs/procfs_vnops.c: revision 1.230
sys/kern/init_main.c: revision 1.547
sys/kern/kern_hook.c: revision 1.15
sys/miscfs/procfs/procfs_vfsops.c: revision 1.112
sys/miscfs/procfs/procfs_vfsops.c: revision 1.113
sys/miscfs/procfs/procfs_vfsops.c: revision 1.114
sys/miscfs/procfs/procfs_subr.c: revision 1.117

Print dangling vnode before panic() to help debug.

PR kern/57775 ""panic: unmount: dangling vnode" while umounting procfs"
Protect kernel hooks exechook, exithook and forkhook with rwlock.

Lock as writer on establish/disestablish and as reader on list traverse.

For exechook ride "exec_lock" as it is already take as reader when
traversing the list. Add local locks for exithook and forkhook.

Move exec_init before signal_init as signal_init calls exechook_establish()
that needs "exec_lock".

PR kern/39913 "exec, fork, exit hooks need locking"

Add a hashmap to access all procfs nodes by pid.

Using the exechook to revoke procfs nodes is racy and may deadlock:
one thread runs doexechooks() -> procfs_revoke_vnodes() and wants to suspend
the file system for vgone(), while another thread runs a forced unmount,
has the file system suspended, tries to disestablish the exechook and
waits for doexechooks() to complete.

Establish/disestablish the exechook on module load/unload instead
mount/unmount and use the hashmap to access all procfs nodes for this pid.

May fix PR kern/57775 ""panic: unmount: dangling vnode" while umounting procfs"

Remove all procfs nodes for this process on process exit.
 1.232.2.1 02-Aug-2025  perseant Sync with HEAD

RSS XML Feed