Home | History | Annotate | only in /src/sys/nfs
History log of /src/sys/nfs
RevisionDateAuthorComments
 1.4 27-Dec-2006  yamt remove nqnfs.
 1.3 03-Jan-2006  yamt branches: 1.3.18;
don't install nfs_var.h.
 1.2 26-Nov-2002  lukem branches: 1.2.22; 1.2.34;
Remove KDIR=, since SYS_INCLUDE=symlinks and KDIR are not supported any more.
 1.1 12-Jun-1998  cgd branches: 1.1.26;
Rework the way kernel include files are installed. In the new method,
as with user-land programs, include files are installed by each directory
in the tree that has includes to install. (This allows more flexibility
as to what gets installed, makes 'partial installs' easier, and gives us
more options as to which machines' includes get installed at any given
time.) The old SYS_INCLUDES={symlinks,copies} behaviours are _both_
still supported, though at least one bug in the 'symlinks' case is
fixed by this change. Include files can't be build before installation,
so directories that have includes as targets (e.g. dev/pci) have to move
those targets into a different Makefile.
 1.1.26.1 11-Dec-2002  thorpej Sync with HEAD.
 1.2.34.1 15-Jan-2006  yamt sync with head.
 1.2.22.2 30-Dec-2006  yamt sync with head.
 1.2.22.1 21-Jun-2006  yamt sync with head.
 1.3.18.1 12-Jan-2007  ad Sync with head.
 1.15 17-May-2018  thorpej Default NFS mounts to using TCP transport instead of UDP.
PR kern/53166
 1.14 11-Oct-2014  uebayasi branches: 1.14.18;
Define filesystem attributes with vfs dependency.
 1.13 02-Mar-2010  pooka branches: 1.13.20;
don't create unused fs_nfs.h
 1.12 02-Mar-2010  pooka Get rid of dependency on fs_nfs.h, i.e. source modules with
conditional content depending on if the NFS client is wanted or
not. The server can now be made an independent module not depending
on the nfs client.

Tested with rump_nfs (standalone client), rump_nfsd (standalone
nfsd) and a qemu installation with both the client and the server.
 1.11 31-Dec-2009  christos branches: 1.11.2;
handle the nuidhash_max lossage differently
 1.10 31-Dec-2009  christos nuidhash_max is needed by sys_nfssvc
 1.9 19-Nov-2008  ad Make the emulations, exec formats, coredump, NFS, and the NFS server
into modules. By and large this commit:

- shuffles header files and ifdefs
- splits code out where necessary to be modular
- adds module glue for each of the components
- adds/replaces hooks for things that can be installed at runtime
 1.8 27-Dec-2006  yamt branches: 1.8.40; 1.8.44; 1.8.50; 1.8.54;
remove nqnfs.
 1.7 11-Dec-2005  christos branches: 1.7.20;
merge ktrace-lwp.
 1.6 23-Sep-2005  jmmv Apply the NFS exports list rototill patch:

- Remove all NFS related stuff from file system specific code.
- Drop the vfs_checkexp hook and generalize it in the new nfs_check_export
function, thus removing redundancy from all file systems.
- Move all NFS export-related stuff from kern/vfs_subr.c to the new
file sys/nfs/nfs_export.c. The former was becoming large and its code
is always compiled, regardless of the build options. Using the latter,
the code is only compiled in when NFSSERVER is enabled. While doing this,
also make some functions in nfs_subs.c conditional to NFSSERVER.
- Add a new command in nfssvc(2), called NFSSVC_SETEXPORTSLIST, that takes a
path and a set of export entries. At the moment it can only clear the
exports list or append entries, one by one, but it is done in a way that
allows setting the whole set of entries atomically in the future (see the
comment in mountd_set_exports_list or in doc/TODO).
- Change mountd(8) to use the nfssvc(2) system call instead of mount(2) so
that it becomes file system agnostic. In fact, all this whole thing was
done to remove a 'XXX' block from this utility!
- Change the mount*, newfs and fsck* userland utilities to not deal with NFS
exports initialization; done internally by the kernel when initializing
the NFS support for each file system.
- Implement an interface for VFS (called VFS hooks) so that several kernel
subsystems can run arbitrary code upon receipt of specific VFS events.
At the moment, this only provides support for unmount and is used to
destroy NFS exports lists from the file systems being unmounted, though it
has room for extension.

Thanks go to yamt@, chs@, thorpej@, wrstuden@ and others for their comments
and advice in the development of this patch.
 1.5 26-Feb-2005  perry branches: 1.5.4;
nuke trailing whitespace
 1.4 12-Dec-2004  bouyer branches: 1.4.2; 1.4.4;
The macro used for static server address is NFS_BOOTSTATIC_SERVADDR, not
NFS_BOOTSTATIC_SADDR. From Xen source distribution.
XXX NFS_BOOTSTATIC* doesn't seem to be documented anywhere ...
 1.3 11-Mar-2004  cl branches: 1.3.6;
Add static nfs boot configuration, from the kernel config file or from
a driver selectable callback function. This is used in the Xen port to
allow controlling the domain's network setup from the domain building
environment at domain creation (vs. having to maintain/change this on a
dhcp server). The Xen network driver parses a command line passed in
from the domain builder.
 1.2 23-Oct-2002  jdolecek branches: 1.2.6;
merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe
 1.1 16-Apr-2002  thorpej branches: 1.1.6; 1.1.8;
Cleanup how file system configuration information is declared, grouping
related information together, with the file system code itself.

This is just low-hanging fruit -- more to come.
 1.1.8.3 30-Sep-2002  jdolecek add support for kevents to NFS
to detect file changes on server by other NFS clients, polling kernel thread
is used to periodically check for attribute changes of watched files;
the NFS server is only contacted when the vnode expires from local attrcache
(which takes 5-60 seconds currently), to keep network&CPU overhead low

the routine checking for remote changes is quite simplistic, but hopefully
doing it's job well enough
 1.1.8.2 23-Jun-2002  jdolecek catch up with -current on kqueue branch
 1.1.8.1 16-Apr-2002  jdolecek file files.nfs was added on branch kqueue on 2002-06-23 17:51:46 +0000
 1.1.6.3 11-Nov-2002  nathanw Catch up to -current
 1.1.6.2 20-Jun-2002  nathanw Catch up to -current.
 1.1.6.1 16-Apr-2002  nathanw file files.nfs was added on branch nathanw_sa on 2002-06-20 03:50:00 +0000
 1.2.6.6 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.2.6.5 04-Mar-2005  skrll Sync with HEAD.

Hi Perry!
 1.2.6.4 18-Dec-2004  skrll Sync with HEAD.
 1.2.6.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.2.6.2 18-Sep-2004  skrll Sync with HEAD.
 1.2.6.1 03-Aug-2004  skrll Sync with HEAD
 1.3.6.1 06-Apr-2005  tron Pull up revision 1.4 (requested by bouyer in ticket #1036):
The macro used for static server address is NFS_BOOTSTATIC_SERVADDR, not
NFS_BOOTSTATIC_SADDR. From Xen source distribution.
XXX NFS_BOOTSTATIC* doesn't seem to be documented anywhere ...
 1.4.4.1 19-Mar-2005  yamt sync with head. xen and whitespace. xen part is not finished.
 1.4.2.1 29-Apr-2005  kent sync with -current
 1.5.4.2 30-Dec-2006  yamt sync with head.
 1.5.4.1 21-Jun-2006  yamt sync with head.
 1.7.20.1 12-Jan-2007  ad Sync with head.
 1.8.54.1 19-Jan-2009  skrll Sync with HEAD.
 1.8.50.1 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.8.44.2 11-Mar-2010  yamt sync with head
 1.8.44.1 04-May-2009  yamt sync with head.
 1.8.40.1 17-Jan-2009  mjf Sync with HEAD.
 1.11.2.1 30-Apr-2010  uebayasi Sync with HEAD.
 1.13.20.1 03-Dec-2017  jdolecek update from HEAD
 1.14.18.1 21-May-2018  pgoyette Sync with HEAD
 1.10 07-Dec-2024  riastradh sys/nfs/nfs: Add some missing includes and include guards.

Fix up some minor KNF issues while here.

No functional change intended (except to enable things to build that
might not have built before because of previously required #include
ordering).
 1.9 14-Mar-2009  dsl branches: 1.9.100;
Remove all the __P() from sys (excluding sys/dist)
Diff checked with grep and MK1 eyeball.
i386 and amd64 GENERIC and sys still build.
 1.8 11-Dec-2005  christos branches: 1.8.74; 1.8.84; 1.8.90;
merge ktrace-lwp.
 1.7 22-May-2004  jonathan branches: 1.7.12;
Eliminate several uses of `curproc' from the socket-layer code and from NFS.

Add a new explicit `struct proc *p' argument to socreate(), sosend().
Use that argument instead of curproc. Follow-on changes to pass that
argument to socreate(), sosend(), and (*so->so_send)() calls.
These changes reviewed and independently recoded by Matt Thomas.

Changes to soreceive() and (*dom->dom_exernalize() from Matt Thomas:
pass soreceive()'s struct uio* uio->uio_procp to unp_externalize().
Eliminate curproc from unp_externalize. Also, now soreceive() uses
its uio->uio_procp value, pass that same value downward to
((pr->pru_usrreq)() calls for consistency, instead of (struct proc * )0.

Similar changes in sys/nfs to eliminate (most) uses of curproc,
either via the req-> r_procp field of a struct nfsreq *req argument,
or by passing down new explicit struct proc * arguments.

Reviewed by: Matt Thomas, posted to tech-kern.
NB: The (*pr->pru_usrreq)() change should be tested on more (all!) protocols.
 1.6 05-May-2003  yamt branches: 1.6.2;
keep things not needed by userland in #ifdef _KERNEL.
(e.g. prototypes for in-kernel functions)
 1.5 20-Oct-1996  fvdl Enhancements from Matthias Drochner:
- Try V3 first for diskless booting. Fall back to V2 if V3 fails.
- optionally (option NFS_BOOT_TCP) try a TCP mount first
for diskless booting. Fall back to UDP if it fails.
- Enable switching between UDP and TCP for remounts.
 1.4 19-Dec-1995  cgd changes to make this work on systems where pointers & longs are 64 bits.
This is mostly just changes to make the stuff that goes over the wire
use fixed-size types.
 1.3 24-Apr-1995  gwr Fixed RPC code to deal with RPC messages larger than one mbuf.
 1.2 26-Oct-1994  cgd new RCS ID format.
 1.1 26-Sep-1994  gwr Do the first BOOTPARAM RPC call to the broadcast address instead of
using the address of the RARP server because a BOOTPARAM server
might not be running on the machine that sent the RARP reply.
 1.6.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.6.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.6.2.1 03-Aug-2004  skrll Sync with HEAD
 1.7.12.1 21-Jun-2006  yamt sync with head.
 1.8.90.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.8.84.1 28-Apr-2009  skrll Sync with HEAD.
 1.8.74.1 04-May-2009  yamt sync with head.
 1.9.100.1 02-Aug-2025  perseant Sync with HEAD
 1.44 20-Oct-2024  mlelstv MBUFTRACE
 1.43 05-Jul-2024  rin sys: Drop redundant NULL check before m_freem(9)

m_freem(9) safely has accepted NULL argument at least since 4.2BSD:
https://www.tuhs.org/cgi-bin/utree.pl?file=4.2BSD/usr/src/sys/sys/uipc_mbuf.c

Compile-tested on amd64/ALL.

Suggested by knakahara@
 1.42 10-Jun-2016  ozaki-r branches: 1.42.54;
Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.
 1.41 21-May-2015  rtr change nfs_boot_sendrecv to take sockaddr_in * instead of mbuf *

fixes m_serv (single mbuf leak) leak in kern/subr_tftproot.c
 1.40 09-May-2015  rtr when calling nfs_boot_sendrecv pass NULL for pointers instead of 0
 1.39 27-Mar-2015  hikaru m_pullup() is called in rcvproc callback functions,
so nfs_boot_sendrecv() should keep track of the head of mbuf chain.
fixes kern/48746
 1.38 06-Mar-2015  maxv Fix uninitialized variable.

Found by The Brainy Code Scanner in FreeBSD.
 1.37 15-Mar-2009  cegger branches: 1.37.18; 1.37.22; 1.37.38; 1.37.40;
ansify function definitions
 1.36 14-Mar-2009  dsl ANSIfy another 1261 function definitions.
The only ones left in sys are beyond by sed script!
(or in sys/dist or sys/external)
Mostly they have function pointer parameters.
 1.35 14-Mar-2009  dsl Change about 4500 of the K&R function definitions to ANSI ones.
There are still about 1600 left, but they have ',' or /* ... */
in the actual variable definitions - which my awk script doesn't handle.
There are also many that need () -> (void).
(The script does handle misordered arguments.)
 1.34 14-Mar-2009  dsl Remove all the __P() from sys (excluding sys/dist)
Diff checked with grep and MK1 eyeball.
i386 and amd64 GENERIC and sys still build.
 1.33 24-Apr-2008  ad branches: 1.33.2; 1.33.10; 1.33.16;
Merge the socket locking patch:

- Socket layer becomes MP safe.
- Unix protocols become MP safe.
- Allows protocol processing interrupts to safely block on locks.
- Fixes a number of race conditions.

With much feedback from matt@ and plunky@.
 1.32 04-Mar-2007  christos branches: 1.32.36; 1.32.38;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.
 1.31 15-Apr-2006  christos branches: 1.31.14;
Coverity CID 2445: Only set from_p if we succeed so that we free it on error.
 1.30 11-Dec-2005  christos branches: 1.30.4; 1.30.6; 1.30.8; 1.30.10; 1.30.12;
merge ktrace-lwp.
 1.29 26-Feb-2005  perry branches: 1.29.4;
nuke trailing whitespace
 1.28 22-May-2004  jonathan branches: 1.28.4; 1.28.6;
Eliminate several uses of `curproc' from the socket-layer code and from NFS.

Add a new explicit `struct proc *p' argument to socreate(), sosend().
Use that argument instead of curproc. Follow-on changes to pass that
argument to socreate(), sosend(), and (*so->so_send)() calls.
These changes reviewed and independently recoded by Matt Thomas.

Changes to soreceive() and (*dom->dom_exernalize() from Matt Thomas:
pass soreceive()'s struct uio* uio->uio_procp to unp_externalize().
Eliminate curproc from unp_externalize. Also, now soreceive() uses
its uio->uio_procp value, pass that same value downward to
((pr->pru_usrreq)() calls for consistency, instead of (struct proc * )0.

Similar changes in sys/nfs to eliminate (most) uses of curproc,
either via the req-> r_procp field of a struct nfsreq *req argument,
or by passing down new explicit struct proc * arguments.

Reviewed by: Matt Thomas, posted to tech-kern.
NB: The (*pr->pru_usrreq)() change should be tested on more (all!) protocols.
 1.27 26-Feb-2003  matt branches: 1.27.2;
Add MBUFTRACE kernel option.
Do a little mbuf rework while here. Change all uses of MGET*(*, M_WAIT, *)
to m_get*(M_WAIT, *). These are not performance critical and making them
call m_get saves considerable space. Add m_clget analogue of MCLGET and
make corresponding change for M_WAIT uses.
Modify netinet, gem, fxp, tulip, nfs to support MBUFTRACE.
Begin to change netstat to use sysctl.
 1.26 22-Sep-2002  jdolecek don't seem to need <sys/conf.h> or <net/if.h> here
 1.25 10-Nov-2001  lukem add RCSIDs
 1.24 12-Jun-2001  wiz branches: 1.24.2; 1.24.6;
receive, not recieve
 1.23 09-Aug-1998  perry branches: 1.23.24;
bzero->memset, bcopy->memcpy, bcmp->memcmp
 1.22 01-Mar-1998  fvdl Merge with Lite2 + local changes
 1.21 30-Sep-1997  drochner Use functions (shared with DHCP boot) in nfs_boot.c.
 1.20 29-Aug-1997  gwr Supporting changes for the new BOOTP support in nfs_mountroot.
 1.19 20-Oct-1996  fvdl branches: 1.19.10;
Enhancements from Matthias Drochner:
- Try V3 first for diskless booting. Fall back to V2 if V3 fails.
- optionally (option NFS_BOOT_TCP) try a TCP mount first
for diskless booting. Fall back to UDP if it fails.
- Enable switching between UDP and TCP for remounts.
 1.18 13-Oct-1996  christos revert kprintf changes
 1.17 10-Oct-1996  christos printf -> kprintf, sprintf -> ksprintf
 1.16 14-Aug-1996  thorpej Another %lx -> %x for ntohl()
 1.15 10-Jul-1996  cgd print result of ntohl/htonl as a long.
 1.14 14-Jun-1996  cgd avoid unnecessary checks of m_get/MGET/etc.'s return values. When
they're called with M_WAIT, they are defined to never return NULL.
 1.13 07-Jun-1996  cgd fix two bugs (the latter potentially fatal) in xdr_string_encode():
(1) if length needed was > MCLBYTES, an mbuf would be lost, and
(2) the wrong check was being used to determine if MCLGET succeeded.
 1.12 18-Feb-1996  fvdl branches: 1.12.4;
Bring in a merge of Rick Macklem's NFSv3 code from Lite2
 1.11 19-Dec-1995  cgd changes to make this work on systems where pointers & longs are 64 bits.
This is mostly just changes to make the stuff that goes over the wire
use fixed-size types.
 1.10 08-Aug-1995  gwr Use RPCAUTH_UNIX in requests to please picky NFS servers.
 1.9 20-May-1995  mycroft Use fxdr_*() and txdr_*() macros to do byte order conversions.
 1.8 24-Apr-1995  gwr Fixed RPC code to deal with RPC messages larger than one mbuf.
 1.7 26-Sep-1994  gwr Do the first BOOTPARAM RPC call to the broadcast address instead of
using the address of the RARP server because a BOOTPARAM server
might not be running on the machine that sent the RARP reply.
 1.6 12-Aug-1994  cgd fix typo
 1.5 11-Aug-1994  gwr Diskless boot will now bind the local socket to a reserved port to
satisfy picky servers. Also fix some missing initializations.
(Thanks to Chuck Cranor for PR#394 -- now fixed.)
 1.4 30-Jun-1994  pk branches: 1.4.2;
error codes are in network order too.
 1.3 29-Jun-1994  cgd New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'
 1.2 13-Jun-1994  gwr New diskless boot code (uses RARP, bootparamd).
 1.1 18-Apr-1994  glass revised nfs diskless support. uses bootp+rpc to gather parameters
 1.4.2.2 12-Aug-1994  mycroft update from trunk
 1.4.2.1 11-Aug-1994  mycroft update from trunk
 1.12.4.1 07-Jun-1996  cgd pull up from trunk:
>fix two bugs (the latter potentially fatal) in xdr_string_encode():
>(1) if length needed was > MCLBYTES, an mbuf would be lost, and
>(2) the wrong check was being used to determine if MCLGET succeeded.
 1.19.10.2 14-Oct-1997  thorpej Update marc-pcmcia branch from trunk.
 1.19.10.1 01-Sep-1997  thorpej Update marc-pcmcia branch from trunk.
 1.23.24.3 18-Oct-2002  nathanw Catch up to -current.
 1.23.24.2 14-Nov-2001  nathanw Catch up to -current.
 1.23.24.1 21-Jun-2001  nathanw Catch up to -current.
 1.24.6.1 12-Nov-2001  thorpej Sync the thorpej-mips-cache branch with -current.
 1.24.2.2 10-Oct-2002  jdolecek sync kqueue with -current; this includes merge of gehenna-devsw branch,
merge of i386 MP branch, and part of autoconf rototil work
 1.24.2.1 10-Jan-2002  thorpej Sync kqueue branch with -current.
 1.27.2.4 04-Mar-2005  skrll Sync with HEAD.

Hi Perry!
 1.27.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.27.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.27.2.1 03-Aug-2004  skrll Sync with HEAD
 1.28.6.1 19-Mar-2005  yamt sync with head. xen and whitespace. xen part is not finished.
 1.28.4.1 29-Apr-2005  kent sync with -current
 1.29.4.2 03-Sep-2007  yamt sync with head.
 1.29.4.1 21-Jun-2006  yamt sync with head.
 1.30.12.1 24-May-2006  tron Merge 2006-05-24 NetBSD-current into the "peter-altq" branch.
 1.30.10.1 19-Apr-2006  elad sync with head.
 1.30.8.1 24-May-2006  yamt sync with head.
 1.30.6.1 22-Apr-2006  simonb Sync with head.
 1.30.4.1 09-Sep-2006  rpaulo sync with head
 1.31.14.1 12-Mar-2007  rmind Sync with HEAD.
 1.32.38.1 18-May-2008  yamt sync with head.
 1.32.36.1 02-Jun-2008  mjf Sync with HEAD.
 1.33.16.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.33.10.1 28-Apr-2009  skrll Sync with HEAD.
 1.33.2.1 04-May-2009  yamt sync with head.
 1.37.40.3 09-Jul-2016  skrll Sync with HEAD
 1.37.40.2 06-Jun-2015  skrll Sync with HEAD
 1.37.40.1 06-Apr-2015  skrll Sync with HEAD
 1.37.38.2 21-Apr-2015  snj Pull up following revision(s) (requested by maxv in ticket #713):
sys/dev/ic/bwi.c: revision 1.26
sys/nfs/krpc_subr.c: revision 1.38
Fix uninitialized variable.
Found by The Brainy Code Scanner in FreeBSD.
--
Fix a double free. "Suggested" by Brainy.
ok rjs@ riastradh@
 1.37.38.1 06-Apr-2015  snj Pull up following revision(s) (requested by hikaru in ticket #656):
sys/kern/subr_tftproot.c: revision 1.14
sys/nfs/krpc_subr.c: revision 1.39
sys/nfs/nfs_boot.c: revision 1.82
sys/nfs/nfs_bootdhcp.c: revision 1.53
sys/nfs/nfsdiskless.h: revision 1.31
m_pullup() is called in rcvproc callback functions,
so nfs_boot_sendrecv() should keep track of the head of mbuf chain.
fixes kern/48746
 1.37.22.1 03-Dec-2017  jdolecek update from HEAD
 1.37.18.1 16-Apr-2015  msaitoh Pull up following revision(s) (requested by hikaru in ticket #1287):
sys/kern/subr_tftproot.c: revision 1.14 via patch
sys/nfs/nfsdiskless.h: revision 1.31
sys/nfs/nfs_boot.c: revision 1.82
sys/nfs/krpc_subr.c: revision 1.39
sys/nfs/nfs_bootdhcp.c: revision 1.53
m_pullup() is called in rcvproc callback functions,
so nfs_boot_sendrecv() should keep track of the head of mbuf chain.
fixes kern/48746
 1.42.54.1 02-Aug-2025  perseant Sync with HEAD
 1.81 07-Dec-2024  riastradh sys/nfs/nfs: Add some missing includes and include guards.

Fix up some minor KNF issues while here.

No functional change intended (except to enable things to build that
might not have built before because of previously required #include
ordering).
 1.80 05-Dec-2021  msaitoh branches: 1.80.10;
s/runable/runnable/
 1.79 04-Jun-2021  hannken Add flag/command NFSSVC_REPLACEEXPORTSLIST to nfssvc(2) system call.

Works like NFSSVC_SETEXPORTSLIST but supports "mel_nexports > 1"
and will atomically update the complete exports list for a file system.
 1.78 22-Aug-2018  msaitoh branches: 1.78.16; 1.78.20;
- Cleanup for dynamic sysctl:
- Remove unused *_NAMES macros for sysctl.
- Remove unused *_MAXID for sysctls.
- Move CTL_MACHDEP sysctl definitions for m68k into m68k/include/cpu.h and
use them on all m68k machines.
 1.77 25-Jan-2018  riastradh branches: 1.77.2; 1.77.4;
Use a random opaque cookie, not kva pointer, for nfssvc(2).

(What were they smoking?!)

I suspect most of this is actually dead code that wasn't properly
amputated along with the rest of the gangrene of NFSKERB a decade
ago, but I'm out of time to investigate further. If someone else
wants to kill NFSSVC_AUTHIN/NFSSVC_AUTHINFAIL and the rest of the
tentacular kerberosity, be my guest.

Noted by Silvio Cesare of InfoSect.
 1.76 21-Jan-2018  christos PR/40491: From Tobias Ulmer in tech-kern@:
1. Protect the nfs request queue with its own mutex
2. make the nfs_receive queue check for signals so that intr mounts
can be interrupted.
XXX: pullup-8
 1.75 20-Apr-2015  riastradh branches: 1.75.10;
Nix LEASE_READ/LEASE_WRITE from <sys/vnode.h>.
 1.74 24-Apr-2014  christos branches: 1.74.4;
PR/48426: Dimitris Karagkasidis: Convert to sized, unsigned types.
Ideally we could use uint64_t, but for compatibility and performance
we don't (for now)
 1.73 01-Mar-2013  joerg branches: 1.73.6; 1.73.10;
Retire OSI network stack. OK core@
 1.72 02-Mar-2010  pooka branches: 1.72.10; 1.72.20;
Get rid of dependency on fs_nfs.h, i.e. source modules with
conditional content depending on if the NFS client is wanted or
not. The server can now be made an independent module not depending
on the nfs client.

Tested with rump_nfs (standalone client), rump_nfsd (standalone
nfsd) and a qemu installation with both the client and the server.
 1.71 19-Jan-2010  yamt branches: 1.71.2;
remove unused r_timer member.
 1.70 22-Oct-2008  matt Change NFS to use a RB-tree for its FH->nfsnode lookups.
 1.69 04-Dec-2007  yamt branches: 1.69.12; 1.69.16; 1.69.22;
merge non-intrusive nfs changes from vmlocking.
 1.68 28-Oct-2007  yamt branches: 1.68.2; 1.68.4;
make NFS_ATTRTIMEO a function.
 1.67 02-Jun-2007  yamt branches: 1.67.6; 1.67.8; 1.67.12;
add some #include.
 1.66 01-Jun-2007  dogcow it seems like a good idea to include <sys/condvar.h>, as we're using them...
 1.65 01-Jun-2007  yamt use mutex and condvar.
 1.64 30-Apr-2007  yamt remove R_GETONEREP.
 1.63 04-Mar-2007  christos branches: 1.63.2; 1.63.4;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.
 1.62 28-Dec-2006  yamt branches: 1.62.2;
remove several nqnfs definitions.
 1.61 27-Dec-2006  yamt remove nqnfs.
 1.60 04-Sep-2006  yamt branches: 1.60.2;
remove (void *) cast from NFSRVFH_DATA as it sometimes
discards const qualifier. pointed out by Havard Eidnes.
(it wasn't detected by in-tree gcc4. seems like a compiler bug.)
 1.59 02-Sep-2006  yamt nfsd: deal with variable-sized filehandles.
 1.58 13-Jul-2006  martin Fix alignement problems for fhandle_t, exposed by gcc4.1.

While touching all vptofh/fhtovp functions, get rid of VFS_MAXFIDSIZ,
version the getfh(2) syscall and explicitly pass the size available in
the filehandle from userland.

Discussed on tech-kern, with lots of help from yamt (thanks!).
 1.57 07-Jun-2006  kardel branches: 1.57.2;
merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html
 1.56 14-May-2006  elad branches: 1.56.2;
integrate kauth.
 1.55 03-Jan-2006  yamt branches: 1.55.2; 1.55.4; 1.55.6; 1.55.8; 1.55.10;
move function prototypes from nfs.h to nfs_var.h.
 1.54 03-Jan-2006  yamt nfssvc_nfsd: reduce a chance for a slow peer to capture all our threads.
instead of sleeping to wait for the socket to send our reply,
just hand-off our reply to the thread which is holding the socket.
 1.53 03-Jan-2006  yamt improve nfsd locking.
- don't bother to take nfs_sndlock when doing nfsrv_rcv.
unlike client, we never reconnect.
- nfsrv_getstream: fix the case that m_split sleeps.
- free socket in nfsrv_slpderef rather than nfsrv_zapsock.
fix race with nfssvc_nfsd.
- while i'm here, remove NFSD_WAITING and NFSD_REQINPROG
as they are redundant.
- some comments and assertions.
 1.52 11-Dec-2005  christos branches: 1.52.2;
merge ktrace-lwp.
 1.51 25-Sep-2005  jmmv Add some COMPAT_30 code to let old mountd binaries work after the NFS
exports rototill.
 1.50 23-Sep-2005  jmmv Apply the NFS exports list rototill patch:

- Remove all NFS related stuff from file system specific code.
- Drop the vfs_checkexp hook and generalize it in the new nfs_check_export
function, thus removing redundancy from all file systems.
- Move all NFS export-related stuff from kern/vfs_subr.c to the new
file sys/nfs/nfs_export.c. The former was becoming large and its code
is always compiled, regardless of the build options. Using the latter,
the code is only compiled in when NFSSERVER is enabled. While doing this,
also make some functions in nfs_subs.c conditional to NFSSERVER.
- Add a new command in nfssvc(2), called NFSSVC_SETEXPORTSLIST, that takes a
path and a set of export entries. At the moment it can only clear the
exports list or append entries, one by one, but it is done in a way that
allows setting the whole set of entries atomically in the future (see the
comment in mountd_set_exports_list or in doc/TODO).
- Change mountd(8) to use the nfssvc(2) system call instead of mount(2) so
that it becomes file system agnostic. In fact, all this whole thing was
done to remove a 'XXX' block from this utility!
- Change the mount*, newfs and fsck* userland utilities to not deal with NFS
exports initialization; done internally by the kernel when initializing
the NFS support for each file system.
- Implement an interface for VFS (called VFS hooks) so that several kernel
subsystems can run arbitrary code upon receipt of specific VFS events.
At the moment, this only provides support for unmount and is used to
destroy NFS exports lists from the file systems being unmounted, though it
has room for extension.

Thanks go to yamt@, chs@, thorpej@, wrstuden@ and others for their comments
and advice in the development of this patch.
 1.49 18-Sep-2005  christos Allow turning off the attribute cache.
 1.48 26-Oct-2004  yamt branches: 1.48.10; 1.48.12;
remove #if 0'ed out definition of VA_EXCLUSIVE.
 1.47 26-Oct-2004  yamt remove an unused macro, NMOD.
 1.46 12-May-2004  yamt g/c unused NFS_*ALLOC defines.
 1.45 10-May-2004  yamt don't do kludge for a reply to a retransmitted request
unless we actually retransmitted the request.
 1.44 06-Dec-2003  jonathan branches: 1.44.2; 1.44.4;
Commit message for previous revision to sys/nfs/nfs.h:

Increase NFS_MAXRAHEAD to 32. With 32k read or write requests, that
amounts to 1 Mbyte of read-ahead, enough to cover about 10 ms latency
at gigabit Ethernet speeds. Increase the table of nfsiod kthreads
(NFS_MAXASYNCDAEMON) from 20 to 128, to match the raised value of
NFS_MAXRAHEAD. (Making the limit dynamic requires replacing the
compile-time array with a dynamic structure.)

Add a comment explaining that each read-ahead requires an I/O thread.

Wrap both parameters with an #ifdef <parameter>/#endif, to allow
hand-tuned values or (later) a kernel config-file option override.
 1.43 06-Dec-2003  jonathan *** empty log message ***
 1.42 26-Sep-2003  yamt change n_mtime from time_t to timespec in order to improve
cache consistency.
(1 second granularity is too loose these days.)
 1.41 16-Aug-2003  yamt current trylater/jukebox retry delay is way too long and
it has a bug in the backoff calculation. so,
- clip it to 1-60 sec. (suggested by Rick Macklem)
- use a constant multiplier instead of nfs_backoff, which
is already exponential.
- move some related constant definations to nfs.h from nqnfs.h and
prefix with NFS_ instead of NQ_ because they are not nqnfs-specific.
 1.40 07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.39 29-Jun-2003  fvdl branches: 1.39.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
 1.38 28-Jun-2003  darrenr Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
 1.37 25-Jun-2003  yamt - instead of scaning a list when looking up
{a idle thread, a socket with pending requests},
maintain dedicated list of them.
- add spin locks.
 1.36 24-Apr-2003  drochner Change some subordinate functions to take a "struct nfsnode" argument
instead of "struct vnode". This saves a number of pointer dereferences;
it sums up to about half a kB for me. And it paves the way for future
fixes.
While cleaning up, eliminate a write-only member of "struct nfsreq"
and a pointless assignment in the NFS_V2_ONLY case.
 1.35 26-Feb-2003  matt Add MBUFTRACE kernel option.
Do a little mbuf rework while here. Change all uses of MGET*(*, M_WAIT, *)
to m_get*(M_WAIT, *). These are not performance critical and making them
call m_get saves considerable space. Add m_clget analogue of MCLGET and
make corresponding change for M_WAIT uses.
Modify netinet, gem, fxp, tulip, nfs to support MBUFTRACE.
Begin to change netstat to use sysctl.
 1.34 01-Dec-2002  matt Don't define VA_EXCLUSIVE if it's not defined. If we do, it'll be a
different value from the one in <sys/vnode.h>
 1.33 12-May-2002  matt Eliminate commons
 1.32 29-Nov-2001  christos use struct uucred in nfsd_svcargs so that we don't break the sys_nfssvc() ABI.
 1.31 15-Sep-2001  chs add a new VFS op, vfs_reinit, which is called when desiredvnodes is
adjusted via sysctl. file systems that have hash tables which are
sized based on the value of this variable now resize those hash tables
using the new value. the max number of FFS softdeps is also recalculated.

convert various file systems to use the <sys/queue.h> macros for
their hash tables.
 1.30 03-Aug-2001  jdolecek branches: 1.30.2;
bound check mount args more thoroughly
 1.29 01-Jul-2001  gmcgarry branches: 1.29.2;
Introduce NFS_DEFAULT_NIOTHREADS to define the default number
of nfs_niothreads instead of hard-coding 4.

This change has the advantage that the default can be specified
at compile time. If the root filesystem is mounted over NFS
we don't have an opportunity to use the syscall to limit the
number of threads. Useful on small-memory machines.
 1.28 03-Apr-2001  chs remove a temporary hack now that it's fixed for real. fixes PR 11731.
 1.27 02-Apr-2001  fvdl Set default NFS read and write sizes back to 8k, because a lot of
(old) hardware can't handle more.
 1.26 25-Mar-2001  matt Allow the default NFS_RSIZE and NFS_WSIZE to be overriden.
 1.25 27-Nov-2000  chs branches: 1.25.2;
Initial integration of the Unified Buffer Cache project.
 1.24 19-Sep-2000  fvdl Bump some defaults and maximums to better values.
 1.23 19-Sep-2000  bjh21 New kernel option, NFS_V2_ONLY, which aims to reduce the NFS client to just
that required to support NFSv2 mounts. Not finished yet, but already
provides some 44k of saving in code size on arm26. More savings, and some
documentation, are still to come.
 1.22 09-Jun-2000  fvdl branches: 1.22.2;
Some tweaks to enable NFS over IPv6. The special-casing of AF_INET
should really be removed.
 1.21 15-Apr-2000  tsarna branches: 1.21.2;
Death to nfsiod!

It is replaced by kernel threads that do the same thing. The number of
kernel threads used is set with the vfs.nfs.iothreads sysctl.
 1.20 13-Nov-1998  thorpej branches: 1.20.10;
Clean up the NFS sysctl variables.
 1.19 11-Sep-1998  mycroft Substantial signal handling changes:
* Increase the size of sigset_t to accomodate 128 signals -- adding new
versions of sys_setprocmask(), sys_sigaction(), sys_sigpending() and
sys_sigsuspend() to handle the changed arguments.
* Abstract the guts of sys_sigaltstack(), sys_setprocmask(), sys_sigaction(),
sys_sigpending() and sys_sigsuspend() into separate functions, and call them
from all the emulations rather than hard-coding everything. (Avoids uses
the stackgap crap for these system calls.)
* Add a new flag (p_checksig) to indicate that a process may have signals
pending and userret() needs to do the full (slow) check.
* Eliminate SAS_ALTSTACK; it's exactly the inverse of SS_DISABLE.
* Correct emulation bugs with restoring SS_ONSTACK.
* Make the signal mask in the sigcontext always use the emulated mask format.
* Store signals internally in sigaction structures, rather than maintaining a
bunch of little sigsets for each SA_* bit.
* Keep track of where we put the signal trampoline, rather than figuring it out
in *_sendsig().
* Issue a warning when a non-emulated sigaction bit is observed.
* Add missing emulated signals, and a native SIGPWR (currently not used).
* Implement the `not reset when caught' semantics for relevant signals.

Note: Only code touched by the i386 port has been modified. Other ports and
emulations need to be updated.
 1.18 09-Aug-1998  perry bzero->memset, bcopy->memcpy, bcmp->memcmp
 1.17 19-Oct-1997  fvdl * Implement optional 32 <-> 64 bit directory cookie translation. This uses
the directory cache as translation table. See nfs_subs.c for comments.
Makes the code a bit more complex to look at than I would have liked,
but doesn't affect the speed of the default behavior.
* Optimize caching behavior a bit when buffers are invalidated.
* Save some RPCs in readdir operations by not bothering if there is
a small amount left to do to fill the buffer. It'll be done in the
next RPC with a larger chunk anyway. Wastes a bit of buffer space
but is faster.
* Make n_vattr an allocated vattr struct. This avoids nfsnode bloat,
and is friendlier to the malloc routines.
 1.16 10-Oct-1997  fvdl * New directory entry caching system. Provides full caching of any
directory cookie that may be thrown back at us from userspace, up
to a size limit. Fixes double entry problem.
* Split flags for internal and external use in the NFS mount structure.
* Fix some buffer structure fields that weren're being used correctly.
* Fix missing directory cache inval call in nfs_open.
* Limit on NFS_DIRBLKSIZ no longer needed, bumped to the more reasonable
value of 8k.
* Various other things that I forget, all related to the dir caching
somehow, though.
 1.15 24-Jun-1997  fvdl branches: 1.15.4; 1.15.6;
Add a few defines for WebNFS support.
 1.14 12-May-1997  fvdl Store RPC procnum consistently as an u_int32_t. This is as it should be,
and avoid possible server crashes due to bogus comparisons. Partly
from BSDI.
 1.13 10-Dec-1996  mycroft Allocate real malloc types for NFS, rather than using M_TEMP.
 1.12 02-Dec-1996  thorpej NFS performance improvement from Doug Rabson/FreeBSD:

Improve the queuing algorithms used by NFS' asynchronous i/o. The
existing mechanism uses a global queue for some buffers and the
vp->b_dirtyblkhd queue for others. This turns sequential writes into
randomly ordered writes to the server, affecting both read and write
performance. The existing mechanism also copes badly with hung
servers, tending to block accesses to other servers when all the iods
are waiting for a hung server.

The new mechanism uses a queue for each mount point. All asynchronous
i/o goes through this queue which preserves the ordering of requests.
A simple mechanism ensures that the iods are shared out fairly between
active mount points.

Reviewed/integrated/approved by Frank van der Linden <fvdl@netbsd.org>
 1.11 27-May-1996  fvdl Align things right in NWDELAYHASH (for the Alpha). This fixes crashes in
the server code. From John Birell.
 1.10 18-Feb-1996  fvdl branches: 1.10.4;
Bring in a merge of Rick Macklem's NFSv3 code from Lite2
 1.9 19-Dec-1995  cgd changes to make this work on systems where pointers & longs are 64 bits.
This is mostly just changes to make the stuff that goes over the wire
use fixed-size types.
 1.8 26-Mar-1995  jtc KERNEL -> _KERNEL
 1.7 18-Aug-1994  mycroft More LIST/CIRCLEQ migration.
 1.6 17-Aug-1994  mycroft Convert some more lists and queues.
 1.5 17-Aug-1994  mycroft Change the reply list to a TAILQ.
 1.4 29-Jun-1994  cgd branches: 1.4.2;
New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'
 1.3 08-Jun-1994  mycroft Update to 4.4-Lite fs code, with local changes.
 1.2 20-May-1993  cgd more rcs id adding and header cleanup. i like vi macros!
 1.1 20-Apr-1993  mycroft branches: 1.1.1;
Restore files lost during crash.
 1.1.1.1 01-Mar-1998  fvdl Import 4.4BSD-Lite for reference
 1.4.2.1 19-Aug-1994  mycroft update from trunk
 1.10.4.2 10-Dec-1996  mycroft From trunk:
Allocate real malloc types for NFS, rather than using M_TEMP.
 1.10.4.1 27-May-1996  fvdl From trunk:

Align things right in NWDELAYHASH (for the Alpha). This fixes crashes in
the server code. From John Birell.
 1.15.6.1 08-Sep-1997  thorpej Significantly restructure the way signal state for a process is stored.
Rather than using bitmasks to redundantly store the information kept
in the process's sigacts (because the sigacts was kept in the u-area),
hang sigacts directly off the process, and access it directly.

Simplify signal setup code tremendously by storing information in
the sigacts as an array of struct sigactions, rather than in a different
format, since userspace uses sigactions.

Make sigacts sharable by adding reference counting.
 1.15.4.1 14-Oct-1997  thorpej Update marc-pcmcia branch from trunk.
 1.20.10.4 21-Apr-2001  bouyer Sync with HEAD
 1.20.10.3 27-Mar-2001  bouyer Sync with HEAD.
 1.20.10.2 08-Dec-2000  bouyer Sync with HEAD.
 1.20.10.1 20-Nov-2000  bouyer Update thorpej_scsipi to -current as of a month ago
 1.21.2.1 22-Jun-2000  minoura Sync w/ netbsd-1-5-base.
 1.22.2.4 09-Feb-2002  he Pull up revision 1.32 (requested by he):
Widen cr_ref to prevent overflow.
(Missed in previous round of commits on this issue.)
 1.22.2.3 16-Aug-2001  tv Pullup [jdolecek]:

sys/miscfs/umapfs/umap_vfsops.c 1.29-1.30
sys/kern/vfs_subr.c 1.156
sys/nfs/nfs.h 1.30

Bounds check mount args.
 1.22.2.2 06-Apr-2001  he Pull up revisions 1.26-1.27 (requested by fvdl):
Adjust default NFS operation size back to 8KB on all systems
except on i386.
 1.22.2.1 14-Dec-2000  he Pull up revision 1.24 (requested by fvdl):
Improve NFS performance, possibly with as much as 100% in
throughput. Please note: this implies a kernel interface change,
VOP_FSYNC gains two arguments.
 1.25.2.6 11-Dec-2002  thorpej Sync with HEAD.
 1.25.2.5 20-Jun-2002  nathanw Catch up to -current.
 1.25.2.4 08-Jan-2002  nathanw Catch up to -current.
 1.25.2.3 21-Sep-2001  nathanw Catch up to -current.
 1.25.2.2 24-Aug-2001  nathanw Catch up with -current.
 1.25.2.1 09-Apr-2001  nathanw Catch up with -current.
 1.29.2.3 23-Jun-2002  jdolecek catch up with -current on kqueue branch
 1.29.2.2 10-Jan-2002  thorpej Sync kqueue branch with -current.
 1.29.2.1 25-Aug-2001  thorpej Merge Aug 24 -current into the kqueue branch.
 1.30.2.1 01-Oct-2001  fvdl Catch up with -current.
 1.39.2.6 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.39.2.5 02-Nov-2004  skrll Sync with HEAD.
 1.39.2.4 21-Sep-2004  skrll Fix the sync with head I botched.
 1.39.2.3 18-Sep-2004  skrll Sync with HEAD.
 1.39.2.2 03-Aug-2004  skrll Sync with HEAD
 1.39.2.1 02-Jul-2003  darrenr Apply the aborted ktrace-lwp changes to a specific branch. This is just for
others to review, I'm concerned that patch fuziness may have resulted in some
errant code being generated but I'll look at that later by comparing the diff
from the base to the branch with the file I attempt to apply to it. This will,
at the very least, put the changes in a better context for others to review
them and attempt to tinker with removing passing of 'struct lwp' through
the kernel.
 1.44.4.2 27-Oct-2005  riz Pull up following revision(s) (requested by christos in ticket #5863):
sys/nfs/nfs_subs.c: revision 1.152 via patch
sys/nfs/nfs.h: revision 1.49
sys/nfs/nfs_vfsops.c: revision 1.149 via patch
usr.sbin/amd/include/config.h: revision 1.36
sys/nfs/nfs_vnops.c: revision 1.227 via patch
sys/nfs/nfsmount.h: revision 1.34
Allow the attribute cache to be turned off, and allow amd to do it.
 1.44.4.1 11-Jan-2005  jmc Pullup patch (requested by yamy in ticket #1078)

Don't do kludge for a reply to a retransmitted request
unless we actually retransmitted the request.
 1.44.2.1 11-Jan-2005  jmc Pullup patch (requested by yamy in ticket #1078)

Don't do kludge for a reply to a retransmitted request
unless we actually retransmitted the request.
 1.48.12.7 27-Feb-2008  yamt revert incomplete nfs client locking for now.
 1.48.12.6 15-Feb-2008  yamt - sprinkle some locks.
- disable MNT_UPDATE because it involves too much locking headache.
- don't overwrite other bits in v_vflags when setting VV_ROOT.
 1.48.12.5 07-Dec-2007  yamt sync with head
 1.48.12.4 15-Nov-2007  yamt sync with head.
 1.48.12.3 03-Sep-2007  yamt sync with head.
 1.48.12.2 30-Dec-2006  yamt sync with head.
 1.48.12.1 21-Jun-2006  yamt sync with head.
 1.48.10.1 26-Sep-2005  tron Pull up following revision(s) (requested by christos in ticket #816):
sys/nfs/nfs.h: revision 1.49
sys/nfs/nfsmount.h: revision 1.34
Allow turning off the attribute cache.
 1.52.2.1 15-Jan-2006  yamt sync with head.
 1.55.10.1 24-May-2006  tron Merge 2006-05-24 NetBSD-current into the "peter-altq" branch.
 1.55.8.3 06-May-2006  christos - Move kauth_cred_t declaration to <sys/types.h>
- Cleanup struct ucred; forward declarations that are unused.
- Don't include <sys/kauth.h> in any header, but include it in the c files
that need it.

Approved by core.
 1.55.8.2 12-Mar-2006  elad Get rid of NFSW_SAMECRED() that uses memcmp() to compare two credentials,
and use a new nfsrv_samecred(), using kauth(9).

Note that the NFSW_SAMECRED() macro used to check nd_flag of both
descriptors for NB_KERBAUTH too; we don't do that. [documented]

Based on code in FreeBSD, thanks to Jeff Roberson.
 1.55.8.1 08-Mar-2006  elad Adapt to kernel authorization KPI.

This could use some testing...
 1.55.6.5 14-Sep-2006  yamt sync with head.
 1.55.6.4 03-Sep-2006  yamt sync with head.
 1.55.6.3 11-Aug-2006  yamt sync with head
 1.55.6.2 26-Jun-2006  yamt sync with head.
 1.55.6.1 24-May-2006  yamt sync with head.
 1.55.4.2 01-Jun-2006  kardel Sync with head.
 1.55.4.1 04-Feb-2006  simonb Adapt for timecounters: mostly use get*time() and use "time_second"
instead of "time.tv_sec".
 1.55.2.1 09-Sep-2006  rpaulo sync with head
 1.56.2.1 19-Jun-2006  chap Sync with head.
 1.57.2.1 13-Jul-2006  gdamore Merge from HEAD.
 1.60.2.1 12-Jan-2007  ad Sync with head.
 1.62.2.2 07-May-2007  yamt sync with head.
 1.62.2.1 12-Mar-2007  rmind Sync with HEAD.
 1.63.4.1 11-Jul-2007  mjf Sync with head.
 1.63.2.3 26-Aug-2007  yamt - mark nfssvc(2) MPSAFE and move the most of nfsd out of the kernel lock.
- remove unused ns_solock.
- remove some of KERNEL_LOCK/UNLOCK which are not necessary on this branch.
 1.63.2.2 09-Jun-2007  ad Sync with head.
 1.63.2.1 08-Jun-2007  ad Sync with head.
 1.67.12.1 13-Nov-2007  bouyer Sync with HEAD
 1.67.8.2 09-Jan-2008  matt sync with HEAD
 1.67.8.1 06-Nov-2007  matt sync with HEAD
 1.67.6.2 09-Dec-2007  jmcneill Sync with HEAD.
 1.67.6.1 29-Oct-2007  joerg Sync with HEAD.
 1.68.4.2 08-Dec-2007  ad Sync with head.
 1.68.4.1 04-Dec-2007  ad Pull the vmlocking changes into a new branch.
 1.68.2.1 08-Dec-2007  mjf Sync with HEAD.
 1.69.22.1 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.69.16.3 11-Mar-2010  yamt sync with head
 1.69.16.2 04-May-2009  yamt sync with head.
 1.69.16.1 27-Apr-2008  yamt commit some work-in-progress changes to make nfs client mp-safe to a branch,
so that they won't get lost.
- sprinkle some locking
- mark the filesystem, nfstimer callout, and kq kthread mp-safe
- add assertions and comments
- disable upgrade mount for now
- some unrelated cosmetic changes
 1.69.12.1 17-Jan-2009  mjf Sync with HEAD.
 1.71.2.1 30-Apr-2010  uebayasi Sync with HEAD.
 1.72.20.3 03-Dec-2017  jdolecek update from HEAD
 1.72.20.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.72.20.1 23-Jun-2013  tls resync from head
 1.72.10.1 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.73.10.1 10-Aug-2014  tls Rebase.
 1.73.6.1 18-May-2014  rmind sync with head
 1.74.4.1 06-Jun-2015  skrll Sync with HEAD
 1.75.10.1 08-Jun-2018  martin Pull up following revision(s) (requested by maya in ticket #856):

sys/nfs/nfs.h: revision 1.76
sys/nfs/nfs_subs.c: revision 1.230
sys/nfs/nfs_socket.c: revision 1.199
sys/nfs/nfs_clntsocket.c: revision 1.6

PR/40491: From Tobias Ulmer in tech-kern@:
1. Protect the nfs request queue with its own mutex
2. make the nfs_receive queue check for signals so that intr mounts
can be interrupted.

XXX: pullup-8
 1.77.4.1 10-Jun-2019  christos Sync with HEAD
 1.77.2.1 06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.78.20.1 06-Jun-2021  cjep sync with head
 1.78.16.1 17-Jun-2021  thorpej Sync w/ HEAD.
 1.80.10.1 02-Aug-2025  perseant Sync with HEAD
 1.202 13-Feb-2024  andvar s/Enque/Enqueue/ in comment.
 1.201 24-Jun-2022  hannken Remove an incorrect assertion.

Just issue a readahead near the end of the vnode and enqueue an async read.
Now let nfs_setattr() truncate the vnode, set its new size and
nfs_vinvalbuf() waits for the pages from the readahead to become unbusy.

The async read gets processed and returns with uio_resid > 0 because there
is a hole and no write after the hole has been pushed yet. As the vnode
size already got truncated to the new size the KASSERT() incorrectly fires.
 1.200 20-Oct-2021  thorpej Overhaul of the EVFILT_VNODE kevent(2) filter:

- Centralize vnode kevent handling in the VOP_*() wrappers, rather than
forcing each individual file system to deal with it (except VOP_RENAME(),
because VOP_RENAME() is a mess and we currently have 2 different ways
of handling it; at least it's reasonably well-centralized in the "new"
way).
- Add support for NOTE_OPEN, NOTE_CLOSE, NOTE_CLOSE_WRITE, and NOTE_READ,
compatible with the same events in FreeBSD.
- Track which kevent notifications clients are interested in receiving
to avoid doing work for events no one cares about (avoiding, e.g.
taking locks and traversing the klist to send a NOTE_WRITE when
someone is merely watching for a file to be deleted, for example).

In support of the above:

- Add support in vnode_if.sh for specifying PRE- and POST-op handlers,
to be invoked before and after vop_pre() and vop_post(), respectively.
Basic idea from FreeBSD, but implemented differently.
- Add support in vnode_if.sh for specifying CONTEXT fields in the
vop_*_args structures. These context fields are used to convey information
between the file system VOP function and the VOP wrapper, but do not
occupy an argument slot in the VOP_*() call itself. These context fields
are initialized and subsequently interpreted by PRE- and POST-op handlers.
- Version VOP_REMOVE(), uses the a context field for the file system to report
back the resulting link count of the target vnode. Return this in tmpfs,
udf, nfs, chfs, ext2fs, lfs, and ufs.

NetBSD 9.99.92.
 1.199 05-Sep-2020  riastradh Round of uvm.h cleanup.

The poorly named uvm.h is generally supposed to be for uvm-internal
users only.

- Narrow it to files that actually need it -- mostly files that need
to query whether curlwp is the pagedaemon, which should maybe be
exposed by an external header.

- Use uvm_extern.h where feasible and uvm_*.h for things not exposed
by it. We should split up uvm_extern.h but this will serve for now
to reduce the uvm.h dependencies.

- Use uvm_stat.h and #ifdef UVMHIST uvm.h for files that use
UVMHIST(ubchist), since ubchist is declared in uvm.h but the
reference evaporates if UVMHIST is not defined, so we reduce header
file dependencies.

- Make uvm_device.h and uvm_swap.h independently includable while
here.

ok chs@
 1.198 23-May-2020  ad Move proc_lock into the data segment. It was dynamically allocated because
at the time we had mutex_obj_alloc() but not __cacheline_aligned.
 1.197 17-May-2020  ad Start trying to reduce cache misses on vm_page during fault processing.

- Make PGO_LOCKED getpages imply PGO_NOBUSY and remove the latter. Mark
pages busy only when there's actually I/O to do.

- When doing COW on a uvm_object, don't mess with neighbouring pages. In
all likelyhood they're already entered.

- Don't mess with neighbouring VAs that have existing mappings as replacing
those mappings with same can be quite costly.

- Don't enqueue pages for neighbour faults unless not enqueued already, and
don't activate centre pages unless uvmpdpol says its useful.

Also:

- Make PGO_LOCKED getpages on UAOs work more like vnodes: do gang lookup in
the radix tree, and don't allocate new pages.

- Fix many assertion failures around faults/loans with tmpfs.
 1.196 23-Apr-2020  ad PR kern/54759 (vm.ubc_direct deadlock when read()/write() into mapping of itself)

- Add new flag UBC_ISMAPPED which tells ubc_uiomove() the object is mmap()ed
somewhere. Use it to decide whether to do direct-mapped copy, rather than
poking around directly in the vnode in ubc_uiomove(), which is ugly and
doesn't work for tmpfs. It would be nicer to contain all this in UVM but
the filesystem provides the needed locking here (VV_MAPPED) and to
reinvent that would suck more.

- Rename UBC_UNMAP_FLAG() to UBC_VNODE_FLAGS(). Pass in UBC_ISMAPPED where
appropriate.
 1.195 22-Mar-2020  ad branches: 1.195.2;
Process concurrent page faults on individual uvm_objects / vm_amaps in
parallel, where the relevant pages are already in-core. Proposed on
tech-kern.

Temporarily disabled on MP architectures with __HAVE_UNLOCKED_PMAP until
adjustments are made to their pmaps.
 1.194 23-Feb-2020  ad UVM locking changes, proposed on tech-kern:

- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart. v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap. Others to follow later.
 1.193 15-Jan-2020  ad Merge from yamt-pagecache (after much testing):

- Reduce unnecessary page scan in putpages esp. when an object has a ton of
pages cached but only a few of them are dirty.

- Reduce the number of pmap operations by tracking page dirtiness more
precisely in uvm layer.
 1.192 13-Dec-2019  ad branches: 1.192.2;
Break the global uvm_pageqlock into a per-page identity lock and a private
lock for use of the pagedaemon policy code. Discussed on tech-kern.

PR kern/54209: NetBSD 8 large memory performance extremely low
PR kern/54210: NetBSD-8 processes presumably not exiting
PR kern/54727: writing a large file causes unreasonable system behaviour
 1.191 15-Jul-2015  manu branches: 1.191.18;
Fix soft NFS force unmount

For many reasons, forcibly unmounting a soft NFS mount could hang forever.
Here are the fixes:
- Introduce decents timeouts in operation that awaited NFS server reply.
- On timeout, fails operations on soft mounts with EIO.
- Introduce NFSMNT_DISMNTFORCE to let the filesystem know that a
force unmount is ongoing. This causes timeouts to be reduced and
prevents the NFS client to attempt reconnecting to the NFS server.

Also fix a race condition where some asynchronous I/O could reference
destroyed mount structures. We fix this by awaiting asynchronous I/O
to drain before proceeding.

Reviewed by Chuck Silvers.
 1.190 05-Sep-2014  matt branches: 1.190.2;
Don't use catch as a variable name.
 1.189 12-Aug-2013  hannken branches: 1.189.4;
Function nfs_vinvalbuf() ignores errors from vinvalbuf() and therefore
delayed write errors may get lost.
Change nfs_vinvalbuf() to keep errors from vinvalbuf() for fsync() or close().

Presented on tech-kern@

Fix for PR kern/47980 (NFS over-quota not detected if utimes() called
before fsync()/close())
 1.188 27-Sep-2011  christos branches: 1.188.2; 1.188.8; 1.188.12; 1.188.14; 1.188.16; 1.188.22;
use NFS_MAXNAMLEN for all names.
 1.187 19-Jun-2011  rmind - Fix a silly bug: remove umap from uobj in ubc_release() UBC_UNMAP case.
- Use UBC_WANT_UNMAP() consistently.

ARM (PMAP_CACHE_VIVT case) works again.
 1.186 12-Jun-2011  rmind Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.
 1.185 12-Jun-2010  jakllsch branches: 1.185.6;
Fix memory leak during some NFS writes.
 1.184 23-Apr-2010  pooka Enforce RLIMIT_FSIZE before VOP_WRITE. This adds support to file
system drivers where it was missing from and fixes one buggy
implementation. The arguably weird semantics of the check are
maintained (v_size vs. va_bytes, overwrite).
 1.183 14-Mar-2009  dsl branches: 1.183.2; 1.183.4;
Remove all the __P() from sys (excluding sys/dist)
Diff checked with grep and MK1 eyeball.
i386 and amd64 GENERIC and sys still build.
 1.182 13-Mar-2009  yamt nfs_bioread: don't truncate values in a debug printf.
 1.181 19-Nov-2008  ad branches: 1.181.4;
Make the emulations, exec formats, coredump, NFS, and the NFS server
into modules. By and large this commit:

- shuffles header files and ifdefs
- splits code out where necessary to be modular
- adds module glue for each of the components
- adds/replaces hooks for things that can be installed at runtime
 1.180 31-Oct-2008  christos - allocate 8 pointers on the stack to avoid stack overflow in nfs.
- make that 8 a constant
- remove bogus panic
 1.179 17-Oct-2008  christos branches: 1.179.2; 1.179.4;
Requested by yamt:
- In getpages don't allocate if we are not locked
- Use kmem_alloc instead of malloc and don't sleep

Also provide a 64 entry stack array so we don't have to allocate in the
common case.
 1.178 17-Oct-2008  dogcow it appears the previous commit's sacrifice was "successful compilation with
NFS_V2_ONLY defined".
 1.177 16-Oct-2008  christos Another sacrifice to the stack protector gods.
 1.176 16-Oct-2008  christos don't use variable allocation on the stack.
 1.175 24-Apr-2008  ad branches: 1.175.2; 1.175.8;
Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.
 1.174 29-Mar-2008  yamt branches: 1.174.2;
ansify. from Christoph Egger.
 1.173 02-Jan-2008  yamt branches: 1.173.6;
use kmem_alloc instead of malloc.
 1.172 02-Jan-2008  ad Merge vmlocking2 to head.
 1.171 04-Dec-2007  yamt branches: 1.171.4;
merge non-intrusive nfs changes from vmlocking.
 1.170 26-Nov-2007  pooka branches: 1.170.2;
Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern
 1.169 28-Oct-2007  yamt branches: 1.169.2;
make NFS_ATTRTIMEO a function.
 1.168 10-Oct-2007  ad branches: 1.168.2;
Merge from vmlocking:

- Split vnode::v_flag into three fields, depending on field locking.
- simple_lock -> kmutex in a few places.
- Fix some simple locking problems.
 1.167 08-Oct-2007  ad Merge brelse() changes from the vmlocking branch.
 1.166 10-Aug-2007  yamt branches: 1.166.2; 1.166.4;
- instead of scanning an array of iods, maintain a list of idle iods.
- make nfs_getset_niothreads MP friendly.
 1.165 08-Aug-2007  yamt push kernel_lock a little.
 1.164 29-Jul-2007  ad branches: 1.164.4; 1.164.6;
It's not a good idea for device drivers to modify b_flags, as they don't
need to understand the locking around that field. Instead of setting
B_ERROR, set b_error instead. b_error is 'owned' by whoever completes
the I/O request.
 1.163 27-Jul-2007  yamt use ubc_uiomove for read as well.
 1.162 27-Jul-2007  yamt ubc_uiomove: add an "advice" argument rather than using UVM_ADV_RANDOM blindly.
 1.161 20-Jul-2007  yamt - fix decreasing of vfs.nfs.iothreads after the recent partial merge
of vmlocking.
- don't make nfsiod exit with requests left.
- make NFSSVC_BIOD a dummy so that nfsiod can be simplified.
 1.160 17-Jul-2007  yamt branches: 1.160.2;
remove (void)0; nonsense.
 1.159 17-Jul-2007  yamt fix a typo in a comment.
 1.158 12-Jul-2007  rmind nfs_asyncio: fix the locking in error case, problem was introduced
in 1.153 revision, where ltsleep() was replaced with condvar.

Problem found and fix provided by David A. Holland, PR/36610.
Actually, relock is not needed here, and mutex would be unlocked
only on nfs_sigintr() fail case.
 1.157 09-Jul-2007  ad Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements
 1.156 12-Jun-2007  yamt nfs_write:
- IO_SYNC: don't bother to flush dirty pages before copying data from
user buffer.
- IO_APPEND: don't invalidate pages blindly. PR/28472 from Brian Marcotte.
 1.155 05-Jun-2007  yamt improve post-ubc file overwrite performance in common cases.
ie. when it's safe, actually overwrite blocks rather than doing
read-modify-write.

also fixes PR/33152 and PR/36303.
 1.154 09-May-2007  yamt nfs_write: report an error correctly in the case of IO_SYNC.
 1.153 29-Apr-2007  yamt use mutex and condver.
 1.152 19-Apr-2007  yamt hold proclist_mutex when calling psignal().
 1.151 04-Mar-2007  christos branches: 1.151.2; 1.151.4;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.
 1.150 27-Feb-2007  yamt nfs_getpages: fix an inverted condition in rev.1.147.
 1.149 22-Feb-2007  thorpej TRUE -> true, FALSE -> false
 1.148 21-Feb-2007  thorpej Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.
 1.147 15-Feb-2007  yamt branches: 1.147.2;
use mutex and rwlock rather than lockmgr.
 1.146 27-Dec-2006  yamt remove nqnfs.
 1.145 23-Jul-2006  ad branches: 1.145.4;
Use the LWP cached credentials where sane.
 1.144 30-Jun-2006  yamt fix handling of NFSERR_NOTSUPP and NFSERR_BAD_COOKIE,
which have been broken since nfs_socket.c rev.1.115.
 1.143 14-May-2006  elad branches: 1.143.4;
integrate kauth.
 1.142 01-Mar-2006  yamt branches: 1.142.2; 1.142.4; 1.142.6;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.
 1.141 14-Jan-2006  yamt branches: 1.141.2; 1.141.4;
nfs_doio_read: clear uio_resid when filling a hole.
 1.140 13-Dec-2005  reinoud branches: 1.140.2;
Fix of panic that was introduced since ktrace-lwp branch was merged. The
shortcut to the process of the passed lwp paniced the kernel since lwp
could/can be passwd as NULL in VOP_WRITE().

This was happening when ktracing to NFS. The function ktrwrite() set the
uio_lwp to NULL and then calls VOP_WRITE() with this argument. nfs_write()
then accessed lwp *l->l_proc wich paniced.

Thanks to David Laight for his help on tracking it down.
 1.139 11-Dec-2005  christos merge ktrace-lwp.
 1.138 29-Nov-2005  yamt merge yamt-readahead branch.
 1.137 04-Nov-2005  yamt branches: 1.137.2;
nfs_bioread: push delayed truncation and tweak loop accordingly.
PR/31926 from Jed Davis.
 1.136 06-Oct-2005  yamt nfs_bioread: handle file truncation on the server a little more gracefully.
 1.135 01-Oct-2005  jdolecek use killproc() for killing the process due to text file modification, so
that it's logged too

PR: 17392 by Greg A. Woods
 1.134 19-Aug-2005  yamt fix some simple bugs in the 64bit ino_t changes.
- edp -> dp
- * -> +
 1.133 19-Aug-2005  christos 64 bit inode changes.
 1.132 21-Jul-2005  yamt use a correct credential for readlink. discussed on source-changes@.
 1.131 21-Jul-2005  yamt nfs_doio_read: revert readlink part of 1.129 and 1.130 because they were wrong.
 1.130 07-Jul-2005  christos Back to using curproc in the VLNK case when uiop->uio_procp == NULL,
and explain why we need to.
 1.129 07-Jul-2005  christos 1. use p = uio->uio_procp consistently and eliminate suspicious uses
of curproc (where uio->uio_procp should be used?). Don't do this
for nfs_commit(), because yamt says it is possibly wrong.
2. nfs_doio() does not use struct proc; remove it and the code to compute it.
3. use copyin_proc() and copyout_proc() instead of copyin() and copyout().
4. check return of copyout_proc(). and mark return from copyin_proc() XXX
5. Eliminate check p == curproc assertion check from nfs_write;
nfs_read does not have it and we might be called in a different
process context anyway (PR 20138).
 1.128 26-Feb-2005  perry branches: 1.128.2; 1.128.4;
nuke trailing whitespace
 1.127 27-Jan-2005  yamt - simplify nfs_bio.c rev.1.126
- add an assertion.

no functional changes.
 1.126 27-Jan-2005  yamt nfs_bioread:
- if a buffer is still empty after successful nfs_doio, it implies EOF.
- don't cache blocks beyond EOF.
 1.125 26-Jan-2005  yamt handle a really empty directory, which doesn't have even the dot entry.
 1.124 09-Jan-2005  chs branches: 1.124.2; 1.124.4;
adjust the UBC mapping code to support non-vnode uvm_objects.
this means we can no longer look at the vnode size to determine how many
pages to request in a fault, which is good since for NFS the size can change
out from under us on the server anyway. there's also a new flag UBC_UNMAP
for ubc_release(), so that the file system code can make the decision about
whether to cache mappings for files being used as executables.
 1.123 14-Dec-2004  yamt - centerize code to invalidate stale cache.
- don't ignore errors when invalidating buffers in nfs_open.
 1.122 26-Oct-2004  yamt since daddr_t is 64-bit these days, simply use nfs directory cookies
as buffer cache indexes. regress/sys/fs/getdents is now supposed to work.
fix PR/27112.
 1.121 17-Sep-2004  skrll There's no need to pass a proc value when using UIO_SYSSPACE with
vn_rdwr(9) and uiomove(9).

OK'd by Jason Thorpe
 1.120 15-Sep-2004  yamt fix access-after-free bugs in dircache code by refcounting nfsdircache.
PR/26864.
 1.119 18-Jul-2004  yamt nfs_doio_read: on short read, zero out the rest of the buffer unconditionally.
we can't rely on n_size here because it can be changed under us.
 1.118 11-Jun-2004  yamt nfs_doio_read: use np->n_rcred instead of curproc->p_ucred for VDIR.

XXX maybe it's better to use a cred passed by VOP_READDIR.
 1.117 23-May-2004  christos cut down another 7K by more NFS_V2_ONLY ifdefs.
 1.116 12-Mar-2004  yamt branches: 1.116.2;
introduce a macro NFS_INVALIDATE_ATTRCACHE and use it
instead of "n_attrstamp = 0".
 1.115 10-Jan-2004  yamt comments in nfs_doio_write.
 1.114 07-Dec-2003  fvdl Unix semantics dictate that access checks for files are done when it
is opened. An open file can always be read from and/or written to,
depending on how it was opened.

Therefore, the read/write/commit RPCs should never return EACCESS,
as they are only performed on files that have been successfully opened
already.

This change improves the current situation and works in most cases.
It simply always uses the most recently known owner/group of the file,
iff the authentication mechanism is AUTH_UNIX (in other cases, the
creds for a succesful open are used, but note that no other cases
are currently implemented).

A retry mechanism can be used to catch a few more cases, but this is
a good improvement for now.
 1.113 17-Nov-2003  jonathan Fix hanging-paren typo.
 1.112 17-Nov-2003  jonathan Change previous patch to have same effect as patch posted to
tech-kern. Suggested reformatting inadvertently changed the meaning of
the code, as noted by YAMAMOTO Takashi <yamt@mwd.biglobe.ne.jp>.
 1.111 17-Nov-2003  jonathan Commit fix for NFS write deadlock, on filesystems mounted via
local-loopback (lo0). As posted for review on tech-kern 2003-18-09,
with a long comment explaining (one of) the deadlock scenarios.

I've used this since shortly after 2002-09-12-, without noticing
performance degradataion or instability for non-loopback mounts.
 1.110 26-Sep-2003  yamt change n_mtime from time_t to timespec in order to improve
cache consistency.
(1 second granularity is too loose these days.)
 1.109 17-Sep-2003  yamt don't call nfs_delayedtruncate() from nfs_getpages().
it causes simplelock deadlock.
 1.108 26-Aug-2003  pk VOP_PUTPAGES() must be called with the vnode's interlock held.
 1.107 07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.106 03-Aug-2003  pk Make life slightly easier for the compiler's optimisation routines.
 1.105 29-Jun-2003  fvdl branches: 1.105.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
 1.104 28-Jun-2003  darrenr Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
 1.103 22-May-2003  yamt interlock for NFLUSHINPROG/NFLUSHWANT.
 1.102 21-May-2003  yamt eliminate memcpy in the common and easy case of write.
 1.101 16-May-2003  yamt correct a KASSERT.
 1.100 15-May-2003  yamt acquire vmobjlock when touch pg->flags.
 1.99 07-May-2003  yamt simple lock for nfs iod.
 1.98 03-May-2003  yamt - check page's offset in the object as well. (pointed by Chuck Silvers.)
- remove false assertion.
 1.97 03-May-2003  yamt - if writerpc ends with a stable result, no need to commit them anymore.
- add comments.
 1.96 03-May-2003  yamt better handling of write verifier change.
 1.95 18-Apr-2003  yamt fix a use of an uninitialized variable.
 1.94 15-Apr-2003  yamt remove line-wrapping that is no longer needed.
 1.93 12-Apr-2003  yamt fix a typo in the previous.
 1.92 12-Apr-2003  yamt set b_resid correctly.
 1.91 12-Apr-2003  yamt split nfs_doio to nfs_doio_{phys,read,write} to avoid too deep indents.
 1.90 12-Apr-2003  yamt - do FILESYNC writes if we're freeing the page or the page doesn't
belong to us. otherwise, data will be lost on server crash.
- use b_bcount instead of b_bufsize to determine
how many pages we should deal with.

based on a patch from Chuck Silvers.
discussed on tech-kern.
 1.89 09-Apr-2003  yamt rename a very confusing variable name.
(must_commit -> stalewriteverf)
 1.88 09-Apr-2003  yamt when commit failed and fall to write, re-set 'off' and 'cnt'
because it can be changed in 'needcommit' path.
 1.87 09-Apr-2003  yamt make per-iod datas together.
 1.86 18-Jan-2003  thorpej Merge the nathanw_sa branch.
 1.85 29-Oct-2002  yamt fix panic (assertion failure) on error case.
if uiomove is failed, we should clean up pages past eof.

the problem reported by kay.
ok'ed by Chuck Silvers.
 1.84 23-Oct-2002  jdolecek merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe
 1.83 21-Oct-2002  yamt fix a page locking deadlock problem for nfs.

add a flag that specify if the file can be truncated safely or not
to nfsm_loadattr and friends. when it isn't safe, just mark the nfsnode
as "should be truncated later".

ok'ed by Frank van der Linden and Chuck Silvers.
close kern/18036.
 1.82 01-Sep-2002  bouyer nfs_doio(): handle the case where nfs_writerpc() returned error != 0.
Fix kern/18125. OK'd by thorpej and chs.
 1.81 06-May-2002  enami branches: 1.81.4;
Remove wrong assertion in previous commit.
 1.80 06-May-2002  enami The per nfsnode n_commitlock is a sleep lock, but we can't sleep if
PGO_LOCKED getpages request. So, just make the lock fail and tell
the caller that there is no pages available if we can't acquire it.
The caller will call us again soon without PGO_LOCKED. Reviewed by chuq.
 1.79 10-Apr-2002  chs only use UBC_FAULTBUSY to access offsets past EOF,
otherwise we can deadlock trying to busy the same page in uiomove().
 1.78 25-Mar-2002  chs remove PGO_WEAK, it isn't needed anymore.
 1.77 23-Mar-2002  chs only do v3 stuff for v3 filesystems.
 1.76 16-Mar-2002  chs make sure that if NMODIFIED is clear, all pages attached to the vnode are
clean and without writable mappings. if we try to flush dirty pages past
EOF to the server when NMODIFIED is clear, we'll update the attrcache before
doing the write, which will try to free the pages past EOF and deadlock.
to deal with this, we write-protect pages before we send them to the server,
and restrict ourselves to creating read-only mappings if NMODIFIED isn't set.
score another one for enami.
 1.75 31-Jan-2002  chs use curproc instead of b_proc for NFS. that's what we want for sync commits
and it doesn't cause any problems for async commits.
 1.74 26-Jan-2002  chs re-enable NFSv3 commit RPCs by abandoning my new approach in favor of
frank's scheme, with one new twist: don't wait until we've totally run
out of free pages before committing, but instead notice when we've built
up a largish range of uncommitted pages and commit only the older half of
the range, which is likely to already be on disk on the server.
 1.73 31-Dec-2001  chs fix locking in nfs_getpages().
 1.72 30-Nov-2001  chs call VOP_PUTPAGES() directly instead of indirecting through
the UVM pager op vector.
 1.71 10-Nov-2001  lukem add RCSIDs
 1.70 13-Oct-2001  simonb branches: 1.70.2;
Remove so variables that are only ever set and never referenced.
 1.69 15-Sep-2001  chs a whole bunch of changes to improve performance and robustness under load:

- remove special treatment of pager_map mappings in pmaps. this is
required now, since I've removed the globals that expose the address range.
pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's
no longer any need to special-case it.
- eliminate struct uvm_vnode by moving its fields into struct vnode.
- rewrite the pageout path. the pager is now responsible for handling the
high-level requests instead of only getting control after a bunch of work
has already been done on its behalf. this will allow us to UBCify LFS,
which needs tighter control over its pages than other filesystems do.
writing a page to disk no longer requires making it read-only, which
allows us to write wired pages without causing all kinds of havoc.
- use a new PG_PAGEOUT flag to indicate that a page should be freed
on behalf of the pagedaemon when it's unlocked. this flag is very similar
to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the
pageout fails due to eg. an indirect-block buffer being locked.
this allows us to remove the "version" field from struct vm_page,
and together with shrinking "loan_count" from 32 bits to 16,
struct vm_page is now 4 bytes smaller.
- no longer use PG_RELEASED for swap-backed pages. if the page is busy
because it's being paged out, we can't release the swap slot to be
reallocated until that write is complete, but unlike with vnodes we
don't keep a count of in-progress writes so there's no good way to
know when the write is done. instead, when we need to free a busy
swap-backed page, just sleep until we can get it busy ourselves.
- implement a fast-path for extending writes which allows us to avoid
zeroing new pages. this substantially reduces cpu usage.
- encapsulate the data used by the genfs code in a struct genfs_node,
which must be the first element of the filesystem-specific vnode data
for filesystems which use genfs_{get,put}pages().
- eliminate many of the UVM pagerops, since they aren't needed anymore
now that the pager "put" operation is a higher-level operation.
- enhance the genfs code to allow NFS to use the genfs_{get,put}pages
instead of a modified copy.
- clean up struct vnode by removing all the fields that used to be used by
the vfs_cluster.c code (which we don't use anymore with UBC).
- remove kmem_object and mb_object since they were useless.
instead of allocating pages to these objects, we now just allocate
pages with no object. such pages are mapped in the kernel until they
are freed, so we can use the mapping to find the page to free it.
this allows us to remove splvm() protection in several places.

The sum of all these changes improves write throughput on my
decstation 5000/200 to within 1% of the rate of NetBSD 1.5
and reduces the elapsed time for "make release" of a NetBSD 1.5
source tree on my 128MB pc to 10% less than a 1.5 kernel took.
 1.68 27-Jun-2001  thorpej branches: 1.68.2; 1.68.4;
Make sure to add NFS vnodes to the syncerd worklist.
 1.67 26-May-2001  chs replace vm_page_t with struct vm_page *.
 1.66 16-Apr-2001  chs reads at or after EOF should "succeed".
 1.65 03-Apr-2001  chs handle partially full directory buffers by only using (b_bcount - b_resid)
bytes of data from the buffer.
 1.64 10-Mar-2001  chs eliminate the VM_PAGER_* error codes in favor of the traditional E* codes.
the mapping is:

VM_PAGER_OK 0
VM_PAGER_BAD <unused>
VM_PAGER_FAIL <unused>
VM_PAGER_PEND 0 (see below)
VM_PAGER_ERROR EIO
VM_PAGER_AGAIN EAGAIN
VM_PAGER_UNLOCK EBUSY
VM_PAGER_REFAULT ERESTART

for async i/o requests, it used to be possible for the request to
be convert to sync, and the pager would return VM_PAGER_OK or VM_PAGER_PEND
to indicate whether the caller should perform post-i/o cleanup.
this is no longer allowed; pagers must now return 0 to indicate that
the async i/o was successfully started, and the caller never needs to
worry about doing the post-i/o cleanup.
 1.63 27-Feb-2001  chs branches: 1.63.2;
min() -> MIN(), max() -> MAX().
fixes more problems with file offsets > 4GB.
 1.62 18-Feb-2001  chs fix a couple more bugs:
- in nfs_getpages(), unbusy any pages that we don't free in the error path.
- in nfs_putpages(), only call biowait() if we actually started any i/os.
 1.61 05-Feb-2001  chs fix several bugs:
- in the cases where we skip over the i/o loop, increment npages by ridx
so that when the cleanup code starts processing the pgs array at index 0
it'll actually process all of the pages.
- process the PG_RELEASED flag when unbusying pages.
- add some missing MP locking.
- use MIN() and MAX() instead of min() and max() since the latter are
functions which take arguments of type "int" but we call them with
values of type "off_t", so the values could be truncated.
 1.60 30-Jan-2001  thorpej Make sure bp->b_proc is initialized. Should fix a deref-garbage-pointer
problem reported by msaitoh@netbsd.org. NOTE: These are marked XXXUBC
since the code that allocates the bufs is new with UBC, but it may be
the case that bp->b_proc needs to be intialized to curproc (it's used
in a call to nfs_sigintr()).
 1.59 07-Jan-2001  enami Use uvm_aio_biodone instead of uvm_aio_aiodone for top-level buf
so that uvmexp.paging is updated if this i/o was initiated by
the pagedaemon.
 1.58 27-Dec-2000  chs fix several bugs:
- fix math when skipping writing pages that just need a commit.
- clear the needcommit stuff and PG_RDONLY flags on pages returned for
overwrite requests as well as for normal write faults.
- bail out of nfs_write() if we get an error.
- remove a bogus attempt to clean up after failed uiomove()s.
- bring over a workaround for a lock-ordering problem from the genfs code.
- add some missing MP locking.
 1.57 13-Dec-2000  jdolecek <sys/trace.h> is not needed here
 1.56 09-Dec-2000  chs only zero the part of the page after EOF if we're actually
initializing the page.
 1.55 04-Dec-2000  fvdl Initialize 'error' to 0, so that nfs_putpages doesn't return garbage
when pages already have been committed and nothing needs to be done.
 1.54 27-Nov-2000  chs Initial integration of the Unified Buffer Cache project.
 1.53 19-Sep-2000  bjh21 Extend NFS_V2_ONLY to remove NQNFS lease support as well. Saves another 10k.
 1.52 19-Sep-2000  fvdl Move handling of B_NEEDCOMMIT buffers to nfs_doio, so that bawrite() calls
for them are actually done asynchronously. Idea taken from FreeBSD.

Do away with nfs_writebp completely, it's not needed anymore.

Keep an eye on the range of a file that needs to be committed, and
do it in heaps.
 1.51 19-Sep-2000  bjh21 New kernel option, NFS_V2_ONLY, which aims to reduce the NFS client to just
that required to support NFSv2 mounts. Not finished yet, but already
provides some 44k of saving in code size on arm26. More savings, and some
documentation, are still to come.
 1.50 27-Jun-2000  mrg remove include of <vm/vm.h>
 1.49 18-May-2000  pk branches: 1.49.4;
Fix printf() format.
 1.48 30-Mar-2000  augustss Remove register declarations.
 1.47 23-Nov-1999  fvdl Be more careful to block bio interrupts for some data structures. There
were at least a few missed cases where vp->v_{clean,dirty}blkhd were
unprotected since the softdep/trickle sync merge.
 1.46 15-Nov-1999  fvdl Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O
 1.45 24-Mar-1999  mrg branches: 1.45.4; 1.45.8; 1.45.10; 1.45.14;
completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.
 1.44 09-Aug-1998  perry branches: 1.44.2;
bzero->memset, bcopy->memcpy, bcmp->memcmp
 1.43 21-Jun-1998  fvdl Fix possible overflow problem in read size computation.
 1.42 10-Feb-1998  mrg - add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.
 1.41 05-Feb-1998  mrg initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)
 1.40 23-Nov-1997  fvdl Move the EOF check after getting a block out of the if() that determines
whether we get it off the wire. An nfsiod might have been busy with
it, and finished while we were waiting for it in nfs_getcacheblk, so
we need to check for EOF again no matter what.
 1.39 23-Oct-1997  fvdl Oops. Fix goof in previous change.
 1.38 22-Oct-1997  fvdl Just return immediately in nfs_bioread if we got an empty buffer because
of EOF on a directory.
 1.37 20-Oct-1997  thorpej branches: 1.37.2;
Fix alignment problems. From Frank van der Linden <fvdl@NetBSD.ORG>.
 1.36 19-Oct-1997  fvdl Only do readaheads when reading sequential blocks; check v_lastr to
achieve this. Improves performance for demand paging. From Chris Demetriou.
 1.35 19-Oct-1997  fvdl * Implement optional 32 <-> 64 bit directory cookie translation. This uses
the directory cache as translation table. See nfs_subs.c for comments.
Makes the code a bit more complex to look at than I would have liked,
but doesn't affect the speed of the default behavior.
* Optimize caching behavior a bit when buffers are invalidated.
* Save some RPCs in readdir operations by not bothering if there is
a small amount left to do to fill the buffer. It'll be done in the
next RPC with a larger chunk anyway. Wastes a bit of buffer space
but is faster.
* Make n_vattr an allocated vattr struct. This avoids nfsnode bloat,
and is friendlier to the malloc routines.
 1.34 10-Oct-1997  fvdl * New directory entry caching system. Provides full caching of any
directory cookie that may be thrown back at us from userspace, up
to a size limit. Fixes double entry problem.
* Split flags for internal and external use in the NFS mount structure.
* Fix some buffer structure fields that weren're being used correctly.
* Fix missing directory cache inval call in nfs_open.
* Limit on NFS_DIRBLKSIZ no longer needed, bumped to the more reasonable
value of 8k.
* Various other things that I forget, all related to the dir caching
somehow, though.
 1.33 17-Jul-1997  fvdl branches: 1.33.2;
* Deal with servers that don't give complete FSINFO (like NT)
From Olaf Seibert <rhialto@polder.ubc.kun.nl> (PR 3687)
* Make an attempt to check the maximum filesize before attempting
a write to the server, as write RPCs will typically happen
asynchronously, and the process will not see the error.
Fixes problems with unexpectly truncated files at 4G
* Pass up errors in nfs_writerpc correctly
 1.32 04-Jul-1997  drochner Don't cast 64bit (off_t) file sizes to vm_offset_t (32bit on many
architectures), truncate them intelligently instead.
The truncation is done centralized in vnode_pager.c.
This prevents from wrap-over effects when parts of large (>2^32 byte) files
are mmapped.
Don't allow to mmap above the numerical range of vm_offset_t.
This is considered a temporary solution until the vm system handles the
object sizes/offsets more cleanly.
 1.31 20-Apr-1997  fvdl Only wake up one nfsiod when there is an async write to do. (from FreeBSD).
 1.30 02-Dec-1996  thorpej NFS performance improvement from Doug Rabson/FreeBSD:

Improve the queuing algorithms used by NFS' asynchronous i/o. The
existing mechanism uses a global queue for some buffers and the
vp->b_dirtyblkhd queue for others. This turns sequential writes into
randomly ordered writes to the server, affecting both read and write
performance. The existing mechanism also copes badly with hung
servers, tending to block accesses to other servers when all the iods
are waiting for a hung server.

The new mechanism uses a queue for each mount point. All asynchronous
i/o goes through this queue which preserves the ordering of requests.
A simple mechanism ensures that the iods are shared out fairly between
active mount points.

Reviewed/integrated/approved by Frank van der Linden <fvdl@netbsd.org>
 1.29 13-Oct-1996  christos revert kprintf changes
 1.28 10-Oct-1996  christos printf -> kprintf, sprintf -> ksprintf
 1.27 02-Jul-1996  fvdl Don't mistake a non-async block that needs to be commited for an
interrupted write.
 1.26 23-May-1996  fvdl * Make mounts with symlinks work (needed for direct mounts with amd). PR #1917
* Never change the NQNFS flag and/or version when just doing an update mount.
Fixes a problem that made diskless booting impossible under some
circumstances.
 1.25 29-Feb-1996  fvdl branches: 1.25.4;
Make sure to clear B_NEEDCOMMIT in the right spot. Fix 'officially blessed'
by Rick Macklem. Fixes PR kern/2128.
 1.24 18-Feb-1996  fvdl Bring in a merge of Rick Macklem's NFSv3 code from Lite2
 1.23 09-Feb-1996  christos nfs prototype changes
 1.22 01-Feb-1996  jtc Rename struct timespec fields to conform to POSIX.1b
 1.21 24-Jul-1995  cgd avoid unnecessary aging of buffers. This used to make sense, when buffer
caches were much smaller, but makes little sense now, and will become more
useless as RAM (and buffer cache) sizes grow. Suggested by Bob Baron.
 1.20 18-Mar-1995  gwr Make call to nfs_writerpc() consistent with others.
 1.19 12-Jan-1995  mycroft Add two missing brelse() calls. From Rick Macklem.
 1.18 10-Jan-1995  mycroft Make sure readdir requests are only truncated on block boundaries.
 1.17 20-Jul-1994  mycroft Fix a problem with write-behind causing processes to be killed occasionally.
From Rick Macklem.
 1.16 12-Jul-1994  cgd minor cache consistency fix
 1.15 29-Jun-1994  cgd branches: 1.15.2;
New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'
 1.14 22-Jun-1994  pk straighten out diskless swap code somewhat.
 1.13 15-Jun-1994  mycroft Turn P_NOSWAP and P_PHYSIO into a hold count, as suggested by a comment.
 1.12 08-Jun-1994  mycroft Update to 4.4-Lite fs code, with local changes.
 1.11 24-May-1994  cgd MIN -> min, MAX -> max
 1.10 25-Apr-1994  cgd some prototype cleanup, eliminate/replace bogus types (e.g. quad and
u_quad) -> use better types (e.g. quad_t & u_quad_t in inodes),
some cleanup.
 1.9 21-Apr-1994  cgd Convert mount, vnode, and buf structs to use <sys/queue.h>. Also,
some knf and structure frobbing to do along with it.
 1.8 18-Dec-1993  mycroft Canonicalize all #includes.
 1.7 03-Sep-1993  jtc branches: 1.7.2;
Include systm.h to get prototypes (and possibly inlines) of *max functions.
 1.6 13-Jul-1993  cgd get rid of some more bogus changes from a week ago
 1.5 13-Jul-1993  cgd diskless changes made last time were hosed; were using NULL for
"no credentials" rather than NOCRED.
 1.4 07-Jul-1993  cgd changes from ws to support diskless booting... these are "OK" on inspection
and after testing... (actually, currently, none of the changed
code is even used...)
 1.3 30-Jun-1993  andrew Paul Kranenburg's VM deadlock fixes. (patchkit patch 00147, part 2)
 1.2 20-May-1993  cgd more rcs id adding and header cleanup. i like vi macros!
 1.1 21-Mar-1993  cgd branches: 1.1.1;
Initial revision
 1.1.1.3 01-Mar-1998  fvdl Import 4.4BSD-Lite2
 1.1.1.2 01-Mar-1998  fvdl Import 4.4BSD-Lite for reference
 1.1.1.1 21-Mar-1993  cgd initial import of 386bsd-0.1 sources
 1.7.2.1 14-Nov-1993  mycroft Canonicalize all #includes.
 1.15.2.2 20-Jul-1994  cgd from trunk, per mycroft
 1.15.2.1 12-Jul-1994  cgd consistency fix, from trunk
 1.25.4.2 08-Jul-1996  jtc Pulled up from rev 1.27 by request from Frank van der Linden
 1.25.4.1 25-May-1996  fvdl Pull in bugfixes from main branch.
 1.33.2.1 14-Oct-1997  thorpej Update marc-pcmcia branch from trunk.
 1.37.2.3 24-Nov-1997  mellon Pull rev 1.40 up from trunk (fvdl)
 1.37.2.2 23-Oct-1997  mellon Pull rev 1.39 up from trunk
 1.37.2.1 23-Oct-1997  mellon Pull rev 1.38 from main trunk
 1.44.2.5 30-May-1999  chs vm_page's blkno is gone.
 1.44.2.4 30-Apr-1999  chs change ubc_alloc()'s length arg to be a pointer instead of the value.
the pointed-to value is the total desired length on input,
and is updated to the length that will fit in the returned window.
this allows callers of ubc_alloc() to be ignorant of the window size.
 1.44.2.3 25-Feb-1999  chs major overhaul of getpages and putpages functions.
 1.44.2.2 16-Nov-1998  chs set NMODIFIED in nfs_write().
putpage is now called with uobj unlocked.
remove some debugging printfs.
 1.44.2.1 09-Nov-1998  chs initial snapshot. lots left to do.
 1.45.14.2 27-Dec-1999  wrstuden Pull up to last week's -current.
 1.45.14.1 21-Dec-1999  wrstuden Initial commit of recent changes to make DEV_BSIZE go away.

Runs on i386, needs work on other arch's. Main kernel routines should be
fine, but a number of the stand programs need help.

cd, fd, ccd, wd, and sd have been updated. sd has been tested with non-512
byte block devices. vnd, raidframe, and lfs need work.

Non 2**n block support is automatic for LKM's and conditional for kernels
on "options NON_PO2_BLOCKS".
 1.45.10.1 19-Oct-1999  fvdl Bring in Kirk McKusick's FFS softdep code on a branch.
 1.45.8.8 21-Apr-2001  bouyer Sync with HEAD
 1.45.8.7 12-Mar-2001  bouyer Sync with HEAD.
 1.45.8.6 11-Feb-2001  bouyer Sync with HEAD.
 1.45.8.5 18-Jan-2001  bouyer Sync with head (for UBC+NFS fixes, mostly).
 1.45.8.4 05-Jan-2001  bouyer Sync with HEAD
 1.45.8.3 13-Dec-2000  bouyer Sync with HEAD (for UBC fixes).
 1.45.8.2 08-Dec-2000  bouyer Sync with HEAD.
 1.45.8.1 20-Nov-2000  bouyer Update thorpej_scsipi to -current as of a month ago
 1.45.4.5 31-Aug-1999  perseant Rudimentary support for LFS under UBC:

- LFS-specific VOP_BALLOC and VOP_PUTPAGES vnode ops.

- getblk VREG panic #ifdef'd out (can be reinstated when Ifile is
internalized and Ifile can be made another type from VREG)

- interface to VOP_PUTPAGES changed to pass all pager flags, not
just sync. FS putpages routines must know about the pager flags.

- new LFS magic disk address, -2 ("unwritten"), meaning accounted for
but not assigned to a fixed disk location (since LFS does these two
things separately, and the previous accounting method using buffer
headers no longer will work). Changed references to (foo == (daddr_t)-1)
to (foo < 0). Since disk drivers reject all addresses < 0, this should
not present a problem for other FSs.
 1.45.4.4 31-Jul-1999  chs in nfs_getpages(), deal with extending writes better.
also, return errnos instead of VM_PAGER_*.
 1.45.4.3 11-Jul-1999  chs remove uvm_vnp_uncache(), it's no longer needed.
 1.45.4.2 04-Jul-1999  chs update uvm_pagermapin() to match new args.
 1.45.4.1 07-Jun-1999  chs merge everything from chs-ubc branch.
 1.49.4.1 14-Dec-2000  he Pull up revision 1.52 (requested by fvdl):
Improve NFS performance, possibly with as much as 100% in
throughput. Please note: this implies a kernel interface change,
VOP_FSYNC gains two arguments.
 1.63.2.19 11-Nov-2002  nathanw Catch up to -current
 1.63.2.18 22-Oct-2002  thorpej Sync with HEAD.
 1.63.2.17 17-Sep-2002  nathanw Catch up to -current.
 1.63.2.16 15-Jul-2002  nathanw Whitespace.
 1.63.2.15 12-Jul-2002  nathanw No longer need to pull in lwp.h; proc.h pulls it in for us.
 1.63.2.14 24-Jun-2002  nathanw Curproc->curlwp renaming.

Change uses of "curproc->l_proc" back to "curproc", which is more like the
original use. Bare uses of "curproc" are now "curlwp".

"curproc" is now #defined in proc.h as ((curlwp) ? (curlwp)->l_proc) : NULL)
so that it is always safe to reference curproc (*de*referencing curproc
is another story, but that's always been true).
 1.63.2.13 20-Jun-2002  nathanw Catch up to -current.
 1.63.2.12 17-Apr-2002  nathanw Catch up to -current.
 1.63.2.11 01-Apr-2002  nathanw Catch up to -current.
(CVS: It's not just a program. It's an adventure!)
 1.63.2.10 28-Feb-2002  nathanw curproc ==> curproc->l_proc
 1.63.2.9 28-Feb-2002  nathanw Catch up to -current.
 1.63.2.8 08-Jan-2002  nathanw Catch up to -current.
 1.63.2.7 14-Nov-2001  nathanw Catch up to -current.
 1.63.2.6 22-Oct-2001  nathanw Catch up to -current.
 1.63.2.5 21-Sep-2001  nathanw Catch up to -current.
 1.63.2.4 24-Aug-2001  nathanw Catch up with -current.
 1.63.2.3 21-Jun-2001  nathanw Catch up to -current.
 1.63.2.2 09-Apr-2001  nathanw Catch up with -current.
 1.63.2.1 05-Mar-2001  nathanw Initial commit of scheduler activations and lightweight process support.
 1.68.4.1 01-Oct-2001  fvdl Catch up with -current.
 1.68.2.5 30-Sep-2002  jdolecek add support for kevents to NFS
to detect file changes on server by other NFS clients, polling kernel thread
is used to periodically check for attribute changes of watched files;
the NFS server is only contacted when the vnode expires from local attrcache
(which takes 5-60 seconds currently), to keep network&CPU overhead low

the routine checking for remote changes is quite simplistic, but hopefully
doing it's job well enough
 1.68.2.4 06-Sep-2002  jdolecek sync kqueue branch with HEAD
 1.68.2.3 23-Jun-2002  jdolecek catch up with -current on kqueue branch
 1.68.2.2 11-Feb-2002  jdolecek Sync w/ -current.
 1.68.2.1 10-Jan-2002  thorpej Sync kqueue branch with -current.
 1.70.2.1 12-Nov-2001  thorpej Sync the thorpej-mips-cache branch with -current.
 1.81.4.1 01-Sep-2002  lukem Pull up revision 1.82 (requested by bouyer in ticket #752):
nfs_doio(): handle the case where nfs_writerpc() returned error != 0.
Fix kern/18125. OK'd by thorpej and chs.
 1.105.2.12 11-Dec-2005  christos Sync with head.
 1.105.2.11 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.105.2.10 04-Mar-2005  skrll Sync with HEAD.

Hi Perry!
 1.105.2.9 04-Feb-2005  skrll Sync with HEAD.
 1.105.2.8 17-Jan-2005  skrll Sync with HEAD.
 1.105.2.7 18-Dec-2004  skrll Sync with HEAD.
 1.105.2.6 02-Nov-2004  skrll Sync with HEAD.
 1.105.2.5 30-Oct-2004  skrll Correct panic message s/proc/lwp/
 1.105.2.4 21-Sep-2004  skrll Fix the sync with head I botched.
 1.105.2.3 18-Sep-2004  skrll Sync with HEAD.
 1.105.2.2 03-Aug-2004  skrll Sync with HEAD
 1.105.2.1 02-Jul-2003  darrenr Apply the aborted ktrace-lwp changes to a specific branch. This is just for
others to review, I'm concerned that patch fuziness may have resulted in some
errant code being generated but I'll look at that later by comparing the diff
from the base to the branch with the file I attempt to apply to it. This will,
at the very least, put the changes in a better context for others to review
them and attempt to tinker with removing passing of 'struct lwp' through
the kernel.
 1.116.2.3 01-Dec-2005  riz Pull up following revision(s) (requested by jld in ticket #8826):
sys/nfs/nfs_bio.c: revisions 1.136-1.137 via patch
The problem (kern/31926): under certain conditions, which could be
reliably reproduced, NFS reads would occasionally return zeroes instead
of some of the file data, or fail with EINVAL.
 1.116.2.2 18-Sep-2004  he branches: 1.116.2.2.2;
Pull up revision 1.120 (requested by yamt in ticket #858):
Fix access-after-free bugs in dircache code by reference
counting nfsdircache. Fixes PR#26864.
 1.116.2.1 21-Jun-2004  tron Pull up revision 1.118 (requested by yamt in ticket #513):
nfs_doio_read: use np->n_rcred instead of curproc->p_ucred for VDIR.
XXX maybe it's better to use a cred passed by VOP_READDIR.
 1.116.2.2.2.2 01-Dec-2005  riz Pull up following revision(s) (requested by jld in ticket #8826):
sys/nfs/nfs_bio.c: revisions 1.136-1.137 via patch
The problem (kern/31926): under certain conditions, which could be
reliably reproduced, NFS reads would occasionally return zeroes instead
of some of the file data, or fail with EINVAL.
 1.116.2.2.2.1 30-Jan-2005  he branches: 1.116.2.2.2.1.2;
Pull up revision 1.122 (requested by yamt in ticket #968):
Since daddr_t is 64-bit these days, simply use nfs directory
cookies as buffer cache indexes. This should make the
regress/sys/fs/getdents test work. Fixes PR#27112.
 1.116.2.2.2.1.2.1 01-Dec-2005  riz Pull up following revision(s) (requested by jld in ticket #8826):
sys/nfs/nfs_bio.c: revisions 1.136-1.137 via patch
The problem (kern/31926): under certain conditions, which could be
reliably reproduced, NFS reads would occasionally return zeroes instead
of some of the file data, or fail with EINVAL.
 1.124.4.2 19-Mar-2005  yamt sync with head. xen and whitespace. xen part is not finished.
 1.124.4.1 12-Feb-2005  yamt sync with head.
 1.124.2.1 29-Apr-2005  kent sync with -current
 1.128.4.8 21-Jan-2008  yamt sync with head
 1.128.4.7 07-Dec-2007  yamt sync with head
 1.128.4.6 15-Nov-2007  yamt sync with head.
 1.128.4.5 27-Oct-2007  yamt sync with head.
 1.128.4.4 03-Sep-2007  yamt sync with head.
 1.128.4.3 26-Feb-2007  yamt sync with head.
 1.128.4.2 30-Dec-2006  yamt sync with head.
 1.128.4.1 21-Jun-2006  yamt sync with head.
 1.128.2.2 21-Nov-2005  tron Pull up following revision(s) (requested by yamt in ticket #980):
sys/nfs/nfs_bio.c: revision 1.137
nfs_bioread: push delayed truncation and tweak loop accordingly.
PR/31926 from Jed Davis.
 1.128.2.1 21-Nov-2005  tron Pull up following revision(s) (requested by yamt in ticket #980):
sys/nfs/nfs_bio.c: revision 1.136
nfs_bioread: handle file truncation on the server a little more gracefully.
 1.137.2.3 19-Nov-2005  yamt - as read-ahead context is per-vnode now,
there are less reasons to make VOP_READ call uvm_ra_request explicitly.
move it to pager (uvn_get) so that it can handle accesses via mmap as well.
- pass advice to pager via ubc.
- tweak DPRINTF.

XXX can be disturbed by PGO_LOCKED.

XXX it's controversial where it should be done.
(uvm_fault, uvn_get or genfs_getpages.)
 1.137.2.2 18-Nov-2005  yamt - associate read-ahead context to vnode, rather than file.
- revert VOP_READ prototype.
 1.137.2.1 15-Nov-2005  yamt adapt ffs, lfs, nfs.
 1.140.2.2 15-Jan-2006  yamt sync with head.
 1.140.2.1 31-Dec-2005  yamt - adapt nfs.
- nfs_doio_read: #if 0 out "killproc if text is modified" part of
the code as it's broken. (a process reading the modified text is not
necessarily a process which is using the file as a text.)
 1.141.4.2 01-Jun-2006  kardel Sync with head.
 1.141.4.1 22-Apr-2006  simonb Sync with head.
 1.141.2.1 09-Sep-2006  rpaulo sync with head
 1.142.6.1 24-May-2006  tron Merge 2006-05-24 NetBSD-current into the "peter-altq" branch.
 1.142.4.2 06-May-2006  christos - Move kauth_cred_t declaration to <sys/types.h>
- Cleanup struct ucred; forward declarations that are unused.
- Don't include <sys/kauth.h> in any header, but include it in the c files
that need it.

Approved by core.
 1.142.4.1 08-Mar-2006  elad Adapt to kernel authorization KPI.

This could use some testing...
 1.142.2.2 11-Aug-2006  yamt sync with head
 1.142.2.1 24-May-2006  yamt sync with head.
 1.143.4.1 13-Jul-2006  gdamore Merge from HEAD.
 1.145.4.1 12-Jan-2007  ad Sync with head.
 1.147.2.4 17-May-2007  yamt sync with head.
 1.147.2.3 07-May-2007  yamt sync with head.
 1.147.2.2 12-Mar-2007  rmind Sync with HEAD.
 1.147.2.1 28-Feb-2007  yamt sync with head. (somehow missed in the previous)
 1.151.4.1 11-Jul-2007  mjf Sync with head.
 1.151.2.12 24-Aug-2007  ad Sync with buffer cache locking changes. See buf.h/vfs_bio.c for details.
Some minor portions are incomplete and needs to be verified as a whole.
 1.151.2.11 20-Aug-2007  ad Sync with HEAD.
 1.151.2.10 19-Aug-2007  ad - Back out the biodone() changes.
- Eliminate B_ERROR (from HEAD).
 1.151.2.9 15-Jul-2007  ad Sync with head.
 1.151.2.8 18-Jun-2007  yamt fix merge botches.
 1.151.2.7 17-Jun-2007  ad - Increase the number of thread priorities from 128 to 256. How the space
is set up is to be revisited.
- Implement soft interrupts as kernel threads. A generic implementation
is provided, with hooks for fast-path MD code that can run the interrupt
threads over the top of other threads executing in the kernel.
- Split vnode::v_flag into three fields, depending on how the flag is
locked (by the interlock, by the vnode lock, by the file system).
- Miscellaneous locking fixes and improvements.
 1.151.2.6 09-Jun-2007  ad Sync with head.
 1.151.2.5 08-Jun-2007  ad Sync with head.
 1.151.2.4 13-May-2007  ad - Pass the error number and residual count to biodone(), and let it handle
setting error indicators. Prepare to eliminate B_ERROR.
- Add a flag argument to brelse() to be set into the buf's flags, instead
of doing it directly. Typically used to set B_INVAL.
- Add a "struct cpu_info *" argument to kthread_create(), to be used to
create bound threads. Change "bool mpsafe" to "int flags".
- Allow exit of LWPs in the IDL state when (l != curlwp).
- More locking fixes & conversion to the new API.
 1.151.2.3 09-Apr-2007  ad - Add two new arguments to kthread_create1: pri_t pri, bool mpsafe.
- Fork kthreads off proc0 as new LWPs, not new processes.
 1.151.2.2 21-Mar-2007  ad - Replace more simple_locks, and fix up in a few places.
- Use condition variables.
- LOCK_ASSERT -> KASSERT.
 1.151.2.1 13-Mar-2007  ad Pull in the initial set of changes for the vmlocking branch.
 1.160.2.1 15-Aug-2007  skrll Sync with HEAD.
 1.164.6.2 29-Jul-2007  ad It's not a good idea for device drivers to modify b_flags, as they don't
need to understand the locking around that field. Instead of setting
B_ERROR, set b_error instead. b_error is 'owned' by whoever completes
the I/O request.
 1.164.6.1 29-Jul-2007  ad file nfs_bio.c was added on branch matt-mips64 on 2007-07-29 13:31:13 +0000
 1.164.4.6 09-Dec-2007  jmcneill Sync with HEAD.
 1.164.4.5 27-Nov-2007  joerg Sync with HEAD. amd64 Xen support needs testing.
 1.164.4.4 29-Oct-2007  joerg Sync with HEAD.
 1.164.4.3 26-Oct-2007  joerg Sync with HEAD.

Follow the merge of pmap.c on i386 and amd64 and move
pmap_init_tmp_pgtbl into arch/x86/x86/pmap.c. Modify the ACPI wakeup
code to restore CR4 before jumping back into kernel space as the large
page option might cover that.
 1.164.4.2 16-Aug-2007  jmcneill Sync with HEAD.
 1.164.4.1 09-Aug-2007  jmcneill Sync with HEAD.
 1.166.4.1 14-Oct-2007  yamt sync with head.
 1.166.2.2 09-Jan-2008  matt sync with HEAD
 1.166.2.1 06-Nov-2007  matt sync with HEAD
 1.168.2.1 13-Nov-2007  bouyer Sync with HEAD
 1.169.2.2 18-Feb-2008  mjf Sync with HEAD.
 1.169.2.1 08-Dec-2007  mjf Sync with HEAD.
 1.170.2.2 08-Dec-2007  ad Sync with head.
 1.170.2.1 04-Dec-2007  ad Pull the vmlocking changes into a new branch.
 1.171.4.1 02-Jan-2008  bouyer Sync with HEAD
 1.173.6.3 17-Jan-2009  mjf Sync with HEAD.
 1.173.6.2 02-Jun-2008  mjf Sync with HEAD.
 1.173.6.1 03-Apr-2008  mjf Sync with HEAD.
 1.174.2.1 18-May-2008  yamt sync with head.
 1.175.8.2 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.175.8.1 19-Oct-2008  haad Sync with HEAD.
 1.175.2.4 10-Oct-2010  yamt some locking changes
 1.175.2.3 26-Sep-2010  yamt locking changes
 1.175.2.2 11-Aug-2010  yamt sync with head.
 1.175.2.1 04-May-2009  yamt sync with head.
 1.179.4.2 16-Jul-2010  riz Pull up following revision(s) (requested by jakllsch in ticket #1417):
sys/nfs/nfs_bio.c: revision 1.185
Fix memory leak during some NFS writes.
 1.179.4.1 02-Nov-2008  snj branches: 1.179.4.1.4;
Pull up following revision(s) (requested by tron in ticket #9):
sys/nfs/nfs_bio.c: revision 1.180
sys/miscfs/genfs/genfs_io.c: revision 1.14
sys/uvm/uvm_extern.h: revision 1.149
- allocate 8 pointers on the stack to avoid stack overflow in nfs.
- make that 8 a constant
- remove bogus panic
 1.179.4.1.4.1 20-May-2011  matt bring matt-nb5-mips64 up to date with netbsd-5-1-RELEASE (except compat).
 1.179.2.2 28-Apr-2009  skrll Sync with HEAD.
 1.179.2.1 19-Jan-2009  skrll Sync with HEAD.
 1.181.4.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.183.4.3 03-Jul-2010  rmind sync with head
 1.183.4.2 30-May-2010  rmind sync with head
 1.183.4.1 16-Mar-2010  rmind Change struct uvm_object::vmobjlock to be dynamically allocated with
mutex_obj_alloc(). It allows us to share the locks among UVM objects.
 1.183.2.2 17-Aug-2010  uebayasi Sync with HEAD.
 1.183.2.1 30-Apr-2010  uebayasi Sync with HEAD.
 1.185.6.1 23-Jun-2011  cherry Catchup with rmind-uvmplock merge.
 1.188.22.1 07-Sep-2013  bouyer Pull up following revision(s) (requested by hannken in ticket #933):
sys/nfs/nfs_bio.c: revision 1.189
Function nfs_vinvalbuf() ignores errors from vinvalbuf() and therefore
delayed write errors may get lost.
Change nfs_vinvalbuf() to keep errors from vinvalbuf() for fsync() or =
close().
=20
Presented on tech-kern@
=20
Fix for PR kern/47980 (NFS over-quota not detected if utimes() called
before fsync()/close())
=20
=20
 1.188.16.1 28-Aug-2013  rmind sync with head
 1.188.14.1 07-Sep-2013  bouyer Pull up following revision(s) (requested by hannken in ticket #933):
sys/nfs/nfs_bio.c: revision 1.189
Function nfs_vinvalbuf() ignores errors from vinvalbuf() and therefore
delayed write errors may get lost.
Change nfs_vinvalbuf() to keep errors from vinvalbuf() for fsync() or =
close().
=20
Presented on tech-kern@
=20
Fix for PR kern/47980 (NFS over-quota not detected if utimes() called
before fsync()/close())
=20
=20
 1.188.12.2 03-Dec-2017  jdolecek update from HEAD
 1.188.12.1 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.188.8.1 07-Sep-2013  bouyer Pull up following revision(s) (requested by hannken in ticket #933):
sys/nfs/nfs_bio.c: revision 1.189
Function nfs_vinvalbuf() ignores errors from vinvalbuf() and therefore
delayed write errors may get lost.
Change nfs_vinvalbuf() to keep errors from vinvalbuf() for fsync() or =
close().
=20
Presented on tech-kern@
=20
Fix for PR kern/47980 (NFS over-quota not detected if utimes() called
before fsync()/close())
=20
=20
 1.188.2.4 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.188.2.3 25-Jan-2012  yamt uvm_loanabj: take an access pattern hint.
 1.188.2.2 04-Jan-2012  yamt enable O->A loaning read for a few filesystems.
 1.188.2.1 02-Nov-2011  yamt page cache related changes

- maintain object pages in radix tree rather than rb tree.
- reduce unnecessary page scan in putpages. esp. when an object has a ton of
pages cached but only a few of them are dirty.
- reduce the number of pmap operations by tracking page dirtiness more
precisely in uvm layer.
- fix nfs commit range tracking.
- fix nfs write clustering. XXX hack
 1.189.4.1 04-Nov-2015  riz Pull up following revision(s) (requested by manu in ticket #882):
sbin/umount/umount.c: revision 1.48
sys/nfs/nfsmount.h: revision 1.53
sys/nfs/nfs_var.h: revision 1.94
sys/nfs/nfs_iod.c: revision 1.7
sys/nfs/nfs_socket.c: revision 1.197
sys/nfs/nfs_bio.c: revision 1.191
sys/nfs/nfs_vfsops.c: revision 1.230
sys/nfs/nfs_clntsocket.c: revision 1.3
Remove useless and harmful sync(2) call in umount(8)
Remove sync(2) call before unmount(2) in umount(8). This sync(2) is useless
since unmount(2) will perform a VFS_SYNC anyway.
But moreover, this sync(2) may be harmful, as there are some situation where
it cannot return (unreachable NFS server, for instance), causing umount -f
to be uneffective.
Fix soft NFS force unmount
For many reasons, forcibly unmounting a soft NFS mount could hang forever.
Here are the fixes:
- Introduce decents timeouts in operation that awaited NFS server reply.
- On timeout, fails operations on soft mounts with EIO.
- Introduce NFSMNT_DISMNTFORCE to let the filesystem know that a
force unmount is ongoing. This causes timeouts to be reduced and
prevents the NFS client to attempt reconnecting to the NFS server.
Also fix a race condition where some asynchronous I/O could reference
destroyed mount structures. We fix this by awaiting asynchronous I/O
to drain before proceeding.
Reviewed by Chuck Silvers.
 1.190.2.1 22-Sep-2015  skrll Sync with HEAD
 1.191.18.1 08-Apr-2020  martin Merge changes from current as of 20200406
 1.192.2.2 29-Feb-2020  ad Sync with head.
 1.192.2.1 17-Jan-2020  ad Sync with head.
 1.195.2.1 25-Apr-2020  bouyer Sync with bouyer-xenpvh-base2 (HEAD)
 1.90 05-Jul-2024  rin sys: Drop redundant NULL check before m_freem(9)

m_freem(9) safely has accepted NULL argument at least since 4.2BSD:
https://www.tuhs.org/cgi-bin/utree.pl?file=4.2BSD/usr/src/sys/sys/uipc_mbuf.c

Compile-tested on amd64/ALL.

Suggested by knakahara@
 1.89 20-Sep-2022  knakahara branches: 1.89.10;
Remove routes on an address removal if the routes referencing to the address. Implemented by ozaki-r@n.o.

A route that has a gateway is on a connected route can be invalid if the
connected route is deleted, i.e., an associated address is removed.
Traditionally NetBSD doesn't sweep such a route on the address removal. Sending
packets over the route fails with "No route to host". Also the route holds an
orphan ifaddr as rt_ifa that is destructed say by in_purgeaddr.

If the same address is assgined again in such a state, there can be two
different ifaddr objects with the same address. Until recently it's not a
big problem because we can send packets anyway. However after MP-ification
of the network stack, we can't send packets because we strictly check if rt_ifa
(i.e., the (old) ifaddr) is valid.

This change automatically removes such routes on a removal of an associated
address to avoid keeping inconsistent routes.
 1.88 17-May-2018  thorpej Default NFS mounts to using TCP transport instead of UDP.
PR kern/53166
 1.87 15-Nov-2016  ozaki-r branches: 1.87.14;
Don't use rt_walktree to delete routes

Some functions use rt_walktree to scan the routing table and delete
matched routes. However, we shouldn't use rt_walktree to delete
routes because rt_walktree is recursive to the routing table (radix
tree) and isn't friendly to MP-ification. rt_walktree allows a caller
to pass a callback function to delete an matched entry. The callback
function is called from an API of the radix tree (rn_walktree) but
also calls an API of the radix tree to delete an entry.

This change adds a new API of the radix tree, rn_search_matched,
which returns a matched entry that is selected by a callback
function passed by a caller and the caller itself deletes the
entry. By using the API, we can avoid the recursive form.
 1.86 07-Jul-2016  msaitoh branches: 1.86.2;
KNF. Remove extra spaces. No functional change.
 1.85 21-May-2015  rtr change nfs_boot_sendrecv to take sockaddr_in * instead of mbuf *

fixes m_serv (single mbuf leak) leak in kern/subr_tftproot.c
 1.84 09-May-2015  rtr change sosend() to accept sockaddr * instead of mbuf * for nam.

bump to 7.99.16
 1.83 03-Apr-2015  rtr * change pr_bind to accept struct sockaddr * instead of struct mbuf *
* update protocol bind implementations to use/expect sockaddr *
instead of mbuf *
* introduce sockaddr_big struct for storage of addr data passed via
sys_bind; sockaddr_big is of sufficient size and alignment to
accommodate all addr data sizes received.
* modify sys_bind to allocate sockaddr_big instead of using an mbuf.
* bump kernel version to 7.99.9 for change to pr_bind() parameter type.

Patch posted to tech-net@
http://mail-index.netbsd.org/tech-net/2015/03/15/msg005004.html

The choice to use a new structure sockaddr_big has been retained since
changing sockaddr_storage size would lead to unnecessary ABI change. The
use of the new structure does not preclude future work that increases
the size of sockaddr_storage and at that time sockaddr_big may be
trivially replaced.

Tested by mrg@ and myself, discussed with rmind@, posted to tech-net@
 1.82 27-Mar-2015  hikaru m_pullup() is called in rcvproc callback functions,
so nfs_boot_sendrecv() should keep track of the head of mbuf chain.
fixes kern/48746
 1.81 25-Oct-2013  martin branches: 1.81.4; 1.81.6;
Mark a potentially unused variable
 1.80 04-Oct-2010  cyber branches: 1.80.8; 1.80.14; 1.80.18; 1.80.22;
Add support to honor MTU settings from DHCP during netboot.

Defines IP_MIN_MTU as 576.

Glanced over quickly by martin@ and joerg@.
 1.79 04-Mar-2009  nisimura branches: 1.79.2; 1.79.4;
Update comments to reflect realities; the filenames were changed in 1997
and another one was added years ago.
 1.78 19-Nov-2008  ad branches: 1.78.4;
Make the emulations, exec formats, coredump, NFS, and the NFS server
into modules. By and large this commit:

- shuffles header files and ifdefs
- splits code out where necessary to be modular
- adds module glue for each of the components
- adds/replaces hooks for things that can be installed at runtime
 1.77 27-Oct-2008  cegger make this build again without having
NFS_BOOT_BOOTSTATIC, NFS_BOOT_BOOTP, NFS_BOOT_DHCP or NFS_BOOT_BOOTPARAM
defined.

I uncovered this case when compiling rump.
 1.76 27-Oct-2008  cegger change nfs boot behaviour to automatically try next boot method if boot information are incomplete to succeed.
That way, it is possible combine static and dhcp boot:
For example, to boot diskless you can specify the nfs-server and the rootpath statically. All other information will be taken via dhcp.

Patch has been presented on port-xen, tech-kern and tech-net:
http://mail-index.netbsd.org/port-xen/2008/10/24/msg004488.html
http://mail-index.netbsd.org/tech-kern/2008/10/24/msg003255.html
http://mail-index.netbsd.org/tech-net/2008/10/24/msg000864.html

No comments, no objections.
 1.75 24-Oct-2008  cegger branches: 1.75.2;
- ansify function definition
- de- __P
- u_int32_t -> uint32_t

No functional changes.
 1.74 06-Aug-2008  plunky Convert socket options code to use a sockopt structure
instead of laying everything into an mbuf.

approved by core
 1.73 22-May-2008  dyoung branches: 1.73.4;
Delete unnecessary cast to void *.
 1.72 28-Apr-2008  martin branches: 1.72.2;
Remove clause 3 and 4 from TNF licenses
 1.71 24-Apr-2008  ad branches: 1.71.2;
Merge the socket locking patch:

- Socket layer becomes MP safe.
- Unix protocols become MP safe.
- Allows protocol processing interrupts to safely block on locks.
- Fixes a number of race conditions.

With much feedback from matt@ and plunky@.
 1.70 05-Apr-2008  cegger branches: 1.70.2;
use aprint_*_dev and device_xname
 1.69 31-Aug-2007  dyoung branches: 1.69.20;
Use sockaddr_in_init() and ifreq_setaddr() to initialize a sockaddr_in
and an ifreq.ifr_addr, respectively. Get rid of an extraneous
cast---down the elevator shaft! Change 'ireq' to 'ifr', which is
what the kernel calls a temporary struct ifreq virtually everywhere
else.
 1.68 19-Jul-2007  dyoung branches: 1.68.4; 1.68.6; 1.68.8;
Take steps to hide the radix_node implementation of the forwarding table
from the forwarding table's users:

Introduce rt_walktree() for walking the routing table and
applying a function to each rtentry. Replace most
rn_walktree() calls with it.

Use rt_getkey()/rt_setkey() to get/set a route's destination.
Keep a pointer to the sockaddr key in the rtentry, so that
rtentry users do not have to grovel in the radix_node for
the key.

Add a RTM_GET method to rtrequest. Use that instead of
radix_node lookups in, e.g., carp(4).

Add sys/net/link_proto.c, which supplies sockaddr routines for
link-layer socket addresses (sockaddr_dl).

Cosmetic:

Constify. KNF. Stop open-coding LIST_FOREACH, TAILQ_FOREACH,
et cetera. Use NULL instead of 0 for null pointers. Use
__arraycount(). Reduce gratuitous parenthesization.

Stop using variadic arguments for rip6_output(), it is
unnecessary.

Remove the unnecessary rtentry member rt_genmask and the
code to maintain it, since nothing actually used it.

Make rt_maskedcopy() easier to read by using meaningful variable
names.

Extract a subroutine intern_netmask() for looking up a netmask in
the masks table.

Start converting backslash-ridden IPv6 macros in
sys/netinet6/in6_var.h into inline subroutines that one
can read without special eyeglasses.

One functional change: when the kernel serves an RTM_GET, RTM_LOCK,
or RTM_CHANGE request, it applies the netmask (if supplied) to a
destination before searching for it in the forwarding table.

I have changed sys/netinet/ip_carp.c, carp_setroute(), to remove
the unlawful radix_node knowledge.

Apart from the changes to carp(4), netiso, ATM, and strip(4), I
have run the changes on three nodes in my wireless routing testbed,
which involves IPv4 + IPv6 dynamic routing acrobatics, and it's
working beautifully so far.
 1.67 09-Jun-2007  dyoung branches: 1.67.2;
Cosmetic: de-__P() et cetera.
 1.66 08-May-2007  manu Fix buid (broken by a fix introduced in the wrong file...)
 1.65 08-May-2007  manu Add the TFTPROOT kernel option for TFTP'ing root RAMdisk at root mount time.
This allows working around situations where a kernel with embedded RAMdisk
cannot be booted by the bootloader because the RAMdisk is too big.
 1.64 04-Mar-2007  christos branches: 1.64.2; 1.64.4;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.
 1.63 01-Mar-2006  yamt branches: 1.63.18; 1.63.20;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.
 1.62 11-Dec-2005  christos branches: 1.62.2; 1.62.4; 1.62.6;
merge ktrace-lwp.
 1.61 22-May-2004  jonathan branches: 1.61.12;
Eliminate several uses of `curproc' from the socket-layer code and from NFS.

Add a new explicit `struct proc *p' argument to socreate(), sosend().
Use that argument instead of curproc. Follow-on changes to pass that
argument to socreate(), sosend(), and (*so->so_send)() calls.
These changes reviewed and independently recoded by Matt Thomas.

Changes to soreceive() and (*dom->dom_exernalize() from Matt Thomas:
pass soreceive()'s struct uio* uio->uio_procp to unp_externalize().
Eliminate curproc from unp_externalize. Also, now soreceive() uses
its uio->uio_procp value, pass that same value downward to
((pr->pru_usrreq)() calls for consistency, instead of (struct proc * )0.

Similar changes in sys/nfs to eliminate (most) uses of curproc,
either via the req-> r_procp field of a struct nfsreq *req argument,
or by passing down new explicit struct proc * arguments.

Reviewed by: Matt Thomas, posted to tech-kern.
NB: The (*pr->pru_usrreq)() change should be tested on more (all!) protocols.
 1.60 11-Mar-2004  cl Add static nfs boot configuration, from the kernel config file or from
a driver selectable callback function. This is used in the Xen port to
allow controlling the domain's network setup from the domain building
environment at domain creation (vs. having to maintain/change this on a
dhcp server). The Xen network driver parses a command line passed in
from the domain builder.
 1.59 29-Jun-2003  fvdl branches: 1.59.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
 1.58 28-Jun-2003  darrenr Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
 1.57 10-Nov-2001  lukem add RCSIDs
 1.56 19-Jan-2001  enami branches: 1.56.2; 1.56.4; 1.56.8;
Use tsleep instead of dalay; since we're mounting root, we can sleep
and no reason to use delay.
 1.55 10-Dec-2000  fvdl Make sobind() take a struct proc *. It already took curproc and
passed it down to the appropriate usrreq function, and this
allows usage for contexts that need to be explicitly different
from curproc (like in the NFS code when binding to a reserved port).
 1.54 19-Sep-2000  bjh21 New kernel option, NFS_V2_ONLY, which aims to reduce the NFS client to just
that required to support NFSv2 mounts. Not finished yet, but already
provides some 44k of saving in code size on arm26. More savings, and some
documentation, are still to come.
 1.53 29-Mar-2000  simonb branches: 1.53.4;
Don't need to include <sys/conf.h> here.
 1.52 03-Sep-1999  drochner branches: 1.52.2;
Wait some seconds after the interface is brought up before packets
are sent. Needed at least for if_ti to get the link up.
 1.51 07-Jul-1999  drochner mount diskless root with "NFSMNT_NOCONN" (which is default in "mount_nfs"
for quite a while) to allow certain servers (multihomed, as our DEC NSE
cluster) to be used as root filesystem without special tweaks
 1.50 21-Feb-1999  drochner branches: 1.50.2; 1.50.4;
restructure the diskless NFS boot code to keep track of the used
interface and the address allocated, to roll everything back if the
mount fails:
-put an interface pointer into "struct nfs_diskless" to have it
available for cleanup, don't pass it around anymore where the
"struct nfs_diskless" is already passed
-add a "cleanup" function which shuts the interface down
-in the protocol-specific parts, either return with "everything
ready" or "completely shut down"
-use common functions for interface initialization and shutdown
-add a function to delete all routes associate to an interface
(why is this necessary and not done by ~IFF_UP?)
g/c diskless swap stuff
general cleanup
 1.49 13-Sep-1998  christos Fix copyright spacing.
 1.48 09-Aug-1998  perry bzero->memset, bcopy->memcpy, bcmp->memcmp
 1.47 13-Jun-1998  drochner Fix last change: If BOOTP/DHCP was successful, don't try RARP/BOOTPARAM.
 1.46 13-Jun-1998  tv Clean up boogered gcc warning workaround (remove goto completely) and remove
a redundant `if'.
 1.45 25-Apr-1998  matt Adapt to new sosend/soreceive and upcall (now down in sowakeup)
 1.44 01-Mar-1998  fvdl Merge with Lite2 + local changes
 1.43 28-Feb-1998  cgd be a bit more clear about what protocols will be tried in the
BOOTP/DHCP case.
 1.42 12-Jan-1998  scottr Consolidate NFS_BOOT_* options into opt_nfs_boot.h
 1.41 11-Jan-1998  scottr Make NFS_BOOT_DHCP work as expected.
 1.40 09-Jan-1998  drochner Use new options "NFS_BOOT_BOOTP" and "NFS_BOOT_BOOTPARAM" for parts
conditional on a particular configuration method.
The global flags "nfs_boot_rfc951" and "nfs_boot_bootparam" control
independantly if the functions are actually called. (Previous meaning
of "nfs_boot_rfc951" was "either-or".)
 1.39 30-Sep-1997  drochner Factor out some functions used by bootparam and DHCP boot.
 1.38 13-Sep-1997  thorpej Correct a comment regarding the sense of the nfs_boot_rfc951 global.
 1.37 09-Sep-1997  gwr Move the call to nfs_boot_getfh() from nfs_vfsops.c to nfs_boot.c
(just for better isolation - it can now be static)
 1.36 02-Sep-1997  gwr Change test from NETHER to NARP (revarpwhoami is in if_arp.c)
 1.35 29-Aug-1997  gwr Supporting changes for the new BOOTP support in nfs_mountroot.
 1.34 14-Aug-1997  drochner 1. Allow to set a netmask (option NFS_BOOT_NETMASK) for the booting
interface. Without this, NFS_BOOT_NETMASK could be useless in
subnetting envirinment.
2. Comment out unneeded NFS swap related stuff.
Closes PR kern/3918.
 1.33 27-May-1997  gwr branches: 1.33.4;
Minor reorganization of nfs_mountroot code to simplify BOOTP support.
The RPC/bootparamd calls to get the root and swap paths are now done
in nfs_boot_init() instead of nfs_boot_getfh(), so the latter now just
does the RPC/mountd call. Also changed some panics into error returns.
 1.32 17-Mar-1997  thorpej Add some missing "\n"'s.
 1.31 15-Mar-1997  is New ARP system, supports IPv4 over any hardware link.

Some of the stuff (e.g., rarpd, bootpd, dhcpd etc., libsa) still will
only support Ethernet. Tcpdump itself should be ok, but libpcap needs
lot of work.

For the detailed change history, look at the commit log entries for
the is-newarp branch.
 1.30 31-Jan-1997  thorpej branches: 1.30.2;
- Don't look for a "suitable interface"; we're now given the name of the
network interface to use.
- If any part of the NFS root mount process fails, don't panic.
Simply return the appropriate error and let the caller recover.
 1.29 20-Oct-1996  fvdl branches: 1.29.2;
Enhancements from Matthias Drochner:
- Try V3 first for diskless booting. Fall back to V2 if V3 fails.
- optionally (option NFS_BOOT_TCP) try a TCP mount first
for diskless booting. Fall back to UDP if it fails.
- Enable switching between UDP and TCP for remounts.
 1.28 13-Oct-1996  christos revert kprintf changes
 1.27 10-Oct-1996  christos printf -> kprintf, sprintf -> ksprintf
 1.26 07-May-1996  thorpej Changed struct ifnet to have a pointer to the softc of the underlying
device and a printable "external name" (name + unit number), thus eliminating
if_name and if_unit. Updated interface to (*if_watchdog)() and (*if_reset)()
to take a struct ifnet *, rather than a unit number.
 1.25 18-Feb-1996  fvdl Bring in a merge of Rick Macklem's NFSv3 code from Lite2
 1.24 16-Feb-1996  gwr Add stub for nfs_boot_getfh if NETHER==0
 1.23 13-Feb-1996  gwr Do the RPC to bootparamd a little later (just before the mountd call)
so that we do not ask for the "swap" path when swapping on disk.
 1.22 10-Feb-1996  pk Don't return EBADRPC if we have something else.
 1.21 09-Feb-1996  christos nfs prototype changes
 1.20 19-Dec-1995  cgd changes to make this work on systems where pointers & longs are 64 bits.
This is mostly just changes to make the stuff that goes over the wire
use fixed-size types.
 1.19 12-Jun-1995  mycroft Various cleanup, including:
* Convert several data structures to use queue.h.
* Split in_pcbnotify() into two parts; one for notifying a specific PCB, and
one for notifying all PCBs for a particular foreign address.
 1.18 23-May-1995  cgd don't blindly set IFF_UP; or it with old flags
 1.17 20-May-1995  mycroft Use fxdr_*() and txdr_*() macros to do byte order conversions.
 1.16 24-Apr-1995  gwr Fixed RPC code to deal with RPC messages larger than one mbuf.
 1.15 28-Mar-1995  gwr Cosmetic changes suggested by Adam.
 1.14 18-Mar-1995  gwr Do the printf "root/swap on" elsewhere to avoid confusion.
 1.13 16-Feb-1995  pk Working "config generic" support; from Theo.
 1.12 29-Oct-1994  cgd fix a couple of obvious, painful endianness bugs introduced in last commit.
 1.11 26-Sep-1994  gwr Do the first BOOTPARAM RPC call to the broadcast address instead of
using the address of the RARP server because a BOOTPARAM server
might not be running on the machine that sent the RARP reply.
 1.10 11-Aug-1994  mycroft char * --> caddr_t, where appropriate.
 1.9 11-Aug-1994  gwr Diskless boot will now bind the local socket to a reserved port to
satisfy picky servers. Also fix some missing initializations.
(Thanks to Chuck Cranor for PR#394 -- now fixed.)
 1.8 19-Jul-1994  gwr Fix the conditionally compiled code inside #ifdef NFS_BOOT_GATEWAY
and make some printf args use host byteorder.
 1.7 16-Jul-1994  paulus If we don't have ethernet, nfs_boot_init reduces to just a panic.
This is so I don't get an undefined symbol compiling a kernel with
NFSCLIENT but no ethernet.
 1.6 29-Jun-1994  deraadt branches: 1.6.2;
knf
 1.5 29-Jun-1994  cgd New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'
 1.4 21-Jun-1994  pk Construct mountpath for remote root.
 1.3 13-Jun-1994  gwr New diskless boot code (uses RARP, bootparamd).
 1.2 05-May-1994  cgd lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.
 1.1 18-Apr-1994  glass revised nfs diskless support. uses bootp+rpc to gather parameters
 1.6.2.3 11-Aug-1994  mycroft update from trunk
 1.6.2.2 19-Jul-1994  cgd from trunk, per gwr.
 1.6.2.1 16-Jul-1994  cgd update from trunk, per paulus
 1.29.2.1 14-Jan-1997  thorpej Snapshot of work-in-progress, committed to private branch.

These changes implement machine-independent root device and file system
selection. Notable features:

- All ports behave in a consistent manner regarding root
device selection.
- No more "options GENERIC"; all kernels have the ability
to boot with RB_ASKNAME to select root device and file system
type.
- Root file system type can be wildcarded; a machine-independent
function will try all possible file systems for the selected
root device until one succeeds.
- If the root file system fails to mount, the operator will
be given the chance to select a new root device and file
system type, rather than having the machine simply panic.
- nfs_mountroot() no longer panics if any part of the NFS
mount process fails; it now returns an error, giving the
operator a chance to recover.
- New, more consistent, config(8) grammar. The constructs:

config netbsd swap generic
config netbsd root on nfs

have been replaced with:

config netbsd root on ? type ?
config netbsd root on ? type nfs

Additionally, the operator may select or wildcard root file
system type in the kernel configuration file:

config netbsd root on cd0a type cd9660

config(8) now requires that a "root" specification be
made. "root" may be wired down or wildcarded. "swap" and
"dump" specifications are optional, and follow previous
semantics.

- config(8) has a new "file-system" keyword, used to configure
file systems into the kernel. Eventually, this will be used
to generate the default vfssw[].

- "options NFSCLIENT" is obsolete, and is replaced by
"file-system NFS". "options NFSSERVER" still exists, since
NFS server support is independent of the NFS file system
client.

- sys/arch/<foo>/<foo>/swapgeneric.c is no longer used, and
will be removed; all information is now generated by config(8).

As of this commit, all ports except arm32 have been updated to use
the new setroot(). Only SPARC, i386, and Alpha ports have been
tested at this time. Port masters should test these changes on their
ports, and report any problems back to me.

More changes are on their way, including RB_ASKNAME support in
nfs_mountroot() (to prompt for server address and path) and, potentially,
the ability to select rarp/bootparam or bootp in nfs_mountroot().
 1.30.2.2 10-Mar-1997  is netinet/if_ether.h => netinet/if_inarp.h
 1.30.2.1 07-Feb-1997  is Snapshot of new ARP code.

Our old ARP code was hardwired for 6-byte length medium
addresses, while the protocol is designed for any size.

This snapshot contains a first hack at getting rid of
Ethernet specific data structures. The ep driver is updated
(and tested on the PCI bus), the iy and fpa drivers have been
updated, but not real life tested yet.

If you want to test this with other drivers, you have to update
them first yourself, and probably tag the relevant directories.
Better contact me if you want to do this.
 1.33.4.5 14-Oct-1997  thorpej Update marc-pcmcia branch from trunk.
 1.33.4.4 16-Sep-1997  thorpej Update marc-pcmcia branch from trunk.
 1.33.4.3 04-Sep-1997  thorpej Update marc-pcmcia branch from trunk.
 1.33.4.2 01-Sep-1997  thorpej Update marc-pcmcia branch from trunk.
 1.33.4.1 23-Aug-1997  thorpej Update marc-pcmcia branch from trunk.
 1.50.4.1 02-Aug-1999  thorpej Update from trunk.
 1.50.2.1 05-Oct-1999  he Pull up revisions 1.51-1.52 (requested by drochner):
Mount diskless root with "noconn" option to allow easier use of
multi-homed servers.
Wait a while between bringing up interface and using it, to allow
e.g. if_ti driver to establish the link.
 1.52.2.3 11-Feb-2001  bouyer Sync with HEAD.
 1.52.2.2 13-Dec-2000  bouyer Sync with HEAD (for UBC fixes).
 1.52.2.1 20-Nov-2000  bouyer Update thorpej_scsipi to -current as of a month ago
 1.53.4.1 15-Dec-2000  he Pull up revision 1.55 (requested by fvdl):
Fix NFS+tcp client hangs on server or network outage. Again,
please note that this introduces yet another kernel interface
change: sobind() gains an argument.
 1.56.8.1 12-Nov-2001  thorpej Sync the thorpej-mips-cache branch with -current.
 1.56.4.1 10-Jan-2002  thorpej Sync kqueue branch with -current.
 1.56.2.4 12-Jul-2002  nathanw No longer need to pull in lwp.h; proc.h pulls it in for us.
 1.56.2.3 24-Jun-2002  nathanw Curproc->curlwp renaming.

Change uses of "curproc->l_proc" back to "curproc", which is more like the
original use. Bare uses of "curproc" are now "curlwp".

"curproc" is now #defined in proc.h as ((curlwp) ? (curlwp)->l_proc) : NULL)
so that it is always safe to reference curproc (*de*referencing curproc
is another story, but that's always been true).
 1.56.2.2 14-Nov-2001  nathanw Catch up to -current.
 1.56.2.1 05-Mar-2001  nathanw Initial commit of scheduler activations and lightweight process support.
 1.59.2.5 04-Feb-2005  skrll Adapt to branch.
 1.59.2.4 21-Sep-2004  skrll Fix the sync with head I botched.
 1.59.2.3 18-Sep-2004  skrll Sync with HEAD.
 1.59.2.2 03-Aug-2004  skrll Sync with HEAD
 1.59.2.1 02-Jul-2003  darrenr Apply the aborted ktrace-lwp changes to a specific branch. This is just for
others to review, I'm concerned that patch fuziness may have resulted in some
errant code being generated but I'll look at that later by comparing the diff
from the base to the branch with the file I attempt to apply to it. This will,
at the very least, put the changes in a better context for others to review
them and attempt to tinker with removing passing of 'struct lwp' through
the kernel.
 1.61.12.2 03-Sep-2007  yamt sync with head.
 1.61.12.1 21-Jun-2006  yamt sync with head.
 1.62.6.1 22-Apr-2006  simonb Sync with head.
 1.62.4.1 09-Sep-2006  rpaulo sync with head
 1.62.2.1 31-Dec-2005  yamt - adapt nfs.
- nfs_doio_read: #if 0 out "killproc if text is modified" part of
the code as it's broken. (a process reading the modified text is not
necessarily a process which is using the file as a text.)
 1.63.20.2 17-May-2007  yamt sync with head.
 1.63.20.1 12-Mar-2007  rmind Sync with HEAD.
 1.63.18.1 13-May-2007  jdc Pull up revisions 1.65-1.66 (requested by manu in ticket #635).

Add the TFTPROOT kernel option for TFTP'ing root RAMdisk at root mount time.
This allows working around situations where a kernel with embedded RAMdisk
cannot be booted by the bootloader because the RAMdisk is too big.

Fix build (broken by a fix introduced in the wrong file...)
 1.64.4.1 11-Jul-2007  mjf Sync with head.
 1.64.2.4 09-Oct-2007  ad Sync with head.
 1.64.2.3 20-Aug-2007  ad Sync with HEAD.
 1.64.2.2 15-Jul-2007  ad Sync with head.
 1.64.2.1 08-Jun-2007  ad Sync with head.
 1.67.2.2 03-Sep-2007  skrll Sync with HEAD.
 1.67.2.1 15-Aug-2007  skrll Sync with HEAD.
 1.68.8.2 19-Jul-2007  dyoung Take steps to hide the radix_node implementation of the forwarding table
from the forwarding table's users:

Introduce rt_walktree() for walking the routing table and
applying a function to each rtentry. Replace most
rn_walktree() calls with it.

Use rt_getkey()/rt_setkey() to get/set a route's destination.
Keep a pointer to the sockaddr key in the rtentry, so that
rtentry users do not have to grovel in the radix_node for
the key.

Add a RTM_GET method to rtrequest. Use that instead of
radix_node lookups in, e.g., carp(4).

Add sys/net/link_proto.c, which supplies sockaddr routines for
link-layer socket addresses (sockaddr_dl).

Cosmetic:

Constify. KNF. Stop open-coding LIST_FOREACH, TAILQ_FOREACH,
et cetera. Use NULL instead of 0 for null pointers. Use
__arraycount(). Reduce gratuitous parenthesization.

Stop using variadic arguments for rip6_output(), it is
unnecessary.

Remove the unnecessary rtentry member rt_genmask and the
code to maintain it, since nothing actually used it.

Make rt_maskedcopy() easier to read by using meaningful variable
names.

Extract a subroutine intern_netmask() for looking up a netmask in
the masks table.

Start converting backslash-ridden IPv6 macros in
sys/netinet6/in6_var.h into inline subroutines that one
can read without special eyeglasses.

One functional change: when the kernel serves an RTM_GET, RTM_LOCK,
or RTM_CHANGE request, it applies the netmask (if supplied) to a
destination before searching for it in the forwarding table.

I have changed sys/netinet/ip_carp.c, carp_setroute(), to remove
the unlawful radix_node knowledge.

Apart from the changes to carp(4), netiso, ATM, and strip(4), I
have run the changes on three nodes in my wireless routing testbed,
which involves IPv4 + IPv6 dynamic routing acrobatics, and it's
working beautifully so far.
 1.68.8.1 19-Jul-2007  dyoung file nfs_boot.c was added on branch matt-mips64 on 2007-07-19 20:49:01 +0000
 1.68.6.1 06-Nov-2007  matt sync with HEAD
 1.68.4.1 03-Sep-2007  jmcneill Sync with HEAD.
 1.69.20.3 17-Jan-2009  mjf Sync with HEAD.
 1.69.20.2 28-Sep-2008  mjf Sync with HEAD.
 1.69.20.1 02-Jun-2008  mjf Sync with HEAD.
 1.70.2.2 04-Jun-2008  yamt sync with head
 1.70.2.1 18-May-2008  yamt sync with head.
 1.71.2.3 09-Oct-2010  yamt sync with head
 1.71.2.2 04-May-2009  yamt sync with head.
 1.71.2.1 16-May-2008  yamt sync with head.
 1.72.2.2 18-Sep-2008  wrstuden Sync with wrstuden-revivesa-base-2.
 1.72.2.1 23-Jun-2008  wrstuden Sync w/ -current. 34 merge conflicts to follow.
 1.73.4.2 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.73.4.1 19-Oct-2008  haad Sync with HEAD.
 1.75.2.2 28-Apr-2009  skrll Sync with HEAD.
 1.75.2.1 19-Jan-2009  skrll Sync with HEAD.
 1.78.4.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.79.4.1 05-Mar-2011  rmind sync with head
 1.79.2.1 22-Oct-2010  uebayasi Sync with HEAD (-D20101022).
 1.80.22.1 18-May-2014  rmind sync with head
 1.80.18.2 03-Dec-2017  jdolecek update from HEAD
 1.80.18.1 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.80.14.1 16-Apr-2015  msaitoh Pull up following revision(s) (requested by hikaru in ticket #1287):
sys/kern/subr_tftproot.c: revision 1.14 via patch
sys/nfs/nfsdiskless.h: revision 1.31
sys/nfs/nfs_boot.c: revision 1.82
sys/nfs/krpc_subr.c: revision 1.39
sys/nfs/nfs_bootdhcp.c: revision 1.53
m_pullup() is called in rcvproc callback functions,
so nfs_boot_sendrecv() should keep track of the head of mbuf chain.
fixes kern/48746
 1.80.8.1 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.81.6.3 05-Dec-2016  skrll Sync with HEAD
 1.81.6.2 06-Jun-2015  skrll Sync with HEAD
 1.81.6.1 06-Apr-2015  skrll Sync with HEAD
 1.81.4.1 06-Apr-2015  snj Pull up following revision(s) (requested by hikaru in ticket #656):
sys/kern/subr_tftproot.c: revision 1.14
sys/nfs/krpc_subr.c: revision 1.39
sys/nfs/nfs_boot.c: revision 1.82
sys/nfs/nfs_bootdhcp.c: revision 1.53
sys/nfs/nfsdiskless.h: revision 1.31
m_pullup() is called in rcvproc callback functions,
so nfs_boot_sendrecv() should keep track of the head of mbuf chain.
fixes kern/48746
 1.86.2.1 07-Jan-2017  pgoyette Sync with HEAD. (Note that most of these changes are simply $NetBSD$
tag issues.)
 1.87.14.1 21-May-2018  pgoyette Sync with HEAD
 1.89.10.1 02-Aug-2025  perseant Sync with HEAD
 1.60 20-Oct-2024  mlelstv MBUFTRACE
 1.59 05-Jul-2024  rin sys: Drop redundant NULL check before m_freem(9)

m_freem(9) safely has accepted NULL argument at least since 4.2BSD:
https://www.tuhs.org/cgi-bin/utree.pl?file=4.2BSD/usr/src/sys/sys/uipc_mbuf.c

Compile-tested on amd64/ALL.

Suggested by knakahara@
 1.58 13-May-2024  msaitoh branches: 1.58.2;
s/contigous/contiguous/ in comment.
 1.57 24-Dec-2022  andvar s/reqest/request/ in comment.
 1.56 10-Jun-2016  ozaki-r Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.
 1.55 21-May-2015  rtr change nfs_boot_sendrecv to take sockaddr_in * instead of mbuf *

fixes m_serv (single mbuf leak) leak in kern/subr_tftproot.c
 1.54 09-May-2015  rtr when calling nfs_boot_sendrecv pass NULL for pointers instead of 0
 1.53 27-Mar-2015  hikaru m_pullup() is called in rcvproc callback functions,
so nfs_boot_sendrecv() should keep track of the head of mbuf chain.
fixes kern/48746
 1.52 04-Oct-2010  cyber branches: 1.52.14; 1.52.18; 1.52.34; 1.52.36;
Add support to honor MTU settings from DHCP during netboot.

Defines IP_MIN_MTU as 576.

Glanced over quickly by martin@ and joerg@.
 1.51 10-Jul-2009  roy branches: 1.51.2; 1.51.4;
Use a function to add extra data to the vendor area so that data added
remains constant for both DISCOVER and REQUEST messages.
 1.50 10-Jul-2009  roy Protect against short IP addresses in the DHCP message.
 1.49 10-Jul-2009  roy When using DHCP, request the parameters that we need. Fixes PR kern/38830.
Thanks to Tim McIntosh.
 1.48 06-May-2009  cegger correct previous: use %zu for BOOTP_SIZE_(MIN,MAX).
Pointed out by David Holland
 1.47 05-May-2009  cegger buildfix: use %d for BOOTP_SIZE_(MIN,MAX).
Makes i386 ALL kernel build again.
 1.46 02-May-2009  manu - Silence warning when running with debug enabled
- Remind the administrator about the required DHCP option when some are
missing, instead of silently failing, you stupid computer!
 1.45 19-Nov-2008  ad branches: 1.45.4;
Make the emulations, exec formats, coredump, NFS, and the NFS server
into modules. By and large this commit:

- shuffles header files and ifdefs
- splits code out where necessary to be modular
- adds module glue for each of the components
- adds/replaces hooks for things that can be installed at runtime
 1.44 27-Oct-2008  cegger change nfs boot behaviour to automatically try next boot method if boot information are incomplete to succeed.
That way, it is possible combine static and dhcp boot:
For example, to boot diskless you can specify the nfs-server and the rootpath statically. All other information will be taken via dhcp.

Patch has been presented on port-xen, tech-kern and tech-net:
http://mail-index.netbsd.org/port-xen/2008/10/24/msg004488.html
http://mail-index.netbsd.org/tech-kern/2008/10/24/msg003255.html
http://mail-index.netbsd.org/tech-net/2008/10/24/msg000864.html

No comments, no objections.
 1.43 24-Oct-2008  cegger branches: 1.43.2;
- ansify function definition
- de- __P
- u_int32_t -> uint32_t

No functional changes.
 1.42 06-Aug-2008  plunky Convert socket options code to use a sockopt structure
instead of laying everything into an mbuf.

approved by core
 1.41 20-Jul-2008  uwe When doing pointer arithmetic to compute limit cast bootp to pointer
type of correct signedness. Caught by lint.
 1.40 09-May-2008  rumble branches: 1.40.2; 1.40.4;
Fix compilation with DEBUG_NFS_BOOT_DHCP and ssp.
 1.39 28-Apr-2008  martin branches: 1.39.2;
Remove clause 3 and 4 from TNF licenses
 1.38 24-Apr-2008  ad branches: 1.38.2;
Merge the socket locking patch:

- Socket layer becomes MP safe.
- Unix protocols become MP safe.
- Allows protocol processing interrupts to safely block on locks.
- Fixes a number of race conditions.

With much feedback from matt@ and plunky@.
 1.37 20-Dec-2007  dyoung branches: 1.37.6; 1.37.8;
Constify.
 1.36 29-Aug-2007  dyoung branches: 1.36.8; 1.36.12;
Constify: LLADDR() -> CLLADDR().
 1.35 08-May-2007  manu branches: 1.35.2; 1.35.6; 1.35.8;
Fix buid (broken by a fix introduced in the wrong file...)
 1.34 08-May-2007  manu Add the TFTPROOT kernel option for TFTP'ing root RAMdisk at root mount time.
This allows working around situations where a kernel with embedded RAMdisk
cannot be booted by the bootloader because the RAMdisk is too big.
 1.33 04-Mar-2007  christos branches: 1.33.2; 1.33.4;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.
 1.32 09-Nov-2006  yamt branches: 1.32.2; 1.32.4;
remove some __unused in function parameters.
 1.31 12-Oct-2006  christos - sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386
 1.30 16-Mar-2006  christos branches: 1.30.10; 1.30.12;
Don't use DEBUG, add a new DEBUG_NFS_BOOT_DHCP variable to provide more
information. Print more information about what fails.
 1.29 11-Dec-2005  christos branches: 1.29.4; 1.29.6; 1.29.8; 1.29.10;
merge ktrace-lwp.
 1.28 26-Feb-2005  perry branches: 1.28.4;
nuke trailing whitespace
 1.27 22-May-2004  jonathan branches: 1.27.4; 1.27.6;
Eliminate several uses of `curproc' from the socket-layer code and from NFS.

Add a new explicit `struct proc *p' argument to socreate(), sosend().
Use that argument instead of curproc. Follow-on changes to pass that
argument to socreate(), sosend(), and (*so->so_send)() calls.
These changes reviewed and independently recoded by Matt Thomas.

Changes to soreceive() and (*dom->dom_exernalize() from Matt Thomas:
pass soreceive()'s struct uio* uio->uio_procp to unp_externalize().
Eliminate curproc from unp_externalize. Also, now soreceive() uses
its uio->uio_procp value, pass that same value downward to
((pr->pru_usrreq)() calls for consistency, instead of (struct proc * )0.

Similar changes in sys/nfs to eliminate (most) uses of curproc,
either via the req-> r_procp field of a struct nfsreq *req argument,
or by passing down new explicit struct proc * arguments.

Reviewed by: Matt Thomas, posted to tech-kern.
NB: The (*pr->pru_usrreq)() change should be tested on more (all!) protocols.
 1.26 06-May-2004  drochner remove duplicated snprintf(vci, ...)
 1.25 21-Apr-2004  itojun kill sprintf, use snprintf
 1.24 29-Jun-2003  fvdl branches: 1.24.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
 1.23 28-Jun-2003  darrenr Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
 1.22 26-Feb-2003  matt Add MBUFTRACE kernel option.
Do a little mbuf rework while here. Change all uses of MGET*(*, M_WAIT, *)
to m_get*(M_WAIT, *). These are not performance critical and making them
call m_get saves considerable space. Add m_clget analogue of MCLGET and
make corresponding change for M_WAIT uses.
Modify netinet, gem, fxp, tulip, nfs to support MBUFTRACE.
Begin to change netstat to use sysctl.
 1.21 10-Jun-2002  drochner increase IP TTL for BOOTP/DHCP request packets to avoid dumb
routers dropping the packet
(seems to be a problem with Cisco and its "helper-address" feature;
a Cabletron SSR I tested with didn't have this problem)
 1.20 12-May-2002  simonb branches: 1.20.2;
In bootpcheck(), make sure we m_pullup() all the of bootp header that we
actually examine.
While here, toss out home-grown ofs() macro and use offsetof().
 1.19 20-Mar-2002  thorpej Add a NetBSD Vendor Class Identifier option as proposed on tech-net in
message <20020216172527.C23901@dr-evil.shagadelic.org>.
 1.18 10-Nov-2001  lukem add RCSIDs
 1.17 02-Jun-2001  kim branches: 1.17.2; 1.17.6;
It is misleading that the kernel outputs "DHCP server:" followed by
the value of "next-server" from the DHCP (or BOOTP) reply. This is
not the DHCP server's IP address (except by chance), so instead of
"server" make it print "next-server".
 1.16 05-Dec-2000  drochner branches: 1.16.2;
add a kernel configuration option to set the string passed in bp_file
in diskless BOOTP/DHCP configuration - good for booting different
userland versions depending on the kernel version
 1.15 28-May-2000  gmcgarry Allow nfs root over token ring. Closes PR6629.
 1.14 29-Mar-2000  simonb branches: 1.14.2;
Don't need to include <sys/conf.h> here.
 1.13 20-Jan-2000  enami If server name field is overloaded for other purpose, or it just contains
NULL string, don't use it as server name.
 1.12 07-May-1999  drochner branches: 1.12.2;
print diskless boot related IP addresses in dot notation
 1.11 21-Feb-1999  drochner branches: 1.11.4;
restructure the diskless NFS boot code to keep track of the used
interface and the address allocated, to roll everything back if the
mount fails:
-put an interface pointer into "struct nfs_diskless" to have it
available for cleanup, don't pass it around anymore where the
"struct nfs_diskless" is already passed
-add a "cleanup" function which shuts the interface down
-in the protocol-specific parts, either return with "everything
ready" or "completely shut down"
-use common functions for interface initialization and shutdown
-add a function to delete all routes associate to an interface
(why is this necessary and not done by ~IFF_UP?)
g/c diskless swap stuff
general cleanup
 1.10 12-Feb-1999  thorpej Fix printf format problems on Alpha.
 1.9 13-Sep-1998  christos Fix copyright spacing.
 1.8 09-Aug-1998  perry bzero->memset, bcopy->memcpy, bcmp->memcmp
 1.7 24-Apr-1998  drochner -catch zero and broadcast IP addresses sent by a DHCP server
-warn about them (and other invalid replies)
-print address of BOOTP/DHCP server for better problem tracking
-KNF
 1.6 01-Mar-1998  ross Sweep up some miscellaneous leftover lite2 integration shrapnel.
 1.5 12-Jan-1998  scottr Consolidate NFS_BOOT_* options into opt_nfs_boot.h
 1.4 11-Jan-1998  scottr Make NFS_BOOT_DHCP work as expected.
 1.3 09-Jan-1998  drochner Use interface type to select "hardware type" in bootp header.
 1.2 30-Sep-1997  drochner branches: 1.2.2;
Make this file deserve its name: add DHCP support, conditionalized
with NFS_BOOT_DHCP.
Don't increment xid between retries anymore, it is not required and
it increases the response time in case of a slow server.
Use common code with bootparam boot.
 1.1 29-Aug-1997  gwr branches: 1.1.2;
Add support for nfs_mountroot using BOOTP based on the contributions
of Tor Egge (closes PR kern/2351).
 1.1.2.3 14-Oct-1997  thorpej Update marc-pcmcia branch from trunk.
 1.1.2.2 01-Sep-1997  thorpej Update marc-pcmcia branch from trunk.
 1.1.2.1 29-Aug-1997  thorpej file nfs_bootdhcp.c was added on branch marc-pcmcia on 1997-09-01 21:02:54 +0000
 1.2.2.1 08-May-1998  mycroft Pull up 1.7, per request of drochner.
 1.11.4.1 21-Jun-1999  thorpej Sync w/ -current.
 1.12.2.2 08-Dec-2000  bouyer Sync with HEAD.
 1.12.2.1 20-Nov-2000  bouyer Update thorpej_scsipi to -current as of a month ago
 1.14.2.1 22-Jun-2000  minoura Sync w/ netbsd-1-5-base.
 1.16.2.4 20-Jun-2002  nathanw Catch up to -current.
 1.16.2.3 01-Apr-2002  nathanw Catch up to -current.
(CVS: It's not just a program. It's an adventure!)
 1.16.2.2 14-Nov-2001  nathanw Catch up to -current.
 1.16.2.1 21-Jun-2001  nathanw Catch up to -current.
 1.17.6.1 12-Nov-2001  thorpej Sync the thorpej-mips-cache branch with -current.
 1.17.2.2 23-Jun-2002  jdolecek catch up with -current on kqueue branch
 1.17.2.1 10-Jan-2002  thorpej Sync kqueue branch with -current.
 1.20.2.1 20-Jun-2002  gehenna catch up with -current.
 1.24.2.5 04-Mar-2005  skrll Sync with HEAD.

Hi Perry!
 1.24.2.4 21-Sep-2004  skrll Fix the sync with head I botched.
 1.24.2.3 18-Sep-2004  skrll Sync with HEAD.
 1.24.2.2 03-Aug-2004  skrll Sync with HEAD
 1.24.2.1 02-Jul-2003  darrenr Apply the aborted ktrace-lwp changes to a specific branch. This is just for
others to review, I'm concerned that patch fuziness may have resulted in some
errant code being generated but I'll look at that later by comparing the diff
from the base to the branch with the file I attempt to apply to it. This will,
at the very least, put the changes in a better context for others to review
them and attempt to tinker with removing passing of 'struct lwp' through
the kernel.
 1.27.6.1 19-Mar-2005  yamt sync with head. xen and whitespace. xen part is not finished.
 1.27.4.1 29-Apr-2005  kent sync with -current
 1.28.4.4 21-Jan-2008  yamt sync with head
 1.28.4.3 03-Sep-2007  yamt sync with head.
 1.28.4.2 30-Dec-2006  yamt sync with head.
 1.28.4.1 21-Jun-2006  yamt sync with head.
 1.29.10.1 19-Apr-2006  elad sync with head.
 1.29.8.1 01-Apr-2006  yamt sync with head.
 1.29.6.1 22-Apr-2006  simonb Sync with head.
 1.29.4.1 09-Sep-2006  rpaulo sync with head
 1.30.12.2 10-Dec-2006  yamt sync with head.
 1.30.12.1 22-Oct-2006  yamt sync with head
 1.30.10.1 18-Nov-2006  ad Sync with head.
 1.32.4.2 17-May-2007  yamt sync with head.
 1.32.4.1 12-Mar-2007  rmind Sync with HEAD.
 1.32.2.1 13-May-2007  jdc Pull up revisions 1.34-1.35 (requested by manu in ticket #635).

Add the TFTPROOT kernel option for TFTP'ing root RAMdisk at root mount time.
This allows working around situations where a kernel with embedded RAMdisk
cannot be booted by the bootloader because the RAMdisk is too big.

Fix build (broken by a fix introduced in the wrong file...)
 1.33.4.1 11-Jul-2007  mjf Sync with head.
 1.33.2.2 09-Oct-2007  ad Sync with head.
 1.33.2.1 08-Jun-2007  ad Sync with head.
 1.35.8.2 09-Jan-2008  matt sync with HEAD
 1.35.8.1 06-Nov-2007  matt sync with HEAD
 1.35.6.1 03-Sep-2007  jmcneill Sync with HEAD.
 1.35.2.1 03-Sep-2007  skrll Sync with HEAD.
 1.36.12.1 02-Jan-2008  bouyer Sync with HEAD
 1.36.8.1 26-Dec-2007  ad Sync with head.
 1.37.8.1 18-May-2008  yamt sync with head.
 1.37.6.3 17-Jan-2009  mjf Sync with HEAD.
 1.37.6.2 28-Sep-2008  mjf Sync with HEAD.
 1.37.6.1 02-Jun-2008  mjf Sync with HEAD.
 1.38.2.5 09-Oct-2010  yamt sync with head
 1.38.2.4 18-Jul-2009  yamt sync with head.
 1.38.2.3 16-May-2009  yamt sync with head
 1.38.2.2 04-May-2009  yamt sync with head.
 1.38.2.1 16-May-2008  yamt sync with head.
 1.39.2.2 18-Sep-2008  wrstuden Sync with wrstuden-revivesa-base-2.
 1.39.2.1 23-Jun-2008  wrstuden Sync w/ -current. 34 merge conflicts to follow.
 1.40.4.2 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.40.4.1 19-Oct-2008  haad Sync with HEAD.
 1.40.2.1 28-Jul-2008  simonb Sync with head.
 1.43.2.1 19-Jan-2009  skrll Sync with HEAD.
 1.45.4.2 23-Jul-2009  jym Sync with HEAD.
 1.45.4.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.51.4.1 05-Mar-2011  rmind sync with head
 1.51.2.1 22-Oct-2010  uebayasi Sync with HEAD (-D20101022).
 1.52.36.3 09-Jul-2016  skrll Sync with HEAD
 1.52.36.2 06-Jun-2015  skrll Sync with HEAD
 1.52.36.1 06-Apr-2015  skrll Sync with HEAD
 1.52.34.1 06-Apr-2015  snj Pull up following revision(s) (requested by hikaru in ticket #656):
sys/kern/subr_tftproot.c: revision 1.14
sys/nfs/krpc_subr.c: revision 1.39
sys/nfs/nfs_boot.c: revision 1.82
sys/nfs/nfs_bootdhcp.c: revision 1.53
sys/nfs/nfsdiskless.h: revision 1.31
m_pullup() is called in rcvproc callback functions,
so nfs_boot_sendrecv() should keep track of the head of mbuf chain.
fixes kern/48746
 1.52.18.1 03-Dec-2017  jdolecek update from HEAD
 1.52.14.1 16-Apr-2015  msaitoh Pull up following revision(s) (requested by hikaru in ticket #1287):
sys/kern/subr_tftproot.c: revision 1.14 via patch
sys/nfs/nfsdiskless.h: revision 1.31
sys/nfs/nfs_boot.c: revision 1.82
sys/nfs/krpc_subr.c: revision 1.39
sys/nfs/nfs_bootdhcp.c: revision 1.53
m_pullup() is called in rcvproc callback functions,
so nfs_boot_sendrecv() should keep track of the head of mbuf chain.
fixes kern/48746
 1.58.2.1 02-Aug-2025  perseant Sync with HEAD
 1.40 05-Jul-2024  rin sys: Drop redundant NULL check before m_freem(9)

m_freem(9) safely has accepted NULL argument at least since 4.2BSD:
https://www.tuhs.org/cgi-bin/utree.pl?file=4.2BSD/usr/src/sys/sys/uipc_mbuf.c

Compile-tested on amd64/ALL.

Suggested by knakahara@
 1.39 29-Jun-2019  kamil branches: 1.39.34;
Appease GCC and initialize arps_ip

Fixes build as GCC errors with maybe-uninitialized that is a false
positive.
 1.38 12-Sep-2013  drochner branches: 1.38.30;
tyop in comment, from Eivind Evensen via OpenBSD
 1.37 21-Mar-2010  chs branches: 1.37.8; 1.37.18; 1.37.22;
in nfs_bootparam(), set the corresponding flag for each field that we fill in.
 1.36 02-Mar-2010  pooka branches: 1.36.2;
Get rid of dependency on fs_nfs.h, i.e. source modules with
conditional content depending on if the NFS client is wanted or
not. The server can now be made an independent module not depending
on the nfs client.

Tested with rump_nfs (standalone client), rump_nfsd (standalone
nfsd) and a qemu installation with both the client and the server.
 1.35 19-Nov-2008  ad branches: 1.35.6;
Make the emulations, exec formats, coredump, NFS, and the NFS server
into modules. By and large this commit:

- shuffles header files and ifdefs
- splits code out where necessary to be modular
- adds module glue for each of the components
- adds/replaces hooks for things that can be installed at runtime
 1.34 27-Oct-2008  cegger change nfs boot behaviour to automatically try next boot method if boot information are incomplete to succeed.
That way, it is possible combine static and dhcp boot:
For example, to boot diskless you can specify the nfs-server and the rootpath statically. All other information will be taken via dhcp.

Patch has been presented on port-xen, tech-kern and tech-net:
http://mail-index.netbsd.org/port-xen/2008/10/24/msg004488.html
http://mail-index.netbsd.org/tech-kern/2008/10/24/msg003255.html
http://mail-index.netbsd.org/tech-net/2008/10/24/msg000864.html

No comments, no objections.
 1.33 24-Oct-2008  cegger branches: 1.33.2;
- ansify function definition
- de- __P
- u_int32_t -> uint32_t

No functional changes.
 1.32 28-Apr-2008  martin branches: 1.32.6;
Remove clause 3 and 4 from TNF licenses
 1.31 02-Jan-2008  yamt branches: 1.31.6; 1.31.8; 1.31.10;
use kmem_alloc instead of malloc.
 1.30 04-Mar-2007  christos branches: 1.30.16; 1.30.22; 1.30.28;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.
 1.29 15-Apr-2006  christos branches: 1.29.14;
m_freem takes one arg.
 1.28 15-Apr-2006  christos s/mfree/m_freem/
 1.27 15-Apr-2006  christos Don't leak mbufs on error.
 1.26 11-Dec-2005  christos branches: 1.26.4; 1.26.6; 1.26.8; 1.26.10; 1.26.12;
merge ktrace-lwp.
 1.25 29-May-2005  christos branches: 1.25.2;
- sprinkle const
- avoid shadowed variables
- mark bad const use with XXXUNCONST
 1.24 22-May-2004  jonathan Eliminate several uses of `curproc' from the socket-layer code and from NFS.

Add a new explicit `struct proc *p' argument to socreate(), sosend().
Use that argument instead of curproc. Follow-on changes to pass that
argument to socreate(), sosend(), and (*so->so_send)() calls.
These changes reviewed and independently recoded by Matt Thomas.

Changes to soreceive() and (*dom->dom_exernalize() from Matt Thomas:
pass soreceive()'s struct uio* uio->uio_procp to unp_externalize().
Eliminate curproc from unp_externalize. Also, now soreceive() uses
its uio->uio_procp value, pass that same value downward to
((pr->pru_usrreq)() calls for consistency, instead of (struct proc * )0.

Similar changes in sys/nfs to eliminate (most) uses of curproc,
either via the req-> r_procp field of a struct nfsreq *req argument,
or by passing down new explicit struct proc * arguments.

Reviewed by: Matt Thomas, posted to tech-kern.
NB: The (*pr->pru_usrreq)() change should be tested on more (all!) protocols.
 1.23 29-Jun-2003  fvdl branches: 1.23.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
 1.22 28-Jun-2003  darrenr Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
 1.21 01-Feb-2003  thorpej Add extensible malloc types, adapted from FreeBSD. This turns
malloc types into a structure, a pointer to which is passed around,
instead of an int constant. Allow the limit to be adjusted when the
malloc type is defined, or with a function call, as suggested by
Jonathan Stone.
 1.20 22-Sep-2002  jdolecek <sys/conf.h> and <sys/device.h> are not needed here
 1.19 10-Nov-2001  lukem add RCSIDs
 1.18 13-Oct-2001  simonb branches: 1.18.2;
Remove so variables that are only ever set and never referenced.
 1.17 03-Oct-2000  chs branches: 1.17.2; 1.17.4;
include opt_inet.h, needed by previous check-in.
 1.16 02-Oct-2000  itojun perform reverse ARP only if INET is compiled into the kernel
 1.15 26-Jul-1999  enami branches: 1.15.2;
Don't use the result of inet_ntoa after calling the another inet_ntoa,
since they share the same static storage.
 1.14 07-May-1999  drochner -print diskless boot related IP addresses in dot notation
-arrange gateway code to fall back to the old method if the new "getfile"
is not answered (and both are enabled -- allow to switch off the new
method for symmetry)
-handle error if setting the netmask fails
 1.13 12-Apr-1999  ross libkern just got an inet_addr(), but it won't compile, no prototype. Cleanup...
* Add prototype to libkern.h.
* Remove the almost-identical-copy from libsa/net.[ch].
* Change its type back to the (wrong, but harmless) historical one. (u_long)
* Kill the XXX local prototype in nfs_bootparam.c
 1.12 11-Apr-1999  gwr Enable the code that gets our gateway+netmask from the
bootparam server using the "gateway" pseudo file.
(Compatible with sys/lib/libsa/dev_net.c)
 1.11 21-Feb-1999  drochner branches: 1.11.4;
restructure the diskless NFS boot code to keep track of the used
interface and the address allocated, to roll everything back if the
mount fails:
-put an interface pointer into "struct nfs_diskless" to have it
available for cleanup, don't pass it around anymore where the
"struct nfs_diskless" is already passed
-add a "cleanup" function which shuts the interface down
-in the protocol-specific parts, either return with "everything
ready" or "completely shut down"
-use common functions for interface initialization and shutdown
-add a function to delete all routes associate to an interface
(why is this necessary and not done by ~IFF_UP?)
g/c diskless swap stuff
general cleanup
 1.10 13-Sep-1998  christos Fix copyright spacing.
 1.9 09-Aug-1998  perry bzero->memset, bcopy->memcpy, bcmp->memcmp
 1.8 13-Jun-1998  tv Fix boogered gcc warning workaround the right way.
 1.7 01-Mar-1998  veego Add two includes for the 'struct nfs_args' so it compiles again.
 1.6 12-Jan-1998  scottr Consolidate NFS_BOOT_* options into opt_nfs_boot.h
 1.5 09-Jan-1998  drochner Conditionalize call to RARP, check interface type.
(This file can now be included even if no ARP capable interfaces are
defined.)
 1.4 12-Dec-1997  gwr Temporarily disable the bootparam "gateway" support.
 1.3 10-Dec-1997  gwr Change the format of the bootparam "gateway" parameter string to
gateway=server:255.255.255.0 because that is the perferred format,
and the sys/libsa code already knows how to parse that format.
(Copied ip_convert here from the libsa code.)
 1.2 09-Sep-1997  gwr branches: 1.2.2;
Circumvent the lack of a reliable gateway/netmask value in the
Sun RPC bootparam/whoami return by requesting a "pseudo file"
named "gateway" and using its contents as the gateway:netmask
Example /etc/bootparams line: client gateway=router:0xfffffff0
 1.1 29-Aug-1997  gwr branches: 1.1.2;
Support for RARP,RPC/bootparam moved from nfs_boot.c to here.
 1.1.2.3 16-Sep-1997  thorpej Update marc-pcmcia branch from trunk.
 1.1.2.2 01-Sep-1997  thorpej Update marc-pcmcia branch from trunk.
 1.1.2.1 29-Aug-1997  thorpej file nfs_bootparam.c was added on branch marc-pcmcia on 1997-09-01 21:02:56 +0000
 1.2.2.1 14-Dec-1997  mellon Pull rev 1.3 and 1.4 up from trunk (gwr)
 1.11.4.2 02-Aug-1999  thorpej Update from trunk.
 1.11.4.1 21-Jun-1999  thorpej Sync w/ -current.
 1.15.2.1 20-Nov-2000  bouyer Update thorpej_scsipi to -current as of a month ago
 1.17.4.2 10-Oct-2002  jdolecek sync kqueue with -current; this includes merge of gehenna-devsw branch,
merge of i386 MP branch, and part of autoconf rototil work
 1.17.4.1 10-Jan-2002  thorpej Sync kqueue branch with -current.
 1.17.2.3 18-Oct-2002  nathanw Catch up to -current.
 1.17.2.2 14-Nov-2001  nathanw Catch up to -current.
 1.17.2.1 22-Oct-2001  nathanw Catch up to -current.
 1.18.2.1 12-Nov-2001  thorpej Sync the thorpej-mips-cache branch with -current.
 1.23.2.5 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.23.2.4 21-Sep-2004  skrll Fix the sync with head I botched.
 1.23.2.3 18-Sep-2004  skrll Sync with HEAD.
 1.23.2.2 03-Aug-2004  skrll Sync with HEAD
 1.23.2.1 02-Jul-2003  darrenr Apply the aborted ktrace-lwp changes to a specific branch. This is just for
others to review, I'm concerned that patch fuziness may have resulted in some
errant code being generated but I'll look at that later by comparing the diff
from the base to the branch with the file I attempt to apply to it. This will,
at the very least, put the changes in a better context for others to review
them and attempt to tinker with removing passing of 'struct lwp' through
the kernel.
 1.25.2.3 21-Jan-2008  yamt sync with head
 1.25.2.2 03-Sep-2007  yamt sync with head.
 1.25.2.1 21-Jun-2006  yamt sync with head.
 1.26.12.1 24-May-2006  tron Merge 2006-05-24 NetBSD-current into the "peter-altq" branch.
 1.26.10.1 19-Apr-2006  elad sync with head.
 1.26.8.1 24-May-2006  yamt sync with head.
 1.26.6.1 22-Apr-2006  simonb Sync with head.
 1.26.4.1 09-Sep-2006  rpaulo sync with head
 1.29.14.1 12-Mar-2007  rmind Sync with HEAD.
 1.30.28.1 02-Jan-2008  bouyer Sync with HEAD
 1.30.22.1 18-Feb-2008  mjf Sync with HEAD.
 1.30.16.1 09-Jan-2008  matt sync with HEAD
 1.31.10.4 11-Aug-2010  yamt sync with head.
 1.31.10.3 11-Mar-2010  yamt sync with head
 1.31.10.2 04-May-2009  yamt sync with head.
 1.31.10.1 16-May-2008  yamt sync with head.
 1.31.8.1 18-May-2008  yamt sync with head.
 1.31.6.2 17-Jan-2009  mjf Sync with HEAD.
 1.31.6.1 02-Jun-2008  mjf Sync with HEAD.
 1.32.6.1 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.33.2.1 19-Jan-2009  skrll Sync with HEAD.
 1.35.6.1 30-Apr-2010  uebayasi Sync with HEAD.
 1.36.2.1 30-May-2010  rmind sync with head
 1.37.22.1 18-May-2014  rmind sync with head
 1.37.18.1 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.37.8.1 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.38.30.1 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.39.34.1 02-Aug-2025  perseant Sync with HEAD
 1.8 23-Oct-2009  snj Remove 3rd and 4th clauses. OK cl@ (copyright holder).
 1.7 19-Nov-2008  ad Make the emulations, exec formats, coredump, NFS, and the NFS server
into modules. By and large this commit:

- shuffles header files and ifdefs
- splits code out where necessary to be modular
- adds module glue for each of the components
- adds/replaces hooks for things that can be installed at runtime
 1.6 27-Oct-2008  cegger change nfs boot behaviour to automatically try next boot method if boot information are incomplete to succeed.
That way, it is possible combine static and dhcp boot:
For example, to boot diskless you can specify the nfs-server and the rootpath statically. All other information will be taken via dhcp.

Patch has been presented on port-xen, tech-kern and tech-net:
http://mail-index.netbsd.org/port-xen/2008/10/24/msg004488.html
http://mail-index.netbsd.org/tech-kern/2008/10/24/msg003255.html
http://mail-index.netbsd.org/tech-net/2008/10/24/msg000864.html

No comments, no objections.
 1.5 08-Jul-2007  bouyer branches: 1.5.28; 1.5.32; 1.5.38; 1.5.42;
Add a new BOOTSTATIC flag, NFS_BOOTSTATIC_NOSTATIC, which causes
nfs_bootstatic() to abort with EOPNOTSUPP. This allows a callback to
say that there is no bootstatic config, and the next NFS boot method should
be tried.
 1.4 04-Mar-2007  christos branches: 1.4.2; 1.4.4;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.
 1.3 11-Dec-2005  christos branches: 1.3.26;
merge ktrace-lwp.
 1.2 26-Feb-2005  perry branches: 1.2.4;
nuke trailing whitespace
 1.1 11-Mar-2004  cl branches: 1.1.4; 1.1.10; 1.1.12;
Add static nfs boot configuration, from the kernel config file or from
a driver selectable callback function. This is used in the Xen port to
allow controlling the domain's network setup from the domain building
environment at domain creation (vs. having to maintain/change this on a
dhcp server). The Xen network driver parses a command line passed in
from the domain builder.
 1.1.12.1 19-Mar-2005  yamt sync with head. xen and whitespace. xen part is not finished.
 1.1.10.1 29-Apr-2005  kent sync with -current
 1.1.4.6 04-Mar-2005  skrll Sync with HEAD.

Hi Perry!
 1.1.4.5 04-Feb-2005  skrll Adapt to branch.
 1.1.4.4 21-Sep-2004  skrll Fix the sync with head I botched.
 1.1.4.3 18-Sep-2004  skrll Sync with HEAD.
 1.1.4.2 03-Aug-2004  skrll Sync with HEAD
 1.1.4.1 11-Mar-2004  skrll file nfs_bootstatic.c was added on branch ktrace-lwp on 2004-08-03 10:56:17 +0000
 1.2.4.2 03-Sep-2007  yamt sync with head.
 1.2.4.1 21-Jun-2006  yamt sync with head.
 1.3.26.1 12-Mar-2007  rmind Sync with HEAD.
 1.4.4.1 11-Jul-2007  mjf Sync with head.
 1.4.2.1 15-Jul-2007  ad Sync with head.
 1.5.42.1 19-Jan-2009  skrll Sync with HEAD.
 1.5.38.1 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.5.32.2 11-Mar-2010  yamt sync with head
 1.5.32.1 04-May-2009  yamt sync with head.
 1.5.28.1 17-Jan-2009  mjf Sync with HEAD.
 1.7 05-Jul-2024  rin sys: Drop redundant NULL check before m_freem(9)

m_freem(9) safely has accepted NULL argument at least since 4.2BSD:
https://www.tuhs.org/cgi-bin/utree.pl?file=4.2BSD/usr/src/sys/sys/uipc_mbuf.c

Compile-tested on amd64/ALL.

Suggested by knakahara@
 1.6 21-Jan-2018  christos branches: 1.6.40;
PR/40491: From Tobias Ulmer in tech-kern@:
1. Protect the nfs request queue with its own mutex
2. make the nfs_receive queue check for signals so that intr mounts
can be interrupted.
XXX: pullup-8
 1.5 17-Jun-2016  christos branches: 1.5.10;
Serialize all access to the NFS request queue via splsoftnet(). Fixes random
crashes.
XXX: Pullup-7
 1.4 13-Jun-2016  christos Simplify, no functional change.
 1.3 15-Jul-2015  manu Fix soft NFS force unmount

For many reasons, forcibly unmounting a soft NFS mount could hang forever.
Here are the fixes:
- Introduce decents timeouts in operation that awaited NFS server reply.
- On timeout, fails operations on soft mounts with EIO.
- Introduce NFSMNT_DISMNTFORCE to let the filesystem know that a
force unmount is ongoing. This causes timeouts to be reduced and
prevents the NFS client to attempt reconnecting to the NFS server.

Also fix a race condition where some asynchronous I/O could reference
destroyed mount structures. We fix this by awaiting asynchronous I/O
to drain before proceeding.

Reviewed by Chuck Silvers.
 1.2 05-Sep-2014  matt branches: 1.2.2;
Don't use catch as a variable name.
 1.1 02-Mar-2010  pooka branches: 1.1.2; 1.1.6; 1.1.24; 1.1.40; 1.1.42;
Get rid of dependency on fs_nfs.h, i.e. source modules with
conditional content depending on if the NFS client is wanted or
not. The server can now be made an independent module not depending
on the nfs client.

Tested with rump_nfs (standalone client), rump_nfsd (standalone
nfsd) and a qemu installation with both the client and the server.
 1.1.42.1 10-Jul-2016  martin Pull up following revision(s) (requested by christos in ticket #1184):
sys/nfs/nfs_socket.c: revision 1.198
sys/nfs/nfs_clntsocket.c: revision 1.5
Serialize all access to the NFS request queue via splsoftnet(). Fixes random
crashes.
 1.1.40.2 10-Jul-2016  martin Pull up following revision(s) (requested by christos in ticket #1184):
sys/nfs/nfs_socket.c: revision 1.198
sys/nfs/nfs_clntsocket.c: revision 1.5
Serialize all access to the NFS request queue via splsoftnet(). Fixes random
crashes.
XXX: Pullup-7
 1.1.40.1 04-Nov-2015  riz Pull up following revision(s) (requested by manu in ticket #882):
sbin/umount/umount.c: revision 1.48
sys/nfs/nfsmount.h: revision 1.53
sys/nfs/nfs_var.h: revision 1.94
sys/nfs/nfs_iod.c: revision 1.7
sys/nfs/nfs_socket.c: revision 1.197
sys/nfs/nfs_bio.c: revision 1.191
sys/nfs/nfs_vfsops.c: revision 1.230
sys/nfs/nfs_clntsocket.c: revision 1.3
Remove useless and harmful sync(2) call in umount(8)
Remove sync(2) call before unmount(2) in umount(8). This sync(2) is useless
since unmount(2) will perform a VFS_SYNC anyway.
But moreover, this sync(2) may be harmful, as there are some situation where
it cannot return (unreachable NFS server, for instance), causing umount -f
to be uneffective.
Fix soft NFS force unmount
For many reasons, forcibly unmounting a soft NFS mount could hang forever.
Here are the fixes:
- Introduce decents timeouts in operation that awaited NFS server reply.
- On timeout, fails operations on soft mounts with EIO.
- Introduce NFSMNT_DISMNTFORCE to let the filesystem know that a
force unmount is ongoing. This causes timeouts to be reduced and
prevents the NFS client to attempt reconnecting to the NFS server.
Also fix a race condition where some asynchronous I/O could reference
destroyed mount structures. We fix this by awaiting asynchronous I/O
to drain before proceeding.
Reviewed by Chuck Silvers.
 1.1.24.1 03-Dec-2017  jdolecek update from HEAD
 1.1.6.2 30-Apr-2010  uebayasi Sync with HEAD.
 1.1.6.1 02-Mar-2010  uebayasi file nfs_clntsocket.c was added on branch uebayasi-xip on 2010-04-30 14:44:22 +0000
 1.1.2.3 21-Mar-2010  yamt fix merge botches
 1.1.2.2 11-Mar-2010  yamt sync with head
 1.1.2.1 02-Mar-2010  yamt file nfs_clntsocket.c was added on branch yamt-nfs-mp on 2010-03-11 15:04:31 +0000
 1.2.2.2 09-Jul-2016  skrll Sync with HEAD
 1.2.2.1 22-Sep-2015  skrll Sync with HEAD
 1.5.10.1 08-Jun-2018  martin Pull up following revision(s) (requested by maya in ticket #856):

sys/nfs/nfs.h: revision 1.76
sys/nfs/nfs_subs.c: revision 1.230
sys/nfs/nfs_socket.c: revision 1.199
sys/nfs/nfs_clntsocket.c: revision 1.6

PR/40491: From Tobias Ulmer in tech-kern@:
1. Protect the nfs request queue with its own mutex
2. make the nfs_receive queue check for signals so that intr mounts
can be interrupted.

XXX: pullup-8
 1.6.40.1 02-Aug-2025  perseant Sync with HEAD
 1.7 21-Mar-2023  christos PR/57279: Izumi Tsutsui: Fix some {int,long} -> time_t. Still things will
break eventually because parts of the nfs protocol assume time_t will fit
in 32 bits.
 1.6 28-Feb-2022  hannken branches: 1.6.4;
Revert the hack from the last commit now that VOP_UNLOCK()
no longer may hold v_interlock or vmobjlock.
 1.5 14-Jan-2022  christos This is a temporary hack to avoid nfs crashes related to nfs_delaytruncate.
 1.4 23-Feb-2020  ad UVM locking changes, proposed on tech-kern:

- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart. v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap. Others to follow later.
 1.3 03-Sep-2018  riastradh branches: 1.3.6;
Rename min/max -> uimin/uimax for better honesty.

These functions are defined on unsigned int. The generic name
min/max should not silently truncate to 32 bits on 64-bit systems.
This is purely a name change -- no functional change intended.

HOWEVER! Some subsystems have

#define min(a, b) ((a) < (b) ? (a) : (b))
#define max(a, b) ((a) > (b) ? (a) : (b))

even though our standard name for that is MIN/MAX. Although these
may invite multiple evaluation bugs, these do _not_ cause integer
truncation.

To avoid `fixing' these cases, I first changed the name in libkern,
and then compile-tested every file where min/max occurred in order to
confirm that it failed -- and thus confirm that nothing shadowed
min/max -- before changing it.

I have left a handful of bootloaders that are too annoying to
compile-test, and some dead code:

cobalt ews4800mips hp300 hppa ia64 luna68k vax
acorn32/if_ie.c (not included in any kernels)
macppc/if_gm.c (superseded by gem(4))

It should be easy to fix the fallout once identified -- this way of
doing things fails safe, and the goal here, after all, is to _avoid_
silent integer truncations, not introduce them.

Maybe one day we can reintroduce min/max as type-generic things that
never silently truncate. But we should avoid doing that for a while,
so that existing code has a chance to be detected by the compiler for
conversion to uimin/uimax without changing the semantics until we can
properly audit it all. (Who knows, maybe in some cases integer
truncation is actually intended!)
 1.2 12-Jun-2011  rmind branches: 1.2.52; 1.2.54;
Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.
 1.1 02-Mar-2010  pooka branches: 1.1.2; 1.1.4; 1.1.6; 1.1.12;
Get rid of dependency on fs_nfs.h, i.e. source modules with
conditional content depending on if the NFS client is wanted or
not. The server can now be made an independent module not depending
on the nfs client.

Tested with rump_nfs (standalone client), rump_nfsd (standalone
nfsd) and a qemu installation with both the client and the server.
 1.1.12.1 23-Jun-2011  cherry Catchup with rmind-uvmplock merge.
 1.1.6.2 30-Apr-2010  uebayasi Sync with HEAD.
 1.1.6.1 02-Mar-2010  uebayasi file nfs_clntsubs.c was added on branch uebayasi-xip on 2010-04-30 14:44:22 +0000
 1.1.4.1 16-Mar-2010  rmind Change struct uvm_object::vmobjlock to be dynamically allocated with
mutex_obj_alloc(). It allows us to share the locks among UVM objects.
 1.1.2.4 10-Oct-2010  yamt some locking changes
 1.1.2.3 26-Sep-2010  yamt locking changes
 1.1.2.2 11-Mar-2010  yamt sync with head
 1.1.2.1 02-Mar-2010  yamt file nfs_clntsubs.c was added on branch yamt-nfs-mp on 2010-03-11 15:04:31 +0000
 1.2.54.2 08-Apr-2020  martin Merge changes from current as of 20200406
 1.2.54.1 10-Jun-2019  christos Sync with HEAD
 1.2.52.1 06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.3.6.1 29-Feb-2020  ad Sync with head.
 1.6.4.1 20-Sep-2024  martin Pull up following revision(s) (requested by rin in ticket #880):

sys/nfs/nfs_iod.c: revision 1.9
sys/nfs/nfs_vfsops.c: revision 1.245
sys/nfs/nfs_clntsubs.c: revision 1.7

PR/57279: Izumi Tsutsui: Fix some {int,long} -> time_t. Still things will
break eventually because parts of the nfs protocol assume time_t will fit
in 32 bits.
 1.63 04-Jun-2021  hannken Add flag/command NFSSVC_REPLACEEXPORTSLIST to nfssvc(2) system call.

Works like NFSSVC_SETEXPORTSLIST but supports "mel_nexports > 1"
and will atomically update the complete exports list for a file system.
 1.62 17-Jan-2020  ad branches: 1.62.10; 1.62.14;
VFS_VGET(), VFS_ROOT(), VFS_FHTOVP(): give them a "int lktype" argument, to
allow us to get shared locks (or no lock) on the returned vnode. Matches
FreeBSD.
 1.61 22-Dec-2019  ad branches: 1.61.2;
Make mntvnode_lock per-mount, and address false sharing of struct mount.
 1.60 17-Apr-2017  hannken branches: 1.60.12;
Remove unused argument "nextp" from vfs_busy() and vfs_unbusy().
Remove argument "keepref" from vfs_unbusy() and add vfs_ref() where needed.
 1.59 20-Nov-2016  maxv branches: 1.59.2;
Memory leak, found by Mootja.
 1.58 14-Dec-2013  christos branches: 1.58.6; 1.58.10;
don't allow the nfs server module to unload if it has exported filesystems.
 1.57 23-Nov-2013  christos convert from CIRCLEQ to TAILQ
 1.56 15-Sep-2013  martin Remove __CT_LOCAL_.. hack
 1.55 14-Sep-2013  martin Guard a function local CTASSERT with pro/epilogue
 1.54 30-Aug-2013  dholland Use __CTASSERT instead of handrolled version.
 1.53 30-Aug-2013  dholland more typos in comments
 1.52 30-Aug-2013  dholland typo in comment
 1.51 27-Sep-2011  christos branches: 1.51.2; 1.51.12; 1.51.16;
use NFS_MAXNAMLEN for all names.
 1.50 31-Mar-2011  dyoung Hide the radix-trie implementation of the forwarding table so that we
will have an easier time replacing it with something different, even if
it is a second radix-trie implementation.

sys/net/route.c and sys/net/rtsock.c no longer operate directly on
radix_nodes or radix_node_heads.

Hopefully this will reduce the temptation to implement multipath or
source-based routing using grotty hacks to the grotty old radix-trie
code, too. :-)
 1.49 19-Nov-2010  dholland branches: 1.49.2;
Introduce struct pathbuf. This is an abstraction to hold a pathname
and the metadata required to interpret it. Callers of namei must now
create a pathbuf and pass it to NDINIT (instead of a string and a
uio_seg), then destroy the pathbuf after the namei session is
complete.

Update all namei call sites accordingly. Add a pathbuf(9) man page and
update namei(9).

The pathbuf interface also now appears in a couple of related
additional places that were passing string/uio_seg pairs that were
later fed into NDINIT. Update other call sites accordingly.
 1.48 07-Jul-2009  christos branches: 1.48.4;
The compatibility call to re-export from sys_mount() calls
mountd_set_exports_list, with the mnt_updating mutex held. Account for that
to avoid a locking against myself panic.
 1.47 23-May-2009  ad Broken assertion.
 1.46 23-May-2009  ad - Cosmetic change to previous.
- Add a comment.
 1.45 23-May-2009  ad - Fix a race between umount()/mount() and nfssvc().
- Toss netexport state on nfsserver module unload.
 1.44 17-Dec-2008  cegger branches: 1.44.2;
kill MALLOC and FREE macros.
 1.43 28-Nov-2008  pooka Use kmem instead of malloc to avoid hassle with dynamically attaching
a malloc type. Makes nfsserver-as-a-module work.

reported on current-users & tested by Geoff Wing
 1.42 25-Nov-2008  pooka Comment police. No functional or GENERIC size change.
 1.41 25-Nov-2008  pooka When testing if a file system handles file handles (ha ha ha), be
content with just the vp-to-fh-size check.
(removes a very weird error path in the code)
 1.40 19-Nov-2008  ad Make the emulations, exec formats, coredump, NFS, and the NFS server
into modules. By and large this commit:

- shuffles header files and ifdefs
- splits code out where necessary to be modular
- adds module glue for each of the components
- adds/replaces hooks for things that can be installed at runtime
 1.39 14-Nov-2008  ad Remove COMPAT ifdefs that might as well be comments (i.e., they cost us
almost nothing).
 1.38 10-May-2008  rumble branches: 1.38.4; 1.38.6;
Convert file systems to dynamically attach with the new module interface.
Make VFS hooks dynamic while we're here and say farewell to VFS_ATTACH and
VFS_HOOKS_ATTACH linksets.

As a consequence, most of the file systems can now be loaded as new style
modules.

Quick sanity check by ad@.
 1.37 06-May-2008  ad branches: 1.37.2;
PR kern/38141 lookup/vfs_busy acquire rwlock recursively

Simplify the mount locking. Remove all the crud to deal with recursion on
the mount lock, and crud to deal with unmount as another weirdo lock.

Hopefully this will once and for all fix the deadlocks with this. With this
commit there are two locks on each mount:

- krwlock_t mnt_unmounting. This is used to prevent unmount across critical
sections like getnewvnode(). It's only ever read locked with rw_tryenter(),
and is only ever write locked in dounmount(). A write hold can't be taken
on this lock if the current LWP could hold a vnode lock.

- kmutex_t mnt_updating. This is taken by threads updating the mount, for
example when going r/o -> r/w, and is only present to serialize updates.
In order to take this lock, a read hold must first be taken on
mnt_unmounting, and the two need to be held across the operation.

One effect of this change: previously if an unmount failed, we would make a
half hearted attempt to back out of it gracefully, but that was unlikely to
work in a lot of cases. Now while an unmount that will be aborted is in
progress, new file operations within the mount will fail instead of being
delayed. That is unlikely to be a problem though, because if the admin
requests unmount of a file system then s(he) has made a decision to deny
access to the resource.
 1.36 30-Apr-2008  ad PR kern/38135 vfs_busy/vfs_trybusy confusion

The previous fix worked, but it opened a window where mounts could have
disappeared from mountlist while the caller was traversing it using
vfs_trybusy(). Fix that.
 1.35 29-Apr-2008  ad kern/38135 vfs_busy/vfs_trybusy confusion

The symptom was that sometimes file systems would occasionally not appear
in output from 'df' or 'mount' if the system was busy. Resolution:

- Make mount locks work somewhat like vm_map locks.
- vfs_trybusy() now only fails if the mount is gone, or if someone is
unmounting the file system. Simple contention on mnt_lock doesn't
cause it to fail.
- vfs_busy() will wait even if the file system is being unmounted.
 1.34 28-Apr-2008  martin Remove clause 3 and 4 from TNF licenses
 1.33 28-Feb-2008  elad branches: 1.33.2; 1.33.4;
Introduce a new kauth action, KAUTH_NETWORK_NFS, and two requests,
KAUTH_REQ_NETWORK_NFS_EXPORT and KAUTH_REQ_NETWORK_NFS_SVC, and use them
to replace two KAUTH_GENERIC_ISSUSER calls in the NFS code.

Also replace two more with KAUTH_SYSTEM_MKNOD, where appropriate.

Documetnation and examples updated. More to come.
 1.32 30-Jan-2008  ad branches: 1.32.2; 1.32.6;
PR kern/37706 (forced unmount of file systems is unsafe):

- Do reference counting for 'struct mount'. Each vnode associated with a
mount takes a reference, and in turn the mount takes a reference to the
vfsops.
- Now that mounts are reference counted, replace the overcomplicated mount
locking inherited from 4.4BSD with a recursable rwlock.
 1.31 08-Dec-2007  pooka Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.
 1.30 12-Jul-2007  dsl branches: 1.30.6; 1.30.8; 1.30.14; 1.30.16;
Change the VFS_MOUNT() interface so that the 'data' buffer passed to the
fs code is a kernel buffer, pass though the length of the buffer as well.
Since the length of the userspace buffer isn'it (yet) passed through the mount
system call, add a field to the vfsops structure containing the default length.
Split sys_mount() for calls from compat code.
Ride one of the recent kernel version changes - old fs LKMs will load, but
sys_mount() will reject any attempt to use them.
 1.29 09-Jul-2007  ad Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements
 1.28 09-Jun-2007  dyoung There is only one radix trie walker, and it is rn_walktree(), so
use that instead of the indirect function call through rnh,
rnh->rnh_walktree.
 1.27 04-Mar-2007  christos branches: 1.27.2; 1.27.4;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.
 1.26 05-Feb-2007  chs branches: 1.26.2;
add back a mistakenly removed vput().
 1.25 04-Feb-2007  chs more fixes for the new vnode locking scheme:
- don't use SAVESTART in calls to relookup() from unionfs,
just vref() the desired vnode when we need to.
- fix locking and refcounting in the unionfs EEXIST error cases.
- release any vnode locks before calling VFS_ROOT(), vfs_busy() is enough.
this allows us to simplify union_root() and fix PR 3006.
- union_lock() doesn't handle shared lock requests correctly,
so convert them to exclusive instead. fixes PR 34775.
- in relookup(), avoid reusing "dp" for different purposes,
the error handling wasn't right. (actually just get rid of dp.)
also, change relookup() to ignore LOCKLEAF and always return the
vnode locked since the callers already expect this.
 1.24 04-Jan-2007  elad Consistent usage of KAUTH_GENERIC_ISSUSER.
 1.23 09-Dec-2006  chs a smorgasbord of improvements to vnode locking and path lookup:
- LOCKPARENT is no longer relevant for lookup(), relookup() or VOP_LOOKUP().
these now always return the parent vnode locked. namei() works as before.
lookup() and various other paths no longer acquire vnode locks in the
wrong order via vrele(). fixes PR 32535.
as a nice side effect, path lookup is also up to 25% faster.
- the above allows us to get rid of PDIRUNLOCK.
- also get rid of WANTPARENT (just use LOCKPARENT and unlock it).
- remove an assumption in layer_node_find() that all file systems implement
a recursive VOP_LOCK() (unionfs doesn't).
- require that all file systems supply vfs_vptofh and vfs_fhtovp routines.
fill in eopnotsupp() for file systems that don't support being exported
and remove the checks for NULL. (layerfs calls these without checking.)
- in union_lookup1(), don't change refcounts in the ISDOTDOT case, just
adjust which vnode is locked. fixes PR 33374.
- apply fixes for ufs_rename() from ufs_vnops.c rev. 1.61 to ext2fs_rename().
 1.22 09-Nov-2006  yamt branches: 1.22.2;
remove some __unused in function parameters.
 1.21 31-Oct-2006  mjf Revert the changes I introduced trying to solve tmpfs' NFS export problem.
Requested by yamt@
 1.20 30-Oct-2006  jmmv Fix a typo in a comment.
 1.19 24-Oct-2006  mjf Add support to allow a file system to not permit being exported over NFS.

Approved by elad@ and wrstuden@
 1.18 22-Oct-2006  pooka kauth_cred_uucvt() -> kauth_uucred_to_cred(), introduce kauth_cred_to_uucred()

per tech-kern proposal
 1.17 12-Oct-2006  christos - sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386
 1.16 23-Jul-2006  ad branches: 1.16.4; 1.16.6;
Use the LWP cached credentials where sane.
 1.15 13-Jul-2006  martin Fix alignement problems for fhandle_t, exposed by gcc4.1.

While touching all vptofh/fhtovp functions, get rid of VFS_MAXFIDSIZ,
version the getfh(2) syscall and explicitly pass the size available in
the filehandle from userland.

Discussed on tech-kern, with lots of help from yamt (thanks!).
 1.14 17-Jun-2006  yamt branches: 1.14.2;
- introduce vfs_composefh() and use it where appropriate.
- fix lock/unlock mismatch in sys_getfh.
 1.13 18-May-2006  yamt branches: 1.13.2; 1.13.4;
- fix some leaks in nfsd, introduced by kauth changes.
- simplify code.
- add some assertions.
- wrap some long lines.
- remove an unnecessary ";".
 1.12 18-May-2006  yamt - nfs_export_unmount: don't forget to free exports.
- rename clear_exports for consistency.
 1.11 14-May-2006  elad integrate kauth.
 1.10 27-Mar-2006  martin KASSERT that the returned file id length from VPTOFH is <= the
maximum allowed value (_VFS_MAXFIDSZ).
 1.9 05-Jan-2006  yamt branches: 1.9.2; 1.9.4; 1.9.6; 1.9.8; 1.9.10;
mountd_set_exports_list: check if VFS_VPTOFH actually works.
 1.8 05-Jan-2006  yamt ensure the export list is not changed during nfsd operations.
 1.7 15-Dec-2005  yamt branches: 1.7.2;
netcred_lookup: remove a wrong assertion and handle the case properly.
 1.6 11-Dec-2005  christos merge ktrace-lwp.
 1.5 22-Nov-2005  yamt - reduce number of linear search per rpc.
- coalesce mount_netexport_pair into netexport.
 1.4 20-Nov-2005  yamt fix bugs introduced by the recent exports list rototill.
- mountd_set_exports_list: don't discard error.
- nfs_update_exports_30: check the correct flag.
 1.3 25-Sep-2005  jmmv branches: 1.3.6; 1.3.8;
Add some COMPAT_30 code to let old mountd binaries work after the NFS
exports rototill.
 1.2 23-Sep-2005  jmmv Remove the mount<->netexport entry from the map during umount... otherwise
we leave a dangling pointer in the list *ouch*.
 1.1 23-Sep-2005  jmmv Apply the NFS exports list rototill patch:

- Remove all NFS related stuff from file system specific code.
- Drop the vfs_checkexp hook and generalize it in the new nfs_check_export
function, thus removing redundancy from all file systems.
- Move all NFS export-related stuff from kern/vfs_subr.c to the new
file sys/nfs/nfs_export.c. The former was becoming large and its code
is always compiled, regardless of the build options. Using the latter,
the code is only compiled in when NFSSERVER is enabled. While doing this,
also make some functions in nfs_subs.c conditional to NFSSERVER.
- Add a new command in nfssvc(2), called NFSSVC_SETEXPORTSLIST, that takes a
path and a set of export entries. At the moment it can only clear the
exports list or append entries, one by one, but it is done in a way that
allows setting the whole set of entries atomically in the future (see the
comment in mountd_set_exports_list or in doc/TODO).
- Change mountd(8) to use the nfssvc(2) system call instead of mount(2) so
that it becomes file system agnostic. In fact, all this whole thing was
done to remove a 'XXX' block from this utility!
- Change the mount*, newfs and fsck* userland utilities to not deal with NFS
exports initialization; done internally by the kernel when initializing
the NFS support for each file system.
- Implement an interface for VFS (called VFS hooks) so that several kernel
subsystems can run arbitrary code upon receipt of specific VFS events.
At the moment, this only provides support for unmount and is used to
destroy NFS exports lists from the file systems being unmounted, though it
has room for extension.

Thanks go to yamt@, chs@, thorpej@, wrstuden@ and others for their comments
and advice in the development of this patch.
 1.3.8.1 22-Nov-2005  yamt sync with head.
 1.3.6.3 11-Dec-2005  christos Sync with head.
 1.3.6.2 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.3.6.1 25-Sep-2005  skrll file nfs_export.c was added on branch ktrace-lwp on 2005-11-10 14:11:55 +0000
 1.7.2.1 15-Jan-2006  yamt sync with head.
 1.9.10.2 24-May-2006  tron Merge 2006-05-24 NetBSD-current into the "peter-altq" branch.
 1.9.10.1 28-Mar-2006  tron Merge 2006-03-28 NetBSD-current into the "peter-altq" branch.
 1.9.8.6 06-May-2006  christos - Move kauth_cred_t declaration to <sys/types.h>
- Cleanup struct ucred; forward declarations that are unused.
- Don't include <sys/kauth.h> in any header, but include it in the c files
that need it.

Approved by core.
 1.9.8.5 19-Apr-2006  elad sync with head.
 1.9.8.4 12-Mar-2006  elad Rename kauth_cred_compare() to kauth_cred_uucmp(), and kauth_cred_convert()
to kauth_cred_uucvt(). This makes it clearer that we're working on struct
uucred.

Inspired by comments from yamt@.
 1.9.8.3 10-Mar-2006  elad Some cleanup.

kauth_cred_setrefcnt() was only called after kauth_cred_convert() in NFS
code to convert a struct uucred to kauth_cred_t. Since there's no valid
use for such a function, make kauth_cred_convert() set the reference
count to 1 and eliminate the need for kauth_cred_setrefcnt() entirely.

Motivated by comments from yamt@ and thorpej@.
 1.9.8.2 10-Mar-2006  elad generic_authorize() -> kauth_authorize_generic().
 1.9.8.1 08-Mar-2006  elad Adapt to kernel authorization KPI.

This could use some testing...
 1.9.6.4 11-Aug-2006  yamt sync with head
 1.9.6.3 26-Jun-2006  yamt sync with head.
 1.9.6.2 24-May-2006  yamt sync with head.
 1.9.6.1 01-Apr-2006  yamt sync with head.
 1.9.4.2 01-Jun-2006  kardel Sync with head.
 1.9.4.1 22-Apr-2006  simonb Sync with head.
 1.9.2.1 09-Sep-2006  rpaulo sync with head
 1.13.4.1 13-Jul-2006  gdamore Merge from HEAD.
 1.13.2.1 19-Jun-2006  chap Sync with head.
 1.14.2.8 17-Mar-2008  yamt sync with head.
 1.14.2.7 04-Feb-2008  yamt sync with head.
 1.14.2.6 21-Jan-2008  yamt sync with head
 1.14.2.5 03-Sep-2007  yamt sync with head.
 1.14.2.4 26-Feb-2007  yamt sync with head.
 1.14.2.3 30-Dec-2006  yamt sync with head.
 1.14.2.2 21-Jun-2006  yamt sync with head.
 1.14.2.1 17-Jun-2006  yamt file nfs_export.c was added on branch yamt-lazymbuf on 2006-06-21 15:11:58 +0000
 1.16.6.2 10-Dec-2006  yamt sync with head.
 1.16.6.1 22-Oct-2006  yamt sync with head
 1.16.4.3 09-Feb-2007  ad Sync with HEAD.
 1.16.4.2 12-Jan-2007  ad Sync with head.
 1.16.4.1 18-Nov-2006  ad Sync with head.
 1.22.2.1 17-Feb-2007  tron Apply patch (requested by chs in ticket #422):
- Fix various deadlock problems with nullfs and unionfs.
- Speed up path lookups by upto 25%.
 1.26.2.1 12-Mar-2007  rmind Sync with HEAD.
 1.27.4.1 11-Jul-2007  mjf Sync with head.
 1.27.2.2 15-Jul-2007  ad Sync with head.
 1.27.2.1 13-Mar-2007  ad Pull in the initial set of changes for the vmlocking branch.
 1.30.16.1 26-Dec-2007  ad Sync with head.
 1.30.14.1 18-Feb-2008  mjf Sync with HEAD.
 1.30.8.2 23-Mar-2008  matt sync with HEAD
 1.30.8.1 09-Jan-2008  matt sync with HEAD
 1.30.6.1 09-Dec-2007  jmcneill Sync with HEAD.
 1.32.6.3 17-Jan-2009  mjf Sync with HEAD.
 1.32.6.2 02-Jun-2008  mjf Sync with HEAD.
 1.32.6.1 03-Apr-2008  mjf Sync with HEAD.
 1.32.2.1 24-Mar-2008  keiichi sync with head.
 1.33.4.4 18-Jul-2009  yamt sync with head.
 1.33.4.3 20-Jun-2009  yamt sync with head
 1.33.4.2 04-May-2009  yamt sync with head.
 1.33.4.1 16-May-2008  yamt sync with head.
 1.33.2.1 18-May-2008  yamt sync with head.
 1.37.2.1 23-Jun-2008  wrstuden Sync w/ -current. 34 merge conflicts to follow.
 1.38.6.1 19-Jan-2009  skrll Sync with HEAD.
 1.38.4.1 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.44.2.1 23-Jul-2009  jym Sync with HEAD.
 1.48.4.2 21-Apr-2011  rmind sync with head
 1.48.4.1 05-Mar-2011  rmind sync with head
 1.49.2.1 06-Jun-2011  jruoho Sync with HEAD.
 1.51.16.1 18-May-2014  rmind sync with head
 1.51.12.2 03-Dec-2017  jdolecek update from HEAD
 1.51.12.1 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.51.2.1 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.58.10.2 26-Apr-2017  pgoyette Sync with HEAD
 1.58.10.1 07-Jan-2017  pgoyette Sync with HEAD. (Note that most of these changes are simply $NetBSD$
tag issues.)
 1.58.6.2 28-Aug-2017  skrll Sync with HEAD
 1.58.6.1 05-Dec-2016  skrll Sync with HEAD
 1.59.2.1 21-Apr-2017  bouyer Sync with HEAD
 1.60.12.1 08-Apr-2020  martin Merge changes from current as of 20200406
 1.61.2.1 17-Jan-2020  ad Sync with head.
 1.62.14.1 06-Jun-2021  cjep sync with head
 1.62.10.1 17-Jun-2021  thorpej Sync w/ HEAD.
 1.9 21-Mar-2023  christos PR/57279: Izumi Tsutsui: Fix some {int,long} -> time_t. Still things will
break eventually because parts of the nfs protocol assume time_t will fit
in 32 bits.
 1.8 03-Sep-2018  riastradh branches: 1.8.30;
Rename min/max -> uimin/uimax for better honesty.

These functions are defined on unsigned int. The generic name
min/max should not silently truncate to 32 bits on 64-bit systems.
This is purely a name change -- no functional change intended.

HOWEVER! Some subsystems have

#define min(a, b) ((a) < (b) ? (a) : (b))
#define max(a, b) ((a) > (b) ? (a) : (b))

even though our standard name for that is MIN/MAX. Although these
may invite multiple evaluation bugs, these do _not_ cause integer
truncation.

To avoid `fixing' these cases, I first changed the name in libkern,
and then compile-tested every file where min/max occurred in order to
confirm that it failed -- and thus confirm that nothing shadowed
min/max -- before changing it.

I have left a handful of bootloaders that are too annoying to
compile-test, and some dead code:

cobalt ews4800mips hp300 hppa ia64 luna68k vax
acorn32/if_ie.c (not included in any kernels)
macppc/if_gm.c (superseded by gem(4))

It should be easy to fix the fallout once identified -- this way of
doing things fails safe, and the goal here, after all, is to _avoid_
silent integer truncations, not introduce them.

Maybe one day we can reintroduce min/max as type-generic things that
never silently truncate. But we should avoid doing that for a while,
so that existing code has a chance to be detected by the compiler for
conversion to uimin/uimax without changing the semantics until we can
properly audit it all. (Who knows, maybe in some cases integer
truncation is actually intended!)
 1.7 15-Jul-2015  manu branches: 1.7.16; 1.7.18;
Fix soft NFS force unmount

For many reasons, forcibly unmounting a soft NFS mount could hang forever.
Here are the fixes:
- Introduce decents timeouts in operation that awaited NFS server reply.
- On timeout, fails operations on soft mounts with EIO.
- Introduce NFSMNT_DISMNTFORCE to let the filesystem know that a
force unmount is ongoing. This causes timeouts to be reduced and
prevents the NFS client to attempt reconnecting to the NFS server.

Also fix a race condition where some asynchronous I/O could reference
destroyed mount structures. We fix this by awaiting asynchronous I/O
to drain before proceeding.

Reviewed by Chuck Silvers.
 1.6 25-Oct-2013  martin branches: 1.6.4; 1.6.6;
Mark a diagnostic-only variable
 1.5 14-Sep-2013  martin Avoid unused variable warnings
 1.4 31-Dec-2009  christos branches: 1.4.12; 1.4.22; 1.4.26;
handle the nuidhash_max lossage differently
 1.3 14-Mar-2009  dsl branches: 1.3.2;
ANSIfy another 1261 function definitions.
The only ones left in sys are beyond by sed script!
(or in sys/dist or sys/external)
Mostly they have function pointer parameters.
 1.2 14-Mar-2009  dsl Change about 4500 of the K&R function definitions to ANSI ones.
There are still about 1600 left, but they have ',' or /* ... */
in the actual variable definitions - which my awk script doesn't handle.
There are also many that need () -> (void).
(The script does handle misordered arguments.)
 1.1 19-Nov-2008  ad branches: 1.1.4; 1.1.6; 1.1.8; 1.1.10;
Make the emulations, exec formats, coredump, NFS, and the NFS server
into modules. By and large this commit:

- shuffles header files and ifdefs
- splits code out where necessary to be modular
- adds module glue for each of the components
- adds/replaces hooks for things that can be installed at runtime
 1.1.10.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.1.8.3 28-Apr-2009  skrll Sync with HEAD.
 1.1.8.2 19-Jan-2009  skrll Sync with HEAD.
 1.1.8.1 19-Nov-2008  skrll file nfs_iod.c was added on branch nick-hppapmap on 2009-01-19 13:20:20 +0000
 1.1.6.2 17-Jan-2009  mjf Sync with HEAD.
 1.1.6.1 19-Nov-2008  mjf file nfs_iod.c was added on branch mjf-devfs2 on 2009-01-17 13:29:34 +0000
 1.1.4.2 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.1.4.1 19-Nov-2008  haad file nfs_iod.c was added on branch haad-dm on 2008-12-13 01:15:28 +0000
 1.3.2.4 11-Mar-2010  yamt sync with head
 1.3.2.3 16-Jul-2009  yamt fix a merge botch.
 1.3.2.2 04-May-2009  yamt sync with head.
 1.3.2.1 14-Mar-2009  yamt file nfs_iod.c was added on branch yamt-nfs-mp on 2009-05-04 08:14:22 +0000
 1.4.26.1 18-May-2014  rmind sync with head
 1.4.22.2 03-Dec-2017  jdolecek update from HEAD
 1.4.22.1 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.4.12.1 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.6.6.1 22-Sep-2015  skrll Sync with HEAD
 1.6.4.1 04-Nov-2015  riz Pull up following revision(s) (requested by manu in ticket #882):
sbin/umount/umount.c: revision 1.48
sys/nfs/nfsmount.h: revision 1.53
sys/nfs/nfs_var.h: revision 1.94
sys/nfs/nfs_iod.c: revision 1.7
sys/nfs/nfs_socket.c: revision 1.197
sys/nfs/nfs_bio.c: revision 1.191
sys/nfs/nfs_vfsops.c: revision 1.230
sys/nfs/nfs_clntsocket.c: revision 1.3
Remove useless and harmful sync(2) call in umount(8)
Remove sync(2) call before unmount(2) in umount(8). This sync(2) is useless
since unmount(2) will perform a VFS_SYNC anyway.
But moreover, this sync(2) may be harmful, as there are some situation where
it cannot return (unreachable NFS server, for instance), causing umount -f
to be uneffective.
Fix soft NFS force unmount
For many reasons, forcibly unmounting a soft NFS mount could hang forever.
Here are the fixes:
- Introduce decents timeouts in operation that awaited NFS server reply.
- On timeout, fails operations on soft mounts with EIO.
- Introduce NFSMNT_DISMNTFORCE to let the filesystem know that a
force unmount is ongoing. This causes timeouts to be reduced and
prevents the NFS client to attempt reconnecting to the NFS server.
Also fix a race condition where some asynchronous I/O could reference
destroyed mount structures. We fix this by awaiting asynchronous I/O
to drain before proceeding.
Reviewed by Chuck Silvers.
 1.7.18.1 10-Jun-2019  christos Sync with HEAD
 1.7.16.1 06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.8.30.1 20-Sep-2024  martin Pull up following revision(s) (requested by rin in ticket #880):

sys/nfs/nfs_iod.c: revision 1.9
sys/nfs/nfs_vfsops.c: revision 1.245
sys/nfs/nfs_clntsubs.c: revision 1.7

PR/57279: Izumi Tsutsui: Fix some {int,long} -> time_t. Still things will
break eventually because parts of the nfs protocol assume time_t will fit
in 32 bits.
 1.32 20-Oct-2021  thorpej Overhaul of the EVFILT_VNODE kevent(2) filter:

- Centralize vnode kevent handling in the VOP_*() wrappers, rather than
forcing each individual file system to deal with it (except VOP_RENAME(),
because VOP_RENAME() is a mess and we currently have 2 different ways
of handling it; at least it's reasonably well-centralized in the "new"
way).
- Add support for NOTE_OPEN, NOTE_CLOSE, NOTE_CLOSE_WRITE, and NOTE_READ,
compatible with the same events in FreeBSD.
- Track which kevent notifications clients are interested in receiving
to avoid doing work for events no one cares about (avoiding, e.g.
taking locks and traversing the klist to send a NOTE_WRITE when
someone is merely watching for a file to be deleted, for example).

In support of the above:

- Add support in vnode_if.sh for specifying PRE- and POST-op handlers,
to be invoked before and after vop_pre() and vop_post(), respectively.
Basic idea from FreeBSD, but implemented differently.
- Add support in vnode_if.sh for specifying CONTEXT fields in the
vop_*_args structures. These context fields are used to convey information
between the file system VOP function and the VOP wrapper, but do not
occupy an argument slot in the VOP_*() call itself. These context fields
are initialized and subsequently interpreted by PRE- and POST-op handlers.
- Version VOP_REMOVE(), uses the a context field for the file system to report
back the resulting link count of the target vnode. Return this in tmpfs,
udf, nfs, chfs, ext2fs, lfs, and ufs.

NetBSD 9.99.92.
 1.31 11-Oct-2021  thorpej Mark the EVFILT_VNODE filters MP-safe.
 1.30 11-Oct-2021  thorpej Setting EV_EOF requires modifying kn->kn_flags. However, that relies on
holding the kq_lock of that note's kq. Rather than exposing this directly,
add new knote_set_eof() and knote_clear_eof() functions that handle the
necessary locking and don't leak as many implementation details to modules.

NetBSD 9.99.91
 1.29 10-Oct-2021  thorpej Must hold kn->kn_kq->kq_lock to modify kn->kn_flags.
 1.28 26-Sep-2021  thorpej Change the kqueue filterops::f_isfd field to filterops::f_flags, and
define a flag FILTEROP_ISFD that has the meaning of the prior f_isfd.
Field and flag name aligned with OpenBSD.

This does not constitute a functional or ABI change, as the field location
and size, and the value placed in that field, are the same as the previous
code, but we're bumping __NetBSD_Version__ so 3rd-party module source code
can adapt, as needed.

NetBSD 9.99.89
 1.27 05-Sep-2020  riastradh Round of uvm.h cleanup.

The poorly named uvm.h is generally supposed to be for uvm-internal
users only.

- Narrow it to files that actually need it -- mostly files that need
to query whether curlwp is the pagedaemon, which should maybe be
exposed by an external header.

- Use uvm_extern.h where feasible and uvm_*.h for things not exposed
by it. We should split up uvm_extern.h but this will serve for now
to reduce the uvm.h dependencies.

- Use uvm_stat.h and #ifdef UVMHIST uvm.h for files that use
UVMHIST(ubchist), since ubchist is declared in uvm.h but the
reference evaporates if UVMHIST is not defined, so we reduce header
file dependencies.

- Make uvm_device.h and uvm_swap.h independently includable while
here.

ok chs@
 1.26 25-Oct-2017  maya Use C99 initializer for filterops

Mostly done with spatch with touchups for indentation

@@
expression a;
identifier b,c,d;
identifier p;
@@
const struct filterops p =
- { a, b, c, d
+ {
+ .f_isfd = a,
+ .f_attach = b,
+ .f_detach = c,
+ .f_event = d,
};
 1.25 24-Oct-2011  hannken branches: 1.25.12;
VOP_GETATTR() needs a shared lock at least.

As nfs_kqpoll() ignores the return value from VOP_GETATTR() initialize
the attrributes to zero -- nfs_kqfilter() does the same.
 1.24 12-Jun-2011  rmind Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.
 1.23 19-Nov-2008  ad branches: 1.23.8; 1.23.14;
Make the emulations, exec formats, coredump, NFS, and the NFS server
into modules. By and large this commit:

- shuffles header files and ifdefs
- splits code out where necessary to be modular
- adds module glue for each of the components
- adds/replaces hooks for things that can be installed at runtime
 1.22 28-Apr-2008  martin branches: 1.22.6; 1.22.8;
Remove clause 3 and 4 from TNF licenses
 1.21 21-Mar-2008  ad branches: 1.21.2; 1.21.4;
Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.
 1.20 12-Feb-2008  yamt branches: 1.20.6;
nfs_kqfilter: fix lock/unlock mismatch. PR/38003 from Geoff C. Wing.
 1.19 05-Feb-2008  ad Lock v_knlist with the vnode interlock. PR kern/37881.
 1.18 02-Jan-2008  yamt use kmem_alloc instead of malloc.
 1.17 05-Dec-2007  pooka branches: 1.17.4;
Do not "return 1" from kqfilter for errors. That value is passed
directly to the userland caller and results in a mysterious EPERM.
Instead, return EINVAL or something else sensible depending on the
case.
 1.16 26-Nov-2007  pooka branches: 1.16.2;
Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern
 1.15 09-Jul-2007  ad branches: 1.15.6; 1.15.8; 1.15.14;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements
 1.14 29-Apr-2007  yamt use mutex and condvar.
 1.13 09-Nov-2006  yamt branches: 1.13.4; 1.13.8; 1.13.10;
remove some __unused in function parameters.
 1.12 12-Oct-2006  christos - sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386
 1.11 23-Jul-2006  ad branches: 1.11.4; 1.11.6;
Use the LWP cached credentials where sane.
 1.10 14-May-2006  elad integrate kauth.
 1.9 11-Dec-2005  christos branches: 1.9.4; 1.9.6; 1.9.8; 1.9.10; 1.9.12;
merge ktrace-lwp.
 1.8 26-Feb-2005  perry branches: 1.8.4;
nuke trailing whitespace
 1.7 30-Oct-2003  simonb branches: 1.7.8; 1.7.10;
Remove some assigned-to but otherwise unused variables.
 1.6 29-Jun-2003  fvdl branches: 1.6.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
 1.5 28-Jun-2003  darrenr Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
 1.4 27-Mar-2003  jdolecek nfs_kqfilter(): add the knote to v_klist only if guaranteed to return success
 1.3 27-Feb-2003  jdolecek fix typo in comment
 1.2 23-Oct-2002  jdolecek branches: 1.2.2;
merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe
 1.1 30-Sep-2002  jdolecek branches: 1.1.2;
file nfs_kq.c was initially added on branch kqueue.
 1.1.2.2 02-Oct-2002  jdolecek knote data is now 64bit, g/c obsolete comment
 1.1.2.1 30-Sep-2002  jdolecek add support for kevents to NFS
to detect file changes on server by other NFS clients, polling kernel thread
is used to periodically check for attribute changes of watched files;
the NFS server is only contacted when the vnode expires from local attrcache
(which takes 5-60 seconds currently), to keep network&CPU overhead low

the routine checking for remote changes is quite simplistic, but hopefully
doing it's job well enough
 1.2.2.2 11-Nov-2002  nathanw Catch up to -current
 1.2.2.1 23-Oct-2002  nathanw file nfs_kq.c was added on branch nathanw_sa on 2002-11-11 22:16:08 +0000
 1.6.2.5 04-Mar-2005  skrll Sync with HEAD.

Hi Perry!
 1.6.2.4 21-Sep-2004  skrll Fix the sync with head I botched.
 1.6.2.3 18-Sep-2004  skrll Sync with HEAD.
 1.6.2.2 03-Aug-2004  skrll Sync with HEAD
 1.6.2.1 02-Jul-2003  darrenr Apply the aborted ktrace-lwp changes to a specific branch. This is just for
others to review, I'm concerned that patch fuziness may have resulted in some
errant code being generated but I'll look at that later by comparing the diff
from the base to the branch with the file I attempt to apply to it. This will,
at the very least, put the changes in a better context for others to review
them and attempt to tinker with removing passing of 'struct lwp' through
the kernel.
 1.7.10.1 19-Mar-2005  yamt sync with head. xen and whitespace. xen part is not finished.
 1.7.8.1 29-Apr-2005  kent sync with -current
 1.8.4.8 24-Mar-2008  yamt sync with head.
 1.8.4.7 27-Feb-2008  yamt sync with head.
 1.8.4.6 11-Feb-2008  yamt sync with head.
 1.8.4.5 21-Jan-2008  yamt sync with head
 1.8.4.4 07-Dec-2007  yamt sync with head
 1.8.4.3 03-Sep-2007  yamt sync with head.
 1.8.4.2 30-Dec-2006  yamt sync with head.
 1.8.4.1 21-Jun-2006  yamt sync with head.
 1.9.12.1 24-May-2006  tron Merge 2006-05-24 NetBSD-current into the "peter-altq" branch.
 1.9.10.1 08-Mar-2006  elad Adapt to kernel authorization KPI.

This could use some testing...
 1.9.8.2 11-Aug-2006  yamt sync with head
 1.9.8.1 24-May-2006  yamt sync with head.
 1.9.6.1 01-Jun-2006  kardel Sync with head.
 1.9.4.1 09-Sep-2006  rpaulo sync with head
 1.11.6.2 10-Dec-2006  yamt sync with head.
 1.11.6.1 22-Oct-2006  yamt sync with head
 1.11.4.1 18-Nov-2006  ad Sync with head.
 1.13.10.1 11-Jul-2007  mjf Sync with head.
 1.13.8.4 08-Jun-2007  ad Sync with head.
 1.13.8.3 13-May-2007  ad - Pass the error number and residual count to biodone(), and let it handle
setting error indicators. Prepare to eliminate B_ERROR.
- Add a flag argument to brelse() to be set into the buf's flags, instead
of doing it directly. Typically used to set B_INVAL.
- Add a "struct cpu_info *" argument to kthread_create(), to be used to
create bound threads. Change "bool mpsafe" to "int flags".
- Allow exit of LWPs in the IDL state when (l != curlwp).
- More locking fixes & conversion to the new API.
 1.13.8.2 10-Apr-2007  ad Nuke the deferred kthread creation stuff, as it's no longer needed.
Pointed out by thorpej@.
 1.13.8.1 09-Apr-2007  ad - Add two new arguments to kthread_create1: pri_t pri, bool mpsafe.
- Fork kthreads off proc0 as new LWPs, not new processes.
 1.13.4.1 07-May-2007  yamt sync with head.
 1.15.14.2 18-Feb-2008  mjf Sync with HEAD.
 1.15.14.1 08-Dec-2007  mjf Sync with HEAD.
 1.15.8.2 23-Mar-2008  matt sync with HEAD
 1.15.8.1 09-Jan-2008  matt sync with HEAD
 1.15.6.2 09-Dec-2007  jmcneill Sync with HEAD.
 1.15.6.1 27-Nov-2007  joerg Sync with HEAD. amd64 Xen support needs testing.
 1.16.2.1 08-Dec-2007  ad Sync with head.
 1.17.4.1 02-Jan-2008  bouyer Sync with HEAD
 1.20.6.3 17-Jan-2009  mjf Sync with HEAD.
 1.20.6.2 02-Jun-2008  mjf Sync with HEAD.
 1.20.6.1 03-Apr-2008  mjf Sync with HEAD.
 1.21.4.5 11-Aug-2010  yamt sync with head.
 1.21.4.4 21-Mar-2010  yamt lock vnode when calling VOP_GETATTR.
 1.21.4.3 04-May-2009  yamt sync with head.
 1.21.4.2 16-May-2008  yamt sync with head.
 1.21.4.1 27-Apr-2008  yamt commit some work-in-progress changes to make nfs client mp-safe to a branch,
so that they won't get lost.
- sprinkle some locking
- mark the filesystem, nfstimer callout, and kq kthread mp-safe
- add assertions and comments
- disable upgrade mount for now
- some unrelated cosmetic changes
 1.21.2.1 18-May-2008  yamt sync with head.
 1.22.8.1 19-Jan-2009  skrll Sync with HEAD.
 1.22.6.1 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.23.14.1 23-Jun-2011  cherry Catchup with rmind-uvmplock merge.
 1.23.8.1 16-Mar-2010  rmind Change struct uvm_object::vmobjlock to be dynamically allocated with
mutex_obj_alloc(). It allows us to share the locks among UVM objects.
 1.25.12.1 03-Dec-2017  jdolecek update from HEAD
 1.126 01-May-2020  hannken Resolve delayed truncation from nfs_inactive() too.

Should prevent "locking against self" from nfs_unlock().
 1.125 24-Feb-2020  ad v_interlock -> vmobjlock
 1.124 18-Oct-2019  msaitoh branches: 1.124.2;
s/initalize/initialize/ in comment or printf message.
 1.123 28-May-2018  chs branches: 1.123.2;
add a genfs method to allow a file system to limit the range of pages
that are given to a single GOP_WRITE() call. needed by ZFS.
 1.122 26-May-2017  riastradh branches: 1.122.8;
Eliminate crusty debugging sludge.

We have a mostly sane vnode lifecycle now. If this needs debugging,
it should be done once at the call site of VOP_RECLAIM.
 1.121 26-May-2017  riastradh Make VOP_RECLAIM do the last unlock of the vnode.

VOP_RECLAIM naturally has exclusive access to the vnode, so having it
locked on entry is not strictly necessary -- but it means if there
are any final operations that must be done on the vnode, such as
ffs_update, requiring exclusive access to it, we can now kassert that
the vnode is locked in those operations.

We can't just have the caller release the last lock because some file
systems don't use genfs_lock, and require the vnode to remain valid
for VOP_UNLOCK to work, notably unionfs.
 1.120 11-Apr-2017  riastradh Make VOP_INACTIVE preserve vnode lock on return.

Discussed on tech-kern:
https://mail-index.netbsd.org/tech-kern/2017/04/01/msg021751.html

Ride 7.99.68, a bumpy bus of incremental vfs improvements!
 1.119 20-Aug-2016  hannken branches: 1.119.2;
Remove now obsolete operation vcache_remove().

Welcome to 7.99.36
 1.118 30-May-2014  hannken branches: 1.118.4; 1.118.8;
Change NFS from rbtree to vcache.
 1.117 27-Feb-2014  hannken branches: 1.117.2;
The current implementation of vn_lock() is racy. Modification of
the vnode operations vector for active vnodes is unsafe because it
is not known whether deadfs or the original file system will be
called.

- Pass down LK_RETRY to the lock operation (hint for deadfs only).

- Change deadfs lock operation to return ENOENT if LK_RETRY is unset.

- Change all other lock operations to check for dead vnode once
the vnode is locked and unlock and return ENOENT in this case.

With these changes in place vnode lock operations will never succeed
after vclean() has marked the vnode as VI_XLOCK and before vclean()
has changed the operations vector.

Adresses PR kern/37706 (Forced unmount of file systems is unsafe)

Discussed on tech-kern.

Welcome to 6.99.33
 1.116 12-Jun-2011  rmind branches: 1.116.2; 1.116.12; 1.116.16;
Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.
 1.115 19-May-2011  rmind branches: 1.115.2;
Remove cache_purge(9) calls from reclamation routines in the file systems,
as vclean(9) performs it for us since Lite2 merge.
 1.114 24-Sep-2010  rmind branches: 1.114.2;
Fixes/improvements to RB-tree implementation:
1. Fix inverted node order, so that negative value from comparison operator
would represent lower (left) node, and positive - higher (right) node.
2. Add an argument (i.e. "context"), passed to comparison operators.
3. Change rb_tree_insert_node() to return a node - either inserted one or
already existing one.
4. Amend the interface to manipulate the actual object, instead of the
rb_node (in a similar way as Patricia-tree interface does).
5. Update all RB-tree users accordingly.

XXX: Perhaps rename rb.h to rbtree.h, since cleaning-up..

1-3 address the PR/43488 by Jeremy Huddleston.

Passes RB-tree regression tests.
Reviewed by: matt@, christos@
 1.113 21-Jul-2010  hannken Make holding v_interlock mandatory for callers of vget().

Announced some time ago on tech-kern.
 1.112 01-Jul-2010  hannken Remove vlockmgr(). Generic vnode lock operations now use a rwlock located
in the vnode. All LK_* flags move from sys/lock.h to sys/vnode.h. Calls
to vlockmgr() in file systems get replaced with VOP_LOCK() or VOP_UNLOCK().

Welcome to 5.99.34.

Discussed on tech-kern.
 1.111 24-Jun-2010  hannken Clean up vnode lock operations pass 2:

VOP_UNLOCK(vp, flags) -> VOP_UNLOCK(vp): Remove the unneeded flags argument.

Welcome to 5.99.32.

Discussed on tech-kern.
 1.110 15-Mar-2009  cegger branches: 1.110.2; 1.110.4;
ansify function definitions
 1.109 14-Mar-2009  dsl Change about 4500 of the K&R function definitions to ANSI ones.
There are still about 1600 left, but they have ',' or /* ... */
in the actual variable definitions - which my awk script doesn't handle.
There are also many that need () -> (void).
(The script does handle misordered arguments.)
 1.108 02-Jan-2009  ad branches: 1.108.2;
- Don't vput() a vnode that we do not hold locked.
- Eliminate one of the few remaining uses of LK_CANRECURSE.
 1.107 19-Nov-2008  ad Make the emulations, exec formats, coredump, NFS, and the NFS server
into modules. By and large this commit:

- shuffles header files and ifdefs
- splits code out where necessary to be modular
- adds module glue for each of the components
- adds/replaces hooks for things that can be installed at runtime
 1.106 22-Oct-2008  matt branches: 1.106.2; 1.106.4;
Don't need nfs_vfs_reinit anymore since we don't resize tables anymore.
Move reinit code to init case.
 1.105 22-Oct-2008  matt Change NFS to use a RB-tree for its FH->nfsnode lookups.
 1.104 30-Sep-2008  pooka Initialize nfsnode pools and malloc type dynamically in the
constructor instead of depending on link sets. Consequently, rename
nfs_nh{init,reinit,done} to nfs_node_{init,reinit,done}, respectively,
to better convey the function.
 1.103 24-May-2008  tron branches: 1.103.4;
Make sure that we flush the NFS directory cache in case of an NFS mount
using the translate cookie option during unmount. This fixes PR kern/38100.
Patch suggested by Michael van Elst during Hackathon 11.
 1.102 05-May-2008  ad branches: 1.102.2;
- Convert hashinit() to use kmem_alloc(). The hash tables can be large
and it's better to not have them in kmem_map.
- Convert a couple of minor items along the way to kmem_alloc().
- Fix some memory leaks.
 1.101 30-Jan-2008  ad branches: 1.101.6; 1.101.8; 1.101.10;
Replace struct lock on vnodes with a simpler lock object built on
krwlock_t. This is a step towards removing lockmgr and simplifying
vnode locking. Discussed on tech-kern.
 1.100 26-Jan-2008  ad - Make nfsnode hash MPSAFE.
- Replace use of lockmgr().
 1.99 17-Jan-2008  ad Correct test of v_usecount.
 1.98 02-Jan-2008  yamt use kmem_alloc instead of malloc.
 1.97 02-Jan-2008  ad Merge vmlocking2 to head.
 1.96 26-Nov-2007  pooka branches: 1.96.2; 1.96.6;
Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern
 1.95 06-Aug-2007  yamt branches: 1.95.2; 1.95.8; 1.95.10;
nfs_inactive: turn a panic into a printf for now, as it isn't critical.
PR/36572 from Martin Husemann.
 1.94 12-Jun-2007  yamt branches: 1.94.2; 1.94.6;
nfs_inactive: don't clear NTRUNCDELAYED erroneously.
(fix cache consistency problems like NUL bytes near EOF.)
 1.93 12-Mar-2007  ad branches: 1.93.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.
 1.92 21-Feb-2007  thorpej branches: 1.92.4;
Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.
 1.91 20-Feb-2007  ad Call genfs_node_destroy() where appropriate.
 1.90 15-Feb-2007  yamt branches: 1.90.2;
use mutex and rwlock rather than lockmgr.
 1.89 28-Dec-2006  yamt remove several nqnfs definitions.
 1.88 27-Dec-2006  yamt remove nqnfs.
 1.87 09-Nov-2006  yamt remove some __unused in function parameters.
 1.86 12-Oct-2006  christos - sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386
 1.85 23-Jul-2006  ad branches: 1.85.4; 1.85.6;
Use the LWP cached credentials where sane.
 1.84 14-May-2006  elad integrate kauth.
 1.83 30-Mar-2006  yamt some cleanups after the introduction of GOP_SIZE_MEM flag.
- remove GOP_SIZE_READ/GOP_SIZE_WRITE flags.
they have not been used since the change.
- ufs_balloc_range: remove code which has been no-op since the change.
thanks Konrad Schroder for explaining the original intention of the code.
- ffs_gop_size: don't extend past eof, in the case of GOP_SIZE_MEM.
otherwise genfs_getpages end up to allocate pages past eof unnecessarily.
 1.82 02-Jan-2006  yamt branches: 1.82.2; 1.82.4; 1.82.6; 1.82.8; 1.82.10;
nfs_inactive:
- use LK_CANRECURSE instead of LK_RECURSEFAIL.
PR/32435 from Valeriy E. Ushakov.
- panic explicitly if the parent directory has been revoked.
add an XXX comment.
 1.81 11-Dec-2005  christos branches: 1.81.2;
merge ktrace-lwp.
 1.80 28-Jun-2005  yamt branches: 1.80.2;
- constify genfs_ops.
- use member designators.
 1.79 26-Feb-2005  perry branches: 1.79.2;
nuke trailing whitespace
 1.78 27-Jan-2005  yamt keep directory eof cache when inactivating vnode
because there's no reason to throw it away.
(fix an unintended side effect of nfs_subs.c rev.1.144.)
 1.77 25-Apr-2004  simonb branches: 1.77.4; 1.77.6;
Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.
 1.76 20-Apr-2004  yamt nfs_inactive: inactive the vp before doing sillyrename works.
vp can be reclaimed soon after it's unlocked.
 1.75 05-Apr-2004  yamt nfs_readdirplusrpc: fix a deadlock problem.
don't wait for vnode lock to load attributes.
otherwise, because READDIRPLUS returns DOTDOT entry as well,
we violate locking order.
 1.74 05-Apr-2004  yamt don't issue VOP_GETATTR blindly in nfs_nget().
in many cases, GETATTR RPCs here is redundant because the caller has
postop_attr. instead, make sure the resulted vnode have a valid
attribute in nfs_lookup().
 1.73 12-Mar-2004  yamt branches: 1.73.2;
shrink sizeof struct nfsnode by putting exclusive members into union.
 1.72 23-Jan-2004  wrstuden Adjust sillyrename cleanup code to deal with the parent vnode
already being locked by our thread. VOP_INACTIVATE() makes no
statement as to the lock state of the parent, yet this code assumed
we had it unlocked.

With this change, we let vn_lock() fail with EDEADLK if we already
have the parent locked. We then handle the rename cleanup, and on
the way out just vrele() the parent vnode, not vput() it.

Fixes a case seen by Steve Woodford at Wasabisystems dot com where
we'd panic while running a pkgsrc configure test that verified
fork() functionality. I expect the problem is a result of the recent
exit() changes and the performance of the machines he tested on.

Specifically we would crash during an nfs_remove(). As best I can
tell, when nfs_remove() tested to see if we should rename or we
should remove, v_usecount was > 1 and vattr.va_nlink was 1. Thus
we did the sillyrename in nfs_remove(). However by the time we got
down to the vput(vp), v_usecount had dropped to one and thus vput()
triggered the VOP_INACTIVATE() code path. nfs_inactive() tries to
lock the parent to undo the sillyrename, and deadlocks as we still
have it locked.
 1.71 07-Dec-2003  fvdl Unix semantics dictate that access checks for files are done when it
is opened. An open file can always be read from and/or written to,
depending on how it was opened.

Therefore, the read/write/commit RPCs should never return EACCESS,
as they are only performed on files that have been successfully opened
already.

This change improves the current situation and works in most cases.
It simply always uses the most recently known owner/group of the file,
iff the authentication mechanism is AUTH_UNIX (in other cases, the
creds for a succesful open are used, but note that no other cases
are currently implemented).

A retry mechanism can be used to catch a few more cases, but this is
a good improvement for now.
 1.70 07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.69 30-Jul-2003  yamt vrecycle removed nfs vnodes.
not perfect, but enough for most cases.
 1.68 29-Jun-2003  fvdl branches: 1.68.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
 1.67 29-Jun-2003  thorpej Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.
 1.66 28-Jun-2003  darrenr Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
 1.65 22-May-2003  yamt avoid double free with xlatecookie.
 1.64 21-May-2003  yamt remove local definitions of TRUE and FALSE.
 1.63 07-May-2003  yamt use hashdone to free hashinit'ed memory.
 1.62 02-Apr-2003  yamt use queue manipulation macros.
 1.61 17-Feb-2003  perseant Add code to UBCify LFS. This is still behind "#ifdef LFS_UBC" for now
(there are still some details to work out) but expect that to go
away soon. To support these basic changes (creation of lfs_putpages,
lfs_gop_write, mods to lfs_balloc) several other changes were made, to
wit:

* Create a writer daemon kernel thread whose purpose is to handle page
writes for the pagedaemon, but which also takes over some of the
functions of lfs_check(). This thread is started the first time an
LFS is mounted.

* Add a "flags" parameter to GOP_SIZE. Current values are
GOP_SIZE_READ, meaning that the call should return the size of the
in-core version of the file, and GOP_SIZE_WRITE, meaning that it
should return the on-disk size. One of GOP_SIZE_READ or
GOP_SIZE_WRITE must be specified.

* Instead of using malloc(...M_WAITOK) for everything, reserve enough
resources to get by and use malloc(...M_NOWAIT), using the reserves if
necessary. Use the pool subsystem for structures small enough that
this is feasible. This also obsoletes LFS_THROTTLE.

And a few that are not strictly necessary:

* Moves the LFS inode extensions off onto a separately allocated
structure; getting closer to LFS as an LKM. "Welcome to 1.6O."

* Unified GOP_ALLOC between FFS and LFS.

* Update LFS copyright headers to correct values.

* Actually cast to unsigned in lfs_shellsort, like the comment says.

* Keep track of which segments were empty before the previous
checkpoint; any segments that pass two checkpoints both dirty and
empty can be summarily cleaned. Do this. Right now lfs_segclean
still works, but this should be turned into an effectless
compatibility syscall.
 1.60 15-Feb-2003  drochner Don't remove the nfsnode from the hash chain in nfs_inactive.
It will never get back... it will not be found in nfs_nget, a new
nfsnode+vnode is allocated instead, which causes a node leak, and
also makes the mountpointness of the vnode to be forgotten, breaking
filesystem crossing lookups through this vnode.
 1.59 12-Feb-2003  fvdl Move purging the dircache and removing a vnode from the nqnfs timer queue
into nfs_inactive, this is a better place for it.

This doesn't actually solve the actual problem, which appears to be a race
condition with unmounting and vnode recycling somewhere, but it fixes
it in the sense that nfs_reclaim will not reference a bad v_mount anymore.
 1.58 10-Feb-2003  christos move the MALLOC decl for DIROFFS to nfs_subs.c
 1.57 01-Feb-2003  thorpej Add extensible malloc types, adapted from FreeBSD. This turns
malloc types into a structure, a pointer to which is passed around,
instead of an int constant. Allow the limit to be adjusted when the
malloc type is defined, or with a function call, as suggested by
Jonathan Stone.
 1.56 01-Dec-2002  matt Make sure these all agree on the same definitons of various variables.
 1.55 01-Oct-2002  christos back out previous. the problem is not the v_mount is null, but it appears
corrupted.
 1.54 30-Sep-2002  christos deal with v_mount == NULL in nfs_reclaim(). We should not be touching this
anyway, but nq-nfs wants us to.
 1.53 16-Mar-2002  chs branches: 1.53.6;
make sure that if NMODIFIED is clear, all pages attached to the vnode are
clean and without writable mappings. if we try to flush dirty pages past
EOF to the server when NMODIFIED is clear, we'll update the attrcache before
doing the write, which will try to free the pages past EOF and deadlock.
to deal with this, we write-protect pages before we send them to the server,
and restrict ourselves to creating read-only mappings if NMODIFIED isn't set.
score another one for enami.
 1.52 08-Mar-2002  thorpej Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.
 1.51 26-Jan-2002  chs re-enable NFSv3 commit RPCs by abandoning my new approach in favor of
frank's scheme, with one new twist: don't wait until we've totally run
out of free pages before committing, but instead notice when we've built
up a largish range of uncommitted pages and commit only the older half of
the range, which is likely to already be on disk on the server.
 1.50 21-Jan-2002  fvdl VOP_UNLOCK + vgone --> vput, since the vnode will already have
a reference.
 1.49 18-Jan-2002  fvdl Unlock vnode before calling vgone() in case of getattr failure during
nfs_nget. Fixes problem reported by Chuck Cranor.
 1.48 06-Dec-2001  lukem Replace nfs_hash() (with its extremely bad hash) with a macro to call
hash32_buf() to obtain a 32 bit hash. On some tests I ran I obtained
a 30x improvement in hash distribution and a 6x reduction in time spent
in nfs_nget().
 1.47 10-Nov-2001  lukem add RCSIDs
 1.46 15-Sep-2001  chs branches: 1.46.2;
a whole bunch of changes to improve performance and robustness under load:

- remove special treatment of pager_map mappings in pmaps. this is
required now, since I've removed the globals that expose the address range.
pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's
no longer any need to special-case it.
- eliminate struct uvm_vnode by moving its fields into struct vnode.
- rewrite the pageout path. the pager is now responsible for handling the
high-level requests instead of only getting control after a bunch of work
has already been done on its behalf. this will allow us to UBCify LFS,
which needs tighter control over its pages than other filesystems do.
writing a page to disk no longer requires making it read-only, which
allows us to write wired pages without causing all kinds of havoc.
- use a new PG_PAGEOUT flag to indicate that a page should be freed
on behalf of the pagedaemon when it's unlocked. this flag is very similar
to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the
pageout fails due to eg. an indirect-block buffer being locked.
this allows us to remove the "version" field from struct vm_page,
and together with shrinking "loan_count" from 32 bits to 16,
struct vm_page is now 4 bytes smaller.
- no longer use PG_RELEASED for swap-backed pages. if the page is busy
because it's being paged out, we can't release the swap slot to be
reallocated until that write is complete, but unlike with vnodes we
don't keep a count of in-progress writes so there's no good way to
know when the write is done. instead, when we need to free a busy
swap-backed page, just sleep until we can get it busy ourselves.
- implement a fast-path for extending writes which allows us to avoid
zeroing new pages. this substantially reduces cpu usage.
- encapsulate the data used by the genfs code in a struct genfs_node,
which must be the first element of the filesystem-specific vnode data
for filesystems which use genfs_{get,put}pages().
- eliminate many of the UVM pagerops, since they aren't needed anymore
now that the pager "put" operation is a higher-level operation.
- enhance the genfs code to allow NFS to use the genfs_{get,put}pages
instead of a modified copy.
- clean up struct vnode by removing all the fields that used to be used by
the vfs_cluster.c code (which we don't use anymore with UBC).
- remove kmem_object and mb_object since they were useless.
instead of allocating pages to these objects, we now just allocate
pages with no object. such pages are mapped in the kernel until they
are freed, so we can use the mapping to find the page to free it.
this allows us to remove splvm() protection in several places.

The sum of all these changes improves write throughput on my
decstation 5000/200 to within 1% of the rate of NetBSD 1.5
and reduces the elapsed time for "make release" of a NetBSD 1.5
source tree on my 128MB pc to 10% less than a 1.5 kernel took.
 1.45 15-Sep-2001  chs add a new VFS op, vfs_reinit, which is called when desiredvnodes is
adjusted via sysctl. file systems that have hash tables which are
sized based on the value of this variable now resize those hash tables
using the new value. the max number of FFS softdeps is also recalculated.

convert various file systems to use the <sys/queue.h> macros for
their hash tables.
 1.44 03-May-2001  fvdl branches: 1.44.2; 1.44.4;
Drop vnode lock before removing the sillyrename file, to avoid a
lock-o-death.
 1.43 20-Apr-2001  fvdl On VOP_GETATTR failure in nfs_nget, call vgone() to get rid
of the vnode that was just created. Suggested by Enami.
 1.42 20-Apr-2001  fvdl Unlock the hash lock before returning an error in nfs_nget.
From IWAMOTO Toshihiro.
 1.41 07-Feb-2001  tsutsui branches: 1.41.2;
Fix nested extern declaration of prtactive.
 1.40 06-Feb-2001  fvdl In nfs_inactive there's no need anymore for an extra refcount around
nfs_vinvalbuf, since it has a real lock on the vnode now, so getnewvnode
will not hijack it.
 1.39 06-Feb-2001  fvdl Do actual vnode locking for NFS.
 1.38 27-Nov-2000  chs Initial integration of the Unified Buffer Cache project.
 1.37 08-Nov-2000  ad Update for hashinit() change.
 1.36 19-Sep-2000  fvdl Initialize the lock needed to serialize commits for one NFS node.
 1.35 19-Sep-2000  bjh21 New kernel option, NFS_V2_ONLY, which aims to reduce the NFS client to just
that required to support NFSv2 mounts. Not finished yet, but already
provides some 44k of saving in code size on arm26. More savings, and some
documentation, are still to come.
 1.34 03-Aug-2000  thorpej MALLOC()/FREE() are not to be used for variable size allocations.
 1.33 30-Mar-2000  augustss branches: 1.33.4;
Remove register declarations.
 1.32 30-Mar-2000  simonb Delete redundant decl of nfsv2_vnodeop_p, it's in <nfs/nfsnode.h>.
 1.31 16-Mar-2000  jdolecek Add new VFS op routine - vfs_done and call it on filesystem detach
in vfs_detach(). vfs_done may free global filesystem's resources,
typically those allocated in respective filesystem's init function.
Needed so those filesystems which went in via LKM have a chance to
clean after themselves before unloading. This fixes random panics
when LKM for filesystem using pools was loaded and unloaded several
times.

For each leaf filesystem, add appropriate vfs_done routine.
 1.30 29-Nov-1999  fvdl Insert an extra VOP_ACCESS check in nfs_lookup, to avoid cached access
mishaps for lookup and getattr. Closes PR 8884.

While at it, cache access RPCs.
 1.29 08-Jul-1999  wrstuden branches: 1.29.2; 1.29.8;
Modify file systems to deal with struct lock in struct vnode. All leaf
fs's other than nfs use genfs_lock() for locking.

Modify lookup routines to set PDIRUNLOCK when they unlock the parrent.
 1.28 01-Sep-1998  thorpej branches: 1.28.2; 1.28.6; 1.28.8;
Use the pool allocator and the "nointr" pool page allocator for NFS nodes
and vattr structures.
 1.27 09-Aug-1998  perry bzero->memset, bcopy->memcpy, bcmp->memcmp
 1.26 01-Mar-1998  fvdl Merge with Lite2 + local changes
 1.25 07-Feb-1998  chs add flags arg to hashinit(), to pass to malloc().
 1.24 19-Oct-1997  fvdl * Implement optional 32 <-> 64 bit directory cookie translation. This uses
the directory cache as translation table. See nfs_subs.c for comments.
Makes the code a bit more complex to look at than I would have liked,
but doesn't affect the speed of the default behavior.
* Optimize caching behavior a bit when buffers are invalidated.
* Save some RPCs in readdir operations by not bothering if there is
a small amount left to do to fill the buffer. It'll be done in the
next RPC with a larger chunk anyway. Wastes a bit of buffer space
but is faster.
* Make n_vattr an allocated vattr struct. This avoids nfsnode bloat,
and is friendlier to the malloc routines.
 1.23 10-Oct-1997  fvdl * New directory entry caching system. Provides full caching of any
directory cookie that may be thrown back at us from userspace, up
to a size limit. Fixes double entry problem.
* Split flags for internal and external use in the NFS mount structure.
* Fix some buffer structure fields that weren're being used correctly.
* Fix missing directory cache inval call in nfs_open.
* Limit on NFS_DIRBLKSIZ no longer needed, bumped to the more reasonable
value of 8k.
* Various other things that I forget, all related to the dir caching
somehow, though.
 1.22 07-Jul-1997  fvdl branches: 1.22.2;
Do locking around nfsnode hashing (perhaps even right this time!)
 1.21 07-Jul-1997  fvdl Revert until I have time to fix it today (lock applied wrongly).
 1.20 06-Jul-1997  fvdl Put lock around nfs node hashing to avoid race conditions, as MALLOC
or getnewvnode may block.
 1.19 22-Feb-1997  fvdl Fixes from BSDI (thanks go to Keith Bostic). Original RCS messages:

date: 1996/09/06 03:00:31; author: donn; state: Exp; lines: +1 -2
Because NFS doesn't implement vnode locking, nfs_inactive() doesn't really
have the vnode locked and hence it can't reliably access the vnode after
it performs a blocking operation. We remove one blocking call and push
the no-op VOP_UNLOCK higher so that we don't access the vnode after we
delete the sillyrename file. This should prevent crashes we've seen in
which the vnode turned into a UFS vnode and caused a panic in ufs_unlock()
when we tried to 'unlock' it.

date: 1996/09/25 19:15:21; author: cp; state: Exp; lines: +4 -0
Kirk's change to not corrupt files after a delete.

date: 1996/11/08 19:53:45; author: donn; state: Exp; lines: +16 -4
Krik's change to solve the paradox that vclean() calls nfs_inactive()
with VXLOCK set on the vnode, and nfs_inactive() was calling vget()
to get a reference on the vnode, which in turn hung on VXLOCK.
Nfs_inactive() now checks v_usecount to make sure that the vnode
is not coming from vclean() before it does a vget().
 1.18 12-Feb-1997  fvdl Don't set sillyrename field to 0 for directories, as it's in a union with
the head of the cookie list. Fixes PR 3215, fix supplied by
Hiroshi Tezuka <tezuka@trc.rwcp.or.jp>. Should also fix M_NFSDIROFF
memory leak.
 1.17 01-Sep-1996  mycroft branches: 1.17.4;
Add a set of generic file system operations that most file systems use.
Also, fix some time stamp bogosities.
 1.16 18-Feb-1996  fvdl Bring in a merge of Rick Macklem's NFSv3 code from Lite2
 1.15 09-Feb-1996  christos nfs prototype changes
 1.14 19-Dec-1995  cgd changes to make this work on systems where pointers & longs are 64 bits.
This is mostly just changes to make the stuff that goes over the wire
use fixed-size types.
 1.13 18-Aug-1994  mycroft More LIST/CIRCLEQ migration.
 1.12 29-Jun-1994  cgd branches: 1.12.2;
New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'
 1.11 13-Jun-1994  mycroft Undo last change.
 1.10 13-Jun-1994  gwr Fix unresolved: prtactive
 1.9 08-Jun-1994  mycroft Update to 4.4-Lite fs code, with local changes.
 1.8 25-Apr-1994  cgd some prototype cleanup, eliminate/replace bogus types (e.g. quad and
u_quad) -> use better types (e.g. quad_t & u_quad_t in inodes),
some cleanup.
 1.7 21-Apr-1994  cgd blow away all vestiges of nfsnode locking.
(1) it's unnecessary
(2) it causes machines to hang (yup!)
(3) it'd be gone in a few days anyway (it'd been yanked out
of 4.4-Lite by macklem long ago)
It was only there because macklem couldn't originally decide if things
should be locked, or not...
 1.6 01-Mar-1994  pk Enable nfs_lock(); useful when IO_APPEND'ing.
 1.5 15-Feb-1994  pk Update {a,m}time vnode attributes on special files a la ufs_vnode.c,
but make it a non-urgent operation, to leave us some performance.
 1.4 18-Dec-1993  mycroft Canonicalize all #includes.
 1.3 28-Jul-1993  cgd branches: 1.3.2;
incorporate changes from 0-9-base to 0-9-ALPHA
 1.2 20-May-1993  cgd branches: 1.2.2;
more rcs id adding and header cleanup. i like vi macros!
 1.1 21-Mar-1993  cgd branches: 1.1.1;
Initial revision
 1.1.1.3 01-Mar-1998  fvdl Import 4.4BSD-Lite2
 1.1.1.2 01-Mar-1998  fvdl Import 4.4BSD-Lite for reference
 1.1.1.1 21-Mar-1993  cgd initial import of 386bsd-0.1 sources
 1.2.2.1 24-Jul-1993  cgd clean the nfsnode's lockf field after getting a new vnode;
this probably explains some strange NFS-related lockf crashes on pain,
and UFS does it, so it can't hurt.
 1.3.2.1 14-Nov-1993  mycroft Canonicalize all #includes.
 1.12.2.1 19-Aug-1994  mycroft update from trunk
 1.17.4.1 12-Mar-1997  is Merge in changes from Trunk
 1.22.2.1 14-Oct-1997  thorpej Update marc-pcmcia branch from trunk.
 1.28.8.2 02-Aug-1999  thorpej Update from trunk.
 1.28.8.1 07-Jun-1999  chs merge everything from chs-ubc branch.
 1.28.6.1 05-Jan-2000  he Pull up revision 1.30 (requested by fvdl):
Insert an extra VOP_ACCESS check in nfs_lookup, preventing cached
access mishaps for lookup and getattr. Fixes PR#8884.
 1.28.2.1 30-May-1999  chs there's a new rule that all vnodes must call uvm_vnp_setsize()
before anyone can possibly access them, so do this in nfs_nget().
 1.29.8.1 27-Dec-1999  wrstuden Pull up to last week's -current.
 1.29.2.6 23-Apr-2001  bouyer Sync with HEAD.
 1.29.2.5 21-Apr-2001  bouyer Sync with HEAD
 1.29.2.4 11-Feb-2001  bouyer Sync with HEAD.
 1.29.2.3 08-Dec-2000  bouyer Sync with HEAD.
 1.29.2.2 22-Nov-2000  bouyer Sync with HEAD.
 1.29.2.1 20-Nov-2000  bouyer Update thorpej_scsipi to -current as of a month ago
 1.33.4.1 14-Dec-2000  he Pull up revision 1.36 (requested by fvdl):
Improve NFS performance, possibly with as much as 100% in
throughput. Please note: this implies a kernel interface change,
VOP_FSYNC gains two arguments.
 1.41.2.13 11-Dec-2002  thorpej Sync with HEAD.
 1.41.2.12 18-Oct-2002  nathanw Catch up to -current.
 1.41.2.11 15-Jul-2002  nathanw Whitespace.
 1.41.2.10 12-Jul-2002  nathanw No longer need to pull in lwp.h; proc.h pulls it in for us.
 1.41.2.9 24-Jun-2002  nathanw Curproc->curlwp renaming.

Change uses of "curproc->l_proc" back to "curproc", which is more like the
original use. Bare uses of "curproc" are now "curlwp".

"curproc" is now #defined in proc.h as ((curlwp) ? (curlwp)->l_proc) : NULL)
so that it is always safe to reference curproc (*de*referencing curproc
is another story, but that's always been true).
 1.41.2.8 01-Apr-2002  nathanw Catch up to -current.
(CVS: It's not just a program. It's an adventure!)
 1.41.2.7 28-Feb-2002  nathanw Catch up to -current.
 1.41.2.6 08-Jan-2002  nathanw Catch up to -current.
 1.41.2.5 14-Nov-2001  nathanw Catch up to -current.
 1.41.2.4 25-Sep-2001  nathanw Fix typo in previous commit.
 1.41.2.3 21-Sep-2001  nathanw Catch up to -current.
 1.41.2.2 21-Jun-2001  nathanw Catch up to -current.
 1.41.2.1 05-Mar-2001  nathanw Initial commit of scheduler activations and lightweight process support.
 1.44.4.1 01-Oct-2001  fvdl Catch up with -current.
 1.44.2.5 10-Oct-2002  jdolecek sync kqueue with -current; this includes merge of gehenna-devsw branch,
merge of i386 MP branch, and part of autoconf rototil work
 1.44.2.4 23-Jun-2002  jdolecek catch up with -current on kqueue branch
 1.44.2.3 16-Mar-2002  jdolecek Catch up with -current.
 1.44.2.2 11-Feb-2002  jdolecek Sync w/ -current.
 1.44.2.1 10-Jan-2002  thorpej Sync kqueue branch with -current.
 1.46.2.1 12-Nov-2001  thorpej Sync the thorpej-mips-cache branch with -current.
 1.53.6.1 28-Jul-2003  he Apply patch (requested by christos in ticket #1171):
Apply a stopgap fix preventing a panic for non-NQNFS when
nfs_reclaim is called on a vnode of an unmounted NFS file
system.
 1.68.2.8 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.68.2.7 04-Mar-2005  skrll Sync with HEAD.

Hi Perry!
 1.68.2.6 04-Feb-2005  skrll Sync with HEAD.
 1.68.2.5 21-Sep-2004  skrll Fix the sync with head I botched.
 1.68.2.4 18-Sep-2004  skrll Sync with HEAD.
 1.68.2.3 24-Aug-2004  skrll Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.
 1.68.2.2 03-Aug-2004  skrll Sync with HEAD
 1.68.2.1 02-Jul-2003  darrenr Apply the aborted ktrace-lwp changes to a specific branch. This is just for
others to review, I'm concerned that patch fuziness may have resulted in some
errant code being generated but I'll look at that later by comparing the diff
from the base to the branch with the file I attempt to apply to it. This will,
at the very least, put the changes in a better context for others to review
them and attempt to tinker with removing passing of 'struct lwp' through
the kernel.
 1.73.2.2 10-Jul-2004  tron Pull up revision 1.75 (requested by tls in ticket #634):
nfs_readdirplusrpc: fix a deadlock problem.
don't wait for vnode lock to load attributes.
otherwise, because READDIRPLUS returns DOTDOT entry as well,
we violate locking order.
 1.73.2.1 10-Jul-2004  tron Pull up revision 1.74 (requested by tls in ticket #634):
don't issue VOP_GETATTR blindly in nfs_nget().
in many cases, GETATTR RPCs here is redundant because the caller has
postop_attr. instead, make sure the resulted vnode have a valid
attribute in nfs_lookup().
 1.77.6.2 19-Mar-2005  yamt sync with head. xen and whitespace. xen part is not finished.
 1.77.6.1 12-Feb-2005  yamt sync with head.
 1.77.4.1 29-Apr-2005  kent sync with -current
 1.79.2.1 24-Aug-2005  riz Pull up following revision(s) (requested by yamt in ticket #688):
sys/miscfs/genfs/genfs_vnops.c: revision 1.98 via patch
sys/ufs/ffs/ffs_vfsops.c: revision 1.165
sys/ufs/lfs/lfs_extern.h: revision 1.69
sys/fs/filecorefs/filecore_vfsops.c: revision 1.20
sys/nfs/nfs_node.c: revision 1.80
sys/fs/smbfs/smbfs_node.c: revision 1.24
sys/fs/cd9660/cd9660_vfsops.c: revision 1.24
sys/fs/msdosfs/msdosfs_denode.c: revision 1.8
sys/miscfs/genfs/genfs_node.h: revision 1.6
sys/ufs/lfs/lfs_vfsops.c: revision 1.183
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.86
sys/fs/adosfs/advfsops.c: revision 1.23
sys/fs/ntfs/ntfs_vfsops.c: revision 1.31
- constify genfs_ops.
- use member designators.

sys/miscfs/genfs/genfs_vnops.c: revision 1.99 via patch
genfs_getpages: don't forget to put the vnode onto the syncer's work que
ue
even in the case of PGO_LOCKED.

sys/uvm/uvm_bio.c: revision 1.40
sys/uvm/uvm_pager.h: revision 1.29
sys/miscfs/genfs/genfs_vnops.c: revision 1.100 via patch
sys/ufs/ufs/ufs_inode.c: revision 1.50
- introduce PGO_NOBLOCKALLOC and use it for ubc mapping
to prevent unnecessary block allocations in the case that
page size > block size.
- ufs_balloc_range: use VM_PROT_WRITE+PGO_NOBLOCKALLOC rather than
VM_PROT_READ.

sys/uvm/uvm_fault.c: revision 1.96
sys/miscfs/genfs/genfs_vnops.c: revision 1.101 via patch
sys/uvm/uvm_object.h: revision 1.19
sys/miscfs/genfs/genfs_node.h: revision 1.7
ensure that vnodes with dirty pages are always on syncer's queue.
- genfs_putpages: wait for i/o completion of PG_RELEASED/PG_PAGEOUT pages by
setting "wasclean" false when encountering them.
suggested by Stephan Uphoff in PR/24596 (1).
- genfs_putpages: write protect pages when cleaning out, if
we're going to take the vnode off the syncer's queue.
uvm_fault: don't write-map pages unless its vnode is already on
the syncer's queue.
fix PR/24596 (3) but in the different way from the suggested fix.
(to keep our current behaviour, ie. not to require explicit msync.
discussed on tech-kern@.)
- genfs_putpages: don't mistakenly take a vnode off the queue
by introducing a generation number in genfs_node.
genfs_getpages: increment the generation number.
suggested by Stephan Uphoff in PR/24596 (2).
- add some assertions.

sys/miscfs/genfs/genfs_vnops.c: revision 1.102 via patch
genfs_putpages: don't bother to clean the vnode unless VONWORKLST.

sys/ufs/ffs/ffs_vnops.c: revision 1.71
ffs_full_fsync: because VBLK/VCHR can be mmap'ed,
do VOP_PUTPAGES for them as well.

sys/uvm/uvm_fault.c: revision 1.97
uvm_fault: check a correct object in the case of layered filesystems.
fix PR/30811 from Jukka Salmi.

sys/uvm/uvm_object.h: revision 1.20
sys/ufs/ffs/ffs_vfsops.c: revision 1.167
sys/uvm/uvm_bio.c: revision 1.41
sys/ufs/ufs/ufs_vnops.c: revision 1.129
sys/uvm/uvm_mmap.c: revision 1.92
sys/uvm/uvm_fault.c: revision 1.98
sys/kern/vfs_subr.c: revision 1.252
sys/fs/msdosfs/denode.h: revision 1.5
sys/miscfs/genfs/genfs_vnops.c: revision 1.103 via patch
sys/fs/msdosfs/msdosfs_denode.c: revision 1.9
sys/sys/vnode.h: revision 1.141
sys/ufs/ufs/ufs_inode.c: revision 1.51
sys/ufs/ufs/ufs_extern.h: revision 1.45 via patch
sys/miscfs/genfs/genfs_node.h: revision 1.8
sys/ufs/lfs/lfs_vfsops.c: revision 1.184
sys/uvm/uvm_pager.h: revision 1.30
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.87
update file timestamps for nfsd loaned-read and mmap.
PR/25279. discussed on tech-kern@.

sys/miscfs/genfs/genfs_vnops.c: revision 1.104 via patch
don't write-protect wired pages. pointed by Chuck Silvers.
for now, leave a vnode on the syncer's queue, as suggested by him.

sys/ufs/ffs/ffs_vnops.c: revision 1.72
revert VCHR part of ffs_vnops.c 1.71.
as VCHR uses the device pager, no point to call VOP_PUTPAGES here.
pointed by Chuck Silvers.
 1.80.2.7 04-Feb-2008  yamt sync with head.
 1.80.2.6 21-Jan-2008  yamt sync with head
 1.80.2.5 07-Dec-2007  yamt sync with head
 1.80.2.4 03-Sep-2007  yamt sync with head.
 1.80.2.3 26-Feb-2007  yamt sync with head.
 1.80.2.2 30-Dec-2006  yamt sync with head.
 1.80.2.1 21-Jun-2006  yamt sync with head.
 1.81.2.1 15-Jan-2006  yamt sync with head.
 1.82.10.2 24-May-2006  tron Merge 2006-05-24 NetBSD-current into the "peter-altq" branch.
 1.82.10.1 31-Mar-2006  tron Merge 2006-03-31 NetBSD-current into the "peter-altq" branch.
 1.82.8.3 06-May-2006  christos - Move kauth_cred_t declaration to <sys/types.h>
- Cleanup struct ucred; forward declarations that are unused.
- Don't include <sys/kauth.h> in any header, but include it in the c files
that need it.

Approved by core.
 1.82.8.2 19-Apr-2006  elad sync with head.
 1.82.8.1 08-Mar-2006  elad Adapt to kernel authorization KPI.

This could use some testing...
 1.82.6.3 11-Aug-2006  yamt sync with head
 1.82.6.2 24-May-2006  yamt sync with head.
 1.82.6.1 01-Apr-2006  yamt sync with head.
 1.82.4.2 01-Jun-2006  kardel Sync with head.
 1.82.4.1 22-Apr-2006  simonb Sync with head.
 1.82.2.1 09-Sep-2006  rpaulo sync with head
 1.85.6.2 10-Dec-2006  yamt sync with head.
 1.85.6.1 22-Oct-2006  yamt sync with head
 1.85.4.2 12-Jan-2007  ad Sync with head.
 1.85.4.1 18-Nov-2006  ad Sync with head.
 1.90.2.2 24-Mar-2007  yamt sync with head.
 1.90.2.1 28-Feb-2007  yamt sync with head. (somehow missed in the previous)
 1.92.4.4 16-Sep-2007  ad Checkpoint work in progress on the vnode lifecycle and reference counting
stuff. This makes it work properly without kernel_lock and fixes a few
quite old bugs. See vfs_subr.c 1.283.2.17 for details.
 1.92.4.3 20-Aug-2007  ad Sync with HEAD.
 1.92.4.2 15-Jul-2007  ad Sync with head.
 1.92.4.1 13-Mar-2007  ad Sync with head.
 1.93.2.1 11-Jul-2007  mjf Sync with head.
 1.94.6.2 27-Nov-2007  joerg Sync with HEAD. amd64 Xen support needs testing.
 1.94.6.1 09-Aug-2007  jmcneill Sync with HEAD.
 1.94.2.1 15-Aug-2007  skrll Sync with HEAD.
 1.95.10.2 06-Aug-2007  yamt nfs_inactive: turn a panic into a printf for now, as it isn't critical.
PR/36572 from Martin Husemann.
 1.95.10.1 06-Aug-2007  yamt file nfs_node.c was added on branch matt-mips64 on 2007-08-06 11:55:09 +0000
 1.95.8.2 18-Feb-2008  mjf Sync with HEAD.
 1.95.8.1 08-Dec-2007  mjf Sync with HEAD.
 1.95.2.2 23-Mar-2008  matt sync with HEAD
 1.95.2.1 09-Jan-2008  matt sync with HEAD
 1.96.6.2 19-Jan-2008  bouyer Sync with HEAD
 1.96.6.1 02-Jan-2008  bouyer Sync with HEAD
 1.96.2.1 04-Dec-2007  ad Pull the vmlocking changes into a new branch.
 1.101.10.6 10-Oct-2010  yamt some locking changes
 1.101.10.5 09-Oct-2010  yamt sync with head
 1.101.10.4 11-Aug-2010  yamt sync with head.
 1.101.10.3 04-May-2009  yamt sync with head.
 1.101.10.2 16-May-2008  yamt sync with head.
 1.101.10.1 27-Apr-2008  yamt commit some work-in-progress changes to make nfs client mp-safe to a branch,
so that they won't get lost.
- sprinkle some locking
- mark the filesystem, nfstimer callout, and kq kthread mp-safe
- add assertions and comments
- disable upgrade mount for now
- some unrelated cosmetic changes
 1.101.8.2 04-Jun-2008  yamt sync with head
 1.101.8.1 18-May-2008  yamt sync with head.
 1.101.6.3 17-Jan-2009  mjf Sync with HEAD.
 1.101.6.2 05-Oct-2008  mjf Sync with HEAD.
 1.101.6.1 02-Jun-2008  mjf Sync with HEAD.
 1.102.2.2 10-Oct-2008  skrll Sync with HEAD.
 1.102.2.1 23-Jun-2008  wrstuden Sync w/ -current. 34 merge conflicts to follow.
 1.103.4.2 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.103.4.1 19-Oct-2008  haad Sync with HEAD.
 1.106.4.1 02-Feb-2009  snj Pull up following revision(s) (requested by ad in ticket #344):
sys/nfs/nfs_node.c: revision 1.108
sys/nfs/nfsnode.h: revision 1.69
- Don't vput() a vnode that we do not hold locked.
- Eliminate one of the few remaining uses of LK_CANRECURSE.
 1.106.2.2 28-Apr-2009  skrll Sync with HEAD.
 1.106.2.1 19-Jan-2009  skrll Sync with HEAD.
 1.108.2.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.110.4.6 31-May-2011  rmind sync with head
 1.110.4.5 22-May-2011  rmind nfs_gop_write: acquire the lock for pmap_page_protect() operation.
 1.110.4.4 19-May-2011  rmind Implement sharing of vnode_t::v_interlock amongst vnodes:
- Lock is shared amongst UVM objects using uvm_obj_setlock() or getnewvnode().
- Adjust vnode cache to handle unsharing, add VI_LOCKSHARE flag for that.
- Use sharing in tmpfs and layerfs for underlying object.
- Simplify locking in ubc_fault().
- Sprinkle some asserts.

Discussed with ad@.
 1.110.4.3 05-Mar-2011  rmind sync with head
 1.110.4.2 03-Jul-2010  rmind sync with head
 1.110.4.1 16-Mar-2010  rmind Change struct uvm_object::vmobjlock to be dynamically allocated with
mutex_obj_alloc(). It allows us to share the locks among UVM objects.
 1.110.2.2 22-Oct-2010  uebayasi Sync with HEAD (-D20101022).
 1.110.2.1 17-Aug-2010  uebayasi Sync with HEAD.
 1.114.2.1 06-Jun-2011  jruoho Sync with HEAD.
 1.115.2.1 23-Jun-2011  cherry Catchup with rmind-uvmplock merge.
 1.116.16.1 18-May-2014  rmind sync with head
 1.116.12.2 03-Dec-2017  jdolecek update from HEAD
 1.116.12.1 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.116.2.1 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.117.2.1 10-Aug-2014  tls Rebase.
 1.118.8.1 26-Apr-2017  pgoyette Sync with HEAD
 1.118.4.2 28-Aug-2017  skrll Sync with HEAD
 1.118.4.1 05-Oct-2016  skrll Sync with HEAD
 1.119.2.1 21-Apr-2017  bouyer Sync with HEAD
 1.122.8.1 25-Jun-2018  pgoyette Sync with HEAD
 1.123.2.2 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.123.2.1 08-Apr-2020  martin Merge changes from current as of 20200406
 1.124.2.1 29-Feb-2020  ad Sync with head.
 1.75 27-Dec-2006  yamt remove nqnfs.
 1.74 09-Nov-2006  yamt remove some __unused in function parameters.
 1.73 17-Oct-2006  dogcow now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.
 1.72 13-Oct-2006  christos more __unused
 1.71 13-Oct-2006  dogcow more unused variable fallout.
 1.70 12-Oct-2006  christos - sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386
 1.69 02-Sep-2006  yamt branches: 1.69.2; 1.69.4;
nfsd: deal with variable-sized filehandles.
 1.68 02-Sep-2006  yamt #ifdef out nqsrv_getlease and friends unless defined(NFSSERVER).
 1.67 13-Jul-2006  martin Fix alignement problems for fhandle_t, exposed by gcc4.1.

While touching all vptofh/fhtovp functions, get rid of VFS_MAXFIDSIZ,
version the getfh(2) syscall and explicitly pass the size available in
the filehandle from userland.

Discussed on tech-kern, with lots of help from yamt (thanks!).
 1.66 17-Jun-2006  yamt - introduce vfs_composefh() and use it where appropriate.
- fix lock/unlock mismatch in sys_getfh.
 1.65 07-Jun-2006  kardel branches: 1.65.2;
merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html
 1.64 20-May-2006  yamt nqsrv_getlease: call nfs_init() to fix NFSSERVER && !NFS case.
 1.63 18-May-2006  yamt branches: 1.63.2;
- fix some leaks in nfsd, introduced by kauth changes.
- simplify code.
- add some assertions.
- wrap some long lines.
- remove an unnecessary ";".
 1.62 14-May-2006  elad integrate kauth.
 1.61 15-Apr-2006  christos Coverity CID 1175: Remove dead code.
 1.60 15-Apr-2006  christos From my posting of April 3 to tech-kern:

My understanding is that the CLRSIG() is supposed to clear the signal
that was sent to the syncer process to prevent it from being delivered
to the syncer process in case unmounting fails, so that the syncer process
does not die while the filesystem is still mounted. The typical scenario
is, the syncher process is tsleep()ing in the kernel, and waking up when
it needs to do work. If someone sends a signal to it, eg. kill -TERM
the mfs process, then the kernel will try to unmount the mfs filesystem
before delivering the signal to the process. If that unmount fails, then
we should not really kill the process because that will hang the mount.
So we call CLRSIG() to stop the signal from being delivered.

So the first call to issignal() will return the signal number that was
sent to the syncer process (unless someone malicious was able to send
a lower numbered signal between the time tsleep() returned and we called
issignal()... something that is not really easy to do). But you are
right, we should not be calling it many times as a side effect of this
macro.

Rewrite CLRSIG() clear all the signals and call issignal() the correct
number of times.
 1.59 15-Apr-2006  christos Coverity CID 2509: Initialize cache
 1.58 27-Mar-2006  martin KASSERT that the returned file id length from VPTOFH is <= the
maximum allowed value (_VFS_MAXFIDSZ).
 1.57 11-Dec-2005  christos branches: 1.57.4; 1.57.6; 1.57.8; 1.57.10; 1.57.12;
merge ktrace-lwp.
 1.56 22-May-2004  jonathan branches: 1.56.12;
Eliminate several uses of `curproc' from the socket-layer code and from NFS.

Add a new explicit `struct proc *p' argument to socreate(), sosend().
Use that argument instead of curproc. Follow-on changes to pass that
argument to socreate(), sosend(), and (*so->so_send)() calls.
These changes reviewed and independently recoded by Matt Thomas.

Changes to soreceive() and (*dom->dom_exernalize() from Matt Thomas:
pass soreceive()'s struct uio* uio->uio_procp to unp_externalize().
Eliminate curproc from unp_externalize. Also, now soreceive() uses
its uio->uio_procp value, pass that same value downward to
((pr->pru_usrreq)() calls for consistency, instead of (struct proc * )0.

Similar changes in sys/nfs to eliminate (most) uses of curproc,
either via the req-> r_procp field of a struct nfsreq *req argument,
or by passing down new explicit struct proc * arguments.

Reviewed by: Matt Thomas, posted to tech-kern.
NB: The (*pr->pru_usrreq)() change should be tested on more (all!) protocols.
 1.55 21-Apr-2004  christos use VFS_MAXFIDSZ
 1.54 21-Apr-2004  christos Replace the statfs() family of system calls with statvfs().
Retain binary compatibility.
 1.53 07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.52 30-Jul-2003  yamt eliminate v_id.
 1.51 29-Jun-2003  fvdl branches: 1.51.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
 1.50 29-Jun-2003  thorpej Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.
 1.49 28-Jun-2003  darrenr Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
 1.48 21-May-2003  yamt remove local definitions of TRUE and FALSE.
 1.47 24-Apr-2003  drochner Change some subordinate functions to take a "struct nfsnode" argument
instead of "struct vnode". This saves a number of pointer dereferences;
it sums up to about half a kB for me. And it paves the way for future
fixes.
While cleaning up, eliminate a write-only member of "struct nfsreq"
and a pointless assignment in the NFS_V2_ONLY case.
 1.46 02-Apr-2003  yamt use queue manipulation macros.
 1.45 26-Feb-2003  matt Add MBUFTRACE kernel option.
Do a little mbuf rework while here. Change all uses of MGET*(*, M_WAIT, *)
to m_get*(M_WAIT, *). These are not performance critical and making them
call m_get saves considerable space. Add m_clget analogue of MCLGET and
make corresponding change for M_WAIT uses.
Modify netinet, gem, fxp, tulip, nfs to support MBUFTRACE.
Begin to change netstat to use sysctl.
 1.44 01-Feb-2003  thorpej Add extensible malloc types, adapted from FreeBSD. This turns
malloc types into a structure, a pointer to which is passed around,
instead of an int constant. Allow the limit to be adjusted when the
malloc type is defined, or with a function call, as suggested by
Jonathan Stone.
 1.43 18-Jan-2003  thorpej Merge the nathanw_sa branch.
 1.42 01-Dec-2002  matt Make sure these all agree on the same definitons of various variables.
 1.41 21-Oct-2002  yamt fix a page locking deadlock problem for nfs.

add a flag that specify if the file can be truncated safely or not
to nfsm_loadattr and friends. when it isn't safe, just mark the nfsnode
as "should be truncated later".

ok'ed by Frank van der Linden and Chuck Silvers.
close kern/18036.
 1.40 12-May-2002  matt Eliminate commons
 1.39 10-Nov-2001  lukem branches: 1.39.4;
add RCSIDs
 1.38 16-Apr-2001  thorpej branches: 1.38.2; 1.38.6;
When unmounting a file system, acquire the syncer_lock before
vfs_busy'ing just before the dounmount() call. This is to avoid
sleeping with the mountlist_slock held -- but we must acquire
syncer_lock before vfs_busy because the syncer itself uses
syncer_lock -> vfs_busy locking order.
 1.37 21-Feb-2001  jdolecek branches: 1.37.2;
make some more constant arrays 'const'
 1.36 06-Feb-2001  fvdl Do actual vnode locking for NFS.
 1.35 24-Nov-2000  chs put more ISO bits under ifdef ISO.
 1.34 19-Sep-2000  bjh21 Extend NFS_V2_ONLY to remove NQNFS lease support as well. Saves another 10k.
 1.33 19-Sep-2000  fvdl Adapt for VOP_FSYNC parameter change.
 1.32 09-Jun-2000  fvdl branches: 1.32.2;
Some tweaks to enable NFS over IPv6. The special-casing of AF_INET
should really be removed.
 1.31 30-Mar-2000  augustss branches: 1.31.2;
Remove register declarations.
 1.30 10-Oct-1999  sommerfeld branches: 1.30.2;
Fix bug in error handling for NFSv3 + nqnfs.
With nfsv2, the nfsm_reply() macro always causes the service routine
to return if error was nonzero.

With nfsv3, we can keep going after nfsm_reply() without returning,
but nqnfsrv_getlease() didn't take this into account, so add a
return(0) after each error-case nfsm_reply(0).
 1.29 25-Mar-1999  sommerfe branches: 1.29.2; 1.29.8;
Fix crash reported in PR7116 on shutdown
 1.28 06-Mar-1999  fair Snatch a patch from OpenBSD to fix PRs 6529 and 7074.
Adjust fxdr_hyper() and txdr_hyper() macros.
 1.27 09-Aug-1998  perry bzero->memset, bcopy->memcpy, bcmp->memcmp
 1.26 25-Jun-1998  thorpej - Rename nqnfs_vop_lease_check() to genfs_lease_check(). If NFSSERVER is
not in the kernel, genfs_lease_check() is simply a no-op. This allows
LKM'd file systems to be exported (previously did not work properly
due to a compile-time decision based on -DNFSSERVER).
- defopt NFSSERVER
 1.25 05-Jun-1998  kleink Convert fsync vnode operator implementations and usage from the old `waitfor'
argument and MNT_WAIT/MNT_NOWAIT to `flags' and FSYNC_WAIT.
 1.24 01-Mar-1998  fvdl Merge with Lite2 + local changes
 1.23 19-Feb-1998  thorpej Include the NFS option header.
 1.22 19-Oct-1997  fvdl * Implement optional 32 <-> 64 bit directory cookie translation. This uses
the directory cache as translation table. See nfs_subs.c for comments.
Makes the code a bit more complex to look at than I would have liked,
but doesn't affect the speed of the default behavior.
* Optimize caching behavior a bit when buffers are invalidated.
* Save some RPCs in readdir operations by not bothering if there is
a small amount left to do to fill the buffer. It'll be done in the
next RPC with a larger chunk anyway. Wastes a bit of buffer space
but is faster.
* Make n_vattr an allocated vattr struct. This avoids nfsnode bloat,
and is friendlier to the malloc routines.
 1.21 10-Oct-1997  fvdl * New directory entry caching system. Provides full caching of any
directory cookie that may be thrown back at us from userspace, up
to a size limit. Fixes double entry problem.
* Split flags for internal and external use in the NFS mount structure.
* Fix some buffer structure fields that weren're being used correctly.
* Fix missing directory cache inval call in nfs_open.
* Limit on NFS_DIRBLKSIZ no longer needed, bumped to the more reasonable
value of 8k.
* Various other things that I forget, all related to the dir caching
somehow, though.
 1.20 24-Jun-1997  fvdl branches: 1.20.4;
Provide extra arg to nfsrv_fhtovp (just FALSE in this case), it was
extended for WebNFS support.
 1.19 22-Feb-1997  fvdl Fixes from BSDI (thanks go to Keith Bostic). Original RCS messages:

date: 1995/11/30 20:37:53; author: cp; state: Exp; lines: +3 -3
Change splsoftclock() to splnet();
Make nfsrv_getstream create two copies of data when
splitting up an mbuf rather than two references to the
same external buffer. The symptom this fixes is client
hangs.

date: 1996/10/16 00:06:05; author: ewv; state: Exp; lines: +5 -3
Clear pending signal when an unmount fails, this allows us another chance
at the umount after a short sleep. The fixes a problem where /usr is
mounted via nqnfs and the system hangs on shutdown since the umount()
always fails with EBUSY (inetd is still busy on usr) and since we don't
clear the signal we end up stuck looping and never give inetd a chance to
catch its SIGKILL.

date: 1996/10/23 18:22:14; author: donn; state: Exp; lines: +12 -7
Kirk's changes to prevent races when unmounting. (1) Unmount()
and vfs_unmountall() now call vfs_busy() so that they participates
in the mount structure locking scheme. Dounmount() calls vfs_unbusy()
to unlock things, and makes sure to wake up waiters if there's an
error. (2) The MFS and NQNFS daemons also now use vfs_busy() when
unmounting filesystems. Kirk restructured the code so that a
successful unmount by another process won't leave the possibility
that a daemon might reference a mount structure that has been freed.
 1.18 09-Feb-1997  fvdl * Fix some bugs in NQNFS (malformed RPC requests, no directory lease eviction)
* Avoid possible NULL ptr ref in nfs_reply
* Don't ever try to sillyrename directories (from FreeBSD)
 1.17 31-Jan-1997  thorpej branches: 1.17.2;
NFSCLIENT -> NFS.
 1.16 13-Oct-1996  christos branches: 1.16.2;
revert kprintf changes
 1.15 10-Oct-1996  christos printf -> kprintf, sprintf -> ksprintf
 1.14 18-Feb-1996  fvdl branches: 1.14.4;
Fix a missing 'error =' before 'if (error)'.
 1.13 18-Feb-1996  fvdl Bring in a merge of Rick Macklem's NFSv3 code from Lite2
 1.12 09-Feb-1996  christos nfs prototype changes
 1.11 19-Dec-1995  cgd changes to make this work on systems where pointers & longs are 64 bits.
This is mostly just changes to make the stuff that goes over the wire
use fixed-size types.
 1.10 18-Jun-1995  cgd don't assume the f_fsnamelen is nul-truncated or longer than MFSNAMELEN
 1.9 23-May-1995  mycroft Remove gratuitous extra indirections.
 1.8 18-Jan-1995  mycroft Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.
 1.7 13-Dec-1994  mycroft Turn lease_check() into a vnode op, per CSRG.
 1.6 21-Aug-1994  cgd branches: 1.6.2;
cleanliness; don't wrap lines.
 1.5 18-Aug-1994  mycroft More LIST/CIRCLEQ migration.
 1.4 17-Aug-1994  mycroft Convert some more lists and queues.
 1.3 17-Aug-1994  mycroft Change the reply list to a TAILQ.
 1.2 29-Jun-1994  cgd branches: 1.2.2;
New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'
 1.1 08-Jun-1994  mycroft branches: 1.1.1;
Update to 4.4-Lite fs code, with local changes.
 1.1.1.2 01-Mar-1998  fvdl Import 4.4BSD-Lite2
 1.1.1.1 01-Mar-1998  fvdl Import 4.4BSD-Lite for reference
 1.2.2.1 19-Aug-1994  mycroft update from trunk
 1.6.2.2 21-Aug-1994  cgd cleanliness; don't wrap lines.
 1.6.2.1 21-Aug-1994  cgd file nfs_nqlease.c was added on branch netbsd-1-0 on 1994-08-21 21:07:14 +0000
 1.14.4.1 04-Mar-1997  mycroft Pull up bug fixes from -current, per fvdl.
 1.16.2.1 14-Jan-1997  thorpej Snapshot of work-in-progress, committed to private branch.

These changes implement machine-independent root device and file system
selection. Notable features:

- All ports behave in a consistent manner regarding root
device selection.
- No more "options GENERIC"; all kernels have the ability
to boot with RB_ASKNAME to select root device and file system
type.
- Root file system type can be wildcarded; a machine-independent
function will try all possible file systems for the selected
root device until one succeeds.
- If the root file system fails to mount, the operator will
be given the chance to select a new root device and file
system type, rather than having the machine simply panic.
- nfs_mountroot() no longer panics if any part of the NFS
mount process fails; it now returns an error, giving the
operator a chance to recover.
- New, more consistent, config(8) grammar. The constructs:

config netbsd swap generic
config netbsd root on nfs

have been replaced with:

config netbsd root on ? type ?
config netbsd root on ? type nfs

Additionally, the operator may select or wildcard root file
system type in the kernel configuration file:

config netbsd root on cd0a type cd9660

config(8) now requires that a "root" specification be
made. "root" may be wired down or wildcarded. "swap" and
"dump" specifications are optional, and follow previous
semantics.

- config(8) has a new "file-system" keyword, used to configure
file systems into the kernel. Eventually, this will be used
to generate the default vfssw[].

- "options NFSCLIENT" is obsolete, and is replaced by
"file-system NFS". "options NFSSERVER" still exists, since
NFS server support is independent of the NFS file system
client.

- sys/arch/<foo>/<foo>/swapgeneric.c is no longer used, and
will be removed; all information is now generated by config(8).

As of this commit, all ports except arm32 have been updated to use
the new setroot(). Only SPARC, i386, and Alpha ports have been
tested at this time. Port masters should test these changes on their
ports, and report any problems back to me.

More changes are on their way, including RB_ASKNAME support in
nfs_mountroot() (to prompt for server address and path) and, potentially,
the ability to select rarp/bootparam or bootp in nfs_mountroot().
 1.17.2.1 12-Mar-1997  is Merge in changes from Trunk
 1.20.4.1 14-Oct-1997  thorpej Update marc-pcmcia branch from trunk.
 1.29.8.1 27-Dec-1999  wrstuden Pull up to last week's -current.
 1.29.2.1 10-Oct-1999  cgd pull up rev 1.30 from trunk (requested by sommerfeld):
Fix an odd corner case if you use nfsv3 and nqnfs at the same time:
v3 changes the error-case behavior of the nfsm_reply macro, but the
caller keeps going and in this case you end up calling vput(NULL).
 1.30.2.5 21-Apr-2001  bouyer Sync with HEAD
 1.30.2.4 12-Mar-2001  bouyer Sync with HEAD.
 1.30.2.3 11-Feb-2001  bouyer Sync with HEAD.
 1.30.2.2 08-Dec-2000  bouyer Sync with HEAD.
 1.30.2.1 20-Nov-2000  bouyer Update thorpej_scsipi to -current as of a month ago
 1.31.2.1 22-Jun-2000  minoura Sync w/ netbsd-1-5-base.
 1.32.2.1 14-Dec-2000  he Pull up revision 1.33 (requested by fvdl):
Improve NFS performance, possibly with as much as 100% in
throughput. Please note: this implies a kernel interface change,
VOP_FSYNC gains two arguments.
 1.37.2.7 11-Dec-2002  thorpej Sync with HEAD.
 1.37.2.6 22-Oct-2002  thorpej Sync with HEAD.
 1.37.2.5 12-Jul-2002  nathanw No longer need to pull in lwp.h; proc.h pulls it in for us.
 1.37.2.4 20-Jun-2002  nathanw Catch up to -current.
 1.37.2.3 14-Nov-2001  nathanw Catch up to -current.
 1.37.2.2 21-Jun-2001  nathanw Catch up to -current.
 1.37.2.1 05-Mar-2001  nathanw Initial commit of scheduler activations and lightweight process support.
 1.38.6.1 12-Nov-2001  thorpej Sync the thorpej-mips-cache branch with -current.
 1.38.2.2 23-Jun-2002  jdolecek catch up with -current on kqueue branch
 1.38.2.1 10-Jan-2002  thorpej Sync kqueue branch with -current.
 1.39.4.1 11-Mar-2002  thorpej Make syncer_lock an adaptive mutex and rename it to syncer_mutex.
 1.51.2.6 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.51.2.5 21-Sep-2004  skrll Fix the sync with head I botched.
 1.51.2.4 18-Sep-2004  skrll Sync with HEAD.
 1.51.2.3 24-Aug-2004  skrll Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.
 1.51.2.2 03-Aug-2004  skrll Sync with HEAD
 1.51.2.1 02-Jul-2003  darrenr Apply the aborted ktrace-lwp changes to a specific branch. This is just for
others to review, I'm concerned that patch fuziness may have resulted in some
errant code being generated but I'll look at that later by comparing the diff
from the base to the branch with the file I attempt to apply to it. This will,
at the very least, put the changes in a better context for others to review
them and attempt to tinker with removing passing of 'struct lwp' through
the kernel.
 1.56.12.2 30-Dec-2006  yamt sync with head.
 1.56.12.1 21-Jun-2006  yamt sync with head.
 1.57.12.2 24-May-2006  tron Merge 2006-05-24 NetBSD-current into the "peter-altq" branch.
 1.57.12.1 28-Mar-2006  tron Merge 2006-03-28 NetBSD-current into the "peter-altq" branch.
 1.57.10.3 06-May-2006  christos - Move kauth_cred_t declaration to <sys/types.h>
- Cleanup struct ucred; forward declarations that are unused.
- Don't include <sys/kauth.h> in any header, but include it in the c files
that need it.

Approved by core.
 1.57.10.2 19-Apr-2006  elad sync with head.
 1.57.10.1 08-Mar-2006  elad Adapt to kernel authorization KPI.

This could use some testing...
 1.57.8.5 03-Sep-2006  yamt sync with head.
 1.57.8.4 11-Aug-2006  yamt sync with head
 1.57.8.3 26-Jun-2006  yamt sync with head.
 1.57.8.2 24-May-2006  yamt sync with head.
 1.57.8.1 01-Apr-2006  yamt sync with head.
 1.57.6.3 01-Jun-2006  kardel Sync with head.
 1.57.6.2 22-Apr-2006  simonb Sync with head.
 1.57.6.1 04-Feb-2006  simonb Adapt for timecounters: mostly use get*time() and use "time_second"
instead of "time.tv_sec".
 1.57.4.1 09-Sep-2006  rpaulo sync with head
 1.63.2.1 19-Jun-2006  chap Sync with head.
 1.65.2.1 13-Jul-2006  gdamore Merge from HEAD.
 1.69.4.2 10-Dec-2006  yamt sync with head.
 1.69.4.1 22-Oct-2006  yamt sync with head
 1.69.2.3 12-Jan-2007  ad Sync with head.
 1.69.2.2 18-Nov-2006  ad Sync with head.
 1.69.2.1 11-Sep-2006  ad - Convert some locks to mutexes and RW locks.
- Use the proclist_lock to protect pgrps and sessions in some places.
 1.184 23-Mar-2023  riastradh nfs: Avoid free of uninitialized on bad name size in create, mknod.

XXX These error branches are a nightmare and need to be more
systematically cleaned up. Even if they are correct now, they are
impossible to audit and extremely fragile in case anyone ever needs
to make other changes to them.

XXX pullup-8
XXX pullup-9
XXX pullup-10
 1.183 27-Apr-2022  hannken branches: 1.183.4;
As VOP_GETATTR() needs a shared lock at least move the preopattr lookup
inside nfs_namei() where we may lock the start directory without violating
the lock order.
 1.182 16-Sep-2021  andvar fix various typos, mainly in comments.
 1.181 05-Sep-2020  riastradh Round of uvm.h cleanup.

The poorly named uvm.h is generally supposed to be for uvm-internal
users only.

- Narrow it to files that actually need it -- mostly files that need
to query whether curlwp is the pagedaemon, which should maybe be
exposed by an external header.

- Use uvm_extern.h where feasible and uvm_*.h for things not exposed
by it. We should split up uvm_extern.h but this will serve for now
to reduce the uvm.h dependencies.

- Use uvm_stat.h and #ifdef UVMHIST uvm.h for files that use
UVMHIST(ubchist), since ubchist is declared in uvm.h but the
reference evaporates if UVMHIST is not defined, so we reduce header
file dependencies.

- Make uvm_device.h and uvm_swap.h independently includable while
here.

ok chs@
 1.180 04-Apr-2020  mlelstv NFSv2 is limited to use only 32bit in metadata. Prevent that larger
metadata values are simply truncated.

-> clamp filesystem block counts to signed 32bit.
-> clamp file sizes to signed 32bit (*)

Some NFSv2 clients also have problems to handle buffer sizes larger
than (signed) 16bit.
-> clamp buffer sizes to signed 16bit for better compatibility.

(*) This can lead to erroneous behaviour for files larger than 2GB
that NFSv2 cannot handle but it is still better than before.
An alternative would be to (partially) reject operations on files
larger than 2GB, but which causes other problems.
 1.179 17-Jan-2020  ad VFS_VGET(), VFS_ROOT(), VFS_FHTOVP(): give them a "int lktype" argument, to
allow us to get shared locks (or no lock) on the returned vnode. Matches
FreeBSD.
 1.178 02-Jan-2020  thorpej branches: 1.178.2;
- Eliminate the global "boottime" variable, which was being accessed
without any synchronization against changes by e.g. clock_settime().
- Replace with new getbinboottime() / getnanoboottime() / getmicroboottime()
functions (naming mirrors that of other time access functions in kern_tc.c).
It returns the (maybe-converted) value of timebasebin, which also tracks
our estimate of when the system was booted (i.e. the legacy "boottime" was
redundant).

XXX There needs to be a lockless synchronization mechanism for reading
timebasebin, but this is a problem in kern_tc.c that pre-existed these
"boottime" changes. At least now the problem is centralized in one location.
 1.177 20-Feb-2019  hannken branches: 1.177.4;
Bracket do_sys_renameat() and nfsrv_rename() with fstrans.

The v_mount field for vnodes on the same file system as "from"
is now stable for referenced vnodes.

VFS_RENAMELOCK no longer may use lock from an unreferenced and
freed "struct mount".
 1.176 03-Feb-2019  mrg - add or adjust /* FALLTHROUGH */ where appropriate
- add __unreachable() after functions that can return but won't in
this case, and thus can't be marked __dead easily
 1.175 03-Sep-2018  riastradh Rename min/max -> uimin/uimax for better honesty.

These functions are defined on unsigned int. The generic name
min/max should not silently truncate to 32 bits on 64-bit systems.
This is purely a name change -- no functional change intended.

HOWEVER! Some subsystems have

#define min(a, b) ((a) < (b) ? (a) : (b))
#define max(a, b) ((a) > (b) ? (a) : (b))

even though our standard name for that is MIN/MAX. Although these
may invite multiple evaluation bugs, these do _not_ cause integer
truncation.

To avoid `fixing' these cases, I first changed the name in libkern,
and then compile-tested every file where min/max occurred in order to
confirm that it failed -- and thus confirm that nothing shadowed
min/max -- before changing it.

I have left a handful of bootloaders that are too annoying to
compile-test, and some dead code:

cobalt ews4800mips hp300 hppa ia64 luna68k vax
acorn32/if_ie.c (not included in any kernels)
macppc/if_gm.c (superseded by gem(4))

It should be easy to fix the fallout once identified -- this way of
doing things fails safe, and the goal here, after all, is to _avoid_
silent integer truncations, not introduce them.

Maybe one day we can reintroduce min/max as type-generic things that
never silently truncate. But we should avoid doing that for a while,
so that existing code has a chance to be detected by the compiler for
conversion to uimin/uimax without changing the semantics until we can
properly audit it all. (Who knows, maybe in some cases integer
truncation is actually intended!)
 1.174 03-May-2018  hannken branches: 1.174.2;
nfsrv_readlink: stop attaching a zero-length mbuf for zero length symlinks.
 1.173 26-Apr-2017  riastradh branches: 1.173.4; 1.173.10;
Change VOP_REMOVE and VOP_RMDIR to preserve lock/ref on dvp.

No change to vp -- the plan is to replace the node by the
componentname in the vop parameters, and let all directory vops do
lookups internally.

Proposed on tech-kern with no objections:
https://mail-index.netbsd.org/tech-kern/2017/04/17/msg021825.html
 1.172 21-Apr-2015  riastradh Cull unused INRENAME and INRELOOKUP from callers.
 1.171 20-Apr-2015  riastradh Make VOP_LINK return directory still locked and referenced.

Ride 7.99.10 bump.
 1.170 23-Jan-2014  hannken branches: 1.170.6;
Change vnode operations create, mknod, mkdir and symlink to return
the resulting vnode *vpp unlocked.

Discussed on tech-kern@

Welcome to 6.99.30
 1.169 17-Jan-2014  hannken Change vnode operations create, mknod, mkdir and symlink to keep the
directory node dvp locked on return.

Discussed on tech-kern@

Welcome to 6.99.29
 1.168 14-Dec-2013  christos only prevent autounload, not regular unload when we have exports
 1.167 14-Dec-2013  christos don't allow the nfs server module to unload if it has exported filesystems.
 1.166 14-Sep-2013  martin Backout wildcard pragma to kill warnings and instead sprinkle a few dozen
__unused attributes.
Requested by joerg@
 1.165 29-Aug-2012  christos branches: 1.165.2; 1.165.4;
When unloading the nfsserver module, call nfs_fini() so that the nfsrvdescpl
pool gets destroyed. Otherwise we are left with a stray pool that points to
unmapped memory behind (and bad things happen). Typically you get seemingly
random page faults (without printing uvm_fault) that happen in various pool
operations. Most frequent one is the pool_drain() from the page daemon.
 1.164 27-Aug-2012  chs fix error handling in nfsrv_rename(): when the first nfs_namei() fails,
don't try to free the resources allocated by a successful lookup.
 1.163 01-Feb-2012  matt branches: 1.163.2; 1.163.4;
When using socket loaning, make sure the KVA used for the loan has the same
color as the UVA being loaned.
 1.162 21-Nov-2011  hannken branches: 1.162.2;
nfsrv_lookup(): Defer the postopattr lookup on dirp until the
child node is unlocked.

Ok: YAMAMOTO Takashi <yamt@netbsd.org>
 1.161 30-Oct-2011  hannken branches: 1.161.2;
VOP_GETATTR() needs a shared lock at least.
 1.160 08-Aug-2011  dholland nfs_namei() should not return a non-null path buffer except on success,
even though the callers are apparently prepared to cope.

Fixes last tidyup part of PR 44625.
 1.159 18-Apr-2011  dholland Back in -r1.60 of nfs_serv.c (a long time ago) VOP_MKNOD was changed
so nfsd no longer needed to do a lookup() call immediately afterwards
to retrieve the newly created object.

Since that change there has been no way for ISSYMLINK to be set upon
return from VOP_MKNOD (or before the call to VOP_MKNOD either) so
remove the test for it and associated block of dead code.

(I do not understand how this code was reachable before then either.
The logic in question is only reached if no object by that name
existed, and there's no reasonable way that a successful call to
VOP_MKNOD should ever create a symlink. The code appears to come from
4.4lite; maybe they had locking bugs?)
 1.158 11-Apr-2011  dholland Clean up. Move some more code across from nfsd's private entry points.
 1.157 19-Mar-2011  dholland Fix memory leak introduced with the struct pathbuf changes. Hi, me.
Closes PR 44625.
 1.156 05-Feb-2011  yamt typo in a comment
 1.155 02-Jan-2011  dholland branches: 1.155.2; 1.155.4;
Remove remaining references to SAVESTART.
 1.154 02-Jan-2011  dholland Remove the special refcount behavior (adding an extra reference to the
parent dir) associated with SAVESTART in relookup().

Check all call sites to make sure that SAVESTART wasn't set while
calling relookup(); if it was, adjust the refcount behavior. Remove
related references to SAVESTART.

The only code that was reaching the extra ref was msdosfs_rename,
where the refcount behavior was already fairly broken and/or gross;
repair it.

Add a dummy 4th argument to relookup to make sure code that hasn't
been inspected won't compile. (This will go away next time the
relookup semantics change, which they will.)
 1.153 02-Jan-2011  dholland Remove unused nameidata field ni_startdir.
 1.152 30-Nov-2010  dholland Abolish struct componentname's cn_pnbuf. Use the path buffer in the
pathbuf object passed to namei as work space instead. (For now a pnbuf
pointer appears in struct nameidata, to support certain unclean things
that haven't been fixed yet, but it will be going away in the future.)

This removes the need for the SAVENAME and HASBUF namei flags.
 1.151 24-Jun-2010  hannken Clean up vnode lock operations pass 2:

VOP_UNLOCK(vp, flags) -> VOP_UNLOCK(vp): Remove the unneeded flags argument.

Welcome to 5.99.32.

Discussed on tech-kern.
 1.150 08-Jan-2010  pooka branches: 1.150.2; 1.150.4;
The VATTR_NULL/VREF/VHOLD/HOLDRELE() macros lost their will to live
years ago when the kernel was modified to not alter ABI based on
DIAGNOSTIC, and now just call the respective function interfaces
(in lowercase). Plenty of mix'n match upper/lowercase has creeped
into the tree since then. Nuke the macros and convert all callsites
to lowercase.

no functional change
 1.149 23-Dec-2009  pooka Define namei flag INRENAME and set it if a lookup operation is part
of rename. This helps with building better asserts for rename in
the DELETE lookup ... the RENAME lookup is quite obviously a part
of rename.
 1.148 07-Nov-2009  cegger Add a flags argument to pmap_kenter_pa(9).
Patch showed on tech-kern@ http://mail-index.netbsd.org/tech-kern/2009/11/04/msg006434.html
No objections.
 1.147 27-Sep-2009  dholland Rename lookup() to lookup_for_nfsd(), to make it clear just whose
private backdoor entry point this is.

Also, clone the lookup_for_nfsd() entry point as
lookup_for_nfsd_index(), for use by a different call site in nfsd that
does different unclean things with nameidata.
 1.146 23-May-2009  ad - Fix a race between umount()/mount() and nfssvc().
- Toss netexport state on nfsserver module unload.
 1.145 23-May-2009  ad Fix a crash when unloading nfsserver module.
 1.144 10-Apr-2009  bouyer PR kern/41158: nfs_rename() locking against myself
nfsrv_rename() can exit without calling genfs_renamelock_exit() because
the nfsm_reply() can do return (0) on error.
Change nfsm_reply to use 'error = 0; goto nfsmout' instead.
Fix a few place so it's safe to goto nfsmout from nfsm_reply, or other
macros calling it.
As a side effect it could fix a missing vrele(dirp) in various place where
nfsm_reply could return(0).
 1.143 14-Mar-2009  dsl Change about 4500 of the K&R function definitions to ANSI ones.
There are still about 1600 left, but they have ',' or /* ... */
in the actual variable definitions - which my awk script doesn't handle.
There are also many that need () -> (void).
(The script does handle misordered arguments.)
 1.142 11-Jan-2009  christos branches: 1.142.2;
merge christos-time_t
 1.141 03-Dec-2008  pooka nfsd_use_loan: int -> bool
 1.140 27-Nov-2008  pooka Use struct nfs_fattr in struct flrep instead of uint32_t array
acrobatics to get rid of type punning warning.
 1.139 19-Nov-2008  ad Make the emulations, exec formats, coredump, NFS, and the NFS server
into modules. By and large this commit:

- shuffles header files and ifdefs
- splits code out where necessary to be modular
- adds module glue for each of the components
- adds/replaces hooks for things that can be installed at runtime
 1.138 28-Mar-2008  dholland branches: 1.138.2; 1.138.6; 1.138.12; 1.138.14; 1.138.16;
Yet another rename workaround - this time check for . and .. early because
relookup() objects to being asked to handle them.
 1.137 08-Mar-2008  yamt desupport link/unlink of directories. noted by Elad Efrat.
 1.136 28-Feb-2008  elad Introduce a new kauth action, KAUTH_NETWORK_NFS, and two requests,
KAUTH_REQ_NETWORK_NFS_EXPORT and KAUTH_REQ_NETWORK_NFS_SVC, and use them
to replace two KAUTH_GENERIC_ISSUSER calls in the NFS code.

Also replace two more with KAUTH_SYSTEM_MKNOD, where appropriate.

Documetnation and examples updated. More to come.
 1.135 20-Feb-2008  matt branches: 1.135.2; 1.135.6;
Fix extern declaration to match actual declaration (add const).
 1.134 28-Jan-2008  dholland Fix some race conditions in rename.
Introduce a per-FS rename lock and new vfsops to manipulate it.
Get this lock while renaming. Also add another relookup() in do_sys_rename,
which is a hack to kludge around some of the worst deficiencies of
ufs_rename.
reviewed-by: pooka (and an earlier rev by ad)
posted on tech-kern with no objections.
 1.133 22-Dec-2007  yamt nfsrv_create: fix a use-after-release.
 1.132 04-Dec-2007  yamt branches: 1.132.4;
merge non-intrusive nfs changes from vmlocking.
 1.131 26-Nov-2007  pooka branches: 1.131.2;
Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern
 1.130 10-Oct-2007  ad branches: 1.130.4;
Merge from vmlocking:

- Split vnode::v_flag into three fields, depending on field locking.
- simple_lock -> kmutex in a few places.
- Fix some simple locking problems.
 1.129 27-Jul-2007  yamt branches: 1.129.4; 1.129.6; 1.129.8; 1.129.10;
stop nfs tick when we have nothing to do.
 1.128 06-Apr-2007  hannken branches: 1.128.4;
Remove calls to now obsolete vn_start_write() and vn_finished_write().
 1.127 05-Mar-2007  yamt branches: 1.127.2; 1.127.4;
nfsrv_setattr: revive nfsm_srvsattr which was (mistakenly?)
removed with caddr_t.
 1.126 04-Mar-2007  christos Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.
 1.125 22-Feb-2007  thorpej TRUE -> true, FALSE -> false
 1.124 20-Feb-2007  pooka after freeing cookies, set the pointer to NULL to prevent dangling reuse
 1.123 04-Feb-2007  chs branches: 1.123.2;
in nfsrv_rename(), change vput(fdirp) back to vrele() since the dirp
returned from nfs_namei() is not locked in this case. fixes PR 35542.
also, apply the same fix here as was made in rename_files() to handle
the case when dvp and vp are the same.
 1.122 04-Jan-2007  elad Consistent usage of KAUTH_GENERIC_ISSUSER.
 1.121 27-Dec-2006  yamt remove nqnfs.
 1.120 09-Dec-2006  chs a smorgasbord of improvements to vnode locking and path lookup:
- LOCKPARENT is no longer relevant for lookup(), relookup() or VOP_LOOKUP().
these now always return the parent vnode locked. namei() works as before.
lookup() and various other paths no longer acquire vnode locks in the
wrong order via vrele(). fixes PR 32535.
as a nice side effect, path lookup is also up to 25% faster.
- the above allows us to get rid of PDIRUNLOCK.
- also get rid of WANTPARENT (just use LOCKPARENT and unlock it).
- remove an assumption in layer_node_find() that all file systems implement
a recursive VOP_LOCK() (unionfs doesn't).
- require that all file systems supply vfs_vptofh and vfs_fhtovp routines.
fill in eopnotsupp() for file systems that don't support being exported
and remove the checks for NULL. (layerfs calls these without checking.)
- in union_lookup1(), don't change refcounts in the ISDOTDOT case, just
adjust which vnode is locked. fixes PR 33374.
- apply fixes for ufs_rename() from ufs_vnops.c rev. 1.61 to ext2fs_rename().
 1.119 09-Nov-2006  yamt branches: 1.119.2;
remove some __unused in function parameters.
 1.118 12-Oct-2006  christos - sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386
 1.117 02-Sep-2006  yamt branches: 1.117.2; 1.117.4;
nfsd: deal with variable-sized filehandles.
 1.116 02-Sep-2006  christos fix default type decls
fix incomplete initializer
 1.115 20-Jul-2006  christos When there are too many empty entries in a row, and we need to try to
read the next block, free the cookie buffer before doing so to avoid
a memory leak. Reported by Mark Davies.
 1.114 13-Jul-2006  martin Fix alignement problems for fhandle_t, exposed by gcc4.1.

While touching all vptofh/fhtovp functions, get rid of VFS_MAXFIDSIZ,
version the getfh(2) syscall and explicitly pass the size available in
the filehandle from userland.

Discussed on tech-kern, with lots of help from yamt (thanks!).
 1.113 30-Jun-2006  yamt wrap long lines and fix indents after kauth merge.
 1.112 17-Jun-2006  yamt - introduce vfs_composefh() and use it where appropriate.
- fix lock/unlock mismatch in sys_getfh.
 1.111 09-Jun-2006  christos branches: 1.111.2;
stack police: don't allocate statvfs on the stack.
 1.110 07-Jun-2006  kardel merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html
 1.109 14-May-2006  elad branches: 1.109.2;
integrate kauth.
 1.108 10-May-2006  mrg quell GCC 4.1 uninitialised variable warnings.

XXX: we should audit the tree for which old ones are no longer needed
after getting the older compilers out of the tree..
 1.107 15-Apr-2006  christos Coverity CID 735: Remove duplicate code.
 1.106 15-Apr-2006  christos Coverity CID 736: Comment out dead code.
 1.105 15-Apr-2006  christos Coverity CID 1143: Prevent NULL deref.
 1.104 15-Apr-2006  christos Coverity CID 1144: Protect against NULL deref.
 1.103 15-Apr-2006  christos Coverity CID 2510-2514: Always initialize cache.
 1.102 27-Mar-2006  martin KASSERT that the returned file id length from VPTOFH is <= the
maximum allowed value (_VFS_MAXFIDSZ).
 1.101 01-Mar-2006  yamt branches: 1.101.2; 1.101.4; 1.101.6;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.
 1.100 03-Jan-2006  yamt branches: 1.100.2; 1.100.4;
remove a few unnecessary caddr_t casts.
 1.99 11-Dec-2005  christos branches: 1.99.2;
merge ktrace-lwp.
 1.98 06-Oct-2005  yamt branches: 1.98.6;
- remove a ufs dependency.
- bump readdir block size to 1024. (the same value as userland DIRBLKSIZ)
 1.97 06-Sep-2005  jmmv Set va_type to VLNK before calling VOP_SYMLINK to match the change in the
vfs_syscalls.c file. Pointed out by yamt@.
 1.96 19-Aug-2005  yamt as we now have 64bit ino_t, no need to truncate nfsv3 fileids.
 1.95 18-May-2005  yamt branches: 1.95.2;
nfsrv_mknod: reject device numbers which we can't handle.
 1.94 26-Feb-2005  perry nuke trailing whitespace
 1.93 09-Dec-2004  yamt branches: 1.93.2; 1.93.4;
nfsrv_commit: make cnt unsigned so that very large commit requests can be
handled properly.
 1.92 09-Dec-2004  yamt when calling create-type VOP, make sure that va_mode is set
even when a client doesn't specify it.
(most filesystems get confused if va_mode is VNOVAL.)
 1.91 04-Dec-2004  yamt nfsrv_read: fall back to copying when fail to loan pages.
(i forgot to commit this with uvm_loan.c rev.1.51.)
 1.90 17-Sep-2004  skrll There's no need to pass a proc value when using UIO_SYSSPACE with
vn_rdwr(9) and uiomove(9).

OK'd by Jason Thorpe
 1.89 31-May-2004  yamt nfsrv_create: fix an LP64 problem for exclusive create.
 1.88 21-Apr-2004  christos Replace the statfs() family of system calls with statvfs().
Retain binary compatibility.
 1.87 07-Jan-2004  yamt branches: 1.87.2;
- get pages to loan out in uvm_loanuobjpages() rather than
having caller (nfsd, in this case) do so.
- tweak locking so that nfs loaned READ works on layered filesystems.
 1.86 05-Nov-2003  hannken Clean up the usage of vn_start_write(). At least one occurence clobbered
previous error conditions.
If "(flags & (V_WAIT|V_PCATCH)) == V_WAIT" the return value is always zero.
Ignore the return value in these cases.

From Darrin B. Jewell.
 1.85 29-Oct-2003  mycroft Back out the bogus initializer -- the compiler bug is fixed.
 1.84 28-Oct-2003  cl note 'm68k {u,}int64_t used uninitialized' bug.
add reference to gcc bug report.
mark all (known) occurrences.
 1.83 20-Oct-2003  yamt set READres EOF flag correctly.
 1.82 15-Oct-2003  hannken Add the gating of system calls that cause modifications to the underlying
file system.
The function vfs_write_suspend stops all new write operations to a file
system, allows any file system modifying system calls already in progress
to complete, then sync's the file system to disk and returns. The
function vfs_write_resume allows the suspended write operations to
complete.

From FreeBSD with slight modifications.

Approved by: Frank van der Linden <fvdl@netbsd.org>
 1.81 07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.80 09-Jul-2003  bouyer nfsrv_commit(): return success and don't do anything for requests which starts
past the end of the file. This can happen when two clients are writting to
the same file.
Close PR 21696 by myself, discussed on tech-net in 2003/05 and 2003/06.
Issue raised by Chuck Silvers (commit and truncate ops needs to be serialised)
still unadressed.
 1.79 29-Jun-2003  fvdl branches: 1.79.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
 1.78 29-Jun-2003  thorpej Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.
 1.77 28-Jun-2003  darrenr Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
 1.76 09-Jun-2003  yamt rework zero padding of rpc reply.
- for READ procedure, don't send back more bytes than requested.
- don't have doubtful assumptions on mbuf chain structure.
- rename a function (nfsm_adj -> nfs_zeropad) to avoid confusion as
the semantics of the function was changed.
 1.75 29-May-2003  yamt workaround for UBC limit.

while our nfsd announces MAXBSIZE as wtmax for tcp,
VOP_GETPAGES of filesystems that uses genfs_getpages can't
handle >= MAX_READ_AHEAD(16) pages at once.
therefore, depending on PAGE_SIZE of the machine and file offset of
a read request, we can't VOP_GETPAGES the range at once.
 1.74 07-May-2003  yamt - indent.
- fix a comment typo (mus -> must)
- remove an unneeded caddr_t cast.
 1.73 04-May-2003  yamt fix handling of the case that readsize == 0.
 1.72 03-May-2003  yamt use uvm page loanout mechanism for nfsd READ procedure processing.

reviewed by Frank van der Linden and Chuck Silvers.
tested by Wojciech Puchar.
 1.71 03-Apr-2003  yamt return rtmax bytes if we get READ requests larger than rtmax.
 1.70 02-Apr-2003  yamt use queue manipulation macros.
 1.69 28-Mar-2003  yamt reply FSINFO rtmax and wtmax for DGRAM properly.
 1.68 28-Mar-2003  yamt reply ENAMETOOLONG properly instead of discarding request as BADRPC.
my own PR20791.
 1.67 26-Feb-2003  matt Add MBUFTRACE kernel option.
Do a little mbuf rework while here. Change all uses of MGET*(*, M_WAIT, *)
to m_get*(M_WAIT, *). These are not performance critical and making them
call m_get saves considerable space. Add m_clget analogue of MCLGET and
make corresponding change for M_WAIT uses.
Modify netinet, gem, fxp, tulip, nfs to support MBUFTRACE.
Begin to change netstat to use sysctl.
 1.66 01-Dec-2002  matt Make sure these all agree on the same definitons of various variables.
 1.65 27-Sep-2002  bouyer In nfs_commit(), sanity-check what we get from network: if we try to fluch
past end of file, or if off + cnt overflows a quad_t, flush to end of file.
 1.64 26-Sep-2002  bouyer In nfsrv_create(), kill an extra PNBUF_PUT() in the NFSv2 mknod case. The
pnbuf has already been freed by VOP_MKNOD. This should have been removed in
rev 1.60.
Should fix PR 18013, OK'd by fvdl.
 1.63 26-Sep-2002  bouyer nfsrv_commit(): Properly handle the case cnt == 0, which means "flush to
end of file". Calling VOP_FSYNC with start == end triggers a DIAGNOSTIC
check. Noticed with NFSv3 Linux clients. OK'd by fvdl.
 1.62 10-Nov-2001  lukem branches: 1.62.10;
add RCSIDs
 1.61 23-Sep-2001  chs branches: 1.61.2;
remove SAVESTART from the symlink, mknod and create operations.
it was unnecessary, and removing it also fixes a v_usecount leak
that was introduced in the previous revision.
 1.60 24-Jul-2001  assar branches: 1.60.2;
change vop_symlink and vop_mknod to return vpp (the created node)
refed, so that the caller can actually use it. update callers and
file systems that implement these vnode operations
 1.59 27-Nov-2000  chs branches: 1.59.2; 1.59.4;
Initial integration of the Unified Buffer Cache project.
 1.58 19-Sep-2000  fvdl Adapt for VOP_FSYNC parameter change.
 1.57 03-Aug-2000  thorpej Convert namei pathname buffer allocation to use the pool allocator.
 1.56 03-Aug-2000  thorpej MALLOC()/FREE() are not to be used for variable size allocations.
 1.55 27-Jun-2000  mrg remove include of <vm/vm.h>
 1.54 30-Mar-2000  augustss branches: 1.54.4;
Remove register declarations.
 1.53 30-Mar-2000  simonb Delete redundant decl of nfs_pub - it's in <sys/mount.h>.
 1.52 05-Dec-1999  fvdl The length check for readdirplus entries wasn't right, causing troubles
with 32k readdir sizes. From FreeBSD.
 1.51 04-May-1999  sommerfe branches: 1.51.2; 1.51.8;
Fix vnode lock leak in nfsrv_mknod() if to-be-created vnode already existed.
 1.50 30-Mar-1999  mycroft branches: 1.50.2;
Fix two problems with NFSV3CREATE_GUARDED:
* We shouldn't truncate the file.
* We were leaving the vnode locked (unless the truncate happened to fail).
Solaris clients may cause this under some conditions.
Problem reported by chopps, analysis and fix by me.
 1.49 24-Mar-1999  mrg completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.
 1.48 06-Mar-1999  fair Snatch a patch from OpenBSD to fix PRs 6529 and 7074.
Adjust fxdr_hyper() and txdr_hyper() macros.
 1.47 05-Mar-1999  mycroft Clean up some sign extension bogosity in statfs, so negative numbers are
actually negative on a LP64 client.
 1.46 31-Jan-1999  mrg non-root users can mkfifo over NFS.
 1.45 18-Aug-1998  thorpej Add some braces to make egcs happy (ambiguous else warning).
 1.44 09-Aug-1998  perry bzero->memset, bcopy->memcpy, bcmp->memcmp
 1.43 05-Jun-1998  kleink Convert fsync vnode operator implementations and usage from the old `waitfor'
argument and MNT_WAIT/MNT_NOWAIT to `flags' and FSYNC_WAIT.
 1.42 01-Mar-1998  fvdl Merge with Lite2 + local changes
 1.41 10-Feb-1998  mrg - add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.
 1.40 05-Feb-1998  mrg initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)
 1.39 22-Dec-1997  fvdl Check vnode for VDIR type before doing anything with it in the
NFS readdir service.
 1.38 10-Oct-1997  fvdl branches: 1.38.2;
* New directory entry caching system. Provides full caching of any
directory cookie that may be thrown back at us from userspace, up
to a size limit. Fixes double entry problem.
* Split flags for internal and external use in the NFS mount structure.
* Fix some buffer structure fields that weren're being used correctly.
* Fix missing directory cache inval call in nfs_open.
* Limit on NFS_DIRBLKSIZ no longer needed, bumped to the more reasonable
value of 8k.
* Various other things that I forget, all related to the dir caching
somehow, though.
 1.37 17-Jul-1997  fvdl branches: 1.37.2;
* Deal with servers that don't give complete FSINFO (like NT)
From Olaf Seibert <rhialto@polder.ubc.kun.nl> (PR 3687)
* Make an attempt to check the maximum filesize before attempting
a write to the server, as write RPCs will typically happen
asynchronously, and the process will not see the error.
Fixes problems with unexpectly truncated files at 4G
* Pass up errors in nfs_writerpc correctly
 1.36 15-Jul-1997  fvdl A filesystem may not support VFS_VGET (like msdosfs). If it doesn't,
the server code would always skip all dir entries for a readdirplus
operation. To avoid endlessly retrying clients, try VFS_VGET first,
and it it fails, return NFSERR_NOTSUPP so that client will fall
back to normal readdir operations.
 1.35 24-Jun-1997  fvdl Provide the extra arg to nfsrv_fhtovp, signalling if we're dealing with
a request on the public filehandle. Extend the lookup operation to
support WebNFS, including index file support (URL style). Yucky, it's
optional in the spec, but Solaris 2.6 will support it, so..
 1.34 12-May-1997  fvdl In nfsrvw_coalesce, make sure the coalesce list from the nfsd is moved
as well. This fixes client hangs. (from Naofumi Honda
<honda@Kururu.math.sci.hokudai.ac.jp> / NetBSD-pc98)
 1.33 08-May-1997  mycroft Pass the vnode type to vaccess(), and use it when checking VEXEC. Make sure
that the mode bits passed to vaccess() and returned by foo_getattr() contain
only permission bits.
 1.32 08-May-1997  mycroft VEXEC -> VLOOKUP, as appropriate.
 1.31 22-Feb-1997  fvdl Fixes from BSDI (thanks go to Keith Bostic). Original RCS message:

date: 1996/11/20 20:02:54; author: pjd; state: Exp; lines: +7 -4
In nfsrv_access(), if VOP_ACCESS() returns an error and the
error == EPERM or its not the owner doing the access, return the error.
 1.30 10-Feb-1997  fvdl Move vnode_pager_uncache to a better spot in nfsrv_remove. Also use it
in nfsrv_rename, if the 2nd argument is an existing file and will thus
be removed.
 1.29 31-Jan-1997  fvdl branches: 1.29.2;
nfsrv_readdirplus also suffered from the off-by-one loop problem; fix it too.
 1.28 31-Jan-1997  fvdl Fix order error in loop condition which could cause a crash in nfsrv_readdir().
Fixes PR #3170
 1.27 11-Dec-1996  fvdl Give permission to the owner of the file to preserve semantics only
in the relevant cases (read, write). Fixes PR 3017.
 1.26 01-Jul-1996  fvdl Always call vnode_pager_uncache when removing a file in the server
(same as in sys_unlink()).
 1.25 02-Mar-1996  jtk branches: 1.25.4;
Do not return whiteout directory entries in NFS readdir replies. (The
NFS protocol doesn't know how to deal with them properly, yet.)
 1.24 20-Feb-1996  cgd Third argument to VOP_PATHCONF is a register_t *, and register_t may be
different than 'int'. Do the right thing when declaring variables which
are used this way.
 1.23 18-Feb-1996  fvdl Bring in a merge of Rick Macklem's NFSv3 code from Lite2
 1.22 09-Feb-1996  christos nfs prototype changes
 1.21 09-Feb-1996  mycroft Fix vop_link, vop_symlink, and vop_remove semantics in several ways:
* Change the argument names to vop_link so they actually make sense.
* Implement vop_link and vop_symlink for all file systems, so they do proper
cleanup.
* Require the file system to decide whether or not linking and unlinking of
directories is allowed, and disable it for all current file systems.
(Also, remove the cross-device link check, that was moved into the file
systems some time ago.)
 1.20 01-Feb-1996  jtc Rename struct timespec fields to conform to POSIX.1b
 1.19 19-Dec-1995  cgd changes to make this work on systems where pointers & longs are 64 bits.
This is mostly just changes to make the stuff that goes over the wire
use fixed-size types.
 1.18 23-May-1995  mycroft Remove gratuitous extra indirections.
 1.17 13-Dec-1994  mycroft Sync with CSRG.
 1.16 29-Jun-1994  cgd New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'
 1.15 08-Jun-1994  mycroft Update to 4.4-Lite fs code, with local changes.
 1.14 21-Apr-1994  cgd Convert mount, vnode, and buf structs to use <sys/queue.h>. Also,
some knf and structure frobbing to do along with it.
 1.13 10-Apr-1994  cgd patchkit date deletions!
 1.12 12-Mar-1994  cgd fix rcs id
 1.11 09-Mar-1994  ws Make FFS optional
 1.10 18-Dec-1993  mycroft Canonicalize all #includes.
 1.9 26-Nov-1993  ws Bug fixes to ISOFS
 1.8 07-Sep-1993  ws branches: 1.8.2;
Changes to VFS readdir semantics
NFS changes for better cookie support
ISOFS changes for better Rockridge support and support for generation numbers
 1.7 03-Sep-1993  jtc Include systm.h to get prototypes (and possibly inlines) of *max functions.
 1.6 16-Jul-1993  cgd ANSI mods.
(originally committed by andrew on 1993/06/27 06:58:35)
 1.5 16-Jul-1993  cgd fix for macklem's bogus use of the va_flags field, supplied by
John Woods, jfwfrom: @ksr.com. also, fixes the following problems:
the va_gen field is in a similar position
(Suns are going to be reporting the change-date microseconds as their
"generation"), I've supplied my own set of diffs below for your inspection.
Note these aren't even compiled, but they're pretty similar to what I had
to do to our older version of OSF/1 here. (There's also an unrelated change
supplied for xdr_subs.h; the pointer types supplied to the fxdr_time() and
txdr_time() macros are not, in fact, both struct timevals. That turns out
to be one of many tips-of-the-iceberg facing those porting the (old) Berkeley
NFS code to 64-bit machines...)
(originally committed by cgd on 1993/06/03 01:12:42)
 1.4 16-Jul-1993  cgd more rcs id adding and header cleanup. i like vi macros!
(originally committed by cgd on 1993/05/20 03:18:44)
 1.3 10-Apr-1993  glass migrated code to make split possible
 1.2 21-Mar-1993  cgd after 0.2.2 "stable" patches applied
 1.1 21-Mar-1993  cgd branches: 1.1.1;
Initial revision
 1.1.1.4 01-Mar-1998  fvdl Import some files that were changed after Lite2
 1.1.1.3 01-Mar-1998  fvdl Import 4.4BSD-Lite2
 1.1.1.2 01-Mar-1998  fvdl Import 4.4BSD-Lite for reference
 1.1.1.1 21-Mar-1993  cgd initial import of 386bsd-0.1 sources
 1.8.2.3 26-Nov-1993  mycroft Merge changes from trunk.
 1.8.2.2 14-Nov-1993  mycroft Canonicalize all #includes.
 1.8.2.1 24-Sep-1993  mycroft Make all files using spl*() #include cpu.h. Changes from trunk.
nfs_vfsops.c, nfsmount.h: Make nfs_quotactl() take an int rather than a uid_t,
as it might be -1.
nfs_vnops.c: va_size and va_bytes are now quads.
 1.25.4.2 04-Mar-1997  mycroft Pull up bug fixes from -current, per fvdl.
 1.25.4.1 11-Dec-1996  mycroft From trunk:
Always call vnode_pager_uncache() when removing a file.
 1.29.2.1 12-Mar-1997  is Merge in changes from Trunk
 1.37.2.1 14-Oct-1997  thorpej Update marc-pcmcia branch from trunk.
 1.38.2.2 01-Feb-1999  cgd pull up rev 1.46 from trunk (mrg)
 1.38.2.1 22-Dec-1997  perry pullup a fix to a critical NFS bug (fvdl)
 1.50.2.2 16-Dec-1999  he Pull up revision 1.52 (requested by fvdl):
Correct length check in readdirplus, making 32k readdir sizes
work.
 1.50.2.1 04-May-1999  perry branches: 1.50.2.1.2;
pullup 1.50->1.51 (sommerfeld)
 1.50.2.1.2.2 11-Jul-1999  chs remove uvm_vnp_uncache(), it's no longer needed.
 1.50.2.1.2.1 21-Jun-1999  thorpej Sync w/ -current.
 1.51.8.1 27-Dec-1999  wrstuden Pull up to last week's -current.
 1.51.2.2 08-Dec-2000  bouyer Sync with HEAD.
 1.51.2.1 20-Nov-2000  bouyer Update thorpej_scsipi to -current as of a month ago
 1.54.4.1 14-Dec-2000  he Pull up revision 1.58 (requested by fvdl):
Improve NFS performance, possibly with as much as 100% in
throughput. Please note: this implies a kernel interface change,
VOP_FSYNC gains two arguments.
 1.59.4.3 10-Oct-2002  jdolecek sync kqueue with -current; this includes merge of gehenna-devsw branch,
merge of i386 MP branch, and part of autoconf rototil work
 1.59.4.2 10-Jan-2002  thorpej Sync kqueue branch with -current.
 1.59.4.1 03-Aug-2001  lukem update to -current
 1.59.2.5 11-Dec-2002  thorpej Sync with HEAD.
 1.59.2.4 18-Oct-2002  nathanw Catch up to -current.
 1.59.2.3 14-Nov-2001  nathanw Catch up to -current.
 1.59.2.2 26-Sep-2001  nathanw Catch up to -current.
Again.
 1.59.2.1 24-Aug-2001  nathanw Catch up with -current.
 1.60.2.1 01-Oct-2001  fvdl Catch up with -current.
 1.61.2.1 12-Nov-2001  thorpej Sync the thorpej-mips-cache branch with -current.
 1.62.10.5 05-Jun-2004  jdc Pull up revision 1.89 (requested by yamt in ticket #1706).

nfsrv_create: fix an LP64 problem for exclusive create.
 1.62.10.4 10-Jul-2003  tron Pull up revision 1.80 (requested by bouyer in ticket #1373):
nfsrv_commit(): return success and don't do anything for requests which starts
past the end of the file. This can happen when two clients are writting to
the same file.
Close PR 21696 by myself, discussed on tech-net in 2003/05 and 2003/06.
Issue raised by Chuck Silvers (commit and truncate ops needs to be serialised)
still unadressed.
 1.62.10.3 30-Sep-2002  lukem Pull up revision 1.65 (requested by bouyer in ticket #880):
In nfs_commit(), sanity-check what we get from network: if we try to fluch
past end of file, or if off + cnt overflows a quad_t, flush to end of file.
 1.62.10.2 30-Sep-2002  lukem Pull up revision 1.63 (requested by bouyer in ticket #880):
nfsrv_commit(): Properly handle the case cnt == 0, which means "flush to
end of file". Calling VOP_FSYNC with start == end triggers a DIAGNOSTIC
check. Noticed with NFSv3 Linux clients. OK'd by fvdl.
 1.62.10.1 30-Sep-2002  lukem Pull up revision 1.64 (requested by bouyer in ticket #879):
In nfsrv_create(), kill an extra PNBUF_PUT() in the NFSv2 mknod case. The
pnbuf has already been freed by VOP_MKNOD. This should have been removed in
rev 1.60.
Should fix PR 18013, OK'd by fvdl.
 1.79.2.8 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.79.2.7 04-Mar-2005  skrll Sync with HEAD.

Hi Perry!
 1.79.2.6 18-Dec-2004  skrll Sync with HEAD.
 1.79.2.5 21-Sep-2004  skrll Fix the sync with head I botched.
 1.79.2.4 18-Sep-2004  skrll Sync with HEAD.
 1.79.2.3 24-Aug-2004  skrll Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.
 1.79.2.2 03-Aug-2004  skrll Sync with HEAD
 1.79.2.1 02-Jul-2003  darrenr Apply the aborted ktrace-lwp changes to a specific branch. This is just for
others to review, I'm concerned that patch fuziness may have resulted in some
errant code being generated but I'll look at that later by comparing the diff
from the base to the branch with the file I attempt to apply to it. This will,
at the very least, put the changes in a better context for others to review
them and attempt to tinker with removing passing of 'struct lwp' through
the kernel.
 1.87.2.1 05-Jun-2004  jdc Pull up revision 1.89 (requested by yamt in ticket #445).

nfsrv_create: fix an LP64 problem for exclusive create.
 1.93.4.1 19-Mar-2005  yamt sync with head. xen and whitespace. xen part is not finished.
 1.93.2.1 29-Apr-2005  kent sync with -current
 1.95.2.12 17-Mar-2008  yamt sync with head.
 1.95.2.11 27-Feb-2008  yamt drop lazy mapping of mbuf external storage for now, because:
- it's currently broken wrt asm code. (cpu_in_cksum)
- there are other approaches worth to consider. eg. sf_buf
 1.95.2.10 27-Feb-2008  yamt sync with head.
 1.95.2.9 04-Feb-2008  yamt sync with head.
 1.95.2.8 21-Jan-2008  yamt sync with head
 1.95.2.7 07-Dec-2007  yamt sync with head
 1.95.2.6 27-Oct-2007  yamt sync with head.
 1.95.2.5 03-Sep-2007  yamt sync with head.
 1.95.2.4 26-Feb-2007  yamt sync with head.
 1.95.2.3 30-Dec-2006  yamt sync with head.
 1.95.2.2 21-Jun-2006  yamt sync with head.
 1.95.2.1 07-Jul-2005  yamt nfsrv_read: defer mbuf mapping.
 1.98.6.2 18-Nov-2005  yamt - associate read-ahead context to vnode, rather than file.
- revert VOP_READ prototype.
 1.98.6.1 15-Nov-2005  yamt adapt ffs, lfs, nfs.
 1.99.2.2 15-Jan-2006  yamt sync with head.
 1.99.2.1 31-Dec-2005  yamt - adapt nfs.
- nfs_doio_read: #if 0 out "killproc if text is modified" part of
the code as it's broken. (a process reading the modified text is not
necessarily a process which is using the file as a text.)
 1.100.4.3 01-Jun-2006  kardel Sync with head.
 1.100.4.2 22-Apr-2006  simonb Sync with head.
 1.100.4.1 04-Feb-2006  simonb Adapt for timecounters: mostly use get*time() and use "time_second"
instead of "time.tv_sec".
 1.100.2.1 09-Sep-2006  rpaulo sync with head
 1.101.6.2 24-May-2006  tron Merge 2006-05-24 NetBSD-current into the "peter-altq" branch.
 1.101.6.1 28-Mar-2006  tron Merge 2006-03-28 NetBSD-current into the "peter-altq" branch.
 1.101.4.6 11-May-2006  elad sync with head
 1.101.4.5 06-May-2006  christos - Move kauth_cred_t declaration to <sys/types.h>
- Cleanup struct ucred; forward declarations that are unused.
- Don't include <sys/kauth.h> in any header, but include it in the c files
that need it.

Approved by core.
 1.101.4.4 19-Apr-2006  elad sync with head.
 1.101.4.3 12-Mar-2006  elad Get rid of NFSW_SAMECRED() that uses memcmp() to compare two credentials,
and use a new nfsrv_samecred(), using kauth(9).

Note that the NFSW_SAMECRED() macro used to check nd_flag of both
descriptors for NB_KERBAUTH too; we don't do that. [documented]

Based on code in FreeBSD, thanks to Jeff Roberson.
 1.101.4.2 10-Mar-2006  elad generic_authorize() -> kauth_authorize_generic().
 1.101.4.1 08-Mar-2006  elad Adapt to kernel authorization KPI.

This could use some testing...
 1.101.2.5 03-Sep-2006  yamt sync with head.
 1.101.2.4 11-Aug-2006  yamt sync with head
 1.101.2.3 26-Jun-2006  yamt sync with head.
 1.101.2.2 24-May-2006  yamt sync with head.
 1.101.2.1 01-Apr-2006  yamt sync with head.
 1.109.2.1 19-Jun-2006  chap Sync with head.
 1.111.2.1 13-Jul-2006  gdamore Merge from HEAD.
 1.117.4.2 10-Dec-2006  yamt sync with head.
 1.117.4.1 22-Oct-2006  yamt sync with head
 1.117.2.3 09-Feb-2007  ad Sync with HEAD.
 1.117.2.2 12-Jan-2007  ad Sync with head.
 1.117.2.1 18-Nov-2006  ad Sync with head.
 1.119.2.2 10-Mar-2007  bouyer Pull up following revision(s) (requested by chs in ticket #506):
sys/nfs/nfs_serv.c: revision 1.124
after freeing cookies, set the pointer to NULL to prevent dangling reuse
 1.119.2.1 17-Feb-2007  tron Apply patch (requested by chs in ticket #422):
- Fix various deadlock problems with nullfs and unionfs.
- Speed up path lookups by upto 25%.
 1.123.2.3 15-Apr-2007  yamt sync with head.
 1.123.2.2 12-Mar-2007  rmind Sync with HEAD.
 1.123.2.1 28-Feb-2007  yamt sync with head. (somehow missed in the previous)
 1.127.4.1 11-Jul-2007  mjf Sync with head.
 1.127.2.4 26-Aug-2007  yamt - mark nfssvc(2) MPSAFE and move the most of nfsd out of the kernel lock.
- remove unused ns_solock.
- remove some of KERNEL_LOCK/UNLOCK which are not necessary on this branch.
 1.127.2.3 20-Aug-2007  ad Sync with HEAD.
 1.127.2.2 17-Jun-2007  ad - Increase the number of thread priorities from 128 to 256. How the space
is set up is to be revisited.
- Implement soft interrupts as kernel threads. A generic implementation
is provided, with hooks for fast-path MD code that can run the interrupt
threads over the top of other threads executing in the kernel.
- Split vnode::v_flag into three fields, depending on how the flag is
locked (by the interlock, by the vnode lock, by the file system).
- Miscellaneous locking fixes and improvements.
 1.127.2.1 10-Apr-2007  ad Sync with head.
 1.128.4.1 15-Aug-2007  skrll Sync with HEAD.
 1.129.10.2 27-Jul-2007  yamt stop nfs tick when we have nothing to do.
 1.129.10.1 27-Jul-2007  yamt file nfs_serv.c was added on branch matt-mips64 on 2007-07-27 10:03:59 +0000
 1.129.8.1 14-Oct-2007  yamt sync with head.
 1.129.6.3 23-Mar-2008  matt sync with HEAD
 1.129.6.2 09-Jan-2008  matt sync with HEAD
 1.129.6.1 06-Nov-2007  matt sync with HEAD
 1.129.4.3 09-Dec-2007  jmcneill Sync with HEAD.
 1.129.4.2 27-Nov-2007  joerg Sync with HEAD. amd64 Xen support needs testing.
 1.129.4.1 26-Oct-2007  joerg Sync with HEAD.

Follow the merge of pmap.c on i386 and amd64 and move
pmap_init_tmp_pgtbl into arch/x86/x86/pmap.c. Modify the ACPI wakeup
code to restore CR4 before jumping back into kernel space as the large
page option might cover that.
 1.130.4.3 18-Feb-2008  mjf Sync with HEAD.
 1.130.4.2 27-Dec-2007  mjf Sync with HEAD.
 1.130.4.1 08-Dec-2007  mjf Sync with HEAD.
 1.131.2.3 26-Dec-2007  ad Sync with head.
 1.131.2.2 08-Dec-2007  ad Sync with head.
 1.131.2.1 04-Dec-2007  ad Pull the vmlocking changes into a new branch.
 1.132.4.1 02-Jan-2008  bouyer Sync with HEAD
 1.135.6.2 17-Jan-2009  mjf Sync with HEAD.
 1.135.6.1 03-Apr-2008  mjf Sync with HEAD.
 1.135.2.1 24-Mar-2008  keiichi sync with head.
 1.138.16.3 30-Sep-2012  bouyer Pull up following revision(s) (requested by chs in ticket #1794):
sys/nfs/nfs_serv.c: revision 1.164 via patch
fix error handling in nfsrv_rename(): when the first nfs_namei() fails,
don't try to free the resources allocated by a successful lookup.
 1.138.16.2 14-Feb-2010  bouyer Pull up following revision(s) (requested by pooka in ticket #1289):
sys/sys/namei.src: revision 1.14
sys/kern/vfs_syscalls.c: revision 1.401
sys/nfs/nfs_serv.c: revision 1.149
sys/sys/namei.h: regen
Define namei flag INRENAME and set it if a lookup operation is part
of rename. This helps with building better asserts for rename in
the DELETE lookup ... the RENAME lookup is quite obviously a part
of rename.
 1.138.16.1 13-Apr-2009  snj branches: 1.138.16.1.4;
Pull up following revision(s) (requested by ad in ticket #700):
sys/nfs/nfs_serv.c: revision 1.144
sys/nfs/nfsm_subs.h: revision 1.51
PR kern/41158: nfs_rename() locking against myself
nfsrv_rename() can exit without calling genfs_renamelock_exit() because
the nfsm_reply() can do return (0) on error.
Change nfsm_reply to use 'error = 0; goto nfsmout' instead.
Fix a few place so it's safe to goto nfsmout from nfsm_reply, or other
macros calling it.
As a side effect it could fix a missing vrele(dirp) in various place where
nfsm_reply could return(0).
 1.138.16.1.4.2 24-Dec-2011  matt Fix call to sokvaalloc (now takes 3 arguments)
 1.138.16.1.4.1 20-Apr-2010  matt Pullin some NFS fixes from netbsd-5.
 1.138.14.2 28-Apr-2009  skrll Sync with HEAD.
 1.138.14.1 19-Jan-2009  skrll Sync with HEAD.
 1.138.12.1 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.138.6.4 11-Aug-2010  yamt sync with head.
 1.138.6.3 11-Mar-2010  yamt sync with head
 1.138.6.2 20-Jun-2009  yamt sync with head
 1.138.6.1 04-May-2009  yamt sync with head.
 1.138.2.3 27-Dec-2008  christos merge with head.
 1.138.2.2 20-Nov-2008  christos merge with head.
 1.138.2.1 29-Mar-2008  christos Welcome to the time_t=long long dev_t=uint64_t branch.
 1.142.2.2 23-Jul-2009  jym Sync with HEAD.
 1.142.2.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.150.4.3 21-Apr-2011  rmind sync with head
 1.150.4.2 05-Mar-2011  rmind sync with head
 1.150.4.1 03-Jul-2010  rmind sync with head
 1.150.2.1 17-Aug-2010  uebayasi Sync with HEAD.
 1.155.4.1 08-Feb-2011  bouyer Sync with HEAD
 1.155.2.1 06-Jun-2011  jruoho Sync with HEAD.
 1.161.2.3 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.161.2.2 30-Oct-2012  yamt sync with head
 1.161.2.1 17-Apr-2012  yamt sync with head
 1.162.2.1 18-Feb-2012  mrg merge to -current.
 1.163.4.1 01-Nov-2012  matt sync with netbsd-6-0-RELEASE.
 1.163.2.2 03-Sep-2012  riz Pull up following revision(s) (requested by christos in ticket #537):
sys/nfs/nfs_serv.c: revision 1.165
When unloading the nfsserver module, call nfs_fini() so that the nfsrvdescpl
pool gets destroyed. Otherwise we are left with a stray pool that points to
unmapped memory behind (and bad things happen). Typically you get seemingly
random page faults (without printing uvm_fault) that happen in various pool
operations. Most frequent one is the pool_drain() from the page daemon.
 1.163.2.1 03-Sep-2012  riz Pull up following revision(s) (requested by chs in ticket #530):
sys/nfs/nfs_serv.c: revision 1.164
fix error handling in nfsrv_rename(): when the first nfs_namei() fails,
don't try to free the resources allocated by a successful lookup.
 1.165.4.1 18-May-2014  rmind sync with head
 1.165.2.2 03-Dec-2017  jdolecek update from HEAD
 1.165.2.1 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.170.6.2 28-Aug-2017  skrll Sync with HEAD
 1.170.6.1 06-Jun-2015  skrll Sync with HEAD
 1.173.10.2 06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.173.10.1 21-May-2018  pgoyette Sync with HEAD
 1.173.4.1 30-Mar-2023  martin Pull up following revision(s) (requested by riastradh in ticket #1810):

sys/nfs/nfs_serv.c: revision 1.184
sys/nfs/nfs_srvsubs.c: revision 1.17
sys/nfs/nfsm_subs.h: revision 1.56
sys/nfs/nfsm_subs.h: revision 1.57

nfs: Use unsigned fhlen so we don't trip over negative values.

nfs: Avoid integer overflow in nfs_namei bounds check.

nfs: Use unsigned name lengths so we don't trip over negative ones.
- nfsm_strsiz is only used with uint32_t in callers, but let's not
leave it as a rake to step on.
- nfsm_srvnamesiz is abused with signed s. The internal conversion
to unsigned serves to reject both negative and too-large values in
such callers.
XXX Should make all callers use unsigned, rather than flipping back
and forth between signed and unsigned for name lengths.

nfs: Avoid free of uninitialized on bad name size in create, mknod.
XXX These error branches are a nightmare and need to be more
systematically cleaned up. Even if they are correct now, they are
impossible to audit and extremely fragile in case anyone ever needs
to make other changes to them.
 1.174.2.2 08-Apr-2020  martin Merge changes from current as of 20200406
 1.174.2.1 10-Jun-2019  christos Sync with HEAD
 1.177.4.1 30-Mar-2023  martin Pull up following revision(s) (requested by riastradh in ticket #1617):

sys/nfs/nfs_serv.c: revision 1.184
sys/nfs/nfs_srvsubs.c: revision 1.17
sys/nfs/nfsm_subs.h: revision 1.56
sys/nfs/nfsm_subs.h: revision 1.57

nfs: Use unsigned fhlen so we don't trip over negative values.

nfs: Avoid integer overflow in nfs_namei bounds check.

nfs: Use unsigned name lengths so we don't trip over negative ones.
- nfsm_strsiz is only used with uint32_t in callers, but let's not
leave it as a rake to step on.
- nfsm_srvnamesiz is abused with signed s. The internal conversion
to unsigned serves to reject both negative and too-large values in
such callers.
XXX Should make all callers use unsigned, rather than flipping back
and forth between signed and unsigned for name lengths.

nfs: Avoid free of uninitialized on bad name size in create, mknod.
XXX These error branches are a nightmare and need to be more
systematically cleaned up. Even if they are correct now, they are
impossible to audit and extremely fragile in case anyone ever needs
to make other changes to them.
 1.178.2.1 17-Jan-2020  ad Sync with head.
 1.183.4.1 30-Mar-2023  martin Pull up following revision(s) (requested by riastradh in ticket #134):

sys/nfs/nfs_serv.c: revision 1.184
sys/nfs/nfs_srvsubs.c: revision 1.17
sys/nfs/nfsm_subs.h: revision 1.56
sys/nfs/nfsm_subs.h: revision 1.57

nfs: Use unsigned fhlen so we don't trip over negative values.

nfs: Avoid integer overflow in nfs_namei bounds check.

nfs: Use unsigned name lengths so we don't trip over negative ones.
- nfsm_strsiz is only used with uint32_t in callers, but let's not
leave it as a rake to step on.
- nfsm_srvnamesiz is abused with signed s. The internal conversion
to unsigned serves to reject both negative and too-large values in
such callers.
XXX Should make all callers use unsigned, rather than flipping back
and forth between signed and unsigned for name lengths.

nfs: Avoid free of uninitialized on bad name size in create, mknod.
XXX These error branches are a nightmare and need to be more
systematically cleaned up. Even if they are correct now, they are
impossible to audit and extremely fragile in case anyone ever needs
to make other changes to them.
 1.203 22-Feb-2025  mlelstv Poll for interrupted NFS operations if the waiting process hasn't
issued the operation and then won't get the interrupt signal itself.
 1.202 05-Feb-2024  andvar branches: 1.202.2;
fix various typos in comments.
 1.201 09-Apr-2023  riastradh nfs: Simplify assertion. No functional change intended.
 1.200 03-Sep-2018  riastradh Rename min/max -> uimin/uimax for better honesty.

These functions are defined on unsigned int. The generic name
min/max should not silently truncate to 32 bits on 64-bit systems.
This is purely a name change -- no functional change intended.

HOWEVER! Some subsystems have

#define min(a, b) ((a) < (b) ? (a) : (b))
#define max(a, b) ((a) > (b) ? (a) : (b))

even though our standard name for that is MIN/MAX. Although these
may invite multiple evaluation bugs, these do _not_ cause integer
truncation.

To avoid `fixing' these cases, I first changed the name in libkern,
and then compile-tested every file where min/max occurred in order to
confirm that it failed -- and thus confirm that nothing shadowed
min/max -- before changing it.

I have left a handful of bootloaders that are too annoying to
compile-test, and some dead code:

cobalt ews4800mips hp300 hppa ia64 luna68k vax
acorn32/if_ie.c (not included in any kernels)
macppc/if_gm.c (superseded by gem(4))

It should be easy to fix the fallout once identified -- this way of
doing things fails safe, and the goal here, after all, is to _avoid_
silent integer truncations, not introduce them.

Maybe one day we can reintroduce min/max as type-generic things that
never silently truncate. But we should avoid doing that for a while,
so that existing code has a chance to be detected by the compiler for
conversion to uimin/uimax without changing the semantics until we can
properly audit it all. (Who knows, maybe in some cases integer
truncation is actually intended!)
 1.199 21-Jan-2018  christos branches: 1.199.2; 1.199.4;
PR/40491: From Tobias Ulmer in tech-kern@:
1. Protect the nfs request queue with its own mutex
2. make the nfs_receive queue check for signals so that intr mounts
can be interrupted.
XXX: pullup-8
 1.198 17-Jun-2016  christos branches: 1.198.10;
Serialize all access to the NFS request queue via splsoftnet(). Fixes random
crashes.
XXX: Pullup-7
 1.197 15-Jul-2015  manu Fix soft NFS force unmount

For many reasons, forcibly unmounting a soft NFS mount could hang forever.
Here are the fixes:
- Introduce decents timeouts in operation that awaited NFS server reply.
- On timeout, fails operations on soft mounts with EIO.
- Introduce NFSMNT_DISMNTFORCE to let the filesystem know that a
force unmount is ongoing. This causes timeouts to be reduced and
prevents the NFS client to attempt reconnecting to the NFS server.

Also fix a race condition where some asynchronous I/O could reference
destroyed mount structures. We fix this by awaiting asynchronous I/O
to drain before proceeding.

Reviewed by Chuck Silvers.
 1.196 09-May-2015  rtr change sosend() to accept sockaddr * instead of mbuf * for nam.

bump to 7.99.16
 1.195 02-May-2015  rtr make connect syscall use sockaddr_big and modify pr_{send,connect}
nam parameter type from buf * to sockaddr *.

final commit for parameter type changes to protocol user requests

* bump kernel version to 7.99.15 for parameter type changes to pr_{send,connect}
 1.194 03-Apr-2015  rtr * change pr_bind to accept struct sockaddr * instead of struct mbuf *
* update protocol bind implementations to use/expect sockaddr *
instead of mbuf *
* introduce sockaddr_big struct for storage of addr data passed via
sys_bind; sockaddr_big is of sufficient size and alignment to
accommodate all addr data sizes received.
* modify sys_bind to allocate sockaddr_big instead of using an mbuf.
* bump kernel version to 7.99.9 for change to pr_bind() parameter type.

Patch posted to tech-net@
http://mail-index.netbsd.org/tech-net/2015/03/15/msg005004.html

The choice to use a new structure sockaddr_big has been retained since
changing sockaddr_storage size would lead to unnecessary ABI change. The
use of the new structure does not preclude future work that increases
the size of sockaddr_storage and at that time sockaddr_big may be
trivially replaced.

Tested by mrg@ and myself, discussed with rmind@, posted to tech-net@
 1.193 05-Sep-2014  matt branches: 1.193.2;
Don't use catch as a variable name.
 1.192 05-Aug-2014  rtr branches: 1.192.2; 1.192.4;
split PRU_SEND function out of pr_generic() usrreq switches and put into
separate functions

xxx_send(struct socket *, struct mbuf *, struct mbuf *,
struct mbuf *, struct lwp *)

- always KASSERT(solocked(so)) even if not implemented

- replace calls to pr_generic() with req = PRU_SEND with calls to
pr_send()

rename existing functions that operate on PCB for consistency (and to
free up their names for xxx_send() PRUs

- l2cap_send() -> l2cap_send_pcb()
- sco_send() -> sco_send_pcb()
- rfcomm_send() -> rfcomm_send_pcb()

patch reviewed by rmind
 1.191 18-May-2014  rmind Add struct pr_usrreqs with a pr_generic function and prepare for the
dismantling of pr_usrreq in the protocols; no functional change intended.
PRU_ATTACH/PRU_DETACH changes will follow soon.

Bump for struct protosw. Welcome to 6.99.62!
 1.190 14-Sep-2013  martin branches: 1.190.2;
Backout wildcard pragma to kill warnings and instead sprinkle a few dozen
__unused attributes.
Requested by joerg@
 1.189 23-Mar-2011  tls branches: 1.189.4; 1.189.14; 1.189.18;
As suggested by matt@: change socket buffer reservations for NFS send/receive
to 3 times max RPC size rather than 2 times. Avoids nasty TCP stalls observed
at Panix. Will require increase to sbmax via sysctl for those running really
huge NFS rsize/wsize (>64K).
 1.188 17-Dec-2010  yamt branches: 1.188.2;
nfs_rcvunlock: don't wake up all waiters.
 1.187 02-Mar-2010  pooka branches: 1.187.2;
Get rid of dependency on fs_nfs.h, i.e. source modules with
conditional content depending on if the NFS client is wanted or
not. The server can now be made an independent module not depending
on the nfs client.

Tested with rump_nfs (standalone client), rump_nfsd (standalone
nfsd) and a qemu installation with both the client and the server.
 1.186 13-Feb-2010  yamt nfs_msg: #if 0 out tprintf for now and comment why.
 1.185 19-Jan-2010  yamt branches: 1.185.2;
nfs_request: fix races which break congestion window and make nfs client stuck.
 1.184 31-Dec-2009  christos appease gcc.
 1.183 06-Dec-2009  dyoung For readability's sake, write NULL instead of (type *)0.
 1.182 05-Nov-2009  bouyer Handle EWOULDBLOCK the same way as EPIPE. It seems the TCP socket layer
can return EWOULDBLOCK on some occasion when the connection is broken.
 1.181 16-Oct-2009  pooka If send fails with EMSGSIZE for whatever reason, it's unlikely to
succeed no matter how hard we retry. So just fail the request.
 1.180 14-Mar-2009  dsl ANSIfy another 1261 function definitions.
The only ones left in sys are beyond by sed script!
(or in sys/dist or sys/external)
Mostly they have function pointer parameters.
 1.179 14-Mar-2009  dsl Change about 4500 of the K&R function definitions to ANSI ones.
There are still about 1600 left, but they have ',' or /* ... */
in the actual variable definitions - which my awk script doesn't handle.
There are also many that need () -> (void).
(The script does handle misordered arguments.)
 1.178 04-Feb-2009  ad branches: 1.178.2;
PR kern/40491 5.0: nfs timer can crash/break on smp

Hack around it by acquiring softnet_lock around the client-side timer loop.
 1.177 21-Jan-2009  yamt restore the pre socket locking patch signal behaviour.
this fixes a busy-loop in nfs_connect.
 1.176 18-Jan-2009  mrg Actually enforce the maximum timeout (60s by default) rather
than backing off to 256*SRTT. This is why it sometimes could take
hours for a NFS mount to come back when the server returned.

contributed anonymously.
 1.175 23-Nov-2008  mrg avoid noisy nfs_timer/nfs_reply DEBUG output that occurs when the
NFS server goes away. use ratelimit(9) and only print the console
error once every 10 seconds. PR#31562.
 1.174 19-Nov-2008  ad Make the emulations, exec formats, coredump, NFS, and the NFS server
into modules. By and large this commit:

- shuffles header files and ifdefs
- splits code out where necessary to be modular
- adds module glue for each of the components
- adds/replaces hooks for things that can be installed at runtime
 1.173 07-Oct-2008  pooka branches: 1.173.2; 1.173.4;
nuke outdated comment
 1.172 30-Sep-2008  pooka Apply #ifdef modern art to make NFSSERVER-without-NFS possible.
 1.171 06-Aug-2008  plunky Convert socket options code to use a sockopt structure
instead of laying everything into an mbuf.

approved by core
 1.170 24-Apr-2008  ad branches: 1.170.2; 1.170.4; 1.170.8;
Merge the socket locking patch:

- Socket layer becomes MP safe.
- Unix protocols become MP safe.
- Allows protocol processing interrupts to safely block on locks.
- Fixes a number of race conditions.

With much feedback from matt@ and plunky@.
 1.169 10-Apr-2008  yamt branches: 1.169.2;
- make nfs_receive and nfs_reply static.
- ansify.
 1.168 28-Mar-2008  yamt whitespace.
 1.167 02-Jan-2008  yamt branches: 1.167.6;
use kmem_alloc instead of malloc.
 1.166 02-Jan-2008  ad Merge vmlocking2 to head.
 1.165 04-Dec-2007  yamt branches: 1.165.4;
merge non-intrusive nfs changes from vmlocking.
 1.164 21-Oct-2007  yamt branches: 1.164.2; 1.164.4;
remove lwp argument from nfs_reconnect and always use &lwp0
because who triggers a reconnect doesn't really matter here. PR/37145.
 1.163 05-Aug-2007  yamt branches: 1.163.2; 1.163.6; 1.163.8;
use kpause rather than lbolt.
 1.162 02-Aug-2007  yamt branches: 1.162.2;
nfsdsock_unlock: add an assertion.
 1.161 27-Jul-2007  yamt stop nfs tick when we have nothing to do.
 1.160 09-Jul-2007  ad branches: 1.160.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements
 1.159 30-Jun-2007  dsl Updates for changes prototype of kauth_cred_set/getgroups().
 1.158 22-Jun-2007  yamt - nfsrv_slpderef: fix a locking botch.
(should fix "slp->ns_sref == 0" assertion failures in nfsrv_init)
- add some related assertions.
 1.157 01-Jun-2007  yamt use mutex and condvar.
 1.156 01-Jun-2007  yamt nfsdsock_lock: fix an inverted check of SLP_VALID.
 1.155 28-May-2007  yamt - remove nfs_exit exit hook. ok'ed by christos@.
- as far as i understand the code, it shouldn't be necessary
because nfs_request can't return without removing its request
and r->r_lwp is either curlwp or NULL.
- even if it's necessary, leaking requests is not the correct way
to recover from the condition.
- nfs_request: add a related assertion.
 1.154 02-May-2007  yamt nfs_rcvlock: fix NFSMNT_INT check, which has been broken since rev.1.39.
 1.153 02-May-2007  yamt - nfs_reply: keep rcvlock longer so that lwp which already have its reply
received won't be stuck in nfs_receive.
- nfs_rcvlock: check exceptions before sleeping on the lock.
- nfs_rcvunlock: use cv_broadcast rather than cv_signal to ensure that
lwps which received its reply get woken up.
 1.152 30-Apr-2007  yamt remove R_GETONEREP.
 1.151 29-Apr-2007  yamt use condvar.
 1.150 29-Apr-2007  yamt use mutex and condver.
 1.149 12-Mar-2007  ad branches: 1.149.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.
 1.148 04-Mar-2007  christos branches: 1.148.2;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.
 1.147 22-Feb-2007  thorpej TRUE -> true, FALSE -> false
 1.146 21-Feb-2007  thorpej Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.
 1.145 09-Feb-2007  ad branches: 1.145.2;
Merge newlock2 to head.
 1.144 27-Dec-2006  yamt remove nqnfs.
 1.143 06-Dec-2006  yamt nfs_disconnect: 2 -> SHUT_RDWR. no functional change.
 1.142 06-Dec-2006  yamt nfsrv_rcv: claim ownership of received mbufs.
 1.141 09-Nov-2006  yamt remove some __unused in function parameters.
 1.140 12-Oct-2006  christos - sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386
 1.139 10-Oct-2006  dogcow change the MOWNER_INIT define to take two args; fix extant struct mowner
decls to use it. Makes options MBUFTRACE compile again and not whinge about
missing structure declarations. (Also makes initialization consistent.)
 1.138 02-Sep-2006  yamt branches: 1.138.2; 1.138.4;
nfsdreq_free: remove an assertion which is not true.
 1.137 15-Jul-2006  yamt nfs_getreq: fix a kauth fallout.
pointed by nanashi-san. http://pc8.2ch.net/test/read.cgi/unix/1145181361/786
 1.136 30-Jun-2006  yamt nfs_request: don't bother to handle NFSERR_STALEWRITEVERF
because it isn't a real nfs error value.
 1.135 07-Jun-2006  kardel branches: 1.135.2;
merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html
 1.134 28-May-2006  blymn Clean up bogus whitespace
 1.133 28-May-2006  yamt nfs_request: use kauth_cred_free rather than kauth_cred_destroy.
 1.132 19-May-2006  yamt branches: 1.132.2;
- fix compilation problem for !NFSSERVER && NFS.
pointed by Tom Spindler on source-changes@.
- make nfs_srvdesc_pool static.
 1.131 18-May-2006  yamt - fix some leaks in nfsd, introduced by kauth changes.
- simplify code.
- add some assertions.
- wrap some long lines.
- remove an unnecessary ";".
 1.130 14-May-2006  elad integrate kauth.
 1.129 10-May-2006  mrg quell GCC 4.1 uninitialised variable warnings.

XXX: we should audit the tree for which old ones are no longer needed
after getting the older compilers out of the tree..
 1.128 15-Apr-2006  dogcow #if -> #ifdef
 1.127 15-Apr-2006  christos Coverity CID 734: Define NFS_TEST_HEAVY for testing nfsds, and use this to
ifdef out dead code. XXX: Why is this turned on by default?
 1.126 01-Mar-2006  rpaulo branches: 1.126.2; 1.126.4; 1.126.6;
Back out revision 1.125 and 1.124. The code for checking if
slp->ns_reclen == 0, was already there since "Linux sometimes
generates 0-lenght records.".

Bad Rui...
 1.125 01-Mar-2006  rpaulo In nfsrv_getstream(), ns_reclen will never be negative due to the
previous assignment (recmark & ~0x80000000).
Pointed out by Christos.
 1.124 01-Mar-2006  rpaulo From FreeBSD SA-06:10
Correct a remote kernel panic when processing zero-length RPC records
via TCP.
 1.123 01-Mar-2006  yamt merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.
 1.122 03-Jan-2006  yamt branches: 1.122.2; 1.122.4;
nfssvc_nfsd: reduce a chance for a slow peer to capture all our threads.
instead of sleeping to wait for the socket to send our reply,
just hand-off our reply to the thread which is holding the socket.
 1.121 03-Jan-2006  yamt improve nfsd locking.
- don't bother to take nfs_sndlock when doing nfsrv_rcv.
unlike client, we never reconnect.
- nfsrv_getstream: fix the case that m_split sleeps.
- free socket in nfsrv_slpderef rather than nfsrv_zapsock.
fix race with nfssvc_nfsd.
- while i'm here, remove NFSD_WAITING and NFSD_REQINPROG
as they are redundant.
- some comments and assertions.
 1.120 30-Dec-2005  jmmv branches: 1.120.2;
Avoid dereferencing the lwp parameter in nfs_receive, as it is always NULL.
Solves a crash when mounting NFS shares. (The proc parameter used before
the conversion to lwp's was NULL too, so the addition of 'l->l_proc' in the
code was extra.)
 1.119 11-Dec-2005  christos merge ktrace-lwp.
 1.118 21-Nov-2005  yamt use c99 initializers for proct.
 1.117 25-Sep-2005  tron branches: 1.117.6;
Correct typo in last commit to fix compilation error.
 1.116 25-Sep-2005  christos Add missing TIMEDOUT and IO errors.
 1.115 25-Sep-2005  christos Convert from nfs error values to regular errno's. Although most values of
nfs errors are chosen to be the same as errno, some of them are not and
it is better for portability to do the conversion anyway. Also a server
can return a bad error number that can cause the server to crash, because
it can have the high bits that are used internally set. This was the case
with amd. Finally nfs_request() should return a valid errno, because we
can return a bogus value to userland. Thanks to rpaulo for debugging this.
 1.114 29-May-2005  christos branches: 1.114.2;
- sprinkle const
- avoid shadowed variables
- mark bad const use with XXXUNCONST
 1.113 29-Mar-2005  yamt nfsrv_rcv: don't do so_receive from socket upcall context.
while there's little benefits, it complicates locking and confuses
flow control.
 1.112 26-Feb-2005  perry branches: 1.112.2;
nuke trailing whitespace
 1.111 17-Sep-2004  skrll branches: 1.111.4; 1.111.6;
There's no need to pass a proc value when using UIO_SYSSPACE with
vn_rdwr(9) and uiomove(9).

OK'd by Jason Thorpe
 1.110 24-Aug-2004  yamt nfs_request: a workaround for servers doing "maproot".
for i/o requests which are expected not to fail due to permission
to mimic unix file open semantics (READ, WRITE, COMMIT),
try two credentials. namely, the file owner's one and open time one.
remember which credential worked in per-file basis and try it first
next time to minimize number of retries.
ideas from Chuck Silvers. PR/23716 and PR/24987.
 1.109 18-Aug-2004  yamt remove a "proc botch" debug printf. ok'ed by Jonathan Stone.
 1.108 24-Jun-2004  jonathan Rename MBUFTRACE helper function m_claim() to m_claimm(),
for consistency with M_FREE() and m_freem(). Affected files:

sys/mbuf.h
kern/uipc_socket2.c
kern/uipc_mbuf.c
net/if_ethersubr.c
netatalk/ddp_input.c
nfs/nfs_socket.c
 1.107 24-May-2004  jonathan Change DIAGNOSTIC warning in nfs_send() about NULL rep->r_procp: the
warning is triggered pervasively, so print it only once per boot.
(The callers who pass NULL r_procps should soon be fixed to pass a
valid struct proc* ).
 1.106 23-May-2004  yamt - for tcp, use SO_RCVTIMEO to recover from server crash.
otherwise we can be stuck in soreceive forever.
the problem is pointed by Minoura Makoto. PR/25662
- clear r_rexmit on reconnect and clear r_rtt and R_TIMING on retransmit
so that the above (and soft mounts) happy.
 1.105 22-May-2004  jonathan Eliminate several uses of `curproc' from the socket-layer code and from NFS.

Add a new explicit `struct proc *p' argument to socreate(), sosend().
Use that argument instead of curproc. Follow-on changes to pass that
argument to socreate(), sosend(), and (*so->so_send)() calls.
These changes reviewed and independently recoded by Matt Thomas.

Changes to soreceive() and (*dom->dom_exernalize() from Matt Thomas:
pass soreceive()'s struct uio* uio->uio_procp to unp_externalize().
Eliminate curproc from unp_externalize. Also, now soreceive() uses
its uio->uio_procp value, pass that same value downward to
((pr->pru_usrreq)() calls for consistency, instead of (struct proc * )0.

Similar changes in sys/nfs to eliminate (most) uses of curproc,
either via the req-> r_procp field of a struct nfsreq *req argument,
or by passing down new explicit struct proc * arguments.

Reviewed by: Matt Thomas, posted to tech-kern.
NB: The (*pr->pru_usrreq)() change should be tested on more (all!) protocols.
 1.104 10-May-2004  yamt don't do kludge for a reply to a retransmitted request
unless we actually retransmitted the request.
 1.103 21-Apr-2004  christos Replace the statfs() family of system calls with statvfs().
Retain binary compatibility.
 1.102 17-Mar-2004  yamt branches: 1.102.2;
nfs_sndlock: fix nfsd null dereference.
 1.101 10-Mar-2004  matt Don't report EPIPE errors on nfs sockets. These can be due to idle tcp
mounts which will be closed by netapp, solaris, etc. if left idle too long.
 1.100 07-Dec-2003  fvdl Unix semantics dictate that access checks for files are done when it
is opened. An open file can always be read from and/or written to,
depending on how it was opened.

Therefore, the read/write/commit RPCs should never return EACCESS,
as they are only performed on files that have been successfully opened
already.

This change improves the current situation and works in most cases.
It simply always uses the most recently known owner/group of the file,
iff the authentication mechanism is AUTH_UNIX (in other cases, the
creds for a succesful open are used, but note that no other cases
are currently implemented).

A retry mechanism can be used to catch a few more cases, but this is
a good improvement for now.
 1.99 09-Oct-2003  yamt for nfs_timer_ch, use callout_schedule rather than callout_reset
as the former is a little more efficient.
 1.98 16-Aug-2003  yamt use sizeof() instead of a hardcorded constant.
 1.97 16-Aug-2003  yamt current trylater/jukebox retry delay is way too long and
it has a bug in the backoff calculation. so,
- clip it to 1-60 sec. (suggested by Rick Macklem)
- use a constant multiplier instead of nfs_backoff, which
is already exponential.
- move some related constant definations to nfs.h from nqnfs.h and
prefix with NFS_ instead of NQ_ because they are not nqnfs-specific.
 1.96 07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.95 23-Jul-2003  yamt when rexmitting a request due to NFSERR_JUKEBOX,
use a new xid as RFC1813 says.
 1.94 23-Jul-2003  yamt fix parenthesis mismatch in rev.1.93.
 1.93 23-Jul-2003  yamt use sizeof() instead of hardcoding the size of the array.
 1.92 29-Jun-2003  fvdl branches: 1.92.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
 1.91 28-Jun-2003  darrenr Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
 1.90 25-Jun-2003  yamt - instead of scaning a list when looking up
{a idle thread, a socket with pending requests},
maintain dedicated list of them.
- add spin locks.
 1.89 23-Jun-2003  martin Make sure to include opt_foo.h if a defflag option FOO is used.
 1.88 22-May-2003  yamt poolify nfsrv_descript.
 1.87 22-May-2003  yamt interlock for nfs_rcvlock.
 1.86 21-May-2003  yamt - use FREE not free for MALLOC'ed memory.
- remove unneeded caddr_t casts.
 1.85 21-May-2003  yamt remove local definitions of TRUE and FALSE.
 1.84 21-May-2003  yamt indent
 1.83 24-Apr-2003  drochner Change some subordinate functions to take a "struct nfsnode" argument
instead of "struct vnode". This saves a number of pointer dereferences;
it sums up to about half a kB for me. And it paves the way for future
fixes.
While cleaning up, eliminate a write-only member of "struct nfsreq"
and a pointless assignment in the NFS_V2_ONLY case.
 1.82 15-Apr-2003  yamt fix indent.
 1.81 03-Apr-2003  yamt use m_copydata and m_split instead of similar inlined ones.
 1.80 02-Apr-2003  yamt use queue manipulation macros.
 1.79 26-Feb-2003  matt Add MBUFTRACE kernel option.
Do a little mbuf rework while here. Change all uses of MGET*(*, M_WAIT, *)
to m_get*(M_WAIT, *). These are not performance critical and making them
call m_get saves considerable space. Add m_clget analogue of MCLGET and
make corresponding change for M_WAIT uses.
Modify netinet, gem, fxp, tulip, nfs to support MBUFTRACE.
Begin to change netstat to use sysctl.
 1.78 01-Feb-2003  thorpej Add extensible malloc types, adapted from FreeBSD. This turns
malloc types into a structure, a pointer to which is passed around,
instead of an int constant. Allow the limit to be adjusted when the
malloc type is defined, or with a function call, as suggested by
Jonathan Stone.
 1.77 01-Dec-2002  matt Make sure these all agree on the same definitons of various variables.
 1.76 27-Sep-2002  provos remove trailing \n in panic(). approved perry.
 1.75 02-Aug-2002  fvdl Initialize recm to NULL inside the loop, so that a record length of
NULL will not accidentallly append bogus data (the previous record).

Derived from a fix by Matt Dillon in FreeBSD.
 1.74 12-May-2002  matt branches: 1.74.2; 1.74.4;
Eliminate commons
 1.73 17-Mar-2002  christos use the exithook mechanism to remove the exiting process from the list
of processes to be signalled in a soft mount.
 1.72 27-Feb-2002  lukem nfs_connect(): if NFSMNT_RESVPORT is set, set IP_PORTRANGE_LOW on the socket
rather than using home-grown code to find a free reserved socket.
this also results in nfs pcb's having the INP_ANONPORT and INP_LOWPORT flags
set, which is useful for netstat(1) to know.
 1.71 22-Jan-2002  minoura Back out the previous.
It was my misreading from the lack of mbuf usage...
Sorry for the mess.
 1.70 21-Jan-2002  minoura Correctly write back the updated value of the local variable to the
struct nfssvc_sock.
Affected only when a recordmark of RPC over TCP is fragmented to
multiple mbufs. I do not know whether this code has ever been executed :)
 1.69 10-Nov-2001  lukem add RCSIDs
 1.68 13-Oct-2001  simonb branches: 1.68.2;
Don't initialise the 5th element of some 4 element arrays.
 1.67 09-May-2001  fvdl branches: 1.67.2;
Suppress another case of a potentially noisy error message which
isn't fatal.
 1.66 21-Feb-2001  jdolecek branches: 1.66.2;
make some more constant arrays 'const'
 1.65 27-Dec-2000  jdolecek update commented out code to recent changes of signal structures
 1.64 27-Dec-2000  bjh21 Extra diagnostic assertion: subtle pmap bugs can ultimately lead to trying
to use NULL credentials for NFS ops, so spot them before we dereference them.
 1.63 10-Dec-2000  fvdl Make sobind() take a struct proc *. It already took curproc and
passed it down to the appropriate usrreq function, and this
allows usage for contexts that need to be explicitly different
from curproc (like in the NFS code when binding to a reserved port).
 1.62 27-Sep-2000  fvdl Avoid unused variables for V2_ONLY case.
 1.61 19-Sep-2000  bjh21 Extend NFS_V2_ONLY to remove NQNFS lease support as well. Saves another 10k.
 1.60 19-Sep-2000  fvdl "ENOBUF" on socket writes isn't really fatal; we may just be too fast
for the driver. Don't log the error, just try again. Could try to
be smart and do a backoff, but it's probably not worth the trouble.
 1.59 19-Sep-2000  bjh21 New kernel option, NFS_V2_ONLY, which aims to reduce the NFS client to just
that required to support NFSv2 mounts. Not finished yet, but already
provides some 44k of saving in code size on arm26. More savings, and some
documentation, are still to come.
 1.58 27-Jun-2000  mrg remove include of <vm/vm.h>
 1.57 09-Jun-2000  fvdl branches: 1.57.2;
Some tweaks to enable NFS over IPv6. The special-casing of AF_INET
should really be removed.
 1.56 27-May-2000  thorpej branches: 1.56.2;
sleep() -> tsleep()
 1.55 30-Mar-2000  augustss Remove register declarations.
 1.54 23-Mar-2000  thorpej New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.
 1.53 29-Aug-1999  sommerfeld branches: 1.53.2;
Fix overzealous DIAGNOSTIC check in nfs_disconnect()
(fix pr8249, 8288)
 1.52 30-Jul-1999  fvdl Don't try to copy an mbuf that may have been freed in case of an error.
 1.51 04-Jul-1999  sommerfeld kern/5591: Fix race in the NFS socket code during umount -f and system
shutdown:

During an unmount, wake up all the processes which are waiting to lock
the socket for receive, and wait for them (and the process blocked in
soreceive, if any) to go away before blowing away the socket and the
mount structure.
 1.50 06-Mar-1999  fair branches: 1.50.2; 1.50.4;
Snatch a patch from OpenBSD to fix PRs 6529 and 7074.
Adjust fxdr_hyper() and txdr_hyper() macros.
 1.49 12-Feb-1999  thorpej Fix printf format warnings on Alpha.
 1.48 12-Nov-1998  fvdl Use different names for the "nfscon" label to tsleep(), so that it can
be seen in which one a process is sleeping.
 1.47 11-Sep-1998  mycroft Substantial signal handling changes:
* Increase the size of sigset_t to accomodate 128 signals -- adding new
versions of sys_setprocmask(), sys_sigaction(), sys_sigpending() and
sys_sigsuspend() to handle the changed arguments.
* Abstract the guts of sys_sigaltstack(), sys_setprocmask(), sys_sigaction(),
sys_sigpending() and sys_sigsuspend() into separate functions, and call them
from all the emulations rather than hard-coding everything. (Avoids uses
the stackgap crap for these system calls.)
* Add a new flag (p_checksig) to indicate that a process may have signals
pending and userret() needs to do the full (slow) check.
* Eliminate SAS_ALTSTACK; it's exactly the inverse of SS_DISABLE.
* Correct emulation bugs with restoring SS_ONSTACK.
* Make the signal mask in the sigcontext always use the emulated mask format.
* Store signals internally in sigaction structures, rather than maintaining a
bunch of little sigsets for each SA_* bit.
* Keep track of where we put the signal trampoline, rather than figuring it out
in *_sendsig().
* Issue a warning when a non-emulated sigaction bit is observed.
* Add missing emulated signals, and a native SIGPWR (currently not used).
* Implement the `not reset when caught' semantics for relevant signals.

Note: Only code touched by the i386 port has been modified. Other ports and
emulations need to be updated.
 1.46 09-Aug-1998  perry bzero->memset, bcopy->memcpy, bcmp->memcmp
 1.45 20-Jul-1998  fvdl Avoid possibly overflowing an mbuf. From Dan S. Decasper, via Chuck Cranor.
 1.44 25-Jun-1998  thorpej defopt NFSSERVER
 1.43 25-Apr-1998  matt Adapt to new sosend/soreceive and upcall (now down in sowakeup)
 1.42 19-Feb-1998  thorpej Include the NFS option header.
 1.41 30-Jan-1998  fvdl Only take the receive lock before disconnecting when doing it from
nfs_decode_args. Otherwise we might just end up locking against ourselves.

XXX workaround, will do ok for now. Proper fix forthcoming.
 1.40 16-Nov-1997  fvdl Make sure the receive lock is taken when disconnecting a socket. Also
change a check for a 'connected' socket to use the socket rather than
the mount flags.

From Matthias Drochner.
 1.39 10-Oct-1997  fvdl branches: 1.39.2;
* New directory entry caching system. Provides full caching of any
directory cookie that may be thrown back at us from userspace, up
to a size limit. Fixes double entry problem.
* Split flags for internal and external use in the NFS mount structure.
* Fix some buffer structure fields that weren're being used correctly.
* Fix missing directory cache inval call in nfs_open.
* Limit on NFS_DIRBLKSIZ no longer needed, bumped to the more reasonable
value of 8k.
* Various other things that I forget, all related to the dir caching
somehow, though.
 1.38 22-May-1997  gwr branches: 1.38.4; 1.38.6;
Temporary work-around for PR kern/3579 (from Jonathan Stone).
 1.37 12-May-1997  fvdl * If nfs_reconnect fails, be sure to release the sndlock, otherwise no
other requests will get through and the mount point will be effectively dead.
This could happen for mounts using TCP and -i and/or -s.
* Reserve enough space for UDP sockets. Fixes PR 3008, from Naofumi Honda.
 1.36 08-Apr-1997  fvdl Avoid nfsiods acquiring/releasing a lock, then acquiring it again, before
anyone else can get to it, by checking if a reply was received, and it
has thus become unnecessary to take the lock. From FreeBSD.

XXX I don't really like this, "locks" potentially suffer from the same
problem throughout the whole kernel; they should probably be FIFO everywhere.
 1.35 22-Feb-1997  fvdl Fixes from BSDI (thanks go to Keith Bostic). Original RCS messages:

date: 1995/11/30 20:37:03; author: cp; state: Exp; lines: +25 -14
Change splsoftclock() to splnet();
Make nfsrv_getstream create two copies of data when
splitting up an mbuf rather than two references to the
same external buffer. The symptom this fixes is client
hangs.

date: 1997/02/10 18:41:13; author: cp; state: Exp; lines: +4 -1
Make nfs_realign go away on sparc and add functionality to nfsm_disct.
 1.34 09-Feb-1997  fvdl * Fix some bugs in NQNFS (malformed RPC requests, no directory lease eviction)
* Avoid possible NULL ptr ref in nfs_reply
* Don't ever try to sillyrename directories (from FreeBSD)
 1.33 04-Feb-1997  fvdl branches: 1.33.2;
* Make sure a new socket is created when switching to/from NOCONN with
a mount
* Add extra printf statements to hopefully get some more info on lockups,
specifically when a send error is ignored.
 1.32 31-Jan-1997  thorpej NFSCLIENT -> NFS.
 1.31 13-Oct-1996  christos branches: 1.31.2;
revert kprintf changes
 1.30 10-Oct-1996  christos printf -> kprintf, sprintf -> ksprintf
 1.29 02-Jul-1996  fvdl Remove bogus check on record length < NFS_MINPACKET.
(From Guy Harris via Rick Macklem).
 1.28 22-May-1996  mycroft Pass a proc pointer down to the usrreq and pcbbind functions for PRU_ATTACH, PRU_BIND and
PRU_CONTROL. The usrreq interface really needs to be split up, but this will have to wait.
Remove SS_PRIV completely.
 1.27 15-Apr-1996  thorpej branches: 1.27.4;
Make this compile again on a SPARC if NFSCLIENT is defined without
NFSSERVER. (-Wall unused variable lossage)
 1.26 25-Feb-1996  fvdl Oops. Do previous fix on the right line this time.. (thanks Charles)
 1.25 25-Feb-1996  fvdl Call soreserve() with the right size for receives (from pk).
 1.24 18-Feb-1996  fvdl Bring in a merge of Rick Macklem's NFSv3 code from Lite2
 1.23 09-Feb-1996  christos nfs prototype changes
 1.22 19-Dec-1995  cgd changes to make this work on systems where pointers & longs are 64 bits.
This is mostly just changes to make the stuff that goes over the wire
use fixed-size types.
 1.21 13-Aug-1995  mycroft splnet --> splsoftnet
 1.20 02-Jun-1995  mycroft Fix more off by one errors.
 1.19 02-Jun-1995  mycroft Fix another off by one error.
 1.18 02-Jun-1995  mycroft Imported group list now starts at offset 0, not 1.
 1.17 17-Aug-1994  mycroft Convert some more lists and queues.
 1.16 17-Aug-1994  mycroft Change the reply list to a TAILQ.
 1.15 29-Jun-1994  cgd branches: 1.15.2;
New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'
 1.14 08-Jun-1994  mycroft Update to 4.4-Lite fs code, with local changes.
 1.13 24-May-1994  cgd MIN -> min, MAX -> max
 1.12 05-May-1994  cgd lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.
 1.11 10-Apr-1994  cgd patchkit date deletions!
 1.10 22-Dec-1993  cgd minor cleanup
 1.9 18-Dec-1993  mycroft Canonicalize all #includes.
 1.8 07-Sep-1993  ws branches: 1.8.2;
Changes to VFS readdir semantics
NFS changes for better cookie support
ISOFS changes for better Rockridge support and support for generation numbers
 1.7 06-Sep-1993  mycroft Make nfs_timer() return void.
 1.6 03-Sep-1993  jtc Include systm.h to get prototypes (and possibly inlines) of *max functions.
 1.5 22-May-1993  cgd add include of select.h if necessary for protos, or delete if extraneous
 1.4 18-May-1993  cgd make kernel select interface be one-stop shopping & clean it all up.
 1.3 10-Apr-1993  glass migrated code to make split possible
 1.2 21-Mar-1993  cgd after 0.2.2 "stable" patches applied
 1.1 21-Mar-1993  cgd branches: 1.1.1;
Initial revision
 1.1.1.2 01-Mar-1998  fvdl Import 4.4BSD-Lite for reference
 1.1.1.1 21-Mar-1993  cgd initial import of 386bsd-0.1 sources
 1.8.2.2 14-Nov-1993  mycroft Canonicalize all #includes.
 1.8.2.1 24-Sep-1993  mycroft Make all files using spl*() #include cpu.h. Changes from trunk.
nfs_vfsops.c, nfsmount.h: Make nfs_quotactl() take an int rather than a uid_t,
as it might be -1.
nfs_vnops.c: va_size and va_bytes are now quads.
 1.15.2.1 19-Aug-1994  mycroft update from trunk
 1.27.4.4 04-Mar-1997  mycroft Pull up bug fixes from -current, per fvdl.
 1.27.4.3 11-Dec-1996  mycroft From trunk:
Eliminate SS_PRIV; instead, pass down a proc pointer to the usrreq methods
that need it.
Fix numerous memory leaks and bogus return values.
 1.27.4.2 10-Jul-1996  jtc Patch from frank needed to compile without corresponding network changes
 1.27.4.1 08-Jul-1996  jtc Pulled up from rev 1.29 by request from Frank van der Linden
 1.31.2.1 14-Jan-1997  thorpej Snapshot of work-in-progress, committed to private branch.

These changes implement machine-independent root device and file system
selection. Notable features:

- All ports behave in a consistent manner regarding root
device selection.
- No more "options GENERIC"; all kernels have the ability
to boot with RB_ASKNAME to select root device and file system
type.
- Root file system type can be wildcarded; a machine-independent
function will try all possible file systems for the selected
root device until one succeeds.
- If the root file system fails to mount, the operator will
be given the chance to select a new root device and file
system type, rather than having the machine simply panic.
- nfs_mountroot() no longer panics if any part of the NFS
mount process fails; it now returns an error, giving the
operator a chance to recover.
- New, more consistent, config(8) grammar. The constructs:

config netbsd swap generic
config netbsd root on nfs

have been replaced with:

config netbsd root on ? type ?
config netbsd root on ? type nfs

Additionally, the operator may select or wildcard root file
system type in the kernel configuration file:

config netbsd root on cd0a type cd9660

config(8) now requires that a "root" specification be
made. "root" may be wired down or wildcarded. "swap" and
"dump" specifications are optional, and follow previous
semantics.

- config(8) has a new "file-system" keyword, used to configure
file systems into the kernel. Eventually, this will be used
to generate the default vfssw[].

- "options NFSCLIENT" is obsolete, and is replaced by
"file-system NFS". "options NFSSERVER" still exists, since
NFS server support is independent of the NFS file system
client.

- sys/arch/<foo>/<foo>/swapgeneric.c is no longer used, and
will be removed; all information is now generated by config(8).

As of this commit, all ports except arm32 have been updated to use
the new setroot(). Only SPARC, i386, and Alpha ports have been
tested at this time. Port masters should test these changes on their
ports, and report any problems back to me.

More changes are on their way, including RB_ASKNAME support in
nfs_mountroot() (to prompt for server address and path) and, potentially,
the ability to select rarp/bootparam or bootp in nfs_mountroot().
 1.33.2.1 12-Mar-1997  is Merge in changes from Trunk
 1.38.6.1 08-Sep-1997  thorpej Significantly restructure the way signal state for a process is stored.
Rather than using bitmasks to redundantly store the information kept
in the process's sigacts (because the sigacts was kept in the u-area),
hang sigacts directly off the process, and access it directly.

Simplify signal setup code tremendously by storing information in
the sigacts as an array of struct sigactions, rather than in a different
format, since userspace uses sigactions.

Make sigacts sharable by adding reference counting.
 1.38.4.1 14-Oct-1997  thorpej Update marc-pcmcia branch from trunk.
 1.39.2.2 07-Feb-1998  mellon Pull up 1.41 (fvdl)
 1.39.2.1 17-Nov-1997  thorpej Sync w/ trunk (fvdl)
 1.50.4.1 02-Aug-1999  thorpej Update from trunk.
 1.50.2.2 09-Nov-1999  he Pull up revisions 1.52-1.53 (requested by fvdl):
Fix overzealous DIAGNOSTIC check in nfs_disconnect() and
don't try to copy a possibly freed mbuf. Fixes PR#8249, PR#8288
and PR#8766.
 1.50.2.1 05-Nov-1999  cgd pull up rev 1.51 from trunk (requested by fvdl):
Avoid a panic when forcibly unmounting a hung NFS mount, e.g. at
reboot.
 1.53.2.4 12-Mar-2001  bouyer Sync with HEAD.
 1.53.2.3 05-Jan-2001  bouyer Sync with HEAD
 1.53.2.2 13-Dec-2000  bouyer Sync with HEAD (for UBC fixes).
 1.53.2.1 20-Nov-2000  bouyer Update thorpej_scsipi to -current as of a month ago
 1.56.2.1 22-Jun-2000  minoura Sync w/ netbsd-1-5-base.
 1.57.2.3 04-Sep-2002  itojun pullup sys/nfs/nfs_socket.c 1.75 (fvdl)

Initialize recm to NULL inside the loop, so that a record length of
NULL will not accidentallly append bogus data (the previous record).

Derived from a fix by Matt Dillon in FreeBSD.
 1.57.2.2 15-Dec-2000  he Pull up revision 1.63 (requested by fvdl):
Fix NFS+tcp client hangs on server or network outage. Again,
please note that this introduces yet another kernel interface
change: sobind() gains an argument.
 1.57.2.1 14-Dec-2000  he Pull up revision 1.60 (requested by fvdl):
Improve NFS performance, possibly with as much as 100% in
throughput. Please note: this implies a kernel interface change,
VOP_FSYNC gains two arguments.
 1.66.2.12 11-Dec-2002  thorpej Sync with HEAD.
 1.66.2.11 18-Oct-2002  nathanw Catch up to -current.
 1.66.2.10 13-Aug-2002  nathanw Catch up to -current.
 1.66.2.9 12-Jul-2002  nathanw No longer need to pull in lwp.h; proc.h pulls it in for us.
 1.66.2.8 24-Jun-2002  nathanw Curproc->curlwp renaming.

Change uses of "curproc->l_proc" back to "curproc", which is more like the
original use. Bare uses of "curproc" are now "curlwp".

"curproc" is now #defined in proc.h as ((curlwp) ? (curlwp)->l_proc) : NULL)
so that it is always safe to reference curproc (*de*referencing curproc
is another story, but that's always been true).
 1.66.2.7 20-Jun-2002  nathanw Catch up to -current.
 1.66.2.6 01-Apr-2002  nathanw Catch up to -current.
(CVS: It's not just a program. It's an adventure!)
 1.66.2.5 28-Feb-2002  nathanw Catch up to -current.
 1.66.2.4 14-Nov-2001  nathanw Catch up to -current.
 1.66.2.3 22-Oct-2001  nathanw Catch up to -current.
 1.66.2.2 21-Jun-2001  nathanw Catch up to -current.
 1.66.2.1 05-Mar-2001  nathanw Initial commit of scheduler activations and lightweight process support.
 1.67.2.6 10-Oct-2002  jdolecek sync kqueue with -current; this includes merge of gehenna-devsw branch,
merge of i386 MP branch, and part of autoconf rototil work
 1.67.2.5 06-Sep-2002  jdolecek sync kqueue branch with HEAD
 1.67.2.4 23-Jun-2002  jdolecek catch up with -current on kqueue branch
 1.67.2.3 16-Mar-2002  jdolecek Catch up with -current.
 1.67.2.2 11-Feb-2002  jdolecek Sync w/ -current.
 1.67.2.1 10-Jan-2002  thorpej Sync kqueue branch with -current.
 1.68.2.1 12-Nov-2001  thorpej Sync the thorpej-mips-cache branch with -current.
 1.74.4.1 02-Aug-2002  lukem Pull up revision 1.75 (requested by fvdl in ticket #604):
Initialize recm to NULL inside the loop, so that a record length of
NULL will not accidentallly append bogus data (the previous record).
Derived from a fix by Matt Dillon in FreeBSD.
 1.74.2.1 29-Aug-2002  gehenna catch up with -current.
 1.92.2.12 11-Dec-2005  christos Sync with head.
 1.92.2.11 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.92.2.10 01-Apr-2005  skrll Sync with HEAD.
 1.92.2.9 04-Mar-2005  skrll Sync with HEAD.

Hi Perry!
 1.92.2.8 07-Feb-2005  skrll Don't deref a NULL struct lwp *.
 1.92.2.7 21-Sep-2004  skrll Fix the sync with head I botched.
 1.92.2.6 18-Sep-2004  skrll Sync with HEAD.
 1.92.2.5 03-Sep-2004  skrll Sync with HEAD
 1.92.2.4 25-Aug-2004  skrll Sync with HEAD.
 1.92.2.3 18-Aug-2004  skrll Revert to passing struct proc for {exit,exec}hook.
 1.92.2.2 03-Aug-2004  skrll Sync with HEAD
 1.92.2.1 02-Jul-2003  darrenr Apply the aborted ktrace-lwp changes to a specific branch. This is just for
others to review, I'm concerned that patch fuziness may have resulted in some
errant code being generated but I'll look at that later by comparing the diff
from the base to the branch with the file I attempt to apply to it. This will,
at the very least, put the changes in a better context for others to review
them and attempt to tinker with removing passing of 'struct lwp' through
the kernel.
 1.102.2.4 11-Jan-2005  jmc Pullup patch (requested by yamy in ticket #1078)

Don't do kludge for a reply to a retransmitted request
unless we actually retransmitted the request.
 1.102.2.3 30-Aug-2004  tron branches: 1.102.2.3.2;
Pull up revision 1.110 via patch (requested by yamt in ticket #803):
nfs_request: a workaround for servers doing "maproot".
for i/o requests which are expected not to fail due to permission
to mimic unix file open semantics (READ, WRITE, COMMIT),
try two credentials. namely, the file owner's one and open time one.
remember which credential worked in per-file basis and try it first
next time to minimize number of retries.
ideas from Chuck Silvers. PR/23716 and PR/24987.
 1.102.2.2 14-Jul-2004  tron Pull up revision 1.108 (requested by jonathan in ticket #648):
Rename MBUFTRACE helper function m_claim() to m_claimm(),
for consistency with M_FREE() and m_freem(). Affected files:
sys/mbuf.h
kern/uipc_socket2.c
kern/uipc_mbuf.c
net/if_ethersubr.c
netatalk/ddp_input.c
nfs/nfs_socket.c
 1.102.2.1 10-Jul-2004  tron Pull up revision 1.106 via patch (requested by yamt in ticket #617):
- for tcp, use SO_RCVTIMEO to recover from server crash.
otherwise we can be stuck in soreceive forever.
the problem is pointed by Minoura Makoto. PR/25662
- clear r_rexmit on reconnect and clear r_rtt and R_TIMING on retransmit
so that the above (and soft mounts) happy.
 1.102.2.3.2.1 11-Jan-2005  jmc Pullup patch (requested by yamy in ticket #1078)

Don't do kludge for a reply to a retransmitted request
unless we actually retransmitted the request.
 1.111.6.1 19-Mar-2005  yamt sync with head. xen and whitespace. xen part is not finished.
 1.111.4.1 29-Apr-2005  kent sync with -current
 1.112.2.2 15-Dec-2005  tron Pull up following revision(s) (requested by christos in ticket #1055):
sys/nfs/nfs_socket.c: revision 1.115 via patch
Convert from nfs error values to regular errno's. Although most values of
nfs errors are chosen to be the same as errno, some of them are not and
it is better for portability to do the conversion anyway. Also a server
can return a bad error number that can cause the server to crash, because
it can have the high bits that are used internally set. This was the case
with amd. Finally nfs_request() should return a valid errno, because we
can return a bogus value to userland. Thanks to rpaulo for debugging this.
 1.112.2.1 04-Apr-2005  tron Pull up revision 1.113 (requested by yamt in ticket #89):
nfsrv_rcv: don't do so_receive from socket upcall context.
while there's little benefits, it complicates locking and confuses
flow control.
 1.114.2.9 27-Feb-2008  yamt revert incomplete nfs client locking for now.
 1.114.2.8 15-Feb-2008  yamt - sprinkle some locks.
- disable MNT_UPDATE because it involves too much locking headache.
- don't overwrite other bits in v_vflags when setting VV_ROOT.
 1.114.2.7 21-Jan-2008  yamt sync with head
 1.114.2.6 07-Dec-2007  yamt sync with head
 1.114.2.5 27-Oct-2007  yamt sync with head.
 1.114.2.4 03-Sep-2007  yamt sync with head.
 1.114.2.3 26-Feb-2007  yamt sync with head.
 1.114.2.2 30-Dec-2006  yamt sync with head.
 1.114.2.1 21-Jun-2006  yamt sync with head.
 1.117.6.1 22-Nov-2005  yamt sync with head.
 1.120.2.2 15-Jan-2006  yamt sync with head.
 1.120.2.1 31-Dec-2005  yamt - adapt nfs.
- nfs_doio_read: #if 0 out "killproc if text is modified" part of
the code as it's broken. (a process reading the modified text is not
necessarily a process which is using the file as a text.)
 1.122.4.3 01-Jun-2006  kardel Sync with head.
 1.122.4.2 22-Apr-2006  simonb Sync with head.
 1.122.4.1 04-Feb-2006  simonb Adapt for timecounters: mostly use get*time() and use "time_second"
instead of "time.tv_sec".
 1.122.2.1 09-Sep-2006  rpaulo sync with head
 1.126.6.1 24-May-2006  tron Merge 2006-05-24 NetBSD-current into the "peter-altq" branch.
 1.126.4.12 11-May-2006  elad sync with head
 1.126.4.11 06-May-2006  christos - Move kauth_cred_t declaration to <sys/types.h>
- Cleanup struct ucred; forward declarations that are unused.
- Don't include <sys/kauth.h> in any header, but include it in the c files
that need it.

Approved by core.
 1.126.4.10 19-Apr-2006  elad sync with head.
 1.126.4.9 14-Apr-2006  elad Plug some possible leaks of kauth_cred_t.
 1.126.4.8 13-Apr-2006  elad Add a missing kauth_cred_alloc() and remove some misleading code and
comment.
 1.126.4.7 31-Mar-2006  elad oops. fix function names... pointed out by yamt@, thanks!
 1.126.4.6 31-Mar-2006  elad fill up real/saved [ug]ids too.
 1.126.4.5 14-Mar-2006  elad Use kauth_cred_[sg]etgroups() where appropriate.
 1.126.4.4 10-Mar-2006  elad Cleanup more interface abuse.

Make nfsrv_setcred() take a kauth_cred_t * as outcred. The original code
just modified it directly; we can't do that, nor do we want to.

Get rid of another case of kauth_cred_zero() followed by kauth_cred_hold()
and use kauth_cred_clone() to make sure we don't leave out important
members.

Add another DIAGNOSTIC check for reference count of above one.

Again, this should be tested.
 1.126.4.3 10-Mar-2006  elad Okay, what I've done here is pretty bogus, and trying to use kauth_cred_t
as something it's not. Then again, it was part of a fast sweep so I have
good excuses. :)

DON'T kauth_cred_zero() and then kauth_cred_hold(); that's guaranteed to
trip over trying to lock an uninitialized lock. Also, kauth_cred_t now
contains more than just a struct ucred, so treat it properly.

Use a call to kauth_cred_copy() to ensure we have reference count of
one, and sprinkle some DIAGNOSTIC check to let us know if we ever leak
memory.

Also, don't forget to set proper values for the real/saved user- and
group-ids.

This should be tested at some point...
 1.126.4.2 10-Mar-2006  elad Remove some #if 0'd code.

There's no need to call kauth_cred_setngroups() here because right above
we use kauth_cred_addgroup() that does the management of ngroups for us.

Also, for the same reason, there's no need to call nfsrvw_sort(), because
we are guaranteed to have the group list sorted at all times.
 1.126.4.1 08-Mar-2006  elad Adapt to kernel authorization KPI.

This could use some testing...
 1.126.2.4 03-Sep-2006  yamt sync with head.
 1.126.2.3 11-Aug-2006  yamt sync with head
 1.126.2.2 26-Jun-2006  yamt sync with head.
 1.126.2.1 24-May-2006  yamt sync with head.
 1.132.2.2 19-Jun-2006  chap Sync with head.
 1.132.2.1 19-May-2006  chap file nfs_socket.c was added on branch chap-midi on 2006-06-19 04:10:37 +0000
 1.135.2.1 13-Jul-2006  gdamore Merge from HEAD.
 1.138.4.2 10-Dec-2006  yamt sync with head.
 1.138.4.1 22-Oct-2006  yamt sync with head
 1.138.2.3 12-Jan-2007  ad Sync with head.
 1.138.2.2 18-Nov-2006  ad Sync with head.
 1.138.2.1 21-Oct-2006  ad Update for sigpending1() change.
 1.145.2.4 07-May-2007  yamt sync with head.
 1.145.2.3 24-Mar-2007  yamt sync with head.
 1.145.2.2 12-Mar-2007  rmind Sync with HEAD.
 1.145.2.1 28-Feb-2007  yamt sync with head. (somehow missed in the previous)
 1.148.2.10 01-Nov-2007  yamt fix a compilation problem w/o NFSSERVER.
reported by Juan RP via Andrew Doran.
 1.148.2.9 27-Aug-2007  yamt - fix/add assertions.
- fix numnfsrvcache.
 1.148.2.8 26-Aug-2007  yamt - mark nfssvc(2) MPSAFE and move the most of nfsd out of the kernel lock.
- remove unused ns_solock.
- remove some of KERNEL_LOCK/UNLOCK which are not necessary on this branch.
 1.148.2.7 20-Aug-2007  ad Sync with HEAD.
 1.148.2.6 15-Jul-2007  ad Sync with head.
 1.148.2.5 01-Jul-2007  ad Adapt to callout API change.
 1.148.2.4 09-Jun-2007  ad Sync with head.
 1.148.2.3 08-Jun-2007  ad Sync with head.
 1.148.2.2 13-Mar-2007  ad Pull in the initial set of changes for the vmlocking branch.
 1.148.2.1 13-Mar-2007  ad Sync with head.
 1.149.2.1 11-Jul-2007  mjf Sync with head.
 1.160.2.1 15-Aug-2007  skrll Sync with HEAD.
 1.162.2.3 09-Dec-2007  jmcneill Sync with HEAD.
 1.162.2.2 26-Oct-2007  joerg Sync with HEAD.

Follow the merge of pmap.c on i386 and amd64 and move
pmap_init_tmp_pgtbl into arch/x86/x86/pmap.c. Modify the ACPI wakeup
code to restore CR4 before jumping back into kernel space as the large
page option might cover that.
 1.162.2.1 09-Aug-2007  jmcneill Sync with HEAD.
 1.163.8.2 05-Aug-2007  yamt use kpause rather than lbolt.
 1.163.8.1 05-Aug-2007  yamt file nfs_socket.c was added on branch matt-mips64 on 2007-08-05 09:40:40 +0000
 1.163.6.1 25-Oct-2007  bouyer Sync with HEAD.
 1.163.2.2 09-Jan-2008  matt sync with HEAD
 1.163.2.1 06-Nov-2007  matt sync with HEAD
 1.164.4.4 29-Dec-2007  yamt to prepare merge, put nfsd back under kernel_lock for now.
 1.164.4.3 08-Dec-2007  ad Sync with head.
 1.164.4.2 04-Dec-2007  yamt apply the following change, which seems to get lost during
vmlocking -> vmlocking2 transition.

Module Name: src
Committed By: yamt
Date: Sun Oct 21 08:23:20 UTC 2007

Modified Files:
src/sys/nfs: nfs_socket.c nfs_var.h

Log Message:
remove lwp argument from nfs_reconnect and always use &lwp0
because who triggers a reconnect doesn't really matter here. PR/37145.


To generate a diff of this commit:
cvs rdiff -r1.163 -r1.164 src/sys/nfs/nfs_socket.c
cvs rdiff -r1.72 -r1.73 src/sys/nfs/nfs_var.h

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
 1.164.4.1 04-Dec-2007  ad Pull the vmlocking changes into a new branch.
 1.164.2.2 18-Feb-2008  mjf Sync with HEAD.
 1.164.2.1 08-Dec-2007  mjf Sync with HEAD.
 1.165.4.1 02-Jan-2008  bouyer Sync with HEAD
 1.167.6.5 17-Jan-2009  mjf Sync with HEAD.
 1.167.6.4 05-Oct-2008  mjf Sync with HEAD.
 1.167.6.3 28-Sep-2008  mjf Sync with HEAD.
 1.167.6.2 02-Jun-2008  mjf Sync with HEAD.
 1.167.6.1 03-Apr-2008  mjf Sync with HEAD.
 1.169.2.1 18-May-2008  yamt sync with head.
 1.170.8.2 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.170.8.1 19-Oct-2008  haad Sync with HEAD.
 1.170.4.2 10-Oct-2008  skrll Sync with HEAD.
 1.170.4.1 18-Sep-2008  wrstuden Sync with wrstuden-revivesa-base-2.
 1.170.2.7 11-Mar-2010  yamt sync with head
 1.170.2.6 19-Jul-2009  yamt debug printf and comments. no functional changes.
 1.170.2.5 16-Jul-2009  yamt remove sndlock. it's superseded by nm_solock.
suggested by Andrew Doran.
 1.170.2.4 04-May-2009  yamt fix a merge botch.
 1.170.2.3 04-May-2009  yamt fix merge botches.
 1.170.2.2 04-May-2009  yamt sync with head.
 1.170.2.1 27-Apr-2008  yamt commit some work-in-progress changes to make nfs client mp-safe to a branch,
so that they won't get lost.
- sprinkle some locking
- mark the filesystem, nfstimer callout, and kq kthread mp-safe
- add assertions and comments
- disable upgrade mount for now
- some unrelated cosmetic changes
 1.173.4.9 24-Apr-2011  riz Pull up following revision(s) (requested by tls in ticket #1600):
sys/nfs/nfs_socket.c: revision 1.189
As suggested by matt@: change socket buffer reservations for NFS send/receive
to 3 times max RPC size rather than 2 times. Avoids nasty TCP stalls observed
at Panix. Will require increase to sbmax via sysctl for those running really
huge NFS rsize/wsize (>64K).
 1.173.4.8 29-Mar-2011  riz Pull up following revision(s) (requested by tls in ticket #1583):
sys/nfs/nfs_socket.c: revision 1.186
nfs_msg: #if 0 out tprintf for now and comment why.
 1.173.4.7 29-Mar-2011  riz Pull up following revision(s) (requested by tls in ticket #1582):
sys/nfs/nfs_socket.c: revision 1.185
nfs_request: fix races which break congestion window and make nfs client stuck.
 1.173.4.6 29-Mar-2011  riz Pull up following revision(s) (requested by tls in ticket #1581):
sys/nfs/nfs_socket.c: revision 1.181
If send fails with EMSGSIZE for whatever reason, it's unlikely to
succeed no matter how hard we retry. So just fail the request.
 1.173.4.5 29-Mar-2011  riz Pull up following revision(s) (requested by tls in ticket #1580):
sys/nfs/nfs_socket.c: revision 1.175
avoid noisy nfs_timer/nfs_reply DEBUG output that occurs when the
NFS server goes away. use ratelimit(9) and only print the console
error once every 10 seconds. PR#31562.
 1.173.4.4 13-Nov-2009  sborrill Pull up the following revisions(s) (requested by bouyer in ticket #1128):
sys/nfs/nfs_socket.c: revision 1.182

Handle EWOULDBLOCK the same way as EPIPE. It seems the TCP socket layer
can return EWOULDBLOCK on some occasion when the connection is broken.
 1.173.4.3 06-Feb-2009  snj branches: 1.173.4.3.4;
Pull up following revision(s) (requested by ad in ticket #412):
sys/nfs/nfs_socket.c: revision 1.178
PR kern/40491 5.0: nfs timer can crash/break on smp
Hack around it by acquiring softnet_lock around the client-side timer loop.
 1.173.4.2 02-Feb-2009  snj Pull up following revision(s) (requested by yamt in ticket #393):
sys/kern/uipc_socket.c: revision 1.185
sys/kern/uipc_socket2.c: revision 1.101
sys/kern/uipc_syscalls.c: revision 1.135
sys/miscfs/portal/portal_vnops.c: revision 1.81
sys/netsmb/smb_trantcp.c: revision 1.40
sys/nfs/nfs_socket.c: revision 1.177
sys/sys/socketvar.h: revision 1.118
restore the pre socket locking patch signal behaviour.
this fixes a busy-loop in nfs_connect.
 1.173.4.1 02-Feb-2009  snj Pull up following revision(s) (requested by mrg in ticket #390):
sys/nfs/nfs_socket.c: revision 1.176
Actually enforce the maximum timeout (60s by default) rather
than backing off to 256*SRTT. This is why it sometimes could take
hours for a NFS mount to come back when the server returned.
contributed anonymously.
 1.173.4.3.4.1 20-Apr-2010  matt Pullin some NFS fixes from netbsd-5.
 1.173.2.3 28-Apr-2009  skrll Sync with HEAD.
 1.173.2.2 03-Mar-2009  skrll Sync with HEAD.
 1.173.2.1 19-Jan-2009  skrll Sync with HEAD.
 1.178.2.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.185.2.1 30-Apr-2010  uebayasi Sync with HEAD.
 1.187.2.2 21-Apr-2011  rmind sync with head
 1.187.2.1 05-Mar-2011  rmind sync with head
 1.188.2.1 06-Jun-2011  jruoho Sync with HEAD.
 1.189.18.2 18-May-2014  rmind sync with head
 1.189.18.1 28-Aug-2013  rmind Checkpoint work in progress:
- Initial split of the protocol user-request method into the following
methods: pr_attach, pr_detach and pr_generic for old the pr_usrreq.
- Adjust socreate(9) and sonewconn(9) to call pr_attach without the
socket lock held (as a preparation for the locking scheme adjustment).
- Adjust all pr_attach routines to assert that PCB is not set.
- Sprinkle various comments, document some routines and their locking.
- Remove M_PCB, replace with kmem(9).
- Fix few bugs spotted on the way.
 1.189.14.2 03-Dec-2017  jdolecek update from HEAD
 1.189.14.1 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.189.4.1 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.190.2.1 10-Aug-2014  tls Rebase.
 1.192.4.1 10-Jul-2016  martin Pull up following revision(s) (requested by christos in ticket #1184):
sys/nfs/nfs_socket.c: revision 1.198
sys/nfs/nfs_clntsocket.c: revision 1.5
Serialize all access to the NFS request queue via splsoftnet(). Fixes random
crashes.
 1.192.2.2 10-Jul-2016  martin Pull up following revision(s) (requested by christos in ticket #1184):
sys/nfs/nfs_socket.c: revision 1.198
sys/nfs/nfs_clntsocket.c: revision 1.5
Serialize all access to the NFS request queue via splsoftnet(). Fixes random
crashes.
XXX: Pullup-7
 1.192.2.1 04-Nov-2015  riz Pull up following revision(s) (requested by manu in ticket #882):
sbin/umount/umount.c: revision 1.48
sys/nfs/nfsmount.h: revision 1.53
sys/nfs/nfs_var.h: revision 1.94
sys/nfs/nfs_iod.c: revision 1.7
sys/nfs/nfs_socket.c: revision 1.197
sys/nfs/nfs_bio.c: revision 1.191
sys/nfs/nfs_vfsops.c: revision 1.230
sys/nfs/nfs_clntsocket.c: revision 1.3
Remove useless and harmful sync(2) call in umount(8)
Remove sync(2) call before unmount(2) in umount(8). This sync(2) is useless
since unmount(2) will perform a VFS_SYNC anyway.
But moreover, this sync(2) may be harmful, as there are some situation where
it cannot return (unreachable NFS server, for instance), causing umount -f
to be uneffective.
Fix soft NFS force unmount
For many reasons, forcibly unmounting a soft NFS mount could hang forever.
Here are the fixes:
- Introduce decents timeouts in operation that awaited NFS server reply.
- On timeout, fails operations on soft mounts with EIO.
- Introduce NFSMNT_DISMNTFORCE to let the filesystem know that a
force unmount is ongoing. This causes timeouts to be reduced and
prevents the NFS client to attempt reconnecting to the NFS server.
Also fix a race condition where some asynchronous I/O could reference
destroyed mount structures. We fix this by awaiting asynchronous I/O
to drain before proceeding.
Reviewed by Chuck Silvers.
 1.193.2.4 09-Jul-2016  skrll Sync with HEAD
 1.193.2.3 22-Sep-2015  skrll Sync with HEAD
 1.193.2.2 06-Jun-2015  skrll Sync with HEAD
 1.193.2.1 06-Apr-2015  skrll Sync with HEAD
 1.198.10.1 08-Jun-2018  martin Pull up following revision(s) (requested by maya in ticket #856):

sys/nfs/nfs.h: revision 1.76
sys/nfs/nfs_subs.c: revision 1.230
sys/nfs/nfs_socket.c: revision 1.199
sys/nfs/nfs_clntsocket.c: revision 1.6

PR/40491: From Tobias Ulmer in tech-kern@:
1. Protect the nfs request queue with its own mutex
2. make the nfs_receive queue check for signals so that intr mounts
can be interrupted.

XXX: pullup-8
 1.199.4.1 10-Jun-2019  christos Sync with HEAD
 1.199.2.1 06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.202.2.1 02-Aug-2025  perseant Sync with HEAD
 1.45 15-Mar-2009  cegger ansify function definitions
 1.44 14-Mar-2009  dsl Change about 4500 of the K&R function definitions to ANSI ones.
There are still about 1600 left, but they have ',' or /* ... */
in the actual variable definitions - which my awk script doesn't handle.
There are also many that need () -> (void).
(The script does handle misordered arguments.)
 1.43 19-Nov-2008  ad branches: 1.43.4;
Make the emulations, exec formats, coredump, NFS, and the NFS server
into modules. By and large this commit:

- shuffles header files and ifdefs
- splits code out where necessary to be modular
- adds module glue for each of the components
- adds/replaces hooks for things that can be installed at runtime
 1.42 05-May-2008  ad branches: 1.42.6; 1.42.8;
- Convert hashinit() to use kmem_alloc(). The hash tables can be large
and it's better to not have them in kmem_map.
- Convert a couple of minor items along the way to kmem_alloc().
- Fix some memory leaks.
 1.41 04-Dec-2007  yamt branches: 1.41.12; 1.41.14; 1.41.16;
merge non-intrusive nfs changes from vmlocking.
 1.40 01-Jun-2007  yamt branches: 1.40.6; 1.40.8; 1.40.14; 1.40.16;
use mutex and condvar.
 1.39 12-Mar-2007  ad branches: 1.39.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.
 1.38 04-Mar-2007  christos branches: 1.38.2;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.
 1.37 22-Feb-2007  thorpej TRUE -> true, FALSE -> false
 1.36 05-Feb-2007  yamt branches: 1.36.2;
have a mowner dedicated to nfs reply cache.
 1.35 05-Feb-2007  yamt nfsrv_updatecache: actually use reply caches for connection oriented protocols.
 1.34 17-Jan-2007  yamt plug mbuf leaks.
 1.33 27-Dec-2006  yamt remove nqnfs.
 1.32 11-Dec-2005  christos branches: 1.32.20; 1.32.24; 1.32.26;
merge ktrace-lwp.
 1.31 21-May-2004  yamt branches: 1.31.12;
enable reply cache for connection oriented protocols as well.
linux retransmits rpcs even when using tcp.
 1.30 20-Nov-2003  yamt branches: 1.30.2;
comments.
 1.29 20-Nov-2003  yamt fix a race case of nfsrv_getcache.
 1.28 07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.27 21-May-2003  yamt branches: 1.27.2;
simplelock for nfsd request cache.
 1.26 21-May-2003  yamt - KNF.
- remove unneeded casts.
 1.25 21-May-2003  yamt poolify nfsd request cache.
 1.24 21-May-2003  yamt indent.
 1.23 21-May-2003  yamt remove local definitions of TRUE and FALSE.
 1.22 02-Apr-2003  yamt use queue manipulation macros.
 1.21 26-Feb-2003  matt Add MBUFTRACE kernel option.
Do a little mbuf rework while here. Change all uses of MGET*(*, M_WAIT, *)
to m_get*(M_WAIT, *). These are not performance critical and making them
call m_get saves considerable space. Add m_clget analogue of MCLGET and
make corresponding change for M_WAIT uses.
Modify netinet, gem, fxp, tulip, nfs to support MBUFTRACE.
Begin to change netstat to use sysctl.
 1.20 01-Dec-2002  matt Make sure these all agree on the same definitons of various variables.
 1.19 10-Nov-2001  lukem branches: 1.19.10;
add RCSIDs
 1.18 21-Feb-2001  jdolecek branches: 1.18.2; 1.18.4; 1.18.8;
make some more constant arrays 'const'
 1.17 08-Nov-2000  ad Update for hashinit() change.
 1.16 30-Mar-2000  augustss Remove register declarations.
 1.15 09-Aug-1998  perry branches: 1.15.12;
bzero->memset, bcopy->memcpy, bcmp->memcmp
 1.14 05-Jul-1998  jonathan defopt ISO TPIP.
 1.13 07-Feb-1998  chs add flags arg to hashinit(), to pass to malloc().
 1.12 18-Feb-1996  fvdl Bring in a merge of Rick Macklem's NFSv3 code from Lite2
 1.11 09-Feb-1996  christos nfs prototype changes
 1.10 13-Dec-1994  mycroft Sync with CSRG.
 1.9 17-Aug-1994  mycroft Use LIST and TAILQ for hash chain and LRU chain, respectively.
 1.8 29-Jun-1994  cgd branches: 1.8.2;
New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'
 1.7 08-Jun-1994  mycroft Update to 4.4-Lite fs code, with local changes.
 1.6 18-Dec-1993  mycroft Canonicalize all #includes.
 1.5 07-Sep-1993  ws branches: 1.5.2;
Changes to VFS readdir semantics
NFS changes for better cookie support
ISOFS changes for better Rockridge support and support for generation numbers
 1.4 22-May-1993  cgd add include of select.h if necessary for protos, or delete if extraneous
 1.3 21-May-1993  cgd add rcsid again; fix RCS+crash fuckup
 1.2 10-Apr-1993  glass migrated code to make split possible
 1.1 21-Mar-1993  cgd branches: 1.1.1;
Initial revision
 1.1.1.3 01-Mar-1998  fvdl Import 4.4BSD-Lite2
 1.1.1.2 01-Mar-1998  fvdl Import 4.4BSD-Lite for reference
 1.1.1.1 21-Mar-1993  cgd initial import of 386bsd-0.1 sources
 1.5.2.1 14-Nov-1993  mycroft Canonicalize all #includes.
 1.8.2.1 19-Aug-1994  mycroft update from trunk
 1.15.12.3 12-Mar-2001  bouyer Sync with HEAD.
 1.15.12.2 22-Nov-2000  bouyer Sync with HEAD.
 1.15.12.1 20-Nov-2000  bouyer Update thorpej_scsipi to -current as of a month ago
 1.18.8.1 12-Nov-2001  thorpej Sync the thorpej-mips-cache branch with -current.
 1.18.4.1 10-Jan-2002  thorpej Sync kqueue branch with -current.
 1.18.2.2 11-Dec-2002  thorpej Sync with HEAD.
 1.18.2.1 14-Nov-2001  nathanw Catch up to -current.
 1.19.10.1 05-Jun-2004  jdc Pull up revision 1.31 (via patch) (requested by yamt in ticket #1705).

enable reply cache for connection oriented protocols as well.
linux retransmits rpcs even when using tcp.
 1.27.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.27.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.27.2.1 03-Aug-2004  skrll Sync with HEAD
 1.30.2.1 05-Jun-2004  jdc Pull up revision 1.31 (requested by yamt in ticket #444).

enable reply cache for connection oriented protocols as well.
linux retransmits rpcs even when using tcp.
 1.31.12.4 07-Dec-2007  yamt sync with head
 1.31.12.3 03-Sep-2007  yamt sync with head.
 1.31.12.2 26-Feb-2007  yamt sync with head.
 1.31.12.1 30-Dec-2006  yamt sync with head.
 1.32.26.1 03-Sep-2007  wrstuden Sync w/ NetBSD-4-RC_1
 1.32.24.2 05-Jun-2007  bouyer Pull up following revision(s) (requested by yamt in ticket #707):
sys/nfs/nfs_srvcache.c: revision 1.35
nfsrv_updatecache: actually use reply caches for connection oriented protocols.
 1.32.24.1 05-Jun-2007  bouyer Pull up following revision(s) (requested by yamt in ticket #704):
sys/nfs/nfs_srvcache.c: revision 1.34
plug mbuf leaks.
 1.32.20.3 09-Feb-2007  ad Sync with HEAD.
 1.32.20.2 01-Feb-2007  ad Sync with head.
 1.32.20.1 12-Jan-2007  ad Sync with head.
 1.36.2.3 24-Mar-2007  yamt sync with head.
 1.36.2.2 12-Mar-2007  rmind Sync with HEAD.
 1.36.2.1 28-Feb-2007  yamt sync with head. (somehow missed in the previous)
 1.38.2.5 27-Aug-2007  yamt - fix/add assertions.
- fix numnfsrvcache.
 1.38.2.4 26-Aug-2007  yamt - mark nfssvc(2) MPSAFE and move the most of nfsd out of the kernel lock.
- remove unused ns_solock.
- remove some of KERNEL_LOCK/UNLOCK which are not necessary on this branch.
 1.38.2.3 09-Jun-2007  ad Sync with head.
 1.38.2.2 05-Apr-2007  ad Compile fixes.
 1.38.2.1 13-Mar-2007  ad Sync with head.
 1.39.2.1 11-Jul-2007  mjf Sync with head.
 1.40.16.2 08-Dec-2007  ad Sync with head.
 1.40.16.1 04-Dec-2007  ad Pull the vmlocking changes into a new branch.
 1.40.14.1 08-Dec-2007  mjf Sync with HEAD.
 1.40.8.1 09-Jan-2008  matt sync with HEAD
 1.40.6.1 09-Dec-2007  jmcneill Sync with HEAD.
 1.41.16.2 04-May-2009  yamt sync with head.
 1.41.16.1 16-May-2008  yamt sync with head.
 1.41.14.1 18-May-2008  yamt sync with head.
 1.41.12.2 17-Jan-2009  mjf Sync with HEAD.
 1.41.12.1 02-Jun-2008  mjf Sync with HEAD.
 1.42.8.2 28-Apr-2009  skrll Sync with HEAD.
 1.42.8.1 19-Jan-2009  skrll Sync with HEAD.
 1.42.6.1 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.43.4.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.6 05-Jul-2024  rin sys: Drop redundant NULL check before m_freem(9)

m_freem(9) safely has accepted NULL argument at least since 4.2BSD:
https://www.tuhs.org/cgi-bin/utree.pl?file=4.2BSD/usr/src/sys/sys/uipc_mbuf.c

Compile-tested on amd64/ALL.

Suggested by knakahara@
 1.5 20-Dec-2022  hannken branches: 1.5.6;
When partitioning a mbuf chain with m_split() the last mbuf of the returned
tail chain is not necessarily the same as the last mbuf of the initial chain.

Always set "slp->ns_rawend" to the last mbuf of the tail chain to prevent
mbuf leaks and corruption.
 1.4 03-Sep-2009  tls branches: 1.4.68; 1.4.94;
Missed this file in previous commit, accidentally checked in fix to local
repository copy! Sorry about that, folks.
 1.3 14-Mar-2009  dsl branches: 1.3.2;
Change about 4500 of the K&R function definitions to ANSI ones.
There are still about 1600 left, but they have ',' or /* ... */
in the actual variable definitions - which my awk script doesn't handle.
There are also many that need () -> (void).
(The script does handle misordered arguments.)
 1.2 14-Mar-2009  dsl Remove all the __P() from sys (excluding sys/dist)
Diff checked with grep and MK1 eyeball.
i386 and amd64 GENERIC and sys still build.
 1.1 19-Nov-2008  ad branches: 1.1.4; 1.1.6; 1.1.8; 1.1.10;
Make the emulations, exec formats, coredump, NFS, and the NFS server
into modules. By and large this commit:

- shuffles header files and ifdefs
- splits code out where necessary to be modular
- adds module glue for each of the components
- adds/replaces hooks for things that can be installed at runtime
 1.1.10.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.1.8.3 28-Apr-2009  skrll Sync with HEAD.
 1.1.8.2 19-Jan-2009  skrll Sync with HEAD.
 1.1.8.1 19-Nov-2008  skrll file nfs_srvsocket.c was added on branch nick-hppapmap on 2009-01-19 13:20:20 +0000
 1.1.6.2 17-Jan-2009  mjf Sync with HEAD.
 1.1.6.1 19-Nov-2008  mjf file nfs_srvsocket.c was added on branch mjf-devfs2 on 2009-01-17 13:29:34 +0000
 1.1.4.2 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.1.4.1 19-Nov-2008  haad file nfs_srvsocket.c was added on branch haad-dm on 2008-12-13 01:15:28 +0000
 1.3.2.4 16-Sep-2009  yamt sync with head
 1.3.2.3 04-May-2009  yamt fix merge botches.
 1.3.2.2 04-May-2009  yamt sync with head.
 1.3.2.1 14-Mar-2009  yamt file nfs_srvsocket.c was added on branch yamt-nfs-mp on 2009-05-04 08:14:22 +0000
 1.4.94.1 20-Dec-2022  martin Pull up following revision(s) (requested by hannken in ticket #12):

sys/nfs/nfs_srvsocket.c: revision 1.5

When partitioning a mbuf chain with m_split() the last mbuf of the returned
tail chain is not necessarily the same as the last mbuf of the initial chain.

Always set "slp->ns_rawend" to the last mbuf of the tail chain to prevent
mbuf leaks and corruption.
 1.4.68.1 20-Dec-2022  martin Pull up following revision(s) (requested by hannken in ticket #1555):

sys/nfs/nfs_srvsocket.c: revision 1.5

When partitioning a mbuf chain with m_split() the last mbuf of the returned
tail chain is not necessarily the same as the last mbuf of the initial chain.

Always set "slp->ns_rawend" to the last mbuf of the tail chain to prevent
mbuf leaks and corruption.
 1.5.6.1 02-Aug-2025  perseant Sync with HEAD
 1.17 23-Mar-2023  riastradh nfs: Avoid integer overflow in nfs_namei bounds check.

XXX pullup-8
XXX pullup-9
XXX pullup-10
 1.16 27-Apr-2022  hannken branches: 1.16.4;
As VOP_GETATTR() needs a shared lock at least move the preopattr lookup
inside nfs_namei() where we may lock the start directory without violating
the lock order.
 1.15 17-Jan-2020  ad VFS_VGET(), VFS_ROOT(), VFS_FHTOVP(): give them a "int lktype" argument, to
allow us to get shared locks (or no lock) on the returned vnode. Matches
FreeBSD.
 1.14 05-Nov-2012  dholland branches: 1.14.30; 1.14.38; 1.14.42; 1.14.44;
Rename the new ni_startdir (the slot used to hold the starting point
for openat() and friends) to ni_atdir to avoid confusion with a
previously existing (and, alas, still documented) ni_startdir field
that meant something else entirely.
 1.13 13-Oct-2012  dholland Replace hack implementation of NDAT() for "nameiat" with a proper one.
(This change requires a kernel bump.)
 1.12 27-Sep-2011  christos branches: 1.12.2; 1.12.12;
use NFS_MAXPATHLEN instead of MAXPATHLEN
 1.11 08-Aug-2011  dholland nfs_namei() should not return a non-null path buffer except on success,
even though the callers are apparently prepared to cope.

Fixes last tidyup part of PR 44625.
 1.10 11-Apr-2011  dholland Clean up. Move some more code across from nfsd's private entry points.
 1.9 19-Mar-2011  dholland Fix memory leak introduced with the struct pathbuf changes. Hi, me.
Closes PR 44625.
 1.8 30-Nov-2010  dholland branches: 1.8.2;
Abolish struct componentname's cn_pnbuf. Use the path buffer in the
pathbuf object passed to namei as work space instead. (For now a pnbuf
pointer appears in struct nameidata, to support certain unclean things
that haven't been fixed yet, but it will be going away in the future.)

This removes the need for the SAVENAME and HASBUF namei flags.
 1.7 19-Nov-2010  dholland Introduce struct pathbuf. This is an abstraction to hold a pathname
and the metadata required to interpret it. Callers of namei must now
create a pathbuf and pass it to NDINIT (instead of a string and a
uio_seg), then destroy the pathbuf after the namei session is
complete.

Update all namei call sites accordingly. Add a pathbuf(9) man page and
update namei(9).

The pathbuf interface also now appears in a couple of related
additional places that were passing string/uio_seg pairs that were
later fed into NDINIT. Update other call sites accordingly.
 1.6 24-Jun-2010  hannken Clean up vnode lock operations pass 2:

VOP_UNLOCK(vp, flags) -> VOP_UNLOCK(vp): Remove the unneeded flags argument.

Welcome to 5.99.32.

Discussed on tech-kern.
 1.5 27-Sep-2009  dholland branches: 1.5.2; 1.5.4;
Move a big wodge of symlink-following code from nfsd to inside
lookup_for_nfsd(). This code is, or at least should be, the same as
the regular symlink-following code plus an extra flag nfsd needs.

The two lots of code can/will be merged in the future.
 1.4 27-Sep-2009  dholland Rename lookup() to lookup_for_nfsd(), to make it clear just whose
private backdoor entry point this is.

Also, clone the lookup_for_nfsd() entry point as
lookup_for_nfsd_index(), for use by a different call site in nfsd that
does different unclean things with nameidata.
 1.3 04-May-2009  yamt branches: 1.3.2;
when freeing cn_pnbuf, make it NULL if DIAGNOSTIC.
 1.2 14-Mar-2009  dsl ANSIfy another 1261 function definitions.
The only ones left in sys are beyond by sed script!
(or in sys/dist or sys/external)
Mostly they have function pointer parameters.
 1.1 19-Nov-2008  ad branches: 1.1.4; 1.1.6; 1.1.8; 1.1.10;
Make the emulations, exec formats, coredump, NFS, and the NFS server
into modules. By and large this commit:

- shuffles header files and ifdefs
- splits code out where necessary to be modular
- adds module glue for each of the components
- adds/replaces hooks for things that can be installed at runtime
 1.1.10.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.1.8.3 28-Apr-2009  skrll Sync with HEAD.
 1.1.8.2 19-Jan-2009  skrll Sync with HEAD.
 1.1.8.1 19-Nov-2008  skrll file nfs_srvsubs.c was added on branch nick-hppapmap on 2009-01-19 13:20:20 +0000
 1.1.6.2 17-Jan-2009  mjf Sync with HEAD.
 1.1.6.1 19-Nov-2008  mjf file nfs_srvsubs.c was added on branch mjf-devfs2 on 2009-01-17 13:29:34 +0000
 1.1.4.2 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.1.4.1 19-Nov-2008  haad file nfs_srvsubs.c was added on branch haad-dm on 2008-12-13 01:15:28 +0000
 1.3.2.4 11-Aug-2010  yamt sync with head.
 1.3.2.3 11-Mar-2010  yamt sync with head
 1.3.2.2 04-May-2009  yamt sync with head.
 1.3.2.1 04-May-2009  yamt file nfs_srvsubs.c was added on branch yamt-nfs-mp on 2009-05-04 08:14:22 +0000
 1.5.4.3 21-Apr-2011  rmind sync with head
 1.5.4.2 05-Mar-2011  rmind sync with head
 1.5.4.1 03-Jul-2010  rmind sync with head
 1.5.2.1 17-Aug-2010  uebayasi Sync with HEAD.
 1.8.2.1 06-Jun-2011  jruoho Sync with HEAD.
 1.12.12.1 20-Nov-2012  tls Resync to 2012-11-19 00:00:00 UTC
 1.12.2.2 16-Jan-2013  yamt sync with (a bit old) head
 1.12.2.1 30-Oct-2012  yamt sync with head
 1.14.44.1 17-Jan-2020  ad Sync with head.
 1.14.42.1 30-Mar-2023  martin Pull up following revision(s) (requested by riastradh in ticket #1617):

sys/nfs/nfs_serv.c: revision 1.184
sys/nfs/nfs_srvsubs.c: revision 1.17
sys/nfs/nfsm_subs.h: revision 1.56
sys/nfs/nfsm_subs.h: revision 1.57

nfs: Use unsigned fhlen so we don't trip over negative values.

nfs: Avoid integer overflow in nfs_namei bounds check.

nfs: Use unsigned name lengths so we don't trip over negative ones.
- nfsm_strsiz is only used with uint32_t in callers, but let's not
leave it as a rake to step on.
- nfsm_srvnamesiz is abused with signed s. The internal conversion
to unsigned serves to reject both negative and too-large values in
such callers.
XXX Should make all callers use unsigned, rather than flipping back
and forth between signed and unsigned for name lengths.

nfs: Avoid free of uninitialized on bad name size in create, mknod.
XXX These error branches are a nightmare and need to be more
systematically cleaned up. Even if they are correct now, they are
impossible to audit and extremely fragile in case anyone ever needs
to make other changes to them.
 1.14.38.1 08-Apr-2020  martin Merge changes from current as of 20200406
 1.14.30.1 30-Mar-2023  martin Pull up following revision(s) (requested by riastradh in ticket #1810):

sys/nfs/nfs_serv.c: revision 1.184
sys/nfs/nfs_srvsubs.c: revision 1.17
sys/nfs/nfsm_subs.h: revision 1.56
sys/nfs/nfsm_subs.h: revision 1.57

nfs: Use unsigned fhlen so we don't trip over negative values.

nfs: Avoid integer overflow in nfs_namei bounds check.

nfs: Use unsigned name lengths so we don't trip over negative ones.
- nfsm_strsiz is only used with uint32_t in callers, but let's not
leave it as a rake to step on.
- nfsm_srvnamesiz is abused with signed s. The internal conversion
to unsigned serves to reject both negative and too-large values in
such callers.
XXX Should make all callers use unsigned, rather than flipping back
and forth between signed and unsigned for name lengths.

nfs: Avoid free of uninitialized on bad name size in create, mknod.
XXX These error branches are a nightmare and need to be more
systematically cleaned up. Even if they are correct now, they are
impossible to audit and extremely fragile in case anyone ever needs
to make other changes to them.
 1.16.4.1 30-Mar-2023  martin Pull up following revision(s) (requested by riastradh in ticket #134):

sys/nfs/nfs_serv.c: revision 1.184
sys/nfs/nfs_srvsubs.c: revision 1.17
sys/nfs/nfsm_subs.h: revision 1.56
sys/nfs/nfsm_subs.h: revision 1.57

nfs: Use unsigned fhlen so we don't trip over negative values.

nfs: Avoid integer overflow in nfs_namei bounds check.

nfs: Use unsigned name lengths so we don't trip over negative ones.
- nfsm_strsiz is only used with uint32_t in callers, but let's not
leave it as a rake to step on.
- nfsm_srvnamesiz is abused with signed s. The internal conversion
to unsigned serves to reject both negative and too-large values in
such callers.
XXX Should make all callers use unsigned, rather than flipping back
and forth between signed and unsigned for name lengths.

nfs: Avoid free of uninitialized on bad name size in create, mknod.
XXX These error branches are a nightmare and need to be more
systematically cleaned up. Even if they are correct now, they are
impossible to audit and extremely fragile in case anyone ever needs
to make other changes to them.
 1.242 09-Feb-2022  andvar s/ony/only/
 1.241 05-Sep-2020  riastradh Round of uvm.h cleanup.

The poorly named uvm.h is generally supposed to be for uvm-internal
users only.

- Narrow it to files that actually need it -- mostly files that need
to query whether curlwp is the pagedaemon, which should maybe be
exposed by an external header.

- Use uvm_extern.h where feasible and uvm_*.h for things not exposed
by it. We should split up uvm_extern.h but this will serve for now
to reduce the uvm.h dependencies.

- Use uvm_stat.h and #ifdef UVMHIST uvm.h for files that use
UVMHIST(ubchist), since ubchist is declared in uvm.h but the
reference evaporates if UVMHIST is not defined, so we reduce header
file dependencies.

- Make uvm_device.h and uvm_swap.h independently includable while
here.

ok chs@
 1.240 25-May-2020  ad - Alter the convention for uvm_page_array slightly, so the basic search
parameters can't change part way through a search: move the "uobj" and
"flags" arguments over to uvm_page_array_init() and store those with the
array.

- With that, detect when it's not possible to find any more pages in the
tree with the given search parameters, and avoid repeated tree lookups if
the caller loops over uvm_page_array_fill_and_peek().
 1.239 04-Apr-2020  mlelstv NFSv2 is limited to use only 32bit in metadata. Prevent that larger
metadata values are simply truncated.

-> clamp filesystem block counts to signed 32bit.
-> clamp file sizes to signed 32bit (*)

Some NFSv2 clients also have problems to handle buffer sizes larger
than (signed) 16bit.
-> clamp buffer sizes to signed 16bit for better compatibility.

(*) This can lead to erroneous behaviour for files larger than 2GB
that NFSv2 cannot handle but it is still better than before.
An alternative would be to (partially) reject operations on files
larger than 2GB, but which causes other problems.
 1.238 08-Mar-2020  mgorny Update NFS errno mapping and add assert for correctness

Add the mapping for errno values missing in nfsrv_v2errmap[]. While
at it, add a compile-time assert to make sure that the array does not
become out-of-date again.
 1.237 24-Feb-2020  ad v_interlock -> vmobjlock
 1.236 15-Dec-2019  ad branches: 1.236.2;
Merge from yamt-pagecache:

- do gang lookup of pages using radixtree.
- remove now unused uvm_object::uo_memq and vm_page::listq.queue.
 1.235 22-Dec-2018  maxv Replace M_ALIGN and MH_ALIGN by m_align.
 1.234 22-Dec-2018  maxv Replace: M_MOVE_PKTHDR -> m_move_pkthdr. No functional change, since the
former is a macro to the latter.
 1.233 03-Sep-2018  riastradh Rename min/max -> uimin/uimax for better honesty.

These functions are defined on unsigned int. The generic name
min/max should not silently truncate to 32 bits on 64-bit systems.
This is purely a name change -- no functional change intended.

HOWEVER! Some subsystems have

#define min(a, b) ((a) < (b) ? (a) : (b))
#define max(a, b) ((a) > (b) ? (a) : (b))

even though our standard name for that is MIN/MAX. Although these
may invite multiple evaluation bugs, these do _not_ cause integer
truncation.

To avoid `fixing' these cases, I first changed the name in libkern,
and then compile-tested every file where min/max occurred in order to
confirm that it failed -- and thus confirm that nothing shadowed
min/max -- before changing it.

I have left a handful of bootloaders that are too annoying to
compile-test, and some dead code:

cobalt ews4800mips hp300 hppa ia64 luna68k vax
acorn32/if_ie.c (not included in any kernels)
macppc/if_gm.c (superseded by gem(4))

It should be easy to fix the fallout once identified -- this way of
doing things fails safe, and the goal here, after all, is to _avoid_
silent integer truncations, not introduce them.

Maybe one day we can reintroduce min/max as type-generic things that
never silently truncate. But we should avoid doing that for a while,
so that existing code has a chance to be detected by the compiler for
conversion to uimin/uimax without changing the semantics until we can
properly audit it all. (Who knows, maybe in some cases integer
truncation is actually intended!)
 1.232 08-May-2018  maxv branches: 1.232.2;
Use M_MOVE_PKTHDR.
 1.231 26-Apr-2018  maxv Hum. This should be M_READONLY, not M_ROMAP.

M_ROMAP tells us whether the mbuf storage is mapped on a read-only page.
But an mbuf can still be read-only in the sense that the storage is
shared with other mbufs.
 1.230 21-Jan-2018  christos branches: 1.230.2;
PR/40491: From Tobias Ulmer in tech-kern@:
1. Protect the nfs request queue with its own mutex
2. make the nfs_receive queue check for signals so that intr mounts
can be interrupted.
XXX: pullup-8
 1.229 01-Apr-2017  riastradh branches: 1.229.6;
KASSERT(mutex_owned(vp->v_interlock)) in vnode iterator selector.
 1.228 10-Jun-2016  ozaki-r branches: 1.228.2; 1.228.4;
Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.
 1.227 10-Aug-2014  tls branches: 1.227.4;
Merge tls-earlyentropy branch into HEAD.
 1.226 24-May-2014  christos Introduce a selector function to the vfs vnode iterator so that we don't
need to vget() vnodes that we are not interested at, and optimize locking
a bit. Iterator changes reviewed by Hannken (thanks), the rest of the bugs
are mine.
 1.225 17-Mar-2014  hannken branches: 1.225.2;
Change nfs_clearcommit() to use vfs_vnode_iterator.
 1.224 18-Sep-2013  pgoyette knf (blank line even if there are no local declarations)
 1.223 18-Sep-2013  christos Use reference counting to keep track of construction and destruction of the
structures used by both the nfs server and client code. Tested by pgoyette@

1. mount remote fs via nfs (my /home directory), which autoloads nfs module
2. manually modload nfsserver
3. wait a bit
4. manually modunload nfsserver
5. wait a couple minutes
6. verify that client access still works (/bin/ls ~paul home dir)
7. manually modload nfsserver again
8. start an nfsd process
9. wait a bit
10. kill nfsd process
11. wait
12. manually modunload nfsserver again
13. verify continued client access

XXX: Note that nfs_vfs_init() calls nfs_init(), but nfs_vfs_done() does not
call nfs_fini(). Also note that the destruction order is wrong in it,
but probably does not matter. "someone" (!= me) should fix it :-) and
run the above tests.
 1.222 19-Nov-2011  tls branches: 1.222.8; 1.222.12;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.
 1.221 12-Jun-2011  rmind branches: 1.221.2;
Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.
 1.220 06-Nov-2010  uebayasi branches: 1.220.6;
Include uvm/uvm.h to use UVM internal type (struct vm_page).
 1.219 02-Mar-2010  pooka branches: 1.219.2;
Get rid of dependency on fs_nfs.h, i.e. source modules with
conditional content depending on if the NFS client is wanted or
not. The server can now be made an independent module not depending
on the nfs client.

Tested with rump_nfs (standalone client), rump_nfsd (standalone
nfsd) and a qemu installation with both the client and the server.
 1.218 31-Dec-2009  christos branches: 1.218.2;
put nuidhash_max in a file that is shared between server and client code.
 1.217 14-May-2009  yamt nfs_clearcommit: fix a race with vnode cleaning.
 1.216 15-Mar-2009  cegger ansify function definitions
 1.215 14-Mar-2009  dsl ANSIfy another 1261 function definitions.
The only ones left in sys are beyond by sed script!
(or in sys/dist or sys/external)
Mostly they have function pointer parameters.
 1.214 14-Mar-2009  dsl Change about 4500 of the K&R function definitions to ANSI ones.
There are still about 1600 left, but they have ',' or /* ... */
in the actual variable definitions - which my awk script doesn't handle.
There are also many that need () -> (void).
(The script does handle misordered arguments.)
 1.213 14-Mar-2009  dsl Remove all the __P() from sys (excluding sys/dist)
Diff checked with grep and MK1 eyeball.
i386 and amd64 GENERIC and sys still build.
 1.212 17-Dec-2008  cegger branches: 1.212.2;
kill MALLOC and FREE macros.
 1.211 28-Nov-2008  pooka g/c unused malloc types
 1.210 19-Nov-2008  ad Make the emulations, exec formats, coredump, NFS, and the NFS server
into modules. By and large this commit:

- shuffles header files and ifdefs
- splits code out where necessary to be modular
- adds module glue for each of the components
- adds/replaces hooks for things that can be installed at runtime
 1.209 22-Oct-2008  matt branches: 1.209.2;
Don't need nfs_vfs_reinit anymore since we don't resize tables anymore.
Move reinit code to init case.
 1.208 22-Oct-2008  matt Change NFS to use a RB-tree for its FH->nfsnode lookups.
 1.207 09-Oct-2008  pooka Use atomic op to get next xid. Initialize value with arc4random()
at nfs init time instead system time based trickery intermingled
with the runtime code.

le bouef: kills last simple_lock from nfs
 1.206 30-Sep-2008  pooka Initialize nfsnode pools and malloc type dynamically in the
constructor instead of depending on link sets. Consequently, rename
nfs_nh{init,reinit,done} to nfs_node_{init,reinit,done}, respectively,
to better convey the function.
 1.205 15-Jul-2008  christos explicitly set birthtime to VNOVAL, since there is no such thing in nfsv{2,3}
 1.204 04-Jun-2008  ad branches: 1.204.2; 1.204.4;
vm_page: put TAILQ_ENTRY into a union with LIST_ENTRY, so we can use both.
 1.203 10-May-2008  rumble Convert file systems to dynamically attach with the new module interface.
Make VFS hooks dynamic while we're here and say farewell to VFS_ATTACH and
VFS_HOOKS_ATTACH linksets.

As a consequence, most of the file systems can now be loaded as new style
modules.

Quick sanity check by ad@.
 1.202 05-May-2008  ad branches: 1.202.2;
- Convert hashinit() to use kmem_alloc(). The hash tables can be large
and it's better to not have them in kmem_map.
- Convert a couple of minor items along the way to kmem_alloc().
- Fix some memory leaks.
 1.201 24-Mar-2008  yamt branches: 1.201.2; 1.201.4;
merge yamt-lazymbuf branch.
 1.200 05-Mar-2008  elad Nuke a KAUTH_GENERIC_ISSUSER, this time in favor of an euid == 0, as
the traditional NFS maproot functionality goes.

Put in a note explaining why and who, also mark for future greps.

Okay yamt@.
 1.199 13-Feb-2008  yamt branches: 1.199.2; 1.199.6;
reject files larger than nm_maxfilesize.
 1.198 28-Jan-2008  yamt nfs_check_wccdata: unifdef wcc kludge messages.
 1.197 24-Jan-2008  ad specfs changes for PR kern/37717 (raidclose() is no longer called on
shutdown). There are still problems with device access and a PR will be
filed.

- Kill checkalias(). Allow multiple vnodes to reference a single device.

- Don't play dangerous tricks with block vnodes to ensure that only one
vnode can describe a block device. Instead, prohibit concurrent opens of
block devices. As a bonus remove the unreliable code that prevents
multiple file system mounts on the same device. It's no longer needed.

- Track opens by vnode and by device. Issue cdev_close() when the last open
goes away, instead of abusing vnode::v_usecount to tell if the device is
open.
 1.196 02-Jan-2008  yamt use kmem_alloc instead of malloc.
 1.195 02-Jan-2008  ad Merge vmlocking2 to head.
 1.194 08-Dec-2007  pooka branches: 1.194.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.
 1.193 26-Nov-2007  pooka branches: 1.193.2;
Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern
 1.192 28-Oct-2007  yamt branches: 1.192.2;
make NFS_ATTRTIMEO a function.
 1.191 27-Jul-2007  yamt branches: 1.191.4; 1.191.6; 1.191.10; 1.191.12;
stop nfs tick when we have nothing to do.
 1.190 09-Jul-2007  ad branches: 1.190.2;
Fix build when not !NFSSERVER.
 1.189 09-Jul-2007  ad Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements
 1.188 06-Jun-2007  yamt nfs_getattrcache: simplify. no functional changes.
 1.187 28-May-2007  yamt - remove nfs_exit exit hook. ok'ed by christos@.
- as far as i understand the code, it shouldn't be necessary
because nfs_request can't return without removing its request
and r->r_lwp is either curlwp or NULL.
- even if it's necessary, leaking requests is not the correct way
to recover from the condition.
- nfs_request: add a related assertion.
 1.186 29-Apr-2007  yamt use mutex and condver.
 1.185 22-Apr-2007  dsl Change the way that emulations locate files within the emulation root to
avoid having to allocate space in the 'stackgap'
- which is very LWP unfriendly.
The additional code for non-emulation namei() is trivial, the reduction for
the emulations is massive.
The vnode for a processes emulation root is saved in the cwdi structure
during process exec.
If the emulation root the TRYEMULROOT flag are set, namei() will do an initial
search for absolute pathnames in the emulation root, if that fails it will
retry from the normal root.
".." at the emulation root will always go to the real root, even in the middle
of paths and when expanding symlinks.
Absolute symlinks found using absolute paths in the emulation root will be
relative to the emulation root (so /usr/lib/xxx.so -> /lib/xxx.so links
inside the emulation root don't need changing).
If the root of the emulation would be returned (for an emulation lookup), then
the real root is returned instead (matching the behaviour of emul_lookup,
but being a cheap comparison here) so that programs that scan "../.."
looking for the root dircetory don't loop forever.
The target for symbolic links is no longer mangled (it used to get the
CHECK_ALT_xxx() treatment, so could get /emul/xxx prepended).
CHECK_ALT_xxx() are no more. Most of the change is deleting them, and adding
TRYEMULROOT to the flags to NDINIT().
A lot of the emulation system call stubs could now be deleted.
 1.184 09-Mar-2007  yamt branches: 1.184.2; 1.184.4;
nfs_check_wccdata: print timestamps.
 1.183 04-Mar-2007  christos Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.
 1.182 22-Feb-2007  thorpej TRUE -> true, FALSE -> false
 1.181 21-Feb-2007  thorpej Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.
 1.180 15-Feb-2007  yamt branches: 1.180.2;
use mutex and rwlock rather than lockmgr.
 1.179 27-Dec-2006  yamt remove nqnfs.
 1.178 09-Dec-2006  chs a smorgasbord of improvements to vnode locking and path lookup:
- LOCKPARENT is no longer relevant for lookup(), relookup() or VOP_LOOKUP().
these now always return the parent vnode locked. namei() works as before.
lookup() and various other paths no longer acquire vnode locks in the
wrong order via vrele(). fixes PR 32535.
as a nice side effect, path lookup is also up to 25% faster.
- the above allows us to get rid of PDIRUNLOCK.
- also get rid of WANTPARENT (just use LOCKPARENT and unlock it).
- remove an assumption in layer_node_find() that all file systems implement
a recursive VOP_LOCK() (unionfs doesn't).
- require that all file systems supply vfs_vptofh and vfs_fhtovp routines.
fill in eopnotsupp() for file systems that don't support being exported
and remove the checks for NULL. (layerfs calls these without checking.)
- in union_lookup1(), don't change refcounts in the ISDOTDOT case, just
adjust which vnode is locked. fixes PR 33374.
- apply fixes for ufs_rename() from ufs_vnops.c rev. 1.61 to ext2fs_rename().
 1.177 09-Nov-2006  yamt branches: 1.177.2;
remove some __unused in function parameters.
 1.176 20-Oct-2006  reinoud Replace the LIST structure mp->mnt_vnodelist to a TAILQ structure since all
vnodes were synced and processed backwards. This meant that the last
accessed node was processed first and the earlierst last.

An extra benefit is the removal of the ugly hack from the Berkly days on
LFS.

In the proces, i've also replaced the various variations hand written loops
by the TAILQ_FOREACH() macro's.
 1.175 17-Oct-2006  dogcow now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.
 1.174 14-Oct-2006  yamt grab glock when calling uvm_unp_setsize, so that it doesn't interfere
mmap'ed accesses. this fixes an assertion failure in in nfs_doio_read.
("vp->v_size >= uiop->uio_offset + uiop->uio_resid")
 1.173 13-Oct-2006  christos more __unused
 1.172 13-Oct-2006  dogcow more unused variable fallout.
 1.171 12-Oct-2006  christos - sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386
 1.170 04-Sep-2006  yamt branches: 1.170.2; 1.170.4;
nfs_ispublicfh: fix another instance of cast-qual.
 1.169 04-Sep-2006  yamt remove (void *) cast from NFSRVFH_DATA as it sometimes
discards const qualifier. pointed out by Havard Eidnes.
(it wasn't detected by in-tree gcc4. seems like a compiler bug.)
 1.168 02-Sep-2006  yamt nfsd: deal with variable-sized filehandles.
 1.167 02-Sep-2006  christos fix default type decls
fix incomplete initializer
 1.166 13-Jul-2006  martin Fix alignement problems for fhandle_t, exposed by gcc4.1.

While touching all vptofh/fhtovp functions, get rid of VFS_MAXFIDSIZ,
version the getfh(2) syscall and explicitly pass the size available in
the filehandle from userland.

Discussed on tech-kern, with lots of help from yamt (thanks!).
 1.165 07-Jun-2006  kardel branches: 1.165.2;
merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html
 1.164 19-May-2006  yamt branches: 1.164.2;
- fix compilation problem for !NFSSERVER && NFS.
pointed by Tom Spindler on source-changes@.
- make nfs_srvdesc_pool static.
 1.163 18-May-2006  yamt - fix some leaks in nfsd, introduced by kauth changes.
- simplify code.
- add some assertions.
- wrap some long lines.
- remove an unnecessary ";".
 1.162 14-May-2006  elad integrate kauth.
 1.161 14-May-2006  christos XXX: GCC uninitialized
 1.160 15-Apr-2006  christos Coverity CID 1141: Add a KASSERT before deref.
 1.159 15-Apr-2006  christos Coverity CID 1142: Add a KASSERT before deref.
 1.158 01-Mar-2006  yamt branches: 1.158.2; 1.158.4; 1.158.6;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.
 1.157 16-Jan-2006  yamt branches: 1.157.2; 1.157.4;
- tweak RUN_ONCE api to allow init_func returns an error.
- physio: handle failure of workqueue_create.
 1.156 11-Dec-2005  christos branches: 1.156.2;
merge ktrace-lwp.
 1.155 25-Nov-2005  thorpej Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().
 1.154 22-Nov-2005  yamt - reduce number of linear search per rpc.
- coalesce mount_netexport_pair into netexport.
 1.153 23-Sep-2005  jmmv branches: 1.153.6;
Apply the NFS exports list rototill patch:

- Remove all NFS related stuff from file system specific code.
- Drop the vfs_checkexp hook and generalize it in the new nfs_check_export
function, thus removing redundancy from all file systems.
- Move all NFS export-related stuff from kern/vfs_subr.c to the new
file sys/nfs/nfs_export.c. The former was becoming large and its code
is always compiled, regardless of the build options. Using the latter,
the code is only compiled in when NFSSERVER is enabled. While doing this,
also make some functions in nfs_subs.c conditional to NFSSERVER.
- Add a new command in nfssvc(2), called NFSSVC_SETEXPORTSLIST, that takes a
path and a set of export entries. At the moment it can only clear the
exports list or append entries, one by one, but it is done in a way that
allows setting the whole set of entries atomically in the future (see the
comment in mountd_set_exports_list or in doc/TODO).
- Change mountd(8) to use the nfssvc(2) system call instead of mount(2) so
that it becomes file system agnostic. In fact, all this whole thing was
done to remove a 'XXX' block from this utility!
- Change the mount*, newfs and fsck* userland utilities to not deal with NFS
exports initialization; done internally by the kernel when initializing
the NFS support for each file system.
- Implement an interface for VFS (called VFS hooks) so that several kernel
subsystems can run arbitrary code upon receipt of specific VFS events.
At the moment, this only provides support for unmount and is used to
destroy NFS exports lists from the file systems being unmounted, though it
has room for extension.

Thanks go to yamt@, chs@, thorpej@, wrstuden@ and others for their comments
and advice in the development of this patch.
 1.152 19-Sep-2005  christos ATTRTIMEO takes 2 args.
 1.151 19-Aug-2005  yamt as we now have 64bit ino_t, no need to truncate nfsv3 fileids.
 1.150 07-Jul-2005  christos 1. use p = uio->uio_procp consistently and eliminate suspicious uses
of curproc (where uio->uio_procp should be used?). Don't do this
for nfs_commit(), because yamt says it is possibly wrong.
2. nfs_doio() does not use struct proc; remove it and the code to compute it.
3. use copyin_proc() and copyout_proc() instead of copyin() and copyout().
4. check return of copyout_proc(). and mark return from copyin_proc() XXX
5. Eliminate check p == curproc assertion check from nfs_write;
nfs_read does not have it and we might be called in a different
process context anyway (PR 20138).
 1.149 29-May-2005  christos branches: 1.149.2;
- sprinkle const
- avoid shadowed variables
- mark bad const use with XXXUNCONST
 1.148 26-Feb-2005  perry branches: 1.148.2;
nuke trailing whitespace
 1.147 28-Jan-2005  yamt nfs_namei: return EACCES for empty filenames as rfc1813 says.
 1.146 28-Jan-2005  yamt nfs_clearcommit: don't attempt to clear commit info (n_pushlo, etc)
unless the vnode is of VREG. union members used to keep commit info
are used for other purposes in the case of !VREG.
 1.145 27-Jan-2005  yamt keep directory eof cache when inactivating vnode
because there's no reason to throw it away.
(fix an unintended side effect of nfs_subs.c rev.1.144.)
 1.144 26-Jan-2005  yamt handle a really empty directory, which doesn't have even the dot entry.
 1.143 25-Jan-2005  yamt branches: 1.143.2;
nfs_check_wccdata: comment.
 1.142 21-Jan-2005  yamt s/time/mono_time/ for n_attrstamp and n_accstamp. (parts of) PR/25641.
 1.141 19-Jan-2005  yamt implement inaccurate mtime/ctime detection.
namely, if mtime or ctime are same between pre_op_attr and post_op_attr
when we expected them to be changed, don't trust the server.
 1.140 09-Jan-2005  yamt branches: 1.140.2;
invalidate cache if filesize is changed besides our activity
because it means that we're out of sync with the server.
 1.139 06-Jan-2005  yamt nfs_loadattrcache: invalidate access cache when ctime is changed.
 1.138 26-Oct-2004  yamt since daddr_t is 64-bit these days, simply use nfs directory cookies
as buffer cache indexes. regress/sys/fs/getdents is now supposed to work.
fix PR/27112.
 1.137 03-Oct-2004  yamt nfs_enterdircache: initialize dc_flags of a newly allocated dircache entry.
provided by Greg Oster.
 1.136 17-Sep-2004  skrll There's no need to pass a proc value when using UIO_SYSSPACE with
vn_rdwr(9) and uiomove(9).

OK'd by Jason Thorpe
 1.135 15-Sep-2004  yamt fix access-after-free bugs in dircache code by refcounting nfsdircache.
PR/26864.
 1.134 14-Jun-2004  yamt nfs_searchdircache: fix a null dereference in the case that
offset!=0 and dircache hasn't been initialized yet.
 1.133 21-Apr-2004  christos Replace the statfs() family of system calls with statvfs().
Retain binary compatibility.
 1.132 19-Mar-2004  yamt branches: 1.132.2;
nfs_getattrcache: deal with timer wraparound.
 1.131 12-Mar-2004  yamt shrink sizeof struct nfsnode by putting exclusive members into union.
 1.130 29-Nov-2003  yamt nfs_zeropad: remove an unneeded substitution (and clean up a little.)
 1.129 02-Oct-2003  itojun plug mbuf leak due to manual mbuf handling. PR kern/13807.
(martti confirmed that it stabilizes the situation described in kren/13807)
 1.128 26-Sep-2003  yamt change n_mtime from time_t to timespec in order to improve
cache consistency.
(1 second granularity is too loose these days.)
 1.127 07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.126 23-Jul-2003  yamt when rexmitting a request due to NFSERR_JUKEBOX,
use a new xid as RFC1813 says.
 1.125 29-Jun-2003  fvdl branches: 1.125.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
 1.124 29-Jun-2003  thorpej Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.
 1.123 28-Jun-2003  darrenr Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
 1.122 09-Jun-2003  yamt rework zero padding of rpc reply.
- for READ procedure, don't send back more bytes than requested.
- don't have doubtful assumptions on mbuf chain structure.
- rename a function (nfsm_adj -> nfs_zeropad) to avoid confusion as
the semantics of the function was changed.
 1.121 22-May-2003  yamt poolify nfsrv_descript.
 1.120 22-May-2003  yamt avoid double free with xlatecookie.
 1.119 07-May-2003  yamt simple lock for nfs iod.
 1.118 03-May-2003  yamt tweak nfsm_adj to pay attention to read only mbufs.
 1.117 03-May-2003  yamt more comment.
 1.116 03-May-2003  yamt better handling of write verifier change.
 1.115 24-Apr-2003  drochner Change some subordinate functions to take a "struct nfsnode" argument
instead of "struct vnode". This saves a number of pointer dereferences;
it sums up to about half a kB for me. And it paves the way for future
fixes.
While cleaning up, eliminate a write-only member of "struct nfsreq"
and a pointless assignment in the NFS_V2_ONLY case.
 1.114 16-Apr-2003  yamt sync a comment with reality.
 1.113 02-Apr-2003  yamt use queue manipulation macros.
 1.112 01-Apr-2003  yamt add an assertion.
 1.111 31-Mar-2003  yamt rename fvdl_debug to NFS_DEBUG_COMMIT.

ok'ed by fvdl.
 1.110 28-Mar-2003  yamt reply ENAMETOOLONG properly instead of discarding request as BADRPC.
my own PR20791.
 1.109 26-Feb-2003  matt Add MBUFTRACE kernel option.
Do a little mbuf rework while here. Change all uses of MGET*(*, M_WAIT, *)
to m_get*(M_WAIT, *). These are not performance critical and making them
call m_get saves considerable space. Add m_clget analogue of MCLGET and
make corresponding change for M_WAIT uses.
Modify netinet, gem, fxp, tulip, nfs to support MBUFTRACE.
Begin to change netstat to use sysctl.
 1.108 10-Feb-2003  christos move the MALLOC decl for DIROFFS to nfs_subs.c
 1.107 01-Dec-2002  matt Make sure these all agree on the same definitons of various variables.
 1.106 23-Oct-2002  jdolecek merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe
 1.105 21-Oct-2002  yamt fix a page locking deadlock problem for nfs.

add a flag that specify if the file can be truncated safely or not
to nfsm_loadattr and friends. when it isn't safe, just mark the nfsnode
as "should be truncated later".

ok'ed by Frank van der Linden and Chuck Silvers.
close kern/18036.
 1.104 23-Aug-2002  enami s/FREE/PNBUF_PUT/
 1.103 17-Mar-2002  christos branches: 1.103.4; 1.103.6;
use the exithook mechanism to remove the exiting process from the list
of processes to be signalled in a soft mount.
 1.102 11-Mar-2002  jdolecek Cosmetic change for nfs_enterdircache() - since 'blkno' is last arg,
define it's type last, too.
Noted in kern/14742 by John Franklin.
 1.101 28-Feb-2002  fvdl Invalidate the access cache when loading a new set of attributes into
the atribute cache. Fixes access cache problem seen by
Nathan Funk of the UofS, relayed by Greg Oster.
 1.100 26-Jan-2002  chs re-enable NFSv3 commit RPCs by abandoning my new approach in favor of
frank's scheme, with one new twist: don't wait until we've totally run
out of free pages before committing, but instead notice when we've built
up a largish range of uncommitted pages and commit only the older half of
the range, which is likely to already be on disk on the server.
 1.99 10-Nov-2001  lukem add RCSIDs
 1.98 27-Sep-2001  fvdl branches: 1.98.2;
Always initialize ni_rootdir in nfs_namei. From Andrei Petrov.
 1.97 15-Sep-2001  chs a whole bunch of changes to improve performance and robustness under load:

- remove special treatment of pager_map mappings in pmaps. this is
required now, since I've removed the globals that expose the address range.
pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's
no longer any need to special-case it.
- eliminate struct uvm_vnode by moving its fields into struct vnode.
- rewrite the pageout path. the pager is now responsible for handling the
high-level requests instead of only getting control after a bunch of work
has already been done on its behalf. this will allow us to UBCify LFS,
which needs tighter control over its pages than other filesystems do.
writing a page to disk no longer requires making it read-only, which
allows us to write wired pages without causing all kinds of havoc.
- use a new PG_PAGEOUT flag to indicate that a page should be freed
on behalf of the pagedaemon when it's unlocked. this flag is very similar
to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the
pageout fails due to eg. an indirect-block buffer being locked.
this allows us to remove the "version" field from struct vm_page,
and together with shrinking "loan_count" from 32 bits to 16,
struct vm_page is now 4 bytes smaller.
- no longer use PG_RELEASED for swap-backed pages. if the page is busy
because it's being paged out, we can't release the swap slot to be
reallocated until that write is complete, but unlike with vnodes we
don't keep a count of in-progress writes so there's no good way to
know when the write is done. instead, when we need to free a busy
swap-backed page, just sleep until we can get it busy ourselves.
- implement a fast-path for extending writes which allows us to avoid
zeroing new pages. this substantially reduces cpu usage.
- encapsulate the data used by the genfs code in a struct genfs_node,
which must be the first element of the filesystem-specific vnode data
for filesystems which use genfs_{get,put}pages().
- eliminate many of the UVM pagerops, since they aren't needed anymore
now that the pager "put" operation is a higher-level operation.
- enhance the genfs code to allow NFS to use the genfs_{get,put}pages
instead of a modified copy.
- clean up struct vnode by removing all the fields that used to be used by
the vfs_cluster.c code (which we don't use anymore with UBC).
- remove kmem_object and mb_object since they were useless.
instead of allocating pages to these objects, we now just allocate
pages with no object. such pages are mapped in the kernel until they
are freed, so we can use the mapping to find the page to free it.
this allows us to remove splvm() protection in several places.

The sum of all these changes improves write throughput on my
decstation 5000/200 to within 1% of the rate of NetBSD 1.5
and reduces the elapsed time for "make release" of a NetBSD 1.5
source tree on my 128MB pc to 10% less than a 1.5 kernel took.
 1.96 15-Sep-2001  chs add a new VFS op, vfs_reinit, which is called when desiredvnodes is
adjusted via sysctl. file systems that have hash tables which are
sized based on the value of this variable now resize those hash tables
using the new value. the max number of FFS softdeps is also recalculated.

convert various file systems to use the <sys/queue.h> macros for
their hash tables.
 1.95 07-Jun-2001  lukem branches: 1.95.2; 1.95.4;
delint lvalue cast abuse
 1.94 21-Apr-2001  bjh21 In nfs_loadattrcache(), if checkalias() gives us a new vnode, lock it. This
prevents us losing the locked state of the old vnode.

fvdl thinks the old vnode is certain to be locked at this point. I've put in
a KASSERT to be on the safe side.

This seems to fix PR kern/12661.
 1.93 23-Mar-2001  fvdl Same change as in the UFS code: unlock vnode before setting v_op
to spec_vnode_ops. From Bill Studenmund.
 1.92 21-Feb-2001  jdolecek branches: 1.92.2;
make some more constant arrays 'const'
 1.91 14-Feb-2001  fvdl Fix some possible locking errors in nfs_namei (XXX this function should die)
 1.90 18-Jan-2001  jdolecek constify
 1.89 27-Nov-2000  chs Initial integration of the Unified Buffer Cache project.
 1.88 08-Nov-2000  ad Update for hashinit() change.
 1.87 24-Oct-2000  fvdl Do not accept vnode type changes to an active node. This may wreak
havoc if the server erroneously uses the same filehandle for
different files. This changes back revision 1.28; the PR that
that revision fixed doesn't apply anymore, it has been verified
not to be a problem with this change.
 1.86 27-Sep-2000  fvdl Avoid unused variables for V2_ONLY case.
 1.85 24-Sep-2000  enami Don't bother to clear commit information for the vnode of type VNON.
It is not necessary since it is a vnode being initialized and it shouldn't
be done since filesystem private data may not be assigned yet.
 1.84 19-Sep-2000  bjh21 Extend NFS_V2_ONLY to remove NQNFS lease support as well. Saves another 10k.
 1.83 19-Sep-2000  fvdl Add functions to deal with keeping track of commit ranges.
 1.82 19-Sep-2000  bjh21 New kernel option, NFS_V2_ONLY, which aims to reduce the NFS client to just
that required to support NFSv2 mounts. Not finished yet, but already
provides some 44k of saving in code size on arm26. More savings, and some
documentation, are still to come.
 1.81 03-Aug-2000  thorpej Convert namei pathname buffer allocation to use the pool allocator.
 1.80 03-Aug-2000  thorpej MALLOC()/FREE() are not to be used for variable size allocations.
 1.79 28-Jun-2000  mrg remove include of <vm/vm.h>
 1.78 27-Jun-2000  mrg remove include of <vm/vm.h>
 1.77 20-Jun-2000  mrg branches: 1.77.2;
disable the bloated NFS structure check on 64bit sparc64.
 1.76 09-Jun-2000  fvdl Some tweaks to enable NFS over IPv6. The special-casing of AF_INET
should really be removed.
 1.75 30-Mar-2000  augustss branches: 1.75.2;
Remove register declarations.
 1.74 30-Mar-2000  simonb Delete redundant decl of nfs_pub - it's in <sys/mount.h>.
Delete redundant decls of nfsv{2,3}_type - they're in <nfs/nfsproto.h>.
 1.73 16-Mar-2000  jdolecek Add new VFS op routine - vfs_done and call it on filesystem detach
in vfs_detach(). vfs_done may free global filesystem's resources,
typically those allocated in respective filesystem's init function.
Needed so those filesystems which went in via LKM have a chance to
clean after themselves before unloading. This fixes random panics
when LKM for filesystem using pools was loaded and unloaded several
times.

For each leaf filesystem, add appropriate vfs_done routine.
 1.72 01-Nov-1999  fvdl Stuff values in va_blocksize that are closer to reality.
 1.71 06-Sep-1999  is branches: 1.71.2; 1.71.4; 1.71.6;
Don't truncate minor numbers >= 256.
Problem reported by Saitoh Masanobu, fix by Frank van der Linden.
 1.70 08-Jul-1999  wrstuden Modify file systems to deal with struct lock in struct vnode. All leaf
fs's other than nfs use genfs_lock() for locking.

Modify lookup routines to set PDIRUNLOCK when they unlock the parrent.
 1.69 24-Mar-1999  mrg branches: 1.69.2; 1.69.4;
completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.
 1.68 16-Mar-1999  fvdl ..JUKEBOX can happen on writes too.
 1.67 16-Mar-1999  fvdl The JUKEBOX error may be returned by the read operation, so don't
filter it out.
 1.66 06-Mar-1999  fair Snatch a patch from OpenBSD to fix PRs 6529 and 7074.
Adjust fxdr_hyper() and txdr_hyper() macros.
 1.65 27-Feb-1999  wrstuden Rationalize the vfs_checkexp macro to be VFS_CHECKEXP.
 1.64 26-Feb-1999  wrstuden Modify vfsops to seperate vfs_fhtovp() into two routines. vfs_fhtovp() now
only handles the file handle to vnode conversion, and a new call,
vfs_checkexp(), performs the export verification.
 1.63 09-Aug-1998  perry bzero->memset, bcopy->memcpy, bcmp->memcmp
 1.62 05-Jul-1998  jonathan defopt ISO TPIP.
 1.61 25-Jun-1998  thorpej defopt NFSSERVER
 1.60 24-Jun-1998  sommerfe Always include fifos; "not an option any more".
 1.59 22-Jun-1998  sommerfe defopt for options FIFO
 1.58 08-May-1998  kleink Fix some arithmetics lossage on typeless pointers.
 1.57 03-Mar-1998  fvdl Only free cookies on error when they were actually allocated by the readdir vop.
 1.56 01-Mar-1998  fvdl Merge with Lite2 + local changes
 1.55 19-Feb-1998  thorpej Include the NFS option header.
 1.54 10-Feb-1998  mrg - add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.
 1.53 07-Feb-1998  chs add flags arg to hashinit(), to pass to malloc().
 1.52 06-Feb-1998  mikel ELAST incremented, update nfsrv_v2errmap[] initialization
 1.51 05-Feb-1998  mrg initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)
 1.50 22-Jan-1998  fvdl Refuse to create entries in the dir cache for offset 0. This is a special
case anyway, and amd(8) erroneously returns some entries with cookie 0.
Fixes PR 4844
 1.49 19-Oct-1997  fvdl branches: 1.49.2;
* Implement optional 32 <-> 64 bit directory cookie translation. This uses
the directory cache as translation table. See nfs_subs.c for comments.
Makes the code a bit more complex to look at than I would have liked,
but doesn't affect the speed of the default behavior.
* Optimize caching behavior a bit when buffers are invalidated.
* Save some RPCs in readdir operations by not bothering if there is
a small amount left to do to fill the buffer. It'll be done in the
next RPC with a larger chunk anyway. Wastes a bit of buffer space
but is faster.
* Make n_vattr an allocated vattr struct. This avoids nfsnode bloat,
and is friendlier to the malloc routines.
 1.48 11-Oct-1997  fvdl Move cookie heuristic function inside ifdef NFS, to make a kernel with server
code but without client code link again. From Erik Bertelsen, PR 4259
 1.47 10-Oct-1997  fvdl Fix unitialized var warning (did not appear on i386, but did on sparc).
 1.46 10-Oct-1997  fvdl * New directory entry caching system. Provides full caching of any
directory cookie that may be thrown back at us from userspace, up
to a size limit. Fixes double entry problem.
* Split flags for internal and external use in the NFS mount structure.
* Fix some buffer structure fields that weren're being used correctly.
* Fix missing directory cache inval call in nfs_open.
* Limit on NFS_DIRBLKSIZ no longer needed, bumped to the more reasonable
value of 8k.
* Various other things that I forget, all related to the dir caching
somehow, though.
 1.45 14-Jul-1997  fvdl branches: 1.45.2;
Don't assume that pointers into mbuf data remain valid across nfsm_dissect.
In readdirplus, don't keep such pointers but store the file attributes
in a variable instead until they are needed. Change nfsm_loadattr*
a bit so it can accept a direct pointer to an nfs_fattr structure.
 1.44 04-Jul-1997  drochner Don't cast 64bit (off_t) file sizes to vm_offset_t (32bit on many
architectures), truncate them intelligently instead.
The truncation is done centralized in vnode_pager.c.
This prevents from wrap-over effects when parts of large (>2^32 byte) files
are mmapped.
Don't allow to mmap above the numerical range of vm_offset_t.
This is considered a temporary solution until the vm system handles the
object sizes/offsets more cleanly.
 1.43 24-Jun-1997  fvdl Extend lookup handling for WebNFS. This means that nfs_namei deals
with full pathname lookups if a public filehandle is used, and that
it translates the '%' escapes (URL-style) in the same case. Also,
make nfsrv_fhtovp convert the public filehandle to the vp of the
publicly exported filesystem, as stored in the nfs_pub structure.
 1.42 08-May-1997  mycroft Pass the vnode type to vaccess(), and use it when checking VEXEC. Make sure
that the mode bits passed to vaccess() and returned by foo_getattr() contain
only permission bits.
 1.41 27-Mar-1997  thorpej Update for new mbuf code.
 1.40 23-Mar-1997  fvdl Check for the use of reserved ports on a per-request basis, unless
MNT_EXNORESPORT is specified. The check is cheap and doesn't impose
any extra overhead.
 1.39 22-Feb-1997  fvdl Fixes from BSDI (thanks go to Keith Bostic). Original RCS messages:

date: 1997/02/10 18:41:14; author: cp; state: Exp; lines: +110 -46
Make nfs_realign go away on sparc and add functionality to nfsm_disct.
 1.38 31-Jan-1997  thorpej branches: 1.38.2;
NFSCLIENT -> NFS.
 1.37 09-Dec-1996  fvdl branches: 1.37.2;
Comment change in previous made for some bad english..
 1.36 09-Dec-1996  fvdl Move '#ifdef NFSSERVER' back to the right spot; NQNFS datastructures need
to be initialized on both the client and the server side. Remove misleading
comment about this being just server stuff.
 1.35 03-Dec-1996  thorpej Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.
 1.34 02-Dec-1996  thorpej NFS performance improvement from Doug Rabson/FreeBSD:

Improve the queuing algorithms used by NFS' asynchronous i/o. The
existing mechanism uses a global queue for some buffers and the
vp->b_dirtyblkhd queue for others. This turns sequential writes into
randomly ordered writes to the server, affecting both read and write
performance. The existing mechanism also copes badly with hung
servers, tending to block accesses to other servers when all the iods
are waiting for a hung server.

The new mechanism uses a queue for each mount point. All asynchronous
i/o goes through this queue which preserves the ordering of requests.
A simple mechanism ensures that the iods are shared out fairly between
active mount points.

Reviewed/integrated/approved by Frank van der Linden <fvdl@netbsd.org>
 1.33 25-Oct-1996  cgd make the namei struct members ni_dirp and ni_next, and the componentname
struct member cn_nameptr 'const', since they should never be used to
modify the path name. (Only the pathname buffer, cn_pnbuf, should be
modified.) Propagate the const poisoning to code that uses the namei
and componentname structs.
 1.32 13-Oct-1996  christos revert kprintf changes
 1.31 10-Oct-1996  christos printf -> kprintf, sprintf -> ksprintf
 1.30 07-Jul-1996  fvdl Start XIDs at a value based on the current time, not 0. This avoids nasty
XID confusions with servers that cache them over a long period and
with clients that reboot quickly.

Problems: because of the sanity check that is done by comparing the system
time with filesystem time, XIDs will start at 0 until root is mounted,
which means it isn't completely safe for diskless setups. But it's clearly
better than it was. It would also be cleaner if all XID handling (more
generally, all RPC handling) within the kernel went through the
same functions.
 1.29 01-Jul-1996  fvdl We're only handling uio with iovcnt == 1, so don't ever attempt to increment
uio_iov, this will get us into nasty trouble. (Thanks to Matthias Drochner for
tracking this down).
 1.28 23-May-1996  fvdl * Make mounts with symlinks work (needed for direct mounts with amd). PR #1917
* Never change the NQNFS flag and/or version when just doing an update mount.
Fixes a problem that made diskless booting impossible under some
circumstances.
 1.27 03-Apr-1996  thorpej branches: 1.27.4;
Make these link in the absense of "options FIFO".
 1.26 13-Mar-1996  fvdl Disable invalidating of directory offsets cookies. Should fix one or two
directory problems.

XXX There is no clean solution to the cookie/cookieverifier validity mess.
Together with the disabled strict cookie check, this puts us back at
what v2 did in this case. Slightly better solution possible by
consequently storing 64bit cookies in other places too.
 1.25 18-Feb-1996  fvdl Bring in a merge of Rick Macklem's NFSv3 code from Lite2
 1.24 09-Feb-1996  christos nfs prototype changes
 1.23 01-Feb-1996  jtc Rename struct timespec fields to conform to POSIX.1b
 1.22 19-Dec-1995  cgd changes to make this work on systems where pointers & longs are 64 bits.
This is mostly just changes to make the stuff that goes over the wire
use fixed-size types.
 1.21 08-Sep-1995  ws No point in computing cn_hash here,
as it gets immediately recomputed in lookup
 1.20 02-Jun-1995  mycroft Exported group list now starts at offset 0, not 1.
 1.19 01-Jun-1995  jtc Moved egid credential from cr_groups[0] to new field cr_gid. POSIX.1
requires that sgid executables and the setuid() syscall *not* change
the supplemental group list.
 1.18 18-Aug-1994  mycroft More LIST/CIRCLEQ migration.
 1.17 17-Aug-1994  mycroft Convert some more lists and queues.
 1.16 17-Aug-1994  mycroft Change the reply list to a TAILQ.
 1.15 22-Jul-1994  mycroft Set the group list length when copying credentials.
 1.14 29-Jun-1994  cgd branches: 1.14.2;
New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'
 1.13 13-Jun-1994  mycroft Move a misplaced #endif.
 1.12 08-Jun-1994  mycroft Update to 4.4-Lite fs code, with local changes.
 1.11 09-Mar-1994  ws Make FFS optional
 1.10 06-Feb-1994  mycroft Eliminate some more uses of b_actl.
 1.9 18-Dec-1993  mycroft Canonicalize all #includes.
 1.8 07-Sep-1993  ws branches: 1.8.2;
Changes to VFS readdir semantics
NFS changes for better cookie support
ISOFS changes for better Rockridge support and support for generation numbers
 1.7 02-Aug-1993  mycroft Make bpos arg to nfsm_reqh a caddr_t*, not a caddr_t**, as that's what it
is actually passed.
 1.6 13-Jul-1993  cgd bpos is really a caddr_t **. doesn't really make a diff to the code
generated...
 1.5 13-Jul-1993  cgd diskless changes made last time were hosed; were using NULL for
"no credentials" rather than NOCRED.
 1.4 07-Jul-1993  cgd changes from ws to support diskless booting... these are "OK" on inspection
and after testing... (actually, currently, none of the changed
code is even used...)
 1.3 21-May-1993  cgd add rcsid again; fix RCS+crash fuckup
 1.2 10-Apr-1993  glass migrated code to make split possible
 1.1 21-Mar-1993  cgd branches: 1.1.1;
Initial revision
 1.1.1.3 01-Mar-1998  fvdl Import 4.4BSD-Lite2
 1.1.1.2 01-Mar-1998  fvdl Import 4.4BSD-Lite for reference
 1.1.1.1 21-Mar-1993  cgd initial import of 386bsd-0.1 sources
 1.8.2.2 14-Nov-1993  mycroft Canonicalize all #includes.
 1.8.2.1 24-Sep-1993  mycroft Make all files using spl*() #include cpu.h. Changes from trunk.
nfs_vfsops.c, nfsmount.h: Make nfs_quotactl() take an int rather than a uid_t,
as it might be -1.
nfs_vnops.c: va_size and va_bytes are now quads.
 1.14.2.2 19-Aug-1994  mycroft update from trunk
 1.14.2.1 24-Jul-1994  cgd from trunk.
 1.27.4.3 08-Jul-1996  jtc Pulled up from rev 1.30 by request from Frank van der Linden
 1.27.4.2 02-Jul-1996  jtc Pulled up from rev 1.29 by request from fvdl
 1.27.4.1 25-May-1996  fvdl Pull in bugfixes from main branch.
 1.37.2.1 14-Jan-1997  thorpej Snapshot of work-in-progress, committed to private branch.

These changes implement machine-independent root device and file system
selection. Notable features:

- All ports behave in a consistent manner regarding root
device selection.
- No more "options GENERIC"; all kernels have the ability
to boot with RB_ASKNAME to select root device and file system
type.
- Root file system type can be wildcarded; a machine-independent
function will try all possible file systems for the selected
root device until one succeeds.
- If the root file system fails to mount, the operator will
be given the chance to select a new root device and file
system type, rather than having the machine simply panic.
- nfs_mountroot() no longer panics if any part of the NFS
mount process fails; it now returns an error, giving the
operator a chance to recover.
- New, more consistent, config(8) grammar. The constructs:

config netbsd swap generic
config netbsd root on nfs

have been replaced with:

config netbsd root on ? type ?
config netbsd root on ? type nfs

Additionally, the operator may select or wildcard root file
system type in the kernel configuration file:

config netbsd root on cd0a type cd9660

config(8) now requires that a "root" specification be
made. "root" may be wired down or wildcarded. "swap" and
"dump" specifications are optional, and follow previous
semantics.

- config(8) has a new "file-system" keyword, used to configure
file systems into the kernel. Eventually, this will be used
to generate the default vfssw[].

- "options NFSCLIENT" is obsolete, and is replaced by
"file-system NFS". "options NFSSERVER" still exists, since
NFS server support is independent of the NFS file system
client.

- sys/arch/<foo>/<foo>/swapgeneric.c is no longer used, and
will be removed; all information is now generated by config(8).

As of this commit, all ports except arm32 have been updated to use
the new setroot(). Only SPARC, i386, and Alpha ports have been
tested at this time. Port masters should test these changes on their
ports, and report any problems back to me.

More changes are on their way, including RB_ASKNAME support in
nfs_mountroot() (to prompt for server address and path) and, potentially,
the ability to select rarp/bootparam or bootp in nfs_mountroot().
 1.38.2.1 12-Mar-1997  is Merge in changes from Trunk
 1.45.2.1 14-Oct-1997  thorpej Update marc-pcmcia branch from trunk.
 1.49.2.1 29-Jan-1998  mellon Pull up 1.50 (fvdl)
 1.69.4.1 02-Aug-1999  thorpej Update from trunk.
 1.69.2.1 10-Sep-1999  he Pull up revision 1.71:
Don't truncate minor numbers > 255 on a NFS client. (is)
 1.71.6.1 27-Dec-1999  wrstuden Pull up to last week's -current.
 1.71.4.1 15-Nov-1999  fvdl Sync with -current
 1.71.2.7 23-Apr-2001  bouyer Sync with HEAD.
 1.71.2.6 27-Mar-2001  bouyer Sync with HEAD.
 1.71.2.5 12-Mar-2001  bouyer Sync with HEAD.
 1.71.2.4 11-Feb-2001  bouyer Sync with HEAD.
 1.71.2.3 08-Dec-2000  bouyer Sync with HEAD.
 1.71.2.2 22-Nov-2000  bouyer Sync with HEAD.
 1.71.2.1 20-Nov-2000  bouyer Update thorpej_scsipi to -current as of a month ago
 1.75.2.1 22-Jun-2000  minoura Sync w/ netbsd-1-5-base.
 1.77.2.5 15-Nov-2001  he Pull up revision 1.98 (requested by fvdl):
Always initialize ni_rootdir in nfs_namei(). This fixes a problem
where use of ``..'' would instead use the information for ``.'',
when using NetBSD/alpha NFS servers, and fixes PR#11618 ands
PR#12953.
 1.77.2.4 06-Apr-2001  he Pull up revision 1.93 (requested by wrstuden):
Explicitly VOP_UNLOCK before setting v_op to spec_vnode_ops_p.
Works around a lock leak and eventual kernel panic.
 1.77.2.3 14-Dec-2000  he Pull up revisions 1.85,1.87 (requested by fvdl):
Improve NFS performance, possibly with as much as 100% in
throughput. Please note: this implies a kernel interface change,
VOP_FSYNC gains two arguments.
 1.77.2.2 30-Oct-2000  tv Pullup 1.87 [fvdl]:
Do not accept vnode type changes to an active node. This may wreak
havoc if the server erroneously uses the same filehandle for
different files. This changes back revision 1.28; the PR that
that revision fixed doesn't apply anymore, it has been verified
not to be a problem with this change.
 1.77.2.1 20-Jun-2000  tv file nfs_subs.c was added on branch netbsd-1-5 on 2000-10-30 22:22:58 +0000
 1.92.2.11 11-Dec-2002  thorpej Sync with HEAD.
 1.92.2.10 11-Nov-2002  nathanw Catch up to -current
 1.92.2.9 22-Oct-2002  thorpej Sync with HEAD.
 1.92.2.8 27-Aug-2002  nathanw Catch up to -current.
 1.92.2.7 01-Apr-2002  nathanw Catch up to -current.
(CVS: It's not just a program. It's an adventure!)
 1.92.2.6 28-Feb-2002  nathanw Catch up to -current.
 1.92.2.5 14-Nov-2001  nathanw Catch up to -current.
 1.92.2.4 08-Oct-2001  nathanw Catch up to -current.
 1.92.2.3 21-Sep-2001  nathanw Catch up to -current.
 1.92.2.2 21-Jun-2001  nathanw Catch up to -current.
 1.92.2.1 09-Apr-2001  nathanw Catch up with -current.
 1.95.4.1 01-Oct-2001  fvdl Catch up with -current.
 1.95.2.6 30-Sep-2002  jdolecek add support for kevents to NFS
to detect file changes on server by other NFS clients, polling kernel thread
is used to periodically check for attribute changes of watched files;
the NFS server is only contacted when the vnode expires from local attrcache
(which takes 5-60 seconds currently), to keep network&CPU overhead low

the routine checking for remote changes is quite simplistic, but hopefully
doing it's job well enough
 1.95.2.5 06-Sep-2002  jdolecek sync kqueue branch with HEAD
 1.95.2.4 23-Jun-2002  jdolecek catch up with -current on kqueue branch
 1.95.2.3 16-Mar-2002  jdolecek Catch up with -current.
 1.95.2.2 11-Feb-2002  jdolecek Sync w/ -current.
 1.95.2.1 10-Jan-2002  thorpej Sync kqueue branch with -current.
 1.98.2.1 12-Nov-2001  thorpej Sync the thorpej-mips-cache branch with -current.
 1.103.6.1 04-Oct-2003  tron Pull up revision 1.129 (requested by martti in ticket #1506):
plug mbuf leak due to manual mbuf handling. PR kern/13807.
(martti confirmed that it stabilizes the situation described in kren/13807)
 1.103.4.1 29-Aug-2002  gehenna catch up with -current.
 1.125.2.13 11-Dec-2005  christos Sync with head.
 1.125.2.12 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.125.2.11 04-Mar-2005  skrll Sync with HEAD.

Hi Perry!
 1.125.2.10 04-Feb-2005  skrll Sync with HEAD.
 1.125.2.9 24-Jan-2005  skrll Sync with HEAD.
 1.125.2.8 17-Jan-2005  skrll Sync with HEAD.
 1.125.2.7 02-Nov-2004  skrll Sync with HEAD.
 1.125.2.6 19-Oct-2004  skrll Sync with HEAD
 1.125.2.5 21-Sep-2004  skrll Fix the sync with head I botched.
 1.125.2.4 18-Sep-2004  skrll Sync with HEAD.
 1.125.2.3 24-Aug-2004  skrll Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.
 1.125.2.2 03-Aug-2004  skrll Sync with HEAD
 1.125.2.1 02-Jul-2003  darrenr Apply the aborted ktrace-lwp changes to a specific branch. This is just for
others to review, I'm concerned that patch fuziness may have resulted in some
errant code being generated but I'll look at that later by comparing the diff
from the base to the branch with the file I attempt to apply to it. This will,
at the very least, put the changes in a better context for others to review
them and attempt to tinker with removing passing of 'struct lwp' through
the kernel.
 1.132.2.5 16-Mar-2005  tron Pull up revision 1.146 (requested by yamt in ticket #1148):
nfs_clearcommit: don't attempt to clear commit info (n_pushlo, etc)
unless the vnode is of VREG. union members used to keep commit info
are used for other purposes in the case of !VREG.
 1.132.2.4 11-Jan-2005  jmc Pullup rev 1.140 (requested by yamt in ticket #1079)

Invalidate cache if filesize is changed besides our activity
because it means that were out of sync with the server.
 1.132.2.3 04-Oct-2004  jmc branches: 1.132.2.3.2;
Pullup rev 1.137 (requested by yamt in ticket #889)

nfs_enterdircache: initialize dc_flags of a newly allocated dircache entry.
 1.132.2.2 18-Sep-2004  he Pull up revision 1.135 (requested by yamt in ticket #858):
Fix access-after-free bugs in dircache code by reference
counting nfsdircache. Fixes PR#26864.
 1.132.2.1 21-Jun-2004  tron Pull up revision 1.134 (requested by yamt in ticket #515):
nfs_searchdircache: fix a null dereference in the case that
offset!=0 and dircache hasn't been initialized yet.
 1.132.2.3.2.4 27-Oct-2005  riz Pull up following revision(s) (requested by christos in ticket #5863):
sys/nfs/nfs_subs.c: revision 1.152 via patch
sys/nfs/nfs.h: revision 1.49
sys/nfs/nfs_vfsops.c: revision 1.149 via patch
usr.sbin/amd/include/config.h: revision 1.36
sys/nfs/nfs_vnops.c: revision 1.227 via patch
sys/nfs/nfsmount.h: revision 1.34
Allow the attribute cache to be turned off, and allow amd to do it.
 1.132.2.3.2.3 16-Mar-2005  tron Pull up revision 1.146 (requested by yamt in ticket #1148):
nfs_clearcommit: don't attempt to clear commit info (n_pushlo, etc)
unless the vnode is of VREG. union members used to keep commit info
are used for other purposes in the case of !VREG.
 1.132.2.3.2.2 30-Jan-2005  he Pull up revision 1.138 (requested by yamt in ticket #968):
Since daddr_t is 64-bit these days, simply use nfs directory
cookies as buffer cache indexes. This should make the
regress/sys/fs/getdents test work. Fixes PR#27112.
 1.132.2.3.2.1 11-Jan-2005  jmc Pullup rev 1.140 (requested by yamt in ticket #1079)

Invalidate cache if filesize is changed besides our activity
because it means that were out of sync with the server.
 1.140.2.1 29-Apr-2005  kent sync with -current
 1.143.2.3 19-Mar-2005  yamt sync with head. xen and whitespace. xen part is not finished.
 1.143.2.2 12-Feb-2005  yamt sync with head.
 1.143.2.1 25-Jan-2005  yamt file nfs_subs.c was added on branch yamt-km on 2005-02-12 18:17:55 +0000
 1.148.2.1 27-Sep-2005  tron Pull up following revision(s) (requested by christos in ticket #816):
sys/nfs/nfs_subs.c: revision 1.152
ATTRTIMEO takes 2 args.
 1.149.2.13 17-Mar-2008  yamt sync with head.
 1.149.2.12 27-Feb-2008  yamt revert incomplete nfs client locking for now.
 1.149.2.11 27-Feb-2008  yamt sync with head.
 1.149.2.10 15-Feb-2008  yamt - sprinkle some locks.
- disable MNT_UPDATE because it involves too much locking headache.
- don't overwrite other bits in v_vflags when setting VV_ROOT.
 1.149.2.9 04-Feb-2008  yamt sync with head.
 1.149.2.8 21-Jan-2008  yamt sync with head
 1.149.2.7 07-Dec-2007  yamt sync with head
 1.149.2.6 15-Nov-2007  yamt sync with head.
 1.149.2.5 03-Sep-2007  yamt sync with head.
 1.149.2.4 26-Feb-2007  yamt sync with head.
 1.149.2.3 30-Dec-2006  yamt sync with head.
 1.149.2.2 21-Jun-2006  yamt sync with head.
 1.149.2.1 07-Jul-2005  yamt adapt to mbuf.h changes.
 1.153.6.2 29-Nov-2005  yamt sync with head.
 1.153.6.1 22-Nov-2005  yamt sync with head.
 1.156.2.2 01-Feb-2006  yamt sync with head.
 1.156.2.1 31-Dec-2005  yamt - adapt nfs.
- nfs_doio_read: #if 0 out "killproc if text is modified" part of
the code as it's broken. (a process reading the modified text is not
necessarily a process which is using the file as a text.)
 1.157.4.3 01-Jun-2006  kardel Sync with head.
 1.157.4.2 22-Apr-2006  simonb Sync with head.
 1.157.4.1 04-Feb-2006  simonb Adapt for timecounters: mostly use get*time() and use "time_second"
instead of "time.tv_sec".
 1.157.2.1 09-Sep-2006  rpaulo sync with head
 1.158.6.1 24-May-2006  tron Merge 2006-05-24 NetBSD-current into the "peter-altq" branch.
 1.158.4.9 06-May-2006  christos - Move kauth_cred_t declaration to <sys/types.h>
- Cleanup struct ucred; forward declarations that are unused.
- Don't include <sys/kauth.h> in any header, but include it in the c files
that need it.

Approved by core.
 1.158.4.8 19-Apr-2006  elad sync with head.
 1.158.4.7 14-Mar-2006  elad Use kauth_cred_clone() where appropriate.
 1.158.4.6 13-Mar-2006  elad kauth_cred_clone() takes care of the groups for us, remove redundant code.
 1.158.4.5 12-Mar-2006  elad Fix group cleaning loop, as pointed out by yamt@.

While I'm here, add an XXX near the second loop that watches NGROUPS; this
should be internal to kauth(9).
 1.158.4.4 11-Mar-2006  elad Replace check for euid == 0 with kauth_authorize_generic().
 1.158.4.3 10-Mar-2006  elad Cleanup more interface abuse.

Make nfsrv_setcred() take a kauth_cred_t * as outcred. The original code
just modified it directly; we can't do that, nor do we want to.

Get rid of another case of kauth_cred_zero() followed by kauth_cred_hold()
and use kauth_cred_clone() to make sure we don't leave out important
members.

Add another DIAGNOSTIC check for reference count of above one.

Again, this should be tested.
 1.158.4.2 10-Mar-2006  elad Remove some more no longer needed calls to kauth_cred_setngroups() and
nfsrvw_sort().
 1.158.4.1 08-Mar-2006  elad Adapt to kernel authorization KPI.

This could use some testing...
 1.158.2.5 14-Sep-2006  yamt sync with head.
 1.158.2.4 03-Sep-2006  yamt sync with head.
 1.158.2.3 11-Aug-2006  yamt sync with head
 1.158.2.2 26-Jun-2006  yamt sync with head.
 1.158.2.1 24-May-2006  yamt sync with head.
 1.164.2.2 19-Jun-2006  chap Sync with head.
 1.164.2.1 19-May-2006  chap file nfs_subs.c was added on branch chap-midi on 2006-06-19 04:10:37 +0000
 1.165.2.1 13-Jul-2006  gdamore Merge from HEAD.
 1.170.4.2 10-Dec-2006  yamt sync with head.
 1.170.4.1 22-Oct-2006  yamt sync with head
 1.170.2.2 12-Jan-2007  ad Sync with head.
 1.170.2.1 18-Nov-2006  ad Sync with head.
 1.177.2.1 17-Feb-2007  tron Apply patch (requested by chs in ticket #422):
- Fix various deadlock problems with nullfs and unionfs.
- Speed up path lookups by upto 25%.
 1.180.2.3 07-May-2007  yamt sync with head.
 1.180.2.2 12-Mar-2007  rmind Sync with HEAD.
 1.180.2.1 28-Feb-2007  yamt sync with head. (somehow missed in the previous)
 1.184.4.1 11-Jul-2007  mjf Sync with head.
 1.184.2.9 24-Oct-2007  ad Do locking / use marker vnodes when traversing mountpoint vnode lists.
 1.184.2.8 16-Sep-2007  ad Checkpoint work in progress on the vnode lifecycle and reference counting
stuff. This makes it work properly without kernel_lock and fixes a few
quite old bugs. See vfs_subr.c 1.283.2.17 for details.
 1.184.2.7 20-Aug-2007  ad Sync with HEAD.
 1.184.2.6 15-Jul-2007  ad Sync with head.
 1.184.2.5 15-Jul-2007  ad Sync with head.
 1.184.2.4 01-Jul-2007  ad Adapt to callout API change.
 1.184.2.3 18-Jun-2007  yamt fix a merge botch.
 1.184.2.2 09-Jun-2007  ad Sync with head.
 1.184.2.1 13-Mar-2007  ad Pull in the initial set of changes for the vmlocking branch.
 1.190.2.1 15-Aug-2007  skrll Sync with HEAD.
 1.191.12.2 27-Jul-2007  yamt stop nfs tick when we have nothing to do.
 1.191.12.1 27-Jul-2007  yamt file nfs_subs.c was added on branch matt-mips64 on 2007-07-27 10:03:59 +0000
 1.191.10.1 13-Nov-2007  bouyer Sync with HEAD
 1.191.6.3 23-Mar-2008  matt sync with HEAD
 1.191.6.2 09-Jan-2008  matt sync with HEAD
 1.191.6.1 06-Nov-2007  matt sync with HEAD
 1.191.4.3 09-Dec-2007  jmcneill Sync with HEAD.
 1.191.4.2 27-Nov-2007  joerg Sync with HEAD. amd64 Xen support needs testing.
 1.191.4.1 29-Oct-2007  joerg Sync with HEAD.
 1.192.2.3 18-Feb-2008  mjf Sync with HEAD.
 1.192.2.2 27-Dec-2007  mjf Sync with HEAD.
 1.192.2.1 08-Dec-2007  mjf Sync with HEAD.
 1.193.2.2 26-Dec-2007  ad Sync with head.
 1.193.2.1 04-Dec-2007  ad Pull the vmlocking changes into a new branch.
 1.194.4.1 02-Jan-2008  bouyer Sync with HEAD
 1.199.6.6 17-Jan-2009  mjf Sync with HEAD.
 1.199.6.5 05-Oct-2008  mjf Sync with HEAD.
 1.199.6.4 28-Sep-2008  mjf Sync with HEAD.
 1.199.6.3 05-Jun-2008  mjf Sync with HEAD.

Also fix build.
 1.199.6.2 02-Jun-2008  mjf Sync with HEAD.
 1.199.6.1 03-Apr-2008  mjf Sync with HEAD.
 1.199.2.1 24-Mar-2008  keiichi sync with head.
 1.201.4.7 10-Oct-2010  yamt some locking changes
 1.201.4.6 26-Sep-2010  yamt locking changes
 1.201.4.5 11-Mar-2010  yamt sync with head
 1.201.4.4 16-May-2009  yamt sync with head
 1.201.4.3 04-May-2009  yamt sync with head.
 1.201.4.2 16-May-2008  yamt sync with head.
 1.201.4.1 27-Apr-2008  yamt commit some work-in-progress changes to make nfs client mp-safe to a branch,
so that they won't get lost.
- sprinkle some locking
- mark the filesystem, nfstimer callout, and kq kthread mp-safe
- add assertions and comments
- disable upgrade mount for now
- some unrelated cosmetic changes
 1.201.2.2 17-Jun-2008  yamt sync with head.
 1.201.2.1 18-May-2008  yamt sync with head.
 1.202.2.3 10-Oct-2008  skrll Sync with HEAD.
 1.202.2.2 18-Sep-2008  wrstuden Sync with wrstuden-revivesa-base-2.
 1.202.2.1 23-Jun-2008  wrstuden Sync w/ -current. 34 merge conflicts to follow.
 1.204.4.2 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.204.4.1 19-Oct-2008  haad Sync with HEAD.
 1.204.2.1 18-Jul-2008  simonb Sync with head.
 1.209.2.2 28-Apr-2009  skrll Sync with HEAD.
 1.209.2.1 19-Jan-2009  skrll Sync with HEAD.
 1.212.2.2 23-Jul-2009  jym Sync with HEAD.
 1.212.2.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.218.2.2 27-May-2010  uebayasi Include uvm/uvm.h, because this touches uvm internal.
 1.218.2.1 30-Apr-2010  uebayasi Sync with HEAD.
 1.219.2.2 05-Mar-2011  rmind sync with head
 1.219.2.1 16-Mar-2010  rmind Change struct uvm_object::vmobjlock to be dynamically allocated with
mutex_obj_alloc(). It allows us to share the locks among UVM objects.
 1.220.6.1 23-Jun-2011  cherry Catchup with rmind-uvmplock merge.
 1.221.2.6 22-May-2014  yamt fix a merge botch
 1.221.2.5 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.221.2.4 17-Apr-2012  yamt sync with head
 1.221.2.3 26-Nov-2011  yamt update after radixtree.h api changes
 1.221.2.2 06-Nov-2011  yamt remove pg->listq and uobj->memq
 1.221.2.1 02-Nov-2011  yamt page cache related changes

- maintain object pages in radix tree rather than rb tree.
- reduce unnecessary page scan in putpages. esp. when an object has a ton of
pages cached but only a few of them are dirty.
- reduce the number of pmap operations by tracking page dirtiness more
precisely in uvm layer.
- fix nfs commit range tracking.
- fix nfs write clustering. XXX hack
 1.222.12.1 18-May-2014  rmind sync with head
 1.222.8.2 03-Dec-2017  jdolecek update from HEAD
 1.222.8.1 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.225.2.2 10-Aug-2014  tls Rebase.
 1.225.2.1 17-Jul-2014  tls Adjustments to the "earlyentropy" branch in response to the various
discussions beginning with my initial proposal
http://mail-index.netbsd.org/tech-kern/2014/04/08/msg016876.html and
particularly the long discussion of cprng_fast() performance (e.g.
https://mail-index.netbsd.org/tech-crypto/2014/04/21/msg000642.html).

In particular:

* Per-CPU, lockless cprng_fast replacement using Dennis Ferguson's
"ccrand" implementation of ChaCha8.

* libkern arc4random() is gone, gone, gone.

* Entropy estimator reverted to 32-bit recordkeeping and timestamps
per Dennis' comments and analysis.

* LZF entropy estimator removed: it required a great deal of state,
and rejected only truly pathological input.

I have not yet reverted the changes that provide LZF in the kernel
as generic functionality; I will likely revert those changes prior
to any merge of this branch to HEAD.
 1.227.4.2 28-Aug-2017  skrll Sync with HEAD
 1.227.4.1 09-Jul-2016  skrll Sync with HEAD
 1.228.4.1 21-Apr-2017  bouyer Sync with HEAD
 1.228.2.1 26-Apr-2017  pgoyette Sync with HEAD
 1.229.6.1 08-Jun-2018  martin Pull up following revision(s) (requested by maya in ticket #856):

sys/nfs/nfs.h: revision 1.76
sys/nfs/nfs_subs.c: revision 1.230
sys/nfs/nfs_socket.c: revision 1.199
sys/nfs/nfs_clntsocket.c: revision 1.6

PR/40491: From Tobias Ulmer in tech-kern@:
1. Protect the nfs request queue with its own mutex
2. make the nfs_receive queue check for signals so that intr mounts
can be interrupted.

XXX: pullup-8
 1.230.2.4 26-Dec-2018  pgoyette Sync with HEAD, resolve a few conflicts
 1.230.2.3 06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.230.2.2 21-May-2018  pgoyette Sync with HEAD
 1.230.2.1 02-May-2018  pgoyette Synch with HEAD
 1.232.2.2 08-Apr-2020  martin Merge changes from current as of 20200406
 1.232.2.1 10-Jun-2019  christos Sync with HEAD
 1.236.2.1 29-Feb-2020  ad Sync with head.
 1.164 05-Jul-2024  rin sys: Drop redundant NULL check before m_freem(9)

m_freem(9) safely has accepted NULL argument at least since 4.2BSD:
https://www.tuhs.org/cgi-bin/utree.pl?file=4.2BSD/usr/src/sys/sys/uipc_mbuf.c

Compile-tested on amd64/ALL.

Suggested by knakahara@
 1.163 04-Jun-2021  hannken branches: 1.163.16;
Add flag/command NFSSVC_REPLACEEXPORTSLIST to nfssvc(2) system call.

Works like NFSSVC_SETEXPORTSLIST but supports "mel_nexports > 1"
and will atomically update the complete exports list for a file system.
 1.162 14-Mar-2020  ad branches: 1.162.8; 1.162.12;
- Hide the details of SPCF_SHOULDYIELD and related behind a couple of small
functions: preempt_point() and preempt_needed().

- preempt(): if the LWP has exceeded its timeslice in kernel, strip it of
any priority boost gained earlier from blocking.
 1.161 03-Feb-2019  mrg - add or adjust /* FALLTHROUGH */ where appropriate
- add __unreachable() after functions that can return but won't in
this case, and thus can't be marked __dead easily
 1.160 16-Mar-2018  christos branches: 1.160.2;
PR/53103: Timo Buhrmester: linux emulation of sendto(2) broken

The sockargs refactoring broke it, because sockargs only works with a user
address. Added an argument to sockargs to indicate where the address is
coming from. Welcome to 8.99.14.
 1.159 25-Jan-2018  riastradh branches: 1.159.2;
Use a random opaque cookie, not kva pointer, for nfssvc(2).

(What were they smoking?!)

I suspect most of this is actually dead code that wasn't properly
amputated along with the rest of the gangrene of NFSKERB a decade
ago, but I'm out of time to investigate further. If someone else
wants to kill NFSSVC_AUTHIN/NFSSVC_AUTHINFAIL and the rest of the
tentacular kerberosity, be my guest.

Noted by Silvio Cesare of InfoSect.
 1.158 12-Feb-2017  maxv Memory leak, found by Mootja; not tested, but obvious enough.
 1.157 10-Jun-2016  ozaki-r branches: 1.157.2; 1.157.4;
Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.
 1.156 22-Jun-2015  mrg add netbsd32 support for nfssvc(2). we do this by defining 5 copyin/out
functions that do all the ugly work, are just plain copyin/out for the
native system calls, and do the necessary translations for netbsd32.

with this i'm able to run 32 bit nfsd and mountd on 64 bit kernel and
mount the file systems remotely.
 1.155 05-Sep-2014  matt branches: 1.155.2;
Try not to use f_data, use f_{vnode,socket,pipe,mqueue,kqueue,ksem} to get
a correctly typed pointer.
 1.154 27-Nov-2013  christos branches: 1.154.4;
CID 271162: NULL deref check
 1.153 31-Dec-2009  christos branches: 1.153.12; 1.153.22; 1.153.26;
put nuidhash_max in a file that is shared between server and client code.
 1.152 31-Dec-2009  christos handle the nuidhash_max lossage differently
 1.151 21-Oct-2009  rmind Remove uarea swap-out functionality:

- Addresses the issue described in PR/38828.
- Some simplification in threading and sleepq subsystems.
- Eliminates pmap_collect() and, as a side note, allows pmap optimisations.
- Eliminates XS_CTL_DATA_ONSTACK in scsipi code.
- Avoids few scans on LWP list and thus potentially long holds of proc_lock.
- Cuts ~1.5k lines of code. Reduces amd64 kernel size by ~4k.
- Removes __SWAP_BROKEN cases.

Tested on x86, mips, acorn32 (thanks <mpumford>) and partly tested on
acorn26 (thanks to <bjh21>).

Discussed on <tech-kern>, reviewed by <ad>.
 1.150 14-Sep-2009  pooka Remove stale comment about super user. no functional change
 1.149 07-Jul-2009  christos The compatibility call to re-export from sys_mount() calls
mountd_set_exports_list, with the mnt_updating mutex held. Account for that
to avoid a locking against myself panic.
 1.148 23-May-2009  ad Remove pointless error check.
 1.147 10-Apr-2009  bouyer PR kern/41154: possible races in NFS server code

Fix some of the races (but probably not all of them) in the NFS server code.
nfssvc_nfsd(): change a splsoftclock()/spx() to mutex_enter/exit(&nfsd_lock)
(I guess it was forgotten when the nfsd code was made SMP safe)
m_freem(nd_nam) in nfsrv_slpderef() instead of nfsrv_zapsock() to
avoid possible use after free in nfssvc_nfsd()
Fix nfsrv_slpderef() to not release nfsd_lock before testing SLP_VALID
and reaquiring it just after. This could cause a use after free
of the slp if one thread is in nfsrv_slpderef() and the other one grabs
slp from nfssvc_sockpending and zap it.
 1.146 15-Mar-2009  cegger ansify function definitions
 1.145 14-Mar-2009  dsl Change about 4500 of the K&R function definitions to ANSI ones.
There are still about 1600 left, but they have ',' or /* ... */
in the actual variable definitions - which my awk script doesn't handle.
There are also many that need () -> (void).
(The script does handle misordered arguments.)
 1.144 14-Mar-2009  dsl Remove all the __P() from sys (excluding sys/dist)
Diff checked with grep and MK1 eyeball.
i386 and amd64 GENERIC and sys still build.
 1.143 28-Nov-2008  pooka branches: 1.143.4;
g/c unused malloc types
 1.142 19-Nov-2008  ad Make the emulations, exec formats, coredump, NFS, and the NFS server
into modules. By and large this commit:

- shuffles header files and ifdefs
- splits code out where necessary to be modular
- adds module glue for each of the components
- adds/replaces hooks for things that can be installed at runtime
 1.141 14-Nov-2008  ad Remove COMPAT ifdefs that might as well be comments (i.e., they cost us
almost nothing).
 1.140 09-Oct-2008  christos branches: 1.140.2; 1.140.4;
do the proper ifdef dance for non-inet families
 1.139 28-Sep-2008  pooka Don't free nd_mrep in case of no reply. It is (at least in one
case) freed already within the rpc handler.

XXX: this line and another was originally committed with "don't
leak mbufs", but given that currently it can double-free an mbuf
and essentially crash the system, I'll opt for the leak. Needless
to say, this needs revisiting, but that requires a large scale
campaign due to the sticky nature of nfsm love.
 1.138 06-Aug-2008  plunky Convert socket options code to use a sockopt structure
instead of laying everything into an mbuf.

approved by core
 1.137 24-Jun-2008  ad branches: 1.137.2;
Replace references to getsock/getvnode.
 1.136 20-May-2008  ad branches: 1.136.2;
Make it compile.
 1.135 28-Apr-2008  yamt branches: 1.135.2;
as softint network processing is now safe to block,
make some mutexes adaptive.
 1.134 24-Apr-2008  ad branches: 1.134.2;
Merge the socket locking patch:

- Socket layer becomes MP safe.
- Unix protocols become MP safe.
- Allows protocol processing interrupts to safely block on locks.
- Fixes a number of race conditions.

With much feedback from matt@ and plunky@.
 1.133 23-Mar-2008  rmind branches: 1.133.2;
G/C l->l_locks.
OK by <ad>.
 1.132 21-Mar-2008  ad Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.
 1.131 28-Feb-2008  elad Introduce a new kauth action, KAUTH_NETWORK_NFS, and two requests,
KAUTH_REQ_NETWORK_NFS_EXPORT and KAUTH_REQ_NETWORK_NFS_SVC, and use them
to replace two KAUTH_GENERIC_ISSUSER calls in the NFS code.

Also replace two more with KAUTH_SYSTEM_MKNOD, where appropriate.

Documetnation and examples updated. More to come.
 1.130 02-Jan-2008  yamt branches: 1.130.2; 1.130.6;
use kmem_alloc instead of malloc.
 1.129 02-Jan-2008  ad Merge vmlocking2 to head.
 1.128 20-Dec-2007  dsl Convert all the system call entry points from:
int foo(struct lwp *l, void *v, register_t *retval)
to:
int foo(struct lwp *l, const struct foo_args *uap, register_t *retval)
Fixup compat code to not write into 'uap' and (in some cases) to actually
pass a correctly formatted 'uap' structure with the right name to the
next routine.
A few 'compat' routines that just call standard ones have been deleted.
All the 'compat' code compiles (along with the kernels required to test
build it).
98% done by automated scripts.
 1.127 04-Dec-2007  yamt branches: 1.127.4;
merge non-intrusive nfs changes from vmlocking.
 1.126 22-Nov-2007  yamt branches: 1.126.2;
nfssvc_nfsd: remove a wrong assertion.
 1.125 08-Oct-2007  ad branches: 1.125.2; 1.125.4;
Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.
 1.124 10-Aug-2007  yamt branches: 1.124.2; 1.124.4;
- instead of scanning an array of iods, maintain a list of idle iods.
- make nfs_getset_niothreads MP friendly.
 1.123 08-Aug-2007  yamt push kernel_lock a little.
 1.122 02-Aug-2007  yamt branches: 1.122.2; 1.122.4;
nfsrv_slpderef: add an assertion.
 1.121 02-Aug-2007  yamt nfssvc_nfsd: fix a wrong assertion. PR/36710 from Tobias Nygren.
 1.120 02-Aug-2007  yamt nfsrv_zapsock: update SLP_DOREC for consistency.
 1.119 02-Aug-2007  yamt nfssvc_nfsd: don't leave sockets with SLP_DISCONN.
 1.118 20-Jul-2007  yamt - fix decreasing of vfs.nfs.iothreads after the recent partial merge
of vmlocking.
- don't make nfsiod exit with requests left.
- make NFSSVC_BIOD a dummy so that nfsiod can be simplified.
 1.117 09-Jul-2007  ad branches: 1.117.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements
 1.116 22-Jun-2007  yamt - nfsrv_slpderef: fix a locking botch.
(should fix "slp->ns_sref == 0" assertion failures in nfsrv_init)
- add some related assertions.
 1.115 19-Jun-2007  yamt nfssvc_iod: fix nm_bufqiods accounting on exit.
 1.114 12-Jun-2007  yamt - nfssvc_nfsd: clear nfsd_slp when exiting.
(fix an assertion failure in rev.1.112.)

- nfsrv_init: add assertions.
 1.113 01-Jun-2007  yamt nfssvc_nfsd: check SPCF_SHOULDYIELD and yield cpu.
 1.112 01-Jun-2007  yamt nfssvc_nfsd: add assertions.
 1.111 01-Jun-2007  yamt use mutex and condvar.
 1.110 30-Apr-2007  yamt fix a lock leak in rev.1.109. pointed by Mindaugas R.
 1.109 29-Apr-2007  yamt use mutex and condver.
 1.108 19-Apr-2007  yamt hold proclist_mutex when calling psignal().
 1.107 04-Mar-2007  christos branches: 1.107.2; 1.107.4;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.
 1.106 22-Feb-2007  thorpej TRUE -> true, FALSE -> false
 1.105 09-Feb-2007  ad branches: 1.105.2;
Merge newlock2 to head.
 1.104 17-Jan-2007  yamt plug a mbuf leak.
 1.103 04-Jan-2007  elad Consistent usage of KAUTH_GENERIC_ISSUSER.
 1.102 28-Dec-2006  yamt remove several nqnfs definitions.
 1.101 27-Dec-2006  yamt - remove the rest of nqnfs.
- reject NFSMNT_MNTD and NFSMNT_KERB. (no users in tree.)
 1.100 27-Dec-2006  yamt remove nqnfs.
 1.99 09-Nov-2006  yamt branches: 1.99.2; 1.99.4;
remove some __unused in function parameters.
 1.98 22-Oct-2006  pooka kauth_cred_uucvt() -> kauth_uucred_to_cred(), introduce kauth_cred_to_uucred()

per tech-kern proposal
 1.97 12-Oct-2006  christos - sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386
 1.96 23-Jul-2006  ad branches: 1.96.4; 1.96.6;
Use the LWP cached credentials where sane.
 1.95 07-Jun-2006  kardel merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html
 1.94 19-May-2006  yamt branches: 1.94.2;
- fix compilation problem for !NFSSERVER && NFS.
pointed by Tom Spindler on source-changes@.
- make nfs_srvdesc_pool static.
 1.93 18-May-2006  yamt - fix some leaks in nfsd, introduced by kauth changes.
- simplify code.
- add some assertions.
- wrap some long lines.
- remove an unnecessary ";".
 1.92 14-May-2006  elad integrate kauth.
 1.91 10-May-2006  mrg quell GCC 4.1 uninitialised variable warnings.

XXX: we should audit the tree for which old ones are no longer needed
after getting the older compilers out of the tree..
 1.90 15-Apr-2006  christos Coverity CID 1162: Prevent NULL deref.
 1.89 15-Apr-2006  christos Coverity CID 1165: Cannot have nfsiod without an lwp, so remove the superfluous
test.
 1.88 05-Jan-2006  yamt branches: 1.88.2; 1.88.4; 1.88.6; 1.88.8; 1.88.10;
ensure the export list is not changed during nfsd operations.
 1.87 03-Jan-2006  yamt fix a deadlock due to a spl problem.
 1.86 03-Jan-2006  yamt nfssvc_nfsd: reduce a chance for a slow peer to capture all our threads.
instead of sleeping to wait for the socket to send our reply,
just hand-off our reply to the thread which is holding the socket.
 1.85 03-Jan-2006  yamt improve nfsd locking.
- don't bother to take nfs_sndlock when doing nfsrv_rcv.
unlike client, we never reconnect.
- nfsrv_getstream: fix the case that m_split sleeps.
- free socket in nfsrv_slpderef rather than nfsrv_zapsock.
fix race with nfssvc_nfsd.
- while i'm here, remove NFSD_WAITING and NFSD_REQINPROG
as they are redundant.
- some comments and assertions.
 1.84 11-Dec-2005  christos branches: 1.84.2;
merge ktrace-lwp.
 1.83 25-Nov-2005  thorpej Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().
 1.82 23-Sep-2005  jmmv branches: 1.82.6;
Apply the NFS exports list rototill patch:

- Remove all NFS related stuff from file system specific code.
- Drop the vfs_checkexp hook and generalize it in the new nfs_check_export
function, thus removing redundancy from all file systems.
- Move all NFS export-related stuff from kern/vfs_subr.c to the new
file sys/nfs/nfs_export.c. The former was becoming large and its code
is always compiled, regardless of the build options. Using the latter,
the code is only compiled in when NFSSERVER is enabled. While doing this,
also make some functions in nfs_subs.c conditional to NFSSERVER.
- Add a new command in nfssvc(2), called NFSSVC_SETEXPORTSLIST, that takes a
path and a set of export entries. At the moment it can only clear the
exports list or append entries, one by one, but it is done in a way that
allows setting the whole set of entries atomically in the future (see the
comment in mountd_set_exports_list or in doc/TODO).
- Change mountd(8) to use the nfssvc(2) system call instead of mount(2) so
that it becomes file system agnostic. In fact, all this whole thing was
done to remove a 'XXX' block from this utility!
- Change the mount*, newfs and fsck* userland utilities to not deal with NFS
exports initialization; done internally by the kernel when initializing
the NFS support for each file system.
- Implement an interface for VFS (called VFS hooks) so that several kernel
subsystems can run arbitrary code upon receipt of specific VFS events.
At the moment, this only provides support for unmount and is used to
destroy NFS exports lists from the file systems being unmounted, though it
has room for extension.

Thanks go to yamt@, chs@, thorpej@, wrstuden@ and others for their comments
and advice in the development of this patch.
 1.81 11-Sep-2005  rpaulo Wrap a multiple line comment so that it doesn't go beyond 80 columns.
 1.80 03-Aug-2005  onoe Fix mbuf leak in nfssvc_nfsd().
 1.79 07-Jul-2005  christos 1. use p = uio->uio_procp consistently and eliminate suspicious uses
of curproc (where uio->uio_procp should be used?). Don't do this
for nfs_commit(), because yamt says it is possibly wrong.
2. nfs_doio() does not use struct proc; remove it and the code to compute it.
3. use copyin_proc() and copyout_proc() instead of copyin() and copyout().
4. check return of copyout_proc(). and mark return from copyin_proc() XXX
5. Eliminate check p == curproc assertion check from nfs_write;
nfs_read does not have it and we might be called in a different
process context anyway (PR 20138).
 1.78 26-Feb-2005  perry branches: 1.78.4;
nuke trailing whitespace
 1.77 10-Jun-2004  yamt branches: 1.77.4; 1.77.6;
make sure that nfssvc sockets are zapped before being freed.
 1.76 10-Jun-2004  yamt nfsrv_zapsock: fix an inverted condition in nfs_syscall.c rev.1.74.
 1.75 22-May-2004  jonathan Eliminate several uses of `curproc' from the socket-layer code and from NFS.

Add a new explicit `struct proc *p' argument to socreate(), sosend().
Use that argument instead of curproc. Follow-on changes to pass that
argument to socreate(), sosend(), and (*so->so_send)() calls.
These changes reviewed and independently recoded by Matt Thomas.

Changes to soreceive() and (*dom->dom_exernalize() from Matt Thomas:
pass soreceive()'s struct uio* uio->uio_procp to unp_externalize().
Eliminate curproc from unp_externalize. Also, now soreceive() uses
its uio->uio_procp value, pass that same value downward to
((pr->pru_usrreq)() calls for consistency, instead of (struct proc * )0.

Similar changes in sys/nfs to eliminate (most) uses of curproc,
either via the req-> r_procp field of a struct nfsreq *req argument,
or by passing down new explicit struct proc * arguments.

Reviewed by: Matt Thomas, posted to tech-kern.
NB: The (*pr->pru_usrreq)() change should be tested on more (all!) protocols.
 1.74 17-Mar-2004  yamt branches: 1.74.2;
nfsrv_zapsock: zap an nfsd socket only if it's valid.
 1.73 17-Mar-2004  yamt nfsrv_zapsock: remove slp from nfssvc_sockpending before zapping.
 1.72 17-Mar-2004  yamt SHUT_RDWR rather than bare 2.
 1.71 07-Dec-2003  thorpej Fix a couple of small whitespace errors.
 1.70 07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.69 29-Jun-2003  fvdl branches: 1.69.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
 1.68 28-Jun-2003  darrenr Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
 1.67 26-Jun-2003  yamt add appropriate #ifdef's.
pointed by Simon Burge on source-changes
and by some people in private mail.
 1.66 25-Jun-2003  yamt - instead of scaning a list when looking up
{a idle thread, a socket with pending requests},
maintain dedicated list of them.
- add spin locks.
 1.65 26-May-2003  yamt make duplicated codes to a function, nfsrv_sockalloc.
 1.64 22-May-2003  yamt poolify nfsrv_descript.
 1.63 22-May-2003  yamt indent (nfssvc_nfsd)
 1.62 21-May-2003  yamt - use FREE not free for MALLOC'ed memory.
- remove unneeded caddr_t casts.
 1.61 21-May-2003  yamt remove local definitions of TRUE and FALSE.
 1.60 07-May-2003  yamt simple lock for nfs iod.
 1.59 07-May-2003  yamt indent.
 1.58 09-Apr-2003  yamt make per-iod datas together.
 1.57 02-Apr-2003  yamt use queue manipulation macros.
 1.56 31-Mar-2003  yamt adapt to file interlock.
 1.55 26-Feb-2003  matt Add MBUFTRACE kernel option.
Do a little mbuf rework while here. Change all uses of MGET*(*, M_WAIT, *)
to m_get*(M_WAIT, *). These are not performance critical and making them
call m_get saves considerable space. Add m_clget analogue of MCLGET and
make corresponding change for M_WAIT uses.
Modify netinet, gem, fxp, tulip, nfs to support MBUFTRACE.
Begin to change netstat to use sysctl.
 1.54 01-Feb-2003  thorpej Add extensible malloc types, adapted from FreeBSD. This turns
malloc types into a structure, a pointer to which is passed around,
instead of an int constant. Allow the limit to be adjusted when the
malloc type is defined, or with a function call, as suggested by
Jonathan Stone.
 1.53 18-Jan-2003  thorpej Merge the nathanw_sa branch.
 1.52 14-Sep-2002  chs pick up a fix from openbsd:
revision 1.33
date: 2002/07/24 23:32:11; author: nordin; state: Exp; lines: +3 -3
Use sizeof(array) instead of sizeof(array *) for bcopy length. ok deraadt@
 1.51 12-May-2002  matt Eliminate commons
 1.50 29-Nov-2001  christos sprinkle crcvt()
 1.49 10-Nov-2001  lukem add RCSIDs
 1.48 27-Nov-2000  chs branches: 1.48.2; 1.48.4; 1.48.8;
Initial integration of the Unified Buffer Cache project.
 1.47 24-Nov-2000  chs put more ISO bits under ifdef ISO.
 1.46 24-Oct-2000  matt Change a DIAGNOSTIC panic slightly to print the locked vnodes and to just
print a diagnostic but not panic.
 1.45 23-Oct-2000  chs fix nfs iod management so we don't lose i/os when iods die.
 1.44 19-Sep-2000  fvdl Don't do write gathering for v3; it makes no sense. Unless the client
is broken and does sync writes all the time, but that's the client's
fault.
 1.43 19-Sep-2000  bjh21 New kernel option, NFS_V2_ONLY, which aims to reduce the NFS client to just
that required to support NFSv2 mounts. Not finished yet, but already
provides some 44k of saving in code size on arm26. More savings, and some
documentation, are still to come.
 1.42 23-Aug-2000  nathanw Fix typo in comment.
 1.41 09-Jun-2000  fvdl branches: 1.41.2;
Some tweaks to enable NFS over IPv6. The special-casing of AF_INET
should really be removed.
 1.40 07-May-2000  tsarna branches: 1.40.2;
Auto-adjusting vfs.nfs.iothreads: when mounting the first nfs
filesystem, if the number of threads is "-1", meaning it's never been
set, then set it to 4. You can override by setting this to some other
number (including 0) before or after mounting, of course.

Thanks to whoever it was that suggested this on ICB... sorry I don't
remember who.
 1.39 15-Apr-2000  tsarna Death to nfsiod!

It is replaced by kernel threads that do the same thing. The number of
kernel threads used is set with the vfs.nfs.iothreads sysctl.
 1.38 30-Mar-2000  augustss Remove register declarations.
 1.37 30-Mar-2000  simonb Delete redundant decl of nfs_pub - it's in <sys/mount.h>.
Delete redundant decl of nfsrv_zapsock() - it's in <nfs/nfs_var.h>.
 1.36 29-Jun-1999  wrstuden branches: 1.36.2;
Add fhopen, fhstat, fhstatfs syscalls. Also move getfh in from the nfs
syscall code.
 1.35 05-May-1999  thorpej Add "use counting" to file entries. When closing a file, and it's reference
count is 0, wait for use count to drain before finishing the close.

This is necessary in order for multiple processes to safely share file
descriptor tables.
 1.34 04-May-1999  sommerfe Include checks (under DIAGNOSTIC) to catch vnode lock leaks soon after
they happen (while we still know which remote op is to blame for it),
instead of later when we trip over the already-locked vnode.
 1.33 08-Nov-1998  mycroft branches: 1.33.6;
Do not permit the u area for nfsd or nfsiod to be swapped out.
 1.32 09-Aug-1998  perry bzero->memset, bcopy->memcpy, bcmp->memcmp
 1.31 05-Jul-1998  jonathan defopt ISO TPIP.
 1.30 25-Jun-1998  thorpej defopt NFSSERVER
 1.29 25-Apr-1998  matt Adapt to new sosend/soreceive and upcall (now down in sowakeup)
 1.28 19-Feb-1998  thorpej Include the NFS option header.
 1.27 10-Oct-1997  fvdl branches: 1.27.2;
* New directory entry caching system. Provides full caching of any
directory cookie that may be thrown back at us from userspace, up
to a size limit. Fixes double entry problem.
* Split flags for internal and external use in the NFS mount structure.
* Fix some buffer structure fields that weren're being used correctly.
* Fix missing directory cache inval call in nfs_open.
* Limit on NFS_DIRBLKSIZ no longer needed, bumped to the more reasonable
value of 8k.
* Various other things that I forget, all related to the dir caching
somehow, though.
 1.26 24-Jun-1997  fvdl branches: 1.26.4;
Invalidate nfs_pub info when reinitting the NFS server.
 1.25 24-Mar-1997  mycroft KNF police.
 1.24 31-Jan-1997  thorpej NFSCLIENT -> NFS.
 1.23 03-Dec-1996  thorpej branches: 1.23.2;
Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.
 1.22 02-Dec-1996  thorpej NFS performance improvement from Doug Rabson/FreeBSD:

Improve the queuing algorithms used by NFS' asynchronous i/o. The
existing mechanism uses a global queue for some buffers and the
vp->b_dirtyblkhd queue for others. This turns sequential writes into
randomly ordered writes to the server, affecting both read and write
performance. The existing mechanism also copes badly with hung
servers, tending to block accesses to other servers when all the iods
are waiting for a hung server.

The new mechanism uses a queue for each mount point. All asynchronous
i/o goes through this queue which preserves the ordering of requests.
A simple mechanism ensures that the iods are shared out fairly between
active mount points.

Reviewed/integrated/approved by Frank van der Linden <fvdl@netbsd.org>
 1.21 13-Oct-1996  christos revert kprintf changes
 1.20 10-Oct-1996  christos printf -> kprintf, sprintf -> ksprintf
 1.19 18-Feb-1996  fvdl Bring in a merge of Rick Macklem's NFSv3 code from Lite2
 1.18 09-Feb-1996  christos nfs prototype changes
 1.17 19-Dec-1995  cgd changes to make this work on systems where pointers & longs are 64 bits.
This is mostly just changes to make the stuff that goes over the wire
use fixed-size types.
 1.16 07-Oct-1995  mycroft Prefix names of system call implementation functions with `sys_'.
 1.15 19-Sep-1995  thorpej Make system calls conform to a standard prototype and bring those
prototypes into scope.
 1.14 13-Aug-1995  mycroft splnet --> splsoftnet
 1.13 20-Oct-1994  cgd update for new syscall args description mechanism
 1.12 17-Aug-1994  mycroft Convert some more lists and queues.
 1.11 29-Jun-1994  cgd branches: 1.11.2;
New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'
 1.10 08-Jun-1994  mycroft Update to 4.4-Lite fs code, with local changes.
 1.9 14-Feb-1994  cgd be more intelligent with credentials, so nfsd's don't inherit
strange credentials. This doesn't actually have any effect on
performance, because the remote cred is used for all operations,
anyway. however, it makes "ps" et al. look normal, because the
proc's ucred is no longer clobbered.
 1.8 06-Feb-1994  mycroft Eliminate some more uses of b_actl.
 1.7 18-Dec-1993  mycroft Canonicalize all #includes.
 1.6 17-Jul-1993  mycroft branches: 1.6.4;
Finish moving struct definitions outside of function declarations.
 1.5 22-May-1993  cgd add include of select.h if necessary for protos, or delete if extraneous
 1.4 18-May-1993  cgd make kernel select interface be one-stop shopping & clean it all up.
 1.3 10-Apr-1993  glass migrated code to make split possible
 1.2 21-Mar-1993  cgd after 0.2.2 "stable" patches applied
 1.1 21-Mar-1993  cgd branches: 1.1.1;
Initial revision
 1.1.1.3 01-Mar-1998  fvdl Import 4.4BSD-Lite2
 1.1.1.2 01-Mar-1998  fvdl Import 4.4BSD-Lite for reference
 1.1.1.1 21-Mar-1993  cgd initial import of 386bsd-0.1 sources
 1.6.4.1 14-Nov-1993  mycroft Canonicalize all #includes.
 1.11.2.1 19-Aug-1994  mycroft update from trunk
 1.23.2.1 14-Jan-1997  thorpej Snapshot of work-in-progress, committed to private branch.

These changes implement machine-independent root device and file system
selection. Notable features:

- All ports behave in a consistent manner regarding root
device selection.
- No more "options GENERIC"; all kernels have the ability
to boot with RB_ASKNAME to select root device and file system
type.
- Root file system type can be wildcarded; a machine-independent
function will try all possible file systems for the selected
root device until one succeeds.
- If the root file system fails to mount, the operator will
be given the chance to select a new root device and file
system type, rather than having the machine simply panic.
- nfs_mountroot() no longer panics if any part of the NFS
mount process fails; it now returns an error, giving the
operator a chance to recover.
- New, more consistent, config(8) grammar. The constructs:

config netbsd swap generic
config netbsd root on nfs

have been replaced with:

config netbsd root on ? type ?
config netbsd root on ? type nfs

Additionally, the operator may select or wildcard root file
system type in the kernel configuration file:

config netbsd root on cd0a type cd9660

config(8) now requires that a "root" specification be
made. "root" may be wired down or wildcarded. "swap" and
"dump" specifications are optional, and follow previous
semantics.

- config(8) has a new "file-system" keyword, used to configure
file systems into the kernel. Eventually, this will be used
to generate the default vfssw[].

- "options NFSCLIENT" is obsolete, and is replaced by
"file-system NFS". "options NFSSERVER" still exists, since
NFS server support is independent of the NFS file system
client.

- sys/arch/<foo>/<foo>/swapgeneric.c is no longer used, and
will be removed; all information is now generated by config(8).

As of this commit, all ports except arm32 have been updated to use
the new setroot(). Only SPARC, i386, and Alpha ports have been
tested at this time. Port masters should test these changes on their
ports, and report any problems back to me.

More changes are on their way, including RB_ASKNAME support in
nfs_mountroot() (to prompt for server address and path) and, potentially,
the ability to select rarp/bootparam or bootp in nfs_mountroot().
 1.26.4.1 14-Oct-1997  thorpej Update marc-pcmcia branch from trunk.
 1.27.2.1 08-Nov-1998  cgd pull up rev 1.33 from trunk (mycroft)
 1.33.6.2 25-Oct-2000  he Pull up revision 1.45 (via patch, requested by chs):
Fix a bug where NFS async I/O requests will be lost if the number
of I/O daemons is ever reduced to a smaller but non-zero number.
 1.33.6.1 04-May-1999  perry branches: 1.33.6.1.2;
pullup 1.33->1.34 (sommerfeld)
 1.33.6.1.2.2 01-Jul-1999  thorpej Sync w/ -current.
 1.33.6.1.2.1 21-Jun-1999  thorpej Sync w/ -current.
 1.36.2.2 08-Dec-2000  bouyer Sync with HEAD.
 1.36.2.1 20-Nov-2000  bouyer Update thorpej_scsipi to -current as of a month ago
 1.40.2.1 22-Jun-2000  minoura Sync w/ netbsd-1-5-base.
 1.41.2.3 09-Feb-2002  he Pull up revision 1.50 (requested by christos):
Widen cr_ref to prevent overflow.
 1.41.2.2 14-Dec-2000  he Pull up revisions 1.44-1.45 (requested by fvdl):
Improve NFS performance, possibly with as much as 100% in
throughput. Please note: this implies a kernel interface change,
VOP_FSYNC gains two arguments.
 1.41.2.1 24-Oct-2000  tv Pullup 1.45 [chs]:
fix nfs iod management so we don't lose i/os when iods die.
 1.48.8.1 12-Nov-2001  thorpej Sync the thorpej-mips-cache branch with -current.
 1.48.4.3 10-Oct-2002  jdolecek sync kqueue with -current; this includes merge of gehenna-devsw branch,
merge of i386 MP branch, and part of autoconf rototil work
 1.48.4.2 23-Jun-2002  jdolecek catch up with -current on kqueue branch
 1.48.4.1 10-Jan-2002  thorpej Sync kqueue branch with -current.
 1.48.2.9 17-Sep-2002  nathanw Catch up to -current.
 1.48.2.8 12-Jul-2002  nathanw No longer need to pull in lwp.h; proc.h pulls it in for us.
 1.48.2.7 24-Jun-2002  nathanw Curproc->curlwp renaming.

Change uses of "curproc->l_proc" back to "curproc", which is more like the
original use. Bare uses of "curproc" are now "curlwp".

"curproc" is now #defined in proc.h as ((curlwp) ? (curlwp)->l_proc) : NULL)
so that it is always safe to reference curproc (*de*referencing curproc
is another story, but that's always been true).
 1.48.2.6 20-Jun-2002  nathanw Catch up to -current.
 1.48.2.5 29-May-2002  nathanw #include <sys/sa.h> before <sys/syscallargs.h>, to provide sa_upcall_t
now that <sys/param.h> doesn't include <sys/sa.h>.

(Behold the Power of Ed)
 1.48.2.4 08-Jan-2002  nathanw Catch up to -current.
 1.48.2.3 27-Nov-2001  thorpej Make lockmgr() lwp-aware:
- Locks are counted against LWPs, not procs.
- When we record the lockholder in the lock structure, we need to
also record the lwpid.
- When we are checking who holds the lock, also consider lwpid.

Fixes a "locking against myself" panic reported by Allen Briggs that
could be easily triggered by redirecting the output of an LWP-using
program to a file.
 1.48.2.2 14-Nov-2001  nathanw Catch up to -current.
 1.48.2.1 05-Mar-2001  nathanw Initial commit of scheduler activations and lightweight process support.
 1.69.2.7 11-Dec-2005  christos Sync with head.
 1.69.2.6 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.69.2.5 04-Mar-2005  skrll Sync with HEAD.

Hi Perry!
 1.69.2.4 21-Sep-2004  skrll Fix the sync with head I botched.
 1.69.2.3 18-Sep-2004  skrll Sync with HEAD.
 1.69.2.2 03-Aug-2004  skrll Sync with HEAD
 1.69.2.1 02-Jul-2003  darrenr Apply the aborted ktrace-lwp changes to a specific branch. This is just for
others to review, I'm concerned that patch fuziness may have resulted in some
errant code being generated but I'll look at that later by comparing the diff
from the base to the branch with the file I attempt to apply to it. This will,
at the very least, put the changes in a better context for others to review
them and attempt to tinker with removing passing of 'struct lwp' through
the kernel.
 1.74.2.1 14-Jun-2004  jmc Pullup rev 1.76 (requested by yamt in ticket #466)

nfsrv_zapsock: fix an inverted condition in nfs_syscall.c rev.1.74.
 1.77.6.1 19-Mar-2005  yamt sync with head. xen and whitespace. xen part is not finished.
 1.77.4.1 29-Apr-2005  kent sync with -current
 1.78.4.11 24-Mar-2008  yamt sync with head.
 1.78.4.10 17-Mar-2008  yamt sync with head.
 1.78.4.9 27-Feb-2008  yamt revert incomplete nfs client locking for now.
 1.78.4.8 15-Feb-2008  yamt - sprinkle some locks.
- disable MNT_UPDATE because it involves too much locking headache.
- don't overwrite other bits in v_vflags when setting VV_ROOT.
 1.78.4.7 21-Jan-2008  yamt sync with head
 1.78.4.6 07-Dec-2007  yamt sync with head
 1.78.4.5 27-Oct-2007  yamt sync with head.
 1.78.4.4 03-Sep-2007  yamt sync with head.
 1.78.4.3 26-Feb-2007  yamt sync with head.
 1.78.4.2 30-Dec-2006  yamt sync with head.
 1.78.4.1 21-Jun-2006  yamt sync with head.
 1.82.6.1 29-Nov-2005  yamt sync with head.
 1.84.2.1 15-Jan-2006  yamt sync with head.
 1.88.10.1 24-May-2006  tron Merge 2006-05-24 NetBSD-current into the "peter-altq" branch.
 1.88.8.9 11-May-2006  elad sync with head
 1.88.8.8 06-May-2006  christos - Move kauth_cred_t declaration to <sys/types.h>
- Cleanup struct ucred; forward declarations that are unused.
- Don't include <sys/kauth.h> in any header, but include it in the c files
that need it.

Approved by core.
 1.88.8.7 19-Apr-2006  elad sync with head.
 1.88.8.6 14-Apr-2006  elad Plug some possible leaks of kauth_cred_t.
 1.88.8.5 12-Mar-2006  elad Rename kauth_cred_compare() to kauth_cred_uucmp(), and kauth_cred_convert()
to kauth_cred_uucvt(). This makes it clearer that we're working on struct
uucred.

Inspired by comments from yamt@.
 1.88.8.4 10-Mar-2006  elad Cleanup more interface abuse.

Make nfsrv_setcred() take a kauth_cred_t * as outcred. The original code
just modified it directly; we can't do that, nor do we want to.

Get rid of another case of kauth_cred_zero() followed by kauth_cred_hold()
and use kauth_cred_clone() to make sure we don't leave out important
members.

Add another DIAGNOSTIC check for reference count of above one.

Again, this should be tested.
 1.88.8.3 10-Mar-2006  elad Some cleanup.

kauth_cred_setrefcnt() was only called after kauth_cred_convert() in NFS
code to convert a struct uucred to kauth_cred_t. Since there's no valid
use for such a function, make kauth_cred_convert() set the reference
count to 1 and eliminate the need for kauth_cred_setrefcnt() entirely.

Motivated by comments from yamt@ and thorpej@.
 1.88.8.2 10-Mar-2006  elad generic_authorize() -> kauth_authorize_generic().
 1.88.8.1 08-Mar-2006  elad Adapt to kernel authorization KPI.

This could use some testing...
 1.88.6.3 11-Aug-2006  yamt sync with head
 1.88.6.2 26-Jun-2006  yamt sync with head.
 1.88.6.1 24-May-2006  yamt sync with head.
 1.88.4.3 01-Jun-2006  kardel Sync with head.
 1.88.4.2 22-Apr-2006  simonb Sync with head.
 1.88.4.1 04-Feb-2006  simonb Adapt for timecounters: mostly use get*time() and use "time_second"
instead of "time.tv_sec".
 1.88.2.1 09-Sep-2006  rpaulo sync with head
 1.94.2.2 19-Jun-2006  chap Sync with head.
 1.94.2.1 19-May-2006  chap file nfs_syscalls.c was added on branch chap-midi on 2006-06-19 04:10:37 +0000
 1.96.6.2 10-Dec-2006  yamt sync with head.
 1.96.6.1 22-Oct-2006  yamt sync with head
 1.96.4.4 01-Feb-2007  ad Sync with head.
 1.96.4.3 30-Jan-2007  ad Remove support for SA. Ok core@.
 1.96.4.2 12-Jan-2007  ad Sync with head.
 1.96.4.1 18-Nov-2006  ad Sync with head.
 1.99.4.1 03-Sep-2007  wrstuden Sync w/ NetBSD-4-RC_1
 1.99.2.1 05-Jun-2007  bouyer Pull up following revision(s) (requested by yamt in ticket #705):
sys/nfs/nfs_syscalls.c: revision 1.104
plug a mbuf leak.
 1.105.2.3 07-May-2007  yamt sync with head.
 1.105.2.2 12-Mar-2007  rmind Sync with HEAD.
 1.105.2.1 28-Feb-2007  yamt sync with head. (somehow missed in the previous)
 1.107.4.1 11-Jul-2007  mjf Sync with head.
 1.107.2.13 27-Oct-2007  yamt call soshutdown with kernel_lock held.
 1.107.2.12 26-Aug-2007  yamt - mark nfssvc(2) MPSAFE and move the most of nfsd out of the kernel lock.
- remove unused ns_solock.
- remove some of KERNEL_LOCK/UNLOCK which are not necessary on this branch.
 1.107.2.11 20-Aug-2007  ad Sync with HEAD.
 1.107.2.10 15-Jul-2007  ad Sync with head.
 1.107.2.9 18-Jun-2007  yamt fix a merge botch.
 1.107.2.8 09-Jun-2007  ad Sync with head.
 1.107.2.7 08-Jun-2007  ad Sync with head.
 1.107.2.6 13-May-2007  ad - Pass the error number and residual count to biodone(), and let it handle
setting error indicators. Prepare to eliminate B_ERROR.
- Add a flag argument to brelse() to be set into the buf's flags, instead
of doing it directly. Typically used to set B_INVAL.
- Add a "struct cpu_info *" argument to kthread_create(), to be used to
create bound threads. Change "bool mpsafe" to "int flags".
- Allow exit of LWPs in the IDL state when (l != curlwp).
- More locking fixes & conversion to the new API.
 1.107.2.5 10-Apr-2007  ad Nuke the deferred kthread creation stuff, as it's no longer needed.
Pointed out by thorpej@.
 1.107.2.4 09-Apr-2007  ad - Add two new arguments to kthread_create1: pri_t pri, bool mpsafe.
- Fork kthreads off proc0 as new LWPs, not new processes.
 1.107.2.3 05-Apr-2007  ad Compile fixes.
 1.107.2.2 21-Mar-2007  ad - Put a lock around the proc's CWD info (work in progress).
- Replace some more simplelocks.
- Make lbolt a condvar.
 1.107.2.1 13-Mar-2007  ad Pull in the initial set of changes for the vmlocking branch.
 1.117.2.1 15-Aug-2007  skrll Sync with HEAD.
 1.122.4.2 02-Aug-2007  yamt nfsrv_slpderef: add an assertion.
 1.122.4.1 02-Aug-2007  yamt file nfs_syscalls.c was added on branch matt-mips64 on 2007-08-02 12:46:04 +0000
 1.122.2.5 09-Dec-2007  jmcneill Sync with HEAD.
 1.122.2.4 27-Nov-2007  joerg Sync with HEAD. amd64 Xen support needs testing.
 1.122.2.3 26-Oct-2007  joerg Sync with HEAD.

Follow the merge of pmap.c on i386 and amd64 and move
pmap_init_tmp_pgtbl into arch/x86/x86/pmap.c. Modify the ACPI wakeup
code to restore CR4 before jumping back into kernel space as the large
page option might cover that.
 1.122.2.2 16-Aug-2007  jmcneill Sync with HEAD.
 1.122.2.1 09-Aug-2007  jmcneill Sync with HEAD.
 1.124.4.1 14-Oct-2007  yamt sync with head.
 1.124.2.3 23-Mar-2008  matt sync with HEAD
 1.124.2.2 09-Jan-2008  matt sync with HEAD
 1.124.2.1 06-Nov-2007  matt sync with HEAD
 1.125.4.3 18-Feb-2008  mjf Sync with HEAD.
 1.125.4.2 27-Dec-2007  mjf Sync with HEAD.
 1.125.4.1 08-Dec-2007  mjf Sync with HEAD.
 1.125.2.1 22-Nov-2007  bouyer Sync with HEAD
 1.126.2.4 29-Dec-2007  yamt to prepare merge, put nfsd back under kernel_lock for now.
 1.126.2.3 26-Dec-2007  ad Sync with head.
 1.126.2.2 08-Dec-2007  ad Sync with head.
 1.126.2.1 04-Dec-2007  ad Pull the vmlocking changes into a new branch.
 1.127.4.1 02-Jan-2008  bouyer Sync with HEAD
 1.130.6.6 17-Jan-2009  mjf Sync with HEAD.
 1.130.6.5 05-Oct-2008  mjf Sync with HEAD.
 1.130.6.4 28-Sep-2008  mjf Sync with HEAD.
 1.130.6.3 29-Jun-2008  mjf Sync with HEAD.
 1.130.6.2 02-Jun-2008  mjf Sync with HEAD.
 1.130.6.1 03-Apr-2008  mjf Sync with HEAD.
 1.130.2.1 24-Mar-2008  keiichi sync with head.
 1.133.2.2 04-Jun-2008  yamt sync with head
 1.133.2.1 18-May-2008  yamt sync with head.
 1.134.2.7 11-Mar-2010  yamt sync with head
 1.134.2.6 16-Sep-2009  yamt sync with head
 1.134.2.5 18-Jul-2009  yamt sync with head.
 1.134.2.4 20-Jun-2009  yamt sync with head
 1.134.2.3 04-May-2009  yamt sync with head.
 1.134.2.2 16-May-2008  yamt sync with head.
 1.134.2.1 27-Apr-2008  yamt commit some work-in-progress changes to make nfs client mp-safe to a branch,
so that they won't get lost.
- sprinkle some locking
- mark the filesystem, nfstimer callout, and kq kthread mp-safe
- add assertions and comments
- disable upgrade mount for now
- some unrelated cosmetic changes
 1.135.2.5 10-Oct-2008  skrll Sync with HEAD.
 1.135.2.4 18-Sep-2008  wrstuden Sync with wrstuden-revivesa-base-2.
 1.135.2.3 23-Jun-2008  wrstuden Sync w/ -current. 34 merge conflicts to follow.
 1.135.2.2 14-May-2008  wrstuden Per discussion with ad, remove most of the #include <sys/sa.h> lines
as they were including sa.h just for the type(s) needed for syscallargs.h.

Instead, create a new file, sys/satypes.h, which contains just the
types needed for syscallargs.h. Yes, there's only one now, but that
may change and it's probably more likely to change if it'd be difficult
to handle. :-)

Per discussion with matt at n dot o, add an include of satypes.h to
sigtypes.h. Upcall handlers are kinda signal handlers, and signalling
is the header file that's already included for syscallargs.h that
closest matches SA.

This shaves about 3000 lines off of the diff of the branch relative
to the base. That also represents about 18% of the total before this
checkin.

I think this reduction is very good thing.
 1.135.2.1 10-May-2008  wrstuden Initial checkin of re-adding SA. Everything except kern_sa.c
compiles in GENERIC for i386. This is still a work-in-progress, but
this checkin covers most of the mechanical work (changing signalling
to be able to accomidate SA's process-wide signalling and re-adding
includes of sys/sa.h and savar.h). Subsequent changes will be much
more interesting.

Also, kern_sa.c has received partial cleanup. There's still more
to do, though.
 1.136.2.1 27-Jun-2008  simonb Sync with head.
 1.137.2.2 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.137.2.1 19-Oct-2008  haad Sync with HEAD.
 1.140.4.1 13-Apr-2009  snj Pull up following revision(s) (requested by ad in ticket #701):
sys/nfs/nfs_syscalls.c: revision 1.147
PR kern/41154: possible races in NFS server code
Fix some of the races (but probably not all of them) in the NFS server code.
nfssvc_nfsd(): change a splsoftclock()/spx() to mutex_enter/exit(&nfsd_lock)
(I guess it was forgotten when the nfsd code was made SMP safe)
m_freem(nd_nam) in nfsrv_slpderef() instead of nfsrv_zapsock() to
avoid possible use after free in nfssvc_nfsd()
Fix nfsrv_slpderef() to not release nfsd_lock before testing SLP_VALID
and reaquiring it just after. This could cause a use after free
of the slp if one thread is in nfsrv_slpderef() and the other one grabs
slp from nfssvc_sockpending and zap it.
 1.140.2.2 28-Apr-2009  skrll Sync with HEAD.
 1.140.2.1 19-Jan-2009  skrll Sync with HEAD.
 1.143.4.2 23-Jul-2009  jym Sync with HEAD.
 1.143.4.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.153.26.1 18-May-2014  rmind sync with head
 1.153.22.2 03-Dec-2017  jdolecek update from HEAD
 1.153.22.1 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.153.12.1 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.154.4.1 04-Nov-2015  riz Pull up following revision(s) (requested by mrg in ticket #956):
sys/compat/netbsd32/files.netbsd32: revision 1.36
sys/compat/netbsd32/netbsd32_sysent.c: revision 1.115
sys/compat/netbsd32/netbsd32_syscallargs.h: revision 1.116
sys/nfs/nfs_var.h: revision 1.93
sys/compat/netbsd32/netbsd32_conv.h: revision 1.30
sys/compat/netbsd32/netbsd32_syscall.h: revision 1.116
sys/compat/netbsd32/netbsd32_syscalls.c: revision 1.115
sys/compat/netbsd32/netbsd32_nfssvc.c: revision 1.1
sys/compat/netbsd32/netbsd32_nfssvc.c: revision 1.3
sys/nfs/nfs_syscalls.c: revision 1.156
sys/compat/netbsd32/syscalls.master: revision 1.108
sys/compat/netbsd32/netbsd32.h: revision 1.107
add netbsd32 support for nfssvc(2). we do this by defining 5 copyin/out
functions that do all the ugly work, are just plain copyin/out for the
native system calls, and do the necessary translations for netbsd32.
with this i'm able to run 32 bit nfsd and mountd on 64 bit kernel and
mount the file systems remotely.
don't copy the first netbsd32_export_args nexports times, but actually
advance the userland pointer each entry through the loop. oops.
 1.155.2.3 28-Aug-2017  skrll Sync with HEAD
 1.155.2.2 09-Jul-2016  skrll Sync with HEAD
 1.155.2.1 22-Sep-2015  skrll Sync with HEAD
 1.157.4.1 21-Apr-2017  bouyer Sync with HEAD
 1.157.2.1 20-Mar-2017  pgoyette Sync with HEAD
 1.159.2.1 22-Mar-2018  pgoyette Synch with HEAD, resolve conflicts
 1.160.2.2 08-Apr-2020  martin Merge changes from current as of 20200406
 1.160.2.1 10-Jun-2019  christos Sync with HEAD
 1.162.12.1 06-Jun-2021  cjep sync with head
 1.162.8.1 17-Jun-2021  thorpej Sync w/ HEAD.
 1.163.16.1 02-Aug-2025  perseant Sync with HEAD
 1.97 07-Dec-2024  riastradh sys/nfs/nfs: Add some missing includes and include guards.

Fix up some minor KNF issues while here.

No functional change intended (except to enable things to build that
might not have built before because of previously required #include
ordering).
 1.96 27-Apr-2022  hannken branches: 1.96.10;
As VOP_GETATTR() needs a shared lock at least move the preopattr lookup
inside nfs_namei() where we may lock the start directory without violating
the lock order.
 1.95 04-Jun-2021  hannken Add flag/command NFSSVC_REPLACEEXPORTSLIST to nfssvc(2) system call.

Works like NFSSVC_SETEXPORTSLIST but supports "mel_nexports > 1"
and will atomically update the complete exports list for a file system.
 1.94 15-Jul-2015  manu branches: 1.94.34; 1.94.38;
Fix soft NFS force unmount

For many reasons, forcibly unmounting a soft NFS mount could hang forever.
Here are the fixes:
- Introduce decents timeouts in operation that awaited NFS server reply.
- On timeout, fails operations on soft mounts with EIO.
- Introduce NFSMNT_DISMNTFORCE to let the filesystem know that a
force unmount is ongoing. This causes timeouts to be reduced and
prevents the NFS client to attempt reconnecting to the NFS server.

Also fix a race condition where some asynchronous I/O could reference
destroyed mount structures. We fix this by awaiting asynchronous I/O
to drain before proceeding.

Reviewed by Chuck Silvers.
 1.93 22-Jun-2015  mrg add netbsd32 support for nfssvc(2). we do this by defining 5 copyin/out
functions that do all the ugly work, are just plain copyin/out for the
native system calls, and do the necessary translations for netbsd32.

with this i'm able to run 32 bit nfsd and mountd on 64 bit kernel and
mount the file systems remotely.
 1.92 30-May-2014  hannken branches: 1.92.2; 1.92.4;
Change NFS from rbtree to vcache.
 1.91 14-Dec-2013  christos branches: 1.91.2;
don't allow the nfs server module to unload if it has exported filesystems.
 1.90 02-Mar-2010  pooka branches: 1.90.10; 1.90.20; 1.90.24;
Get rid of dependency on fs_nfs.h, i.e. source modules with
conditional content depending on if the NFS client is wanted or
not. The server can now be made an independent module not depending
on the nfs client.

Tested with rump_nfs (standalone client), rump_nfsd (standalone
nfsd) and a qemu installation with both the client and the server.
 1.89 03-Sep-2009  tls branches: 1.89.2;
...and one more missed in the earlier commit (sigh). Kernels should build
again now.
 1.88 07-Jul-2009  christos The compatibility call to re-export from sys_mount() calls
mountd_set_exports_list, with the mnt_updating mutex held. Account for that
to avoid a locking against myself panic.
 1.87 23-May-2009  ad - Fix a race between umount()/mount() and nfssvc().
- Toss netexport state on nfsserver module unload.
 1.86 14-Mar-2009  dsl Remove all the __P() from sys (excluding sys/dist)
Diff checked with grep and MK1 eyeball.
i386 and amd64 GENERIC and sys still build.
 1.85 28-Nov-2008  pooka branches: 1.85.4;
g/c unused malloc types
 1.84 19-Nov-2008  ad Make the emulations, exec formats, coredump, NFS, and the NFS server
into modules. By and large this commit:

- shuffles header files and ifdefs
- splits code out where necessary to be modular
- adds module glue for each of the components
- adds/replaces hooks for things that can be installed at runtime
 1.83 14-Nov-2008  ad Remove COMPAT ifdefs that might as well be comments (i.e., they cost us
almost nothing).
 1.82 22-Oct-2008  matt branches: 1.82.2;
Don't need nfs_vfs_reinit anymore since we don't resize tables anymore.
Move reinit code to init case.
 1.81 22-Oct-2008  matt Change NFS to use a RB-tree for its FH->nfsnode lookups.
 1.80 30-Sep-2008  pooka Initialize nfsnode pools and malloc type dynamically in the
constructor instead of depending on link sets. Consequently, rename
nfs_nh{init,reinit,done} to nfs_node_{init,reinit,done}, respectively,
to better convey the function.
 1.79 28-Apr-2008  martin branches: 1.79.2; 1.79.6;
Remove clause 3 and 4 from TNF licenses
 1.78 10-Apr-2008  yamt branches: 1.78.2; 1.78.4;
- make nfs_receive and nfs_reply static.
- ansify.
 1.77 02-Jan-2008  yamt branches: 1.77.6;
use kmem_alloc instead of malloc.
 1.76 20-Dec-2007  dsl Convert all the system call entry points from:
int foo(struct lwp *l, void *v, register_t *retval)
to:
int foo(struct lwp *l, const struct foo_args *uap, register_t *retval)
Fixup compat code to not write into 'uap' and (in some cases) to actually
pass a correctly formatted 'uap' structure with the right name to the
next routine.
A few 'compat' routines that just call standard ones have been deleted.
All the 'compat' code compiles (along with the kernels required to test
build it).
98% done by automated scripts.
 1.75 04-Dec-2007  yamt branches: 1.75.4;
merge non-intrusive nfs changes from vmlocking.
 1.74 28-Oct-2007  yamt branches: 1.74.2; 1.74.4;
make NFS_ATTRTIMEO a function.
 1.73 21-Oct-2007  yamt remove lwp argument from nfs_reconnect and always use &lwp0
because who triggers a reconnect doesn't really matter here. PR/37145.
 1.72 10-Aug-2007  yamt branches: 1.72.2; 1.72.6;
- instead of scanning an array of iods, maintain a list of idle iods.
- make nfs_getset_niothreads MP friendly.
 1.71 27-Jul-2007  yamt branches: 1.71.4; 1.71.6;
stop nfs tick when we have nothing to do.
 1.70 20-Jul-2007  yamt - fix decreasing of vfs.nfs.iothreads after the recent partial merge
of vmlocking.
- don't make nfsiod exit with requests left.
- make NFSSVC_BIOD a dummy so that nfsiod can be simplified.
 1.69 12-Jul-2007  dsl branches: 1.69.2;
Change the VFS_MOUNT() interface so that the 'data' buffer passed to the
fs code is a kernel buffer, pass though the length of the buffer as well.
Since the length of the userspace buffer isn'it (yet) passed through the mount
system call, add a field to the vfsops structure containing the default length.
Split sys_mount() for calls from compat code.
Ride one of the recent kernel version changes - old fs LKMs will load, but
sys_mount() will reject any attempt to use them.
 1.68 28-May-2007  yamt - remove nfs_exit exit hook. ok'ed by christos@.
- as far as i understand the code, it shouldn't be necessary
because nfs_request can't return without removing its request
and r->r_lwp is either curlwp or NULL.
- even if it's necessary, leaking requests is not the correct way
to recover from the condition.
- nfs_request: add a related assertion.
 1.67 29-Apr-2007  yamt use mutex and condver.
 1.66 04-Mar-2007  christos branches: 1.66.2; 1.66.4;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.
 1.65 21-Feb-2007  thorpej Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.
 1.64 27-Dec-2006  yamt branches: 1.64.2;
remove nqnfs.
 1.63 02-Sep-2006  yamt branches: 1.63.2;
nfsd: deal with variable-sized filehandles.
 1.62 01-Jul-2006  yamt if a file is sillyrename'ed because it's a destination of rename,
make sillyrename (try to) use LINK operation rather than RENAME.
PR/33861 from Jed Davis. he provided the almost same patch.
according to him, it also happen to be what opensolaris does in this case.

from the PR:
> In nfs_rename(), if the destination appears to exist and is "in use"
> (this check is apparently satisfied even if the file isn't in use by
> anything except the rename itself), it will sillyrename it, then delete
> the sillyrenamed file even if the rename fails -- for instance, because
> the "from" file no longer exists on the server.

> mkdir a b; touch a/x; perl -e 'fork(); rename("a/x","b/x") or die "$!\n"'
>
> Afterwards, neither a/x nor b/x will exist.

> 1) Lookup of b/x; fails with NOENT.
> 2) Rename from a/x to b/x; succeeds.
> 3) Lookup of b/x; fails with NOENT.
> 4) Rename from b/x to b/.nfsA23a3; succeeds.
> 5) Rename from a/x to b/x; fails with NOENT.
> 6) Remove of b/.nfsA23a3; succeeds.
 1.61 19-May-2006  yamt branches: 1.61.2; 1.61.4;
- fix compilation problem for !NFSSERVER && NFS.
pointed by Tom Spindler on source-changes@.
- make nfs_srvdesc_pool static.
 1.60 18-May-2006  yamt - fix some leaks in nfsd, introduced by kauth changes.
- simplify code.
- add some assertions.
- wrap some long lines.
- remove an unnecessary ";".
 1.59 14-May-2006  elad integrate kauth.
 1.58 05-Jan-2006  yamt branches: 1.58.2; 1.58.4; 1.58.6; 1.58.8; 1.58.10;
ensure the export list is not changed during nfsd operations.
 1.57 03-Jan-2006  yamt de-__P.
 1.56 03-Jan-2006  yamt move function prototypes from nfs.h to nfs_var.h.
 1.55 11-Dec-2005  christos branches: 1.55.2;
merge ktrace-lwp.
 1.54 22-Nov-2005  yamt - reduce number of linear search per rpc.
- coalesce mount_netexport_pair into netexport.
 1.53 25-Sep-2005  jmmv branches: 1.53.6;
Add some COMPAT_30 code to let old mountd binaries work after the NFS
exports rototill.
 1.52 23-Sep-2005  jmmv Apply the NFS exports list rototill patch:

- Remove all NFS related stuff from file system specific code.
- Drop the vfs_checkexp hook and generalize it in the new nfs_check_export
function, thus removing redundancy from all file systems.
- Move all NFS export-related stuff from kern/vfs_subr.c to the new
file sys/nfs/nfs_export.c. The former was becoming large and its code
is always compiled, regardless of the build options. Using the latter,
the code is only compiled in when NFSSERVER is enabled. While doing this,
also make some functions in nfs_subs.c conditional to NFSSERVER.
- Add a new command in nfssvc(2), called NFSSVC_SETEXPORTSLIST, that takes a
path and a set of export entries. At the moment it can only clear the
exports list or append entries, one by one, but it is done in a way that
allows setting the whole set of entries atomically in the future (see the
comment in mountd_set_exports_list or in doc/TODO).
- Change mountd(8) to use the nfssvc(2) system call instead of mount(2) so
that it becomes file system agnostic. In fact, all this whole thing was
done to remove a 'XXX' block from this utility!
- Change the mount*, newfs and fsck* userland utilities to not deal with NFS
exports initialization; done internally by the kernel when initializing
the NFS support for each file system.
- Implement an interface for VFS (called VFS hooks) so that several kernel
subsystems can run arbitrary code upon receipt of specific VFS events.
At the moment, this only provides support for unmount and is used to
destroy NFS exports lists from the file systems being unmounted, though it
has room for extension.

Thanks go to yamt@, chs@, thorpej@, wrstuden@ and others for their comments
and advice in the development of this patch.
 1.51 07-Jul-2005  christos 1. use p = uio->uio_procp consistently and eliminate suspicious uses
of curproc (where uio->uio_procp should be used?). Don't do this
for nfs_commit(), because yamt says it is possibly wrong.
2. nfs_doio() does not use struct proc; remove it and the code to compute it.
3. use copyin_proc() and copyout_proc() instead of copyin() and copyout().
4. check return of copyout_proc(). and mark return from copyin_proc() XXX
5. Eliminate check p == curproc assertion check from nfs_write;
nfs_read does not have it and we might be called in a different
process context anyway (PR 20138).
 1.50 29-May-2005  christos branches: 1.50.2;
- sprinkle const
- avoid shadowed variables
- mark bad const use with XXXUNCONST
 1.49 27-Jan-2005  yamt branches: 1.49.4; 1.49.6;
keep directory eof cache when inactivating vnode
because there's no reason to throw it away.
(fix an unintended side effect of nfs_subs.c rev.1.144.)
 1.48 19-Jan-2005  yamt branches: 1.48.2;
implement inaccurate mtime/ctime detection.
namely, if mtime or ctime are same between pre_op_attr and post_op_attr
when we expected them to be changed, don't trust the server.
 1.47 14-Dec-2004  yamt branches: 1.47.2;
- centerize code to invalidate stale cache.
- don't ignore errors when invalidating buffers in nfs_open.
 1.46 15-Sep-2004  yamt fix access-after-free bugs in dircache code by refcounting nfsdircache.
PR/26864.
 1.45 22-May-2004  jonathan Eliminate several uses of `curproc' from the socket-layer code and from NFS.

Add a new explicit `struct proc *p' argument to socreate(), sosend().
Use that argument instead of curproc. Follow-on changes to pass that
argument to socreate(), sosend(), and (*so->so_send)() calls.
These changes reviewed and independently recoded by Matt Thomas.

Changes to soreceive() and (*dom->dom_exernalize() from Matt Thomas:
pass soreceive()'s struct uio* uio->uio_procp to unp_externalize().
Eliminate curproc from unp_externalize. Also, now soreceive() uses
its uio->uio_procp value, pass that same value downward to
((pr->pru_usrreq)() calls for consistency, instead of (struct proc * )0.

Similar changes in sys/nfs to eliminate (most) uses of curproc,
either via the req-> r_procp field of a struct nfsreq *req argument,
or by passing down new explicit struct proc * arguments.

Reviewed by: Matt Thomas, posted to tech-kern.
NB: The (*pr->pru_usrreq)() change should be tested on more (all!) protocols.
 1.44 10-May-2004  yamt don't do kludge for a reply to a retransmitted request
unless we actually retransmitted the request.
 1.43 05-Apr-2004  yamt nfs_readdirplusrpc: fix a deadlock problem.
don't wait for vnode lock to load attributes.
otherwise, because READDIRPLUS returns DOTDOT entry as well,
we violate locking order.
 1.42 23-Jul-2003  yamt branches: 1.42.2;
when rexmitting a request due to NFSERR_JUKEBOX,
use a new xid as RFC1813 says.
 1.41 29-Jun-2003  fvdl branches: 1.41.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
 1.40 29-Jun-2003  thorpej Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.
 1.39 28-Jun-2003  darrenr Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
 1.38 09-Jun-2003  yamt rework zero padding of rpc reply.
- for READ procedure, don't send back more bytes than requested.
- don't have doubtful assumptions on mbuf chain structure.
- rename a function (nfsm_adj -> nfs_zeropad) to avoid confusion as
the semantics of the function was changed.
 1.37 22-May-2003  yamt poolify nfsrv_descript.
 1.36 22-May-2003  yamt avoid double free with xlatecookie.
 1.35 22-May-2003  yamt interlock for nfs_rcvlock.
 1.34 21-May-2003  yamt eliminate memcpy in the common and easy case of write.
 1.33 07-May-2003  yamt simple lock for nfs iod.
 1.32 05-May-2003  yamt keep things not needed by userland in #ifdef _KERNEL.
(e.g. prototypes for in-kernel functions)
 1.31 03-May-2003  yamt better handling of write verifier change.
 1.30 24-Apr-2003  drochner Change some subordinate functions to take a "struct nfsnode" argument
instead of "struct vnode". This saves a number of pointer dereferences;
it sums up to about half a kB for me. And it paves the way for future
fixes.
While cleaning up, eliminate a write-only member of "struct nfsreq"
and a pointless assignment in the NFS_V2_ONLY case.
 1.29 28-Mar-2003  yamt i forgot to check this in with the previous (reply ENAMETOOLONG properly).
 1.28 02-Feb-2003  christos protect <sys/mallocvar.h> ifdef _KERNEL
 1.27 01-Feb-2003  thorpej Add extensible malloc types, adapted from FreeBSD. This turns
malloc types into a structure, a pointer to which is passed around,
instead of an int constant. Allow the limit to be adjusted when the
malloc type is defined, or with a function call, as suggested by
Jonathan Stone.
 1.26 18-Jan-2003  thorpej Merge the nathanw_sa branch.
 1.25 23-Oct-2002  jdolecek merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe
 1.24 21-Oct-2002  yamt fix a page locking deadlock problem for nfs.

add a flag that specify if the file can be truncated safely or not
to nfsm_loadattr and friends. when it isn't safe, just mark the nfsnode
as "should be truncated later".

ok'ed by Frank van der Linden and Chuck Silvers.
close kern/18036.
 1.23 17-Mar-2002  christos use the exithook mechanism to remove the exiting process from the list
of processes to be signalled in a soft mount.
 1.22 11-Mar-2002  jdolecek nfs_enterdircache() had last two parameter types swapped
Noted in kern/14742 by John Franklin.
 1.21 05-Dec-2001  lukem don't need nfs_hash prototype here
 1.20 15-Sep-2001  chs add a new VFS op, vfs_reinit, which is called when desiredvnodes is
adjusted via sysctl. file systems that have hash tables which are
sized based on the value of this variable now resize those hash tables
using the new value. the max number of FFS softdeps is also recalculated.

convert various file systems to use the <sys/queue.h> macros for
their hash tables.
 1.19 27-Nov-2000  chs branches: 1.19.2; 1.19.4; 1.19.6;
Initial integration of the Unified Buffer Cache project.
 1.18 19-Sep-2000  fvdl Add prototypes for commitrange functions.
 1.17 15-Apr-2000  tsarna branches: 1.17.4;
Death to nfsiod!

It is replaced by kernel threads that do the same thing. The number of
kernel threads used is set with the vfs.nfs.iothreads sysctl.
 1.16 16-Mar-2000  jdolecek Add new VFS op routine - vfs_done and call it on filesystem detach
in vfs_detach(). vfs_done may free global filesystem's resources,
typically those allocated in respective filesystem's init function.
Needed so those filesystems which went in via LKM have a chance to
clean after themselves before unloading. This fixes random panics
when LKM for filesystem using pools was loaded and unloaded several
times.

For each leaf filesystem, add appropriate vfs_done routine.
 1.15 05-Sep-1998  christos branches: 1.15.12;
Assign copyright to TNF.
 1.14 25-Jun-1998  thorpej - Rename nqnfs_vop_lease_check() to genfs_lease_check(). If NFSSERVER is
not in the kernel, genfs_lease_check() is simply a no-op. This allows
LKM'd file systems to be exported (previously did not work properly
due to a compile-time decision based on -DNFSSERVER).
- defopt NFSSERVER
 1.13 29-Mar-1998  mrg add forward decl for union nethostaddr.
 1.12 30-Jan-1998  fvdl Only take the receive lock before disconnecting when doing it from
nfs_decode_args. Otherwise we might just end up locking against ourselves.

XXX workaround, will do ok for now. Proper fix forthcoming.
 1.11 19-Oct-1997  fvdl branches: 1.11.2;
* Implement optional 32 <-> 64 bit directory cookie translation. This uses
the directory cache as translation table. See nfs_subs.c for comments.
Makes the code a bit more complex to look at than I would have liked,
but doesn't affect the speed of the default behavior.
* Optimize caching behavior a bit when buffers are invalidated.
* Save some RPCs in readdir operations by not bothering if there is
a small amount left to do to fill the buffer. It'll be done in the
next RPC with a larger chunk anyway. Wastes a bit of buffer space
but is faster.
* Make n_vattr an allocated vattr struct. This avoids nfsnode bloat,
and is friendlier to the malloc routines.
 1.10 10-Oct-1997  fvdl * New directory entry caching system. Provides full caching of any
directory cookie that may be thrown back at us from userspace, up
to a size limit. Fixes double entry problem.
* Split flags for internal and external use in the NFS mount structure.
* Fix some buffer structure fields that weren're being used correctly.
* Fix missing directory cache inval call in nfs_open.
* Limit on NFS_DIRBLKSIZ no longer needed, bumped to the more reasonable
value of 8k.
* Various other things that I forget, all related to the dir caching
somehow, though.
 1.9 29-Aug-1997  gwr The nfs_boot_xxx functions are declared in nfsdiskless.h
so remove the duplicate declarations here.
 1.8 14-Jul-1997  fvdl branches: 1.8.2;
Don't assume that pointers into mbuf data remain valid across nfsm_dissect.
In readdirplus, don't keep such pointers but store the file attributes
in a variable instead until they are needed. Change nfsm_loadattr*
a bit so it can accept a direct pointer to an nfs_fattr structure.
 1.7 24-Jun-1997  fvdl Add prototype for nfs_ispublicfh, change the ones for nfs_namei and
nfsrv_fhtovp.
 1.6 11-Dec-1996  fvdl Give permission to the owner of the file to preserve semantics only
in the relevant cases (read, write). Fixes PR 3017.
 1.5 25-Oct-1996  cgd make the namei struct members ni_dirp and ni_next, and the componentname
struct member cn_nameptr 'const', since they should never be used to
modify the path name. (Only the pathname buffer, cn_pnbuf, should be
modified.) Propagate the const poisoning to code that uses the namei
and componentname structs.
 1.4 01-Sep-1996  mycroft Add a set of generic file system operations that most file systems use.
Also, fix some time stamp bogosities.
 1.3 18-Feb-1996  fvdl Bring in a merge of Rick Macklem's NFSv3 code from Lite2
 1.2 13-Feb-1996  christos add 2 missing fwd struct declarations
 1.1 09-Feb-1996  christos nfs prototype changes
 1.8.2.2 14-Oct-1997  thorpej Update marc-pcmcia branch from trunk.
 1.8.2.1 01-Sep-1997  thorpej Update marc-pcmcia branch from trunk.
 1.11.2.1 07-Feb-1998  mellon Pull up 1.12
 1.15.12.2 08-Dec-2000  bouyer Sync with HEAD.
 1.15.12.1 20-Nov-2000  bouyer Update thorpej_scsipi to -current as of a month ago
 1.17.4.1 14-Dec-2000  he Pull up revision 1.18 (requested by fvdl):
Improve NFS performance, possibly with as much as 100% in
throughput. Please note: this implies a kernel interface change,
VOP_FSYNC gains two arguments.
 1.19.6.1 01-Oct-2001  fvdl Catch up with -current.
 1.19.4.4 30-Sep-2002  jdolecek add support for kevents to NFS
to detect file changes on server by other NFS clients, polling kernel thread
is used to periodically check for attribute changes of watched files;
the NFS server is only contacted when the vnode expires from local attrcache
(which takes 5-60 seconds currently), to keep network&CPU overhead low

the routine checking for remote changes is quite simplistic, but hopefully
doing it's job well enough
 1.19.4.3 23-Jun-2002  jdolecek catch up with -current on kqueue branch
 1.19.4.2 16-Mar-2002  jdolecek Catch up with -current.
 1.19.4.1 10-Jan-2002  thorpej Sync kqueue branch with -current.
 1.19.2.6 11-Nov-2002  nathanw Catch up to -current
 1.19.2.5 22-Oct-2002  thorpej Sync with HEAD.
 1.19.2.4 01-Apr-2002  nathanw Catch up to -current.
(CVS: It's not just a program. It's an adventure!)
 1.19.2.3 08-Jan-2002  nathanw Catch up to -current.
 1.19.2.2 21-Sep-2001  nathanw Catch up to -current.
 1.19.2.1 05-Mar-2001  nathanw Initial commit of scheduler activations and lightweight process support.
 1.41.2.11 11-Dec-2005  christos Sync with head.
 1.41.2.10 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.41.2.9 04-Feb-2005  skrll Sync with HEAD.
 1.41.2.8 24-Jan-2005  skrll Sync with HEAD.
 1.41.2.7 18-Dec-2004  skrll Sync with HEAD.
 1.41.2.6 21-Sep-2004  skrll Fix the sync with head I botched.
 1.41.2.5 18-Sep-2004  skrll Sync with HEAD.
 1.41.2.4 24-Aug-2004  skrll Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.
 1.41.2.3 18-Aug-2004  skrll Revert to passing struct proc for {exit,exec}hook.
 1.41.2.2 03-Aug-2004  skrll Sync with HEAD
 1.41.2.1 02-Jul-2003  darrenr Apply the aborted ktrace-lwp changes to a specific branch. This is just for
others to review, I'm concerned that patch fuziness may have resulted in some
errant code being generated but I'll look at that later by comparing the diff
from the base to the branch with the file I attempt to apply to it. This will,
at the very least, put the changes in a better context for others to review
them and attempt to tinker with removing passing of 'struct lwp' through
the kernel.
 1.42.2.3 11-Jan-2005  jmc Pullup patch (requested by yamy in ticket #1078)

Don't do kludge for a reply to a retransmitted request
unless we actually retransmitted the request.
 1.42.2.2 18-Sep-2004  he branches: 1.42.2.2.2;
Pull up revision 1.46 (requested by yamt in ticket #858):
Fix access-after-free bugs in dircache code by reference
counting nfsdircache. Fixes PR#26864.
 1.42.2.1 10-Jul-2004  tron Pull up revision 1.43 (requested by tls in ticket #634):
nfs_readdirplusrpc: fix a deadlock problem.
don't wait for vnode lock to load attributes.
otherwise, because READDIRPLUS returns DOTDOT entry as well,
we violate locking order.
 1.42.2.2.2.1 11-Jan-2005  jmc Pullup patch (requested by yamy in ticket #1078)

Don't do kludge for a reply to a retransmitted request
unless we actually retransmitted the request.
 1.47.2.1 29-Apr-2005  kent sync with -current
 1.48.2.1 12-Feb-2005  yamt sync with head.
 1.49.6.1 16-Jul-2006  ghen Pull up following revision(s) (requested by jld in ticket #1424):
sys/nfs/nfs_vnops.c: revision 1.240 via patch
sys/nfs/nfs_var.h: revision 1.62 via patch
Fix race condition in NFS renaming that could cause the renamed file to be
deleted (PR/33861).
 1.49.4.1 16-Jul-2006  ghen Pull up following revision(s) (requested by jld in ticket #1424):
sys/nfs/nfs_vnops.c: revision 1.240 via patch
sys/nfs/nfs_var.h: revision 1.62 via patch
Fix race condition in NFS renaming that could cause the renamed file to be
deleted (PR/33861).
 1.50.2.8 21-Jan-2008  yamt sync with head
 1.50.2.7 07-Dec-2007  yamt sync with head
 1.50.2.6 15-Nov-2007  yamt sync with head.
 1.50.2.5 27-Oct-2007  yamt sync with head.
 1.50.2.4 03-Sep-2007  yamt sync with head.
 1.50.2.3 26-Feb-2007  yamt sync with head.
 1.50.2.2 30-Dec-2006  yamt sync with head.
 1.50.2.1 21-Jun-2006  yamt sync with head.
 1.53.6.4 22-Nov-2005  yamt sync with head.
 1.53.6.3 22-Nov-2005  yamt remove uvm_ractx forward decl. which is no longer used.
 1.53.6.2 18-Nov-2005  yamt - associate read-ahead context to vnode, rather than file.
- revert VOP_READ prototype.
 1.53.6.1 15-Nov-2005  yamt adapt ffs, lfs, nfs.
 1.55.2.1 15-Jan-2006  yamt sync with head.
 1.58.10.1 24-May-2006  tron Merge 2006-05-24 NetBSD-current into the "peter-altq" branch.
 1.58.8.3 06-May-2006  christos - Move kauth_cred_t declaration to <sys/types.h>
- Cleanup struct ucred; forward declarations that are unused.
- Don't include <sys/kauth.h> in any header, but include it in the c files
that need it.

Approved by core.
 1.58.8.2 10-Mar-2006  elad Cleanup more interface abuse.

Make nfsrv_setcred() take a kauth_cred_t * as outcred. The original code
just modified it directly; we can't do that, nor do we want to.

Get rid of another case of kauth_cred_zero() followed by kauth_cred_hold()
and use kauth_cred_clone() to make sure we don't leave out important
members.

Add another DIAGNOSTIC check for reference count of above one.

Again, this should be tested.
 1.58.8.1 08-Mar-2006  elad Adapt to kernel authorization KPI.

This could use some testing...
 1.58.6.3 03-Sep-2006  yamt sync with head.
 1.58.6.2 11-Aug-2006  yamt sync with head
 1.58.6.1 24-May-2006  yamt sync with head.
 1.58.4.1 01-Jun-2006  kardel Sync with head.
 1.58.2.1 09-Sep-2006  rpaulo sync with head
 1.61.4.1 13-Jul-2006  gdamore Merge from HEAD.
 1.61.2.2 19-May-2006  yamt - fix compilation problem for !NFSSERVER && NFS.
pointed by Tom Spindler on source-changes@.
- make nfs_srvdesc_pool static.
 1.61.2.1 19-May-2006  yamt file nfs_var.h was added on branch chap-midi on 2006-05-19 13:53:12 +0000
 1.63.2.1 12-Jan-2007  ad Sync with head.
 1.64.2.3 07-May-2007  yamt sync with head.
 1.64.2.2 12-Mar-2007  rmind Sync with HEAD.
 1.64.2.1 28-Feb-2007  yamt sync with head. (somehow missed in the previous)
 1.66.4.1 11-Jul-2007  mjf Sync with head.
 1.66.2.5 26-Aug-2007  yamt - mark nfssvc(2) MPSAFE and move the most of nfsd out of the kernel lock.
- remove unused ns_solock.
- remove some of KERNEL_LOCK/UNLOCK which are not necessary on this branch.
 1.66.2.4 20-Aug-2007  ad Sync with HEAD.
 1.66.2.3 15-Jul-2007  ad Sync with head.
 1.66.2.2 09-Jun-2007  ad Sync with head.
 1.66.2.1 08-Jun-2007  ad Sync with head.
 1.69.2.1 15-Aug-2007  skrll Sync with HEAD.
 1.71.6.2 27-Jul-2007  yamt stop nfs tick when we have nothing to do.
 1.71.6.1 27-Jul-2007  yamt file nfs_var.h was added on branch matt-mips64 on 2007-07-27 10:03:59 +0000
 1.71.4.4 09-Dec-2007  jmcneill Sync with HEAD.
 1.71.4.3 29-Oct-2007  joerg Sync with HEAD.
 1.71.4.2 26-Oct-2007  joerg Sync with HEAD.

Follow the merge of pmap.c on i386 and amd64 and move
pmap_init_tmp_pgtbl into arch/x86/x86/pmap.c. Modify the ACPI wakeup
code to restore CR4 before jumping back into kernel space as the large
page option might cover that.
 1.71.4.1 16-Aug-2007  jmcneill Sync with HEAD.
 1.72.6.2 13-Nov-2007  bouyer Sync with HEAD
 1.72.6.1 25-Oct-2007  bouyer Sync with HEAD.
 1.72.2.2 09-Jan-2008  matt sync with HEAD
 1.72.2.1 06-Nov-2007  matt sync with HEAD
 1.74.4.4 26-Dec-2007  ad Sync with head.
 1.74.4.3 08-Dec-2007  ad Sync with head.
 1.74.4.2 04-Dec-2007  yamt apply the following change, which seems to get lost during
vmlocking -> vmlocking2 transition.

Module Name: src
Committed By: yamt
Date: Sun Oct 21 08:23:20 UTC 2007

Modified Files:
src/sys/nfs: nfs_socket.c nfs_var.h

Log Message:
remove lwp argument from nfs_reconnect and always use &lwp0
because who triggers a reconnect doesn't really matter here. PR/37145.


To generate a diff of this commit:
cvs rdiff -r1.163 -r1.164 src/sys/nfs/nfs_socket.c
cvs rdiff -r1.72 -r1.73 src/sys/nfs/nfs_var.h

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
 1.74.4.1 04-Dec-2007  ad Pull the vmlocking changes into a new branch.
 1.74.2.3 18-Feb-2008  mjf Sync with HEAD.
 1.74.2.2 27-Dec-2007  mjf Sync with HEAD.
 1.74.2.1 08-Dec-2007  mjf Sync with HEAD.
 1.75.4.1 02-Jan-2008  bouyer Sync with HEAD
 1.77.6.3 17-Jan-2009  mjf Sync with HEAD.
 1.77.6.2 05-Oct-2008  mjf Sync with HEAD.
 1.77.6.1 02-Jun-2008  mjf Sync with HEAD.
 1.78.4.8 11-Mar-2010  yamt sync with head
 1.78.4.7 16-Sep-2009  yamt sync with head
 1.78.4.6 18-Jul-2009  yamt sync with head.
 1.78.4.5 20-Jun-2009  yamt sync with head
 1.78.4.4 04-May-2009  yamt fix merge botches.
 1.78.4.3 04-May-2009  yamt sync with head.
 1.78.4.2 16-May-2008  yamt sync with head.
 1.78.4.1 27-Apr-2008  yamt commit some work-in-progress changes to make nfs client mp-safe to a branch,
so that they won't get lost.
- sprinkle some locking
- mark the filesystem, nfstimer callout, and kq kthread mp-safe
- add assertions and comments
- disable upgrade mount for now
- some unrelated cosmetic changes
 1.78.2.1 18-May-2008  yamt sync with head.
 1.79.6.2 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.79.6.1 19-Oct-2008  haad Sync with HEAD.
 1.79.2.1 10-Oct-2008  skrll Sync with HEAD.
 1.82.2.2 28-Apr-2009  skrll Sync with HEAD.
 1.82.2.1 19-Jan-2009  skrll Sync with HEAD.
 1.85.4.2 23-Jul-2009  jym Sync with HEAD.
 1.85.4.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.89.2.1 30-Apr-2010  uebayasi Sync with HEAD.
 1.90.24.1 18-May-2014  rmind sync with head
 1.90.20.2 03-Dec-2017  jdolecek update from HEAD
 1.90.20.1 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.90.10.1 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.91.2.1 10-Aug-2014  tls Rebase.
 1.92.4.1 22-Sep-2015  skrll Sync with HEAD
 1.92.2.2 04-Nov-2015  riz Pull up following revision(s) (requested by mrg in ticket #956):
sys/compat/netbsd32/files.netbsd32: revision 1.36
sys/compat/netbsd32/netbsd32_sysent.c: revision 1.115
sys/compat/netbsd32/netbsd32_syscallargs.h: revision 1.116
sys/nfs/nfs_var.h: revision 1.93
sys/compat/netbsd32/netbsd32_conv.h: revision 1.30
sys/compat/netbsd32/netbsd32_syscall.h: revision 1.116
sys/compat/netbsd32/netbsd32_syscalls.c: revision 1.115
sys/compat/netbsd32/netbsd32_nfssvc.c: revision 1.1
sys/compat/netbsd32/netbsd32_nfssvc.c: revision 1.3
sys/nfs/nfs_syscalls.c: revision 1.156
sys/compat/netbsd32/syscalls.master: revision 1.108
sys/compat/netbsd32/netbsd32.h: revision 1.107
add netbsd32 support for nfssvc(2). we do this by defining 5 copyin/out
functions that do all the ugly work, are just plain copyin/out for the
native system calls, and do the necessary translations for netbsd32.
with this i'm able to run 32 bit nfsd and mountd on 64 bit kernel and
mount the file systems remotely.
don't copy the first netbsd32_export_args nexports times, but actually
advance the userland pointer each entry through the loop. oops.
 1.92.2.1 04-Nov-2015  riz Pull up following revision(s) (requested by manu in ticket #882):
sbin/umount/umount.c: revision 1.48
sys/nfs/nfsmount.h: revision 1.53
sys/nfs/nfs_var.h: revision 1.94
sys/nfs/nfs_iod.c: revision 1.7
sys/nfs/nfs_socket.c: revision 1.197
sys/nfs/nfs_bio.c: revision 1.191
sys/nfs/nfs_vfsops.c: revision 1.230
sys/nfs/nfs_clntsocket.c: revision 1.3
Remove useless and harmful sync(2) call in umount(8)
Remove sync(2) call before unmount(2) in umount(8). This sync(2) is useless
since unmount(2) will perform a VFS_SYNC anyway.
But moreover, this sync(2) may be harmful, as there are some situation where
it cannot return (unreachable NFS server, for instance), causing umount -f
to be uneffective.
Fix soft NFS force unmount
For many reasons, forcibly unmounting a soft NFS mount could hang forever.
Here are the fixes:
- Introduce decents timeouts in operation that awaited NFS server reply.
- On timeout, fails operations on soft mounts with EIO.
- Introduce NFSMNT_DISMNTFORCE to let the filesystem know that a
force unmount is ongoing. This causes timeouts to be reduced and
prevents the NFS client to attempt reconnecting to the NFS server.
Also fix a race condition where some asynchronous I/O could reference
destroyed mount structures. We fix this by awaiting asynchronous I/O
to drain before proceeding.
Reviewed by Chuck Silvers.
 1.94.38.1 06-Jun-2021  cjep sync with head
 1.94.34.1 17-Jun-2021  thorpej Sync w/ HEAD.
 1.96.10.1 02-Aug-2025  perseant Sync with HEAD
 1.246 13-May-2024  msaitoh ficticious -> fictitious in comment.
 1.245 21-Mar-2023  christos PR/57279: Izumi Tsutsui: Fix some {int,long} -> time_t. Still things will
break eventually because parts of the nfs protocol assume time_t will fit
in 32 bits.
 1.244 17-Mar-2023  mlelstv Avoid overflow of nfs_commitsize on machines with > 32GB RAM.
 1.243 13-Jun-2021  mlelstv branches: 1.243.10;
Don't pretend that files are limited to 1TB on NFSv3.
 1.242 02-Apr-2021  christos branches: 1.242.2;
Set f_namemax during mount time like all the other filesystems so that
it does gets the right data in copy_statvfs_info(). Otherwise f_namemax
can end up being 0. To reproduce: unmount the remote filesystem, remount
it, and kill -HUP mountd to refresh exports.
 1.241 13-Apr-2020  ad branches: 1.241.2; 1.241.4;
Replace most uses of vp->v_usecount with a call to vrefcnt(vp), a function
that hides the details and does atomic_load_relaxed(). Signature matches
FreeBSD.
 1.240 16-Mar-2020  pgoyette branches: 1.240.2;
Use the module subsystem's ability to process SYSCTL_SETUP() entries to
automate installation of sysctl nodes.

Note that there are still a number of device and pseudo-device modules
that create entries tied to individual device units, rather than to the
module itself. These are not changed.
 1.239 27-Feb-2020  ad Tighten up the locking around vp->v_iflag a little more after the recent
split of vmobjlock & v_interlock.
 1.238 17-Jan-2020  ad VFS_VGET(), VFS_ROOT(), VFS_FHTOVP(): give them a "int lktype" argument, to
allow us to get shared locks (or no lock) on the returned vnode. Matches
FreeBSD.
 1.237 03-Sep-2018  riastradh branches: 1.237.4; 1.237.6;
Rename min/max -> uimin/uimax for better honesty.

These functions are defined on unsigned int. The generic name
min/max should not silently truncate to 32 bits on 64-bit systems.
This is purely a name change -- no functional change intended.

HOWEVER! Some subsystems have

#define min(a, b) ((a) < (b) ? (a) : (b))
#define max(a, b) ((a) > (b) ? (a) : (b))

even though our standard name for that is MIN/MAX. Although these
may invite multiple evaluation bugs, these do _not_ cause integer
truncation.

To avoid `fixing' these cases, I first changed the name in libkern,
and then compile-tested every file where min/max occurred in order to
confirm that it failed -- and thus confirm that nothing shadowed
min/max -- before changing it.

I have left a handful of bootloaders that are too annoying to
compile-test, and some dead code:

cobalt ews4800mips hp300 hppa ia64 luna68k vax
acorn32/if_ie.c (not included in any kernels)
macppc/if_gm.c (superseded by gem(4))

It should be easy to fix the fallout once identified -- this way of
doing things fails safe, and the goal here, after all, is to _avoid_
silent integer truncations, not introduce them.

Maybe one day we can reintroduce min/max as type-generic things that
never silently truncate. But we should avoid doing that for a while,
so that existing code has a chance to be detected by the compiler for
conversion to uimin/uimax without changing the semantics until we can
properly audit it all. (Who knows, maybe in some cases integer
truncation is actually intended!)
 1.236 16-Mar-2018  christos branches: 1.236.2;
PR/53103: Timo Buhrmester: linux emulation of sendto(2) broken

The sockargs refactoring broke it, because sockargs only works with a user
address. Added an argument to sockargs to indicate where the address is
coming from. Welcome to 8.99.14.
 1.235 17-Apr-2017  hannken branches: 1.235.10;
Remove unused argument "nextp" from vfs_busy() and vfs_unbusy().
Remove argument "keepref" from vfs_unbusy() and add vfs_ref() where needed.
 1.234 17-Apr-2017  hannken Add vfs_ref(mp) and vfs_rele(mp) to add or remove a reference to
struct mount. Rename vfs_destroy(mp) to vfs_rele(mp) and replace
incrementing mp->mnt_refcnt with vfs_ref(mp).
 1.233 01-Apr-2017  riastradh KASSERT(mutex_owned(vp->v_interlock)) in vnode iterator selector.
 1.232 17-Feb-2017  hannken Add generic genfs_suspendctl() and use it for all file systems.
Layered file systems need work.
 1.231 02-Nov-2015  pgoyette branches: 1.231.2; 1.231.4;
Don't forget to call nfs_fini() when we're finished. Without this,
we leave a dangling pool nfsrvdescpl around.
 1.230 15-Jul-2015  manu Fix soft NFS force unmount

For many reasons, forcibly unmounting a soft NFS mount could hang forever.
Here are the fixes:
- Introduce decents timeouts in operation that awaited NFS server reply.
- On timeout, fails operations on soft mounts with EIO.
- Introduce NFSMNT_DISMNTFORCE to let the filesystem know that a
force unmount is ongoing. This causes timeouts to be reduced and
prevents the NFS client to attempt reconnecting to the NFS server.

Also fix a race condition where some asynchronous I/O could reference
destroyed mount structures. We fix this by awaiting asynchronous I/O
to drain before proceeding.

Reviewed by Chuck Silvers.
 1.229 30-May-2014  hannken branches: 1.229.2; 1.229.4;
Change NFS from rbtree to vcache.
 1.228 24-May-2014  christos Introduce a selector function to the vfs vnode iterator so that we don't
need to vget() vnodes that we are not interested at, and optimize locking
a bit. Iterator changes reviewed by Hannken (thanks), the rest of the bugs
are mine.
 1.227 16-Apr-2014  maxv An (un)privileged user can easily make the kernel dereference a NULL
pointer.

The kernel allows 'data' to be NULL; it's the fs's responsibility to
ensure that it isn't NULL (if the fs actually needs data).

ok christos@
 1.226 23-Mar-2014  hannken branches: 1.226.2;
Change all vfsops to use C99 designated initializers.

No functional changes intended.
 1.225 17-Mar-2014  hannken Change nfs_sync() to use vfs_vnode_iterator.
 1.224 25-Feb-2014  pooka Ensure that the top level sysctl nodes (kern, vfs, net, ...) exist before
the sysctl link sets are processed, and remove redundancy.

Shaves >13kB off of an amd64 GENERIC, not to mention >1k duplicate
lines of code.
 1.223 23-Nov-2013  christos change the mountlist CIRCLEQ into a TAILQ
 1.222 14-Sep-2013  martin Remove unused variable
 1.221 22-Jan-2013  dholland branches: 1.221.2;
Stuff UFS_ in front of a few of ufs's symbols to reduce namespace
pollution. Specifically:
ROOTINO -> UFS_ROOTINO
WINO -> UFS_WINO
NXADDR -> UFS_NXADDR
NDADDR -> UFS_NDADDR
NIADDR -> UFS_NIADDR
MAXSYMLINKLEN -> UFS_MAXSYMLINKLEN
MAXSYMLINKLEN_UFS[12] -> UFS[12]_MAXSYMLINKLEN (for consistency)

Sort out ext2fs's misuse of NDADDR and NIADDR; fortunately, these have
the same values in ext2fs and ffs.

No functional change intended.
 1.220 24-Oct-2011  hannken branches: 1.220.2; 1.220.8; 1.220.12; 1.220.14; 1.220.16;
VOP_GETATTR() needs a shared lock at least.

As nfs_kqpoll() ignores the return value from VOP_GETATTR() initialize
the attrributes to zero -- nfs_kqfilter() does the same.
 1.219 07-Oct-2011  hannken As vnalloc() always allocates with PR_WAITOK there is no longer the need
to test its result for NULL.
 1.218 12-Jun-2011  rmind Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.
 1.217 12-Aug-2010  pooka branches: 1.217.6;
Do not return a garbage vnode in vpp if fhtovp fails.

Fixes PR kern/43745 for nfs.
 1.216 21-Jul-2010  hannken Make holding v_interlock mandatory for callers of vget().

Announced some time ago on tech-kern.
 1.215 09-Jul-2010  hannken nfs_unmount(): No need to take a second reference for the root node.

nfs_root(): Replace vget() with vref()/vn_lock(), this node already
has a reference.
 1.214 24-Jun-2010  hannken Clean up vnode lock operations pass 2:

VOP_UNLOCK(vp, flags) -> VOP_UNLOCK(vp): Remove the unneeded flags argument.

Welcome to 5.99.32.

Discussed on tech-kern.
 1.213 24-Jun-2010  hannken Clean up vnode lock operations:

- VOP_LOCK(vp, flags): Limit the set of allowed flags to LK_EXCLUSIVE,
LK_SHARED and LK_NOWAIT. LK_INTERLOCK is no longer allowed as it
makes no sense here.

- VOP_ISLOCKED(vp): Remove the for some time unused return value
LK_EXCLOTHER. Mark this operation as "diagnostic only".
Making a lock decision based on this operation is no longer allowed.

Discussed on tech-kern.
 1.212 15-May-2010  dholland nfs_statvfs should return NFS_MAXNAMLEN, not MAXNAMLEN.
(Compile-tested only, but that should be ok)
 1.211 02-Mar-2010  pooka branches: 1.211.2;
Get rid of dependency on fs_nfs.h, i.e. source modules with
conditional content depending on if the NFS client is wanted or
not. The server can now be made an independent module not depending
on the nfs client.

Tested with rump_nfs (standalone client), rump_nfsd (standalone
nfsd) and a qemu installation with both the client and the server.
 1.210 15-Mar-2009  cegger branches: 1.210.2;
ansify function definitions
 1.209 14-Mar-2009  dsl ANSIfy another 1261 function definitions.
The only ones left in sys are beyond by sed script!
(or in sys/dist or sys/external)
Mostly they have function pointer parameters.
 1.208 14-Mar-2009  dsl Change about 4500 of the K&R function definitions to ANSI ones.
There are still about 1600 left, but they have ',' or /* ... */
in the actual variable definitions - which my awk script doesn't handle.
There are also many that need () -> (void).
(The script does handle misordered arguments.)
 1.207 14-Mar-2009  dsl Remove all the __P() from sys (excluding sys/dist)
Diff checked with grep and MK1 eyeball.
i386 and amd64 GENERIC and sys still build.
 1.206 17-Dec-2008  cegger branches: 1.206.2;
kill MALLOC and FREE macros.
 1.205 19-Nov-2008  ad Make the emulations, exec formats, coredump, NFS, and the NFS server
into modules. By and large this commit:

- shuffles header files and ifdefs
- splits code out where necessary to be modular
- adds module glue for each of the components
- adds/replaces hooks for things that can be installed at runtime
 1.204 14-Nov-2008  ad Remove COMPAT ifdefs that might as well be comments (i.e., they cost us
almost nothing).
 1.203 22-Oct-2008  matt branches: 1.203.2; 1.203.4; 1.203.10; 1.203.14;
Don't need nfs_vfs_reinit anymore since we don't resize tables anymore.
Move reinit code to init case.
 1.202 22-Oct-2008  matt Change NFS to use a RB-tree for its FH->nfsnode lookups.
 1.201 30-Sep-2008  pooka Since the nfs root vnode is eternally constant, fully initialize
it in mountfs instead of deferring part of the initialization to
VFS_ROOT(). Fixes theoretical future bugs for nfs roots.
 1.200 10-May-2008  rumble branches: 1.200.4;
Convert file systems to dynamically attach with the new module interface.
Make VFS hooks dynamic while we're here and say farewell to VFS_ATTACH and
VFS_HOOKS_ATTACH linksets.

As a consequence, most of the file systems can now be loaded as new style
modules.

Quick sanity check by ad@.
 1.199 06-May-2008  ad branches: 1.199.2;
PR kern/38141 lookup/vfs_busy acquire rwlock recursively

Simplify the mount locking. Remove all the crud to deal with recursion on
the mount lock, and crud to deal with unmount as another weirdo lock.

Hopefully this will once and for all fix the deadlocks with this. With this
commit there are two locks on each mount:

- krwlock_t mnt_unmounting. This is used to prevent unmount across critical
sections like getnewvnode(). It's only ever read locked with rw_tryenter(),
and is only ever write locked in dounmount(). A write hold can't be taken
on this lock if the current LWP could hold a vnode lock.

- kmutex_t mnt_updating. This is taken by threads updating the mount, for
example when going r/o -> r/w, and is only present to serialize updates.
In order to take this lock, a read hold must first be taken on
mnt_unmounting, and the two need to be held across the operation.

One effect of this change: previously if an unmount failed, we would make a
half hearted attempt to back out of it gracefully, but that was unlikely to
work in a lot of cases. Now while an unmount that will be aborted is in
progress, new file operations within the mount will fail instead of being
delayed. That is unlikely to be a problem though, because if the admin
requests unmount of a file system then s(he) has made a decision to deny
access to the resource.
 1.198 30-Apr-2008  ad PR kern/38135 vfs_busy/vfs_trybusy confusion

The previous fix worked, but it opened a window where mounts could have
disappeared from mountlist while the caller was traversing it using
vfs_trybusy(). Fix that.
 1.197 29-Apr-2008  ad PR kern/38057 ffs makes assuptions about devvp file system
PR kern/33406 softdeps get stuck in endless loop

Introduce VFS_FSYNC() and call it when syncing a block device, if it
has a mounted file system.
 1.196 13-Feb-2008  yamt branches: 1.196.6; 1.196.8; 1.196.10;
reject files larger than nm_maxfilesize.
 1.195 13-Feb-2008  yamt nfs_mountroot: kmem_alloc+memset -> kmem_zalloc
 1.194 30-Jan-2008  ad PR kern/37706 (forced unmount of file systems is unsafe):

- Do reference counting for 'struct mount'. Each vnode associated with a
mount takes a reference, and in turn the mount takes a reference to the
vfsops.
- Now that mounts are reference counted, replace the overcomplicated mount
locking inherited from 4.4BSD with a recursable rwlock.
 1.193 28-Jan-2008  dholland Fix some race conditions in rename.
Introduce a per-FS rename lock and new vfsops to manipulate it.
Get this lock while renaming. Also add another relookup() in do_sys_rename,
which is a hack to kludge around some of the worst deficiencies of
ufs_rename.
reviewed-by: pooka (and an earlier rev by ad)
posted on tech-kern with no objections.
 1.192 20-Jan-2008  joerg Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.
 1.191 03-Jan-2008  pooka valloc -> vnalloc, vfree -> vnfree
Avoids collision with userland valloc(3).

no functional change
ad ok
 1.190 02-Jan-2008  yamt use kmem_alloc instead of malloc.
 1.189 02-Jan-2008  ad Merge vmlocking2 to head.
 1.188 26-Nov-2007  pooka branches: 1.188.2; 1.188.6;
Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern
 1.187 28-Oct-2007  yamt branches: 1.187.2;
make NFS_ATTRTIMEO a function.
 1.186 10-Oct-2007  ad branches: 1.186.2;
Merge from vmlocking:

- Split vnode::v_flag into three fields, depending on field locking.
- simple_lock -> kmutex in a few places.
- Fix some simple locking problems.
 1.185 06-Sep-2007  rmind branches: 1.185.2;
nfs_mount: Plug a possible leaks.
Invented in 1.114 rev.
From CID: 4534
 1.184 10-Aug-2007  yamt branches: 1.184.2;
- instead of scanning an array of iods, maintain a list of idle iods.
- make nfs_getset_niothreads MP friendly.
 1.183 05-Aug-2007  yamt branches: 1.183.2;
use kpause rather than lbolt.
 1.182 31-Jul-2007  pooka branches: 1.182.2;
* nuke the nameidata parameter from VFS_MOUNT(). Nobody on tech-kern
knew what it was supposed to be used for and wrstuden gave a go-ahead
* while rototilling, convert file systems which went easily to
use VFS_PROTOS() instead of manually prototyping the methods
 1.181 26-Jul-2007  pooka Use eopnotsupp() instead of vfs_stdsuspendctl() and retire the latter.
 1.180 20-Jul-2007  pooka In sync, skip over vnodes based on if they are clean rather than
if they have pages.
 1.179 17-Jul-2007  pooka branches: 1.179.2;
Make set_statvfs_info() take a parameter for the vfs name instead
of always retrieving it from mp->mnt_op->vfs_name

christos ok
 1.178 12-Jul-2007  dsl Change the VFS_MOUNT() interface so that the 'data' buffer passed to the
fs code is a kernel buffer, pass though the length of the buffer as well.
Since the length of the userspace buffer isn'it (yet) passed through the mount
system call, add a field to the vfsops structure containing the default length.
Split sys_mount() for calls from compat code.
Ride one of the recent kernel version changes - old fs LKMs will load, but
sys_mount() will reject any attempt to use them.
 1.177 29-Apr-2007  yamt don't forget to destroy mutex and condvar.
 1.176 29-Apr-2007  yamt use condvar.
 1.175 29-Apr-2007  yamt use mutex and condver.
 1.174 04-Mar-2007  christos branches: 1.174.2; 1.174.4;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.
 1.173 22-Feb-2007  thorpej TRUE -> true, FALSE -> false
 1.172 15-Feb-2007  yamt branches: 1.172.2;
use mutex and rwlock rather than lockmgr.
 1.171 19-Jan-2007  hannken New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).
 1.170 27-Dec-2006  yamt - remove the rest of nqnfs.
- reject NFSMNT_MNTD and NFSMNT_KERB. (no users in tree.)
 1.169 27-Dec-2006  yamt remove nqnfs.
 1.168 09-Nov-2006  yamt remove some __unused in function parameters.
 1.167 25-Oct-2006  reinoud Revisit mnt_vnodelist TAILQ patch. Remove all suspicious TAILQ_FOREACH()
loops where vnodes can get removed or added during the loops. This could
lead to panic's on unmount since nodes are skipped or otherwise
TAILQ_NEXT(0xdeadbeef, ...) was dereferenced.
 1.166 20-Oct-2006  reinoud Replace the LIST structure mp->mnt_vnodelist to a TAILQ structure since all
vnodes were synced and processed backwards. This meant that the last
accessed node was processed first and the earlierst last.

An extra benefit is the removal of the ugly hack from the Berkly days on
LFS.

In the proces, i've also replaced the various variations hand written loops
by the TAILQ_FOREACH() macro's.
 1.165 12-Oct-2006  christos - sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386
 1.164 02-Sep-2006  yamt branches: 1.164.2; 1.164.4;
nfs_fhtovp: try to detect stale or invalid handles by issuing VOP_GETATTR.
 1.163 02-Sep-2006  yamt implement vptofh and fhtovp for nfs.
 1.162 02-Sep-2006  christos fix default type decls
fix incomplete initializer
 1.161 24-Aug-2006  christos Don't free what we did not allocate.
 1.160 23-Aug-2006  christos Change iostat_alloc() to take the parent pointer and the name directly, so
that callers are not responsible for initializing the fields. Store the name
inside the struct instead of maintaining a pointer to external storage, or
leaked memory (nfs case).
 1.159 23-Jul-2006  ad Use the LWP cached credentials where sane.
 1.158 13-Jul-2006  martin Fix alignement problems for fhandle_t, exposed by gcc4.1.

While touching all vptofh/fhtovp functions, get rid of VFS_MAXFIDSIZ,
version the getfh(2) syscall and explicitly pass the size available in
the filehandle from userland.

Discussed on tech-kern, with lots of help from yamt (thanks!).
 1.157 07-Jun-2006  kardel branches: 1.157.2;
merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html
 1.156 20-May-2006  yamt mountnfs: reject wrongly-sized filehandle for nfsv2.
 1.155 14-May-2006  elad branches: 1.155.2;
integrate kauth.
 1.154 20-Apr-2006  blymn Prefix iostat structure elements with io_
 1.153 14-Apr-2006  blymn Make i/o statistics collection more generic, include tape drives and
nfs mounts in the set of devices that statistics will be reported on.
 1.152 21-Feb-2006  thorpej branches: 1.152.2; 1.152.4; 1.152.6;
Use device_class() instead of accessing dv_class directly.
 1.151 11-Dec-2005  christos branches: 1.151.2; 1.151.4; 1.151.6;
merge ktrace-lwp.
 1.150 23-Sep-2005  jmmv Apply the NFS exports list rototill patch:

- Remove all NFS related stuff from file system specific code.
- Drop the vfs_checkexp hook and generalize it in the new nfs_check_export
function, thus removing redundancy from all file systems.
- Move all NFS export-related stuff from kern/vfs_subr.c to the new
file sys/nfs/nfs_export.c. The former was becoming large and its code
is always compiled, regardless of the build options. Using the latter,
the code is only compiled in when NFSSERVER is enabled. While doing this,
also make some functions in nfs_subs.c conditional to NFSSERVER.
- Add a new command in nfssvc(2), called NFSSVC_SETEXPORTSLIST, that takes a
path and a set of export entries. At the moment it can only clear the
exports list or append entries, one by one, but it is done in a way that
allows setting the whole set of entries atomically in the future (see the
comment in mountd_set_exports_list or in doc/TODO).
- Change mountd(8) to use the nfssvc(2) system call instead of mount(2) so
that it becomes file system agnostic. In fact, all this whole thing was
done to remove a 'XXX' block from this utility!
- Change the mount*, newfs and fsck* userland utilities to not deal with NFS
exports initialization; done internally by the kernel when initializing
the NFS support for each file system.
- Implement an interface for VFS (called VFS hooks) so that several kernel
subsystems can run arbitrary code upon receipt of specific VFS events.
At the moment, this only provides support for unmount and is used to
destroy NFS exports lists from the file systems being unmounted, though it
has room for extension.

Thanks go to yamt@, chs@, thorpej@, wrstuden@ and others for their comments
and advice in the development of this patch.
 1.149 19-Sep-2005  christos ATTRTIMEO takes 2 args.
 1.148 09-Jun-2005  atatat branches: 1.148.2;
Properly fix the constipated lossage wrt -Wcast-qual and the sysctl
code. I know it's not the prettiest code, but it seems to work rather
well in spite of itself.
 1.147 29-May-2005  christos - sprinkle const
- avoid shadowed variables
- mark bad const use with XXXUNCONST
 1.146 29-Mar-2005  thorpej - Define a VFS_ATTACH() macro that places a reference to a vfsops structure
into the "vfsops" link set.
- Use VFS_ATTACH() where vfsops are declared for individual file systems.
- In vfsinit(), traverse the "vfsops" link set, rather than vfs_list_initial[].
 1.145 26-Feb-2005  perry branches: 1.145.2;
nuke trailing whitespace
 1.144 02-Jan-2005  thorpej branches: 1.144.2; 1.144.4;
Add the system call and VFS infrastructure for file system extended
attributes.

From FreeBSD.
 1.143 15-Aug-2004  mycroft Fixing age old cruft:
* Rather than using mnt_maxsymlinklen to indicate that a file systems returns
d_type fields(!), add a new internal flag, IMNT_DTYPE.

Add 3 new elements to ufsmount:
* um_maxsymlinklen, replaces mnt_maxsymlinklen (which never should have existed
in the first place).
* um_dirblksiz, which tracks the current directory block size, eliminating the
FS-specific checks littered throughout the code. This may be used later to
make the block size variable.
* um_maxfilesize, which is the maximum file size, possibly adjusted lower due
to implementation issues.

Sync some bug fixes from FFS into ext2fs, particularly:
* ffs_lookup.c 1.21, 1.28, 1.33, 1.48
* ffs_inode.c 1.43, 1.44, 1.45, 1.66, 1.67
* ffs_vnops.c 1.84, 1.85, 1.86

Clean up some crappy pointer frobnication.
 1.142 12-Jul-2004  yamt nfs_fsinfo: when changing rsize/wsize,
keep mnt_fs_bshift in-sync. otherwise genfs_getpages behaves badly.
 1.141 05-Jul-2004  pk Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().
 1.140 25-May-2004  hannken Add ffs internal snapshots. Written by Marshall Kirk McKusick for FreeBSD.

- Not enabled by default. Needs kernel option FFS_SNAPSHOT.
- Change parameters of ffs_blkfree.
- Let the copy-on-write functions return an error so spec_strategy
may fail if the copy-on-write fails.
- Change genfs_*lock*() to use vp->v_vnlock instead of &vp->v_lock.
- Add flag B_METAONLY to VOP_BALLOC to return indirect block buffer.
- Add a function ffs_checkfreefile needed for snapshot creation.
- Add special handling of snapshot files:
Snapshots may not be opened for writing and the attributes are read-only.
Use the mtime as the time this snapshot was taken.
Deny mtime updates for snapshot files.
- Add function transferlockers to transfer any waiting processes from
one lock to another.
- Add vfsop VFS_SNAPSHOT to take a snapshot and make it accessible through
a vnode.
- Add snapshot support to ls, fsck_ffs and dump.

Welcome to 2.0F.

Approved by: Jason R. Thorpe <thorpej@netbsd.org>
 1.139 25-May-2004  atatat Sysctl descriptions under vfs subtree
 1.138 22-May-2004  jonathan Eliminate several uses of `curproc' from the socket-layer code and from NFS.

Add a new explicit `struct proc *p' argument to socreate(), sosend().
Use that argument instead of curproc. Follow-on changes to pass that
argument to socreate(), sosend(), and (*so->so_send)() calls.
These changes reviewed and independently recoded by Matt Thomas.

Changes to soreceive() and (*dom->dom_exernalize() from Matt Thomas:
pass soreceive()'s struct uio* uio->uio_procp to unp_externalize().
Eliminate curproc from unp_externalize. Also, now soreceive() uses
its uio->uio_procp value, pass that same value downward to
((pr->pru_usrreq)() calls for consistency, instead of (struct proc * )0.

Similar changes in sys/nfs to eliminate (most) uses of curproc,
either via the req-> r_procp field of a struct nfsreq *req argument,
or by passing down new explicit struct proc * arguments.

Reviewed by: Matt Thomas, posted to tech-kern.
NB: The (*pr->pru_usrreq)() change should be tested on more (all!) protocols.
 1.137 27-Apr-2004  jrf First pass for some caddr_t removal and changes to get rid of it where we
no longer use and/or need it

- removed casts from unionfs, deadfs and fdesc
(there are more to hunt down still)
- changed vfs_quotactl args argumet from caddr_t to void *
- changed vfs_quotactl structures/callers to reflect the api change

Compiled fine and ran for about a day. Approved/reviewed by
christos@netbsd.org and gimpy@netbsd.org.
 1.136 21-Apr-2004  christos Replace the statfs() family of system calls with statvfs().
Retain binary compatibility.
 1.135 24-Mar-2004  atatat branches: 1.135.2;
Tango on sysctl_createv() and flags. The flags have all been renamed,
and sysctl_createv() now uses more arguments.
 1.134 04-Dec-2003  atatat Dynamic sysctl.

Gone are the old kern_sysctl(), cpu_sysctl(), hw_sysctl(),
vfs_sysctl(), etc, routines, along with sysctl_int() et al. Now all
nodes are registered with the tree, and nodes can be added (or
removed) easily, and I/O to and from the tree is handled generically.

Since the nodes are registered with the tree, the mapping from name to
number (and back again) can now be discovered, instead of having to be
hard coded. Adding new nodes to the tree is likewise much simpler --
the new infrastructure handles almost all the work for simple types,
and just about anything else can be done with a small helper function.

All existing nodes are where they were before (numerically speaking),
so all existing consumers of sysctl information should notice no
difference.

PS - I'm sorry, but there's a distinct lack of documentation at the
moment. I'm working on sysctl(3/8/9) right now, and I promise to
watch out for buses.
 1.133 02-Oct-2003  itojun plug mbuf leak due to manual mbuf handling. PR kern/13807.
(martti confirmed that it stabilizes the situation described in kren/13807)
 1.132 07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.131 29-Jun-2003  fvdl branches: 1.131.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
 1.130 29-Jun-2003  thorpej Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.
 1.129 28-Jun-2003  darrenr Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
 1.128 21-May-2003  yamt remove local definitions of TRUE and FALSE.
 1.127 03-May-2003  yamt better handling of write verifier change.
 1.126 24-Apr-2003  drochner Change some subordinate functions to take a "struct nfsnode" argument
instead of "struct vnode". This saves a number of pointer dereferences;
it sums up to about half a kB for me. And it paves the way for future
fixes.
While cleaning up, eliminate a write-only member of "struct nfsreq"
and a pointless assignment in the NFS_V2_ONLY case.
 1.125 16-Apr-2003  christos PR/1796: John Kohl: statfs misbehaves under chrooted environments.

- Under chroot it displays only the visible filesystems with appropriate paths.
- The statfs f_mntonname gets adjusted to contain the real path from root.
- While was there, fixed a bug in ext2fs, locking problems with vfs_getfsstat(),
and factored out some of the vfsop statfs() code to copy_statfs_info(). This
fixes the problem where some filesystems forgot to set fsid.
- Made coda look more like a normal fs.
 1.124 02-Apr-2003  yamt use queue manipulation macros.
 1.123 28-Mar-2003  yamt if rsize was explicitly specified by mount_nfs,
prefer it to rtpref from nfsd. the same for wsize and wtpref.

ok'ed by fvdl.
 1.122 26-Feb-2003  matt Add MBUFTRACE kernel option.
Do a little mbuf rework while here. Change all uses of MGET*(*, M_WAIT, *)
to m_get*(M_WAIT, *). These are not performance critical and making them
call m_get saves considerable space. Add m_clget analogue of MCLGET and
make corresponding change for M_WAIT uses.
Modify netinet, gem, fxp, tulip, nfs to support MBUFTRACE.
Begin to change netstat to use sysctl.
 1.121 01-Feb-2003  thorpej Add extensible malloc types, adapted from FreeBSD. This turns
malloc types into a structure, a pointer to which is passed around,
instead of an int constant. Allow the limit to be adjusted when the
malloc type is defined, or with a function call, as suggested by
Jonathan Stone.
 1.120 24-Nov-2002  scw Fix an uninitialised variable warning.
 1.119 21-Oct-2002  yamt fix a page locking deadlock problem for nfs.

add a flag that specify if the file can be truncated safely or not
to nfsm_loadattr and friends. when it isn't safe, just mark the nfsnode
as "should be truncated later".

ok'ed by Frank van der Linden and Chuck Silvers.
close kern/18036.
 1.118 21-Oct-2002  enami When printing filesystem specific parameters, also print the address and
port of server numerically.
 1.117 01-Oct-2002  christos forgot to set deadthresh; thanks to YAMAMOTO Takashi.
 1.116 21-Sep-2002  christos MNT_GETARGS support
 1.115 30-Jul-2002  soren Die, qaddr_t, die! - mnt_data in struct mount is already effectively
a void *, so stop pretending otherwise.
 1.114 26-Jul-2002  enami Synchronize code and comment again to prevent mbuf leak. Sprinkle some
KNF while I'm here.
 1.113 25-Jul-2002  jdolecek Reduce stack usage on the NFS mount code path. This fixes kernel stack
overflow when using IPsec on vax, as reported by Olaf Seibert on
current-users@.
 1.112 04-Dec-2001  christos branches: 1.112.8; 1.112.10;
PR/14817: Gregory McGarry: NFS_V2_ONLY doesn't seem to work.
 1.111 10-Nov-2001  lukem add RCSIDs
 1.110 08-Oct-2001  chs branches: 1.110.2;
revert a change that I accidentally included with ubcperf.
 1.109 20-Sep-2001  chs fix nfs_bmap() so that it works for both genfs_{get,put}pages() and swap/vnd.
 1.108 15-Sep-2001  chs a whole bunch of changes to improve performance and robustness under load:

- remove special treatment of pager_map mappings in pmaps. this is
required now, since I've removed the globals that expose the address range.
pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's
no longer any need to special-case it.
- eliminate struct uvm_vnode by moving its fields into struct vnode.
- rewrite the pageout path. the pager is now responsible for handling the
high-level requests instead of only getting control after a bunch of work
has already been done on its behalf. this will allow us to UBCify LFS,
which needs tighter control over its pages than other filesystems do.
writing a page to disk no longer requires making it read-only, which
allows us to write wired pages without causing all kinds of havoc.
- use a new PG_PAGEOUT flag to indicate that a page should be freed
on behalf of the pagedaemon when it's unlocked. this flag is very similar
to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the
pageout fails due to eg. an indirect-block buffer being locked.
this allows us to remove the "version" field from struct vm_page,
and together with shrinking "loan_count" from 32 bits to 16,
struct vm_page is now 4 bytes smaller.
- no longer use PG_RELEASED for swap-backed pages. if the page is busy
because it's being paged out, we can't release the swap slot to be
reallocated until that write is complete, but unlike with vnodes we
don't keep a count of in-progress writes so there's no good way to
know when the write is done. instead, when we need to free a busy
swap-backed page, just sleep until we can get it busy ourselves.
- implement a fast-path for extending writes which allows us to avoid
zeroing new pages. this substantially reduces cpu usage.
- encapsulate the data used by the genfs code in a struct genfs_node,
which must be the first element of the filesystem-specific vnode data
for filesystems which use genfs_{get,put}pages().
- eliminate many of the UVM pagerops, since they aren't needed anymore
now that the pager "put" operation is a higher-level operation.
- enhance the genfs code to allow NFS to use the genfs_{get,put}pages
instead of a modified copy.
- clean up struct vnode by removing all the fields that used to be used by
the vfs_cluster.c code (which we don't use anymore with UBC).
- remove kmem_object and mb_object since they were useless.
instead of allocating pages to these objects, we now just allocate
pages with no object. such pages are mapped in the kernel until they
are freed, so we can use the mapping to find the page to free it.
this allows us to remove splvm() protection in several places.

The sum of all these changes improves write throughput on my
decstation 5000/200 to within 1% of the rate of NetBSD 1.5
and reduces the elapsed time for "make release" of a NetBSD 1.5
source tree on my 128MB pc to 10% less than a 1.5 kernel took.
 1.107 15-Sep-2001  chs add a new VFS op, vfs_reinit, which is called when desiredvnodes is
adjusted via sysctl. file systems that have hash tables which are
sized based on the value of this variable now resize those hash tables
using the new value. the max number of FFS softdeps is also recalculated.

convert various file systems to use the <sys/queue.h> macros for
their hash tables.
 1.106 30-Jul-2001  jdolecek branches: 1.106.2;
Check the passed file handle length _before_, not _after_ copyin()
 1.105 30-Jul-2001  fvdl Check the length of a passed in filehandle to the mount call before
doing a copyin. From Ken Ashcraft @ Stanford via Constantine Sapuntzakis.
 1.104 01-Jul-2001  gmcgarry branches: 1.104.2;
Introduce NFS_DEFAULT_NIOTHREADS to define the default number
of nfs_niothreads instead of hard-coding 4.

This change has the advantage that the default can be specified
at compile time. If the root filesystem is mounted over NFS
we don't have an opportunity to use the syscall to limit the
number of threads. Useful on small-memory machines.
 1.103 30-May-2001  mrg use _KERNEL_OPT
 1.102 28-Apr-2001  bjh21 When NFS_V2_ONLY is defined, refuse to mount NFSv3 and NQNFS filesystems,
rather than pretending they're NFSv2 and hoping for the best. Fix based on
that supplied by Christian Groessler.
 1.101 12-Feb-2001  fvdl branches: 1.101.2;
Instead of storing the filehandle in the mount structure, store the
vnode pointer. This avoids a locking problem with nfs_nget, and
can be done because we always have a reference on the root vnode
of the filesystem.
 1.100 06-Feb-2001  fvdl Do actual vnode locking for NFS.
 1.99 22-Jan-2001  jdolecek make filesystem vnodeop, specop, fifoop and vnodeopv_* arrays const
 1.98 10-Dec-2000  chs in *_sync(), don't skip vnodes which have (potentially dirty) pages.
 1.97 27-Nov-2000  chs Initial integration of the Unified Buffer Cache project.
 1.96 19-Sep-2000  fvdl Update for VOP_FSYNC parameter change.
 1.95 19-Sep-2000  bjh21 New kernel option, NFS_V2_ONLY, which aims to reduce the NFS client to just
that required to support NFSv2 mounts. Not finished yet, but already
provides some 44k of saving in code size on arm26. More savings, and some
documentation, are still to come.
 1.94 23-Aug-2000  enami Update nfs mount flags correctly. Fixes a bug introduced in rev. 1.65.
 1.93 30-Jul-2000  simonb Remove inclusion of <uvm/uvm_extern.h> that was there only to keep
<sys/sysctl.h> happy.
 1.92 27-Jun-2000  mrg remove include of <vm/vm.h>
 1.91 10-Jun-2000  assar branches: 1.91.2;
make vfs_getnewfsid only take one argument and fetch the name of the
filesystem from the supplied mount argument. also make makefstype
take a const parameter. update all the callers.
 1.90 07-May-2000  tsarna branches: 1.90.2;
Auto-adjusting vfs.nfs.iothreads: when mounting the first nfs
filesystem, if the number of threads is "-1", meaning it's never been
set, then set it to 4. You can override by setting this to some other
number (including 0) before or after mounting, of course.

Thanks to whoever it was that suggested this on ICB... sorry I don't
remember who.
 1.89 15-Apr-2000  tsarna Death to nfsiod!

It is replaced by kernel threads that do the same thing. The number of
kernel threads used is set with the vfs.nfs.iothreads sysctl.
 1.88 30-Mar-2000  augustss Remove register declarations.
 1.87 29-Mar-2000  simonb Don't need to include <sys/conf.h> here.
 1.86 16-Mar-2000  jdolecek Add new VFS op routine - vfs_done and call it on filesystem detach
in vfs_detach(). vfs_done may free global filesystem's resources,
typically those allocated in respective filesystem's init function.
Needed so those filesystems which went in via LKM have a chance to
clean after themselves before unloading. This fixes random panics
when LKM for filesystem using pools was loaded and unloaded several
times.

For each leaf filesystem, add appropriate vfs_done routine.
 1.85 15-Nov-1999  fvdl Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O
 1.84 29-Aug-1999  sommerfeld branches: 1.84.2; 1.84.4; 1.84.8;
Once the mount structure is definitely doomed, always set the
NFSMNT_DISMNT bit in it so that any waiters can go away cleanly.
(formerly, we did this only in the NQNFS/KERB cases).
 1.83 06-Mar-1999  fair branches: 1.83.2; 1.83.4;
Snatch a patch from OpenBSD to fix PRs 6529 and 7074.
Adjust fxdr_hyper() and txdr_hyper() macros.
 1.82 05-Mar-1999  mycroft Clean up some sign extension bogosity in statfs, so negative numbers are
actually negative on a LP64 client.
 1.81 26-Feb-1999  wrstuden Modify vfsops to seperate vfs_fhtovp() into two routines. vfs_fhtovp() now
only handles the file handle to vnode conversion, and a new call,
vfs_checkexp(), performs the export verification.
 1.80 21-Feb-1999  drochner -call nfs_boot_cleanup() if mount failed
-g/c diskless swap initialization
 1.79 12-Nov-1998  fvdl Use different names for the "nfscon" label to tsleep(), so that it can
be seen in which one a process is sleeping.
 1.78 28-Sep-1998  drochner Use the "atime" instead of "mtime" of the remote root directory as
base for inittodr() - it is closer to the current time.
 1.77 09-Aug-1998  perry bzero->memset, bcopy->memcpy, bcmp->memcmp
 1.76 05-Jul-1998  jonathan * defopt COMPAT_{09,10,11,12,13} and COMPAT_NOMID.
TODO: revisit interaction between native compat and emul compat usage.
 1.75 24-Jun-1998  sommerfe Always include fifos; "not an option any more".
 1.74 22-Jun-1998  sommerfe defopt for options FIFO
 1.73 05-Jun-1998  kleink Convert fsync vnode operator implementations and usage from the old `waitfor'
argument and MNT_WAIT/MNT_NOWAIT to `flags' and FSYNC_WAIT.
 1.72 24-Mar-1998  fvdl Re-instate call to "safe" disconnect function that got lost during the
Lite2 merge.
 1.71 03-Mar-1998  thorpej Historical practice assumes that NFS root mounts are initially read/write.
 1.70 03-Mar-1998  fvdl Don't try to apply the cookie endian heuristic on a mounted file (e.g.
a swapfile). From Matthias Drochner.
 1.69 01-Mar-1998  fvdl Merge with Lite2 + local changes
 1.68 18-Feb-1998  thorpej Place a pointer to an array of our vnodeopv_desc *'s in our vfsops
structure, for use by vfs_attach().
 1.67 30-Jan-1998  fvdl Only take the receive lock before disconnecting when doing it from
nfs_decode_args. Otherwise we might just end up locking against ourselves.

XXX workaround, will do ok for now. Proper fix forthcoming.
 1.66 19-Oct-1997  fvdl branches: 1.66.2;
* Implement optional 32 <-> 64 bit directory cookie translation. This uses
the directory cache as translation table. See nfs_subs.c for comments.
Makes the code a bit more complex to look at than I would have liked,
but doesn't affect the speed of the default behavior.
* Optimize caching behavior a bit when buffers are invalidated.
* Save some RPCs in readdir operations by not bothering if there is
a small amount left to do to fill the buffer. It'll be done in the
next RPC with a larger chunk anyway. Wastes a bit of buffer space
but is faster.
* Make n_vattr an allocated vattr struct. This avoids nfsnode bloat,
and is friendlier to the malloc routines.
 1.65 10-Oct-1997  fvdl * New directory entry caching system. Provides full caching of any
directory cookie that may be thrown back at us from userspace, up
to a size limit. Fixes double entry problem.
* Split flags for internal and external use in the NFS mount structure.
* Fix some buffer structure fields that weren're being used correctly.
* Fix missing directory cache inval call in nfs_open.
* Limit on NFS_DIRBLKSIZ no longer needed, bumped to the more reasonable
value of 8k.
* Various other things that I forget, all related to the dir caching
somehow, though.
 1.64 09-Sep-1997  gwr Move the call to nfs_boot_getfh() from nfs_vfsops.c to nfs_boot.c
(just for better isolation - it can now be static)
 1.63 29-Aug-1997  gwr Supporting changes for the new BOOTP support in nfs_mountroot.
 1.62 18-Jul-1997  christos branches: 1.62.2;
Fix reversed test for version 3 that broke nfs version 2 mounts.
 1.61 17-Jul-1997  fvdl * Deal with servers that don't give complete FSINFO (like NT)
From Olaf Seibert <rhialto@polder.ubc.kun.nl> (PR 3687)
* Make an attempt to check the maximum filesize before attempting
a write to the server, as write RPCs will typically happen
asynchronously, and the process will not see the error.
Fixes problems with unexpectly truncated files at 4G
* Pass up errors in nfs_writerpc correctly
 1.60 12-Jun-1997  mrg remove swap configuration.
 1.59 27-May-1997  gwr Minor reorganization of nfs_mountroot code to simplify BOOTP support.
The RPC/bootparamd calls to get the root and swap paths are now done
in nfs_boot_init() instead of nfs_boot_getfh(), so the latter now just
does the RPC/mountd call. Also changed some panics into error returns.
 1.58 22-Feb-1997  fvdl Silently clear NFSMNT_NOCONN if it's a TCP mount.
 1.57 04-Feb-1997  fvdl branches: 1.57.2; 1.57.4;
* Make sure a new socket is created when switching to/from NOCONN with
a mount
* Add extra printf statements to hopefully get some more info on lockups,
specifically when a send error is ignored.
 1.56 31-Jan-1997  thorpej - Add nfs_mountroot to nfs_vfsops.
- Only attempt to mount NFS root on a DV_IFNET class device.
- If nfs_boot_init() failes, return the error code to the caller.
 1.55 22-Dec-1996  cgd branches: 1.55.2;
Change the second and third args to struct vfsops' (*vfs_mount)() to
'const char *', and 'void *', respectively. The second arg is taken directly
from user arguments, and is const there, so must be const in the prototypes
and functions. The third arg is also taken directly from user arguments.
It doesn't have to be changed, but since it's cleaner to keep the type
the same as the user arg's type, and I'm already making the 'const char *'
change...
 1.54 03-Dec-1996  thorpej Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.
 1.53 02-Dec-1996  thorpej NFS performance improvement from Doug Rabson/FreeBSD:

Improve the queuing algorithms used by NFS' asynchronous i/o. The
existing mechanism uses a global queue for some buffers and the
vp->b_dirtyblkhd queue for others. This turns sequential writes into
randomly ordered writes to the server, affecting both read and write
performance. The existing mechanism also copes badly with hung
servers, tending to block accesses to other servers when all the iods
are waiting for a hung server.

The new mechanism uses a queue for each mount point. All asynchronous
i/o goes through this queue which preserves the ordering of requests.
A simple mechanism ensures that the iods are shared out fairly between
active mount points.

Reviewed/integrated/approved by Frank van der Linden <fvdl@netbsd.org>
 1.52 20-Oct-1996  fvdl Enhancements from Matthias Drochner:
- Try V3 first for diskless booting. Fall back to V2 if V3 fails.
- optionally (option NFS_BOOT_TCP) try a TCP mount first
for diskless booting. Fall back to UDP if it fails.
- Enable switching between UDP and TCP for remounts.
 1.51 13-Oct-1996  christos revert kprintf changes
 1.50 10-Oct-1996  christos printf -> kprintf, sprintf -> ksprintf
 1.49 24-Jun-1996  pk Ignore the mountpoint's `v_usecount' in nfs_unmount() if MNT_FORCE is on.
This takes care of two related problems:
- `umount -f' wouldn't work if someone's working directory is
the filesystem root.
- vfs_unmountall() would complain about a busy `/' on a
diskless setup.
 1.48 14-Jun-1996  cgd avoid unnecessary checks of m_get/MGET/etc.'s return values. When
they're called with M_WAIT, they are defined to never return NULL.
 1.47 23-May-1996  fvdl * Make mounts with symlinks work (needed for direct mounts with amd). PR #1917
* Never change the NQNFS flag and/or version when just doing an update mount.
Fixes a problem that made diskless booting impossible under some
circumstances.
 1.46 24-Mar-1996  fvdl branches: 1.46.4;
Return earlier on error in nfs_statfs. Should fix problem reported by
both mrg and cgd.
 1.45 17-Mar-1996  christos Fix printf format strings.
 1.44 13-Mar-1996  fvdl Make readdirsize default to rsize if rsize is explicitly specified,
and readdirsize isn't.
 1.43 18-Feb-1996  fvdl Bring in a merge of Rick Macklem's NFSv3 code from Lite2
 1.42 13-Feb-1996  gwr Do the RPC to bootparamd a little later (just before the mountd call)
so that we do not ask for the "swap" path when swapping on disk.
 1.41 09-Feb-1996  christos nfs prototype changes
 1.40 01-Feb-1996  jtc Rename struct timespec fields to conform to POSIX.1b
 1.39 19-Dec-1995  cgd changes to make this work on systems where pointers & longs are 64 bits.
This is mostly just changes to make the stuff that goes over the wire
use fixed-size types.
 1.38 13-Aug-1995  mycroft splnet --> splsoftnet
 1.37 18-Jun-1995  cgd don't assume the f_fsnamelen is nul-truncated or longer than MFSNAMELEN
 1.36 02-Jun-1995  mycroft Fix more off by one errors.
 1.35 18-Mar-1995  gwr Print the "root/swap on ..." messages here.
Add NFS_BOOT_OPTIONS for things like NFSMNT_NOCONN.
 1.34 09-Mar-1995  mycroft copy*str() should use size_t.
 1.33 18-Jan-1995  mycroft Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.
 1.32 23-Aug-1994  pk branches: 1.32.2;
When updating an NFS mountpoint, we cannot just increase `rsize' or `wsize'
without also adjusting the corresponding socket buffers. We could probably
call sbrelease/sbreserve/soreserve ourselves without much harm, but we'd
have to duplicate much of the logic in nfs_connect(). In stead, blow the
socket away entirely and let nfs_connect() do its job again.
 1.31 18-Aug-1994  mycroft More LIST/CIRCLEQ migration.
 1.30 14-Aug-1994  gwr Add the option NFS_BOOT_RWSIZE to allow diskless boot configuration
to start with a reduced NFS read and write size (need for wd8003).
 1.29 12-Aug-1994  cgd kill two errant spaces.
 1.28 11-Aug-1994  gwr Diskless boot will now bind the local socket to a reserved port to
satisfy picky servers. Also fix some missing initializations.
(Thanks to Chuck Cranor for PR#394 -- now fixed.)
 1.27 03-Jul-1994  mycroft branches: 1.27.2;
Save FS type at mount time for some later tests.
 1.26 29-Jun-1994  cgd New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'
 1.25 28-Jun-1994  gwr Minor nits: replace ... with ...
p->p_cred->pc_ucred p->p_ucred
x / DEV_BSIZE x >> DEV_BSHIFT
 1.24 22-Jun-1994  pk straighten out diskless swap code somewhat.
 1.23 14-Jun-1994  gwr Fix false "hits" in the attribute cache when booting diskless.
(Yet another thing that breaks when time.tv_sec is near zero...)
 1.22 13-Jun-1994  gwr New diskless boot code (uses RARP, bootparamd).
 1.21 08-Jun-1994  mycroft Update to 4.4-Lite fs code, with local changes.
 1.20 18-May-1994  cgd put sync printing in one place
 1.19 13-May-1994  mycroft Trivial function name change.
 1.18 11-May-1994  mycroft Cast some args to caddr_t.
 1.17 23-Apr-1994  cgd make fs types consistent over new kernels. also, some proto foo.
 1.16 21-Apr-1994  cgd Convert mount, vnode, and buf structs to use <sys/queue.h>. Also,
some knf and structure frobbing to do along with it.
 1.15 18-Apr-1994  glass revised nfs diskless support. uses bootp+rpc to gather parameters
 1.14 14-Apr-1994  cgd fs types are names now.
 1.13 10-Apr-1994  cgd make damn sure nothing's holding on the the mount point vnode
 1.12 31-Mar-1994  glass make panic string unique
 1.11 21-Dec-1993  cgd oops; fix last
 1.10 21-Dec-1993  cgd from jsp: Changed to get attributes of root node and
generate correct type, rather than assuming it's a directory.
This allows Amd direct mounts to work correctly.
 1.9 18-Dec-1993  mycroft Canonicalize all #includes.
 1.8 07-Dec-1993  pk Exclusive access when manipulating flag field in mount structure.
 1.7 07-Dec-1993  pk Don't allow the NFS_LOCKBITS to be set or reset from user land.
Allow other flags (SOFT,HARD,SPONGY, etc) to be altered by `mount -u'.
 1.6 06-Dec-1993  pk Allow changing of various NFS parameters by using `mount -u ...'.
 1.5 19-Nov-1993  cgd patch from Ukai Fumitoshi <ukai@kmc.kyoto-u.ac.jp>
to do the right thing with NFS fsid's and getnewfsid()
 1.4 13-Jul-1993  cgd branches: 1.4.4;
diskless changes made last time were hosed; were using NULL for
"no credentials" rather than NOCRED.
 1.3 07-Jul-1993  cgd changes from ws to support diskless booting... these are "OK" on inspection
and after testing... (actually, currently, none of the changed
code is even used...)
 1.2 20-May-1993  cgd more rcs id adding and header cleanup. i like vi macros!
 1.1 21-Mar-1993  cgd branches: 1.1.1;
Initial revision
 1.1.1.3 01-Mar-1998  fvdl Import 4.4BSD-Lite2
 1.1.1.2 01-Mar-1998  fvdl Import 4.4BSD-Lite for reference
 1.1.1.1 21-Mar-1993  cgd initial import of 386bsd-0.1 sources
 1.4.4.5 21-Dec-1993  cgd update from trunk
 1.4.4.4 21-Dec-1993  cgd update from trunk
 1.4.4.3 20-Nov-1993  cgd update from trunk
 1.4.4.2 14-Nov-1993  mycroft Canonicalize all #includes.
 1.4.4.1 24-Sep-1993  mycroft Make all files using spl*() #include cpu.h. Changes from trunk.
nfs_vfsops.c, nfsmount.h: Make nfs_quotactl() take an int rather than a uid_t,
as it might be -1.
nfs_vnops.c: va_size and va_bytes are now quads.
 1.27.2.4 19-Aug-1994  mycroft update from trunk
 1.27.2.3 14-Aug-1994  mycroft update from trunk
 1.27.2.2 12-Aug-1994  mycroft update from trunk
 1.27.2.1 11-Aug-1994  mycroft update from trunk
 1.32.2.2 23-Aug-1994  pk When updating an NFS mountpoint, we cannot just increase `rsize' or `wsize'
without also adjusting the corresponding socket buffers. We could probably
call sbrelease/sbreserve/soreserve ourselves without much harm, but we'd
have to duplicate much of the logic in nfs_connect(). In stead, blow the
socket away entirely and let nfs_connect() do its job again.
 1.32.2.1 23-Aug-1994  pk file nfs_vfsops.c was added on branch netbsd-1-0 on 1994-08-23 09:31:01 +0000
 1.46.4.2 11-Dec-1996  mycroft From trunk:
Ignore reference count when using MNT_FORCE.
 1.46.4.1 25-May-1996  fvdl Pull in bugfixes from main branch.
 1.55.2.1 14-Jan-1997  thorpej Snapshot of work-in-progress, committed to private branch.

These changes implement machine-independent root device and file system
selection. Notable features:

- All ports behave in a consistent manner regarding root
device selection.
- No more "options GENERIC"; all kernels have the ability
to boot with RB_ASKNAME to select root device and file system
type.
- Root file system type can be wildcarded; a machine-independent
function will try all possible file systems for the selected
root device until one succeeds.
- If the root file system fails to mount, the operator will
be given the chance to select a new root device and file
system type, rather than having the machine simply panic.
- nfs_mountroot() no longer panics if any part of the NFS
mount process fails; it now returns an error, giving the
operator a chance to recover.
- New, more consistent, config(8) grammar. The constructs:

config netbsd swap generic
config netbsd root on nfs

have been replaced with:

config netbsd root on ? type ?
config netbsd root on ? type nfs

Additionally, the operator may select or wildcard root file
system type in the kernel configuration file:

config netbsd root on cd0a type cd9660

config(8) now requires that a "root" specification be
made. "root" may be wired down or wildcarded. "swap" and
"dump" specifications are optional, and follow previous
semantics.

- config(8) has a new "file-system" keyword, used to configure
file systems into the kernel. Eventually, this will be used
to generate the default vfssw[].

- "options NFSCLIENT" is obsolete, and is replaced by
"file-system NFS". "options NFSSERVER" still exists, since
NFS server support is independent of the NFS file system
client.

- sys/arch/<foo>/<foo>/swapgeneric.c is no longer used, and
will be removed; all information is now generated by config(8).

As of this commit, all ports except arm32 have been updated to use
the new setroot(). Only SPARC, i386, and Alpha ports have been
tested at this time. Port masters should test these changes on their
ports, and report any problems back to me.

More changes are on their way, including RB_ASKNAME support in
nfs_mountroot() (to prompt for server address and path) and, potentially,
the ability to select rarp/bootparam or bootp in nfs_mountroot().
 1.57.4.1 02-Mar-1997  mrg swap configuration is no longer done at boot time.
 1.57.2.1 12-Mar-1997  is Merge in changes from Trunk
 1.62.2.3 14-Oct-1997  thorpej Update marc-pcmcia branch from trunk.
 1.62.2.2 16-Sep-1997  thorpej Update marc-pcmcia branch from trunk.
 1.62.2.1 01-Sep-1997  thorpej Update marc-pcmcia branch from trunk.
 1.66.2.1 07-Feb-1998  mellon Pull up 1.67 (fvdl)
 1.83.4.1 04-Jul-1999  chs initialize new struct mount fields in nfs_mountfs().
 1.83.2.1 05-Nov-1999  cgd pull up rev 1.84 from trunk (requested by fvdl):
Avoid a panic when forcibly unmounting a hung NFS mount, e.g. at
reboot.
 1.84.8.2 27-Dec-1999  wrstuden Pull up to last week's -current.
 1.84.8.1 21-Dec-1999  wrstuden Initial commit of recent changes to make DEV_BSIZE go away.

Runs on i386, needs work on other arch's. Main kernel routines should be
fine, but a number of the stand programs need help.

cd, fd, ccd, wd, and sd have been updated. sd has been tested with non-512
byte block devices. vnd, raidframe, and lfs need work.

Non 2**n block support is automatic for LKM's and conditional for kernels
on "options NON_PO2_BLOCKS".
 1.84.4.1 19-Oct-1999  fvdl Bring in Kirk McKusick's FFS softdep code on a branch.
 1.84.2.5 12-Mar-2001  bouyer Sync with HEAD.
 1.84.2.4 11-Feb-2001  bouyer Sync with HEAD.
 1.84.2.3 13-Dec-2000  bouyer Sync with HEAD (for UBC fixes).
 1.84.2.2 08-Dec-2000  bouyer Sync with HEAD.
 1.84.2.1 20-Nov-2000  bouyer Update thorpej_scsipi to -current as of a month ago
 1.90.2.1 22-Jun-2000  minoura Sync w/ netbsd-1-5-base.
 1.91.2.1 14-Dec-2000  he Pull up revision 1.96 (requested by fvdl):
Improve NFS performance, possibly with as much as 100% in
throughput. Please note: this implies a kernel interface change,
VOP_FSYNC gains two arguments.
 1.101.2.13 11-Dec-2002  thorpej Sync with HEAD.
 1.101.2.12 22-Oct-2002  thorpej Sync with HEAD.
 1.101.2.11 18-Oct-2002  nathanw Catch up to -current.
 1.101.2.10 01-Aug-2002  nathanw Catch up to -current.
 1.101.2.9 12-Jul-2002  nathanw No longer need to pull in lwp.h; proc.h pulls it in for us.
 1.101.2.8 24-Jun-2002  nathanw Curproc->curlwp renaming.

Change uses of "curproc->l_proc" back to "curproc", which is more like the
original use. Bare uses of "curproc" are now "curlwp".

"curproc" is now #defined in proc.h as ((curlwp) ? (curlwp)->l_proc) : NULL)
so that it is always safe to reference curproc (*de*referencing curproc
is another story, but that's always been true).
 1.101.2.7 08-Jan-2002  nathanw Catch up to -current.
 1.101.2.6 14-Nov-2001  nathanw Catch up to -current.
 1.101.2.5 22-Oct-2001  nathanw Catch up to -current.
 1.101.2.4 21-Sep-2001  nathanw Catch up to -current.
 1.101.2.3 24-Aug-2001  nathanw Catch up with -current.
 1.101.2.2 21-Jun-2001  nathanw Catch up to -current.
 1.101.2.1 05-Mar-2001  nathanw Initial commit of scheduler activations and lightweight process support.
 1.104.2.4 10-Oct-2002  jdolecek sync kqueue with -current; this includes merge of gehenna-devsw branch,
merge of i386 MP branch, and part of autoconf rototil work
 1.104.2.3 06-Sep-2002  jdolecek sync kqueue branch with HEAD
 1.104.2.2 10-Jan-2002  thorpej Sync kqueue branch with -current.
 1.104.2.1 03-Aug-2001  lukem update to -current
 1.106.2.2 11-Oct-2001  fvdl Catch up with -current. Fix some bogons in the sparc64 kbd/ms
attach code. cd18xx conversion provided by mrg.
 1.106.2.1 01-Oct-2001  fvdl Catch up with -current.
 1.110.2.1 12-Nov-2001  thorpej Sync the thorpej-mips-cache branch with -current.
 1.112.10.3 04-Oct-2003  tron Pull up revision 1.133 (requested by martti in ticket #1506):
plug mbuf leak due to manual mbuf handling. PR kern/13807.
(martti confirmed that it stabilizes the situation described in kren/13807)
 1.112.10.2 29-Jul-2002  lukem Pull up revision 1.114 (requested by enami in ticket #555):
Synchronize code and comment again to prevent mbuf leak. Sprinkle some
KNF while I'm here.
 1.112.10.1 29-Jul-2002  lukem Pull up revision 1.113 (requested by jaromir in ticket #555):
Reduce stack usage on the NFS mount code path. This fixes kernel stack
overflow when using IPsec on vax, as reported by Olaf Seibert on
current-users@.
 1.112.8.1 29-Aug-2002  gehenna catch up with -current.
 1.131.2.10 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.131.2.9 01-Apr-2005  skrll Sync with HEAD.
 1.131.2.8 04-Mar-2005  skrll Sync with HEAD.

Hi Perry!
 1.131.2.7 17-Jan-2005  skrll Sync with HEAD.
 1.131.2.6 21-Sep-2004  skrll Fix the sync with head I botched.
 1.131.2.5 18-Sep-2004  skrll Sync with HEAD.
 1.131.2.4 25-Aug-2004  skrll Sync with HEAD.
 1.131.2.3 24-Aug-2004  skrll Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.
 1.131.2.2 03-Aug-2004  skrll Sync with HEAD
 1.131.2.1 02-Jul-2003  darrenr Apply the aborted ktrace-lwp changes to a specific branch. This is just for
others to review, I'm concerned that patch fuziness may have resulted in some
errant code being generated but I'll look at that later by comparing the diff
from the base to the branch with the file I attempt to apply to it. This will,
at the very least, put the changes in a better context for others to review
them and attempt to tinker with removing passing of 'struct lwp' through
the kernel.
 1.135.2.1 29-May-2004  tron branches: 1.135.2.1.2;
Pull up revision 1.139 (requested by atatat in ticket #393):
Sysctl descriptions under vfs subtree
 1.135.2.1.2.1 27-Oct-2005  riz Pull up following revision(s) (requested by christos in ticket #5863):
sys/nfs/nfs_subs.c: revision 1.152 via patch
sys/nfs/nfs.h: revision 1.49
sys/nfs/nfs_vfsops.c: revision 1.149 via patch
usr.sbin/amd/include/config.h: revision 1.36
sys/nfs/nfs_vnops.c: revision 1.227 via patch
sys/nfs/nfsmount.h: revision 1.34
Allow the attribute cache to be turned off, and allow amd to do it.
 1.144.4.1 19-Mar-2005  yamt sync with head. xen and whitespace. xen part is not finished.
 1.144.2.1 29-Apr-2005  kent sync with -current
 1.145.2.1 27-Sep-2005  tron Pull up following revision(s) (requested by christos in ticket #816):
sys/nfs/nfs_vfsops.c: revision 1.149
sys/nfs/nfs_vnops.c: revision 1.227
ATTRTIMEO takes 2 args.
 1.148.2.12 27-Feb-2008  yamt revert incomplete nfs client locking for now.
 1.148.2.11 27-Feb-2008  yamt sync with head.
 1.148.2.10 15-Feb-2008  yamt - sprinkle some locks.
- disable MNT_UPDATE because it involves too much locking headache.
- don't overwrite other bits in v_vflags when setting VV_ROOT.
 1.148.2.9 04-Feb-2008  yamt sync with head.
 1.148.2.8 21-Jan-2008  yamt sync with head
 1.148.2.7 07-Dec-2007  yamt sync with head
 1.148.2.6 15-Nov-2007  yamt sync with head.
 1.148.2.5 27-Oct-2007  yamt sync with head.
 1.148.2.4 03-Sep-2007  yamt sync with head.
 1.148.2.3 26-Feb-2007  yamt sync with head.
 1.148.2.2 30-Dec-2006  yamt sync with head.
 1.148.2.1 21-Jun-2006  yamt sync with head.
 1.151.6.3 01-Jun-2006  kardel Sync with head.
 1.151.6.2 22-Apr-2006  simonb Sync with head.
 1.151.6.1 04-Feb-2006  simonb In the timecounter case, call tc_setclock() instead of setting
time.tv_sec/tv_nsec directly.
 1.151.4.1 09-Sep-2006  rpaulo sync with head
 1.151.2.1 01-Mar-2006  yamt sync with head.
 1.152.6.1 24-May-2006  tron Merge 2006-05-24 NetBSD-current into the "peter-altq" branch.
 1.152.4.5 11-May-2006  elad sync with head
 1.152.4.4 06-May-2006  christos - Move kauth_cred_t declaration to <sys/types.h>
- Cleanup struct ucred; forward declarations that are unused.
- Don't include <sys/kauth.h> in any header, but include it in the c files
that need it.

Approved by core.
 1.152.4.3 19-Apr-2006  elad sync with head.
 1.152.4.2 14-Apr-2006  elad Store real/saved user/group ids too.
 1.152.4.1 08-Mar-2006  elad Adapt to kernel authorization KPI.

This could use some testing...
 1.152.2.4 03-Sep-2006  yamt sync with head.
 1.152.2.3 11-Aug-2006  yamt sync with head
 1.152.2.2 26-Jun-2006  yamt sync with head.
 1.152.2.1 24-May-2006  yamt sync with head.
 1.155.2.1 19-Jun-2006  chap Sync with head.
 1.157.2.1 13-Jul-2006  gdamore Merge from HEAD.
 1.164.4.2 10-Dec-2006  yamt sync with head.
 1.164.4.1 22-Oct-2006  yamt sync with head
 1.164.2.3 01-Feb-2007  ad Sync with head.
 1.164.2.2 12-Jan-2007  ad Sync with head.
 1.164.2.1 18-Nov-2006  ad Sync with head.
 1.172.2.3 07-May-2007  yamt sync with head.
 1.172.2.2 12-Mar-2007  rmind Sync with HEAD.
 1.172.2.1 28-Feb-2007  yamt sync with head. (somehow missed in the previous)
 1.174.4.1 11-Jul-2007  mjf Sync with head.
 1.174.2.10 25-Oct-2007  ad Fix up mnt_vnodelist handling.
 1.174.2.9 24-Oct-2007  ad Do locking / use marker vnodes when traversing mountpoint vnode lists.
 1.174.2.8 09-Oct-2007  ad Sync with head.
 1.174.2.7 16-Sep-2007  ad Checkpoint work in progress on the vnode lifecycle and reference counting
stuff. This makes it work properly without kernel_lock and fixes a few
quite old bugs. See vfs_subr.c 1.283.2.17 for details.
 1.174.2.6 20-Aug-2007  ad Sync with HEAD.
 1.174.2.5 15-Jul-2007  ad Sync with head.
 1.174.2.4 18-Jun-2007  yamt fix merge botches.
 1.174.2.3 17-Jun-2007  ad - Increase the number of thread priorities from 128 to 256. How the space
is set up is to be revisited.
- Implement soft interrupts as kernel threads. A generic implementation
is provided, with hooks for fast-path MD code that can run the interrupt
threads over the top of other threads executing in the kernel.
- Split vnode::v_flag into three fields, depending on how the flag is
locked (by the interlock, by the vnode lock, by the file system).
- Miscellaneous locking fixes and improvements.
 1.174.2.2 08-Jun-2007  ad Sync with head.
 1.174.2.1 13-Mar-2007  ad Pull in the initial set of changes for the vmlocking branch.
 1.179.2.2 10-Sep-2007  skrll Sync with HEAD.
 1.179.2.1 15-Aug-2007  skrll Sync with HEAD.
 1.182.2.6 27-Nov-2007  joerg Sync with HEAD. amd64 Xen support needs testing.
 1.182.2.5 29-Oct-2007  joerg Sync with HEAD.
 1.182.2.4 26-Oct-2007  joerg Sync with HEAD.

Follow the merge of pmap.c on i386 and amd64 and move
pmap_init_tmp_pgtbl into arch/x86/x86/pmap.c. Modify the ACPI wakeup
code to restore CR4 before jumping back into kernel space as the large
page option might cover that.
 1.182.2.3 02-Oct-2007  joerg Sync with HEAD.
 1.182.2.2 16-Aug-2007  jmcneill Sync with HEAD.
 1.182.2.1 09-Aug-2007  jmcneill Sync with HEAD.
 1.183.2.2 05-Aug-2007  yamt use kpause rather than lbolt.
 1.183.2.1 05-Aug-2007  yamt file nfs_vfsops.c was added on branch matt-mips64 on 2007-08-05 09:40:41 +0000
 1.184.2.3 23-Mar-2008  matt sync with HEAD
 1.184.2.2 09-Jan-2008  matt sync with HEAD
 1.184.2.1 06-Nov-2007  matt sync with HEAD
 1.185.2.1 14-Oct-2007  yamt sync with head.
 1.186.2.1 13-Nov-2007  bouyer Sync with HEAD
 1.187.2.2 18-Feb-2008  mjf Sync with HEAD.
 1.187.2.1 08-Dec-2007  mjf Sync with HEAD.
 1.188.6.3 23-Jan-2008  bouyer Sync with HEAD.
 1.188.6.2 08-Jan-2008  bouyer Sync with HEAD
 1.188.6.1 02-Jan-2008  bouyer Sync with HEAD
 1.188.2.1 04-Dec-2007  ad Pull the vmlocking changes into a new branch.
 1.196.10.8 09-Oct-2010  yamt sync with head
 1.196.10.7 11-Aug-2010  yamt sync with head.
 1.196.10.6 11-Mar-2010  yamt sync with head
 1.196.10.5 24-Jun-2009  yamt lock vnode when calling VOP_GETATTR because there's no reasonable way for
an implementation of VOP_GETATTR to prevent the vnode from being revoked.
 1.196.10.4 24-Jun-2009  yamt nfs_mount: re-enable MNT_UPDATE. it's broken as it is in trunk.
 1.196.10.3 04-May-2009  yamt sync with head.
 1.196.10.2 16-May-2008  yamt sync with head.
 1.196.10.1 27-Apr-2008  yamt commit some work-in-progress changes to make nfs client mp-safe to a branch,
so that they won't get lost.
- sprinkle some locking
- mark the filesystem, nfstimer callout, and kq kthread mp-safe
- add assertions and comments
- disable upgrade mount for now
- some unrelated cosmetic changes
 1.196.8.1 18-May-2008  yamt sync with head.
 1.196.6.3 17-Jan-2009  mjf Sync with HEAD.
 1.196.6.2 05-Oct-2008  mjf Sync with HEAD.
 1.196.6.1 02-Jun-2008  mjf Sync with HEAD.
 1.199.2.2 10-Oct-2008  skrll Sync with HEAD.
 1.199.2.1 23-Jun-2008  wrstuden Sync w/ -current. 34 merge conflicts to follow.
 1.200.4.2 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.200.4.1 19-Oct-2008  haad Sync with HEAD.
 1.203.14.1 28-Apr-2014  sborrill Pull up the following revisions(s) (requested by maxv in ticket #1901):
sys/kern/vfs_syscalls.c: revision 1.478, 1.480 via patch
sys/coda/coda_vfsops.c: revision 1.81
sys/fs/adosfs/advfsops.c: revision 1.70
sys/fs/cd9660/cd9660_vfsops.c: revision 1.84
sys/fs/efs/efs_vfsops.c: revision 1.25
sys/fs/filecorefs/filecore_vfsops.c: revision 1.76
sys/fs/hfs/hfs_vfsops.c: revision 1.31
sys/fs/msdosfs/msdosfs_vfsops.c: revision 1.107
sys/fs/ntfs/ntfs_vfsops.c: revision 1.94
sys/fs/ptyfs/ptyfs_vfsops.c: revision 1.50 via patch
sys/fs/puffs/puffs_vfsops.c: revision 1.110 via patch
sys/fs/smbfs/smbfs_vfsops.c: revision 1.100
sys/fs/sysvbfs/sysvbfs_vfsops.c: revision 1.43
sys/fs/tmpfs/tmpfs_vfsops.c: revision 1.59 via patch
sys/fs/udf/udf_vfsops.c: revision 1.67
sys/fs/union/union_vfsops.c: revision 1.72
sys/fs/unionfs/unionfs_vfsops.c: revision 1.13
sys/kern/vfs_syscalls.c: revision 1.479
sys/miscfs/nullfs/null_vfsops.c: revision 1.88 via patch
sys/miscfs/overlay/overlay_vfsops.c: revision 1.61
sys/miscfs/procfs/procfs_vfsops.c: revision 1.91
sys/miscfs/umapfs/umap_vfsops.c: revision 1.92
sys/nfs/nfs_vfsops.c: revision 1.227
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.180
sys/ufs/ffs/ffs_vfsops.c: revision 1.297
sys/ufs/lfs/lfs_vfsops.c: revision 1.321
sys/ufs/mfs/mfs_vfsops.c: revision 1.107

Due to missing checks in the mount syscall, and a wrong assumption on the
file systems side, the kernel could allocate an unbounded or zero-sized
memory buffer, and could dereference a NULL pointer when particular
arguments are given by a user.
 1.203.10.1 28-Apr-2014  sborrill Pull up the following revisions(s) (requested by maxv in ticket #1901):
sys/kern/vfs_syscalls.c: revision 1.478, 1.480 via patch
sys/coda/coda_vfsops.c: revision 1.81
sys/fs/adosfs/advfsops.c: revision 1.70
sys/fs/cd9660/cd9660_vfsops.c: revision 1.84
sys/fs/efs/efs_vfsops.c: revision 1.25
sys/fs/filecorefs/filecore_vfsops.c: revision 1.76
sys/fs/hfs/hfs_vfsops.c: revision 1.31
sys/fs/msdosfs/msdosfs_vfsops.c: revision 1.107
sys/fs/ntfs/ntfs_vfsops.c: revision 1.94
sys/fs/ptyfs/ptyfs_vfsops.c: revision 1.50 via patch
sys/fs/puffs/puffs_vfsops.c: revision 1.110 via patch
sys/fs/smbfs/smbfs_vfsops.c: revision 1.100
sys/fs/sysvbfs/sysvbfs_vfsops.c: revision 1.43
sys/fs/tmpfs/tmpfs_vfsops.c: revision 1.59 via patch
sys/fs/udf/udf_vfsops.c: revision 1.67
sys/fs/union/union_vfsops.c: revision 1.72
sys/fs/unionfs/unionfs_vfsops.c: revision 1.13
sys/kern/vfs_syscalls.c: revision 1.479
sys/miscfs/nullfs/null_vfsops.c: revision 1.88 via patch
sys/miscfs/overlay/overlay_vfsops.c: revision 1.61
sys/miscfs/procfs/procfs_vfsops.c: revision 1.91
sys/miscfs/umapfs/umap_vfsops.c: revision 1.92
sys/nfs/nfs_vfsops.c: revision 1.227
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.180
sys/ufs/ffs/ffs_vfsops.c: revision 1.297
sys/ufs/lfs/lfs_vfsops.c: revision 1.321
sys/ufs/mfs/mfs_vfsops.c: revision 1.107

Due to missing checks in the mount syscall, and a wrong assumption on the
file systems side, the kernel could allocate an unbounded or zero-sized
memory buffer, and could dereference a NULL pointer when particular
arguments are given by a user.
 1.203.4.1 25-Apr-2014  sborrill Pull up the following revisions(s) (requested by maxv in ticket #1901):
sys/kern/vfs_syscalls.c: revision 1.478, 1.480 via patch
sys/coda/coda_vfsops.c: revision 1.81
sys/fs/adosfs/advfsops.c: revision 1.70
sys/fs/cd9660/cd9660_vfsops.c: revision 1.84
sys/fs/efs/efs_vfsops.c: revision 1.25
sys/fs/filecorefs/filecore_vfsops.c: revision 1.76
sys/fs/hfs/hfs_vfsops.c: revision 1.31
sys/fs/msdosfs/msdosfs_vfsops.c: revision 1.107
sys/fs/ntfs/ntfs_vfsops.c: revision 1.94
sys/fs/ptyfs/ptyfs_vfsops.c: revision 1.50 via patch
sys/fs/puffs/puffs_vfsops.c: revision 1.110 via patch
sys/fs/smbfs/smbfs_vfsops.c: revision 1.100
sys/fs/sysvbfs/sysvbfs_vfsops.c: revision 1.43
sys/fs/tmpfs/tmpfs_vfsops.c: revision 1.59 via patch
sys/fs/udf/udf_vfsops.c: revision 1.67
sys/fs/union/union_vfsops.c: revision 1.72
sys/fs/unionfs/unionfs_vfsops.c: revision 1.13
sys/kern/vfs_syscalls.c: revision 1.479
sys/miscfs/nullfs/null_vfsops.c: revision 1.88 via patch
sys/miscfs/overlay/overlay_vfsops.c: revision 1.61
sys/miscfs/procfs/procfs_vfsops.c: revision 1.91
sys/miscfs/umapfs/umap_vfsops.c: revision 1.92
sys/nfs/nfs_vfsops.c: revision 1.227
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.180
sys/ufs/ffs/ffs_vfsops.c: revision 1.297
sys/ufs/lfs/lfs_vfsops.c: revision 1.321
sys/ufs/mfs/mfs_vfsops.c: revision 1.107

Due to missing checks in the mount syscall, and a wrong assumption on the
file systems side, the kernel could allocate an unbounded or zero-sized
memory buffer, and could dereference a NULL pointer when particular
arguments are given by a user.
 1.203.2.2 28-Apr-2009  skrll Sync with HEAD.
 1.203.2.1 19-Jan-2009  skrll Sync with HEAD.
 1.206.2.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.210.2.2 17-Aug-2010  uebayasi Sync with HEAD.
 1.210.2.1 30-Apr-2010  uebayasi Sync with HEAD.
 1.211.2.4 05-Mar-2011  rmind sync with head
 1.211.2.3 03-Jul-2010  rmind sync with head
 1.211.2.2 30-May-2010  rmind sync with head
 1.211.2.1 16-Mar-2010  rmind Change struct uvm_object::vmobjlock to be dynamically allocated with
mutex_obj_alloc(). It allows us to share the locks among UVM objects.
 1.217.6.1 23-Jun-2011  cherry Catchup with rmind-uvmplock merge.
 1.220.16.1 21-Apr-2014  bouyer Pull up following revision(s) (requested by maxv in ticket #1050):
sys/ufs/chfs/chfs_vfsops.c: revision 1.11
sys/fs/unionfs/unionfs_vfsops.c: revision 1.13
sys/fs/nilfs/nilfs_vfsops.c: revision 1.16
sys/ufs/mfs/mfs_vfsops.c: revision 1.107
sys/fs/sysvbfs/sysvbfs_vfsops.c: revision 1.43
sys/ufs/ffs/ffs_vfsops.c: revision 1.297
sys/kern/vfs_syscalls.c: revision 1.478
sys/kern/vfs_syscalls.c: revision 1.479
sys/fs/puffs/puffs_vfsops.c: revision 1.110
sys/fs/cd9660/cd9660_vfsops.c: revision 1.84
sys/nfs/nfs_vfsops.c: revision 1.227
sys/fs/v7fs/v7fs_vfsops.c: revision 1.10
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.180
sys/miscfs/umapfs/umap_vfsops.c: revision 1.92
sys/fs/filecorefs/filecore_vfsops.c: revision 1.76
sys/miscfs/nullfs/null_vfsops.c: revision 1.88
sys/fs/ptyfs/ptyfs_vfsops.c: revision 1.50
sys/coda/coda_vfsops.c: revision 1.81
sys/ufs/lfs/lfs_vfsops.c: revision 1.321
sys/fs/tmpfs/tmpfs_vfsops.c: revision 1.59
sys/fs/hfs/hfs_vfsops.c: revision 1.31
sys/miscfs/overlay/overlay_vfsops.c: revision 1.61
sys/fs/union/union_vfsops.c: revision 1.72
sys/fs/ntfs/ntfs_vfsops.c: revision 1.94
sys/kern/vfs_syscalls.c: revision 1.480
sys/fs/efs/efs_vfsops.c: revision 1.25
sys/kern/vfs_syscalls.c: revision 1.482
sys/fs/msdosfs/msdosfs_vfsops.c: revision 1.107
external/cddl/osnet/dist/uts/common/fs/zfs/zfs_vfsops.c: revision 1.12
sys/miscfs/procfs/procfs_vfsops.c: revision 1.91
sys/fs/smbfs/smbfs_vfsops.c: revision 1.100
sys/fs/adosfs/advfsops.c: revision 1.70
sys/fs/udf/udf_vfsops.c: revision 1.67
Limit check for 'data_len'. Otherwise a (un)privileged user can easily
panic the system by passing a huge size.
ok christos@
An (un)privileged user can easily make the kernel dereference a NULL
pointer.
The kernel allows 'data' to be NULL; it's the fs's responsibility to
ensure that it isn't NULL (if the fs actually needs data).
ok christos@
Some fs's - like kernfs - set their vfs_min_mount_data to zero. Add a check
to prevent an (un)privileged user from requesting a zero-sized allocation
(and thus a panic).
This thing is totally buggy: 'data_len' is modified by the fs, so calling
kmem_free with it while its value has changed since the kmem_alloc is far
from being a good idea.
If the kernel figures out that something mismatches, it will panic
(typically with kernfs).
 1.220.14.1 21-Apr-2014  bouyer Pull up following revision(s) (requested by maxv in ticket #1050):
sys/ufs/chfs/chfs_vfsops.c: revision 1.11
sys/fs/unionfs/unionfs_vfsops.c: revision 1.13
sys/fs/nilfs/nilfs_vfsops.c: revision 1.16
sys/ufs/mfs/mfs_vfsops.c: revision 1.107
sys/fs/sysvbfs/sysvbfs_vfsops.c: revision 1.43
sys/ufs/ffs/ffs_vfsops.c: revision 1.297
sys/kern/vfs_syscalls.c: revision 1.478
sys/kern/vfs_syscalls.c: revision 1.479
sys/fs/puffs/puffs_vfsops.c: revision 1.110
sys/fs/cd9660/cd9660_vfsops.c: revision 1.84
sys/nfs/nfs_vfsops.c: revision 1.227
sys/fs/v7fs/v7fs_vfsops.c: revision 1.10
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.180
sys/miscfs/umapfs/umap_vfsops.c: revision 1.92
sys/fs/filecorefs/filecore_vfsops.c: revision 1.76
sys/miscfs/nullfs/null_vfsops.c: revision 1.88
sys/fs/ptyfs/ptyfs_vfsops.c: revision 1.50
sys/coda/coda_vfsops.c: revision 1.81
sys/ufs/lfs/lfs_vfsops.c: revision 1.321
sys/fs/tmpfs/tmpfs_vfsops.c: revision 1.59
sys/fs/hfs/hfs_vfsops.c: revision 1.31
sys/miscfs/overlay/overlay_vfsops.c: revision 1.61
sys/fs/union/union_vfsops.c: revision 1.72
sys/fs/ntfs/ntfs_vfsops.c: revision 1.94
sys/kern/vfs_syscalls.c: revision 1.480
sys/fs/efs/efs_vfsops.c: revision 1.25
sys/kern/vfs_syscalls.c: revision 1.482
sys/fs/msdosfs/msdosfs_vfsops.c: revision 1.107
external/cddl/osnet/dist/uts/common/fs/zfs/zfs_vfsops.c: revision 1.12
sys/miscfs/procfs/procfs_vfsops.c: revision 1.91
sys/fs/smbfs/smbfs_vfsops.c: revision 1.100
sys/fs/adosfs/advfsops.c: revision 1.70
sys/fs/udf/udf_vfsops.c: revision 1.67
Limit check for 'data_len'. Otherwise a (un)privileged user can easily
panic the system by passing a huge size.
ok christos@
An (un)privileged user can easily make the kernel dereference a NULL
pointer.
The kernel allows 'data' to be NULL; it's the fs's responsibility to
ensure that it isn't NULL (if the fs actually needs data).
ok christos@
Some fs's - like kernfs - set their vfs_min_mount_data to zero. Add a check
to prevent an (un)privileged user from requesting a zero-sized allocation
(and thus a panic).
This thing is totally buggy: 'data_len' is modified by the fs, so calling
kmem_free with it while its value has changed since the kmem_alloc is far
from being a good idea.
If the kernel figures out that something mismatches, it will panic
(typically with kernfs).
 1.220.12.3 03-Dec-2017  jdolecek update from HEAD
 1.220.12.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.220.12.1 25-Feb-2013  tls resync with head
 1.220.8.1 21-Apr-2014  bouyer Pull up following revision(s) (requested by maxv in ticket #1050):
sys/ufs/chfs/chfs_vfsops.c: revision 1.11
sys/fs/unionfs/unionfs_vfsops.c: revision 1.13
sys/fs/nilfs/nilfs_vfsops.c: revision 1.16
sys/ufs/mfs/mfs_vfsops.c: revision 1.107
sys/fs/sysvbfs/sysvbfs_vfsops.c: revision 1.43
sys/ufs/ffs/ffs_vfsops.c: revision 1.297
sys/kern/vfs_syscalls.c: revision 1.478
sys/kern/vfs_syscalls.c: revision 1.479
sys/fs/puffs/puffs_vfsops.c: revision 1.110
sys/fs/cd9660/cd9660_vfsops.c: revision 1.84
sys/nfs/nfs_vfsops.c: revision 1.227
sys/fs/v7fs/v7fs_vfsops.c: revision 1.10
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.180
sys/miscfs/umapfs/umap_vfsops.c: revision 1.92
sys/fs/filecorefs/filecore_vfsops.c: revision 1.76
sys/miscfs/nullfs/null_vfsops.c: revision 1.88
sys/fs/ptyfs/ptyfs_vfsops.c: revision 1.50
sys/coda/coda_vfsops.c: revision 1.81
sys/ufs/lfs/lfs_vfsops.c: revision 1.321
sys/fs/tmpfs/tmpfs_vfsops.c: revision 1.59
sys/fs/hfs/hfs_vfsops.c: revision 1.31
sys/miscfs/overlay/overlay_vfsops.c: revision 1.61
sys/fs/union/union_vfsops.c: revision 1.72
sys/fs/ntfs/ntfs_vfsops.c: revision 1.94
sys/kern/vfs_syscalls.c: revision 1.480
sys/fs/efs/efs_vfsops.c: revision 1.25
sys/kern/vfs_syscalls.c: revision 1.482
sys/fs/msdosfs/msdosfs_vfsops.c: revision 1.107
external/cddl/osnet/dist/uts/common/fs/zfs/zfs_vfsops.c: revision 1.12
sys/miscfs/procfs/procfs_vfsops.c: revision 1.91
sys/fs/smbfs/smbfs_vfsops.c: revision 1.100
sys/fs/adosfs/advfsops.c: revision 1.70
sys/fs/udf/udf_vfsops.c: revision 1.67
Limit check for 'data_len'. Otherwise a (un)privileged user can easily
panic the system by passing a huge size.
ok christos@
An (un)privileged user can easily make the kernel dereference a NULL
pointer.
The kernel allows 'data' to be NULL; it's the fs's responsibility to
ensure that it isn't NULL (if the fs actually needs data).
ok christos@
Some fs's - like kernfs - set their vfs_min_mount_data to zero. Add a check
to prevent an (un)privileged user from requesting a zero-sized allocation
(and thus a panic).
This thing is totally buggy: 'data_len' is modified by the fs, so calling
kmem_free with it while its value has changed since the kmem_alloc is far
from being a good idea.
If the kernel figures out that something mismatches, it will panic
(typically with kernfs).
 1.220.2.2 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.220.2.1 23-Jan-2013  yamt sync with head
 1.221.2.1 18-May-2014  rmind sync with head
 1.226.2.1 10-Aug-2014  tls Rebase.
 1.229.4.3 28-Aug-2017  skrll Sync with HEAD
 1.229.4.2 27-Dec-2015  skrll Sync with HEAD (as of 26th Dec)
 1.229.4.1 22-Sep-2015  skrll Sync with HEAD
 1.229.2.2 08-Nov-2015  riz Pull up following revision(s) (requested by pgoyette in ticket #1021):
sys/nfs/nfs_vfsops.c: revision 1.231
Don't forget to call nfs_fini() when we're finished. Without this,
we leave a dangling pool nfsrvdescpl around.
 1.229.2.1 04-Nov-2015  riz Pull up following revision(s) (requested by manu in ticket #882):
sbin/umount/umount.c: revision 1.48
sys/nfs/nfsmount.h: revision 1.53
sys/nfs/nfs_var.h: revision 1.94
sys/nfs/nfs_iod.c: revision 1.7
sys/nfs/nfs_socket.c: revision 1.197
sys/nfs/nfs_bio.c: revision 1.191
sys/nfs/nfs_vfsops.c: revision 1.230
sys/nfs/nfs_clntsocket.c: revision 1.3
Remove useless and harmful sync(2) call in umount(8)
Remove sync(2) call before unmount(2) in umount(8). This sync(2) is useless
since unmount(2) will perform a VFS_SYNC anyway.
But moreover, this sync(2) may be harmful, as there are some situation where
it cannot return (unreachable NFS server, for instance), causing umount -f
to be uneffective.
Fix soft NFS force unmount
For many reasons, forcibly unmounting a soft NFS mount could hang forever.
Here are the fixes:
- Introduce decents timeouts in operation that awaited NFS server reply.
- On timeout, fails operations on soft mounts with EIO.
- Introduce NFSMNT_DISMNTFORCE to let the filesystem know that a
force unmount is ongoing. This causes timeouts to be reduced and
prevents the NFS client to attempt reconnecting to the NFS server.
Also fix a race condition where some asynchronous I/O could reference
destroyed mount structures. We fix this by awaiting asynchronous I/O
to drain before proceeding.
Reviewed by Chuck Silvers.
 1.231.4.1 21-Apr-2017  bouyer Sync with HEAD
 1.231.2.2 26-Apr-2017  pgoyette Sync with HEAD
 1.231.2.1 20-Mar-2017  pgoyette Sync with HEAD
 1.235.10.2 06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.235.10.1 22-Mar-2018  pgoyette Synch with HEAD, resolve conflicts
 1.236.2.3 21-Apr-2020  martin Sync with HEAD
 1.236.2.2 08-Apr-2020  martin Merge changes from current as of 20200406
 1.236.2.1 10-Jun-2019  christos Sync with HEAD
 1.237.6.2 29-Feb-2020  ad Sync with head.
 1.237.6.1 17-Jan-2020  ad Sync with head.
 1.237.4.1 04-May-2022  martin Pull up following revision(s) (requested by gavan in ticket #1441):

sys/nfs/nfs_vfsops.c: revision 1.243

Don't pretend that files are limited to 1TB on NFSv3.
 1.240.2.1 20-Apr-2020  bouyer Sync with HEAD
 1.241.4.1 03-Apr-2021  thorpej Sync with HEAD.
 1.241.2.1 03-Apr-2021  thorpej Sync with HEAD.
 1.242.2.1 17-Jun-2021  thorpej Sync w/ HEAD.
 1.243.10.2 20-Sep-2024  martin Pull up following revision(s) (requested by rin in ticket #880):

sys/nfs/nfs_iod.c: revision 1.9
sys/nfs/nfs_vfsops.c: revision 1.245
sys/nfs/nfs_clntsubs.c: revision 1.7

PR/57279: Izumi Tsutsui: Fix some {int,long} -> time_t. Still things will
break eventually because parts of the nfs protocol assume time_t will fit
in 32 bits.
 1.243.10.1 20-Sep-2024  martin Pull up following revision(s) (requested by rin in ticket #879):

sys/nfs/nfs_vfsops.c: revision 1.244

Avoid overflow of nfs_commitsize on machines with > 32GB RAM.
 1.325 10-Dec-2023  schmonz NFS client: fix interop with macOS 14 servers.

Symptom: a bunch of "Cannot open `.' (Invalid argument)".

thorpej@ analysis and fix: on the first request to read a given
directory, make sure READDIR and READDIRPLUS cookie verifiers are
being set to 0. This is in RFC1813 and macOS must have gotten
stricter about it.

Verified on 10.0_RC1/aarch64 to fix the reproducers in PR kern/57691 as
well as the original use case in which I met the bug: pkg_rr once again
runs to completion.
 1.324 24-May-2022  andvar branches: 1.324.4;
fix various typos in comments, docs and log messages.
 1.323 30-Mar-2022  christos restructure so we abort/unlock properly on failure.
 1.322 27-Mar-2022  christos add a kauth vnode check for creating links
 1.321 20-Oct-2021  thorpej Overhaul of the EVFILT_VNODE kevent(2) filter:

- Centralize vnode kevent handling in the VOP_*() wrappers, rather than
forcing each individual file system to deal with it (except VOP_RENAME(),
because VOP_RENAME() is a mess and we currently have 2 different ways
of handling it; at least it's reasonably well-centralized in the "new"
way).
- Add support for NOTE_OPEN, NOTE_CLOSE, NOTE_CLOSE_WRITE, and NOTE_READ,
compatible with the same events in FreeBSD.
- Track which kevent notifications clients are interested in receiving
to avoid doing work for events no one cares about (avoiding, e.g.
taking locks and traversing the klist to send a NOTE_WRITE when
someone is merely watching for a file to be deleted, for example).

In support of the above:

- Add support in vnode_if.sh for specifying PRE- and POST-op handlers,
to be invoked before and after vop_pre() and vop_post(), respectively.
Basic idea from FreeBSD, but implemented differently.
- Add support in vnode_if.sh for specifying CONTEXT fields in the
vop_*_args structures. These context fields are used to convey information
between the file system VOP function and the VOP wrapper, but do not
occupy an argument slot in the VOP_*() call itself. These context fields
are initialized and subsequently interpreted by PRE- and POST-op handlers.
- Version VOP_REMOVE(), uses the a context field for the file system to report
back the resulting link count of the target vnode. Return this in tmpfs,
udf, nfs, chfs, ext2fs, lfs, and ufs.

NetBSD 9.99.92.
 1.320 18-Jul-2021  dholland Abolish all the silly indirection macros for initializing vnode ops tables.

These are things of the form #define foofs_op genfs_op, or #define
foofs_op genfs_eopnotsupp, or similar. They serve no purpose besides
obfuscation, and have gotten cutpasted all over everywhere.
 1.319 18-Jul-2021  dholland Use macros for the canned parts of device and fifo vnode op tables.

Add GENFS_SPECOP_ENTRIES and GENFS_FIFOOP_ENTRIES macros that contain
the portion of the vnode ops table declaration that is
(conservatively) the same in every fs. Use these in every fs that
supports devices and/or fifos with separate ops tables.

Note that ptyfs works differently (it has one type of vnode with
open-coded dispatch to the specfs code, which I haven't changed in
this commit) and rump/librump/rumpvfs/rumpfs.c has an indirect dynamic
dispatch that already does more or less the same thing, which I also
haven't changed.

Also note that this anticipates a few bits in the next changeset here
and there, and adds missing but unreachable calls in some cases (e.g.
most fses weren't defining whiteout on devices and fifos, but it isn't
reachable there), and it changes parsepath on devices and fifos to
genfs_badop from genfs_parsepath (but it's not reachable there
either).

It appears that devices in kernfs were missing kqfilter, so it's
possible that if you try to use kqueue on /kern/rootdev that it'll
explode.

And finally note that the ops declaration tables aren't
order-dependent. (Other than vop_default_desc has to come first.)
Otherwise this wouldn't work.
 1.318 29-Jun-2021  dholland - Add a new vnode op: VOP_PARSEPATH.
- Move namei_getcomponent to genfs_vnops.c and call it genfs_parsepath.
- Add a parsepath entry to every vnode ops table.

VOP_PARSEPATH takes a directory vnode to be searched and a complete
following path and chooses how much of that path to consume. To begin
with, all parsepath calls are genfs_parsepath, which locates the first
'/' as always.

Note that the call doesn't take the whole struct componentname, only
the string. The other bits of struct componentname should not be
needed and there's no reason to cause potential complications by
exposing them.
 1.317 05-Sep-2020  riastradh branches: 1.317.6;
Round of uvm.h cleanup.

The poorly named uvm.h is generally supposed to be for uvm-internal
users only.

- Narrow it to files that actually need it -- mostly files that need
to query whether curlwp is the pagedaemon, which should maybe be
exposed by an external header.

- Use uvm_extern.h where feasible and uvm_*.h for things not exposed
by it. We should split up uvm_extern.h but this will serve for now
to reduce the uvm.h dependencies.

- Use uvm_stat.h and #ifdef UVMHIST uvm.h for files that use
UVMHIST(ubchist), since ubchist is declared in uvm.h but the
reference evaporates if UVMHIST is not defined, so we reduce header
file dependencies.

- Make uvm_device.h and uvm_swap.h independently includable while
here.

ok chs@
 1.316 27-Jun-2020  christos Introduce genfs_pathconf() and use it for the default case in all filesystems.
 1.315 16-May-2020  christos Add ACL support for FFS. From FreeBSD.
 1.314 13-Apr-2020  ad Replace most uses of vp->v_usecount with a call to vrefcnt(vp), a function
that hides the details and does atomic_load_relaxed(). Signature matches
FreeBSD.
 1.313 23-Feb-2020  ad branches: 1.313.4;
UVM locking changes, proposed on tech-kern:

- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart. v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap. Others to follow later.
 1.312 10-Sep-2019  christos branches: 1.312.2;
remove NCHNAMLEN optimization
 1.311 03-Sep-2018  riastradh branches: 1.311.4;
Rename min/max -> uimin/uimax for better honesty.

These functions are defined on unsigned int. The generic name
min/max should not silently truncate to 32 bits on 64-bit systems.
This is purely a name change -- no functional change intended.

HOWEVER! Some subsystems have

#define min(a, b) ((a) < (b) ? (a) : (b))
#define max(a, b) ((a) > (b) ? (a) : (b))

even though our standard name for that is MIN/MAX. Although these
may invite multiple evaluation bugs, these do _not_ cause integer
truncation.

To avoid `fixing' these cases, I first changed the name in libkern,
and then compile-tested every file where min/max occurred in order to
confirm that it failed -- and thus confirm that nothing shadowed
min/max -- before changing it.

I have left a handful of bootloaders that are too annoying to
compile-test, and some dead code:

cobalt ews4800mips hp300 hppa ia64 luna68k vax
acorn32/if_ie.c (not included in any kernels)
macppc/if_gm.c (superseded by gem(4))

It should be easy to fix the fallout once identified -- this way of
doing things fails safe, and the goal here, after all, is to _avoid_
silent integer truncations, not introduce them.

Maybe one day we can reintroduce min/max as type-generic things that
never silently truncate. But we should avoid doing that for a while,
so that existing code has a chance to be detected by the compiler for
conversion to uimin/uimax without changing the semantics until we can
properly audit it all. (Who knows, maybe in some cases integer
truncation is actually intended!)
 1.310 26-Apr-2017  riastradh branches: 1.310.4; 1.310.10; 1.310.12;
Change VOP_REMOVE and VOP_RMDIR to preserve lock/ref on dvp.

No change to vp -- the plan is to replace the node by the
componentname in the vop parameters, and let all directory vops do
lookups internally.

Proposed on tech-kern with no objections:
https://mail-index.netbsd.org/tech-kern/2017/04/17/msg021825.html
 1.309 19-Jan-2016  hannken Return an error if NFSPROC_LOOKUP returns the file handle of the current
directory. Treating it as DOT lookup would put garbage into the name
cache and could panic on future lookups.

Seen with ZFS file system exported from OmniOS, an OpenSolaris derivative.

Fixes PR kern/50664 "cd .." over NFS/ZFS can panic kernel
 1.308 14-May-2015  chs in nfs_writerpc(), avoid a signed/unsigned problem in computing the
number of bytes to back up in the uio when we need to resend a write RPC
(eg. after a server crash) on a 64-bit platform. should fix PR 35448.
 1.307 20-Apr-2015  riastradh Make VOP_LINK return directory still locked and referenced.

Ride 7.99.10 bump.
 1.306 25-Jul-2014  dholland branches: 1.306.2; 1.306.4;
Add VOP_FALLOCATE and VOP_FDISCARD to every vnode ops table I can
find.

The filesystem ones all call genfs_eopnotsupp - right now I am only
implementing the plumbing and we can implement fallocate and/or
fdiscard for files later.

The device ones call spec_fallocate (which is also genfs_eopnotsupp)
and spec_fdiscard, which dispatches to the device-level op.

The fifo ones all call vn_fifo_bypass, which also ends up being
EOPNOTSUPP.
 1.305 05-Jul-2014  hannken Use vcache_rekey_* for nfs_lookitup() in the "*npp != NULL" case.
 1.304 07-Feb-2014  hannken branches: 1.304.2;
Change vnode operation lookup to return the resulting vnode *vpp unlocked.
Change cache_lookup() to return an unlocked vnode.

Discussed on tech-kern@

Welcome to 6.99.31
 1.303 23-Jan-2014  hannken Change vnode operations create, mknod, mkdir and symlink to return
the resulting vnode *vpp unlocked.

Discussed on tech-kern@

Welcome to 6.99.30
 1.302 17-Jan-2014  hannken Change vnode operations create, mknod, mkdir and symlink to keep the
directory node dvp locked on return.

Discussed on tech-kern@

Welcome to 6.99.29
 1.301 15-Nov-2013  nisimura add one more __unused attribute to shut gcc4.8 off.
 1.300 14-Sep-2013  martin Backout wildcard pragma to kill warnings and instead sprinkle a few dozen
__unused attributes.
Requested by joerg@
 1.299 18-Mar-2013  plunky branches: 1.299.6;
C99 section 6.7.2.3 (Tags) Note 3 states that:

A type specifier of the form

enum identifier

without an enumerator list shall only appear after the type it
specifies is complete.

which means that we cannot pass an "enum vtype" argument to
kauth_access_action() without fully specifying the type first.
Unfortunately there is a complicated include file loop which
makes that difficult, so convert this minimal function into a
macro (and capitalize it).

(ok elad@)
 1.298 07-Nov-2012  macallan fix crash in nfs client lookups, dholland says 'my fault'
 1.297 05-Nov-2012  dholland Excise struct componentname from the namecache.

This uglifies the interface, because several operations need to be
passed the namei flags and cache_lookup also needs for the time being
to be passed cnp->cn_nameiop. Nonetheless, it's a net benefit.

The glop should be able to go away eventually but requires structural
cleanup elsewhere first.

This change requires a kernel bump.
 1.296 05-Nov-2012  dholland Disentangle the namecache from the internals of namei.

- Move the namecache's hash computation to inside the namecache code,
instead of being spread out all over the place. Remove cn_hash from
struct componentname and delete all uses of it.

- It is no longer necessary (if it ever was) for cache_lookup and
cache_lookup_raw to clear MAKEENTRY from cnp->cn_flags for the cases
that cache_enter already checks for.

- Rearrange the interface of cache_lookup (and cache_lookup_raw) to
make it somewhat simpler, to exclude certain nonexistent error
conditions, and (most importantly) to make it not require write access
to cnp->cn_flags.

This change requires a kernel bump.
 1.295 22-Jul-2012  rmind branches: 1.295.2;
Move some the test for MAKEENTRY into the cache_enter(9). Make some
variables in vfs_cache.c static, __read_mostly, etc.

No objection on tech-kern@.
 1.294 27-Apr-2012  drochner fix access permission check which got broken by some kauth rework
in March, affected mostly systems with NFS root fs
 1.293 28-Nov-2011  tls branches: 1.293.2; 1.293.4;
Remove arc4random() and arc4randbytes() from the kernel API. Replace
arc4random() hacks in rump with stubs that call the host arc4random() to
get numbers that are hopefully actually random (arc4random() keyed with
stack junk is not). This should fix some of the currently failing anita
tests -- we should no longer generate duplicate "random" MAC addresses in
the test environment.
 1.292 27-Sep-2011  christos branches: 1.292.2;
use NFS_MAXPATHLEN instead of MAXPATHLEN
 1.291 12-Jun-2011  rmind Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.
 1.290 24-Apr-2011  rmind branches: 1.290.2;
sys_link: prevent hard links on directories (cross-mount operations are
already prevented). File systems are no longer responsible to check this.
Clean up and add asserts (note that dvp == vp cannot happen in vop_link).

OK dholland@
 1.289 14-Dec-2010  cegger branches: 1.289.2;
Initialize mutex and cv after sanity checks
 1.288 14-Dec-2010  cegger back out rev. 1.285. The problem I try to hunt down
in PR 42455 is not in the network stack as shown by PR 44206.
 1.287 30-Nov-2010  dholland Abolish the SAVENAME and HASBUF flags. There is now always a buffer,
so the path in a struct componentname is now always valid during VOP
calls.
 1.286 30-Nov-2010  dholland Abolish struct componentname's cn_pnbuf. Use the path buffer in the
pathbuf object passed to namei as work space instead. (For now a pnbuf
pointer appears in struct nameidata, to support certain unclean things
that haven't been fixed yet, but it will be going away in the future.)

This removes the need for the SAVENAME and HASBUF namei flags.
 1.285 26-Oct-2010  cegger Add diagnostic check which hits when PR 42455 is reproduced.
Idea from hans@
 1.284 24-Jun-2010  hannken Clean up vnode lock operations pass 2:

VOP_UNLOCK(vp, flags) -> VOP_UNLOCK(vp): Remove the unneeded flags argument.

Welcome to 5.99.32.

Discussed on tech-kern.
 1.283 29-Mar-2010  pooka Stop exposing fifofs internals and leave only fifo_vnodeop_p visible.
 1.282 08-Jan-2010  pooka branches: 1.282.2; 1.282.4;
The VATTR_NULL/VREF/VHOLD/HOLDRELE() macros lost their will to live
years ago when the kernel was modified to not alter ABI based on
DIAGNOSTIC, and now just call the respective function interfaces
(in lowercase). Plenty of mix'n match upper/lowercase has creeped
into the tree since then. Nuke the macros and convert all callsites
to lowercase.

no functional change
 1.281 21-Oct-2009  rmind Remove uarea swap-out functionality:

- Addresses the issue described in PR/38828.
- Some simplification in threading and sleepq subsystems.
- Eliminates pmap_collect() and, as a side note, allows pmap optimisations.
- Eliminates XS_CTL_DATA_ONSTACK in scsipi code.
- Avoids few scans on LWP list and thus potentially long holds of proc_lock.
- Cuts ~1.5k lines of code. Reduces amd64 kernel size by ~4k.
- Removes __SWAP_BROKEN cases.

Tested on x86, mips, acorn32 (thanks <mpumford>) and partly tested on
acorn26 (thanks to <bjh21>).

Discussed on <tech-kern>, reviewed by <ad>.
 1.280 14-Jul-2009  apb Use pid_t, not short, for a pid.
Part of PR 41255 from Kurt Lidl.
 1.279 23-Jun-2009  elad Move the implementation of vaccess() to genfs_can_access(), in line with
the other routines of the same spirit.

Adjust file-system code to use it.

Keep vaccess() for KPI compatibility and to keep element of least
surprise. A "diagnostic" message warning that vaccess() is deprecated will
be printed when it's used (obviously, only in DIAGNOSTIC kernels).

No objections on tech-kern@:

http://mail-index.netbsd.org/tech-kern/2009/06/21/msg005310.html
 1.278 10-May-2009  yamt nfs_lookup: vn_lock the vnode returned by cache_lookup_raw
before feeding it to VOP_GETATTR. it's necessary because the vnode might
be being cleaned by getcleanvnode.

it's an instance of more general races between vnode reclaim and
unlocked VOPs. however, this one happens somewhat often because it can be
triggered by getnewvnode rather than revoke.
 1.277 10-May-2009  yamt restore lines, esp. a vrele() call, which i mistakenly removed
in the previous. (rev.1.276)
 1.276 04-May-2009  yamt nfs_lookup: handle the case where the vnode returned cache_lookup_raw is
being reclaimed by another thread. after recent changes in cache_lookup_raw,
there's a race between cache_lookup_raw/vtryget and getcleanvnode/vclean.
PR/41028.
 1.275 04-May-2009  yamt nfs_lookup: add an assertion.
 1.274 04-May-2009  yamt nfs_lookup: comments. no functional changes.
 1.273 14-Mar-2009  dsl ANSIfy another 1261 function definitions.
The only ones left in sys are beyond by sed script!
(or in sys/dist or sys/external)
Mostly they have function pointer parameters.
 1.272 14-Mar-2009  dsl Change about 4500 of the K&R function definitions to ANSI ones.
There are still about 1600 left, but they have ',' or /* ... */
in the actual variable definitions - which my awk script doesn't handle.
There are also many that need () -> (void).
(The script does handle misordered arguments.)
 1.271 14-Mar-2009  dsl Remove all the __P() from sys (excluding sys/dist)
Diff checked with grep and MK1 eyeball.
i386 and amd64 GENERIC and sys still build.
 1.270 13-Mar-2009  yamt nfs_lookup: fix a comment.
 1.269 11-Jan-2009  christos branches: 1.269.2;
merge christos-time_t
 1.268 19-Nov-2008  ad branches: 1.268.4;
Make the emulations, exec formats, coredump, NFS, and the NFS server
into modules. By and large this commit:

- shuffles header files and ifdefs
- splits code out where necessary to be modular
- adds module glue for each of the components
- adds/replaces hooks for things that can be installed at runtime
 1.267 15-Oct-2008  pooka branches: 1.267.2;
For NFSV3CREATE_EXCLUSIVE verifier, just use arc4random() instead
of the first inet address on INET systems (which is likely to be
localhost).
 1.266 13-Feb-2008  yamt branches: 1.266.6; 1.266.10; 1.266.16;
reject files larger than nm_maxfilesize.
 1.265 25-Jan-2008  ad Remove VOP_LEASE. Discussed on tech-kern.
 1.264 02-Jan-2008  yamt use kmem_alloc instead of malloc.
 1.263 02-Jan-2008  ad Merge vmlocking2 to head.
 1.262 17-Dec-2007  yamt nfs_create: try GUARDED if EXCLUSIVE is NOTSUPP.
 1.261 08-Dec-2007  pooka branches: 1.261.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.
 1.260 26-Nov-2007  pooka branches: 1.260.2;
Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern
 1.259 13-Nov-2007  yamt nfs_lookup: fix indent.
 1.258 07-Nov-2007  ad Merge from vmlocking:

- pool_cache changes.
- Debugger/procfs locking fixes.
- Other minor changes.
 1.257 28-Oct-2007  yamt branches: 1.257.2;
make NFS_ATTRTIMEO a function.
 1.256 09-Jul-2007  ad branches: 1.256.6; 1.256.8; 1.256.12;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements
 1.255 29-Apr-2007  yamt don't forget to destroy mutex and condvar.
 1.254 29-Apr-2007  yamt use mutex and condver.
 1.253 29-Apr-2007  yamt use mutex and condvar.
 1.252 04-Mar-2007  christos branches: 1.252.2; 1.252.4;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.
 1.251 22-Feb-2007  thorpej TRUE -> true, FALSE -> false
 1.250 21-Feb-2007  thorpej Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.
 1.249 24-Jan-2007  hubertf branches: 1.249.2;
Remove duplicate #includes, patch contributed in private mail
by Slava Semushin <slava.semushin@gmail.com>.

To verify that no nasty side effects of duplicate includes (or their
removal) have an effect here, I've compiled an i386/ALL kernel with
and without the patch, and the only difference in the resulting .o
files was in shifted line numbers in some assert() calls.
The comparison of the .o files was based on the output of "objdump -D".

Thanks to martin@ for the input on testing.
 1.248 27-Dec-2006  yamt - remove the rest of nqnfs.
- reject NFSMNT_MNTD and NFSMNT_KERB. (no users in tree.)
 1.247 27-Dec-2006  yamt remove nqnfs.
 1.246 09-Dec-2006  chs a smorgasbord of improvements to vnode locking and path lookup:
- LOCKPARENT is no longer relevant for lookup(), relookup() or VOP_LOOKUP().
these now always return the parent vnode locked. namei() works as before.
lookup() and various other paths no longer acquire vnode locks in the
wrong order via vrele(). fixes PR 32535.
as a nice side effect, path lookup is also up to 25% faster.
- the above allows us to get rid of PDIRUNLOCK.
- also get rid of WANTPARENT (just use LOCKPARENT and unlock it).
- remove an assumption in layer_node_find() that all file systems implement
a recursive VOP_LOCK() (unionfs doesn't).
- require that all file systems supply vfs_vptofh and vfs_fhtovp routines.
fill in eopnotsupp() for file systems that don't support being exported
and remove the checks for NULL. (layerfs calls these without checking.)
- in union_lookup1(), don't change refcounts in the ISDOTDOT case, just
adjust which vnode is locked. fixes PR 33374.
- apply fixes for ufs_rename() from ufs_vnops.c rev. 1.61 to ext2fs_rename().
 1.245 09-Nov-2006  yamt branches: 1.245.2;
remove some __unused in function parameters.
 1.244 14-Oct-2006  yamt grab glock when calling uvm_unp_setsize, so that it doesn't interfere
mmap'ed accesses. this fixes an assertion failure in in nfs_doio_read.
("vp->v_size >= uiop->uio_offset + uiop->uio_resid")
 1.243 12-Oct-2006  christos - sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386
 1.242 29-Sep-2006  drochner Flush regular files before setattr also if the mode bits are going to
be set. Linux NFS servers (at least) reset suid/sgid bits if a write
happens afterwards. Add a comment why this is done.
This fixes system builds on diskless systems for me where suid bits
were missing after install(1).
Approved by yamt.
 1.241 23-Jul-2006  ad branches: 1.241.4; 1.241.6;
Use the LWP cached credentials where sane.
 1.240 01-Jul-2006  yamt some comments taken from Jed Davis's patch.
 1.239 01-Jul-2006  yamt if a file is sillyrename'ed because it's a destination of rename,
make sillyrename (try to) use LINK operation rather than RENAME.
PR/33861 from Jed Davis. he provided the almost same patch.
according to him, it also happen to be what opensolaris does in this case.

from the PR:
> In nfs_rename(), if the destination appears to exist and is "in use"
> (this check is apparently satisfied even if the file isn't in use by
> anything except the rename itself), it will sillyrename it, then delete
> the sillyrenamed file even if the rename fails -- for instance, because
> the "from" file no longer exists on the server.

> mkdir a b; touch a/x; perl -e 'fork(); rename("a/x","b/x") or die "$!\n"'
>
> Afterwards, neither a/x nor b/x will exist.

> 1) Lookup of b/x; fails with NOENT.
> 2) Rename from a/x to b/x; succeeds.
> 3) Lookup of b/x; fails with NOENT.
> 4) Rename from b/x to b/.nfsA23a3; succeeds.
> 5) Rename from a/x to b/x; fails with NOENT.
> 6) Remove of b/.nfsA23a3; succeeds.
 1.238 30-Jun-2006  yamt fix handling of NFSERR_NOTSUPP and NFSERR_BAD_COOKIE,
which have been broken since nfs_socket.c rev.1.115.
 1.237 07-Jun-2006  kardel branches: 1.237.2;
merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html
 1.236 14-May-2006  elad branches: 1.236.2;
integrate kauth.
 1.235 15-Apr-2006  christos Coverity CID 744: Conditionally define out dead code (only if it is dead)
 1.234 15-Apr-2006  christos Coverity CID 2515-2519: Initialize rexmit on error path.
 1.233 15-Apr-2006  christos Coverity CID 2520: rexmit can be uninitialized on error path.
 1.232 14-Apr-2006  blymn Make i/o statistics collection more generic, include tape drives and
nfs mounts in the set of devices that statistics will be reported on.
 1.231 01-Mar-2006  yamt branches: 1.231.2; 1.231.4; 1.231.6;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.
 1.230 11-Dec-2005  christos branches: 1.230.2; 1.230.4; 1.230.6;
merge ktrace-lwp.
 1.229 05-Nov-2005  yamt branches: 1.229.2;
nfs_read: return EISDIR rather than EPERM for !VREG files.
 1.228 02-Nov-2005  yamt merge yamt-vop branch. remove following VOPs.

VOP_BLKATOFF
VOP_VALLOC
VOP_BALLOC
VOP_REALLOCBLKS
VOP_VFREE
VOP_TRUNCATE
VOP_UPDATE
 1.227 19-Sep-2005  christos branches: 1.227.2;
ATTRTIMEO takes 2 args.
 1.226 19-Aug-2005  yamt fix some simple bugs in the 64bit ino_t changes.
- edp -> dp
- * -> +
 1.225 19-Aug-2005  christos 64 bit inode changes.
 1.224 21-Jul-2005  yamt use a correct credential for readlink. discussed on source-changes@.
 1.223 07-Jul-2005  christos 1. use p = uio->uio_procp consistently and eliminate suspicious uses
of curproc (where uio->uio_procp should be used?). Don't do this
for nfs_commit(), because yamt says it is possibly wrong.
2. nfs_doio() does not use struct proc; remove it and the code to compute it.
3. use copyin_proc() and copyout_proc() instead of copyin() and copyout().
4. check return of copyout_proc(). and mark return from copyin_proc() XXX
5. Eliminate check p == curproc assertion check from nfs_write;
nfs_read does not have it and we might be called in a different
process context anyway (PR 20138).
 1.222 29-May-2005  christos branches: 1.222.2;
- sprinkle const
- avoid shadowed variables
- mark bad const use with XXXUNCONST
 1.221 17-May-2005  christos Yes, it was a cool trick >20 years ago to use "0123456789abcdef"[a] to
implement, xtoa(), but I think defining the samestring 50 times is a bit
too much. Defined HEXDIGITS and hexdigits in subr_prf.c and use it...
 1.220 26-Feb-2005  perry branches: 1.220.2;
nuke trailing whitespace
 1.219 26-Jan-2005  yamt nfs_readdirrpc, nfs_readdirplusrpc:
avoid infinite loops when getting readdir response without
any entries or eof. PR/28971.
 1.218 26-Jan-2005  yamt handle a really empty directory, which doesn't have even the dot entry.
 1.217 21-Jan-2005  yamt branches: 1.217.2;
s/time/mono_time/ for n_attrstamp and n_accstamp. (parts of) PR/25641.
 1.216 19-Jan-2005  yamt implement inaccurate mtime/ctime detection.
namely, if mtime or ctime are same between pre_op_attr and post_op_attr
when we expected them to be changed, don't trust the server.
 1.215 08-Jan-2005  yamt branches: 1.215.2;
nfs_lookup: check n_nctime for positive entries as well to improve
cache consistency.
 1.214 17-Dec-2004  yamt revive spec vop_bwrite as it's needed for block devices.
PR/28684 from Jukka Salmi.
 1.213 14-Dec-2004  yamt redirect some VOPs which shouldn't be used for nfs
to genfs_badop (ie. panic).
 1.212 14-Dec-2004  yamt - centerize code to invalidate stale cache.
- don't ignore errors when invalidating buffers in nfs_open.
 1.211 03-Oct-2004  yamt nfs_readdirrpc, nfs_readdirplusrpc:
don't expose kernel garbage data to userland.
 1.210 01-Oct-2004  yamt nfs_writerpc: fix PHOLD leak on error.
 1.209 23-Sep-2004  yamt nfs_readdirplusrpc: fix spurious EBUSYs.
 1.208 20-Jul-2004  yamt nfs_readdirplusrpc: fix a very long-standing cache corruption bug.
in the case of !bigenough, don't fill d_type or dnlc with bogus data.
 1.207 20-Jul-2004  yamt revert nfs_vnops.c rev.1.189.
it's no longer needed because cache_enter() has been changed to handle
duplicated entries by itsself.
 1.206 18-Jul-2004  yamt nfs_commit: use NAC_NOTRUNC when loading an attribute
as we're called holding pages locked.
 1.205 08-Jul-2004  yamt - include opt_inet.h for INET.
- catchup to in_ifaddr -> in_ifaddrhead rename.

XXX the address on the top of in_ifaddrhead is likely 127.0.0.1.
using it to construct the verifier doesn't make much sense.
maybe it's better to use some uuid or ip_randomid-like method.
 1.204 08-Jul-2004  yamt nfs_create: after an exclusive create rpc, make sure to update
timestamps, which were likely used to store the verifier.
reported by Mark Davies. PR/26200
 1.203 27-Jun-2004  yamt nfs_lookup: use cache_lookup_raw() so that:
- "intrusive" dirops now have more chances to get benefits from dnlc.
- fixes a deadlock due to vnode locking order inversion.

nfs_create and others: purge stale dnlc entries
as nfs_lookup() no longer does it automatically.
 1.202 16-Jun-2004  yamt nfs_lookup: maintain PDIRUNLOCK even in the case of success to make
layered filesystems happy.
 1.201 27-May-2004  yamt remove an unused instance of VOP_UPDATE.
 1.200 23-May-2004  christos cut down another 7K by more NFS_V2_ONLY ifdefs.
 1.199 17-May-2004  yamt #if 0 out CREATE optimization for now because it has a problem in the case
of O_CREAT|O_TRUNC, which is hard to be fixed without changing upper layer.
 1.198 10-May-2004  yamt nfs_lookup: handle "." by ourselves as RFC1813 3.2 says.
 1.197 10-May-2004  yamt don't do kludge for a reply to a retransmitted request
unless we actually retransmitted the request.
 1.196 08-May-2004  yamt nfs_lookup: avoid CREATE optimization for DOTDOT.
creating a DOTDOT entry has no sense and will fail anyway.
 1.195 08-May-2004  yamt nfs_mkdir: handle the "." case.
 1.194 08-May-2004  yamt nfs_lookitup: handle "." correctly rather than returning garbage on the stack.
 1.193 07-May-2004  yamt check read only mount appropriately.
(fix a bug of nfs_vnops.c rev.1.192.)
pointed by Rob Quinn on current-users@.
 1.192 06-May-2004  yamt because nfsv3 has the same CREATE semantics as ours,
we don't have to issue LOOKUP RPCs beforehand.
 1.191 05-Apr-2004  yamt nfs_readdirplusrpc: fix a deadlock problem.
don't wait for vnode lock to load attributes.
otherwise, because READDIRPLUS returns DOTDOT entry as well,
we violate locking order.
 1.190 05-Apr-2004  yamt don't issue VOP_GETATTR blindly in nfs_nget().
in many cases, GETATTR RPCs here is redundant because the caller has
postop_attr. instead, make sure the resulted vnode have a valid
attribute in nfs_lookup().
 1.189 05-Apr-2004  yamt nfs_readdirplusrpc: purge existing namecache entry before entering a new one.
otherwise we'll get duplicated entries.
 1.188 05-Apr-2004  yamt when entering a namecache entry for nfs, ensure to update the appropriate
timestamp in the nfsnode so that we don't get namecache-miss when
looking up the node we just created.
 1.187 05-Apr-2004  yamt avoid unnecessary namecache purges in some places.
 1.186 12-Mar-2004  yamt branches: 1.186.2;
shrink sizeof struct nfsnode by putting exclusive members into union.
 1.185 12-Mar-2004  yamt introduce a macro NFS_INVALIDATE_ATTRCACHE and use it
instead of "n_attrstamp = 0".
 1.184 07-Dec-2003  fvdl Unix semantics dictate that access checks for files are done when it
is opened. An open file can always be read from and/or written to,
depending on how it was opened.

Therefore, the read/write/commit RPCs should never return EACCESS,
as they are only performed on files that have been successfully opened
already.

This change improves the current situation and works in most cases.
It simply always uses the most recently known owner/group of the file,
iff the authentication mechanism is AUTH_UNIX (in other cases, the
creds for a succesful open are used, but note that no other cases
are currently implemented).

A retry mechanism can be used to catch a few more cases, but this is
a good improvement for now.
 1.183 29-Nov-2003  yamt pad requests correctly in the zerocopy case of write rpc.
 1.182 25-Oct-2003  christos fix uninitialized variable
 1.181 26-Sep-2003  yamt do delayed truncation in nfs_getattr.
 1.180 26-Sep-2003  yamt change n_mtime from time_t to timespec in order to improve
cache consistency.
(1 second granularity is too loose these days.)
 1.179 25-Sep-2003  enami Make negative name cache works again.
 1.178 17-Sep-2003  yamt change nctime to timespec from time_t.
there can be too many activities in a second.
 1.177 07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.176 30-Jul-2003  yamt vrecycle removed nfs vnodes.
not perfect, but enough for most cases.
 1.175 29-Jun-2003  fvdl branches: 1.175.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
 1.174 28-Jun-2003  darrenr Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
 1.173 27-Jun-2003  yamt if nfs_writerpc() splitted an unstable write into some rpcs and
write verifier was changed, we should restart from the first.
 1.172 27-Jun-2003  yamt indent.
 1.171 03-Jun-2003  yamt fix a problem in 'protected' case of writerpc.

retransmitted mbufs can survive even after requests themselves
finished. so, before unbusy pages, make sure that mbufs referring them
go away.

pointed by enami tsugutomo on port-mips.
 1.170 27-May-2003  yamt fix a memory leak bug that i introduced in rev.1.167.
patch provided by enami tsugutomo on current-users.
 1.169 26-May-2003  yamt when a result of NFSv3 READLINK is too long for us,
return ENAMETOOLONG rather than EBADRPC.
(it's our implementation limit, not protocol limit.)
 1.168 21-May-2003  yamt remove local definitions of TRUE and FALSE.
 1.167 21-May-2003  yamt eliminate memcpy in the common and easy case of write.
 1.166 03-May-2003  yamt better handling of write verifier change.
 1.165 24-Apr-2003  drochner Change some subordinate functions to take a "struct nfsnode" argument
instead of "struct vnode". This saves a number of pointer dereferences;
it sums up to about half a kB for me. And it paves the way for future
fixes.
While cleaning up, eliminate a write-only member of "struct nfsreq"
and a pointless assignment in the NFS_V2_ONLY case.
 1.164 09-Apr-2003  yamt rename a very confusing variable name.
(must_commit -> stalewriteverf)
 1.163 09-Apr-2003  yamt make per-iod datas together.
 1.162 09-Apr-2003  yamt rename nm_verf to nm_writeverf because it's confusing with nm_verf{str,len}.
 1.161 02-Apr-2003  yamt use queue manipulation macros.
 1.160 31-Mar-2003  yamt rename fvdl_debug to NFS_DEBUG_COMMIT.

ok'ed by fvdl.
 1.159 26-Feb-2003  matt Add MBUFTRACE kernel option.
Do a little mbuf rework while here. Change all uses of MGET*(*, M_WAIT, *)
to m_get*(M_WAIT, *). These are not performance critical and making them
call m_get saves considerable space. Add m_clget analogue of MCLGET and
make corresponding change for M_WAIT uses.
Modify netinet, gem, fxp, tulip, nfs to support MBUFTRACE.
Begin to change netstat to use sysctl.
 1.158 18-Feb-2003  jdolecek add missing dot in comment
 1.157 01-Dec-2002  matt Make sure these all agree on the same definitons of various variables.
 1.156 23-Oct-2002  jdolecek merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe
 1.155 22-Oct-2002  yamt fix panic introduced by my previous commit.

for device special files, VOP_UNLOCK is called
by nfs_loadattrcache with v_data == 0.

reported and tested by Matthias Drochner.
 1.154 22-Oct-2002  simonb "nmp" in nfs_lookup() is set but not used, remove it.
 1.153 21-Oct-2002  yamt fix a page locking deadlock problem for nfs.

add a flag that specify if the file can be truncated safely or not
to nfsm_loadattr and friends. when it isn't safe, just mark the nfsnode
as "should be truncated later".

ok'ed by Frank van der Linden and Chuck Silvers.
close kern/18036.
 1.152 18-Oct-2002  thorpej nfs_remove(): Don't vput() the vnode twice if vp == dvp, vrele() and
vput() instead.
 1.151 19-May-2002  tls branches: 1.151.2;
Fix client-side lockmgr: locking against myself panic immediately upon an
attempt to NFS-mount a filesystem with the -l (use ReaddirPlus RPC) option.

Fix from Bill Sommerfeld.
 1.150 12-May-2002  matt branches: 1.150.2;
Eliminate commons
 1.149 28-Feb-2002  fvdl Invalidate the access cache when loading a new set of attributes into
the atribute cache. Fixes access cache problem seen by
Nathan Funk of the UofS, relayed by Greg Oster.
 1.148 15-Dec-2001  fvdl Set np->n_size before calling nfs_vinvalbuf, to avoid recursion
and confusion about the actual filesize. From Matt Dillon's
similar change in FreeBSD.

XXX n_size is really redundant in -current and must die. This commit
XXX is more of a placeholder for a pullup into the 1.5 branch.
 1.147 08-Dec-2001  lukem - Implement
uint32_t namei_hash(const char *p, const char **ep)
which determines the equivalent MI hash32_str() hash for p.
If *ep != NULL, calculate the hash to the character before ep.
If *ep == NULL, calculate the has to the first / or NUL found, and
point *ep to that location.
- Use namei_hash() to calculate cn_hash in lookup() and relookup().
Hash distribution goes from 35-40% to 55-70%, with similar profiled
time spent in cache_lookup() and cache_enter() on my P3-600.
- Use namei_hash() to calculate cn_hash in nfs_readdirplusrpc(),
insetad of homegrown code (that differed from that in lookup() !)
namei_hash() has better spread and is faster than previous code
(which used a non-constant multiplication).
 1.146 04-Dec-2001  christos PR/14817: Gregory McGarry: NFS_V2_ONLY doesn't seem to work.
 1.145 30-Nov-2001  chs call VOP_PUTPAGES() directly instead of indirecting through
the UVM pager op vector.
 1.144 29-Nov-2001  christos PR/14776: Emmanuel Dreyfus: cross device hard link causes panic.
Call VOP_ABORTOP on the right vnode damnit!
 1.143 10-Nov-2001  lukem add RCSIDs
 1.142 07-Nov-2001  bjh21 Diagnostic panics should be enabled when DIAGNOSTIC is defined, not undefined.
 1.141 13-Oct-2001  simonb branches: 1.141.2;
Remove so variables that are only ever set and never referenced.
 1.140 22-Sep-2001  sommerfeld Add fifo_putpages() placebo so that the vnode's uobj is unlocked.
 1.139 20-Sep-2001  chs fix nfs_bmap() so that it works for both genfs_{get,put}pages() and swap/vnd.
 1.138 15-Sep-2001  chs a whole bunch of changes to improve performance and robustness under load:

- remove special treatment of pager_map mappings in pmaps. this is
required now, since I've removed the globals that expose the address range.
pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's
no longer any need to special-case it.
- eliminate struct uvm_vnode by moving its fields into struct vnode.
- rewrite the pageout path. the pager is now responsible for handling the
high-level requests instead of only getting control after a bunch of work
has already been done on its behalf. this will allow us to UBCify LFS,
which needs tighter control over its pages than other filesystems do.
writing a page to disk no longer requires making it read-only, which
allows us to write wired pages without causing all kinds of havoc.
- use a new PG_PAGEOUT flag to indicate that a page should be freed
on behalf of the pagedaemon when it's unlocked. this flag is very similar
to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the
pageout fails due to eg. an indirect-block buffer being locked.
this allows us to remove the "version" field from struct vm_page,
and together with shrinking "loan_count" from 32 bits to 16,
struct vm_page is now 4 bytes smaller.
- no longer use PG_RELEASED for swap-backed pages. if the page is busy
because it's being paged out, we can't release the swap slot to be
reallocated until that write is complete, but unlike with vnodes we
don't keep a count of in-progress writes so there's no good way to
know when the write is done. instead, when we need to free a busy
swap-backed page, just sleep until we can get it busy ourselves.
- implement a fast-path for extending writes which allows us to avoid
zeroing new pages. this substantially reduces cpu usage.
- encapsulate the data used by the genfs code in a struct genfs_node,
which must be the first element of the filesystem-specific vnode data
for filesystems which use genfs_{get,put}pages().
- eliminate many of the UVM pagerops, since they aren't needed anymore
now that the pager "put" operation is a higher-level operation.
- enhance the genfs code to allow NFS to use the genfs_{get,put}pages
instead of a modified copy.
- clean up struct vnode by removing all the fields that used to be used by
the vfs_cluster.c code (which we don't use anymore with UBC).
- remove kmem_object and mb_object since they were useless.
instead of allocating pages to these objects, we now just allocate
pages with no object. such pages are mapped in the kernel until they
are freed, so we can use the mapping to find the page to free it.
this allows us to remove splvm() protection in several places.

The sum of all these changes improves write throughput on my
decstation 5000/200 to within 1% of the rate of NetBSD 1.5
and reduces the elapsed time for "make release" of a NetBSD 1.5
source tree on my 128MB pc to 10% less than a 1.5 kernel took.
 1.137 17-Aug-2001  chs branches: 1.137.2;
add getpages/putpages entries for spec vnodes.
 1.136 14-Aug-2001  itojun don't panic on mknod(2) over NFS. PR 13705.
 1.135 24-Jul-2001  assar change vop_symlink and vop_mknod to return vpp (the created node)
refed, so that the caller can actually use it. update callers and
file systems that implement these vnode operations
 1.134 07-Jun-2001  lukem branches: 1.134.2;
delint lvalue cast abuse
 1.133 28-May-2001  chs add a genfs_mmap() and change all of the disk-based filesystems
to implement VOP_MMAP() with the genfs version, in preparation for
actually using this VOP.
 1.132 14-May-2001  fvdl Lock vp in nfs_link while we're busy with it (doing VOP_FSYNC, etc).
 1.131 20-Apr-2001  fvdl Don't forget to unlock the vnode returned by cache_lookup if the
subsequent access check fails. Don't overwrite the error code
returned by cache_lookup. Remove a piece of redundant code.

Should fix kern/12680.
 1.130 11-Feb-2001  enami branches: 1.130.2;
Unlock the rename target vnode after sillyrename'ing it.
 1.129 06-Feb-2001  fvdl Get locking in rmdir right. Don't unlock a vnode when passing its
associated nfsnode to nfs_lookitup, it is not needed, and fixes
nfs_remove.
 1.128 06-Feb-2001  fvdl Do actual vnode locking for NFS.
 1.127 22-Jan-2001  jdolecek make filesystem vnodeop, specop, fifoop and vnodeopv_* arrays const
 1.126 12-Dec-2000  chs initialize read creds in nfs_open() too.
 1.125 30-Nov-2000  chs in nfs_open(), initialize the write creds if we're opening for writing.
otherwise we would never set them if we only modify the file via mmap().
 1.124 27-Nov-2000  chs Initial integration of the Unified Buffer Cache project.
 1.123 08-Nov-2000  chs in nfs_flush(), only play games with B_NOCACHE for VREG vnodes.
if we do this for VBLK vnodes which are in use by softdep mounts,
brelse() will mark the buffer B_INVAL as well, which makes the
softdep code very unhappy.
 1.122 02-Oct-2000  itojun check in_ifaddr only if INET is compiled
 1.121 19-Sep-2000  bjh21 Extend NFS_V2_ONLY to remove NQNFS lease support as well. Saves another 10k.
 1.120 19-Sep-2000  fvdl Update for VOP_FSYNC parameter change. Simplify nfs_flush.
 1.119 19-Sep-2000  bjh21 New kernel option, NFS_V2_ONLY, which aims to reduce the NFS client to just
that required to support NFSv2 mounts. Not finished yet, but already
provides some 44k of saving in code size on arm26. More savings, and some
documentation, are still to come.
 1.118 19-Sep-2000  fvdl Fix bug in access cache that might result in permission being denied
needlessly. From Matthias Drochner.
 1.117 03-Aug-2000  thorpej Convert namei pathname buffer allocation to use the pool allocator.
 1.116 03-Aug-2000  thorpej MALLOC()/FREE() are not to be used for variable size allocations.
 1.115 22-Jul-2000  jdolecek change the lf_advlock() arguments from

int lf_advlock __P((struct lockf **,
off_t, caddr_t, int, struct flock *, int));
to

int lf_advlock __P((struct vop_advlock_args *, struct lockf **, off_t));

This matches common usage and is also compatible with similar change
in FreeBSD (though they use u_quad_t as last arg).
 1.114 27-Jun-2000  mrg remove include of <vm/vm.h>
 1.113 26-May-2000  enami branches: 1.113.4;
- Try to commit another buffer even if previous commit failed except the
case that write verf is changed. Suggested by mycroft@netbsd.org.
- Reset wcred to NULL (i.e., write credential isn't decieded) everytime
before gathering buffer for new commit, so that there is a chance to
the commit request is merged.
 1.112 25-May-2000  enami In nfs_flush, if the previous commit succeeded and we may have more
uncommitted dirty buffer, attempt to commit them.
 1.111 30-Mar-2000  augustss Remove more register declarations.
 1.110 30-Mar-2000  augustss Remove register declarations.
 1.109 30-Mar-2000  simonb Delete redundant decls of fifo_vnodeop_p - it's in <miscfs/fifofs/fifo.h>.
Don't need <sys/conf.h> here.
 1.108 29-Nov-1999  fvdl Insert an extra VOP_ACCESS check in nfs_lookup, to avoid cached access
mishaps for lookup and getattr. Closes PR 8884.

While at it, cache access RPCs.
 1.107 15-Nov-1999  fvdl Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O
 1.106 05-Sep-1999  jdolecek branches: 1.106.2; 1.106.4; 1.106.8;
Adapt to cache_lookup() changes.
XXX I had no chance to actually test the changes for nfs, but hopefully I got
it right.

Tested by: jdolecek
Rewieved by: wrstuden
 1.105 03-Aug-1999  wrstuden Add support for fcntl(2) to generate VOP_FCNTL calls. Any fcntl
call with F_FSCTL set and F_SETFL calls generate calls to a new
fileop fo_fcntl. Add genfs_fcntl() and soo_fcntl() which return 0
for F_SETFL and EOPNOTSUPP otherwise. Have all leaf filesystems
use genfs_fcntl().

Reviewed by: thorpej
Tested by: wrstuden
 1.104 02-Aug-1999  wrstuden Teach nfs_lookup() to set PDIRUNLOCK when appropriate. Should resolve
PR 8051 by Konrad Schroder.
 1.103 29-Jul-1999  thorpej In nfs_create(), make sure error is reset to 0 if we restart the operation.
 1.102 08-Jul-1999  wrstuden Teach nfs_lookup to clear PDIRUNLOCK.
 1.101 29-May-1999  fvdl Be more correct with attribute structures for setattr RPCs and friends,
so that picky servers (e.g. Solaris 7) don't refuse our requests. Move
some code into a macro, and a bit of KNF. From OpenBSD.
 1.100 24-Mar-1999  mrg branches: 1.100.2; 1.100.4; 1.100.6;
completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.
 1.99 22-Mar-1999  kleink Add _PC_FILESIZEBITS to pathconf vnop.
 1.98 06-Mar-1999  fair Snatch a patch from OpenBSD to fix PRs 6529 and 7074.
Adjust fxdr_hyper() and txdr_hyper() macros.
 1.97 09-Aug-1998  perry branches: 1.97.2;
bzero->memset, bcopy->memcpy, bcmp->memcmp
 1.96 08-Aug-1998  kleink Pathconf: for V2 mounts, revert back to failing with EINVAL if an RPC would
be necessary to obtain the information, as this fits the pathconf semantics
of `no association supported' better than `no limit available.'
 1.95 07-Aug-1998  kleink Add client pathconf support.
 1.94 24-Jun-1998  sommerfe Always include fifos; "not an option any more".
 1.93 22-Jun-1998  sommerfe defopt for options FIFO
 1.92 05-Jun-1998  kleink Convert fsync vnode operator implementations and usage from the old `waitfor'
argument and MNT_WAIT/MNT_NOWAIT to `flags' and FSYNC_WAIT.
 1.91 08-May-1998  kleink Fix some arithmetics lossage on typeless pointers.
 1.90 03-Mar-1998  fvdl Fix cookie handling I messed up totally when doing the Lite2 thing.
(Hello McFly? Anybody home?)
 1.89 01-Mar-1998  fvdl Merge with Lite2 + local changes
 1.88 10-Feb-1998  mrg - add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.
 1.87 05-Feb-1998  mrg initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)
 1.86 20-Oct-1997  thorpej Fix alignment problems. From Frank van der Linden <fvdl@NetBSD.ORG>.
 1.85 19-Oct-1997  fvdl * Implement optional 32 <-> 64 bit directory cookie translation. This uses
the directory cache as translation table. See nfs_subs.c for comments.
Makes the code a bit more complex to look at than I would have liked,
but doesn't affect the speed of the default behavior.
* Optimize caching behavior a bit when buffers are invalidated.
* Save some RPCs in readdir operations by not bothering if there is
a small amount left to do to fill the buffer. It'll be done in the
next RPC with a larger chunk anyway. Wastes a bit of buffer space
but is faster.
* Make n_vattr an allocated vattr struct. This avoids nfsnode bloat,
and is friendlier to the malloc routines.
 1.84 17-Oct-1997  christos u_short -> mode_t
 1.83 13-Oct-1997  fvdl Get rid of some MARKCACHED calls I thought better of. Make sure d_reclen
is aligned for off_t access, or things will break on the Alpha.
 1.82 12-Oct-1997  fvdl Do negative lookup caching. Use a timestamp of the oldest negative cache
entry, so it can be checked against directory modification time for
validity.
 1.81 10-Oct-1997  fvdl * New directory entry caching system. Provides full caching of any
directory cookie that may be thrown back at us from userspace, up
to a size limit. Fixes double entry problem.
* Split flags for internal and external use in the NFS mount structure.
* Fix some buffer structure fields that weren're being used correctly.
* Fix missing directory cache inval call in nfs_open.
* Limit on NFS_DIRBLKSIZ no longer needed, bumped to the more reasonable
value of 8k.
* Various other things that I forget, all related to the dir caching
somehow, though.
 1.80 17-Jul-1997  fvdl branches: 1.80.2;
* Deal with servers that don't give complete FSINFO (like NT)
From Olaf Seibert <rhialto@polder.ubc.kun.nl> (PR 3687)
* Make an attempt to check the maximum filesize before attempting
a write to the server, as write RPCs will typically happen
asynchronously, and the process will not see the error.
Fixes problems with unexpectly truncated files at 4G
* Pass up errors in nfs_writerpc correctly
 1.79 14-Jul-1997  fvdl Don't assume that pointers into mbuf data remain valid across nfsm_dissect.
In readdirplus, don't keep such pointers but store the file attributes
in a variable instead until they are needed. Change nfsm_loadattr*
a bit so it can accept a direct pointer to an nfs_fattr structure.
 1.78 04-Jul-1997  drochner Don't cast 64bit (off_t) file sizes to vm_offset_t (32bit on many
architectures), truncate them intelligently instead.
The truncation is done centralized in vnode_pager.c.
This prevents from wrap-over effects when parts of large (>2^32 byte) files
are mmapped.
Don't allow to mmap above the numerical range of vm_offset_t.
This is considered a temporary solution until the vm system handles the
object sizes/offsets more cleanly.
 1.77 30-Jun-1997  fvdl Immediately return EPERM for a VOP_REMOVE on a directory.
 1.76 12-May-1997  fvdl clear B_AGE for non-flush writes, buffers seem to be reused
too quickly, disturbing NFS performance (XXXX needs further analysis
and a _real_ fix)
 1.75 08-May-1997  mycroft Need stat.h.
 1.74 08-May-1997  mycroft Pass the vnode type to vaccess(), and use it when checking VEXEC. Make sure
that the mode bits passed to vaccess() and returned by foo_getattr() contain
only permission bits.
 1.73 08-May-1997  mycroft VEXEC -> VLOOKUP, as appropriate.
 1.72 05-Mar-1997  mycroft In nfs_link(), check for a cross-device mount *before* looking in the
v_data field.
 1.71 22-Feb-1997  fvdl Fixes from BSDI (thanks go to Keith Bostic). Original RCS messages:

date: 1996/07/23 17:14:46; author: donn; state: Exp; lines: +6 -4
Be sure to push out the last page of the file before truncating it.

date: 1996/10/14 22:41:20; author: donn; state: Exp; lines: +2 -2
From Chris: Nfs_link() called vput() on the wrong vnode when aborting
from a cross-device link, which could (and did) lead to crashes.

date: 1996/10/24 16:43:43; author: pjd; state: Exp; lines: +6 -2
Return EOPNOTSUPP when trying to do a setattr with flags.

===

Also (from BSDI too, but the RCS message did not quite describe the change
to this particular file well): move the EROFS a bit further down to
let VOP_ACCESS do it's work and return an 'expected' error value to
a possible layered filesystem.
 1.70 09-Feb-1997  fvdl * Fix some bugs in NQNFS (malformed RPC requests, no directory lease eviction)
* Avoid possible NULL ptr ref in nfs_reply
* Don't ever try to sillyrename directories (from FreeBSD)
 1.69 02-Dec-1996  thorpej branches: 1.69.4;
NFS performance improvement from Doug Rabson/FreeBSD:

Improve the queuing algorithms used by NFS' asynchronous i/o. The
existing mechanism uses a global queue for some buffers and the
vp->b_dirtyblkhd queue for others. This turns sequential writes into
randomly ordered writes to the server, affecting both read and write
performance. The existing mechanism also copes badly with hung
servers, tending to block accesses to other servers when all the iods
are waiting for a hung server.

The new mechanism uses a queue for each mount point. All asynchronous
i/o goes through this queue which preserves the ordering of requests.
A simple mechanism ensures that the iods are shared out fairly between
active mount points.

Reviewed/integrated/approved by Frank van der Linden <fvdl@netbsd.org>
 1.68 25-Oct-1996  cgd make the namei struct members ni_dirp and ni_next, and the componentname
struct member cn_nameptr 'const', since they should never be used to
modify the path name. (Only the pathname buffer, cn_pnbuf, should be
modified.) Propagate the const poisoning to code that uses the namei
and componentname structs.
 1.67 13-Oct-1996  christos revert kprintf changes
 1.66 10-Oct-1996  christos printf -> kprintf, sprintf -> ksprintf
 1.65 07-Sep-1996  mycroft Implement poll(2).
 1.64 01-Sep-1996  mycroft Add a set of generic file system operations that most file systems use.
Also, fix some time stamp bogosities.
 1.63 07-Jul-1996  fvdl Use the right time for v3 setattr operation.
 1.62 11-May-1996  mycroft branches: 1.62.4;
Change VOP_UPDATE() semantics:
* Make 2nd and 3rd args timespecs, not timevals.
* Consistently pass a Boolean as the 4th arg (except in LFS).
Also, fix ffs_update() and lfs_update() to actually change the nsec fields.
 1.61 03-Apr-1996  thorpej Make these link in the absense of "options FIFO".
 1.60 05-Mar-1996  jtk fix panic "leaf should be empty" on diagnostic kernels when unlinking on
a read-only file system.
 1.59 18-Feb-1996  fvdl Bring in a merge of Rick Macklem's NFSv3 code from Lite2
 1.58 09-Feb-1996  christos nfs prototype changes
 1.57 09-Feb-1996  mycroft Fix vop_link, vop_symlink, and vop_remove semantics in several ways:
* Change the argument names to vop_link so they actually make sense.
* Implement vop_link and vop_symlink for all file systems, so they do proper
cleanup.
* Require the file system to decide whether or not linking and unlinking of
directories is allowed, and disable it for all current file systems.
 1.56 01-Feb-1996  jtc Rename struct timespec fields to conform to POSIX.1b
 1.55 31-Jan-1996  mycroft Don't specify a uid or gid in create operations; let the server fill it in.
 1.54 31-Jan-1996  mycroft Correct some uses of -1 and VNOVAL.
 1.53 19-Dec-1995  cgd changes to make this work on systems where pointers & longs are 64 bits.
This is mostly just changes to make the stuff that goes over the wire
use fixed-size types.
 1.52 14-Oct-1995  ghudson Add cookie support. Stash cookies in the word prior to the end of
each entry, and read them out in nfs_readdir().

Caveat: our current caching method for directory blocks uses the
server offset of the first directory entry as an identifier, so a
Linux emulation getdirentries() will wind up retrieving one block from
the NFS server for each directory entry, unnecessarily thrashing the
cache. The situation isn't as bad for other emulations.

Instead of getblk(), we need to write a routine to scan each cache
block associated with vp to find a cookie that matches at some
directory entry. Some later time.
 1.51 09-Oct-1995  mycroft branches: 1.51.2;
For now, return EINVAL if the client needs cookies.
 1.50 18-Mar-1995  gwr Initialize fields in the RPC data where we were sending garbage.
 1.49 10-Jan-1995  mycroft Make sure readdir requests are only truncated on block boundaries.
 1.48 29-Dec-1994  mycroft Minor consistency nits.
 1.47 29-Dec-1994  mycroft Remove a bit of redundant code.
 1.46 27-Dec-1994  mycroft Format police.
 1.45 27-Dec-1994  mycroft Fix typos in last change.
 1.44 24-Dec-1994  ws Implement and use a common access checking routine
 1.43 13-Dec-1994  mycroft Turn lease_check() into a vnode op, per CSRG.
 1.42 13-Dec-1994  mycroft Remove an old `#ifdef notyet'.
 1.41 20-Oct-1994  cgd update for new syscall args description mechanism
 1.40 30-Aug-1994  pk mknod() must release its new vnode.
 1.39 21-Aug-1994  mycroft Don't attempt to use IO_APPEND for NQNFS, as suggested by Rick Macklem.
 1.38 13-Aug-1994  pk Files with > 1 links can always be removed on the server, even if a
"silly name" exists.
 1.37 08-Aug-1994  deraadt delete unused extern decl
 1.36 12-Jul-1994  mycroft Bug fix from Rick Macklem for a problem with linking to an open file.
 1.35 03-Jul-1994  mycroft branches: 1.35.2;
Fix problem with O_TRUNC and NFS device nodes.
 1.34 29-Jun-1994  cgd New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'
 1.33 22-Jun-1994  pk straighten out diskless swap code somewhat.
 1.32 08-Jun-1994  mycroft Update to 4.4-Lite fs code, with local changes.
 1.31 19-May-1994  cgd stupidity for prototypes...
 1.30 25-Apr-1994  cgd some prototype cleanup, eliminate/replace bogus types (e.g. quad and
u_quad) -> use better types (e.g. quad_t & u_quad_t in inodes),
some cleanup.
 1.29 21-Apr-1994  cgd blow away all vestiges of nfsnode locking.
(1) it's unnecessary
(2) it causes machines to hang (yup!)
(3) it'd be gone in a few days anyway (it'd been yanked out
of 4.4-Lite by macklem long ago)
It was only there because macklem couldn't originally decide if things
should be locked, or not...
 1.28 21-Apr-1994  cgd Convert mount, vnode, and buf structs to use <sys/queue.h>. Also,
some knf and structure frobbing to do along with it.
 1.27 14-Apr-1994  pk Remove bogus type translation; in stead, use IFTOVT again to go from
`NFS mode bits' to `vnode type'.
Use aliased vnode consistently.
 1.26 27-Mar-1994  cgd expand uid_t/gid_t/off_t
 1.25 09-Mar-1994  ws Make FFS optional
 1.24 15-Feb-1994  mycroft Macros bite again.
 1.23 15-Feb-1994  mycroft Format police.
 1.22 15-Feb-1994  pk Update {a,m}time vnode attributes on special files a la ufs_vnode.c,
but make it a non-urgent operation, to leave us some performance.
 1.21 06-Feb-1994  mycroft Eliminate some more uses of b_actl.
 1.20 10-Jan-1994  pk reparations...
 1.19 10-Jan-1994  pk Don't deny unlink()s of files with the "silly" bit on, but still have > 1 links,
but avoid doing gratuitous (possibly expensive) get_attr() calls.
 1.18 04-Jan-1994  cgd add support for union and loopback mounts, from jsp
 1.17 22-Dec-1993  cgd fix nfs_print, add cross-device link checking (From jsp)
 1.16 18-Dec-1993  mycroft Canonicalize all #includes.
 1.15 16-Dec-1993  pk Avoid dereferencing NULL pointer in nfs_doio() when B_PHYS is on.
Remove comment talking about nfsiomaps that we don't have.
Always use credentials that are in the buffer header, in stead of trying
to get them from pageproc, which may once have been necessary to push pages
to swap (cannot imaging anyone having exercised this over NFS though).
 1.14 07-Dec-1993  cgd fix a goof that i made; return *before* nfs_lock() is called...
 1.13 20-Nov-1993  cgd do something better with lookup return values; suggested by BSDI's msdosfs mod
 1.12 12-Nov-1993  cgd new specfs.h and fifo.h locations
 1.11 07-Sep-1993  ws branches: 1.11.2;
Changes to VFS readdir semantics
NFS changes for better cookie support
ISOFS changes for better Rockridge support and support for generation numbers
 1.10 02-Aug-1993  mycroft Make return type of nfs_print be a void, not an int.
 1.9 13-Jul-1993  cgd get rid of some more bogus changes from a week ago
 1.8 13-Jul-1993  cgd diskless changes made last time were hosed; were using NULL for
"no credentials" rather than NOCRED.
 1.7 07-Jul-1993  cgd changes from ws to support diskless booting... these are "OK" on inspection
and after testing... (actually, currently, none of the changed
code is even used...)
 1.6 03-Jun-1993  cgd fix for macklem's bogus use of the va_flags field, supplied by
John Woods, jfwfrom: @ksr.com. also, fixes the following problems:
the va_gen field is in a similar position
(Suns are going to be reporting the change-date microseconds as their
"generation"), I've supplied my own set of diffs below for your inspection.
Note these aren't even compiled, but they're pretty similar to what I had
to do to our older version of OSF/1 here. (There's also an unrelated change
supplied for xdr_subs.h; the pointer types supplied to the fxdr_time() and
txdr_time() macros are not, in fact, both struct timevals. That turns out
to be one of many tips-of-the-iceberg facing those porting the (old) Berkeley
NFS code to 64-bit machines...)
 1.5 22-May-1993  cgd add Yuval Yarom's changes (originally for BSD/386) for advisory record
locking on NFS files. Note that this DOES NOT support network locking,
only local advisory locks.
 1.4 21-May-1993  cgd add rcsid again; fix RCS+crash fuckup
 1.3 10-Apr-1993  glass migrated code to make split possible
 1.2 21-Mar-1993  cgd after 0.2.2 "stable" patches applied
 1.1 21-Mar-1993  cgd branches: 1.1.1;
Initial revision
 1.1.1.4 01-Mar-1998  fvdl Import some files that were changed after Lite2
 1.1.1.3 01-Mar-1998  fvdl Import 4.4BSD-Lite2
 1.1.1.2 01-Mar-1998  fvdl Import 4.4BSD-Lite for reference
 1.1.1.1 21-Mar-1993  cgd initial import of 386bsd-0.1 sources
 1.11.2.5 07-Dec-1993  cgd update from trunk
 1.11.2.4 20-Nov-1993  cgd do something better with lookup return values; suggested by BSDI's msdosfs mod
 1.11.2.3 12-Nov-1993  cgd new specfs.h and fifo.h locations, and include file syntax updates
 1.11.2.2 16-Oct-1993  mycroft Nuke #ifndef hp300 & i386 block of variable definitions that were never used.
 1.11.2.1 24-Sep-1993  mycroft Make all files using spl*() #include cpu.h. Changes from trunk.
nfs_vfsops.c, nfsmount.h: Make nfs_quotactl() take an int rather than a uid_t,
as it might be -1.
nfs_vnops.c: va_size and va_bytes are now quads.
 1.35.2.4 06-Oct-1994  mycroft Update from trunk.
 1.35.2.3 21-Aug-1994  mycroft update from trunk
 1.35.2.2 09-Aug-1994  mycroft update from trunk
 1.35.2.1 12-Jul-1994  cgd linking to open file fix, from trunk.
 1.51.2.2 02-Feb-1996  mycroft Bring in changes for mondo patch 2.
 1.51.2.1 17-Oct-1995  ghudson Update from main branch to get cookie support into the NFS readdir
vnode operation.
 1.62.4.3 05-Mar-1997  mycroft Pull up nfs_link() fix.
 1.62.4.2 04-Mar-1997  mycroft Pull up bug fixes from -current, per fvdl.
 1.62.4.1 08-Jul-1996  jtc Pulled up from rev 1.63 by request from Frank van der Linden
 1.69.4.1 12-Mar-1997  is Merge in changes from Trunk
 1.80.2.1 14-Oct-1997  thorpej Update marc-pcmcia branch from trunk.
 1.97.2.1 09-Nov-1998  chs initial snapshot. lots left to do.
 1.100.6.1 30-Nov-1999  itojun bring in latest KAME (as of 19991130, KAME/NetBSD141) into kame branch
just for reference purposes.
This commit includes 1.4 -> 1.4.1 sync for kame branch.

The branch does not compile at all (due to the lack of ALTQ and some other
source code). Please do not try to modify the branch, this is just for
referenre purposes.

synchronization to latest KAME will take place on HEAD branch soon.
 1.100.4.5 02-Aug-1999  thorpej Update from trunk.
 1.100.4.4 11-Jul-1999  chs remove uvm_vnp_uncache(), it's no longer needed.
 1.100.4.3 04-Jul-1999  chs support VOP_BALLOC().
 1.100.4.2 21-Jun-1999  thorpej Sync w/ -current.
 1.100.4.1 07-Jun-1999  chs merge everything from chs-ubc branch.
 1.100.2.3 15-Aug-2000  he Apply patch (requested by fvdl):
Be careful about data consistency across operations which may
block. Should fix some reported nfs_lookup panics.
 1.100.2.2 05-Jan-2000  he Pull up revision 1.108 (via patch, requested by fvdl):
Insert an extra VOP_ACCESS check in nfs_lookup, preventing cached
access mishaps for lookup and getattr. Fixes PR#8884.
 1.100.2.1 22-Jun-1999  perry pullup 1.100->1.101 (fvdl): fix file creation with a Solaris 7 server
 1.106.8.2 27-Dec-1999  wrstuden Pull up to last week's -current.
 1.106.8.1 21-Dec-1999  wrstuden Initial commit of recent changes to make DEV_BSIZE go away.

Runs on i386, needs work on other arch's. Main kernel routines should be
fine, but a number of the stand programs need help.

cd, fd, ccd, wd, and sd have been updated. sd has been tested with non-512
byte block devices. vnd, raidframe, and lfs need work.

Non 2**n block support is automatic for LKM's and conditional for kernels
on "options NON_PO2_BLOCKS".
 1.106.4.1 19-Oct-1999  fvdl Bring in Kirk McKusick's FFS softdep code on a branch.
 1.106.2.6 23-Apr-2001  bouyer Sync with HEAD.
 1.106.2.5 11-Feb-2001  bouyer Sync with HEAD.
 1.106.2.4 13-Dec-2000  bouyer Sync with HEAD (for UBC fixes).
 1.106.2.3 08-Dec-2000  bouyer Sync with HEAD.
 1.106.2.2 22-Nov-2000  bouyer Sync with HEAD.
 1.106.2.1 20-Nov-2000  bouyer Update thorpej_scsipi to -current as of a month ago
 1.113.4.4 29-Jan-2002  he Pull up revision 1.148 (requested by fvdl):
Set np->n_size before calling nfs_vinvalbuf. Avoids confusion as
to the actual file size.
 1.113.4.3 14-Dec-2000  he Pull up revisions 1.120,1.123 (requested by fvdl):
Improve NFS performance, possibly with as much as 100% in
throughput. Please note: this implies a kernel interface change,
VOP_FSYNC gains two arguments.
 1.113.4.2 19-Sep-2000  fvdl Revision 1.118:
Fix bug in access cache that might result in permission being denied
needlessly. From Matthias Drochner.
(approved by thorpej)
 1.113.4.1 30-Jul-2000  jdolecek Pullup from trunk (approved by thorpej):
Change lf_advlock() to:
int lf_advlock (struct vop_advlock_args *, struct lockf **, off_t)

This matches it's usage. Change inspired by FreeBSD, though we use
off_t instead u_quad_t as the last argument.

sys/lockf.h rev. 1.9
msdosfs/msdosfs_vnops.c rev. 1.99
kern/vfs_lockf.c rev. 1.17
miscfs/specfs/spec_vnops.c rev. 1.49
nfs/nfs_vnops.c rev. 1.115
ufs/ext2fs/ext2fs_vnops.c rev. 1.28
ufs/ufs/ufs_vnops.c rev. 1.72
 1.130.2.17 11-Dec-2002  thorpej Sync with HEAD.
 1.130.2.16 11-Nov-2002  nathanw Catch up to -current
 1.130.2.15 22-Oct-2002  thorpej Sync with HEAD.
 1.130.2.14 18-Oct-2002  thorpej Sync with HEAD.
 1.130.2.13 15-Jul-2002  nathanw Whitespace.
 1.130.2.12 12-Jul-2002  nathanw No longer need to pull in lwp.h; proc.h pulls it in for us.
 1.130.2.11 24-Jun-2002  nathanw Curproc->curlwp renaming.

Change uses of "curproc->l_proc" back to "curproc", which is more like the
original use. Bare uses of "curproc" are now "curlwp".

"curproc" is now #defined in proc.h as ((curlwp) ? (curlwp)->l_proc) : NULL)
so that it is always safe to reference curproc (*de*referencing curproc
is another story, but that's always been true).
 1.130.2.10 20-Jun-2002  nathanw Catch up to -current.
 1.130.2.9 01-Apr-2002  nathanw Catch up to -current.
(CVS: It's not just a program. It's an adventure!)
 1.130.2.8 08-Jan-2002  nathanw Catch up to -current.
 1.130.2.7 14-Nov-2001  nathanw Catch up to -current.
 1.130.2.6 22-Oct-2001  nathanw Catch up to -current.
 1.130.2.5 26-Sep-2001  nathanw Catch up to -current.
Again.
 1.130.2.4 21-Sep-2001  nathanw Catch up to -current.
 1.130.2.3 24-Aug-2001  nathanw Catch up with -current.
 1.130.2.2 21-Jun-2001  nathanw Catch up to -current.
 1.130.2.1 05-Mar-2001  nathanw Initial commit of scheduler activations and lightweight process support.
 1.134.2.8 30-Sep-2002  jdolecek add support for kevents to NFS
to detect file changes on server by other NFS clients, polling kernel thread
is used to periodically check for attribute changes of watched files;
the NFS server is only contacted when the vnode expires from local attrcache
(which takes 5-60 seconds currently), to keep network&CPU overhead low

the routine checking for remote changes is quite simplistic, but hopefully
doing it's job well enough
 1.134.2.7 23-Sep-2002  jdolecek add spec kqfilter vnode op
 1.134.2.6 22-Sep-2002  jdolecek add fifo_kqfilter() to fifo ops, to switch on support for kevents
 1.134.2.5 23-Jun-2002  jdolecek catch up with -current on kqueue branch
 1.134.2.4 16-Mar-2002  jdolecek Catch up with -current.
 1.134.2.3 10-Jan-2002  thorpej Sync kqueue branch with -current.
 1.134.2.2 25-Aug-2001  thorpej Merge Aug 24 -current into the kqueue branch.
 1.134.2.1 03-Aug-2001  lukem update to -current
 1.137.2.2 01-Oct-2001  fvdl Catch up with -current.
 1.137.2.1 18-Sep-2001  fvdl Various changes to make cloning devices possible:

* Add an extra argument (struct vnode **) to VOP_OPEN. If it is
not NULL, specfs will create a cloned (aliased) vnode during
the call, and return it there. The caller should release and
unlock the original vnode if a new vnode was returned. The
new vnode is returned locked.

* Add a flag field to the cdevsw and bdevsw structures.
DF_CLONING indicates that it wants a new vnode for each
open (XXX is there a better way? devprop?)

* If a device is cloning, always call the close entry
point for a VOP_CLOSE.


Also, rewrite cons.c to do the right thing with vnodes. Use VOPs
rather then direct device entry calls. Suggested by mycroft@

Light to moderate testing done an i386 system (arch doesn't matter
though, these are MI changes).
 1.141.2.1 12-Nov-2001  thorpej Sync the thorpej-mips-cache branch with -current.
 1.150.2.1 30-May-2002  gehenna Catch up with -current.
 1.151.2.1 18-Oct-2002  thorpej Pullup revision 1.152. Original log message:

nfs_remove(): Don't vput() the vnode twice if vp == dvp, vrele() and
vput() instead.
 1.175.2.12 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.175.2.11 04-Mar-2005  skrll Sync with HEAD.

Hi Perry!
 1.175.2.10 04-Feb-2005  skrll Sync with HEAD.
 1.175.2.9 24-Jan-2005  skrll Sync with HEAD.
 1.175.2.8 17-Jan-2005  skrll Sync with HEAD.
 1.175.2.7 18-Dec-2004  skrll Sync with HEAD.
 1.175.2.6 19-Oct-2004  skrll Sync with HEAD
 1.175.2.5 24-Sep-2004  skrll Sync with HEAD.
 1.175.2.4 21-Sep-2004  skrll Fix the sync with head I botched.
 1.175.2.3 18-Sep-2004  skrll Sync with HEAD.
 1.175.2.2 03-Aug-2004  skrll Sync with HEAD
 1.175.2.1 02-Jul-2003  darrenr Apply the aborted ktrace-lwp changes to a specific branch. This is just for
others to review, I'm concerned that patch fuziness may have resulted in some
errant code being generated but I'll look at that later by comparing the diff
from the base to the branch with the file I attempt to apply to it. This will,
at the very least, put the changes in a better context for others to review
them and attempt to tinker with removing passing of 'struct lwp' through
the kernel.
 1.186.2.10 16-Mar-2005  tron Pull up revision 1.219 (requested by yamt in ticket #1208):
nfs_readdirrpc, nfs_readdirplusrpc:
avoid infinite loops when getting readdir response without
any entries or eof. PR/28971.
 1.186.2.9 16-Mar-2005  tron Pull up revision 1.206 (requested by yamt in ticket #1134):
nfs_commit: use NAC_NOTRUNC when loading an attribute
as we're called holding pages locked.
 1.186.2.8 11-Jan-2005  jmc Pullup patch (requested by yamy in ticket #1078)

Don't do kludge for a reply to a retransmitted request
unless we actually retransmitted the request.
 1.186.2.7 11-Jan-2005  jmc Pullup patch (requested by yamt in ticket #1077)

nfs_lookup: check n_nctime for positive entries as well to improve
cache consistency.
 1.186.2.6 10-Jul-2004  tron branches: 1.186.2.6.2;
Pull up revision 1.204 via patch (requested by yamt in ticket #636):
nfs_create: after an exclusive create rpc, make sure to update
timestamps, which were likely used to store the verifier.
reported by Mark Davies. PR/26200
 1.186.2.5 10-Jul-2004  tron Pull up revision 1.191 (requested by tls in ticket #634):
nfs_readdirplusrpc: fix a deadlock problem.
don't wait for vnode lock to load attributes.
otherwise, because READDIRPLUS returns DOTDOT entry as well,
we violate locking order.
 1.186.2.4 10-Jul-2004  tron Pull up revision 1.190 (requested by tls in ticket #634):
don't issue VOP_GETATTR blindly in nfs_nget().
in many cases, GETATTR RPCs here is redundant because the caller has
postop_attr. instead, make sure the resulted vnode have a valid
attribute in nfs_lookup().
 1.186.2.3 10-Jul-2004  tron Pull up revision 1.189 (requested by tls in ticket #634):
nfs_readdirplusrpc: purge existing namecache entry before entering a new one.
otherwise we'll get duplicated entries.
 1.186.2.2 10-Jul-2004  tron Pull up revision 1.188 (requested by tls in ticket #634):
when entering a namecache entry for nfs, ensure to update the appropriate
timestamp in the nfsnode so that we don't get namecache-miss when
looking up the node we just created.
 1.186.2.1 10-Jul-2004  tron Pull up revision 1.187 (requested by tls in ticket #634):
avoid unnecessary namecache purges in some places.
 1.186.2.6.2.5 27-Oct-2005  riz Pull up following revision(s) (requested by christos in ticket #5863):
sys/nfs/nfs_subs.c: revision 1.152 via patch
sys/nfs/nfs.h: revision 1.49
sys/nfs/nfs_vfsops.c: revision 1.149 via patch
usr.sbin/amd/include/config.h: revision 1.36
sys/nfs/nfs_vnops.c: revision 1.227 via patch
sys/nfs/nfsmount.h: revision 1.34
Allow the attribute cache to be turned off, and allow amd to do it.
 1.186.2.6.2.4 16-Mar-2005  tron Pull up revision 1.219 (requested by yamt in ticket #1208):
nfs_readdirrpc, nfs_readdirplusrpc:
avoid infinite loops when getting readdir response without
any entries or eof. PR/28971.
 1.186.2.6.2.3 16-Mar-2005  tron Pull up revision 1.206 (requested by yamt in ticket #1134):
nfs_commit: use NAC_NOTRUNC when loading an attribute
as we're called holding pages locked.
 1.186.2.6.2.2 11-Jan-2005  jmc Pullup patch (requested by yamy in ticket #1078)

Don't do kludge for a reply to a retransmitted request
unless we actually retransmitted the request.
 1.186.2.6.2.1 11-Jan-2005  jmc Pullup patch (requested by yamt in ticket #1077)

nfs_lookup: check n_nctime for positive entries as well to improve
cache consistency.
 1.215.2.1 29-Apr-2005  kent sync with -current
 1.217.2.2 19-Mar-2005  yamt sync with head. xen and whitespace. xen part is not finished.
 1.217.2.1 12-Feb-2005  yamt sync with head.
 1.220.2.2 16-Jul-2006  ghen Pull up following revision(s) (requested by jld in ticket #1424):
sys/nfs/nfs_vnops.c: revision 1.240 via patch
sys/nfs/nfs_var.h: revision 1.62 via patch
Fix race condition in NFS renaming that could cause the renamed file to be
deleted (PR/33861).
 1.220.2.1 27-Sep-2005  tron branches: 1.220.2.1.2;
Pull up following revision(s) (requested by christos in ticket #816):
sys/nfs/nfs_vfsops.c: revision 1.149
sys/nfs/nfs_vnops.c: revision 1.227
ATTRTIMEO takes 2 args.
 1.220.2.1.2.1 16-Jul-2006  ghen Pull up following revision(s) (requested by jld in ticket #1424):
sys/nfs/nfs_vnops.c: revision 1.240 via patch
sys/nfs/nfs_var.h: revision 1.62 via patch
Fix race condition in NFS renaming that could cause the renamed file to be
deleted (PR/33861).
 1.222.2.11 27-Feb-2008  yamt revert incomplete nfs client locking for now.
 1.222.2.10 27-Feb-2008  yamt sync with head.
 1.222.2.9 15-Feb-2008  yamt - sprinkle some locks.
- disable MNT_UPDATE because it involves too much locking headache.
- don't overwrite other bits in v_vflags when setting VV_ROOT.
 1.222.2.8 04-Feb-2008  yamt sync with head.
 1.222.2.7 21-Jan-2008  yamt sync with head
 1.222.2.6 07-Dec-2007  yamt sync with head
 1.222.2.5 15-Nov-2007  yamt sync with head.
 1.222.2.4 03-Sep-2007  yamt sync with head.
 1.222.2.3 26-Feb-2007  yamt sync with head.
 1.222.2.2 30-Dec-2006  yamt sync with head.
 1.222.2.1 21-Jun-2006  yamt sync with head.
 1.227.2.1 20-Oct-2005  yamt adapt nfs.
 1.229.2.3 22-Nov-2005  yamt remove a whitespace change which is not related to this branch.
 1.229.2.2 18-Nov-2005  yamt - associate read-ahead context to vnode, rather than file.
- revert VOP_READ prototype.
 1.229.2.1 15-Nov-2005  yamt adapt ffs, lfs, nfs.
 1.230.6.3 01-Jun-2006  kardel Sync with head.
 1.230.6.2 22-Apr-2006  simonb Sync with head.
 1.230.6.1 04-Feb-2006  simonb Adapt for timecounters: mostly use get*time() and use "time_second"
instead of "time.tv_sec".
 1.230.4.1 09-Sep-2006  rpaulo sync with head
 1.230.2.2 15-Jan-2006  yamt rename VMSPACE_IS_KERNEL to VMSPACE_IS_KERNEL_P. ("predicate")
suggested by Matt Thomas.
 1.230.2.1 31-Dec-2005  yamt - adapt nfs.
- nfs_doio_read: #if 0 out "killproc if text is modified" part of
the code as it's broken. (a process reading the modified text is not
necessarily a process which is using the file as a text.)
 1.231.6.1 24-May-2006  tron Merge 2006-05-24 NetBSD-current into the "peter-altq" branch.
 1.231.4.3 06-May-2006  christos - Move kauth_cred_t declaration to <sys/types.h>
- Cleanup struct ucred; forward declarations that are unused.
- Don't include <sys/kauth.h> in any header, but include it in the c files
that need it.

Approved by core.
 1.231.4.2 19-Apr-2006  elad sync with head.
 1.231.4.1 08-Mar-2006  elad Adapt to kernel authorization KPI.

This could use some testing...
 1.231.2.3 11-Aug-2006  yamt sync with head
 1.231.2.2 26-Jun-2006  yamt sync with head.
 1.231.2.1 24-May-2006  yamt sync with head.
 1.236.2.1 19-Jun-2006  chap Sync with head.
 1.237.2.1 13-Jul-2006  gdamore Merge from HEAD.
 1.241.6.2 10-Dec-2006  yamt sync with head.
 1.241.6.1 22-Oct-2006  yamt sync with head
 1.241.4.3 01-Feb-2007  ad Sync with head.
 1.241.4.2 12-Jan-2007  ad Sync with head.
 1.241.4.1 18-Nov-2006  ad Sync with head.
 1.245.2.1 17-Feb-2007  tron Apply patch (requested by chs in ticket #422):
- Fix various deadlock problems with nullfs and unionfs.
- Speed up path lookups by upto 25%.
 1.249.2.3 07-May-2007  yamt sync with head.
 1.249.2.2 12-Mar-2007  rmind Sync with HEAD.
 1.249.2.1 28-Feb-2007  yamt sync with head. (somehow missed in the previous)
 1.252.4.1 11-Jul-2007  mjf Sync with head.
 1.252.2.4 01-Sep-2007  ad Update for pool_cache API changes.
 1.252.2.3 08-Jun-2007  ad Sync with head.
 1.252.2.2 05-Apr-2007  ad Compile fixes.
 1.252.2.1 13-Mar-2007  ad Pull in the initial set of changes for the vmlocking branch.
 1.256.12.1 13-Nov-2007  bouyer Sync with HEAD
 1.256.8.4 23-Mar-2008  matt sync with HEAD
 1.256.8.3 09-Jan-2008  matt sync with HEAD
 1.256.8.2 08-Nov-2007  matt sync with -HEAD
 1.256.8.1 06-Nov-2007  matt sync with HEAD
 1.256.6.5 09-Dec-2007  jmcneill Sync with HEAD.
 1.256.6.4 27-Nov-2007  joerg Sync with HEAD. amd64 Xen support needs testing.
 1.256.6.3 14-Nov-2007  joerg Sync with HEAD.
 1.256.6.2 11-Nov-2007  joerg Sync with HEAD.
 1.256.6.1 29-Oct-2007  joerg Sync with HEAD.
 1.257.2.4 18-Feb-2008  mjf Sync with HEAD.
 1.257.2.3 27-Dec-2007  mjf Sync with HEAD.
 1.257.2.2 08-Dec-2007  mjf Sync with HEAD.
 1.257.2.1 19-Nov-2007  mjf Sync with HEAD.
 1.260.2.2 26-Dec-2007  ad Sync with head.
 1.260.2.1 04-Dec-2007  ad Pull the vmlocking changes into a new branch.
 1.261.4.1 02-Jan-2008  bouyer Sync with HEAD
 1.266.16.2 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.266.16.1 19-Oct-2008  haad Sync with HEAD.
 1.266.10.9 10-Oct-2010  yamt some locking changes
 1.266.10.8 26-Sep-2010  yamt locking changes
 1.266.10.7 11-Aug-2010  yamt sync with head.
 1.266.10.6 11-Mar-2010  yamt sync with head
 1.266.10.5 18-Jul-2009  yamt sync with head.
 1.266.10.4 24-Jun-2009  yamt lock vnode when calling VOP_GETATTR because there's no reasonable way for
an implementation of VOP_GETATTR to prevent the vnode from being revoked.
 1.266.10.3 16-May-2009  yamt sync with head
 1.266.10.2 04-May-2009  yamt sync with head.
 1.266.10.1 27-Apr-2008  yamt commit some work-in-progress changes to make nfs client mp-safe to a branch,
so that they won't get lost.
- sprinkle some locking
- mark the filesystem, nfstimer callout, and kq kthread mp-safe
- add assertions and comments
- disable upgrade mount for now
- some unrelated cosmetic changes
 1.266.6.1 17-Jan-2009  mjf Sync with HEAD.
 1.267.2.2 28-Apr-2009  skrll Sync with HEAD.
 1.267.2.1 19-Jan-2009  skrll Sync with HEAD.
 1.268.4.2 29-Dec-2008  christos fix printf format.
 1.268.4.1 19-Nov-2008  christos file nfs_vnops.c was added on branch christos-time_t on 2008-12-29 00:14:38 +0000
 1.269.2.2 23-Jul-2009  jym Sync with HEAD.
 1.269.2.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.282.4.5 31-May-2011  rmind sync with head
 1.282.4.4 05-Mar-2011  rmind sync with head
 1.282.4.3 03-Jul-2010  rmind sync with head
 1.282.4.2 30-May-2010  rmind sync with head
 1.282.4.1 16-Mar-2010  rmind Change struct uvm_object::vmobjlock to be dynamically allocated with
mutex_obj_alloc(). It allows us to share the locks among UVM objects.
 1.282.2.3 06-Nov-2010  uebayasi Sync with HEAD.
 1.282.2.2 17-Aug-2010  uebayasi Sync with HEAD.
 1.282.2.1 30-Apr-2010  uebayasi Sync with HEAD.
 1.289.2.1 06-Jun-2011  jruoho Sync with HEAD.
 1.290.2.1 23-Jun-2011  cherry Catchup with rmind-uvmplock merge.
 1.292.2.5 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.292.2.4 16-Jan-2013  yamt sync with (a bit old) head
 1.292.2.3 30-Oct-2012  yamt sync with head
 1.292.2.2 23-May-2012  yamt sync with head.
 1.292.2.1 17-Apr-2012  yamt sync with head
 1.293.4.2 14-Jul-2016  snj Pull up following revision(s) (requested by hannken in ticket #1363):
sys/nfs/nfs_vnops.c: revision 1.309
Return an error if NFSPROC_LOOKUP returns the file handle of the current
directory. Treating it as DOT lookup would put garbage into the name
cache and could panic on future lookups.
Seen with ZFS file system exported from OmniOS, an OpenSolaris derivative.
Fixes PR kern/50664 "cd .." over NFS/ZFS can panic kernel
 1.293.4.1 12-Aug-2012  martin branches: 1.293.4.1.4; 1.293.4.1.6;
Pull up following revision(s) (requested by manu in ticket #484):
sys/fs/nilfs/nilfs_vnops.c: revision 1.18
sys/ufs/ufs/ufs_lookup.c: revision 1.117
sys/nfs/nfs_vnops.c: revision 1.295
sys/ufs/chfs/chfs_vnops.c: revision 1.8
sys/ufs/ext2fs/ext2fs_lookup.c: revision 1.70
sys/fs/unionfs/unionfs_vnops.c: revision 1.6
sys/kern/vfs_cache.c: revision 1.89
sys/fs/efs/efs_vnops.c: revision 1.26
sys/fs/hfs/hfs_vnops.c: revision 1.26
sys/fs/adosfs/adlookup.c: revision 1.16
sys/fs/puffs/puffs_vnops.c: revision 1.168
sys/fs/tmpfs/tmpfs_vnops.c: revision 1.98
sys/fs/ntfs/ntfs_vnops.c: revision 1.52
sys/fs/cd9660/cd9660_lookup.c: revision 1.20
sys/fs/msdosfs/msdosfs_lookup.c: revision 1.24
sys/fs/smbfs/smbfs_vnops.c: revision 1.80
sys/fs/udf/udf_vnops.c: revision 1.72
sys/fs/filecorefs/filecore_lookup.c: revision 1.14
sys/fs/puffs/puffs_node.c: revision 1.25
Move some the test for MAKEENTRY into the cache_enter(9). Make some
variables in vfs_cache.c static, __read_mostly, etc.
No objection on tech-kern@.
 1.293.4.1.6.1 14-Jul-2016  snj Pull up following revision(s) (requested by hannken in ticket #1363):
sys/nfs/nfs_vnops.c: revision 1.309
Return an error if NFSPROC_LOOKUP returns the file handle of the current
directory. Treating it as DOT lookup would put garbage into the name
cache and could panic on future lookups.
Seen with ZFS file system exported from OmniOS, an OpenSolaris derivative.
Fixes PR kern/50664 "cd .." over NFS/ZFS can panic kernel
 1.293.4.1.4.1 14-Jul-2016  snj Pull up following revision(s) (requested by hannken in ticket #1363):
sys/nfs/nfs_vnops.c: revision 1.309
Return an error if NFSPROC_LOOKUP returns the file handle of the current
directory. Treating it as DOT lookup would put garbage into the name
cache and could panic on future lookups.
Seen with ZFS file system exported from OmniOS, an OpenSolaris derivative.
Fixes PR kern/50664 "cd .." over NFS/ZFS can panic kernel
 1.293.2.1 29-Apr-2012  mrg sync to latest -current.
 1.295.2.4 03-Dec-2017  jdolecek update from HEAD
 1.295.2.3 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.295.2.2 23-Jun-2013  tls resync from head
 1.295.2.1 20-Nov-2012  tls Resync to 2012-11-19 00:00:00 UTC
 1.299.6.1 18-May-2014  rmind sync with head
 1.304.2.1 10-Aug-2014  tls Rebase.
 1.306.4.3 28-Aug-2017  skrll Sync with HEAD
 1.306.4.2 19-Mar-2016  skrll Sync with HEAD
 1.306.4.1 06-Jun-2015  skrll Sync with HEAD
 1.306.2.2 06-Feb-2016  snj Pull up following revision(s) (requested by hannken in ticket #1094):
sys/nfs/nfs_vnops.c: revision 1.309
Return an error if NFSPROC_LOOKUP returns the file handle of the current
directory. Treating it as DOT lookup would put garbage into the name
cache and could panic on future lookups.
Seen with ZFS file system exported from OmniOS, an OpenSolaris derivative.
Fixes PR kern/50664 "cd .." over NFS/ZFS can panic kernel
 1.306.2.1 19-May-2015  snj branches: 1.306.2.1.2;
Pull up following revision(s) (requested by chs in ticket #769):
sys/nfs/nfs_vnops.c: revision 1.308
in nfs_writerpc(), avoid a signed/unsigned problem in computing the
number of bytes to back up in the uio when we need to resend a write RPC
(eg. after a server crash) on a 64-bit platform. should fix PR 35448.
 1.306.2.1.2.1 06-Feb-2016  snj Pull up following revision(s) (requested by hannken in ticket #1094):
sys/nfs/nfs_vnops.c: revision 1.309
Return an error if NFSPROC_LOOKUP returns the file handle of the current
directory. Treating it as DOT lookup would put garbage into the name
cache and could panic on future lookups.
Seen with ZFS file system exported from OmniOS, an OpenSolaris derivative.
Fixes PR kern/50664 "cd .." over NFS/ZFS can panic kernel
 1.310.12.4 21-Apr-2020  martin Sync with HEAD
 1.310.12.3 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.310.12.2 08-Apr-2020  martin Merge changes from current as of 20200406
 1.310.12.1 10-Jun-2019  christos Sync with HEAD
 1.310.10.1 06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.310.4.1 12-Dec-2023  martin Pull up following revision(s) (requested by schmonz in ticket #1927):

sys/nfs/nfs_vnops.c: revision 1.325

NFS client: fix interop with macOS 14 servers.

Symptom: a bunch of "Cannot open `.' (Invalid argument)".
thorpej@ analysis and fix: on the first request to read a given
directory, make sure READDIR and READDIRPLUS cookie verifiers are
being set to 0. This is in RFC1813 and macOS must have gotten
stricter about it.

Verified on 10.0_RC1/aarch64 to fix the reproducers in PR kern/57691 as
well as the original use case in which I met the bug: pkg_rr once again
runs to completion.
 1.311.4.1 11-Dec-2023  martin Pull up following revision(s) (requested by schmonz in ticket #1778):

sys/nfs/nfs_vnops.c: revision 1.325

NFS client: fix interop with macOS 14 servers.

Symptom: a bunch of "Cannot open `.' (Invalid argument)".
thorpej@ analysis and fix: on the first request to read a given
directory, make sure READDIR and READDIRPLUS cookie verifiers are
being set to 0. This is in RFC1813 and macOS must have gotten
stricter about it.

Verified on 10.0_RC1/aarch64 to fix the reproducers in PR kern/57691 as
well as the original use case in which I met the bug: pkg_rr once again
runs to completion.
 1.312.2.1 29-Feb-2020  ad Sync with head.
 1.313.4.1 20-Apr-2020  bouyer Sync with HEAD
 1.317.6.1 01-Aug-2021  thorpej Sync with HEAD.
 1.324.4.1 11-Dec-2023  martin Pull up following revision(s) (requested by schmonz in ticket #490):

sys/nfs/nfs_vnops.c: revision 1.325

NFS client: fix interop with macOS 14 servers.

Symptom: a bunch of "Cannot open `.' (Invalid argument)".
thorpej@ analysis and fix: on the first request to read a given
directory, make sure READDIR and READDIRPLUS cookie verifiers are
being set to 0. This is in RFC1813 and macOS must have gotten
stricter about it.

Verified on 10.0_RC1/aarch64 to fix the reproducers in PR kern/57691 as
well as the original use case in which I met the bug: pkg_rr once again
runs to completion.
 1.4 08-Jun-1994  mycroft Clean up deleted files.
 1.3 20-May-1993  cgd more rcs id adding and header cleanup. i like vi macros!
 1.2 20-Apr-1993  mycroft Add consistent multiple-inclusion protection (repeat).
 1.1 21-Mar-1993  cgd branches: 1.1.1;
Initial revision
 1.1.1.1 21-Mar-1993  cgd initial import of 386bsd-0.1 sources
 1.33 07-Dec-2024  riastradh sys/nfs/nfs: Add some missing includes and include guards.

Fix up some minor KNF issues while here.

No functional change intended (except to enable things to build that
might not have built before because of previously required #include
ordering).
 1.32 21-May-2015  rtr branches: 1.32.54;
change nfs_boot_sendrecv to take sockaddr_in * instead of mbuf *

fixes m_serv (single mbuf leak) leak in kern/subr_tftproot.c
 1.31 27-Mar-2015  hikaru m_pullup() is called in rcvproc callback functions,
so nfs_boot_sendrecv() should keep track of the head of mbuf chain.
fixes kern/48746
 1.30 04-Oct-2010  cyber branches: 1.30.14; 1.30.18; 1.30.34; 1.30.36;
Add support to honor MTU settings from DHCP during netboot.

Defines IP_MIN_MTU as 576.

Glanced over quickly by martin@ and joerg@.
 1.29 27-Oct-2008  cegger branches: 1.29.12; 1.29.14;
change nfs boot behaviour to automatically try next boot method if boot information are incomplete to succeed.
That way, it is possible combine static and dhcp boot:
For example, to boot diskless you can specify the nfs-server and the rootpath statically. All other information will be taken via dhcp.

Patch has been presented on port-xen, tech-kern and tech-net:
http://mail-index.netbsd.org/port-xen/2008/10/24/msg004488.html
http://mail-index.netbsd.org/tech-kern/2008/10/24/msg003255.html
http://mail-index.netbsd.org/tech-net/2008/10/24/msg000864.html

No comments, no objections.
 1.28 24-Oct-2008  cegger branches: 1.28.2;
- ansify function definition
- de- __P
- u_int32_t -> uint32_t

No functional changes.
 1.27 28-Apr-2008  martin branches: 1.27.6;
Remove clause 3 and 4 from TNF licenses
 1.26 08-Jul-2007  bouyer branches: 1.26.28; 1.26.30; 1.26.32;
Add a new BOOTSTATIC flag, NFS_BOOTSTATIC_NOSTATIC, which causes
nfs_bootstatic() to abort with EOPNOTSUPP. This allows a callback to
say that there is no bootstatic config, and the next NFS boot method should
be tried.
 1.25 08-May-2007  manu Add the TFTPROOT kernel option for TFTP'ing root RAMdisk at root mount time.
This allows working around situations where a kernel with embedded RAMdisk
cannot be booted by the bootloader because the RAMdisk is too big.
 1.24 11-Dec-2005  christos branches: 1.24.24; 1.24.26; 1.24.30; 1.24.32;
merge ktrace-lwp.
 1.23 22-May-2004  jonathan branches: 1.23.12;
Eliminate several uses of `curproc' from the socket-layer code and from NFS.

Add a new explicit `struct proc *p' argument to socreate(), sosend().
Use that argument instead of curproc. Follow-on changes to pass that
argument to socreate(), sosend(), and (*so->so_send)() calls.
These changes reviewed and independently recoded by Matt Thomas.

Changes to soreceive() and (*dom->dom_exernalize() from Matt Thomas:
pass soreceive()'s struct uio* uio->uio_procp to unp_externalize().
Eliminate curproc from unp_externalize. Also, now soreceive() uses
its uio->uio_procp value, pass that same value downward to
((pr->pru_usrreq)() calls for consistency, instead of (struct proc * )0.

Similar changes in sys/nfs to eliminate (most) uses of curproc,
either via the req-> r_procp field of a struct nfsreq *req argument,
or by passing down new explicit struct proc * arguments.

Reviewed by: Matt Thomas, posted to tech-kern.
NB: The (*pr->pru_usrreq)() change should be tested on more (all!) protocols.
 1.22 01-May-2004  matt Commons are not allowed in header files. extern it.
 1.21 11-Mar-2004  cl Add static nfs boot configuration, from the kernel config file or from
a driver selectable callback function. This is used in the Xen port to
allow controlling the domain's network setup from the domain building
environment at domain creation (vs. having to maintain/change this on a
dhcp server). The Xen network driver parses a command line passed in
from the domain builder.
 1.20 29-Jun-2003  fvdl branches: 1.20.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
 1.19 28-Jun-2003  darrenr Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
 1.18 05-May-2003  yamt keep things not needed by userland in #ifdef _KERNEL.
(e.g. prototypes for in-kernel functions)
 1.17 01-Dec-2002  matt Add multiple inclusion protection.
 1.16 21-Feb-1999  drochner branches: 1.16.20;
restructure the diskless NFS boot code to keep track of the used
interface and the address allocated, to roll everything back if the
mount fails:
-put an interface pointer into "struct nfs_diskless" to have it
available for cleanup, don't pass it around anymore where the
"struct nfs_diskless" is already passed
-add a "cleanup" function which shuts the interface down
-in the protocol-specific parts, either return with "everything
ready" or "completely shut down"
-use common functions for interface initialization and shutdown
-add a function to delete all routes associate to an interface
(why is this necessary and not done by ~IFF_UP?)
g/c diskless swap stuff
general cleanup
 1.15 30-Sep-1997  drochner Factor out some functions used by bootparam and DHCP boot.
 1.14 09-Sep-1997  gwr Move the call to nfs_boot_getfh() from nfs_vfsops.c to nfs_boot.c
(just for better isolation - it can now be static)
 1.13 29-Aug-1997  gwr Supporting changes for the new BOOTP support in nfs_mountroot.
 1.12 14-Aug-1997  drochner 1. Allow to set a netmask (option NFS_BOOT_NETMASK) for the booting
interface. Without this, NFS_BOOT_NETMASK could be useless in
subnetting envirinment.
2. Comment out unneeded NFS swap related stuff.
Closes PR kern/3918.
 1.11 27-May-1997  gwr branches: 1.11.4;
Minor reorganization of nfs_mountroot code to simplify BOOTP support.
The RPC/bootparamd calls to get the root and swap paths are now done
in nfs_boot_init() instead of nfs_boot_getfh(), so the latter now just
does the RPC/mountd call. Also changed some panics into error returns.
 1.10 20-Oct-1996  fvdl Enhancements from Matthias Drochner:
- Try V3 first for diskless booting. Fall back to V2 if V3 fails.
- optionally (option NFS_BOOT_TCP) try a TCP mount first
for diskless booting. Fall back to UDP if it fails.
- Enable switching between UDP and TCP for remounts.
 1.9 18-Feb-1996  fvdl Bring in a merge of Rick Macklem's NFSv3 code from Lite2
 1.8 13-Feb-1996  gwr Do the RPC to bootparamd a little later (just before the mountd call)
so that we do not ask for the "swap" path when swapping on disk.
 1.7 29-Jun-1994  cgd New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'
 1.6 13-Jun-1994  gwr New diskless boot code (uses RARP, bootparamd).
 1.5 08-Jun-1994  mycroft Update to 4.4-Lite fs code, with local changes.
 1.4 07-Jul-1993  cgd changes from ws to support diskless booting... these are "OK" on inspection
and after testing... (actually, currently, none of the changed
code is even used...)
 1.3 20-May-1993  cgd more rcs id adding and header cleanup. i like vi macros!
 1.2 20-Apr-1993  cgd re-merged include file changes which got eaten by crash
 1.1 21-Mar-1993  cgd branches: 1.1.1;
Initial revision
 1.1.1.3 01-Mar-1998  fvdl Import 4.4BSD-Lite2
 1.1.1.2 01-Mar-1998  fvdl Import 4.4BSD-Lite for reference
 1.1.1.1 21-Mar-1993  cgd initial import of 386bsd-0.1 sources
 1.11.4.4 14-Oct-1997  thorpej Update marc-pcmcia branch from trunk.
 1.11.4.3 16-Sep-1997  thorpej Update marc-pcmcia branch from trunk.
 1.11.4.2 01-Sep-1997  thorpej Update marc-pcmcia branch from trunk.
 1.11.4.1 23-Aug-1997  thorpej Update marc-pcmcia branch from trunk.
 1.16.20.1 11-Dec-2002  thorpej Sync with HEAD.
 1.20.2.4 21-Sep-2004  skrll Fix the sync with head I botched.
 1.20.2.3 18-Sep-2004  skrll Sync with HEAD.
 1.20.2.2 03-Aug-2004  skrll Sync with HEAD
 1.20.2.1 02-Jul-2003  darrenr Apply the aborted ktrace-lwp changes to a specific branch. This is just for
others to review, I'm concerned that patch fuziness may have resulted in some
errant code being generated but I'll look at that later by comparing the diff
from the base to the branch with the file I attempt to apply to it. This will,
at the very least, put the changes in a better context for others to review
them and attempt to tinker with removing passing of 'struct lwp' through
the kernel.
 1.23.12.2 03-Sep-2007  yamt sync with head.
 1.23.12.1 21-Jun-2006  yamt sync with head.
 1.24.32.1 11-Jul-2007  mjf Sync with head.
 1.24.30.2 15-Jul-2007  ad Sync with head.
 1.24.30.1 08-Jun-2007  ad Sync with head.
 1.24.26.1 17-May-2007  yamt sync with head.
 1.24.24.1 13-May-2007  jdc Pull up revision 1.25 (requested by manu in ticket #635).

Add the TFTPROOT kernel option for TFTP'ing root RAMdisk at root mount time.
This allows working around situations where a kernel with embedded RAMdisk
cannot be booted by the bootloader because the RAMdisk is too big.
 1.26.32.3 09-Oct-2010  yamt sync with head
 1.26.32.2 04-May-2009  yamt sync with head.
 1.26.32.1 16-May-2008  yamt sync with head.
 1.26.30.1 18-May-2008  yamt sync with head.
 1.26.28.2 17-Jan-2009  mjf Sync with HEAD.
 1.26.28.1 02-Jun-2008  mjf Sync with HEAD.
 1.27.6.1 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.28.2.1 19-Jan-2009  skrll Sync with HEAD.
 1.29.14.1 05-Mar-2011  rmind sync with head
 1.29.12.1 22-Oct-2010  uebayasi Sync with HEAD (-D20101022).
 1.30.36.2 06-Jun-2015  skrll Sync with HEAD
 1.30.36.1 06-Apr-2015  skrll Sync with HEAD
 1.30.34.1 06-Apr-2015  snj Pull up following revision(s) (requested by hikaru in ticket #656):
sys/kern/subr_tftproot.c: revision 1.14
sys/nfs/krpc_subr.c: revision 1.39
sys/nfs/nfs_boot.c: revision 1.82
sys/nfs/nfs_bootdhcp.c: revision 1.53
sys/nfs/nfsdiskless.h: revision 1.31
m_pullup() is called in rcvproc callback functions,
so nfs_boot_sendrecv() should keep track of the head of mbuf chain.
fixes kern/48746
 1.30.18.1 03-Dec-2017  jdolecek update from HEAD
 1.30.14.1 16-Apr-2015  msaitoh Pull up following revision(s) (requested by hikaru in ticket #1287):
sys/kern/subr_tftproot.c: revision 1.14 via patch
sys/nfs/nfsdiskless.h: revision 1.31
sys/nfs/nfs_boot.c: revision 1.82
sys/nfs/krpc_subr.c: revision 1.39
sys/nfs/nfs_bootdhcp.c: revision 1.53
m_pullup() is called in rcvproc callback functions,
so nfs_boot_sendrecv() should keep track of the head of mbuf chain.
fixes kern/48746
 1.32.54.1 02-Aug-2025  perseant Sync with HEAD
 1.4 08-Jun-1994  mycroft Clean up deleted files.
 1.3 20-May-1993  cgd more rcs id adding and header cleanup. i like vi macros!
 1.2 20-Apr-1993  mycroft Add consistent multiple-inclusion protection (repeat).
 1.1 21-Mar-1993  cgd branches: 1.1.1;
Initial revision
 1.1.1.1 21-Mar-1993  cgd initial import of 386bsd-0.1 sources
 1.59 07-Dec-2024  riastradh sys/nfs/nfs: Add some missing includes and include guards.

Fix up some minor KNF issues while here.

No functional change intended (except to enable things to build that
might not have built before because of previously required #include
ordering).
 1.58 05-Jul-2024  rin sys: Drop redundant NULL check before m_freem(9)

m_freem(9) safely has accepted NULL argument at least since 4.2BSD:
https://www.tuhs.org/cgi-bin/utree.pl?file=4.2BSD/usr/src/sys/sys/uipc_mbuf.c

Compile-tested on amd64/ALL.

Suggested by knakahara@
 1.57 23-Mar-2023  riastradh branches: 1.57.6;
nfs: Use unsigned name lengths so we don't trip over negative ones.

- nfsm_strsiz is only used with uint32_t in callers, but let's not
leave it as a rake to step on.

- nfsm_srvnamesiz is abused with signed s. The internal conversion
to unsigned serves to reject both negative and too-large values in
such callers.

XXX Should make all callers use unsigned, rather than flipping back
and forth between signed and unsigned for name lengths.

XXX pullup-8
XXX pullup-9
XXX pullup-10
 1.56 23-Mar-2023  riastradh nfs: Use unsigned fhlen so we don't trip over negative values.

XXX pullup-8
XXX pullup-9
XXX pullup-10
 1.55 12-Aug-2021  andvar branches: 1.55.4;
s/directry/directory/
 1.54 04-Apr-2020  mlelstv NFSv2 is limited to use only 32bit in metadata. Prevent that larger
metadata values are simply truncated.

-> clamp filesystem block counts to signed 32bit.
-> clamp file sizes to signed 32bit (*)

Some NFSv2 clients also have problems to handle buffer sizes larger
than (signed) 16bit.
-> clamp buffer sizes to signed 16bit for better compatibility.

(*) This can lead to erroneous behaviour for files larger than 2GB
that NFSv2 cannot handle but it is still better than before.
An alternative would be to (partially) reject operations on files
larger than 2GB, but which causes other problems.
 1.53 14-Sep-2013  martin branches: 1.53.22; 1.53.30; 1.53.34;
Backout wildcard pragma to kill warnings and instead sprinkle a few dozen
__unused attributes.
Requested by joerg@
 1.52 14-Sep-2013  martin Silence gcc 4.8.1 warnings
 1.51 10-Apr-2009  bouyer PR kern/41158: nfs_rename() locking against myself
nfsrv_rename() can exit without calling genfs_renamelock_exit() because
the nfsm_reply() can do return (0) on error.
Change nfsm_reply to use 'error = 0; goto nfsmout' instead.
Fix a few place so it's safe to goto nfsmout from nfsm_reply, or other
macros calling it.
As a side effect it could fix a missing vrele(dirp) in various place where
nfsm_reply could return(0).
 1.50 04-Mar-2007  christos branches: 1.50.40; 1.50.50; 1.50.52; 1.50.56;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.
 1.49 22-Feb-2007  thorpej TRUE -> true, FALSE -> false
 1.48 21-Feb-2007  thorpej Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.
 1.47 02-Sep-2006  yamt branches: 1.47.8;
nfsd: deal with variable-sized filehandles.
 1.46 08-Aug-2006  yamt nfsm_srvfhtom: ensure that padding bytes in nfsv2 file handles are zero.
 1.45 07-Jun-2006  kardel branches: 1.45.4;
merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html
 1.44 11-Dec-2005  christos branches: 1.44.4; 1.44.6; 1.44.8; 1.44.14;
merge ktrace-lwp.
 1.43 31-Oct-2005  thorpej Fix paste-o in the NFSV3SATTRTIME_TOSERVER case of mtime handing (need
to set va_mtime, not va_atime).
 1.42 01-Oct-2005  yamt branches: 1.42.2;
nfsm_srvsattr: use nanotime(9) rather than time(9) for NFSV3SATTRTIME_TOSERVER.
 1.41 29-May-2005  christos branches: 1.41.2;
- sprinkle const
- avoid shadowed variables
- mark bad const use with XXXUNCONST
 1.40 26-Feb-2005  perry nuke trailing whitespace
 1.39 19-Jan-2005  yamt branches: 1.39.2;
implement inaccurate mtime/ctime detection.
namely, if mtime or ctime are same between pre_op_attr and post_op_attr
when we expected them to be changed, don't trust the server.
 1.38 29-Sep-2004  yamt branches: 1.38.4;
g/c NFSMINOFF, which is unused and identical with MRESETDATA.
 1.37 10-May-2004  yamt don't do kludge for a reply to a retransmitted request
unless we actually retransmitted the request.
 1.36 05-Apr-2004  yamt nfsm_mtofh: handle the case that filehandle is exist but fattr is not.
 1.35 05-Apr-2004  yamt nfsm_wcc_data: update n_ctime and n_nctime if no one other than us
changed the file in the meantime so that we won't invalidate caches
unnecessarily due to our own activities.
 1.34 19-Mar-2004  yamt branches: 1.34.2;
comments on some nfsm_ macros.
 1.33 15-Mar-2004  yamt some comments on cryptic nfsm_ macros.
 1.32 26-Sep-2003  yamt change n_mtime from time_t to timespec in order to improve
cache consistency.
(1 second granularity is too loose these days.)
 1.31 07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.30 29-Jun-2003  fvdl branches: 1.30.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
 1.29 28-Jun-2003  darrenr Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
 1.28 09-Jun-2003  yamt rework zero padding of rpc reply.
- for READ procedure, don't send back more bytes than requested.
- don't have doubtful assumptions on mbuf chain structure.
- rename a function (nfsm_adj -> nfs_zeropad) to avoid confusion as
the semantics of the function was changed.
 1.27 06-May-2003  yamt remove nfsm_srvstrsiz as it's no longer used.
 1.26 24-Apr-2003  drochner Change some subordinate functions to take a "struct nfsnode" argument
instead of "struct vnode". This saves a number of pointer dereferences;
it sums up to about half a kB for me. And it paves the way for future
fixes.
While cleaning up, eliminate a write-only member of "struct nfsreq"
and a pointless assignment in the NFS_V2_ONLY case.
 1.25 28-Mar-2003  yamt reply ENAMETOOLONG properly instead of discarding request as BADRPC.
my own PR20791.
 1.24 26-Feb-2003  matt Fix typo.
 1.23 26-Feb-2003  matt Add MBUFTRACE kernel option.
Do a little mbuf rework while here. Change all uses of MGET*(*, M_WAIT, *)
to m_get*(M_WAIT, *). These are not performance critical and making them
call m_get saves considerable space. Add m_clget analogue of MCLGET and
make corresponding change for M_WAIT uses.
Modify netinet, gem, fxp, tulip, nfs to support MBUFTRACE.
Begin to change netstat to use sysctl.
 1.22 21-Oct-2002  yamt fix a page locking deadlock problem for nfs.

add a flag that specify if the file can be truncated safely or not
to nfsm_loadattr and friends. when it isn't safe, just mark the nfsnode
as "should be truncated later".

ok'ed by Frank van der Linden and Chuck Silvers.
close kern/18036.
 1.21 03-Apr-2002  wrstuden In the SETATTR code, if the changes to a & m time are exclusively
set via NFSV3SATTRTIME_TOSERVER and not NFSV3SATTRTIME_TOCLIENT,
add VA_UTIMES_NULL to the va_vflags. This reflects our policy
where we're much more liberal about who can set a & m times to 'now'
than we are about who can set them to a specific time.

Should close PR 15597 from Martin Husemann. Patch is based on the
one Matthias Drochner gave in the PR.
 1.20 29-May-1999  fvdl branches: 1.20.14; 1.20.16;
Be more correct with attribute structures for setattr RPCs and friends,
so that picky servers (e.g. Solaris 7) don't refuse our requests. Move
some code into a macro, and a bit of KNF. From OpenBSD.
 1.19 06-Mar-1999  fair branches: 1.19.2; 1.19.4; 1.19.6;
Snatch a patch from OpenBSD to fix PRs 6529 and 7074.
Adjust fxdr_hyper() and txdr_hyper() macros.
 1.18 09-Aug-1998  perry bzero->memset, bcopy->memcpy, bcmp->memcmp
 1.17 14-Jul-1997  fvdl Don't assume that pointers into mbuf data remain valid across nfsm_dissect.
In readdirplus, don't keep such pointers but store the file attributes
in a variable instead until they are needed. Change nfsm_loadattr*
a bit so it can accept a direct pointer to an nfs_fattr structure.
 1.16 24-Jun-1997  fvdl Let nfsm_srvmtofh deal with the public filehandle, convert to all zeroes
for both v2 and v3 internally.
 1.15 27-Mar-1997  thorpej Don't assume mbuf external storage is MCLBYTES.
 1.14 24-Feb-1997  fvdl Use ALIGNED_POINTER to see whether mbuf data needs to be realigned.
 1.13 22-Feb-1997  fvdl Cast pointer to u_long, not int, when doing the alignment check.
Fixes warnings on the Alpha. Needs a better solution.
 1.12 22-Feb-1997  fvdl Fixes from BSDI (thanks go to Keith Bostic). Original RCS message:

date: 1997/02/10 18:41:15; author: cp; state: Exp; lines: +8 -2
Make nfs_realign go away on sparc and add functionality to nfsm_disct.

===
[XXX this introduces an ifdef __i386__, see the comment. Should be changed]
 1.11 25-Oct-1996  cgd branches: 1.11.4;
make the namei struct members ni_dirp and ni_next, and the componentname
struct member cn_nameptr 'const', since they should never be used to
modify the path name. (Only the pathname buffer, cn_pnbuf, should be
modified.) Propagate the const poisoning to code that uses the namei
and componentname structs.
 1.10 20-Mar-1996  fvdl Make sure not to free the reply mbuf twice. Should fix PR #2240
 1.9 18-Feb-1996  fvdl Bring in a merge of Rick Macklem's NFSv3 code from Lite2
 1.8 09-Feb-1996  christos nfs prototype changes
 1.7 19-Dec-1995  cgd changes to make this work on systems where pointers & longs are 64 bits.
This is mostly just changes to make the stuff that goes over the wire
use fixed-size types.
 1.6 23-May-1995  mycroft Remove gratuitous extra indirections.
 1.5 29-Jun-1994  cgd New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'
 1.4 08-Jun-1994  mycroft Update to 4.4-Lite fs code, with local changes.
 1.3 03-Jun-1993  cgd fix for macklem's bogus use of the va_flags field, supplied by
John Woods, jfwfrom: @ksr.com. also, fixes the following problems:
the va_gen field is in a similar position
(Suns are going to be reporting the change-date microseconds as their
"generation"), I've supplied my own set of diffs below for your inspection.
Note these aren't even compiled, but they're pretty similar to what I had
to do to our older version of OSF/1 here. (There's also an unrelated change
supplied for xdr_subs.h; the pointer types supplied to the fxdr_time() and
txdr_time() macros are not, in fact, both struct timevals. That turns out
to be one of many tips-of-the-iceberg facing those porting the (old) Berkeley
NFS code to 64-bit machines...)
 1.2 20-May-1993  cgd more rcs id adding and header cleanup. i like vi macros!
 1.1 20-Apr-1993  mycroft branches: 1.1.1;
Restore files lost during crash.
 1.1.1.2 01-Mar-1998  fvdl Import 4.4BSD-Lite2
 1.1.1.1 01-Mar-1998  fvdl Import 4.4BSD-Lite for reference
 1.11.4.1 12-Mar-1997  is Merge in changes from Trunk
 1.19.6.1 30-Nov-1999  itojun bring in latest KAME (as of 19991130, KAME/NetBSD141) into kame branch
just for reference purposes.
This commit includes 1.4 -> 1.4.1 sync for kame branch.

The branch does not compile at all (due to the lack of ALTQ and some other
source code). Please do not try to modify the branch, this is just for
referenre purposes.

synchronization to latest KAME will take place on HEAD branch soon.
 1.19.4.1 21-Jun-1999  thorpej Sync w/ -current.
 1.19.2.1 22-Jun-1999  perry pullup 1.19->1.20 (fvdl): fix file creation with a Solaris 7 server
 1.20.16.1 23-Jun-2002  jdolecek catch up with -current on kqueue branch
 1.20.14.2 22-Oct-2002  thorpej Sync with HEAD.
 1.20.14.1 17-Apr-2002  nathanw Catch up to -current.
 1.30.2.8 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.30.2.7 04-Mar-2005  skrll Sync with HEAD.

Hi Perry!
 1.30.2.6 24-Jan-2005  skrll Sync with HEAD.
 1.30.2.5 19-Oct-2004  skrll Sync with HEAD
 1.30.2.4 21-Sep-2004  skrll Fix the sync with head I botched.
 1.30.2.3 18-Sep-2004  skrll Sync with HEAD.
 1.30.2.2 03-Aug-2004  skrll Sync with HEAD
 1.30.2.1 02-Jul-2003  darrenr Apply the aborted ktrace-lwp changes to a specific branch. This is just for
others to review, I'm concerned that patch fuziness may have resulted in some
errant code being generated but I'll look at that later by comparing the diff
from the base to the branch with the file I attempt to apply to it. This will,
at the very least, put the changes in a better context for others to review
them and attempt to tinker with removing passing of 'struct lwp' through
the kernel.
 1.34.2.3 11-Jan-2005  jmc Pullup patch (requested by yamy in ticket #1078)

Don't do kludge for a reply to a retransmitted request
unless we actually retransmitted the request.
 1.34.2.2 10-Jul-2004  tron branches: 1.34.2.2.2;
Pull up revision 1.36 (requested by tls in ticket #634):
nfsm_mtofh: handle the case that filehandle is exist but fattr is not.
 1.34.2.1 10-Jul-2004  tron Pull up revision 1.35 (requested by tls in ticket #634):
nfsm_wcc_data: update n_ctime and n_nctime if no one other than us
changed the file in the meantime so that we won't invalidate caches
unnecessarily due to our own activities.
 1.34.2.2.2.1 11-Jan-2005  jmc Pullup patch (requested by yamy in ticket #1078)

Don't do kludge for a reply to a retransmitted request
unless we actually retransmitted the request.
 1.38.4.1 29-Apr-2005  kent sync with -current
 1.39.2.1 19-Mar-2005  yamt sync with head. xen and whitespace. xen part is not finished.
 1.41.2.4 03-Sep-2007  yamt sync with head.
 1.41.2.3 26-Feb-2007  yamt sync with head.
 1.41.2.2 30-Dec-2006  yamt sync with head.
 1.41.2.1 21-Jun-2006  yamt sync with head.
 1.42.2.1 02-Nov-2005  yamt sync with head.
 1.44.14.1 19-Jun-2006  chap Sync with head.
 1.44.8.3 03-Sep-2006  yamt sync with head.
 1.44.8.2 11-Aug-2006  yamt sync with head
 1.44.8.1 26-Jun-2006  yamt sync with head.
 1.44.6.1 04-Feb-2006  simonb Adapt for timecounters: mostly use get*time() and use "time_second"
instead of "time.tv_sec".
 1.44.4.1 09-Sep-2006  rpaulo sync with head
 1.45.4.1 16-Aug-2006  tron Pull up following revision(s) (requested by yamt in ticket #24):
sys/nfs/nfsm_subs.h: revision 1.46
nfsm_srvfhtom: ensure that padding bytes in nfsv2 file handles are zero.
 1.47.8.2 12-Mar-2007  rmind Sync with HEAD.
 1.47.8.1 28-Feb-2007  yamt sync with head. (somehow missed in the previous)
 1.50.56.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.50.52.1 13-Apr-2009  snj Pull up following revision(s) (requested by ad in ticket #700):
sys/nfs/nfs_serv.c: revision 1.144
sys/nfs/nfsm_subs.h: revision 1.51
PR kern/41158: nfs_rename() locking against myself
nfsrv_rename() can exit without calling genfs_renamelock_exit() because
the nfsm_reply() can do return (0) on error.
Change nfsm_reply to use 'error = 0; goto nfsmout' instead.
Fix a few place so it's safe to goto nfsmout from nfsm_reply, or other
macros calling it.
As a side effect it could fix a missing vrele(dirp) in various place where
nfsm_reply could return(0).
 1.50.50.1 28-Apr-2009  skrll Sync with HEAD.
 1.50.40.1 04-May-2009  yamt sync with head.
 1.53.34.1 30-Mar-2023  martin Pull up following revision(s) (requested by riastradh in ticket #1617):

sys/nfs/nfs_serv.c: revision 1.184
sys/nfs/nfs_srvsubs.c: revision 1.17
sys/nfs/nfsm_subs.h: revision 1.56
sys/nfs/nfsm_subs.h: revision 1.57

nfs: Use unsigned fhlen so we don't trip over negative values.

nfs: Avoid integer overflow in nfs_namei bounds check.

nfs: Use unsigned name lengths so we don't trip over negative ones.
- nfsm_strsiz is only used with uint32_t in callers, but let's not
leave it as a rake to step on.
- nfsm_srvnamesiz is abused with signed s. The internal conversion
to unsigned serves to reject both negative and too-large values in
such callers.
XXX Should make all callers use unsigned, rather than flipping back
and forth between signed and unsigned for name lengths.

nfs: Avoid free of uninitialized on bad name size in create, mknod.
XXX These error branches are a nightmare and need to be more
systematically cleaned up. Even if they are correct now, they are
impossible to audit and extremely fragile in case anyone ever needs
to make other changes to them.
 1.53.30.1 08-Apr-2020  martin Merge changes from current as of 20200406
 1.53.22.1 30-Mar-2023  martin Pull up following revision(s) (requested by riastradh in ticket #1810):

sys/nfs/nfs_serv.c: revision 1.184
sys/nfs/nfs_srvsubs.c: revision 1.17
sys/nfs/nfsm_subs.h: revision 1.56
sys/nfs/nfsm_subs.h: revision 1.57

nfs: Use unsigned fhlen so we don't trip over negative values.

nfs: Avoid integer overflow in nfs_namei bounds check.

nfs: Use unsigned name lengths so we don't trip over negative ones.
- nfsm_strsiz is only used with uint32_t in callers, but let's not
leave it as a rake to step on.
- nfsm_srvnamesiz is abused with signed s. The internal conversion
to unsigned serves to reject both negative and too-large values in
such callers.
XXX Should make all callers use unsigned, rather than flipping back
and forth between signed and unsigned for name lengths.

nfs: Avoid free of uninitialized on bad name size in create, mknod.
XXX These error branches are a nightmare and need to be more
systematically cleaned up. Even if they are correct now, they are
impossible to audit and extremely fragile in case anyone ever needs
to make other changes to them.
 1.55.4.1 30-Mar-2023  martin Pull up following revision(s) (requested by riastradh in ticket #134):

sys/nfs/nfs_serv.c: revision 1.184
sys/nfs/nfs_srvsubs.c: revision 1.17
sys/nfs/nfsm_subs.h: revision 1.56
sys/nfs/nfsm_subs.h: revision 1.57

nfs: Use unsigned fhlen so we don't trip over negative values.

nfs: Avoid integer overflow in nfs_namei bounds check.

nfs: Use unsigned name lengths so we don't trip over negative ones.
- nfsm_strsiz is only used with uint32_t in callers, but let's not
leave it as a rake to step on.
- nfsm_srvnamesiz is abused with signed s. The internal conversion
to unsigned serves to reject both negative and too-large values in
such callers.
XXX Should make all callers use unsigned, rather than flipping back
and forth between signed and unsigned for name lengths.

nfs: Avoid free of uninitialized on bad name size in create, mknod.
XXX These error branches are a nightmare and need to be more
systematically cleaned up. Even if they are correct now, they are
impossible to audit and extremely fragile in case anyone ever needs
to make other changes to them.
 1.57.6.1 02-Aug-2025  perseant Sync with HEAD
 1.54 07-Dec-2024  riastradh sys/nfs/nfs: Add some missing includes and include guards.

Fix up some minor KNF issues while here.

No functional change intended (except to enable things to build that
might not have built before because of previously required #include
ordering).
 1.53 15-Jul-2015  manu branches: 1.53.54;
Fix soft NFS force unmount

For many reasons, forcibly unmounting a soft NFS mount could hang forever.
Here are the fixes:
- Introduce decents timeouts in operation that awaited NFS server reply.
- On timeout, fails operations on soft mounts with EIO.
- Introduce NFSMNT_DISMNTFORCE to let the filesystem know that a
force unmount is ongoing. This causes timeouts to be reduced and
prevents the NFS client to attempt reconnecting to the NFS server.

Also fix a race condition where some asynchronous I/O could reference
destroyed mount structures. We fix this by awaiting asynchronous I/O
to drain before proceeding.

Reviewed by Chuck Silvers.
 1.52 30-May-2014  hannken branches: 1.52.2; 1.52.4;
Change NFS from rbtree to vcache.
 1.51 22-Jan-2011  matt branches: 1.51.14; 1.51.28;
Add the ability to mount NFS filesystems in COMPAT_NETBSD32
If in the kernel and NFS_ARGS_ONLY, just export struct nfs_args and its flags.
 1.50 25-Sep-2010  matt branches: 1.50.2; 1.50.4;
Rename rb.h to rbtree.h, as it is more appropriate (c.f. ptree.h). Also
helps find code that hasn't been updated to use the new rbtree API.
 1.49 14-Mar-2009  dsl branches: 1.49.2; 1.49.4;
Remove all the __P() from sys (excluding sys/dist)
Diff checked with grep and MK1 eyeball.
i386 and amd64 GENERIC and sys still build.
 1.48 22-Oct-2008  matt branches: 1.48.2; 1.48.8;
Don't need nfs_vfs_reinit anymore since we don't resize tables anymore.
Move reinit code to init case.
 1.47 22-Oct-2008  matt Change NFS to use a RB-tree for its FH->nfsnode lookups.
 1.46 31-Jul-2007  pooka branches: 1.46.24; 1.46.28; 1.46.34; 1.46.36;
* nuke the nameidata parameter from VFS_MOUNT(). Nobody on tech-kern
knew what it was supposed to be used for and wrstuden gave a go-ahead
* while rototilling, convert file systems which went easily to
use VFS_PROTOS() instead of manually prototyping the methods
 1.45 12-Jul-2007  dsl branches: 1.45.2;
Change the VFS_MOUNT() interface so that the 'data' buffer passed to the
fs code is a kernel buffer, pass though the length of the buffer as well.
Since the length of the userspace buffer isn'it (yet) passed through the mount
system call, add a field to the vfsops structure containing the default length.
Split sys_mount() for calls from compat code.
Ride one of the recent kernel version changes - old fs LKMs will load, but
sys_mount() will reject any attempt to use them.
 1.44 29-Apr-2007  yamt use condvar.
 1.43 29-Apr-2007  yamt use mutex and condver.
 1.42 15-Feb-2007  yamt branches: 1.42.2; 1.42.6; 1.42.8;
use mutex and rwlock rather than lockmgr.
 1.41 27-Dec-2006  yamt remove nqnfs.
 1.40 13-Jul-2006  martin branches: 1.40.4;
Fix alignement problems for fhandle_t, exposed by gcc4.1.

While touching all vptofh/fhtovp functions, get rid of VFS_MAXFIDSIZ,
version the getfh(2) syscall and explicitly pass the size available in
the filehandle from userland.

Discussed on tech-kern, with lots of help from yamt (thanks!).
 1.39 14-May-2006  elad branches: 1.39.4;
integrate kauth.
 1.38 14-Apr-2006  blymn Make i/o statistics collection more generic, include tape drives and
nfs mounts in the set of devices that statistics will be reported on.
 1.37 11-Dec-2005  christos branches: 1.37.4; 1.37.6; 1.37.8; 1.37.10; 1.37.12;
merge ktrace-lwp.
 1.36 25-Nov-2005  thorpej Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().
 1.35 23-Sep-2005  jmmv branches: 1.35.6;
Apply the NFS exports list rototill patch:

- Remove all NFS related stuff from file system specific code.
- Drop the vfs_checkexp hook and generalize it in the new nfs_check_export
function, thus removing redundancy from all file systems.
- Move all NFS export-related stuff from kern/vfs_subr.c to the new
file sys/nfs/nfs_export.c. The former was becoming large and its code
is always compiled, regardless of the build options. Using the latter,
the code is only compiled in when NFSSERVER is enabled. While doing this,
also make some functions in nfs_subs.c conditional to NFSSERVER.
- Add a new command in nfssvc(2), called NFSSVC_SETEXPORTSLIST, that takes a
path and a set of export entries. At the moment it can only clear the
exports list or append entries, one by one, but it is done in a way that
allows setting the whole set of entries atomically in the future (see the
comment in mountd_set_exports_list or in doc/TODO).
- Change mountd(8) to use the nfssvc(2) system call instead of mount(2) so
that it becomes file system agnostic. In fact, all this whole thing was
done to remove a 'XXX' block from this utility!
- Change the mount*, newfs and fsck* userland utilities to not deal with NFS
exports initialization; done internally by the kernel when initializing
the NFS support for each file system.
- Implement an interface for VFS (called VFS hooks) so that several kernel
subsystems can run arbitrary code upon receipt of specific VFS events.
At the moment, this only provides support for unmount and is used to
destroy NFS exports lists from the file systems being unmounted, though it
has room for extension.

Thanks go to yamt@, chs@, thorpej@, wrstuden@ and others for their comments
and advice in the development of this patch.
 1.34 18-Sep-2005  christos Allow turning off the attribute cache.
 1.33 19-Jan-2005  yamt branches: 1.33.6; 1.33.8;
implement inaccurate mtime/ctime detection.
namely, if mtime or ctime are same between pre_op_attr and post_op_attr
when we expected them to be changed, don't trust the server.
 1.32 22-May-2004  jonathan branches: 1.32.4;
Eliminate several uses of `curproc' from the socket-layer code and from NFS.

Add a new explicit `struct proc *p' argument to socreate(), sosend().
Use that argument instead of curproc. Follow-on changes to pass that
argument to socreate(), sosend(), and (*so->so_send)() calls.
These changes reviewed and independently recoded by Matt Thomas.

Changes to soreceive() and (*dom->dom_exernalize() from Matt Thomas:
pass soreceive()'s struct uio* uio->uio_procp to unp_externalize().
Eliminate curproc from unp_externalize. Also, now soreceive() uses
its uio->uio_procp value, pass that same value downward to
((pr->pru_usrreq)() calls for consistency, instead of (struct proc * )0.

Similar changes in sys/nfs to eliminate (most) uses of curproc,
either via the req-> r_procp field of a struct nfsreq *req argument,
or by passing down new explicit struct proc * arguments.

Reviewed by: Matt Thomas, posted to tech-kern.
NB: The (*pr->pru_usrreq)() change should be tested on more (all!) protocols.
 1.31 27-Apr-2004  jrf First pass for some caddr_t removal and changes to get rid of it where we
no longer use and/or need it

- removed casts from unionfs, deadfs and fdesc
(there are more to hunt down still)
- changed vfs_quotactl args argumet from caddr_t to void *
- changed vfs_quotactl structures/callers to reflect the api change

Compiled fine and ran for about a day. Approved/reviewed by
christos@netbsd.org and gimpy@netbsd.org.
 1.30 21-Apr-2004  christos Replace the statfs() family of system calls with statvfs().
Retain binary compatibility.
 1.29 03-Oct-2003  yamt branches: 1.29.4;
terminate snprintb 'new' format strings correctly.
(fixes overrun in mount_*)
 1.28 07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.27 29-Jun-2003  fvdl branches: 1.27.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
 1.26 29-Jun-2003  thorpej Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.
 1.25 28-Jun-2003  darrenr Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
 1.24 03-May-2003  yamt better handling of write verifier change.
 1.23 09-Apr-2003  yamt rename nm_verf to nm_writeverf because it's confusing with nm_verf{str,len}.
 1.22 21-Sep-2002  christos MNT_GETARGS support
 1.21 15-Sep-2001  chs add a new VFS op, vfs_reinit, which is called when desiredvnodes is
adjusted via sysctl. file systems that have hash tables which are
sized based on the value of this variable now resize those hash tables
using the new value. the max number of FFS softdeps is also recalculated.

convert various file systems to use the <sys/queue.h> macros for
their hash tables.
 1.20 12-Feb-2001  fvdl branches: 1.20.2; 1.20.4; 1.20.6;
Instead of storing the filehandle in the mount structure, store the
vnode pointer. This avoids a locking problem with nfs_nget, and
can be done because we always have a reference on the root vnode
of the filesystem.
 1.19 16-Mar-2000  jdolecek Add new VFS op routine - vfs_done and call it on filesystem detach
in vfs_detach(). vfs_done may free global filesystem's resources,
typically those allocated in respective filesystem's init function.
Needed so those filesystems which went in via LKM have a chance to
clean after themselves before unloading. This fixes random panics
when LKM for filesystem using pools was loaded and unloaded several
times.

For each leaf filesystem, add appropriate vfs_done routine.
 1.18 04-Jul-1999  sommerfeld branches: 1.18.2;
kern/5591: Fix race in the NFS socket code during umount -f and system
shutdown:

During an unmount, wake up all the processes which are waiting to lock
the socket for receive, and wait for them (and the process blocked in
soreceive, if any) to go away before blowing away the socket and the
mount structure.
 1.17 26-Feb-1999  wrstuden branches: 1.17.2; 1.17.4;
Modify vfsops to seperate vfs_fhtovp() into two routines. vfs_fhtovp() now
only handles the file handle to vnode conversion, and a new call,
vfs_checkexp(), performs the export verification.
 1.16 01-Mar-1998  fvdl Merge with Lite2 + local changes
 1.15 10-Oct-1997  fvdl * New directory entry caching system. Provides full caching of any
directory cookie that may be thrown back at us from userspace, up
to a size limit. Fixes double entry problem.
* Split flags for internal and external use in the NFS mount structure.
* Fix some buffer structure fields that weren're being used correctly.
* Fix missing directory cache inval call in nfs_open.
* Limit on NFS_DIRBLKSIZ no longer needed, bumped to the more reasonable
value of 8k.
* Various other things that I forget, all related to the dir caching
somehow, though.
 1.14 17-Jul-1997  fvdl branches: 1.14.2;
* Deal with servers that don't give complete FSINFO (like NT)
From Olaf Seibert <rhialto@polder.ubc.kun.nl> (PR 3687)
* Make an attempt to check the maximum filesize before attempting
a write to the server, as write RPCs will typically happen
asynchronously, and the process will not see the error.
Fixes problems with unexpectly truncated files at 4G
* Pass up errors in nfs_writerpc correctly
 1.13 22-Dec-1996  cgd Change the second and third args to struct vfsops' (*vfs_mount)() to
'const char *', and 'void *', respectively. The second arg is taken directly
from user arguments, and is const there, so must be const in the prototypes
and functions. The third arg is also taken directly from user arguments.
It doesn't have to be changed, but since it's cleaner to keep the type
the same as the user arg's type, and I'm already making the 'const char *'
change...
 1.12 03-Dec-1996  thorpej Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.
 1.11 02-Dec-1996  thorpej NFS performance improvement from Doug Rabson/FreeBSD:

Improve the queuing algorithms used by NFS' asynchronous i/o. The
existing mechanism uses a global queue for some buffers and the
vp->b_dirtyblkhd queue for others. This turns sequential writes into
randomly ordered writes to the server, affecting both read and write
performance. The existing mechanism also copes badly with hung
servers, tending to block accesses to other servers when all the iods
are waiting for a hung server.

The new mechanism uses a queue for each mount point. All asynchronous
i/o goes through this queue which preserves the ordering of requests.
A simple mechanism ensures that the iods are shared out fairly between
active mount points.

Reviewed/integrated/approved by Frank van der Linden <fvdl@netbsd.org>
 1.10 18-Feb-1996  fvdl Bring in a merge of Rick Macklem's NFSv3 code from Lite2
 1.9 09-Feb-1996  christos nfs prototype changes
 1.8 26-Mar-1995  jtc KERNEL -> _KERNEL
 1.7 13-Dec-1994  mycroft Sync with CSRG.
 1.6 18-Aug-1994  mycroft More LIST/CIRCLEQ migration.
 1.5 29-Jun-1994  cgd branches: 1.5.2;
New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'
 1.4 08-Jun-1994  mycroft Update to 4.4-Lite fs code, with local changes.
 1.3 27-Mar-1994  cgd expand uid_t/gid_t/off_t
 1.2 20-May-1993  cgd branches: 1.2.4;
more rcs id adding and header cleanup. i like vi macros!
 1.1 20-Apr-1993  mycroft branches: 1.1.1;
Restore files lost during crash.
 1.1.1.2 01-Mar-1998  fvdl Import 4.4BSD-Lite2
 1.1.1.1 01-Mar-1998  fvdl Import 4.4BSD-Lite for reference
 1.2.4.1 24-Sep-1993  mycroft Make all files using spl*() #include cpu.h. Changes from trunk.
nfs_vfsops.c, nfsmount.h: Make nfs_quotactl() take an int rather than a uid_t,
as it might be -1.
nfs_vnops.c: va_size and va_bytes are now quads.
 1.5.2.1 19-Aug-1994  mycroft update from trunk
 1.14.2.1 14-Oct-1997  thorpej Update marc-pcmcia branch from trunk.
 1.17.4.1 02-Aug-1999  thorpej Update from trunk.
 1.17.2.1 05-Nov-1999  cgd pull up rev 1.18 from trunk (requested by fvdl):
Avoid a panic when forcibly unmounting a hung NFS mount, e.g. at
reboot.
 1.18.2.2 12-Mar-2001  bouyer Sync with HEAD.
 1.18.2.1 20-Nov-2000  bouyer Update thorpej_scsipi to -current as of a month ago
 1.20.6.1 01-Oct-2001  fvdl Catch up with -current.
 1.20.4.2 10-Oct-2002  jdolecek sync kqueue with -current; this includes merge of gehenna-devsw branch,
merge of i386 MP branch, and part of autoconf rototil work
 1.20.4.1 10-Jan-2002  thorpej Sync kqueue branch with -current.
 1.20.2.2 06-Oct-2002  thorpej Sync with HEAD.
 1.20.2.1 21-Sep-2001  nathanw Catch up to -current.
 1.27.2.9 11-Dec-2005  christos Sync with head.
 1.27.2.8 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.27.2.7 24-Jan-2005  skrll Sync with HEAD.
 1.27.2.6 30-Oct-2004  skrll s/p/l/ for the struct lwp * arg.
 1.27.2.5 21-Sep-2004  skrll Fix the sync with head I botched.
 1.27.2.4 18-Sep-2004  skrll Sync with HEAD.
 1.27.2.3 24-Aug-2004  skrll Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.
 1.27.2.2 03-Aug-2004  skrll Sync with HEAD
 1.27.2.1 02-Jul-2003  darrenr Apply the aborted ktrace-lwp changes to a specific branch. This is just for
others to review, I'm concerned that patch fuziness may have resulted in some
errant code being generated but I'll look at that later by comparing the diff
from the base to the branch with the file I attempt to apply to it. This will,
at the very least, put the changes in a better context for others to review
them and attempt to tinker with removing passing of 'struct lwp' through
the kernel.
 1.29.4.1 27-Oct-2005  riz Pull up following revision(s) (requested by christos in ticket #5863):
sys/nfs/nfs_subs.c: revision 1.152 via patch
sys/nfs/nfs.h: revision 1.49
sys/nfs/nfs_vfsops.c: revision 1.149 via patch
usr.sbin/amd/include/config.h: revision 1.36
sys/nfs/nfs_vnops.c: revision 1.227 via patch
sys/nfs/nfsmount.h: revision 1.34
Allow the attribute cache to be turned off, and allow amd to do it.
 1.32.4.1 29-Apr-2005  kent sync with -current
 1.33.8.4 03-Sep-2007  yamt sync with head.
 1.33.8.3 26-Feb-2007  yamt sync with head.
 1.33.8.2 30-Dec-2006  yamt sync with head.
 1.33.8.1 21-Jun-2006  yamt sync with head.
 1.33.6.1 26-Sep-2005  tron Pull up following revision(s) (requested by christos in ticket #816):
sys/nfs/nfs.h: revision 1.49
sys/nfs/nfsmount.h: revision 1.34
Allow turning off the attribute cache.
 1.35.6.1 29-Nov-2005  yamt sync with head.
 1.37.12.1 24-May-2006  tron Merge 2006-05-24 NetBSD-current into the "peter-altq" branch.
 1.37.10.2 19-Apr-2006  elad sync with head.
 1.37.10.1 08-Mar-2006  elad Adapt to kernel authorization KPI.

This could use some testing...
 1.37.8.2 11-Aug-2006  yamt sync with head
 1.37.8.1 24-May-2006  yamt sync with head.
 1.37.6.2 01-Jun-2006  kardel Sync with head.
 1.37.6.1 22-Apr-2006  simonb Sync with head.
 1.37.4.1 09-Sep-2006  rpaulo sync with head
 1.39.4.1 13-Jul-2006  gdamore Merge from HEAD.
 1.40.4.1 12-Jan-2007  ad Sync with head.
 1.42.8.1 11-Jul-2007  mjf Sync with head.
 1.42.6.4 20-Aug-2007  ad Sync with HEAD.
 1.42.6.3 15-Jul-2007  ad Sync with head.
 1.42.6.2 08-Jun-2007  ad Sync with head.
 1.42.6.1 13-Mar-2007  ad Pull in the initial set of changes for the vmlocking branch.
 1.42.2.1 07-May-2007  yamt sync with head.
 1.45.2.1 15-Aug-2007  skrll Sync with HEAD.
 1.46.36.2 31-Jul-2007  pooka * nuke the nameidata parameter from VFS_MOUNT(). Nobody on tech-kern
knew what it was supposed to be used for and wrstuden gave a go-ahead
* while rototilling, convert file systems which went easily to
use VFS_PROTOS() instead of manually prototyping the methods
 1.46.36.1 31-Jul-2007  pooka file nfsmount.h was added on branch matt-mips64 on 2007-07-31 21:14:20 +0000
 1.46.34.1 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.46.28.4 09-Oct-2010  yamt sync with head
 1.46.28.3 16-Jul-2009  yamt remove sndlock. it's superseded by nm_solock.
suggested by Andrew Doran.
 1.46.28.2 04-May-2009  yamt sync with head.
 1.46.28.1 27-Apr-2008  yamt commit some work-in-progress changes to make nfs client mp-safe to a branch,
so that they won't get lost.
- sprinkle some locking
- mark the filesystem, nfstimer callout, and kq kthread mp-safe
- add assertions and comments
- disable upgrade mount for now
- some unrelated cosmetic changes
 1.46.24.1 17-Jan-2009  mjf Sync with HEAD.
 1.48.8.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.48.2.1 28-Apr-2009  skrll Sync with HEAD.
 1.49.4.1 05-Mar-2011  rmind sync with head
 1.49.2.1 22-Oct-2010  uebayasi Sync with HEAD (-D20101022).
 1.50.4.1 08-Feb-2011  bouyer Sync with HEAD
 1.50.2.1 06-Jun-2011  jruoho Sync with HEAD.
 1.51.28.1 10-Aug-2014  tls Rebase.
 1.51.14.2 03-Dec-2017  jdolecek update from HEAD
 1.51.14.1 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.52.4.1 22-Sep-2015  skrll Sync with HEAD
 1.52.2.1 04-Nov-2015  riz Pull up following revision(s) (requested by manu in ticket #882):
sbin/umount/umount.c: revision 1.48
sys/nfs/nfsmount.h: revision 1.53
sys/nfs/nfs_var.h: revision 1.94
sys/nfs/nfs_iod.c: revision 1.7
sys/nfs/nfs_socket.c: revision 1.197
sys/nfs/nfs_bio.c: revision 1.191
sys/nfs/nfs_vfsops.c: revision 1.230
sys/nfs/nfs_clntsocket.c: revision 1.3
Remove useless and harmful sync(2) call in umount(8)
Remove sync(2) call before unmount(2) in umount(8). This sync(2) is useless
since unmount(2) will perform a VFS_SYNC anyway.
But moreover, this sync(2) may be harmful, as there are some situation where
it cannot return (unreachable NFS server, for instance), causing umount -f
to be uneffective.
Fix soft NFS force unmount
For many reasons, forcibly unmounting a soft NFS mount could hang forever.
Here are the fixes:
- Introduce decents timeouts in operation that awaited NFS server reply.
- On timeout, fails operations on soft mounts with EIO.
- Introduce NFSMNT_DISMNTFORCE to let the filesystem know that a
force unmount is ongoing. This causes timeouts to be reduced and
prevents the NFS client to attempt reconnecting to the NFS server.
Also fix a race condition where some asynchronous I/O could reference
destroyed mount structures. We fix this by awaiting asynchronous I/O
to drain before proceeding.
Reviewed by Chuck Silvers.
 1.53.54.1 02-Aug-2025  perseant Sync with HEAD
 1.77 07-Dec-2024  riastradh sys/nfs/nfs: Add some missing includes and include guards.

Fix up some minor KNF issues while here.

No functional change intended (except to enable things to build that
might not have built before because of previously required #include
ordering).
 1.76 21-Oct-2021  andvar branches: 1.76.10;
fix various typos, mainly in comments, but also in man pages and log messages.
 1.75 18-Jul-2021  dholland Abolish all the silly indirection macros for initializing vnode ops tables.

These are things of the form #define foofs_op genfs_op, or #define
foofs_op genfs_eopnotsupp, or similar. They serve no purpose besides
obfuscation, and have gotten cutpasted all over everywhere.
 1.74 27-May-2021  simonb Remove nfs_putpages() prototype; it's not defined anywhere.
 1.73 30-May-2014  hannken branches: 1.73.44; 1.73.46;
Change NFS from rbtree to vcache.
 1.72 25-Sep-2010  matt branches: 1.72.18; 1.72.32;
Rename rb.h to rbtree.h, as it is more appropriate (c.f. ptree.h). Also
helps find code that hasn't been updated to use the new rbtree API.
 1.71 14-Mar-2009  dsl branches: 1.71.2; 1.71.4;
Remove all the __P() from sys (excluding sys/dist)
Diff checked with grep and MK1 eyeball.
i386 and amd64 GENERIC and sys still build.
 1.70 02-Jan-2009  christos branches: 1.70.2;
protect sillyrename with _KERNEL
 1.69 02-Jan-2009  ad - Don't vput() a vnode that we do not hold locked.
- Eliminate one of the few remaining uses of LK_CANRECURSE.
 1.68 22-Oct-2008  matt branches: 1.68.2; 1.68.4;
Change NFS to use a RB-tree for its FH->nfsnode lookups.
 1.67 25-Jan-2008  ad branches: 1.67.6; 1.67.10; 1.67.16;
Remove VOP_LEASE. Discussed on tech-kern.
 1.66 10-Aug-2007  yamt branches: 1.66.2; 1.66.8;
- instead of scanning an array of iods, maintain a list of idle iods.
- make nfs_getset_niothreads MP friendly.
 1.65 08-Aug-2007  yamt push kernel_lock a little.
 1.64 20-Jul-2007  yamt branches: 1.64.4; 1.64.6;
- fix decreasing of vfs.nfs.iothreads after the recent partial merge
of vmlocking.
- don't make nfsiod exit with requests left.
- make NFSSVC_BIOD a dummy so that nfsiod can be simplified.
 1.63 29-Apr-2007  yamt branches: 1.63.2;
include condvar.h. pointed by Kurt Schreiner.
 1.62 29-Apr-2007  yamt use mutex and condver.
 1.61 15-Feb-2007  yamt branches: 1.61.2; 1.61.6; 1.61.8;
use mutex and rwlock rather than lockmgr.
 1.60 28-Dec-2006  yamt remove several nqnfs definitions.
 1.59 27-Dec-2006  yamt remove nqnfs.
 1.58 17-Oct-2006  christos another variable should have been _KERNEL only.
 1.57 17-Oct-2006  christos don't expose kernel variables to userland.
 1.56 14-May-2006  elad branches: 1.56.8; 1.56.10;
integrate kauth.
 1.55 11-Dec-2005  christos branches: 1.55.4; 1.55.6; 1.55.8; 1.55.10; 1.55.12;
merge ktrace-lwp.
 1.54 26-Jan-2005  yamt branches: 1.54.6;
handle a really empty directory, which doesn't have even the dot entry.
 1.53 09-Jan-2005  yamt branches: 1.53.2; 1.53.4;
invalidate cache if filesize is changed besides our activity
because it means that we're out of sync with the server.
 1.52 08-Jan-2005  yamt nfs_lookup: check n_nctime for positive entries as well to improve
cache consistency.
 1.51 14-Dec-2004  yamt redirect some VOPs which shouldn't be used for nfs
to genfs_badop (ie. panic).
 1.50 26-Oct-2004  yamt since daddr_t is 64-bit these days, simply use nfs directory cookies
as buffer cache indexes. regress/sys/fs/getdents is now supposed to work.
fix PR/27112.
 1.49 15-Sep-2004  yamt fix access-after-free bugs in dircache code by refcounting nfsdircache.
PR/26864.
 1.48 24-Aug-2004  yamt nfs_request: a workaround for servers doing "maproot".
for i/o requests which are expected not to fail due to permission
to mimic unix file open semantics (READ, WRITE, COMMIT),
try two credentials. namely, the file owner's one and open time one.
remember which credential worked in per-file basis and try it first
next time to minimize number of retries.
ideas from Chuck Silvers. PR/23716 and PR/24987.
 1.47 27-May-2004  yamt remove an unused instance of VOP_UPDATE.
 1.46 12-Mar-2004  yamt branches: 1.46.2;
shrink sizeof struct nfsnode by putting exclusive members into union.
 1.45 12-Mar-2004  yamt introduce a macro NFS_INVALIDATE_ATTRCACHE and use it
instead of "n_attrstamp = 0".
 1.44 26-Sep-2003  yamt change n_mtime from time_t to timespec in order to improve
cache consistency.
(1 second granularity is too loose these days.)
 1.43 17-Sep-2003  yamt change nctime to timespec from time_t.
there can be too many activities in a second.
 1.42 07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.41 30-Jul-2003  yamt vrecycle removed nfs vnodes.
not perfect, but enough for most cases.
 1.40 07-May-2003  yamt branches: 1.40.2;
simple lock for nfs iod.
 1.39 09-Apr-2003  yamt update a comment to follow the previous change.
 1.38 09-Apr-2003  yamt make per-iod datas together.
 1.37 01-Dec-2002  matt Make sure these all agree on the same definitons of various variables.
 1.36 23-Oct-2002  jdolecek merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe
 1.35 21-Oct-2002  yamt fix a page locking deadlock problem for nfs.

add a flag that specify if the file can be truncated safely or not
to nfsm_loadattr and friends. when it isn't safe, just mark the nfsnode
as "should be truncated later".

ok'ed by Frank van der Linden and Chuck Silvers.
close kern/18036.
 1.34 15-Sep-2001  chs a whole bunch of changes to improve performance and robustness under load:

- remove special treatment of pager_map mappings in pmaps. this is
required now, since I've removed the globals that expose the address range.
pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's
no longer any need to special-case it.
- eliminate struct uvm_vnode by moving its fields into struct vnode.
- rewrite the pageout path. the pager is now responsible for handling the
high-level requests instead of only getting control after a bunch of work
has already been done on its behalf. this will allow us to UBCify LFS,
which needs tighter control over its pages than other filesystems do.
writing a page to disk no longer requires making it read-only, which
allows us to write wired pages without causing all kinds of havoc.
- use a new PG_PAGEOUT flag to indicate that a page should be freed
on behalf of the pagedaemon when it's unlocked. this flag is very similar
to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the
pageout fails due to eg. an indirect-block buffer being locked.
this allows us to remove the "version" field from struct vm_page,
and together with shrinking "loan_count" from 32 bits to 16,
struct vm_page is now 4 bytes smaller.
- no longer use PG_RELEASED for swap-backed pages. if the page is busy
because it's being paged out, we can't release the swap slot to be
reallocated until that write is complete, but unlike with vnodes we
don't keep a count of in-progress writes so there's no good way to
know when the write is done. instead, when we need to free a busy
swap-backed page, just sleep until we can get it busy ourselves.
- implement a fast-path for extending writes which allows us to avoid
zeroing new pages. this substantially reduces cpu usage.
- encapsulate the data used by the genfs code in a struct genfs_node,
which must be the first element of the filesystem-specific vnode data
for filesystems which use genfs_{get,put}pages().
- eliminate many of the UVM pagerops, since they aren't needed anymore
now that the pager "put" operation is a higher-level operation.
- enhance the genfs code to allow NFS to use the genfs_{get,put}pages
instead of a modified copy.
- clean up struct vnode by removing all the fields that used to be used by
the vfs_cluster.c code (which we don't use anymore with UBC).
- remove kmem_object and mb_object since they were useless.
instead of allocating pages to these objects, we now just allocate
pages with no object. such pages are mapped in the kernel until they
are freed, so we can use the mapping to find the page to free it.
this allows us to remove splvm() protection in several places.

The sum of all these changes improves write throughput on my
decstation 5000/200 to within 1% of the rate of NetBSD 1.5
and reduces the elapsed time for "make release" of a NetBSD 1.5
source tree on my 128MB pc to 10% less than a 1.5 kernel took.
 1.33 28-May-2001  chs branches: 1.33.2; 1.33.4;
add a genfs_mmap() and change all of the disk-based filesystems
to implement VOP_MMAP() with the genfs version, in preparation for
actually using this VOP.
 1.32 06-Feb-2001  fvdl branches: 1.32.2;
Do actual vnode locking for NFS.
 1.31 27-Nov-2000  chs Initial integration of the Unified Buffer Cache project.
 1.30 19-Sep-2000  fvdl Add fields to deal with commit ranges.
 1.29 30-Mar-2000  simonb branches: 1.29.4;
Delete redundant decl of nfs_vget() - it's in <nfs/nfsmount.h>.
 1.28 29-Nov-1999  fvdl Insert an extra VOP_ACCESS check in nfs_lookup, to avoid cached access
mishaps for lookup and getattr. Closes PR 8884.

While at it, cache access RPCs.
 1.27 10-Aug-1998  matthias branches: 1.27.2; 1.27.6; 1.27.8; 1.27.12; 1.27.18;
create miscfs/genfs/genfs_vnops.c:genfs_enoioctl and make all the other
filesystems use it instead of a private version.
 1.26 25-Jun-1998  thorpej - Rename nqnfs_vop_lease_check() to genfs_lease_check(). If NFSSERVER is
not in the kernel, genfs_lease_check() is simply a no-op. This allows
LKM'd file systems to be exported (previously did not work properly
due to a compile-time decision based on -DNFSSERVER).
- defopt NFSSERVER
 1.25 01-Mar-1998  fvdl Merge with Lite2 + local changes
 1.24 19-Oct-1997  fvdl * Implement optional 32 <-> 64 bit directory cookie translation. This uses
the directory cache as translation table. See nfs_subs.c for comments.
Makes the code a bit more complex to look at than I would have liked,
but doesn't affect the speed of the default behavior.
* Optimize caching behavior a bit when buffers are invalidated.
* Save some RPCs in readdir operations by not bothering if there is
a small amount left to do to fill the buffer. It'll be done in the
next RPC with a larger chunk anyway. Wastes a bit of buffer space
but is faster.
* Make n_vattr an allocated vattr struct. This avoids nfsnode bloat,
and is friendlier to the malloc routines.
 1.23 16-Oct-1997  christos Fix the location of the NFS_SMALLFH
 1.22 12-Oct-1997  fvdl Do negative lookup caching. Use a timestamp of the oldest negative cache
entry, so it can be checked against directory modification time for
validity.
 1.21 10-Oct-1997  fvdl * New directory entry caching system. Provides full caching of any
directory cookie that may be thrown back at us from userspace, up
to a size limit. Fixes double entry problem.
* Split flags for internal and external use in the NFS mount structure.
* Fix some buffer structure fields that weren're being used correctly.
* Fix missing directory cache inval call in nfs_open.
* Limit on NFS_DIRBLKSIZ no longer needed, bumped to the more reasonable
value of 8k.
* Various other things that I forget, all related to the dir caching
somehow, though.
 1.20 11-Apr-1997  kleink branches: 1.20.4;
Implement a POSIX compliant genfs VOP_SEEK() and use it in the appropriate
places; by Chris G. Demetriou and myself.
 1.19 02-Dec-1996  thorpej NFS performance improvement from Doug Rabson/FreeBSD:

Improve the queuing algorithms used by NFS' asynchronous i/o. The
existing mechanism uses a global queue for some buffers and the
vp->b_dirtyblkhd queue for others. This turns sequential writes into
randomly ordered writes to the server, affecting both read and write
performance. The existing mechanism also copes badly with hung
servers, tending to block accesses to other servers when all the iods
are waiting for a hung server.

The new mechanism uses a queue for each mount point. All asynchronous
i/o goes through this queue which preserves the ordering of requests.
A simple mechanism ensures that the iods are shared out fairly between
active mount points.

Reviewed/integrated/approved by Frank van der Linden <fvdl@netbsd.org>
 1.18 07-Sep-1996  mycroft Implement poll(2).
 1.17 01-Sep-1996  mycroft Add a set of generic file system operations that most file systems use.
Also, fix some time stamp bogosities.
 1.16 18-Feb-1996  fvdl Bring in a merge of Rick Macklem's NFSv3 code from Lite2
 1.15 09-Feb-1996  christos nfs prototype changes
 1.14 26-Mar-1995  jtc KERNEL -> _KERNEL
 1.13 13-Dec-1994  mycroft Sync with CSRG.
 1.12 18-Aug-1994  mycroft More LIST/CIRCLEQ migration.
 1.11 29-Jun-1994  cgd branches: 1.11.2;
New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'
 1.10 08-Jun-1994  mycroft Update to 4.4-Lite fs code, with local changes.
 1.9 25-Apr-1994  cgd some prototype cleanup, eliminate/replace bogus types (e.g. quad and
u_quad) -> use better types (e.g. quad_t & u_quad_t in inodes),
some cleanup.
 1.8 21-Apr-1994  cgd blow away all vestiges of nfsnode locking.
(1) it's unnecessary
(2) it causes machines to hang (yup!)
(3) it'd be gone in a few days anyway (it'd been yanked out
of 4.4-Lite by macklem long ago)
It was only there because macklem couldn't originally decide if things
should be locked, or not...
 1.7 15-Feb-1994  pk Update {a,m}time vnode attributes on special files a la ufs_vnode.c,
but make it a non-urgent operation, to leave us some performance.
 1.6 22-Dec-1993  cgd change return type of nfs_print back to int
 1.5 07-Sep-1993  ws branches: 1.5.2;
Changes to VFS readdir semantics
NFS changes for better cookie support
ISOFS changes for better Rockridge support and support for generation numbers
 1.4 02-Aug-1993  mycroft Make return type of nfs_print be a void, not an int.
 1.3 22-May-1993  cgd add Yuval Yarom's changes (originally for BSD/386) for advisory record
locking on NFS files. Note that this DOES NOT support network locking,
only local advisory locks.
 1.2 20-May-1993  cgd more rcs id adding and header cleanup. i like vi macros!
 1.1 20-Apr-1993  mycroft branches: 1.1.1;
Restore files lost during crash.
 1.1.1.2 01-Mar-1998  fvdl Import 4.4BSD-Lite2
 1.1.1.1 01-Mar-1998  fvdl Import 4.4BSD-Lite for reference
 1.5.2.2 19-Dec-1993  pk Undo misguided attempt to use quad_t type for n_size field.
 1.5.2.1 16-Dec-1993  pk Use u_quad for file size and dir offset a la ufs_inode.h.
Must re-address this later as straight u_quad_t doesn't work somehow (make
a Sparc crash with an "alignment fault").
 1.11.2.1 19-Aug-1994  mycroft update from trunk
 1.20.4.1 14-Oct-1997  thorpej Update marc-pcmcia branch from trunk.
 1.27.18.1 27-Dec-1999  wrstuden Pull up to last week's -current.
 1.27.12.3 11-Feb-2001  bouyer Sync with HEAD.
 1.27.12.2 08-Dec-2000  bouyer Sync with HEAD.
 1.27.12.1 20-Nov-2000  bouyer Update thorpej_scsipi to -current as of a month ago
 1.27.8.2 04-Jul-1999  chs support VOP_BALLOC().
 1.27.8.1 07-Jun-1999  chs merge everything from chs-ubc branch.
 1.27.6.1 05-Jan-2000  he Pull up revision 1.28 (requested by fvdl):
Insert an extra VOP_ACCESS check in nfs_lookup, preventing cached
access mishaps for lookup and getattr. Fixes PR#8884.
 1.27.2.1 09-Nov-1998  chs initial snapshot. lots left to do.
 1.29.4.1 14-Dec-2000  he Pull up revision 1.30 (requested by fvdl):
Improve NFS performance, possibly with as much as 100% in
throughput. Please note: this implies a kernel interface change,
VOP_FSYNC gains two arguments.
 1.32.2.5 11-Dec-2002  thorpej Sync with HEAD.
 1.32.2.4 11-Nov-2002  nathanw Catch up to -current
 1.32.2.3 22-Oct-2002  thorpej Sync with HEAD.
 1.32.2.2 21-Sep-2001  nathanw Catch up to -current.
 1.32.2.1 21-Jun-2001  nathanw Catch up to -current.
 1.33.4.1 01-Oct-2001  fvdl Catch up with -current.
 1.33.2.2 30-Sep-2002  jdolecek add support for kevents to NFS
to detect file changes on server by other NFS clients, polling kernel thread
is used to periodically check for attribute changes of watched files;
the NFS server is only contacted when the vnode expires from local attrcache
(which takes 5-60 seconds currently), to keep network&CPU overhead low

the routine checking for remote changes is quite simplistic, but hopefully
doing it's job well enough
 1.33.2.1 10-Jan-2002  thorpej Sync kqueue branch with -current.
 1.40.2.8 04-Feb-2005  skrll Sync with HEAD.
 1.40.2.7 17-Jan-2005  skrll Sync with HEAD.
 1.40.2.6 18-Dec-2004  skrll Sync with HEAD.
 1.40.2.5 02-Nov-2004  skrll Sync with HEAD.
 1.40.2.4 21-Sep-2004  skrll Fix the sync with head I botched.
 1.40.2.3 18-Sep-2004  skrll Sync with HEAD.
 1.40.2.2 03-Sep-2004  skrll Sync with HEAD
 1.40.2.1 03-Aug-2004  skrll Sync with HEAD
 1.46.2.4 11-Jan-2005  jmc Pullup rev 1.53 (requested by yamt in ticket #1079)

Invalidate cache if filesize is changed besides our activity
because it means that were out of sync with the server.
 1.46.2.3 11-Jan-2005  jmc Pullup patch (requested by yamt in ticket #1077)

nfs_lookup: check n_nctime for positive entries as well to improve
cache consistency.
 1.46.2.2 18-Sep-2004  he branches: 1.46.2.2.2;
Pull up revision 1.49 (requested by yamt in ticket #858):
Fix access-after-free bugs in dircache code by reference
counting nfsdircache. Fixes PR#26864.
 1.46.2.1 30-Aug-2004  tron Pull up revision 1.48 (requested by yamt in ticket #803):
nfs_request: a workaround for servers doing "maproot".
for i/o requests which are expected not to fail due to permission
to mimic unix file open semantics (READ, WRITE, COMMIT),
try two credentials. namely, the file owner's one and open time one.
remember which credential worked in per-file basis and try it first
next time to minimize number of retries.
ideas from Chuck Silvers. PR/23716 and PR/24987.
 1.46.2.2.2.3 30-Jan-2005  he Pull up revision 1.50 (requested by yamt in ticket #968):
Since daddr_t is 64-bit these days, simply use nfs directory
cookies as buffer cache indexes. This should make the
regress/sys/fs/getdents test work. Fixes PR#27112.
 1.46.2.2.2.2 11-Jan-2005  jmc Pullup rev 1.53 (requested by yamt in ticket #1079)

Invalidate cache if filesize is changed besides our activity
because it means that were out of sync with the server.
 1.46.2.2.2.1 11-Jan-2005  jmc Pullup patch (requested by yamt in ticket #1077)

nfs_lookup: check n_nctime for positive entries as well to improve
cache consistency.
 1.53.4.1 12-Feb-2005  yamt sync with head.
 1.53.2.1 29-Apr-2005  kent sync with -current
 1.54.6.5 04-Feb-2008  yamt sync with head.
 1.54.6.4 03-Sep-2007  yamt sync with head.
 1.54.6.3 26-Feb-2007  yamt sync with head.
 1.54.6.2 30-Dec-2006  yamt sync with head.
 1.54.6.1 21-Jun-2006  yamt sync with head.
 1.55.12.1 24-May-2006  tron Merge 2006-05-24 NetBSD-current into the "peter-altq" branch.
 1.55.10.2 06-May-2006  christos - Move kauth_cred_t declaration to <sys/types.h>
- Cleanup struct ucred; forward declarations that are unused.
- Don't include <sys/kauth.h> in any header, but include it in the c files
that need it.

Approved by core.
 1.55.10.1 08-Mar-2006  elad Adapt to kernel authorization KPI.

This could use some testing...
 1.55.8.1 24-May-2006  yamt sync with head.
 1.55.6.1 01-Jun-2006  kardel Sync with head.
 1.55.4.1 09-Sep-2006  rpaulo sync with head
 1.56.10.1 22-Oct-2006  yamt sync with head
 1.56.8.2 12-Jan-2007  ad Sync with head.
 1.56.8.1 18-Nov-2006  ad Sync with head.
 1.61.8.1 11-Jul-2007  mjf Sync with head.
 1.61.6.3 20-Aug-2007  ad Sync with HEAD.
 1.61.6.2 08-Jun-2007  ad Sync with head.
 1.61.6.1 13-Mar-2007  ad Pull in the initial set of changes for the vmlocking branch.
 1.61.2.1 07-May-2007  yamt sync with head.
 1.63.2.1 15-Aug-2007  skrll Sync with HEAD.
 1.64.6.2 20-Jul-2007  yamt - fix decreasing of vfs.nfs.iothreads after the recent partial merge
of vmlocking.
- don't make nfsiod exit with requests left.
- make NFSSVC_BIOD a dummy so that nfsiod can be simplified.
 1.64.6.1 20-Jul-2007  yamt file nfsnode.h was added on branch matt-mips64 on 2007-07-20 15:36:43 +0000
 1.64.4.2 16-Aug-2007  jmcneill Sync with HEAD.
 1.64.4.1 09-Aug-2007  jmcneill Sync with HEAD.
 1.66.8.1 18-Feb-2008  mjf Sync with HEAD.
 1.66.2.1 23-Mar-2008  matt sync with HEAD
 1.67.16.1 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.67.10.5 10-Oct-2010  yamt some locking changes
 1.67.10.4 09-Oct-2010  yamt sync with head
 1.67.10.3 26-Sep-2010  yamt locking changes
 1.67.10.2 04-May-2009  yamt sync with head.
 1.67.10.1 27-Apr-2008  yamt commit some work-in-progress changes to make nfs client mp-safe to a branch,
so that they won't get lost.
- sprinkle some locking
- mark the filesystem, nfstimer callout, and kq kthread mp-safe
- add assertions and comments
- disable upgrade mount for now
- some unrelated cosmetic changes
 1.67.6.1 17-Jan-2009  mjf Sync with HEAD.
 1.68.4.2 02-Feb-2009  snj Pull up following revision(s) (requested by ad in ticket #344):
sys/nfs/nfsnode.h: revision 1.70
protect sillyrename with _KERNEL
 1.68.4.1 02-Feb-2009  snj Pull up following revision(s) (requested by ad in ticket #344):
sys/nfs/nfs_node.c: revision 1.108
sys/nfs/nfsnode.h: revision 1.69
- Don't vput() a vnode that we do not hold locked.
- Eliminate one of the few remaining uses of LK_CANRECURSE.
 1.68.2.2 28-Apr-2009  skrll Sync with HEAD.
 1.68.2.1 19-Jan-2009  skrll Sync with HEAD.
 1.70.2.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.71.4.1 05-Mar-2011  rmind sync with head
 1.71.2.1 22-Oct-2010  uebayasi Sync with HEAD (-D20101022).
 1.72.32.1 10-Aug-2014  tls Rebase.
 1.72.18.1 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.73.46.1 31-May-2021  cjep sync with head
 1.73.44.2 01-Aug-2021  thorpej Sync with HEAD.
 1.73.44.1 17-Jun-2021  thorpej Sync w/ HEAD.
 1.76.10.1 02-Aug-2025  perseant Sync with HEAD
 1.18 07-Dec-2024  riastradh sys/nfs/nfs: Add some missing includes and include guards.

Fix up some minor KNF issues while here.

No functional change intended (except to enable things to build that
might not have built before because of previously required #include
ordering).
 1.17 27-Dec-2006  yamt branches: 1.17.170;
remove nqnfs.
 1.16 02-Sep-2006  yamt branches: 1.16.2;
nfsd: deal with variable-sized filehandles.
 1.15 31-Jul-2006  martin Make filehandles opaque to userland
 1.14 13-Jul-2006  martin Fix alignement problems for fhandle_t, exposed by gcc4.1.

While touching all vptofh/fhtovp functions, get rid of VFS_MAXFIDSIZ,
version the getfh(2) syscall and explicitly pass the size available in
the filehandle from userland.

Discussed on tech-kern, with lots of help from yamt (thanks!).
 1.13 14-Mar-2006  yamt branches: 1.13.6;
bump NFS_MAXDGRAMDATA from 32k to 60k. (ie. near the protocol limit of udp.)
- it can help performance for some environments.
- administrators should be free to do silly things. :-)
 1.12 11-Dec-2005  christos branches: 1.12.4; 1.12.6; 1.12.8; 1.12.10;
merge ktrace-lwp.
 1.11 25-Sep-2005  christos Add missing TIMEDOUT and IO errors.
 1.10 07-Aug-2003  agc branches: 1.10.14; 1.10.16;
Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.9 19-Sep-2000  fvdl branches: 1.9.24;
Adapt some defaults/max values to be more realistic.
 1.8 06-Aug-1998  kleink branches: 1.8.12; 1.8.18; 1.8.22;
Like for NFSv2, add a pointer to the NFSv3 RFC, too.
 1.7 19-Oct-1997  fvdl * Implement optional 32 <-> 64 bit directory cookie translation. This uses
the directory cache as translation table. See nfs_subs.c for comments.
Makes the code a bit more complex to look at than I would have liked,
but doesn't affect the speed of the default behavior.
* Optimize caching behavior a bit when buffers are invalidated.
* Save some RPCs in readdir operations by not bothering if there is
a small amount left to do to fill the buffer. It'll be done in the
next RPC with a larger chunk anyway. Wastes a bit of buffer space
but is faster.
* Make n_vattr an allocated vattr struct. This avoids nfsnode bloat,
and is friendlier to the malloc routines.
 1.6 17-Oct-1997  fvdl NFS_SMALLFH should be a multiple of 4; since the nfsnode struct may grow
one more member, just make it the minimum (32) now.
 1.5 17-Oct-1997  christos nfstov_mode converts 32 bits now.
change NFS_SMALLFH from 44 to 38 to accommodate the mode_t and nlink_t changes.
 1.4 12-Oct-1997  fvdl Do negative lookup caching. Use a timestamp of the oldest negative cache
entry, so it can be checked against directory modification time for
validity.
 1.3 10-Oct-1997  fvdl * New directory entry caching system. Provides full caching of any
directory cookie that may be thrown back at us from userspace, up
to a size limit. Fixes double entry problem.
* Split flags for internal and external use in the NFS mount structure.
* Fix some buffer structure fields that weren're being used correctly.
* Fix missing directory cache inval call in nfs_open.
* Limit on NFS_DIRBLKSIZ no longer needed, bumped to the more reasonable
value of 8k.
* Various other things that I forget, all related to the dir caching
somehow, though.
 1.2 08-May-1997  mycroft branches: 1.2.4;
Pass the vnode type to vaccess(), and use it when checking VEXEC. Make sure
that the mode bits passed to vaccess() and returned by foo_getattr() contain
only permission bits.
 1.1 18-Feb-1996  fvdl branches: 1.1.1;
Bring in a merge of Rick Macklem's NFSv3 code from Lite2
 1.1.1.1 01-Mar-1998  fvdl Import 4.4BSD-Lite for reference
 1.2.4.1 14-Oct-1997  thorpej Update marc-pcmcia branch from trunk.
 1.8.22.1 14-Dec-2000  he Pull up revision 1.9 (requested by fvdl):
Improve NFS performance, possibly with as much as 100% in
throughput. Please note: this implies a kernel interface change,
VOP_FSYNC gains two arguments.
 1.8.18.1 21-Dec-1999  wrstuden Initial commit of recent changes to make DEV_BSIZE go away.

Runs on i386, needs work on other arch's. Main kernel routines should be
fine, but a number of the stand programs need help.

cd, fd, ccd, wd, and sd have been updated. sd has been tested with non-512
byte block devices. vnd, raidframe, and lfs need work.

Non 2**n block support is automatic for LKM's and conditional for kernels
on "options NON_PO2_BLOCKS".
 1.8.12.1 20-Nov-2000  bouyer Update thorpej_scsipi to -current as of a month ago
 1.9.24.4 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.9.24.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.9.24.2 18-Sep-2004  skrll Sync with HEAD.
 1.9.24.1 03-Aug-2004  skrll Sync with HEAD
 1.10.16.2 30-Dec-2006  yamt sync with head.
 1.10.16.1 21-Jun-2006  yamt sync with head.
 1.10.14.1 16-Dec-2005  tron Pull up following revision(s) (requested by christos in ticket #1055):
sys/nfs/nfsproto.h: revision 1.11
Add missing TIMEDOUT and IO errors.
 1.12.10.1 19-Apr-2006  elad sync with head.
 1.12.8.3 03-Sep-2006  yamt sync with head.
 1.12.8.2 11-Aug-2006  yamt sync with head
 1.12.8.1 01-Apr-2006  yamt sync with head.
 1.12.6.1 22-Apr-2006  simonb Sync with head.
 1.12.4.1 09-Sep-2006  rpaulo sync with head
 1.13.6.1 13-Jul-2006  gdamore Merge from HEAD.
 1.16.2.1 12-Jan-2007  ad Sync with head.
 1.17.170.1 02-Aug-2025  perseant Sync with HEAD
 1.11 07-Dec-2024  riastradh sys/nfs/nfs: Add some missing includes and include guards.

Fix up some minor KNF issues while here.

No functional change intended (except to enable things to build that
might not have built before because of previously required #include
ordering).
 1.10 05-Sep-2014  matt branches: 1.10.56;
Don't nest structure definitions.
 1.9 28-Dec-2006  yamt branches: 1.9.90;
remove several nqnfs definitions.
 1.8 11-Dec-2005  christos branches: 1.8.20;
merge ktrace-lwp.
 1.7 07-Aug-2003  agc branches: 1.7.16;
Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.6 12-May-2002  matt branches: 1.6.10;
Eliminate commons
 1.5 12-May-1997  fvdl branches: 1.5.34; 1.5.36;
Store RPC procnum consistently as an u_int32_t. This is as it should be,
and avoid possible server crashes due to bogus comparisons. Partly
from BSDI.
 1.4 18-Feb-1996  fvdl Bring in a merge of Rick Macklem's NFSv3 code from Lite2
 1.3 19-Dec-1995  cgd changes to make this work on systems where pointers & longs are 64 bits.
This is mostly just changes to make the stuff that goes over the wire
use fixed-size types.
 1.2 29-Jun-1994  cgd New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'
 1.1 08-Jun-1994  mycroft branches: 1.1.1;
Update to 4.4-Lite fs code, with local changes.
 1.1.1.2 01-Mar-1998  fvdl Import 4.4BSD-Lite2
 1.1.1.1 01-Mar-1998  fvdl Import 4.4BSD-Lite for reference
 1.5.36.1 23-Jun-2002  jdolecek catch up with -current on kqueue branch
 1.5.34.1 20-Jun-2002  nathanw Catch up to -current.
 1.6.10.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.6.10.2 18-Sep-2004  skrll Sync with HEAD.
 1.6.10.1 03-Aug-2004  skrll Sync with HEAD
 1.7.16.1 30-Dec-2006  yamt sync with head.
 1.8.20.1 12-Jan-2007  ad Sync with head.
 1.9.90.1 03-Dec-2017  jdolecek update from HEAD
 1.10.56.1 02-Aug-2025  perseant Sync with HEAD
 1.17 07-Dec-2024  riastradh sys/nfs/nfs: Add some missing includes and include guards.

Fix up some minor KNF issues while here.

No functional change intended (except to enable things to build that
might not have built before because of previously required #include
ordering).
 1.16 04-Dec-2007  yamt branches: 1.16.140;
merge non-intrusive nfs changes from vmlocking.
 1.15 01-Jun-2007  yamt branches: 1.15.6; 1.15.8; 1.15.14; 1.15.16;
use mutex and condvar.
 1.14 28-Dec-2006  yamt branches: 1.14.6; 1.14.8;
remove several nqnfs definitions.
 1.13 11-Dec-2005  christos branches: 1.13.20;
merge ktrace-lwp.
 1.12 07-Aug-2003  agc branches: 1.12.16;
Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.11 12-May-1997  fvdl branches: 1.11.56;
Store RPC procnum consistently as an u_int32_t. This is as it should be,
and avoid possible server crashes due to bogus comparisons. Partly
from BSDI.
 1.10 18-Feb-1996  fvdl Bring in a merge of Rick Macklem's NFSv3 code from Lite2
 1.9 19-Dec-1995  cgd changes to make this work on systems where pointers & longs are 64 bits.
This is mostly just changes to make the stuff that goes over the wire
use fixed-size types.
 1.8 13-Dec-1994  mycroft Sync with CSRG.
 1.7 18-Aug-1994  mycroft More LIST/CIRCLEQ migration.
 1.6 17-Aug-1994  mycroft Use LIST and TAILQ for hash chain and LRU chain, respectively.
 1.5 29-Jun-1994  cgd branches: 1.5.2;
New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'
 1.4 08-Jun-1994  mycroft Update to 4.4-Lite fs code, with local changes.
 1.3 20-May-1993  cgd more rcs id adding and header cleanup. i like vi macros!
 1.2 20-Apr-1993  mycroft Add consistent multiple-inclusion protection (repeat).
 1.1 21-Mar-1993  cgd branches: 1.1.1;
Initial revision
 1.1.1.3 01-Mar-1998  fvdl Import 4.4BSD-Lite2
 1.1.1.2 01-Mar-1998  fvdl Import 4.4BSD-Lite for reference
 1.1.1.1 21-Mar-1993  cgd initial import of 386bsd-0.1 sources
 1.5.2.1 19-Aug-1994  mycroft update from trunk
 1.11.56.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.11.56.2 18-Sep-2004  skrll Sync with HEAD.
 1.11.56.1 03-Aug-2004  skrll Sync with HEAD
 1.12.16.3 07-Dec-2007  yamt sync with head
 1.12.16.2 03-Sep-2007  yamt sync with head.
 1.12.16.1 30-Dec-2006  yamt sync with head.
 1.13.20.1 12-Jan-2007  ad Sync with head.
 1.14.8.1 11-Jul-2007  mjf Sync with head.
 1.14.6.3 27-Aug-2007  yamt - fix/add assertions.
- fix numnfsrvcache.
 1.14.6.2 26-Aug-2007  yamt - mark nfssvc(2) MPSAFE and move the most of nfsd out of the kernel lock.
- remove unused ns_solock.
- remove some of KERNEL_LOCK/UNLOCK which are not necessary on this branch.
 1.14.6.1 09-Jun-2007  ad Sync with head.
 1.15.16.2 08-Dec-2007  ad Sync with head.
 1.15.16.1 04-Dec-2007  ad Pull the vmlocking changes into a new branch.
 1.15.14.1 08-Dec-2007  mjf Sync with HEAD.
 1.15.8.1 09-Jan-2008  matt sync with HEAD
 1.15.6.1 09-Dec-2007  jmcneill Sync with HEAD.
 1.16.140.1 02-Aug-2025  perseant Sync with HEAD
 1.11 18-Feb-1996  fvdl Bring in a merge of Rick Macklem's NFSv3 code from Lite2
 1.10 19-Dec-1995  cgd changes to make this work on systems where pointers & longs are 64 bits.
This is mostly just changes to make the stuff that goes over the wire
use fixed-size types.
 1.9 29-Jun-1994  cgd New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'
 1.8 08-Jun-1994  mycroft Update to 4.4-Lite fs code, with local changes.
 1.7 10-Apr-1994  cgd patchkit date deletions!
 1.6 10-Mar-1994  ws Oops. Bug fix for nfs server. Reported by Theo.
 1.5 09-Mar-1994  ws Make FFS optional
 1.4 20-May-1993  cgd more rcs id adding and header cleanup. i like vi macros!
 1.3 20-Apr-1993  mycroft Add consistent multiple-inclusion protection (repeat).
 1.2 21-Mar-1993  cgd after 0.2.2 "stable" patches applied
 1.1 21-Mar-1993  cgd branches: 1.1.1;
Initial revision
 1.1.1.1 21-Mar-1993  cgd initial import of 386bsd-0.1 sources
 1.22 27-Dec-2006  yamt remove nqnfs.
 1.21 02-Sep-2006  yamt branches: 1.21.2;
nfsd: deal with variable-sized filehandles.
 1.20 13-Jul-2006  martin Fix alignement problems for fhandle_t, exposed by gcc4.1.

While touching all vptofh/fhtovp functions, get rid of VFS_MAXFIDSIZ,
version the getfh(2) syscall and explicitly pass the size available in
the filehandle from userland.

Discussed on tech-kern, with lots of help from yamt (thanks!).
 1.19 07-Jun-2006  kardel branches: 1.19.2;
merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html
 1.18 14-May-2006  elad branches: 1.18.2;
integrate kauth.
 1.17 11-Dec-2005  christos branches: 1.17.4; 1.17.6; 1.17.8; 1.17.10; 1.17.12;
merge ktrace-lwp.
 1.16 21-Apr-2004  christos branches: 1.16.12;
use VFS_MAXFIDSZ
 1.15 16-Aug-2003  yamt current trylater/jukebox retry delay is way too long and
it has a bug in the backoff calculation. so,
- clip it to 1-60 sec. (suggested by Rick Macklem)
- use a constant multiplier instead of nfs_backoff, which
is already exponential.
- move some related constant definations to nfs.h from nqnfs.h and
prefix with NFS_ instead of NQ_ because they are not nqnfs-specific.
 1.14 07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.13 29-Jun-2003  fvdl branches: 1.13.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
 1.12 28-Jun-2003  darrenr Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
 1.11 05-May-2003  yamt keep things not needed by userland in #ifdef _KERNEL.
(e.g. prototypes for in-kernel functions)
 1.10 24-Apr-2003  drochner Change some subordinate functions to take a "struct nfsnode" argument
instead of "struct vnode". This saves a number of pointer dereferences;
it sums up to about half a kB for me. And it paves the way for future
fixes.
While cleaning up, eliminate a write-only member of "struct nfsreq"
and a pointless assignment in the NFS_V2_ONLY case.
 1.9 18-Jan-2003  thorpej Merge the nathanw_sa branch.
 1.8 12-May-2002  matt Eliminate commons
 1.7 09-Jun-2000  fvdl branches: 1.7.4; 1.7.6;
Some tweaks to enable NFS over IPv6. The special-casing of AF_INET
should really be removed.
 1.6 18-Feb-1996  fvdl branches: 1.6.30; 1.6.38;
Bring in a merge of Rick Macklem's NFSv3 code from Lite2
 1.5 19-Dec-1995  cgd changes to make this work on systems where pointers & longs are 64 bits.
This is mostly just changes to make the stuff that goes over the wire
use fixed-size types.
 1.4 13-Dec-1994  mycroft Sync with CSRG.
 1.3 18-Aug-1994  mycroft More LIST/CIRCLEQ migration.
 1.2 29-Jun-1994  cgd branches: 1.2.2;
New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'
 1.1 08-Jun-1994  mycroft branches: 1.1.1;
Update to 4.4-Lite fs code, with local changes.
 1.1.1.2 01-Mar-1998  fvdl Import 4.4BSD-Lite2
 1.1.1.1 01-Mar-1998  fvdl Import 4.4BSD-Lite for reference
 1.2.2.1 19-Aug-1994  mycroft update from trunk
 1.6.38.1 22-Jun-2000  minoura Sync w/ netbsd-1-5-base.
 1.6.30.1 20-Nov-2000  bouyer Update thorpej_scsipi to -current as of a month ago
 1.7.6.1 23-Jun-2002  jdolecek catch up with -current on kqueue branch
 1.7.4.2 20-Jun-2002  nathanw Catch up to -current.
 1.7.4.1 05-Mar-2001  nathanw Initial commit of scheduler activations and lightweight process support.
 1.13.2.4 21-Sep-2004  skrll Fix the sync with head I botched.
 1.13.2.3 18-Sep-2004  skrll Sync with HEAD.
 1.13.2.2 03-Aug-2004  skrll Sync with HEAD
 1.13.2.1 02-Jul-2003  darrenr Apply the aborted ktrace-lwp changes to a specific branch. This is just for
others to review, I'm concerned that patch fuziness may have resulted in some
errant code being generated but I'll look at that later by comparing the diff
from the base to the branch with the file I attempt to apply to it. This will,
at the very least, put the changes in a better context for others to review
them and attempt to tinker with removing passing of 'struct lwp' through
the kernel.
 1.16.12.2 30-Dec-2006  yamt sync with head.
 1.16.12.1 21-Jun-2006  yamt sync with head.
 1.17.12.1 24-May-2006  tron Merge 2006-05-24 NetBSD-current into the "peter-altq" branch.
 1.17.10.1 08-Mar-2006  elad Adapt to kernel authorization KPI.

This could use some testing...
 1.17.8.4 03-Sep-2006  yamt sync with head.
 1.17.8.3 11-Aug-2006  yamt sync with head
 1.17.8.2 26-Jun-2006  yamt sync with head.
 1.17.8.1 24-May-2006  yamt sync with head.
 1.17.6.2 01-Jun-2006  kardel Sync with head.
 1.17.6.1 04-Feb-2006  simonb Adapt for timecounters: mostly use get*time() and use "time_second"
instead of "time.tv_sec".
 1.17.4.1 09-Sep-2006  rpaulo sync with head
 1.18.2.1 19-Jun-2006  chap Sync with head.
 1.19.2.1 13-Jul-2006  gdamore Merge from HEAD.
 1.21.2.1 12-Jan-2007  ad Sync with head.
 1.13 07-Dec-2024  riastradh sys/nfs/nfs: Add some missing includes and include guards.

Fix up some minor KNF issues while here.

No functional change intended (except to enable things to build that
might not have built before because of previously required #include
ordering).
 1.12 28-Dec-2006  yamt branches: 1.12.170;
remove several nqnfs definitions.
 1.11 11-Dec-2005  christos branches: 1.11.20;
merge ktrace-lwp.
 1.10 26-Feb-2005  perry branches: 1.10.4;
nuke trailing whitespace
 1.9 07-Aug-2003  agc branches: 1.9.8; 1.9.10;
Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.8 18-Feb-1996  fvdl branches: 1.8.64;
Bring in a merge of Rick Macklem's NFSv3 code from Lite2
 1.7 19-Dec-1995  cgd changes to make this work on systems where pointers & longs are 64 bits.
This is mostly just changes to make the stuff that goes over the wire
use fixed-size types.
 1.6 29-Jun-1994  cgd New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'
 1.5 08-Jun-1994  mycroft Update to 4.4-Lite fs code, with local changes.
 1.4 18-Apr-1994  glass revised nfs diskless support. uses bootp+rpc to gather parameters
 1.3 20-May-1993  cgd more rcs id adding and header cleanup. i like vi macros!
 1.2 20-Apr-1993  cgd re-merged include file changes which got eaten by crash
 1.1 21-Mar-1993  cgd branches: 1.1.1;
Initial revision
 1.1.1.3 01-Mar-1998  fvdl Import 4.4BSD-Lite2
 1.1.1.2 01-Mar-1998  fvdl Import 4.4BSD-Lite for reference
 1.1.1.1 21-Mar-1993  cgd initial import of 386bsd-0.1 sources
 1.8.64.4 04-Mar-2005  skrll Sync with HEAD.

Hi Perry!
 1.8.64.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.8.64.2 18-Sep-2004  skrll Sync with HEAD.
 1.8.64.1 03-Aug-2004  skrll Sync with HEAD
 1.9.10.1 19-Mar-2005  yamt sync with head. xen and whitespace. xen part is not finished.
 1.9.8.1 29-Apr-2005  kent sync with -current
 1.10.4.1 30-Dec-2006  yamt sync with head.
 1.11.20.1 12-Jan-2007  ad Sync with head.
 1.12.170.1 02-Aug-2025  perseant Sync with HEAD
 1.10 31-Jan-1997  thorpej This file is now obsolete.
 1.9 30-Apr-1995  cgd kill unnecessary blank line at end of file
 1.8 29-Jun-1994  cgd New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'
 1.7 29-Apr-1994  glass i really wish i knew what was wrong
 1.6 18-Apr-1994  glass revised nfs diskless support. uses bootp+rpc to gather parameters
 1.5 01-Mar-1994  glass remove some warnings
 1.4 18-Jan-1994  brezak Include nfs_hack_mountroot() in NFSDISKLESS_HARDWIRE
 1.3 18-Dec-1993  mycroft Canonicalize all #includes.
 1.2 14-Oct-1993  glass this is the disgusting temporary hack to assist people booting over nfs via
hacked structures until netboot works.

the word "abortion" comes to mind.
 1.1 07-Jul-1993  cgd branches: 1.1.4;
changes from ws to support diskless booting... these are "OK" on inspection
and after testing... (actually, currently, none of the changed
code is even used...)
 1.1.4.1 14-Nov-1993  mycroft Canonicalize all #includes.
 1.16 07-Dec-2024  riastradh sys/nfs/nfs: Add some missing includes and include guards.

Fix up some minor KNF issues while here.

No functional change intended (except to enable things to build that
might not have built before because of previously required #include
ordering).
 1.15 11-Dec-2005  christos branches: 1.15.200;
merge ktrace-lwp.
 1.14 07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.13 06-Mar-1999  fair branches: 1.13.42;
Snatch a patch from OpenBSD to fix PRs 6529 and 7074.
Adjust fxdr_hyper() and txdr_hyper() macros.
 1.12 10-Oct-1997  fvdl * New directory entry caching system. Provides full caching of any
directory cookie that may be thrown back at us from userspace, up
to a size limit. Fixes double entry problem.
* Split flags for internal and external use in the NFS mount structure.
* Fix some buffer structure fields that weren're being used correctly.
* Fix missing directory cache inval call in nfs_open.
* Limit on NFS_DIRBLKSIZ no longer needed, bumped to the more reasonable
value of 8k.
* Various other things that I forget, all related to the dir caching
somehow, though.
 1.11 18-Feb-1996  fvdl branches: 1.11.12;
Bring in a merge of Rick Macklem's NFSv3 code from Lite2
 1.10 01-Feb-1996  jtc Rename struct timespec fields to conform to POSIX.1b
 1.9 19-Dec-1995  cgd changes to make this work on systems where pointers & longs are 64 bits.
This is mostly just changes to make the stuff that goes over the wire
use fixed-size types.
 1.8 13-Jan-1995  mycroft Convert unspecified usec value to 0, per discussion with Rick.
 1.7 29-Jun-1994  cgd New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'
 1.6 08-Jun-1994  mycroft Update to 4.4-Lite fs code, with local changes.
 1.5 25-Apr-1994  cgd some prototype cleanup, eliminate/replace bogus types (e.g. quad and
u_quad) -> use better types (e.g. quad_t & u_quad_t in inodes),
some cleanup.
 1.4 03-Jun-1993  cgd fix for macklem's bogus use of the va_flags field, supplied by
John Woods, jfwfrom: @ksr.com. also, fixes the following problems:
the va_gen field is in a similar position
(Suns are going to be reporting the change-date microseconds as their
"generation"), I've supplied my own set of diffs below for your inspection.
Note these aren't even compiled, but they're pretty similar to what I had
to do to our older version of OSF/1 here. (There's also an unrelated change
supplied for xdr_subs.h; the pointer types supplied to the fxdr_time() and
txdr_time() macros are not, in fact, both struct timevals. That turns out
to be one of many tips-of-the-iceberg facing those porting the (old) Berkeley
NFS code to 64-bit machines...)
 1.3 20-May-1993  cgd more rcs id adding and header cleanup. i like vi macros!
 1.2 20-Apr-1993  mycroft Add consistent multiple-inclusion protection (repeat).
 1.1 21-Mar-1993  cgd branches: 1.1.1;
Initial revision
 1.1.1.3 01-Mar-1998  fvdl Import 4.4BSD-Lite2
 1.1.1.2 01-Mar-1998  fvdl Import 4.4BSD-Lite for reference
 1.1.1.1 21-Mar-1993  cgd initial import of 386bsd-0.1 sources
 1.11.12.1 14-Oct-1997  thorpej Update marc-pcmcia branch from trunk.
 1.13.42.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.13.42.2 18-Sep-2004  skrll Sync with HEAD.
 1.13.42.1 03-Aug-2004  skrll Sync with HEAD
 1.15.200.1 02-Aug-2025  perseant Sync with HEAD

RSS XML Feed