Home | History | Annotate | Download | only in nfs
History log of /src/sys/nfs/nfs_vfsops.c
RevisionDateAuthorComments
 1.246  13-May-2024  msaitoh ficticious -> fictitious in comment.
 1.245  21-Mar-2023  christos PR/57279: Izumi Tsutsui: Fix some {int,long} -> time_t. Still things will
break eventually because parts of the nfs protocol assume time_t will fit
in 32 bits.
 1.244  17-Mar-2023  mlelstv Avoid overflow of nfs_commitsize on machines with > 32GB RAM.
 1.243  13-Jun-2021  mlelstv branches: 1.243.10;
Don't pretend that files are limited to 1TB on NFSv3.
 1.242  02-Apr-2021  christos branches: 1.242.2;
Set f_namemax during mount time like all the other filesystems so that
it does gets the right data in copy_statvfs_info(). Otherwise f_namemax
can end up being 0. To reproduce: unmount the remote filesystem, remount
it, and kill -HUP mountd to refresh exports.
 1.241  13-Apr-2020  ad branches: 1.241.2; 1.241.4;
Replace most uses of vp->v_usecount with a call to vrefcnt(vp), a function
that hides the details and does atomic_load_relaxed(). Signature matches
FreeBSD.
 1.240  16-Mar-2020  pgoyette branches: 1.240.2;
Use the module subsystem's ability to process SYSCTL_SETUP() entries to
automate installation of sysctl nodes.

Note that there are still a number of device and pseudo-device modules
that create entries tied to individual device units, rather than to the
module itself. These are not changed.
 1.239  27-Feb-2020  ad Tighten up the locking around vp->v_iflag a little more after the recent
split of vmobjlock & v_interlock.
 1.238  17-Jan-2020  ad VFS_VGET(), VFS_ROOT(), VFS_FHTOVP(): give them a "int lktype" argument, to
allow us to get shared locks (or no lock) on the returned vnode. Matches
FreeBSD.
 1.237  03-Sep-2018  riastradh branches: 1.237.4; 1.237.6;
Rename min/max -> uimin/uimax for better honesty.

These functions are defined on unsigned int. The generic name
min/max should not silently truncate to 32 bits on 64-bit systems.
This is purely a name change -- no functional change intended.

HOWEVER! Some subsystems have

#define min(a, b) ((a) < (b) ? (a) : (b))
#define max(a, b) ((a) > (b) ? (a) : (b))

even though our standard name for that is MIN/MAX. Although these
may invite multiple evaluation bugs, these do _not_ cause integer
truncation.

To avoid `fixing' these cases, I first changed the name in libkern,
and then compile-tested every file where min/max occurred in order to
confirm that it failed -- and thus confirm that nothing shadowed
min/max -- before changing it.

I have left a handful of bootloaders that are too annoying to
compile-test, and some dead code:

cobalt ews4800mips hp300 hppa ia64 luna68k vax
acorn32/if_ie.c (not included in any kernels)
macppc/if_gm.c (superseded by gem(4))

It should be easy to fix the fallout once identified -- this way of
doing things fails safe, and the goal here, after all, is to _avoid_
silent integer truncations, not introduce them.

Maybe one day we can reintroduce min/max as type-generic things that
never silently truncate. But we should avoid doing that for a while,
so that existing code has a chance to be detected by the compiler for
conversion to uimin/uimax without changing the semantics until we can
properly audit it all. (Who knows, maybe in some cases integer
truncation is actually intended!)
 1.236  16-Mar-2018  christos branches: 1.236.2;
PR/53103: Timo Buhrmester: linux emulation of sendto(2) broken

The sockargs refactoring broke it, because sockargs only works with a user
address. Added an argument to sockargs to indicate where the address is
coming from. Welcome to 8.99.14.
 1.235  17-Apr-2017  hannken branches: 1.235.10;
Remove unused argument "nextp" from vfs_busy() and vfs_unbusy().
Remove argument "keepref" from vfs_unbusy() and add vfs_ref() where needed.
 1.234  17-Apr-2017  hannken Add vfs_ref(mp) and vfs_rele(mp) to add or remove a reference to
struct mount. Rename vfs_destroy(mp) to vfs_rele(mp) and replace
incrementing mp->mnt_refcnt with vfs_ref(mp).
 1.233  01-Apr-2017  riastradh KASSERT(mutex_owned(vp->v_interlock)) in vnode iterator selector.
 1.232  17-Feb-2017  hannken Add generic genfs_suspendctl() and use it for all file systems.
Layered file systems need work.
 1.231  02-Nov-2015  pgoyette branches: 1.231.2; 1.231.4;
Don't forget to call nfs_fini() when we're finished. Without this,
we leave a dangling pool nfsrvdescpl around.
 1.230  15-Jul-2015  manu Fix soft NFS force unmount

For many reasons, forcibly unmounting a soft NFS mount could hang forever.
Here are the fixes:
- Introduce decents timeouts in operation that awaited NFS server reply.
- On timeout, fails operations on soft mounts with EIO.
- Introduce NFSMNT_DISMNTFORCE to let the filesystem know that a
force unmount is ongoing. This causes timeouts to be reduced and
prevents the NFS client to attempt reconnecting to the NFS server.

Also fix a race condition where some asynchronous I/O could reference
destroyed mount structures. We fix this by awaiting asynchronous I/O
to drain before proceeding.

Reviewed by Chuck Silvers.
 1.229  30-May-2014  hannken branches: 1.229.2; 1.229.4;
Change NFS from rbtree to vcache.
 1.228  24-May-2014  christos Introduce a selector function to the vfs vnode iterator so that we don't
need to vget() vnodes that we are not interested at, and optimize locking
a bit. Iterator changes reviewed by Hannken (thanks), the rest of the bugs
are mine.
 1.227  16-Apr-2014  maxv An (un)privileged user can easily make the kernel dereference a NULL
pointer.

The kernel allows 'data' to be NULL; it's the fs's responsibility to
ensure that it isn't NULL (if the fs actually needs data).

ok christos@
 1.226  23-Mar-2014  hannken branches: 1.226.2;
Change all vfsops to use C99 designated initializers.

No functional changes intended.
 1.225  17-Mar-2014  hannken Change nfs_sync() to use vfs_vnode_iterator.
 1.224  25-Feb-2014  pooka Ensure that the top level sysctl nodes (kern, vfs, net, ...) exist before
the sysctl link sets are processed, and remove redundancy.

Shaves >13kB off of an amd64 GENERIC, not to mention >1k duplicate
lines of code.
 1.223  23-Nov-2013  christos change the mountlist CIRCLEQ into a TAILQ
 1.222  14-Sep-2013  martin Remove unused variable
 1.221  22-Jan-2013  dholland branches: 1.221.2;
Stuff UFS_ in front of a few of ufs's symbols to reduce namespace
pollution. Specifically:
ROOTINO -> UFS_ROOTINO
WINO -> UFS_WINO
NXADDR -> UFS_NXADDR
NDADDR -> UFS_NDADDR
NIADDR -> UFS_NIADDR
MAXSYMLINKLEN -> UFS_MAXSYMLINKLEN
MAXSYMLINKLEN_UFS[12] -> UFS[12]_MAXSYMLINKLEN (for consistency)

Sort out ext2fs's misuse of NDADDR and NIADDR; fortunately, these have
the same values in ext2fs and ffs.

No functional change intended.
 1.220  24-Oct-2011  hannken branches: 1.220.2; 1.220.8; 1.220.12; 1.220.14; 1.220.16;
VOP_GETATTR() needs a shared lock at least.

As nfs_kqpoll() ignores the return value from VOP_GETATTR() initialize
the attrributes to zero -- nfs_kqfilter() does the same.
 1.219  07-Oct-2011  hannken As vnalloc() always allocates with PR_WAITOK there is no longer the need
to test its result for NULL.
 1.218  12-Jun-2011  rmind Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.
 1.217  12-Aug-2010  pooka branches: 1.217.6;
Do not return a garbage vnode in vpp if fhtovp fails.

Fixes PR kern/43745 for nfs.
 1.216  21-Jul-2010  hannken Make holding v_interlock mandatory for callers of vget().

Announced some time ago on tech-kern.
 1.215  09-Jul-2010  hannken nfs_unmount(): No need to take a second reference for the root node.

nfs_root(): Replace vget() with vref()/vn_lock(), this node already
has a reference.
 1.214  24-Jun-2010  hannken Clean up vnode lock operations pass 2:

VOP_UNLOCK(vp, flags) -> VOP_UNLOCK(vp): Remove the unneeded flags argument.

Welcome to 5.99.32.

Discussed on tech-kern.
 1.213  24-Jun-2010  hannken Clean up vnode lock operations:

- VOP_LOCK(vp, flags): Limit the set of allowed flags to LK_EXCLUSIVE,
LK_SHARED and LK_NOWAIT. LK_INTERLOCK is no longer allowed as it
makes no sense here.

- VOP_ISLOCKED(vp): Remove the for some time unused return value
LK_EXCLOTHER. Mark this operation as "diagnostic only".
Making a lock decision based on this operation is no longer allowed.

Discussed on tech-kern.
 1.212  15-May-2010  dholland nfs_statvfs should return NFS_MAXNAMLEN, not MAXNAMLEN.
(Compile-tested only, but that should be ok)
 1.211  02-Mar-2010  pooka branches: 1.211.2;
Get rid of dependency on fs_nfs.h, i.e. source modules with
conditional content depending on if the NFS client is wanted or
not. The server can now be made an independent module not depending
on the nfs client.

Tested with rump_nfs (standalone client), rump_nfsd (standalone
nfsd) and a qemu installation with both the client and the server.
 1.210  15-Mar-2009  cegger branches: 1.210.2;
ansify function definitions
 1.209  14-Mar-2009  dsl ANSIfy another 1261 function definitions.
The only ones left in sys are beyond by sed script!
(or in sys/dist or sys/external)
Mostly they have function pointer parameters.
 1.208  14-Mar-2009  dsl Change about 4500 of the K&R function definitions to ANSI ones.
There are still about 1600 left, but they have ',' or /* ... */
in the actual variable definitions - which my awk script doesn't handle.
There are also many that need () -> (void).
(The script does handle misordered arguments.)
 1.207  14-Mar-2009  dsl Remove all the __P() from sys (excluding sys/dist)
Diff checked with grep and MK1 eyeball.
i386 and amd64 GENERIC and sys still build.
 1.206  17-Dec-2008  cegger branches: 1.206.2;
kill MALLOC and FREE macros.
 1.205  19-Nov-2008  ad Make the emulations, exec formats, coredump, NFS, and the NFS server
into modules. By and large this commit:

- shuffles header files and ifdefs
- splits code out where necessary to be modular
- adds module glue for each of the components
- adds/replaces hooks for things that can be installed at runtime
 1.204  14-Nov-2008  ad Remove COMPAT ifdefs that might as well be comments (i.e., they cost us
almost nothing).
 1.203  22-Oct-2008  matt branches: 1.203.2; 1.203.4; 1.203.10; 1.203.14;
Don't need nfs_vfs_reinit anymore since we don't resize tables anymore.
Move reinit code to init case.
 1.202  22-Oct-2008  matt Change NFS to use a RB-tree for its FH->nfsnode lookups.
 1.201  30-Sep-2008  pooka Since the nfs root vnode is eternally constant, fully initialize
it in mountfs instead of deferring part of the initialization to
VFS_ROOT(). Fixes theoretical future bugs for nfs roots.
 1.200  10-May-2008  rumble branches: 1.200.4;
Convert file systems to dynamically attach with the new module interface.
Make VFS hooks dynamic while we're here and say farewell to VFS_ATTACH and
VFS_HOOKS_ATTACH linksets.

As a consequence, most of the file systems can now be loaded as new style
modules.

Quick sanity check by ad@.
 1.199  06-May-2008  ad branches: 1.199.2;
PR kern/38141 lookup/vfs_busy acquire rwlock recursively

Simplify the mount locking. Remove all the crud to deal with recursion on
the mount lock, and crud to deal with unmount as another weirdo lock.

Hopefully this will once and for all fix the deadlocks with this. With this
commit there are two locks on each mount:

- krwlock_t mnt_unmounting. This is used to prevent unmount across critical
sections like getnewvnode(). It's only ever read locked with rw_tryenter(),
and is only ever write locked in dounmount(). A write hold can't be taken
on this lock if the current LWP could hold a vnode lock.

- kmutex_t mnt_updating. This is taken by threads updating the mount, for
example when going r/o -> r/w, and is only present to serialize updates.
In order to take this lock, a read hold must first be taken on
mnt_unmounting, and the two need to be held across the operation.

One effect of this change: previously if an unmount failed, we would make a
half hearted attempt to back out of it gracefully, but that was unlikely to
work in a lot of cases. Now while an unmount that will be aborted is in
progress, new file operations within the mount will fail instead of being
delayed. That is unlikely to be a problem though, because if the admin
requests unmount of a file system then s(he) has made a decision to deny
access to the resource.
 1.198  30-Apr-2008  ad PR kern/38135 vfs_busy/vfs_trybusy confusion

The previous fix worked, but it opened a window where mounts could have
disappeared from mountlist while the caller was traversing it using
vfs_trybusy(). Fix that.
 1.197  29-Apr-2008  ad PR kern/38057 ffs makes assuptions about devvp file system
PR kern/33406 softdeps get stuck in endless loop

Introduce VFS_FSYNC() and call it when syncing a block device, if it
has a mounted file system.
 1.196  13-Feb-2008  yamt branches: 1.196.6; 1.196.8; 1.196.10;
reject files larger than nm_maxfilesize.
 1.195  13-Feb-2008  yamt nfs_mountroot: kmem_alloc+memset -> kmem_zalloc
 1.194  30-Jan-2008  ad PR kern/37706 (forced unmount of file systems is unsafe):

- Do reference counting for 'struct mount'. Each vnode associated with a
mount takes a reference, and in turn the mount takes a reference to the
vfsops.
- Now that mounts are reference counted, replace the overcomplicated mount
locking inherited from 4.4BSD with a recursable rwlock.
 1.193  28-Jan-2008  dholland Fix some race conditions in rename.
Introduce a per-FS rename lock and new vfsops to manipulate it.
Get this lock while renaming. Also add another relookup() in do_sys_rename,
which is a hack to kludge around some of the worst deficiencies of
ufs_rename.
reviewed-by: pooka (and an earlier rev by ad)
posted on tech-kern with no objections.
 1.192  20-Jan-2008  joerg Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.
 1.191  03-Jan-2008  pooka valloc -> vnalloc, vfree -> vnfree
Avoids collision with userland valloc(3).

no functional change
ad ok
 1.190  02-Jan-2008  yamt use kmem_alloc instead of malloc.
 1.189  02-Jan-2008  ad Merge vmlocking2 to head.
 1.188  26-Nov-2007  pooka branches: 1.188.2; 1.188.6;
Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern
 1.187  28-Oct-2007  yamt branches: 1.187.2;
make NFS_ATTRTIMEO a function.
 1.186  10-Oct-2007  ad branches: 1.186.2;
Merge from vmlocking:

- Split vnode::v_flag into three fields, depending on field locking.
- simple_lock -> kmutex in a few places.
- Fix some simple locking problems.
 1.185  06-Sep-2007  rmind branches: 1.185.2;
nfs_mount: Plug a possible leaks.
Invented in 1.114 rev.
From CID: 4534
 1.184  10-Aug-2007  yamt branches: 1.184.2;
- instead of scanning an array of iods, maintain a list of idle iods.
- make nfs_getset_niothreads MP friendly.
 1.183  05-Aug-2007  yamt branches: 1.183.2;
use kpause rather than lbolt.
 1.182  31-Jul-2007  pooka branches: 1.182.2;
* nuke the nameidata parameter from VFS_MOUNT(). Nobody on tech-kern
knew what it was supposed to be used for and wrstuden gave a go-ahead
* while rototilling, convert file systems which went easily to
use VFS_PROTOS() instead of manually prototyping the methods
 1.181  26-Jul-2007  pooka Use eopnotsupp() instead of vfs_stdsuspendctl() and retire the latter.
 1.180  20-Jul-2007  pooka In sync, skip over vnodes based on if they are clean rather than
if they have pages.
 1.179  17-Jul-2007  pooka branches: 1.179.2;
Make set_statvfs_info() take a parameter for the vfs name instead
of always retrieving it from mp->mnt_op->vfs_name

christos ok
 1.178  12-Jul-2007  dsl Change the VFS_MOUNT() interface so that the 'data' buffer passed to the
fs code is a kernel buffer, pass though the length of the buffer as well.
Since the length of the userspace buffer isn'it (yet) passed through the mount
system call, add a field to the vfsops structure containing the default length.
Split sys_mount() for calls from compat code.
Ride one of the recent kernel version changes - old fs LKMs will load, but
sys_mount() will reject any attempt to use them.
 1.177  29-Apr-2007  yamt don't forget to destroy mutex and condvar.
 1.176  29-Apr-2007  yamt use condvar.
 1.175  29-Apr-2007  yamt use mutex and condver.
 1.174  04-Mar-2007  christos branches: 1.174.2; 1.174.4;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.
 1.173  22-Feb-2007  thorpej TRUE -> true, FALSE -> false
 1.172  15-Feb-2007  yamt branches: 1.172.2;
use mutex and rwlock rather than lockmgr.
 1.171  19-Jan-2007  hannken New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).
 1.170  27-Dec-2006  yamt - remove the rest of nqnfs.
- reject NFSMNT_MNTD and NFSMNT_KERB. (no users in tree.)
 1.169  27-Dec-2006  yamt remove nqnfs.
 1.168  09-Nov-2006  yamt remove some __unused in function parameters.
 1.167  25-Oct-2006  reinoud Revisit mnt_vnodelist TAILQ patch. Remove all suspicious TAILQ_FOREACH()
loops where vnodes can get removed or added during the loops. This could
lead to panic's on unmount since nodes are skipped or otherwise
TAILQ_NEXT(0xdeadbeef, ...) was dereferenced.
 1.166  20-Oct-2006  reinoud Replace the LIST structure mp->mnt_vnodelist to a TAILQ structure since all
vnodes were synced and processed backwards. This meant that the last
accessed node was processed first and the earlierst last.

An extra benefit is the removal of the ugly hack from the Berkly days on
LFS.

In the proces, i've also replaced the various variations hand written loops
by the TAILQ_FOREACH() macro's.
 1.165  12-Oct-2006  christos - sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386
 1.164  02-Sep-2006  yamt branches: 1.164.2; 1.164.4;
nfs_fhtovp: try to detect stale or invalid handles by issuing VOP_GETATTR.
 1.163  02-Sep-2006  yamt implement vptofh and fhtovp for nfs.
 1.162  02-Sep-2006  christos fix default type decls
fix incomplete initializer
 1.161  24-Aug-2006  christos Don't free what we did not allocate.
 1.160  23-Aug-2006  christos Change iostat_alloc() to take the parent pointer and the name directly, so
that callers are not responsible for initializing the fields. Store the name
inside the struct instead of maintaining a pointer to external storage, or
leaked memory (nfs case).
 1.159  23-Jul-2006  ad Use the LWP cached credentials where sane.
 1.158  13-Jul-2006  martin Fix alignement problems for fhandle_t, exposed by gcc4.1.

While touching all vptofh/fhtovp functions, get rid of VFS_MAXFIDSIZ,
version the getfh(2) syscall and explicitly pass the size available in
the filehandle from userland.

Discussed on tech-kern, with lots of help from yamt (thanks!).
 1.157  07-Jun-2006  kardel branches: 1.157.2;
merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html
 1.156  20-May-2006  yamt mountnfs: reject wrongly-sized filehandle for nfsv2.
 1.155  14-May-2006  elad branches: 1.155.2;
integrate kauth.
 1.154  20-Apr-2006  blymn Prefix iostat structure elements with io_
 1.153  14-Apr-2006  blymn Make i/o statistics collection more generic, include tape drives and
nfs mounts in the set of devices that statistics will be reported on.
 1.152  21-Feb-2006  thorpej branches: 1.152.2; 1.152.4; 1.152.6;
Use device_class() instead of accessing dv_class directly.
 1.151  11-Dec-2005  christos branches: 1.151.2; 1.151.4; 1.151.6;
merge ktrace-lwp.
 1.150  23-Sep-2005  jmmv Apply the NFS exports list rototill patch:

- Remove all NFS related stuff from file system specific code.
- Drop the vfs_checkexp hook and generalize it in the new nfs_check_export
function, thus removing redundancy from all file systems.
- Move all NFS export-related stuff from kern/vfs_subr.c to the new
file sys/nfs/nfs_export.c. The former was becoming large and its code
is always compiled, regardless of the build options. Using the latter,
the code is only compiled in when NFSSERVER is enabled. While doing this,
also make some functions in nfs_subs.c conditional to NFSSERVER.
- Add a new command in nfssvc(2), called NFSSVC_SETEXPORTSLIST, that takes a
path and a set of export entries. At the moment it can only clear the
exports list or append entries, one by one, but it is done in a way that
allows setting the whole set of entries atomically in the future (see the
comment in mountd_set_exports_list or in doc/TODO).
- Change mountd(8) to use the nfssvc(2) system call instead of mount(2) so
that it becomes file system agnostic. In fact, all this whole thing was
done to remove a 'XXX' block from this utility!
- Change the mount*, newfs and fsck* userland utilities to not deal with NFS
exports initialization; done internally by the kernel when initializing
the NFS support for each file system.
- Implement an interface for VFS (called VFS hooks) so that several kernel
subsystems can run arbitrary code upon receipt of specific VFS events.
At the moment, this only provides support for unmount and is used to
destroy NFS exports lists from the file systems being unmounted, though it
has room for extension.

Thanks go to yamt@, chs@, thorpej@, wrstuden@ and others for their comments
and advice in the development of this patch.
 1.149  19-Sep-2005  christos ATTRTIMEO takes 2 args.
 1.148  09-Jun-2005  atatat branches: 1.148.2;
Properly fix the constipated lossage wrt -Wcast-qual and the sysctl
code. I know it's not the prettiest code, but it seems to work rather
well in spite of itself.
 1.147  29-May-2005  christos - sprinkle const
- avoid shadowed variables
- mark bad const use with XXXUNCONST
 1.146  29-Mar-2005  thorpej - Define a VFS_ATTACH() macro that places a reference to a vfsops structure
into the "vfsops" link set.
- Use VFS_ATTACH() where vfsops are declared for individual file systems.
- In vfsinit(), traverse the "vfsops" link set, rather than vfs_list_initial[].
 1.145  26-Feb-2005  perry branches: 1.145.2;
nuke trailing whitespace
 1.144  02-Jan-2005  thorpej branches: 1.144.2; 1.144.4;
Add the system call and VFS infrastructure for file system extended
attributes.

From FreeBSD.
 1.143  15-Aug-2004  mycroft Fixing age old cruft:
* Rather than using mnt_maxsymlinklen to indicate that a file systems returns
d_type fields(!), add a new internal flag, IMNT_DTYPE.

Add 3 new elements to ufsmount:
* um_maxsymlinklen, replaces mnt_maxsymlinklen (which never should have existed
in the first place).
* um_dirblksiz, which tracks the current directory block size, eliminating the
FS-specific checks littered throughout the code. This may be used later to
make the block size variable.
* um_maxfilesize, which is the maximum file size, possibly adjusted lower due
to implementation issues.

Sync some bug fixes from FFS into ext2fs, particularly:
* ffs_lookup.c 1.21, 1.28, 1.33, 1.48
* ffs_inode.c 1.43, 1.44, 1.45, 1.66, 1.67
* ffs_vnops.c 1.84, 1.85, 1.86

Clean up some crappy pointer frobnication.
 1.142  12-Jul-2004  yamt nfs_fsinfo: when changing rsize/wsize,
keep mnt_fs_bshift in-sync. otherwise genfs_getpages behaves badly.
 1.141  05-Jul-2004  pk Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().
 1.140  25-May-2004  hannken Add ffs internal snapshots. Written by Marshall Kirk McKusick for FreeBSD.

- Not enabled by default. Needs kernel option FFS_SNAPSHOT.
- Change parameters of ffs_blkfree.
- Let the copy-on-write functions return an error so spec_strategy
may fail if the copy-on-write fails.
- Change genfs_*lock*() to use vp->v_vnlock instead of &vp->v_lock.
- Add flag B_METAONLY to VOP_BALLOC to return indirect block buffer.
- Add a function ffs_checkfreefile needed for snapshot creation.
- Add special handling of snapshot files:
Snapshots may not be opened for writing and the attributes are read-only.
Use the mtime as the time this snapshot was taken.
Deny mtime updates for snapshot files.
- Add function transferlockers to transfer any waiting processes from
one lock to another.
- Add vfsop VFS_SNAPSHOT to take a snapshot and make it accessible through
a vnode.
- Add snapshot support to ls, fsck_ffs and dump.

Welcome to 2.0F.

Approved by: Jason R. Thorpe <thorpej@netbsd.org>
 1.139  25-May-2004  atatat Sysctl descriptions under vfs subtree
 1.138  22-May-2004  jonathan Eliminate several uses of `curproc' from the socket-layer code and from NFS.

Add a new explicit `struct proc *p' argument to socreate(), sosend().
Use that argument instead of curproc. Follow-on changes to pass that
argument to socreate(), sosend(), and (*so->so_send)() calls.
These changes reviewed and independently recoded by Matt Thomas.

Changes to soreceive() and (*dom->dom_exernalize() from Matt Thomas:
pass soreceive()'s struct uio* uio->uio_procp to unp_externalize().
Eliminate curproc from unp_externalize. Also, now soreceive() uses
its uio->uio_procp value, pass that same value downward to
((pr->pru_usrreq)() calls for consistency, instead of (struct proc * )0.

Similar changes in sys/nfs to eliminate (most) uses of curproc,
either via the req-> r_procp field of a struct nfsreq *req argument,
or by passing down new explicit struct proc * arguments.

Reviewed by: Matt Thomas, posted to tech-kern.
NB: The (*pr->pru_usrreq)() change should be tested on more (all!) protocols.
 1.137  27-Apr-2004  jrf First pass for some caddr_t removal and changes to get rid of it where we
no longer use and/or need it

- removed casts from unionfs, deadfs and fdesc
(there are more to hunt down still)
- changed vfs_quotactl args argumet from caddr_t to void *
- changed vfs_quotactl structures/callers to reflect the api change

Compiled fine and ran for about a day. Approved/reviewed by
christos@netbsd.org and gimpy@netbsd.org.
 1.136  21-Apr-2004  christos Replace the statfs() family of system calls with statvfs().
Retain binary compatibility.
 1.135  24-Mar-2004  atatat branches: 1.135.2;
Tango on sysctl_createv() and flags. The flags have all been renamed,
and sysctl_createv() now uses more arguments.
 1.134  04-Dec-2003  atatat Dynamic sysctl.

Gone are the old kern_sysctl(), cpu_sysctl(), hw_sysctl(),
vfs_sysctl(), etc, routines, along with sysctl_int() et al. Now all
nodes are registered with the tree, and nodes can be added (or
removed) easily, and I/O to and from the tree is handled generically.

Since the nodes are registered with the tree, the mapping from name to
number (and back again) can now be discovered, instead of having to be
hard coded. Adding new nodes to the tree is likewise much simpler --
the new infrastructure handles almost all the work for simple types,
and just about anything else can be done with a small helper function.

All existing nodes are where they were before (numerically speaking),
so all existing consumers of sysctl information should notice no
difference.

PS - I'm sorry, but there's a distinct lack of documentation at the
moment. I'm working on sysctl(3/8/9) right now, and I promise to
watch out for buses.
 1.133  02-Oct-2003  itojun plug mbuf leak due to manual mbuf handling. PR kern/13807.
(martti confirmed that it stabilizes the situation described in kren/13807)
 1.132  07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.131  29-Jun-2003  fvdl branches: 1.131.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
 1.130  29-Jun-2003  thorpej Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.
 1.129  28-Jun-2003  darrenr Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
 1.128  21-May-2003  yamt remove local definitions of TRUE and FALSE.
 1.127  03-May-2003  yamt better handling of write verifier change.
 1.126  24-Apr-2003  drochner Change some subordinate functions to take a "struct nfsnode" argument
instead of "struct vnode". This saves a number of pointer dereferences;
it sums up to about half a kB for me. And it paves the way for future
fixes.
While cleaning up, eliminate a write-only member of "struct nfsreq"
and a pointless assignment in the NFS_V2_ONLY case.
 1.125  16-Apr-2003  christos PR/1796: John Kohl: statfs misbehaves under chrooted environments.

- Under chroot it displays only the visible filesystems with appropriate paths.
- The statfs f_mntonname gets adjusted to contain the real path from root.
- While was there, fixed a bug in ext2fs, locking problems with vfs_getfsstat(),
and factored out some of the vfsop statfs() code to copy_statfs_info(). This
fixes the problem where some filesystems forgot to set fsid.
- Made coda look more like a normal fs.
 1.124  02-Apr-2003  yamt use queue manipulation macros.
 1.123  28-Mar-2003  yamt if rsize was explicitly specified by mount_nfs,
prefer it to rtpref from nfsd. the same for wsize and wtpref.

ok'ed by fvdl.
 1.122  26-Feb-2003  matt Add MBUFTRACE kernel option.
Do a little mbuf rework while here. Change all uses of MGET*(*, M_WAIT, *)
to m_get*(M_WAIT, *). These are not performance critical and making them
call m_get saves considerable space. Add m_clget analogue of MCLGET and
make corresponding change for M_WAIT uses.
Modify netinet, gem, fxp, tulip, nfs to support MBUFTRACE.
Begin to change netstat to use sysctl.
 1.121  01-Feb-2003  thorpej Add extensible malloc types, adapted from FreeBSD. This turns
malloc types into a structure, a pointer to which is passed around,
instead of an int constant. Allow the limit to be adjusted when the
malloc type is defined, or with a function call, as suggested by
Jonathan Stone.
 1.120  24-Nov-2002  scw Fix an uninitialised variable warning.
 1.119  21-Oct-2002  yamt fix a page locking deadlock problem for nfs.

add a flag that specify if the file can be truncated safely or not
to nfsm_loadattr and friends. when it isn't safe, just mark the nfsnode
as "should be truncated later".

ok'ed by Frank van der Linden and Chuck Silvers.
close kern/18036.
 1.118  21-Oct-2002  enami When printing filesystem specific parameters, also print the address and
port of server numerically.
 1.117  01-Oct-2002  christos forgot to set deadthresh; thanks to YAMAMOTO Takashi.
 1.116  21-Sep-2002  christos MNT_GETARGS support
 1.115  30-Jul-2002  soren Die, qaddr_t, die! - mnt_data in struct mount is already effectively
a void *, so stop pretending otherwise.
 1.114  26-Jul-2002  enami Synchronize code and comment again to prevent mbuf leak. Sprinkle some
KNF while I'm here.
 1.113  25-Jul-2002  jdolecek Reduce stack usage on the NFS mount code path. This fixes kernel stack
overflow when using IPsec on vax, as reported by Olaf Seibert on
current-users@.
 1.112  04-Dec-2001  christos branches: 1.112.8; 1.112.10;
PR/14817: Gregory McGarry: NFS_V2_ONLY doesn't seem to work.
 1.111  10-Nov-2001  lukem add RCSIDs
 1.110  08-Oct-2001  chs branches: 1.110.2;
revert a change that I accidentally included with ubcperf.
 1.109  20-Sep-2001  chs fix nfs_bmap() so that it works for both genfs_{get,put}pages() and swap/vnd.
 1.108  15-Sep-2001  chs a whole bunch of changes to improve performance and robustness under load:

- remove special treatment of pager_map mappings in pmaps. this is
required now, since I've removed the globals that expose the address range.
pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's
no longer any need to special-case it.
- eliminate struct uvm_vnode by moving its fields into struct vnode.
- rewrite the pageout path. the pager is now responsible for handling the
high-level requests instead of only getting control after a bunch of work
has already been done on its behalf. this will allow us to UBCify LFS,
which needs tighter control over its pages than other filesystems do.
writing a page to disk no longer requires making it read-only, which
allows us to write wired pages without causing all kinds of havoc.
- use a new PG_PAGEOUT flag to indicate that a page should be freed
on behalf of the pagedaemon when it's unlocked. this flag is very similar
to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the
pageout fails due to eg. an indirect-block buffer being locked.
this allows us to remove the "version" field from struct vm_page,
and together with shrinking "loan_count" from 32 bits to 16,
struct vm_page is now 4 bytes smaller.
- no longer use PG_RELEASED for swap-backed pages. if the page is busy
because it's being paged out, we can't release the swap slot to be
reallocated until that write is complete, but unlike with vnodes we
don't keep a count of in-progress writes so there's no good way to
know when the write is done. instead, when we need to free a busy
swap-backed page, just sleep until we can get it busy ourselves.
- implement a fast-path for extending writes which allows us to avoid
zeroing new pages. this substantially reduces cpu usage.
- encapsulate the data used by the genfs code in a struct genfs_node,
which must be the first element of the filesystem-specific vnode data
for filesystems which use genfs_{get,put}pages().
- eliminate many of the UVM pagerops, since they aren't needed anymore
now that the pager "put" operation is a higher-level operation.
- enhance the genfs code to allow NFS to use the genfs_{get,put}pages
instead of a modified copy.
- clean up struct vnode by removing all the fields that used to be used by
the vfs_cluster.c code (which we don't use anymore with UBC).
- remove kmem_object and mb_object since they were useless.
instead of allocating pages to these objects, we now just allocate
pages with no object. such pages are mapped in the kernel until they
are freed, so we can use the mapping to find the page to free it.
this allows us to remove splvm() protection in several places.

The sum of all these changes improves write throughput on my
decstation 5000/200 to within 1% of the rate of NetBSD 1.5
and reduces the elapsed time for "make release" of a NetBSD 1.5
source tree on my 128MB pc to 10% less than a 1.5 kernel took.
 1.107  15-Sep-2001  chs add a new VFS op, vfs_reinit, which is called when desiredvnodes is
adjusted via sysctl. file systems that have hash tables which are
sized based on the value of this variable now resize those hash tables
using the new value. the max number of FFS softdeps is also recalculated.

convert various file systems to use the <sys/queue.h> macros for
their hash tables.
 1.106  30-Jul-2001  jdolecek branches: 1.106.2;
Check the passed file handle length _before_, not _after_ copyin()
 1.105  30-Jul-2001  fvdl Check the length of a passed in filehandle to the mount call before
doing a copyin. From Ken Ashcraft @ Stanford via Constantine Sapuntzakis.
 1.104  01-Jul-2001  gmcgarry branches: 1.104.2;
Introduce NFS_DEFAULT_NIOTHREADS to define the default number
of nfs_niothreads instead of hard-coding 4.

This change has the advantage that the default can be specified
at compile time. If the root filesystem is mounted over NFS
we don't have an opportunity to use the syscall to limit the
number of threads. Useful on small-memory machines.
 1.103  30-May-2001  mrg use _KERNEL_OPT
 1.102  28-Apr-2001  bjh21 When NFS_V2_ONLY is defined, refuse to mount NFSv3 and NQNFS filesystems,
rather than pretending they're NFSv2 and hoping for the best. Fix based on
that supplied by Christian Groessler.
 1.101  12-Feb-2001  fvdl branches: 1.101.2;
Instead of storing the filehandle in the mount structure, store the
vnode pointer. This avoids a locking problem with nfs_nget, and
can be done because we always have a reference on the root vnode
of the filesystem.
 1.100  06-Feb-2001  fvdl Do actual vnode locking for NFS.
 1.99  22-Jan-2001  jdolecek make filesystem vnodeop, specop, fifoop and vnodeopv_* arrays const
 1.98  10-Dec-2000  chs in *_sync(), don't skip vnodes which have (potentially dirty) pages.
 1.97  27-Nov-2000  chs Initial integration of the Unified Buffer Cache project.
 1.96  19-Sep-2000  fvdl Update for VOP_FSYNC parameter change.
 1.95  19-Sep-2000  bjh21 New kernel option, NFS_V2_ONLY, which aims to reduce the NFS client to just
that required to support NFSv2 mounts. Not finished yet, but already
provides some 44k of saving in code size on arm26. More savings, and some
documentation, are still to come.
 1.94  23-Aug-2000  enami Update nfs mount flags correctly. Fixes a bug introduced in rev. 1.65.
 1.93  30-Jul-2000  simonb Remove inclusion of <uvm/uvm_extern.h> that was there only to keep
<sys/sysctl.h> happy.
 1.92  27-Jun-2000  mrg remove include of <vm/vm.h>
 1.91  10-Jun-2000  assar branches: 1.91.2;
make vfs_getnewfsid only take one argument and fetch the name of the
filesystem from the supplied mount argument. also make makefstype
take a const parameter. update all the callers.
 1.90  07-May-2000  tsarna branches: 1.90.2;
Auto-adjusting vfs.nfs.iothreads: when mounting the first nfs
filesystem, if the number of threads is "-1", meaning it's never been
set, then set it to 4. You can override by setting this to some other
number (including 0) before or after mounting, of course.

Thanks to whoever it was that suggested this on ICB... sorry I don't
remember who.
 1.89  15-Apr-2000  tsarna Death to nfsiod!

It is replaced by kernel threads that do the same thing. The number of
kernel threads used is set with the vfs.nfs.iothreads sysctl.
 1.88  30-Mar-2000  augustss Remove register declarations.
 1.87  29-Mar-2000  simonb Don't need to include <sys/conf.h> here.
 1.86  16-Mar-2000  jdolecek Add new VFS op routine - vfs_done and call it on filesystem detach
in vfs_detach(). vfs_done may free global filesystem's resources,
typically those allocated in respective filesystem's init function.
Needed so those filesystems which went in via LKM have a chance to
clean after themselves before unloading. This fixes random panics
when LKM for filesystem using pools was loaded and unloaded several
times.

For each leaf filesystem, add appropriate vfs_done routine.
 1.85  15-Nov-1999  fvdl Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O
 1.84  29-Aug-1999  sommerfeld branches: 1.84.2; 1.84.4; 1.84.8;
Once the mount structure is definitely doomed, always set the
NFSMNT_DISMNT bit in it so that any waiters can go away cleanly.
(formerly, we did this only in the NQNFS/KERB cases).
 1.83  06-Mar-1999  fair branches: 1.83.2; 1.83.4;
Snatch a patch from OpenBSD to fix PRs 6529 and 7074.
Adjust fxdr_hyper() and txdr_hyper() macros.
 1.82  05-Mar-1999  mycroft Clean up some sign extension bogosity in statfs, so negative numbers are
actually negative on a LP64 client.
 1.81  26-Feb-1999  wrstuden Modify vfsops to seperate vfs_fhtovp() into two routines. vfs_fhtovp() now
only handles the file handle to vnode conversion, and a new call,
vfs_checkexp(), performs the export verification.
 1.80  21-Feb-1999  drochner -call nfs_boot_cleanup() if mount failed
-g/c diskless swap initialization
 1.79  12-Nov-1998  fvdl Use different names for the "nfscon" label to tsleep(), so that it can
be seen in which one a process is sleeping.
 1.78  28-Sep-1998  drochner Use the "atime" instead of "mtime" of the remote root directory as
base for inittodr() - it is closer to the current time.
 1.77  09-Aug-1998  perry bzero->memset, bcopy->memcpy, bcmp->memcmp
 1.76  05-Jul-1998  jonathan * defopt COMPAT_{09,10,11,12,13} and COMPAT_NOMID.
TODO: revisit interaction between native compat and emul compat usage.
 1.75  24-Jun-1998  sommerfe Always include fifos; "not an option any more".
 1.74  22-Jun-1998  sommerfe defopt for options FIFO
 1.73  05-Jun-1998  kleink Convert fsync vnode operator implementations and usage from the old `waitfor'
argument and MNT_WAIT/MNT_NOWAIT to `flags' and FSYNC_WAIT.
 1.72  24-Mar-1998  fvdl Re-instate call to "safe" disconnect function that got lost during the
Lite2 merge.
 1.71  03-Mar-1998  thorpej Historical practice assumes that NFS root mounts are initially read/write.
 1.70  03-Mar-1998  fvdl Don't try to apply the cookie endian heuristic on a mounted file (e.g.
a swapfile). From Matthias Drochner.
 1.69  01-Mar-1998  fvdl Merge with Lite2 + local changes
 1.68  18-Feb-1998  thorpej Place a pointer to an array of our vnodeopv_desc *'s in our vfsops
structure, for use by vfs_attach().
 1.67  30-Jan-1998  fvdl Only take the receive lock before disconnecting when doing it from
nfs_decode_args. Otherwise we might just end up locking against ourselves.

XXX workaround, will do ok for now. Proper fix forthcoming.
 1.66  19-Oct-1997  fvdl branches: 1.66.2;
* Implement optional 32 <-> 64 bit directory cookie translation. This uses
the directory cache as translation table. See nfs_subs.c for comments.
Makes the code a bit more complex to look at than I would have liked,
but doesn't affect the speed of the default behavior.
* Optimize caching behavior a bit when buffers are invalidated.
* Save some RPCs in readdir operations by not bothering if there is
a small amount left to do to fill the buffer. It'll be done in the
next RPC with a larger chunk anyway. Wastes a bit of buffer space
but is faster.
* Make n_vattr an allocated vattr struct. This avoids nfsnode bloat,
and is friendlier to the malloc routines.
 1.65  10-Oct-1997  fvdl * New directory entry caching system. Provides full caching of any
directory cookie that may be thrown back at us from userspace, up
to a size limit. Fixes double entry problem.
* Split flags for internal and external use in the NFS mount structure.
* Fix some buffer structure fields that weren're being used correctly.
* Fix missing directory cache inval call in nfs_open.
* Limit on NFS_DIRBLKSIZ no longer needed, bumped to the more reasonable
value of 8k.
* Various other things that I forget, all related to the dir caching
somehow, though.
 1.64  09-Sep-1997  gwr Move the call to nfs_boot_getfh() from nfs_vfsops.c to nfs_boot.c
(just for better isolation - it can now be static)
 1.63  29-Aug-1997  gwr Supporting changes for the new BOOTP support in nfs_mountroot.
 1.62  18-Jul-1997  christos branches: 1.62.2;
Fix reversed test for version 3 that broke nfs version 2 mounts.
 1.61  17-Jul-1997  fvdl * Deal with servers that don't give complete FSINFO (like NT)
From Olaf Seibert <rhialto@polder.ubc.kun.nl> (PR 3687)
* Make an attempt to check the maximum filesize before attempting
a write to the server, as write RPCs will typically happen
asynchronously, and the process will not see the error.
Fixes problems with unexpectly truncated files at 4G
* Pass up errors in nfs_writerpc correctly
 1.60  12-Jun-1997  mrg remove swap configuration.
 1.59  27-May-1997  gwr Minor reorganization of nfs_mountroot code to simplify BOOTP support.
The RPC/bootparamd calls to get the root and swap paths are now done
in nfs_boot_init() instead of nfs_boot_getfh(), so the latter now just
does the RPC/mountd call. Also changed some panics into error returns.
 1.58  22-Feb-1997  fvdl Silently clear NFSMNT_NOCONN if it's a TCP mount.
 1.57  04-Feb-1997  fvdl branches: 1.57.2; 1.57.4;
* Make sure a new socket is created when switching to/from NOCONN with
a mount
* Add extra printf statements to hopefully get some more info on lockups,
specifically when a send error is ignored.
 1.56  31-Jan-1997  thorpej - Add nfs_mountroot to nfs_vfsops.
- Only attempt to mount NFS root on a DV_IFNET class device.
- If nfs_boot_init() failes, return the error code to the caller.
 1.55  22-Dec-1996  cgd branches: 1.55.2;
Change the second and third args to struct vfsops' (*vfs_mount)() to
'const char *', and 'void *', respectively. The second arg is taken directly
from user arguments, and is const there, so must be const in the prototypes
and functions. The third arg is also taken directly from user arguments.
It doesn't have to be changed, but since it's cleaner to keep the type
the same as the user arg's type, and I'm already making the 'const char *'
change...
 1.54  03-Dec-1996  thorpej Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.
 1.53  02-Dec-1996  thorpej NFS performance improvement from Doug Rabson/FreeBSD:

Improve the queuing algorithms used by NFS' asynchronous i/o. The
existing mechanism uses a global queue for some buffers and the
vp->b_dirtyblkhd queue for others. This turns sequential writes into
randomly ordered writes to the server, affecting both read and write
performance. The existing mechanism also copes badly with hung
servers, tending to block accesses to other servers when all the iods
are waiting for a hung server.

The new mechanism uses a queue for each mount point. All asynchronous
i/o goes through this queue which preserves the ordering of requests.
A simple mechanism ensures that the iods are shared out fairly between
active mount points.

Reviewed/integrated/approved by Frank van der Linden <fvdl@netbsd.org>
 1.52  20-Oct-1996  fvdl Enhancements from Matthias Drochner:
- Try V3 first for diskless booting. Fall back to V2 if V3 fails.
- optionally (option NFS_BOOT_TCP) try a TCP mount first
for diskless booting. Fall back to UDP if it fails.
- Enable switching between UDP and TCP for remounts.
 1.51  13-Oct-1996  christos revert kprintf changes
 1.50  10-Oct-1996  christos printf -> kprintf, sprintf -> ksprintf
 1.49  24-Jun-1996  pk Ignore the mountpoint's `v_usecount' in nfs_unmount() if MNT_FORCE is on.
This takes care of two related problems:
- `umount -f' wouldn't work if someone's working directory is
the filesystem root.
- vfs_unmountall() would complain about a busy `/' on a
diskless setup.
 1.48  14-Jun-1996  cgd avoid unnecessary checks of m_get/MGET/etc.'s return values. When
they're called with M_WAIT, they are defined to never return NULL.
 1.47  23-May-1996  fvdl * Make mounts with symlinks work (needed for direct mounts with amd). PR #1917
* Never change the NQNFS flag and/or version when just doing an update mount.
Fixes a problem that made diskless booting impossible under some
circumstances.
 1.46  24-Mar-1996  fvdl branches: 1.46.4;
Return earlier on error in nfs_statfs. Should fix problem reported by
both mrg and cgd.
 1.45  17-Mar-1996  christos Fix printf format strings.
 1.44  13-Mar-1996  fvdl Make readdirsize default to rsize if rsize is explicitly specified,
and readdirsize isn't.
 1.43  18-Feb-1996  fvdl Bring in a merge of Rick Macklem's NFSv3 code from Lite2
 1.42  13-Feb-1996  gwr Do the RPC to bootparamd a little later (just before the mountd call)
so that we do not ask for the "swap" path when swapping on disk.
 1.41  09-Feb-1996  christos nfs prototype changes
 1.40  01-Feb-1996  jtc Rename struct timespec fields to conform to POSIX.1b
 1.39  19-Dec-1995  cgd changes to make this work on systems where pointers & longs are 64 bits.
This is mostly just changes to make the stuff that goes over the wire
use fixed-size types.
 1.38  13-Aug-1995  mycroft splnet --> splsoftnet
 1.37  18-Jun-1995  cgd don't assume the f_fsnamelen is nul-truncated or longer than MFSNAMELEN
 1.36  02-Jun-1995  mycroft Fix more off by one errors.
 1.35  18-Mar-1995  gwr Print the "root/swap on ..." messages here.
Add NFS_BOOT_OPTIONS for things like NFSMNT_NOCONN.
 1.34  09-Mar-1995  mycroft copy*str() should use size_t.
 1.33  18-Jan-1995  mycroft Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.
 1.32  23-Aug-1994  pk branches: 1.32.2;
When updating an NFS mountpoint, we cannot just increase `rsize' or `wsize'
without also adjusting the corresponding socket buffers. We could probably
call sbrelease/sbreserve/soreserve ourselves without much harm, but we'd
have to duplicate much of the logic in nfs_connect(). In stead, blow the
socket away entirely and let nfs_connect() do its job again.
 1.31  18-Aug-1994  mycroft More LIST/CIRCLEQ migration.
 1.30  14-Aug-1994  gwr Add the option NFS_BOOT_RWSIZE to allow diskless boot configuration
to start with a reduced NFS read and write size (need for wd8003).
 1.29  12-Aug-1994  cgd kill two errant spaces.
 1.28  11-Aug-1994  gwr Diskless boot will now bind the local socket to a reserved port to
satisfy picky servers. Also fix some missing initializations.
(Thanks to Chuck Cranor for PR#394 -- now fixed.)
 1.27  03-Jul-1994  mycroft branches: 1.27.2;
Save FS type at mount time for some later tests.
 1.26  29-Jun-1994  cgd New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'
 1.25  28-Jun-1994  gwr Minor nits: replace ... with ...
p->p_cred->pc_ucred p->p_ucred
x / DEV_BSIZE x >> DEV_BSHIFT
 1.24  22-Jun-1994  pk straighten out diskless swap code somewhat.
 1.23  14-Jun-1994  gwr Fix false "hits" in the attribute cache when booting diskless.
(Yet another thing that breaks when time.tv_sec is near zero...)
 1.22  13-Jun-1994  gwr New diskless boot code (uses RARP, bootparamd).
 1.21  08-Jun-1994  mycroft Update to 4.4-Lite fs code, with local changes.
 1.20  18-May-1994  cgd put sync printing in one place
 1.19  13-May-1994  mycroft Trivial function name change.
 1.18  11-May-1994  mycroft Cast some args to caddr_t.
 1.17  23-Apr-1994  cgd make fs types consistent over new kernels. also, some proto foo.
 1.16  21-Apr-1994  cgd Convert mount, vnode, and buf structs to use <sys/queue.h>. Also,
some knf and structure frobbing to do along with it.
 1.15  18-Apr-1994  glass revised nfs diskless support. uses bootp+rpc to gather parameters
 1.14  14-Apr-1994  cgd fs types are names now.
 1.13  10-Apr-1994  cgd make damn sure nothing's holding on the the mount point vnode
 1.12  31-Mar-1994  glass make panic string unique
 1.11  21-Dec-1993  cgd oops; fix last
 1.10  21-Dec-1993  cgd from jsp: Changed to get attributes of root node and
generate correct type, rather than assuming it's a directory.
This allows Amd direct mounts to work correctly.
 1.9  18-Dec-1993  mycroft Canonicalize all #includes.
 1.8  07-Dec-1993  pk Exclusive access when manipulating flag field in mount structure.
 1.7  07-Dec-1993  pk Don't allow the NFS_LOCKBITS to be set or reset from user land.
Allow other flags (SOFT,HARD,SPONGY, etc) to be altered by `mount -u'.
 1.6  06-Dec-1993  pk Allow changing of various NFS parameters by using `mount -u ...'.
 1.5  19-Nov-1993  cgd patch from Ukai Fumitoshi <ukai@kmc.kyoto-u.ac.jp>
to do the right thing with NFS fsid's and getnewfsid()
 1.4  13-Jul-1993  cgd branches: 1.4.4;
diskless changes made last time were hosed; were using NULL for
"no credentials" rather than NOCRED.
 1.3  07-Jul-1993  cgd changes from ws to support diskless booting... these are "OK" on inspection
and after testing... (actually, currently, none of the changed
code is even used...)
 1.2  20-May-1993  cgd more rcs id adding and header cleanup. i like vi macros!
 1.1  21-Mar-1993  cgd branches: 1.1.1;
Initial revision
 1.1.1.3  01-Mar-1998  fvdl Import 4.4BSD-Lite2
 1.1.1.2  01-Mar-1998  fvdl Import 4.4BSD-Lite for reference
 1.1.1.1  21-Mar-1993  cgd initial import of 386bsd-0.1 sources
 1.4.4.5  21-Dec-1993  cgd update from trunk
 1.4.4.4  21-Dec-1993  cgd update from trunk
 1.4.4.3  20-Nov-1993  cgd update from trunk
 1.4.4.2  14-Nov-1993  mycroft Canonicalize all #includes.
 1.4.4.1  24-Sep-1993  mycroft Make all files using spl*() #include cpu.h. Changes from trunk.
nfs_vfsops.c, nfsmount.h: Make nfs_quotactl() take an int rather than a uid_t,
as it might be -1.
nfs_vnops.c: va_size and va_bytes are now quads.
 1.27.2.4  19-Aug-1994  mycroft update from trunk
 1.27.2.3  14-Aug-1994  mycroft update from trunk
 1.27.2.2  12-Aug-1994  mycroft update from trunk
 1.27.2.1  11-Aug-1994  mycroft update from trunk
 1.32.2.2  23-Aug-1994  pk When updating an NFS mountpoint, we cannot just increase `rsize' or `wsize'
without also adjusting the corresponding socket buffers. We could probably
call sbrelease/sbreserve/soreserve ourselves without much harm, but we'd
have to duplicate much of the logic in nfs_connect(). In stead, blow the
socket away entirely and let nfs_connect() do its job again.
 1.32.2.1  23-Aug-1994  pk file nfs_vfsops.c was added on branch netbsd-1-0 on 1994-08-23 09:31:01 +0000
 1.46.4.2  11-Dec-1996  mycroft From trunk:
Ignore reference count when using MNT_FORCE.
 1.46.4.1  25-May-1996  fvdl Pull in bugfixes from main branch.
 1.55.2.1  14-Jan-1997  thorpej Snapshot of work-in-progress, committed to private branch.

These changes implement machine-independent root device and file system
selection. Notable features:

- All ports behave in a consistent manner regarding root
device selection.
- No more "options GENERIC"; all kernels have the ability
to boot with RB_ASKNAME to select root device and file system
type.
- Root file system type can be wildcarded; a machine-independent
function will try all possible file systems for the selected
root device until one succeeds.
- If the root file system fails to mount, the operator will
be given the chance to select a new root device and file
system type, rather than having the machine simply panic.
- nfs_mountroot() no longer panics if any part of the NFS
mount process fails; it now returns an error, giving the
operator a chance to recover.
- New, more consistent, config(8) grammar. The constructs:

config netbsd swap generic
config netbsd root on nfs

have been replaced with:

config netbsd root on ? type ?
config netbsd root on ? type nfs

Additionally, the operator may select or wildcard root file
system type in the kernel configuration file:

config netbsd root on cd0a type cd9660

config(8) now requires that a "root" specification be
made. "root" may be wired down or wildcarded. "swap" and
"dump" specifications are optional, and follow previous
semantics.

- config(8) has a new "file-system" keyword, used to configure
file systems into the kernel. Eventually, this will be used
to generate the default vfssw[].

- "options NFSCLIENT" is obsolete, and is replaced by
"file-system NFS". "options NFSSERVER" still exists, since
NFS server support is independent of the NFS file system
client.

- sys/arch/<foo>/<foo>/swapgeneric.c is no longer used, and
will be removed; all information is now generated by config(8).

As of this commit, all ports except arm32 have been updated to use
the new setroot(). Only SPARC, i386, and Alpha ports have been
tested at this time. Port masters should test these changes on their
ports, and report any problems back to me.

More changes are on their way, including RB_ASKNAME support in
nfs_mountroot() (to prompt for server address and path) and, potentially,
the ability to select rarp/bootparam or bootp in nfs_mountroot().
 1.57.4.1  02-Mar-1997  mrg swap configuration is no longer done at boot time.
 1.57.2.1  12-Mar-1997  is Merge in changes from Trunk
 1.62.2.3  14-Oct-1997  thorpej Update marc-pcmcia branch from trunk.
 1.62.2.2  16-Sep-1997  thorpej Update marc-pcmcia branch from trunk.
 1.62.2.1  01-Sep-1997  thorpej Update marc-pcmcia branch from trunk.
 1.66.2.1  07-Feb-1998  mellon Pull up 1.67 (fvdl)
 1.83.4.1  04-Jul-1999  chs initialize new struct mount fields in nfs_mountfs().
 1.83.2.1  05-Nov-1999  cgd pull up rev 1.84 from trunk (requested by fvdl):
Avoid a panic when forcibly unmounting a hung NFS mount, e.g. at
reboot.
 1.84.8.2  27-Dec-1999  wrstuden Pull up to last week's -current.
 1.84.8.1  21-Dec-1999  wrstuden Initial commit of recent changes to make DEV_BSIZE go away.

Runs on i386, needs work on other arch's. Main kernel routines should be
fine, but a number of the stand programs need help.

cd, fd, ccd, wd, and sd have been updated. sd has been tested with non-512
byte block devices. vnd, raidframe, and lfs need work.

Non 2**n block support is automatic for LKM's and conditional for kernels
on "options NON_PO2_BLOCKS".
 1.84.4.1  19-Oct-1999  fvdl Bring in Kirk McKusick's FFS softdep code on a branch.
 1.84.2.5  12-Mar-2001  bouyer Sync with HEAD.
 1.84.2.4  11-Feb-2001  bouyer Sync with HEAD.
 1.84.2.3  13-Dec-2000  bouyer Sync with HEAD (for UBC fixes).
 1.84.2.2  08-Dec-2000  bouyer Sync with HEAD.
 1.84.2.1  20-Nov-2000  bouyer Update thorpej_scsipi to -current as of a month ago
 1.90.2.1  22-Jun-2000  minoura Sync w/ netbsd-1-5-base.
 1.91.2.1  14-Dec-2000  he Pull up revision 1.96 (requested by fvdl):
Improve NFS performance, possibly with as much as 100% in
throughput. Please note: this implies a kernel interface change,
VOP_FSYNC gains two arguments.
 1.101.2.13  11-Dec-2002  thorpej Sync with HEAD.
 1.101.2.12  22-Oct-2002  thorpej Sync with HEAD.
 1.101.2.11  18-Oct-2002  nathanw Catch up to -current.
 1.101.2.10  01-Aug-2002  nathanw Catch up to -current.
 1.101.2.9  12-Jul-2002  nathanw No longer need to pull in lwp.h; proc.h pulls it in for us.
 1.101.2.8  24-Jun-2002  nathanw Curproc->curlwp renaming.

Change uses of "curproc->l_proc" back to "curproc", which is more like the
original use. Bare uses of "curproc" are now "curlwp".

"curproc" is now #defined in proc.h as ((curlwp) ? (curlwp)->l_proc) : NULL)
so that it is always safe to reference curproc (*de*referencing curproc
is another story, but that's always been true).
 1.101.2.7  08-Jan-2002  nathanw Catch up to -current.
 1.101.2.6  14-Nov-2001  nathanw Catch up to -current.
 1.101.2.5  22-Oct-2001  nathanw Catch up to -current.
 1.101.2.4  21-Sep-2001  nathanw Catch up to -current.
 1.101.2.3  24-Aug-2001  nathanw Catch up with -current.
 1.101.2.2  21-Jun-2001  nathanw Catch up to -current.
 1.101.2.1  05-Mar-2001  nathanw Initial commit of scheduler activations and lightweight process support.
 1.104.2.4  10-Oct-2002  jdolecek sync kqueue with -current; this includes merge of gehenna-devsw branch,
merge of i386 MP branch, and part of autoconf rototil work
 1.104.2.3  06-Sep-2002  jdolecek sync kqueue branch with HEAD
 1.104.2.2  10-Jan-2002  thorpej Sync kqueue branch with -current.
 1.104.2.1  03-Aug-2001  lukem update to -current
 1.106.2.2  11-Oct-2001  fvdl Catch up with -current. Fix some bogons in the sparc64 kbd/ms
attach code. cd18xx conversion provided by mrg.
 1.106.2.1  01-Oct-2001  fvdl Catch up with -current.
 1.110.2.1  12-Nov-2001  thorpej Sync the thorpej-mips-cache branch with -current.
 1.112.10.3  04-Oct-2003  tron Pull up revision 1.133 (requested by martti in ticket #1506):
plug mbuf leak due to manual mbuf handling. PR kern/13807.
(martti confirmed that it stabilizes the situation described in kren/13807)
 1.112.10.2  29-Jul-2002  lukem Pull up revision 1.114 (requested by enami in ticket #555):
Synchronize code and comment again to prevent mbuf leak. Sprinkle some
KNF while I'm here.
 1.112.10.1  29-Jul-2002  lukem Pull up revision 1.113 (requested by jaromir in ticket #555):
Reduce stack usage on the NFS mount code path. This fixes kernel stack
overflow when using IPsec on vax, as reported by Olaf Seibert on
current-users@.
 1.112.8.1  29-Aug-2002  gehenna catch up with -current.
 1.131.2.10  10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.131.2.9  01-Apr-2005  skrll Sync with HEAD.
 1.131.2.8  04-Mar-2005  skrll Sync with HEAD.

Hi Perry!
 1.131.2.7  17-Jan-2005  skrll Sync with HEAD.
 1.131.2.6  21-Sep-2004  skrll Fix the sync with head I botched.
 1.131.2.5  18-Sep-2004  skrll Sync with HEAD.
 1.131.2.4  25-Aug-2004  skrll Sync with HEAD.
 1.131.2.3  24-Aug-2004  skrll Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.
 1.131.2.2  03-Aug-2004  skrll Sync with HEAD
 1.131.2.1  02-Jul-2003  darrenr Apply the aborted ktrace-lwp changes to a specific branch. This is just for
others to review, I'm concerned that patch fuziness may have resulted in some
errant code being generated but I'll look at that later by comparing the diff
from the base to the branch with the file I attempt to apply to it. This will,
at the very least, put the changes in a better context for others to review
them and attempt to tinker with removing passing of 'struct lwp' through
the kernel.
 1.135.2.1  29-May-2004  tron branches: 1.135.2.1.2;
Pull up revision 1.139 (requested by atatat in ticket #393):
Sysctl descriptions under vfs subtree
 1.135.2.1.2.1  27-Oct-2005  riz Pull up following revision(s) (requested by christos in ticket #5863):
sys/nfs/nfs_subs.c: revision 1.152 via patch
sys/nfs/nfs.h: revision 1.49
sys/nfs/nfs_vfsops.c: revision 1.149 via patch
usr.sbin/amd/include/config.h: revision 1.36
sys/nfs/nfs_vnops.c: revision 1.227 via patch
sys/nfs/nfsmount.h: revision 1.34
Allow the attribute cache to be turned off, and allow amd to do it.
 1.144.4.1  19-Mar-2005  yamt sync with head. xen and whitespace. xen part is not finished.
 1.144.2.1  29-Apr-2005  kent sync with -current
 1.145.2.1  27-Sep-2005  tron Pull up following revision(s) (requested by christos in ticket #816):
sys/nfs/nfs_vfsops.c: revision 1.149
sys/nfs/nfs_vnops.c: revision 1.227
ATTRTIMEO takes 2 args.
 1.148.2.12  27-Feb-2008  yamt revert incomplete nfs client locking for now.
 1.148.2.11  27-Feb-2008  yamt sync with head.
 1.148.2.10  15-Feb-2008  yamt - sprinkle some locks.
- disable MNT_UPDATE because it involves too much locking headache.
- don't overwrite other bits in v_vflags when setting VV_ROOT.
 1.148.2.9  04-Feb-2008  yamt sync with head.
 1.148.2.8  21-Jan-2008  yamt sync with head
 1.148.2.7  07-Dec-2007  yamt sync with head
 1.148.2.6  15-Nov-2007  yamt sync with head.
 1.148.2.5  27-Oct-2007  yamt sync with head.
 1.148.2.4  03-Sep-2007  yamt sync with head.
 1.148.2.3  26-Feb-2007  yamt sync with head.
 1.148.2.2  30-Dec-2006  yamt sync with head.
 1.148.2.1  21-Jun-2006  yamt sync with head.
 1.151.6.3  01-Jun-2006  kardel Sync with head.
 1.151.6.2  22-Apr-2006  simonb Sync with head.
 1.151.6.1  04-Feb-2006  simonb In the timecounter case, call tc_setclock() instead of setting
time.tv_sec/tv_nsec directly.
 1.151.4.1  09-Sep-2006  rpaulo sync with head
 1.151.2.1  01-Mar-2006  yamt sync with head.
 1.152.6.1  24-May-2006  tron Merge 2006-05-24 NetBSD-current into the "peter-altq" branch.
 1.152.4.5  11-May-2006  elad sync with head
 1.152.4.4  06-May-2006  christos - Move kauth_cred_t declaration to <sys/types.h>
- Cleanup struct ucred; forward declarations that are unused.
- Don't include <sys/kauth.h> in any header, but include it in the c files
that need it.

Approved by core.
 1.152.4.3  19-Apr-2006  elad sync with head.
 1.152.4.2  14-Apr-2006  elad Store real/saved user/group ids too.
 1.152.4.1  08-Mar-2006  elad Adapt to kernel authorization KPI.

This could use some testing...
 1.152.2.4  03-Sep-2006  yamt sync with head.
 1.152.2.3  11-Aug-2006  yamt sync with head
 1.152.2.2  26-Jun-2006  yamt sync with head.
 1.152.2.1  24-May-2006  yamt sync with head.
 1.155.2.1  19-Jun-2006  chap Sync with head.
 1.157.2.1  13-Jul-2006  gdamore Merge from HEAD.
 1.164.4.2  10-Dec-2006  yamt sync with head.
 1.164.4.1  22-Oct-2006  yamt sync with head
 1.164.2.3  01-Feb-2007  ad Sync with head.
 1.164.2.2  12-Jan-2007  ad Sync with head.
 1.164.2.1  18-Nov-2006  ad Sync with head.
 1.172.2.3  07-May-2007  yamt sync with head.
 1.172.2.2  12-Mar-2007  rmind Sync with HEAD.
 1.172.2.1  28-Feb-2007  yamt sync with head. (somehow missed in the previous)
 1.174.4.1  11-Jul-2007  mjf Sync with head.
 1.174.2.10  25-Oct-2007  ad Fix up mnt_vnodelist handling.
 1.174.2.9  24-Oct-2007  ad Do locking / use marker vnodes when traversing mountpoint vnode lists.
 1.174.2.8  09-Oct-2007  ad Sync with head.
 1.174.2.7  16-Sep-2007  ad Checkpoint work in progress on the vnode lifecycle and reference counting
stuff. This makes it work properly without kernel_lock and fixes a few
quite old bugs. See vfs_subr.c 1.283.2.17 for details.
 1.174.2.6  20-Aug-2007  ad Sync with HEAD.
 1.174.2.5  15-Jul-2007  ad Sync with head.
 1.174.2.4  18-Jun-2007  yamt fix merge botches.
 1.174.2.3  17-Jun-2007  ad - Increase the number of thread priorities from 128 to 256. How the space
is set up is to be revisited.
- Implement soft interrupts as kernel threads. A generic implementation
is provided, with hooks for fast-path MD code that can run the interrupt
threads over the top of other threads executing in the kernel.
- Split vnode::v_flag into three fields, depending on how the flag is
locked (by the interlock, by the vnode lock, by the file system).
- Miscellaneous locking fixes and improvements.
 1.174.2.2  08-Jun-2007  ad Sync with head.
 1.174.2.1  13-Mar-2007  ad Pull in the initial set of changes for the vmlocking branch.
 1.179.2.2  10-Sep-2007  skrll Sync with HEAD.
 1.179.2.1  15-Aug-2007  skrll Sync with HEAD.
 1.182.2.6  27-Nov-2007  joerg Sync with HEAD. amd64 Xen support needs testing.
 1.182.2.5  29-Oct-2007  joerg Sync with HEAD.
 1.182.2.4  26-Oct-2007  joerg Sync with HEAD.

Follow the merge of pmap.c on i386 and amd64 and move
pmap_init_tmp_pgtbl into arch/x86/x86/pmap.c. Modify the ACPI wakeup
code to restore CR4 before jumping back into kernel space as the large
page option might cover that.
 1.182.2.3  02-Oct-2007  joerg Sync with HEAD.
 1.182.2.2  16-Aug-2007  jmcneill Sync with HEAD.
 1.182.2.1  09-Aug-2007  jmcneill Sync with HEAD.
 1.183.2.2  05-Aug-2007  yamt use kpause rather than lbolt.
 1.183.2.1  05-Aug-2007  yamt file nfs_vfsops.c was added on branch matt-mips64 on 2007-08-05 09:40:41 +0000
 1.184.2.3  23-Mar-2008  matt sync with HEAD
 1.184.2.2  09-Jan-2008  matt sync with HEAD
 1.184.2.1  06-Nov-2007  matt sync with HEAD
 1.185.2.1  14-Oct-2007  yamt sync with head.
 1.186.2.1  13-Nov-2007  bouyer Sync with HEAD
 1.187.2.2  18-Feb-2008  mjf Sync with HEAD.
 1.187.2.1  08-Dec-2007  mjf Sync with HEAD.
 1.188.6.3  23-Jan-2008  bouyer Sync with HEAD.
 1.188.6.2  08-Jan-2008  bouyer Sync with HEAD
 1.188.6.1  02-Jan-2008  bouyer Sync with HEAD
 1.188.2.1  04-Dec-2007  ad Pull the vmlocking changes into a new branch.
 1.196.10.8  09-Oct-2010  yamt sync with head
 1.196.10.7  11-Aug-2010  yamt sync with head.
 1.196.10.6  11-Mar-2010  yamt sync with head
 1.196.10.5  24-Jun-2009  yamt lock vnode when calling VOP_GETATTR because there's no reasonable way for
an implementation of VOP_GETATTR to prevent the vnode from being revoked.
 1.196.10.4  24-Jun-2009  yamt nfs_mount: re-enable MNT_UPDATE. it's broken as it is in trunk.
 1.196.10.3  04-May-2009  yamt sync with head.
 1.196.10.2  16-May-2008  yamt sync with head.
 1.196.10.1  27-Apr-2008  yamt commit some work-in-progress changes to make nfs client mp-safe to a branch,
so that they won't get lost.
- sprinkle some locking
- mark the filesystem, nfstimer callout, and kq kthread mp-safe
- add assertions and comments
- disable upgrade mount for now
- some unrelated cosmetic changes
 1.196.8.1  18-May-2008  yamt sync with head.
 1.196.6.3  17-Jan-2009  mjf Sync with HEAD.
 1.196.6.2  05-Oct-2008  mjf Sync with HEAD.
 1.196.6.1  02-Jun-2008  mjf Sync with HEAD.
 1.199.2.2  10-Oct-2008  skrll Sync with HEAD.
 1.199.2.1  23-Jun-2008  wrstuden Sync w/ -current. 34 merge conflicts to follow.
 1.200.4.2  13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.200.4.1  19-Oct-2008  haad Sync with HEAD.
 1.203.14.1  28-Apr-2014  sborrill Pull up the following revisions(s) (requested by maxv in ticket #1901):
sys/kern/vfs_syscalls.c: revision 1.478, 1.480 via patch
sys/coda/coda_vfsops.c: revision 1.81
sys/fs/adosfs/advfsops.c: revision 1.70
sys/fs/cd9660/cd9660_vfsops.c: revision 1.84
sys/fs/efs/efs_vfsops.c: revision 1.25
sys/fs/filecorefs/filecore_vfsops.c: revision 1.76
sys/fs/hfs/hfs_vfsops.c: revision 1.31
sys/fs/msdosfs/msdosfs_vfsops.c: revision 1.107
sys/fs/ntfs/ntfs_vfsops.c: revision 1.94
sys/fs/ptyfs/ptyfs_vfsops.c: revision 1.50 via patch
sys/fs/puffs/puffs_vfsops.c: revision 1.110 via patch
sys/fs/smbfs/smbfs_vfsops.c: revision 1.100
sys/fs/sysvbfs/sysvbfs_vfsops.c: revision 1.43
sys/fs/tmpfs/tmpfs_vfsops.c: revision 1.59 via patch
sys/fs/udf/udf_vfsops.c: revision 1.67
sys/fs/union/union_vfsops.c: revision 1.72
sys/fs/unionfs/unionfs_vfsops.c: revision 1.13
sys/kern/vfs_syscalls.c: revision 1.479
sys/miscfs/nullfs/null_vfsops.c: revision 1.88 via patch
sys/miscfs/overlay/overlay_vfsops.c: revision 1.61
sys/miscfs/procfs/procfs_vfsops.c: revision 1.91
sys/miscfs/umapfs/umap_vfsops.c: revision 1.92
sys/nfs/nfs_vfsops.c: revision 1.227
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.180
sys/ufs/ffs/ffs_vfsops.c: revision 1.297
sys/ufs/lfs/lfs_vfsops.c: revision 1.321
sys/ufs/mfs/mfs_vfsops.c: revision 1.107

Due to missing checks in the mount syscall, and a wrong assumption on the
file systems side, the kernel could allocate an unbounded or zero-sized
memory buffer, and could dereference a NULL pointer when particular
arguments are given by a user.
 1.203.10.1  28-Apr-2014  sborrill Pull up the following revisions(s) (requested by maxv in ticket #1901):
sys/kern/vfs_syscalls.c: revision 1.478, 1.480 via patch
sys/coda/coda_vfsops.c: revision 1.81
sys/fs/adosfs/advfsops.c: revision 1.70
sys/fs/cd9660/cd9660_vfsops.c: revision 1.84
sys/fs/efs/efs_vfsops.c: revision 1.25
sys/fs/filecorefs/filecore_vfsops.c: revision 1.76
sys/fs/hfs/hfs_vfsops.c: revision 1.31
sys/fs/msdosfs/msdosfs_vfsops.c: revision 1.107
sys/fs/ntfs/ntfs_vfsops.c: revision 1.94
sys/fs/ptyfs/ptyfs_vfsops.c: revision 1.50 via patch
sys/fs/puffs/puffs_vfsops.c: revision 1.110 via patch
sys/fs/smbfs/smbfs_vfsops.c: revision 1.100
sys/fs/sysvbfs/sysvbfs_vfsops.c: revision 1.43
sys/fs/tmpfs/tmpfs_vfsops.c: revision 1.59 via patch
sys/fs/udf/udf_vfsops.c: revision 1.67
sys/fs/union/union_vfsops.c: revision 1.72
sys/fs/unionfs/unionfs_vfsops.c: revision 1.13
sys/kern/vfs_syscalls.c: revision 1.479
sys/miscfs/nullfs/null_vfsops.c: revision 1.88 via patch
sys/miscfs/overlay/overlay_vfsops.c: revision 1.61
sys/miscfs/procfs/procfs_vfsops.c: revision 1.91
sys/miscfs/umapfs/umap_vfsops.c: revision 1.92
sys/nfs/nfs_vfsops.c: revision 1.227
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.180
sys/ufs/ffs/ffs_vfsops.c: revision 1.297
sys/ufs/lfs/lfs_vfsops.c: revision 1.321
sys/ufs/mfs/mfs_vfsops.c: revision 1.107

Due to missing checks in the mount syscall, and a wrong assumption on the
file systems side, the kernel could allocate an unbounded or zero-sized
memory buffer, and could dereference a NULL pointer when particular
arguments are given by a user.
 1.203.4.1  25-Apr-2014  sborrill Pull up the following revisions(s) (requested by maxv in ticket #1901):
sys/kern/vfs_syscalls.c: revision 1.478, 1.480 via patch
sys/coda/coda_vfsops.c: revision 1.81
sys/fs/adosfs/advfsops.c: revision 1.70
sys/fs/cd9660/cd9660_vfsops.c: revision 1.84
sys/fs/efs/efs_vfsops.c: revision 1.25
sys/fs/filecorefs/filecore_vfsops.c: revision 1.76
sys/fs/hfs/hfs_vfsops.c: revision 1.31
sys/fs/msdosfs/msdosfs_vfsops.c: revision 1.107
sys/fs/ntfs/ntfs_vfsops.c: revision 1.94
sys/fs/ptyfs/ptyfs_vfsops.c: revision 1.50 via patch
sys/fs/puffs/puffs_vfsops.c: revision 1.110 via patch
sys/fs/smbfs/smbfs_vfsops.c: revision 1.100
sys/fs/sysvbfs/sysvbfs_vfsops.c: revision 1.43
sys/fs/tmpfs/tmpfs_vfsops.c: revision 1.59 via patch
sys/fs/udf/udf_vfsops.c: revision 1.67
sys/fs/union/union_vfsops.c: revision 1.72
sys/fs/unionfs/unionfs_vfsops.c: revision 1.13
sys/kern/vfs_syscalls.c: revision 1.479
sys/miscfs/nullfs/null_vfsops.c: revision 1.88 via patch
sys/miscfs/overlay/overlay_vfsops.c: revision 1.61
sys/miscfs/procfs/procfs_vfsops.c: revision 1.91
sys/miscfs/umapfs/umap_vfsops.c: revision 1.92
sys/nfs/nfs_vfsops.c: revision 1.227
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.180
sys/ufs/ffs/ffs_vfsops.c: revision 1.297
sys/ufs/lfs/lfs_vfsops.c: revision 1.321
sys/ufs/mfs/mfs_vfsops.c: revision 1.107

Due to missing checks in the mount syscall, and a wrong assumption on the
file systems side, the kernel could allocate an unbounded or zero-sized
memory buffer, and could dereference a NULL pointer when particular
arguments are given by a user.
 1.203.2.2  28-Apr-2009  skrll Sync with HEAD.
 1.203.2.1  19-Jan-2009  skrll Sync with HEAD.
 1.206.2.1  13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.210.2.2  17-Aug-2010  uebayasi Sync with HEAD.
 1.210.2.1  30-Apr-2010  uebayasi Sync with HEAD.
 1.211.2.4  05-Mar-2011  rmind sync with head
 1.211.2.3  03-Jul-2010  rmind sync with head
 1.211.2.2  30-May-2010  rmind sync with head
 1.211.2.1  16-Mar-2010  rmind Change struct uvm_object::vmobjlock to be dynamically allocated with
mutex_obj_alloc(). It allows us to share the locks among UVM objects.
 1.217.6.1  23-Jun-2011  cherry Catchup with rmind-uvmplock merge.
 1.220.16.1  21-Apr-2014  bouyer Pull up following revision(s) (requested by maxv in ticket #1050):
sys/ufs/chfs/chfs_vfsops.c: revision 1.11
sys/fs/unionfs/unionfs_vfsops.c: revision 1.13
sys/fs/nilfs/nilfs_vfsops.c: revision 1.16
sys/ufs/mfs/mfs_vfsops.c: revision 1.107
sys/fs/sysvbfs/sysvbfs_vfsops.c: revision 1.43
sys/ufs/ffs/ffs_vfsops.c: revision 1.297
sys/kern/vfs_syscalls.c: revision 1.478
sys/kern/vfs_syscalls.c: revision 1.479
sys/fs/puffs/puffs_vfsops.c: revision 1.110
sys/fs/cd9660/cd9660_vfsops.c: revision 1.84
sys/nfs/nfs_vfsops.c: revision 1.227
sys/fs/v7fs/v7fs_vfsops.c: revision 1.10
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.180
sys/miscfs/umapfs/umap_vfsops.c: revision 1.92
sys/fs/filecorefs/filecore_vfsops.c: revision 1.76
sys/miscfs/nullfs/null_vfsops.c: revision 1.88
sys/fs/ptyfs/ptyfs_vfsops.c: revision 1.50
sys/coda/coda_vfsops.c: revision 1.81
sys/ufs/lfs/lfs_vfsops.c: revision 1.321
sys/fs/tmpfs/tmpfs_vfsops.c: revision 1.59
sys/fs/hfs/hfs_vfsops.c: revision 1.31
sys/miscfs/overlay/overlay_vfsops.c: revision 1.61
sys/fs/union/union_vfsops.c: revision 1.72
sys/fs/ntfs/ntfs_vfsops.c: revision 1.94
sys/kern/vfs_syscalls.c: revision 1.480
sys/fs/efs/efs_vfsops.c: revision 1.25
sys/kern/vfs_syscalls.c: revision 1.482
sys/fs/msdosfs/msdosfs_vfsops.c: revision 1.107
external/cddl/osnet/dist/uts/common/fs/zfs/zfs_vfsops.c: revision 1.12
sys/miscfs/procfs/procfs_vfsops.c: revision 1.91
sys/fs/smbfs/smbfs_vfsops.c: revision 1.100
sys/fs/adosfs/advfsops.c: revision 1.70
sys/fs/udf/udf_vfsops.c: revision 1.67
Limit check for 'data_len'. Otherwise a (un)privileged user can easily
panic the system by passing a huge size.
ok christos@
An (un)privileged user can easily make the kernel dereference a NULL
pointer.
The kernel allows 'data' to be NULL; it's the fs's responsibility to
ensure that it isn't NULL (if the fs actually needs data).
ok christos@
Some fs's - like kernfs - set their vfs_min_mount_data to zero. Add a check
to prevent an (un)privileged user from requesting a zero-sized allocation
(and thus a panic).
This thing is totally buggy: 'data_len' is modified by the fs, so calling
kmem_free with it while its value has changed since the kmem_alloc is far
from being a good idea.
If the kernel figures out that something mismatches, it will panic
(typically with kernfs).
 1.220.14.1  21-Apr-2014  bouyer Pull up following revision(s) (requested by maxv in ticket #1050):
sys/ufs/chfs/chfs_vfsops.c: revision 1.11
sys/fs/unionfs/unionfs_vfsops.c: revision 1.13
sys/fs/nilfs/nilfs_vfsops.c: revision 1.16
sys/ufs/mfs/mfs_vfsops.c: revision 1.107
sys/fs/sysvbfs/sysvbfs_vfsops.c: revision 1.43
sys/ufs/ffs/ffs_vfsops.c: revision 1.297
sys/kern/vfs_syscalls.c: revision 1.478
sys/kern/vfs_syscalls.c: revision 1.479
sys/fs/puffs/puffs_vfsops.c: revision 1.110
sys/fs/cd9660/cd9660_vfsops.c: revision 1.84
sys/nfs/nfs_vfsops.c: revision 1.227
sys/fs/v7fs/v7fs_vfsops.c: revision 1.10
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.180
sys/miscfs/umapfs/umap_vfsops.c: revision 1.92
sys/fs/filecorefs/filecore_vfsops.c: revision 1.76
sys/miscfs/nullfs/null_vfsops.c: revision 1.88
sys/fs/ptyfs/ptyfs_vfsops.c: revision 1.50
sys/coda/coda_vfsops.c: revision 1.81
sys/ufs/lfs/lfs_vfsops.c: revision 1.321
sys/fs/tmpfs/tmpfs_vfsops.c: revision 1.59
sys/fs/hfs/hfs_vfsops.c: revision 1.31
sys/miscfs/overlay/overlay_vfsops.c: revision 1.61
sys/fs/union/union_vfsops.c: revision 1.72
sys/fs/ntfs/ntfs_vfsops.c: revision 1.94
sys/kern/vfs_syscalls.c: revision 1.480
sys/fs/efs/efs_vfsops.c: revision 1.25
sys/kern/vfs_syscalls.c: revision 1.482
sys/fs/msdosfs/msdosfs_vfsops.c: revision 1.107
external/cddl/osnet/dist/uts/common/fs/zfs/zfs_vfsops.c: revision 1.12
sys/miscfs/procfs/procfs_vfsops.c: revision 1.91
sys/fs/smbfs/smbfs_vfsops.c: revision 1.100
sys/fs/adosfs/advfsops.c: revision 1.70
sys/fs/udf/udf_vfsops.c: revision 1.67
Limit check for 'data_len'. Otherwise a (un)privileged user can easily
panic the system by passing a huge size.
ok christos@
An (un)privileged user can easily make the kernel dereference a NULL
pointer.
The kernel allows 'data' to be NULL; it's the fs's responsibility to
ensure that it isn't NULL (if the fs actually needs data).
ok christos@
Some fs's - like kernfs - set their vfs_min_mount_data to zero. Add a check
to prevent an (un)privileged user from requesting a zero-sized allocation
(and thus a panic).
This thing is totally buggy: 'data_len' is modified by the fs, so calling
kmem_free with it while its value has changed since the kmem_alloc is far
from being a good idea.
If the kernel figures out that something mismatches, it will panic
(typically with kernfs).
 1.220.12.3  03-Dec-2017  jdolecek update from HEAD
 1.220.12.2  20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.220.12.1  25-Feb-2013  tls resync with head
 1.220.8.1  21-Apr-2014  bouyer Pull up following revision(s) (requested by maxv in ticket #1050):
sys/ufs/chfs/chfs_vfsops.c: revision 1.11
sys/fs/unionfs/unionfs_vfsops.c: revision 1.13
sys/fs/nilfs/nilfs_vfsops.c: revision 1.16
sys/ufs/mfs/mfs_vfsops.c: revision 1.107
sys/fs/sysvbfs/sysvbfs_vfsops.c: revision 1.43
sys/ufs/ffs/ffs_vfsops.c: revision 1.297
sys/kern/vfs_syscalls.c: revision 1.478
sys/kern/vfs_syscalls.c: revision 1.479
sys/fs/puffs/puffs_vfsops.c: revision 1.110
sys/fs/cd9660/cd9660_vfsops.c: revision 1.84
sys/nfs/nfs_vfsops.c: revision 1.227
sys/fs/v7fs/v7fs_vfsops.c: revision 1.10
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.180
sys/miscfs/umapfs/umap_vfsops.c: revision 1.92
sys/fs/filecorefs/filecore_vfsops.c: revision 1.76
sys/miscfs/nullfs/null_vfsops.c: revision 1.88
sys/fs/ptyfs/ptyfs_vfsops.c: revision 1.50
sys/coda/coda_vfsops.c: revision 1.81
sys/ufs/lfs/lfs_vfsops.c: revision 1.321
sys/fs/tmpfs/tmpfs_vfsops.c: revision 1.59
sys/fs/hfs/hfs_vfsops.c: revision 1.31
sys/miscfs/overlay/overlay_vfsops.c: revision 1.61
sys/fs/union/union_vfsops.c: revision 1.72
sys/fs/ntfs/ntfs_vfsops.c: revision 1.94
sys/kern/vfs_syscalls.c: revision 1.480
sys/fs/efs/efs_vfsops.c: revision 1.25
sys/kern/vfs_syscalls.c: revision 1.482
sys/fs/msdosfs/msdosfs_vfsops.c: revision 1.107
external/cddl/osnet/dist/uts/common/fs/zfs/zfs_vfsops.c: revision 1.12
sys/miscfs/procfs/procfs_vfsops.c: revision 1.91
sys/fs/smbfs/smbfs_vfsops.c: revision 1.100
sys/fs/adosfs/advfsops.c: revision 1.70
sys/fs/udf/udf_vfsops.c: revision 1.67
Limit check for 'data_len'. Otherwise a (un)privileged user can easily
panic the system by passing a huge size.
ok christos@
An (un)privileged user can easily make the kernel dereference a NULL
pointer.
The kernel allows 'data' to be NULL; it's the fs's responsibility to
ensure that it isn't NULL (if the fs actually needs data).
ok christos@
Some fs's - like kernfs - set their vfs_min_mount_data to zero. Add a check
to prevent an (un)privileged user from requesting a zero-sized allocation
(and thus a panic).
This thing is totally buggy: 'data_len' is modified by the fs, so calling
kmem_free with it while its value has changed since the kmem_alloc is far
from being a good idea.
If the kernel figures out that something mismatches, it will panic
(typically with kernfs).
 1.220.2.2  22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.220.2.1  23-Jan-2013  yamt sync with head
 1.221.2.1  18-May-2014  rmind sync with head
 1.226.2.1  10-Aug-2014  tls Rebase.
 1.229.4.3  28-Aug-2017  skrll Sync with HEAD
 1.229.4.2  27-Dec-2015  skrll Sync with HEAD (as of 26th Dec)
 1.229.4.1  22-Sep-2015  skrll Sync with HEAD
 1.229.2.2  08-Nov-2015  riz Pull up following revision(s) (requested by pgoyette in ticket #1021):
sys/nfs/nfs_vfsops.c: revision 1.231
Don't forget to call nfs_fini() when we're finished. Without this,
we leave a dangling pool nfsrvdescpl around.
 1.229.2.1  04-Nov-2015  riz Pull up following revision(s) (requested by manu in ticket #882):
sbin/umount/umount.c: revision 1.48
sys/nfs/nfsmount.h: revision 1.53
sys/nfs/nfs_var.h: revision 1.94
sys/nfs/nfs_iod.c: revision 1.7
sys/nfs/nfs_socket.c: revision 1.197
sys/nfs/nfs_bio.c: revision 1.191
sys/nfs/nfs_vfsops.c: revision 1.230
sys/nfs/nfs_clntsocket.c: revision 1.3
Remove useless and harmful sync(2) call in umount(8)
Remove sync(2) call before unmount(2) in umount(8). This sync(2) is useless
since unmount(2) will perform a VFS_SYNC anyway.
But moreover, this sync(2) may be harmful, as there are some situation where
it cannot return (unreachable NFS server, for instance), causing umount -f
to be uneffective.
Fix soft NFS force unmount
For many reasons, forcibly unmounting a soft NFS mount could hang forever.
Here are the fixes:
- Introduce decents timeouts in operation that awaited NFS server reply.
- On timeout, fails operations on soft mounts with EIO.
- Introduce NFSMNT_DISMNTFORCE to let the filesystem know that a
force unmount is ongoing. This causes timeouts to be reduced and
prevents the NFS client to attempt reconnecting to the NFS server.
Also fix a race condition where some asynchronous I/O could reference
destroyed mount structures. We fix this by awaiting asynchronous I/O
to drain before proceeding.
Reviewed by Chuck Silvers.
 1.231.4.1  21-Apr-2017  bouyer Sync with HEAD
 1.231.2.2  26-Apr-2017  pgoyette Sync with HEAD
 1.231.2.1  20-Mar-2017  pgoyette Sync with HEAD
 1.235.10.2  06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.235.10.1  22-Mar-2018  pgoyette Synch with HEAD, resolve conflicts
 1.236.2.3  21-Apr-2020  martin Sync with HEAD
 1.236.2.2  08-Apr-2020  martin Merge changes from current as of 20200406
 1.236.2.1  10-Jun-2019  christos Sync with HEAD
 1.237.6.2  29-Feb-2020  ad Sync with head.
 1.237.6.1  17-Jan-2020  ad Sync with head.
 1.237.4.1  04-May-2022  martin Pull up following revision(s) (requested by gavan in ticket #1441):

sys/nfs/nfs_vfsops.c: revision 1.243

Don't pretend that files are limited to 1TB on NFSv3.
 1.240.2.1  20-Apr-2020  bouyer Sync with HEAD
 1.241.4.1  03-Apr-2021  thorpej Sync with HEAD.
 1.241.2.1  03-Apr-2021  thorpej Sync with HEAD.
 1.242.2.1  17-Jun-2021  thorpej Sync w/ HEAD.
 1.243.10.2  20-Sep-2024  martin Pull up following revision(s) (requested by rin in ticket #880):

sys/nfs/nfs_iod.c: revision 1.9
sys/nfs/nfs_vfsops.c: revision 1.245
sys/nfs/nfs_clntsubs.c: revision 1.7

PR/57279: Izumi Tsutsui: Fix some {int,long} -> time_t. Still things will
break eventually because parts of the nfs protocol assume time_t will fit
in 32 bits.
 1.243.10.1  20-Sep-2024  martin Pull up following revision(s) (requested by rin in ticket #879):

sys/nfs/nfs_vfsops.c: revision 1.244

Avoid overflow of nfs_commitsize on machines with > 32GB RAM.

RSS XML Feed