Home | History | Annotate | Download | only in nfs
History log of /src/sys/nfs/nfs_bio.c
RevisionDateAuthorComments
 1.202  13-Feb-2024  andvar s/Enque/Enqueue/ in comment.
 1.201  24-Jun-2022  hannken Remove an incorrect assertion.

Just issue a readahead near the end of the vnode and enqueue an async read.
Now let nfs_setattr() truncate the vnode, set its new size and
nfs_vinvalbuf() waits for the pages from the readahead to become unbusy.

The async read gets processed and returns with uio_resid > 0 because there
is a hole and no write after the hole has been pushed yet. As the vnode
size already got truncated to the new size the KASSERT() incorrectly fires.
 1.200  20-Oct-2021  thorpej Overhaul of the EVFILT_VNODE kevent(2) filter:

- Centralize vnode kevent handling in the VOP_*() wrappers, rather than
forcing each individual file system to deal with it (except VOP_RENAME(),
because VOP_RENAME() is a mess and we currently have 2 different ways
of handling it; at least it's reasonably well-centralized in the "new"
way).
- Add support for NOTE_OPEN, NOTE_CLOSE, NOTE_CLOSE_WRITE, and NOTE_READ,
compatible with the same events in FreeBSD.
- Track which kevent notifications clients are interested in receiving
to avoid doing work for events no one cares about (avoiding, e.g.
taking locks and traversing the klist to send a NOTE_WRITE when
someone is merely watching for a file to be deleted, for example).

In support of the above:

- Add support in vnode_if.sh for specifying PRE- and POST-op handlers,
to be invoked before and after vop_pre() and vop_post(), respectively.
Basic idea from FreeBSD, but implemented differently.
- Add support in vnode_if.sh for specifying CONTEXT fields in the
vop_*_args structures. These context fields are used to convey information
between the file system VOP function and the VOP wrapper, but do not
occupy an argument slot in the VOP_*() call itself. These context fields
are initialized and subsequently interpreted by PRE- and POST-op handlers.
- Version VOP_REMOVE(), uses the a context field for the file system to report
back the resulting link count of the target vnode. Return this in tmpfs,
udf, nfs, chfs, ext2fs, lfs, and ufs.

NetBSD 9.99.92.
 1.199  05-Sep-2020  riastradh Round of uvm.h cleanup.

The poorly named uvm.h is generally supposed to be for uvm-internal
users only.

- Narrow it to files that actually need it -- mostly files that need
to query whether curlwp is the pagedaemon, which should maybe be
exposed by an external header.

- Use uvm_extern.h where feasible and uvm_*.h for things not exposed
by it. We should split up uvm_extern.h but this will serve for now
to reduce the uvm.h dependencies.

- Use uvm_stat.h and #ifdef UVMHIST uvm.h for files that use
UVMHIST(ubchist), since ubchist is declared in uvm.h but the
reference evaporates if UVMHIST is not defined, so we reduce header
file dependencies.

- Make uvm_device.h and uvm_swap.h independently includable while
here.

ok chs@
 1.198  23-May-2020  ad Move proc_lock into the data segment. It was dynamically allocated because
at the time we had mutex_obj_alloc() but not __cacheline_aligned.
 1.197  17-May-2020  ad Start trying to reduce cache misses on vm_page during fault processing.

- Make PGO_LOCKED getpages imply PGO_NOBUSY and remove the latter. Mark
pages busy only when there's actually I/O to do.

- When doing COW on a uvm_object, don't mess with neighbouring pages. In
all likelyhood they're already entered.

- Don't mess with neighbouring VAs that have existing mappings as replacing
those mappings with same can be quite costly.

- Don't enqueue pages for neighbour faults unless not enqueued already, and
don't activate centre pages unless uvmpdpol says its useful.

Also:

- Make PGO_LOCKED getpages on UAOs work more like vnodes: do gang lookup in
the radix tree, and don't allocate new pages.

- Fix many assertion failures around faults/loans with tmpfs.
 1.196  23-Apr-2020  ad PR kern/54759 (vm.ubc_direct deadlock when read()/write() into mapping of itself)

- Add new flag UBC_ISMAPPED which tells ubc_uiomove() the object is mmap()ed
somewhere. Use it to decide whether to do direct-mapped copy, rather than
poking around directly in the vnode in ubc_uiomove(), which is ugly and
doesn't work for tmpfs. It would be nicer to contain all this in UVM but
the filesystem provides the needed locking here (VV_MAPPED) and to
reinvent that would suck more.

- Rename UBC_UNMAP_FLAG() to UBC_VNODE_FLAGS(). Pass in UBC_ISMAPPED where
appropriate.
 1.195  22-Mar-2020  ad branches: 1.195.2;
Process concurrent page faults on individual uvm_objects / vm_amaps in
parallel, where the relevant pages are already in-core. Proposed on
tech-kern.

Temporarily disabled on MP architectures with __HAVE_UNLOCKED_PMAP until
adjustments are made to their pmaps.
 1.194  23-Feb-2020  ad UVM locking changes, proposed on tech-kern:

- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart. v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap. Others to follow later.
 1.193  15-Jan-2020  ad Merge from yamt-pagecache (after much testing):

- Reduce unnecessary page scan in putpages esp. when an object has a ton of
pages cached but only a few of them are dirty.

- Reduce the number of pmap operations by tracking page dirtiness more
precisely in uvm layer.
 1.192  13-Dec-2019  ad branches: 1.192.2;
Break the global uvm_pageqlock into a per-page identity lock and a private
lock for use of the pagedaemon policy code. Discussed on tech-kern.

PR kern/54209: NetBSD 8 large memory performance extremely low
PR kern/54210: NetBSD-8 processes presumably not exiting
PR kern/54727: writing a large file causes unreasonable system behaviour
 1.191  15-Jul-2015  manu branches: 1.191.18;
Fix soft NFS force unmount

For many reasons, forcibly unmounting a soft NFS mount could hang forever.
Here are the fixes:
- Introduce decents timeouts in operation that awaited NFS server reply.
- On timeout, fails operations on soft mounts with EIO.
- Introduce NFSMNT_DISMNTFORCE to let the filesystem know that a
force unmount is ongoing. This causes timeouts to be reduced and
prevents the NFS client to attempt reconnecting to the NFS server.

Also fix a race condition where some asynchronous I/O could reference
destroyed mount structures. We fix this by awaiting asynchronous I/O
to drain before proceeding.

Reviewed by Chuck Silvers.
 1.190  05-Sep-2014  matt branches: 1.190.2;
Don't use catch as a variable name.
 1.189  12-Aug-2013  hannken branches: 1.189.4;
Function nfs_vinvalbuf() ignores errors from vinvalbuf() and therefore
delayed write errors may get lost.
Change nfs_vinvalbuf() to keep errors from vinvalbuf() for fsync() or close().

Presented on tech-kern@

Fix for PR kern/47980 (NFS over-quota not detected if utimes() called
before fsync()/close())
 1.188  27-Sep-2011  christos branches: 1.188.2; 1.188.8; 1.188.12; 1.188.14; 1.188.16; 1.188.22;
use NFS_MAXNAMLEN for all names.
 1.187  19-Jun-2011  rmind - Fix a silly bug: remove umap from uobj in ubc_release() UBC_UNMAP case.
- Use UBC_WANT_UNMAP() consistently.

ARM (PMAP_CACHE_VIVT case) works again.
 1.186  12-Jun-2011  rmind Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.
 1.185  12-Jun-2010  jakllsch branches: 1.185.6;
Fix memory leak during some NFS writes.
 1.184  23-Apr-2010  pooka Enforce RLIMIT_FSIZE before VOP_WRITE. This adds support to file
system drivers where it was missing from and fixes one buggy
implementation. The arguably weird semantics of the check are
maintained (v_size vs. va_bytes, overwrite).
 1.183  14-Mar-2009  dsl branches: 1.183.2; 1.183.4;
Remove all the __P() from sys (excluding sys/dist)
Diff checked with grep and MK1 eyeball.
i386 and amd64 GENERIC and sys still build.
 1.182  13-Mar-2009  yamt nfs_bioread: don't truncate values in a debug printf.
 1.181  19-Nov-2008  ad branches: 1.181.4;
Make the emulations, exec formats, coredump, NFS, and the NFS server
into modules. By and large this commit:

- shuffles header files and ifdefs
- splits code out where necessary to be modular
- adds module glue for each of the components
- adds/replaces hooks for things that can be installed at runtime
 1.180  31-Oct-2008  christos - allocate 8 pointers on the stack to avoid stack overflow in nfs.
- make that 8 a constant
- remove bogus panic
 1.179  17-Oct-2008  christos branches: 1.179.2; 1.179.4;
Requested by yamt:
- In getpages don't allocate if we are not locked
- Use kmem_alloc instead of malloc and don't sleep

Also provide a 64 entry stack array so we don't have to allocate in the
common case.
 1.178  17-Oct-2008  dogcow it appears the previous commit's sacrifice was "successful compilation with
NFS_V2_ONLY defined".
 1.177  16-Oct-2008  christos Another sacrifice to the stack protector gods.
 1.176  16-Oct-2008  christos don't use variable allocation on the stack.
 1.175  24-Apr-2008  ad branches: 1.175.2; 1.175.8;
Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.
 1.174  29-Mar-2008  yamt branches: 1.174.2;
ansify. from Christoph Egger.
 1.173  02-Jan-2008  yamt branches: 1.173.6;
use kmem_alloc instead of malloc.
 1.172  02-Jan-2008  ad Merge vmlocking2 to head.
 1.171  04-Dec-2007  yamt branches: 1.171.4;
merge non-intrusive nfs changes from vmlocking.
 1.170  26-Nov-2007  pooka branches: 1.170.2;
Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern
 1.169  28-Oct-2007  yamt branches: 1.169.2;
make NFS_ATTRTIMEO a function.
 1.168  10-Oct-2007  ad branches: 1.168.2;
Merge from vmlocking:

- Split vnode::v_flag into three fields, depending on field locking.
- simple_lock -> kmutex in a few places.
- Fix some simple locking problems.
 1.167  08-Oct-2007  ad Merge brelse() changes from the vmlocking branch.
 1.166  10-Aug-2007  yamt branches: 1.166.2; 1.166.4;
- instead of scanning an array of iods, maintain a list of idle iods.
- make nfs_getset_niothreads MP friendly.
 1.165  08-Aug-2007  yamt push kernel_lock a little.
 1.164  29-Jul-2007  ad branches: 1.164.4; 1.164.6;
It's not a good idea for device drivers to modify b_flags, as they don't
need to understand the locking around that field. Instead of setting
B_ERROR, set b_error instead. b_error is 'owned' by whoever completes
the I/O request.
 1.163  27-Jul-2007  yamt use ubc_uiomove for read as well.
 1.162  27-Jul-2007  yamt ubc_uiomove: add an "advice" argument rather than using UVM_ADV_RANDOM blindly.
 1.161  20-Jul-2007  yamt - fix decreasing of vfs.nfs.iothreads after the recent partial merge
of vmlocking.
- don't make nfsiod exit with requests left.
- make NFSSVC_BIOD a dummy so that nfsiod can be simplified.
 1.160  17-Jul-2007  yamt branches: 1.160.2;
remove (void)0; nonsense.
 1.159  17-Jul-2007  yamt fix a typo in a comment.
 1.158  12-Jul-2007  rmind nfs_asyncio: fix the locking in error case, problem was introduced
in 1.153 revision, where ltsleep() was replaced with condvar.

Problem found and fix provided by David A. Holland, PR/36610.
Actually, relock is not needed here, and mutex would be unlocked
only on nfs_sigintr() fail case.
 1.157  09-Jul-2007  ad Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements
 1.156  12-Jun-2007  yamt nfs_write:
- IO_SYNC: don't bother to flush dirty pages before copying data from
user buffer.
- IO_APPEND: don't invalidate pages blindly. PR/28472 from Brian Marcotte.
 1.155  05-Jun-2007  yamt improve post-ubc file overwrite performance in common cases.
ie. when it's safe, actually overwrite blocks rather than doing
read-modify-write.

also fixes PR/33152 and PR/36303.
 1.154  09-May-2007  yamt nfs_write: report an error correctly in the case of IO_SYNC.
 1.153  29-Apr-2007  yamt use mutex and condver.
 1.152  19-Apr-2007  yamt hold proclist_mutex when calling psignal().
 1.151  04-Mar-2007  christos branches: 1.151.2; 1.151.4;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.
 1.150  27-Feb-2007  yamt nfs_getpages: fix an inverted condition in rev.1.147.
 1.149  22-Feb-2007  thorpej TRUE -> true, FALSE -> false
 1.148  21-Feb-2007  thorpej Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.
 1.147  15-Feb-2007  yamt branches: 1.147.2;
use mutex and rwlock rather than lockmgr.
 1.146  27-Dec-2006  yamt remove nqnfs.
 1.145  23-Jul-2006  ad branches: 1.145.4;
Use the LWP cached credentials where sane.
 1.144  30-Jun-2006  yamt fix handling of NFSERR_NOTSUPP and NFSERR_BAD_COOKIE,
which have been broken since nfs_socket.c rev.1.115.
 1.143  14-May-2006  elad branches: 1.143.4;
integrate kauth.
 1.142  01-Mar-2006  yamt branches: 1.142.2; 1.142.4; 1.142.6;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.
 1.141  14-Jan-2006  yamt branches: 1.141.2; 1.141.4;
nfs_doio_read: clear uio_resid when filling a hole.
 1.140  13-Dec-2005  reinoud branches: 1.140.2;
Fix of panic that was introduced since ktrace-lwp branch was merged. The
shortcut to the process of the passed lwp paniced the kernel since lwp
could/can be passwd as NULL in VOP_WRITE().

This was happening when ktracing to NFS. The function ktrwrite() set the
uio_lwp to NULL and then calls VOP_WRITE() with this argument. nfs_write()
then accessed lwp *l->l_proc wich paniced.

Thanks to David Laight for his help on tracking it down.
 1.139  11-Dec-2005  christos merge ktrace-lwp.
 1.138  29-Nov-2005  yamt merge yamt-readahead branch.
 1.137  04-Nov-2005  yamt branches: 1.137.2;
nfs_bioread: push delayed truncation and tweak loop accordingly.
PR/31926 from Jed Davis.
 1.136  06-Oct-2005  yamt nfs_bioread: handle file truncation on the server a little more gracefully.
 1.135  01-Oct-2005  jdolecek use killproc() for killing the process due to text file modification, so
that it's logged too

PR: 17392 by Greg A. Woods
 1.134  19-Aug-2005  yamt fix some simple bugs in the 64bit ino_t changes.
- edp -> dp
- * -> +
 1.133  19-Aug-2005  christos 64 bit inode changes.
 1.132  21-Jul-2005  yamt use a correct credential for readlink. discussed on source-changes@.
 1.131  21-Jul-2005  yamt nfs_doio_read: revert readlink part of 1.129 and 1.130 because they were wrong.
 1.130  07-Jul-2005  christos Back to using curproc in the VLNK case when uiop->uio_procp == NULL,
and explain why we need to.
 1.129  07-Jul-2005  christos 1. use p = uio->uio_procp consistently and eliminate suspicious uses
of curproc (where uio->uio_procp should be used?). Don't do this
for nfs_commit(), because yamt says it is possibly wrong.
2. nfs_doio() does not use struct proc; remove it and the code to compute it.
3. use copyin_proc() and copyout_proc() instead of copyin() and copyout().
4. check return of copyout_proc(). and mark return from copyin_proc() XXX
5. Eliminate check p == curproc assertion check from nfs_write;
nfs_read does not have it and we might be called in a different
process context anyway (PR 20138).
 1.128  26-Feb-2005  perry branches: 1.128.2; 1.128.4;
nuke trailing whitespace
 1.127  27-Jan-2005  yamt - simplify nfs_bio.c rev.1.126
- add an assertion.

no functional changes.
 1.126  27-Jan-2005  yamt nfs_bioread:
- if a buffer is still empty after successful nfs_doio, it implies EOF.
- don't cache blocks beyond EOF.
 1.125  26-Jan-2005  yamt handle a really empty directory, which doesn't have even the dot entry.
 1.124  09-Jan-2005  chs branches: 1.124.2; 1.124.4;
adjust the UBC mapping code to support non-vnode uvm_objects.
this means we can no longer look at the vnode size to determine how many
pages to request in a fault, which is good since for NFS the size can change
out from under us on the server anyway. there's also a new flag UBC_UNMAP
for ubc_release(), so that the file system code can make the decision about
whether to cache mappings for files being used as executables.
 1.123  14-Dec-2004  yamt - centerize code to invalidate stale cache.
- don't ignore errors when invalidating buffers in nfs_open.
 1.122  26-Oct-2004  yamt since daddr_t is 64-bit these days, simply use nfs directory cookies
as buffer cache indexes. regress/sys/fs/getdents is now supposed to work.
fix PR/27112.
 1.121  17-Sep-2004  skrll There's no need to pass a proc value when using UIO_SYSSPACE with
vn_rdwr(9) and uiomove(9).

OK'd by Jason Thorpe
 1.120  15-Sep-2004  yamt fix access-after-free bugs in dircache code by refcounting nfsdircache.
PR/26864.
 1.119  18-Jul-2004  yamt nfs_doio_read: on short read, zero out the rest of the buffer unconditionally.
we can't rely on n_size here because it can be changed under us.
 1.118  11-Jun-2004  yamt nfs_doio_read: use np->n_rcred instead of curproc->p_ucred for VDIR.

XXX maybe it's better to use a cred passed by VOP_READDIR.
 1.117  23-May-2004  christos cut down another 7K by more NFS_V2_ONLY ifdefs.
 1.116  12-Mar-2004  yamt branches: 1.116.2;
introduce a macro NFS_INVALIDATE_ATTRCACHE and use it
instead of "n_attrstamp = 0".
 1.115  10-Jan-2004  yamt comments in nfs_doio_write.
 1.114  07-Dec-2003  fvdl Unix semantics dictate that access checks for files are done when it
is opened. An open file can always be read from and/or written to,
depending on how it was opened.

Therefore, the read/write/commit RPCs should never return EACCESS,
as they are only performed on files that have been successfully opened
already.

This change improves the current situation and works in most cases.
It simply always uses the most recently known owner/group of the file,
iff the authentication mechanism is AUTH_UNIX (in other cases, the
creds for a succesful open are used, but note that no other cases
are currently implemented).

A retry mechanism can be used to catch a few more cases, but this is
a good improvement for now.
 1.113  17-Nov-2003  jonathan Fix hanging-paren typo.
 1.112  17-Nov-2003  jonathan Change previous patch to have same effect as patch posted to
tech-kern. Suggested reformatting inadvertently changed the meaning of
the code, as noted by YAMAMOTO Takashi <yamt@mwd.biglobe.ne.jp>.
 1.111  17-Nov-2003  jonathan Commit fix for NFS write deadlock, on filesystems mounted via
local-loopback (lo0). As posted for review on tech-kern 2003-18-09,
with a long comment explaining (one of) the deadlock scenarios.

I've used this since shortly after 2002-09-12-, without noticing
performance degradataion or instability for non-loopback mounts.
 1.110  26-Sep-2003  yamt change n_mtime from time_t to timespec in order to improve
cache consistency.
(1 second granularity is too loose these days.)
 1.109  17-Sep-2003  yamt don't call nfs_delayedtruncate() from nfs_getpages().
it causes simplelock deadlock.
 1.108  26-Aug-2003  pk VOP_PUTPAGES() must be called with the vnode's interlock held.
 1.107  07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.106  03-Aug-2003  pk Make life slightly easier for the compiler's optimisation routines.
 1.105  29-Jun-2003  fvdl branches: 1.105.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
 1.104  28-Jun-2003  darrenr Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
 1.103  22-May-2003  yamt interlock for NFLUSHINPROG/NFLUSHWANT.
 1.102  21-May-2003  yamt eliminate memcpy in the common and easy case of write.
 1.101  16-May-2003  yamt correct a KASSERT.
 1.100  15-May-2003  yamt acquire vmobjlock when touch pg->flags.
 1.99  07-May-2003  yamt simple lock for nfs iod.
 1.98  03-May-2003  yamt - check page's offset in the object as well. (pointed by Chuck Silvers.)
- remove false assertion.
 1.97  03-May-2003  yamt - if writerpc ends with a stable result, no need to commit them anymore.
- add comments.
 1.96  03-May-2003  yamt better handling of write verifier change.
 1.95  18-Apr-2003  yamt fix a use of an uninitialized variable.
 1.94  15-Apr-2003  yamt remove line-wrapping that is no longer needed.
 1.93  12-Apr-2003  yamt fix a typo in the previous.
 1.92  12-Apr-2003  yamt set b_resid correctly.
 1.91  12-Apr-2003  yamt split nfs_doio to nfs_doio_{phys,read,write} to avoid too deep indents.
 1.90  12-Apr-2003  yamt - do FILESYNC writes if we're freeing the page or the page doesn't
belong to us. otherwise, data will be lost on server crash.
- use b_bcount instead of b_bufsize to determine
how many pages we should deal with.

based on a patch from Chuck Silvers.
discussed on tech-kern.
 1.89  09-Apr-2003  yamt rename a very confusing variable name.
(must_commit -> stalewriteverf)
 1.88  09-Apr-2003  yamt when commit failed and fall to write, re-set 'off' and 'cnt'
because it can be changed in 'needcommit' path.
 1.87  09-Apr-2003  yamt make per-iod datas together.
 1.86  18-Jan-2003  thorpej Merge the nathanw_sa branch.
 1.85  29-Oct-2002  yamt fix panic (assertion failure) on error case.
if uiomove is failed, we should clean up pages past eof.

the problem reported by kay.
ok'ed by Chuck Silvers.
 1.84  23-Oct-2002  jdolecek merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe
 1.83  21-Oct-2002  yamt fix a page locking deadlock problem for nfs.

add a flag that specify if the file can be truncated safely or not
to nfsm_loadattr and friends. when it isn't safe, just mark the nfsnode
as "should be truncated later".

ok'ed by Frank van der Linden and Chuck Silvers.
close kern/18036.
 1.82  01-Sep-2002  bouyer nfs_doio(): handle the case where nfs_writerpc() returned error != 0.
Fix kern/18125. OK'd by thorpej and chs.
 1.81  06-May-2002  enami branches: 1.81.4;
Remove wrong assertion in previous commit.
 1.80  06-May-2002  enami The per nfsnode n_commitlock is a sleep lock, but we can't sleep if
PGO_LOCKED getpages request. So, just make the lock fail and tell
the caller that there is no pages available if we can't acquire it.
The caller will call us again soon without PGO_LOCKED. Reviewed by chuq.
 1.79  10-Apr-2002  chs only use UBC_FAULTBUSY to access offsets past EOF,
otherwise we can deadlock trying to busy the same page in uiomove().
 1.78  25-Mar-2002  chs remove PGO_WEAK, it isn't needed anymore.
 1.77  23-Mar-2002  chs only do v3 stuff for v3 filesystems.
 1.76  16-Mar-2002  chs make sure that if NMODIFIED is clear, all pages attached to the vnode are
clean and without writable mappings. if we try to flush dirty pages past
EOF to the server when NMODIFIED is clear, we'll update the attrcache before
doing the write, which will try to free the pages past EOF and deadlock.
to deal with this, we write-protect pages before we send them to the server,
and restrict ourselves to creating read-only mappings if NMODIFIED isn't set.
score another one for enami.
 1.75  31-Jan-2002  chs use curproc instead of b_proc for NFS. that's what we want for sync commits
and it doesn't cause any problems for async commits.
 1.74  26-Jan-2002  chs re-enable NFSv3 commit RPCs by abandoning my new approach in favor of
frank's scheme, with one new twist: don't wait until we've totally run
out of free pages before committing, but instead notice when we've built
up a largish range of uncommitted pages and commit only the older half of
the range, which is likely to already be on disk on the server.
 1.73  31-Dec-2001  chs fix locking in nfs_getpages().
 1.72  30-Nov-2001  chs call VOP_PUTPAGES() directly instead of indirecting through
the UVM pager op vector.
 1.71  10-Nov-2001  lukem add RCSIDs
 1.70  13-Oct-2001  simonb branches: 1.70.2;
Remove so variables that are only ever set and never referenced.
 1.69  15-Sep-2001  chs a whole bunch of changes to improve performance and robustness under load:

- remove special treatment of pager_map mappings in pmaps. this is
required now, since I've removed the globals that expose the address range.
pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's
no longer any need to special-case it.
- eliminate struct uvm_vnode by moving its fields into struct vnode.
- rewrite the pageout path. the pager is now responsible for handling the
high-level requests instead of only getting control after a bunch of work
has already been done on its behalf. this will allow us to UBCify LFS,
which needs tighter control over its pages than other filesystems do.
writing a page to disk no longer requires making it read-only, which
allows us to write wired pages without causing all kinds of havoc.
- use a new PG_PAGEOUT flag to indicate that a page should be freed
on behalf of the pagedaemon when it's unlocked. this flag is very similar
to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the
pageout fails due to eg. an indirect-block buffer being locked.
this allows us to remove the "version" field from struct vm_page,
and together with shrinking "loan_count" from 32 bits to 16,
struct vm_page is now 4 bytes smaller.
- no longer use PG_RELEASED for swap-backed pages. if the page is busy
because it's being paged out, we can't release the swap slot to be
reallocated until that write is complete, but unlike with vnodes we
don't keep a count of in-progress writes so there's no good way to
know when the write is done. instead, when we need to free a busy
swap-backed page, just sleep until we can get it busy ourselves.
- implement a fast-path for extending writes which allows us to avoid
zeroing new pages. this substantially reduces cpu usage.
- encapsulate the data used by the genfs code in a struct genfs_node,
which must be the first element of the filesystem-specific vnode data
for filesystems which use genfs_{get,put}pages().
- eliminate many of the UVM pagerops, since they aren't needed anymore
now that the pager "put" operation is a higher-level operation.
- enhance the genfs code to allow NFS to use the genfs_{get,put}pages
instead of a modified copy.
- clean up struct vnode by removing all the fields that used to be used by
the vfs_cluster.c code (which we don't use anymore with UBC).
- remove kmem_object and mb_object since they were useless.
instead of allocating pages to these objects, we now just allocate
pages with no object. such pages are mapped in the kernel until they
are freed, so we can use the mapping to find the page to free it.
this allows us to remove splvm() protection in several places.

The sum of all these changes improves write throughput on my
decstation 5000/200 to within 1% of the rate of NetBSD 1.5
and reduces the elapsed time for "make release" of a NetBSD 1.5
source tree on my 128MB pc to 10% less than a 1.5 kernel took.
 1.68  27-Jun-2001  thorpej branches: 1.68.2; 1.68.4;
Make sure to add NFS vnodes to the syncerd worklist.
 1.67  26-May-2001  chs replace vm_page_t with struct vm_page *.
 1.66  16-Apr-2001  chs reads at or after EOF should "succeed".
 1.65  03-Apr-2001  chs handle partially full directory buffers by only using (b_bcount - b_resid)
bytes of data from the buffer.
 1.64  10-Mar-2001  chs eliminate the VM_PAGER_* error codes in favor of the traditional E* codes.
the mapping is:

VM_PAGER_OK 0
VM_PAGER_BAD <unused>
VM_PAGER_FAIL <unused>
VM_PAGER_PEND 0 (see below)
VM_PAGER_ERROR EIO
VM_PAGER_AGAIN EAGAIN
VM_PAGER_UNLOCK EBUSY
VM_PAGER_REFAULT ERESTART

for async i/o requests, it used to be possible for the request to
be convert to sync, and the pager would return VM_PAGER_OK or VM_PAGER_PEND
to indicate whether the caller should perform post-i/o cleanup.
this is no longer allowed; pagers must now return 0 to indicate that
the async i/o was successfully started, and the caller never needs to
worry about doing the post-i/o cleanup.
 1.63  27-Feb-2001  chs branches: 1.63.2;
min() -> MIN(), max() -> MAX().
fixes more problems with file offsets > 4GB.
 1.62  18-Feb-2001  chs fix a couple more bugs:
- in nfs_getpages(), unbusy any pages that we don't free in the error path.
- in nfs_putpages(), only call biowait() if we actually started any i/os.
 1.61  05-Feb-2001  chs fix several bugs:
- in the cases where we skip over the i/o loop, increment npages by ridx
so that when the cleanup code starts processing the pgs array at index 0
it'll actually process all of the pages.
- process the PG_RELEASED flag when unbusying pages.
- add some missing MP locking.
- use MIN() and MAX() instead of min() and max() since the latter are
functions which take arguments of type "int" but we call them with
values of type "off_t", so the values could be truncated.
 1.60  30-Jan-2001  thorpej Make sure bp->b_proc is initialized. Should fix a deref-garbage-pointer
problem reported by msaitoh@netbsd.org. NOTE: These are marked XXXUBC
since the code that allocates the bufs is new with UBC, but it may be
the case that bp->b_proc needs to be intialized to curproc (it's used
in a call to nfs_sigintr()).
 1.59  07-Jan-2001  enami Use uvm_aio_biodone instead of uvm_aio_aiodone for top-level buf
so that uvmexp.paging is updated if this i/o was initiated by
the pagedaemon.
 1.58  27-Dec-2000  chs fix several bugs:
- fix math when skipping writing pages that just need a commit.
- clear the needcommit stuff and PG_RDONLY flags on pages returned for
overwrite requests as well as for normal write faults.
- bail out of nfs_write() if we get an error.
- remove a bogus attempt to clean up after failed uiomove()s.
- bring over a workaround for a lock-ordering problem from the genfs code.
- add some missing MP locking.
 1.57  13-Dec-2000  jdolecek <sys/trace.h> is not needed here
 1.56  09-Dec-2000  chs only zero the part of the page after EOF if we're actually
initializing the page.
 1.55  04-Dec-2000  fvdl Initialize 'error' to 0, so that nfs_putpages doesn't return garbage
when pages already have been committed and nothing needs to be done.
 1.54  27-Nov-2000  chs Initial integration of the Unified Buffer Cache project.
 1.53  19-Sep-2000  bjh21 Extend NFS_V2_ONLY to remove NQNFS lease support as well. Saves another 10k.
 1.52  19-Sep-2000  fvdl Move handling of B_NEEDCOMMIT buffers to nfs_doio, so that bawrite() calls
for them are actually done asynchronously. Idea taken from FreeBSD.

Do away with nfs_writebp completely, it's not needed anymore.

Keep an eye on the range of a file that needs to be committed, and
do it in heaps.
 1.51  19-Sep-2000  bjh21 New kernel option, NFS_V2_ONLY, which aims to reduce the NFS client to just
that required to support NFSv2 mounts. Not finished yet, but already
provides some 44k of saving in code size on arm26. More savings, and some
documentation, are still to come.
 1.50  27-Jun-2000  mrg remove include of <vm/vm.h>
 1.49  18-May-2000  pk branches: 1.49.4;
Fix printf() format.
 1.48  30-Mar-2000  augustss Remove register declarations.
 1.47  23-Nov-1999  fvdl Be more careful to block bio interrupts for some data structures. There
were at least a few missed cases where vp->v_{clean,dirty}blkhd were
unprotected since the softdep/trickle sync merge.
 1.46  15-Nov-1999  fvdl Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O
 1.45  24-Mar-1999  mrg branches: 1.45.4; 1.45.8; 1.45.10; 1.45.14;
completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.
 1.44  09-Aug-1998  perry branches: 1.44.2;
bzero->memset, bcopy->memcpy, bcmp->memcmp
 1.43  21-Jun-1998  fvdl Fix possible overflow problem in read size computation.
 1.42  10-Feb-1998  mrg - add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.
 1.41  05-Feb-1998  mrg initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)
 1.40  23-Nov-1997  fvdl Move the EOF check after getting a block out of the if() that determines
whether we get it off the wire. An nfsiod might have been busy with
it, and finished while we were waiting for it in nfs_getcacheblk, so
we need to check for EOF again no matter what.
 1.39  23-Oct-1997  fvdl Oops. Fix goof in previous change.
 1.38  22-Oct-1997  fvdl Just return immediately in nfs_bioread if we got an empty buffer because
of EOF on a directory.
 1.37  20-Oct-1997  thorpej branches: 1.37.2;
Fix alignment problems. From Frank van der Linden <fvdl@NetBSD.ORG>.
 1.36  19-Oct-1997  fvdl Only do readaheads when reading sequential blocks; check v_lastr to
achieve this. Improves performance for demand paging. From Chris Demetriou.
 1.35  19-Oct-1997  fvdl * Implement optional 32 <-> 64 bit directory cookie translation. This uses
the directory cache as translation table. See nfs_subs.c for comments.
Makes the code a bit more complex to look at than I would have liked,
but doesn't affect the speed of the default behavior.
* Optimize caching behavior a bit when buffers are invalidated.
* Save some RPCs in readdir operations by not bothering if there is
a small amount left to do to fill the buffer. It'll be done in the
next RPC with a larger chunk anyway. Wastes a bit of buffer space
but is faster.
* Make n_vattr an allocated vattr struct. This avoids nfsnode bloat,
and is friendlier to the malloc routines.
 1.34  10-Oct-1997  fvdl * New directory entry caching system. Provides full caching of any
directory cookie that may be thrown back at us from userspace, up
to a size limit. Fixes double entry problem.
* Split flags for internal and external use in the NFS mount structure.
* Fix some buffer structure fields that weren're being used correctly.
* Fix missing directory cache inval call in nfs_open.
* Limit on NFS_DIRBLKSIZ no longer needed, bumped to the more reasonable
value of 8k.
* Various other things that I forget, all related to the dir caching
somehow, though.
 1.33  17-Jul-1997  fvdl branches: 1.33.2;
* Deal with servers that don't give complete FSINFO (like NT)
From Olaf Seibert <rhialto@polder.ubc.kun.nl> (PR 3687)
* Make an attempt to check the maximum filesize before attempting
a write to the server, as write RPCs will typically happen
asynchronously, and the process will not see the error.
Fixes problems with unexpectly truncated files at 4G
* Pass up errors in nfs_writerpc correctly
 1.32  04-Jul-1997  drochner Don't cast 64bit (off_t) file sizes to vm_offset_t (32bit on many
architectures), truncate them intelligently instead.
The truncation is done centralized in vnode_pager.c.
This prevents from wrap-over effects when parts of large (>2^32 byte) files
are mmapped.
Don't allow to mmap above the numerical range of vm_offset_t.
This is considered a temporary solution until the vm system handles the
object sizes/offsets more cleanly.
 1.31  20-Apr-1997  fvdl Only wake up one nfsiod when there is an async write to do. (from FreeBSD).
 1.30  02-Dec-1996  thorpej NFS performance improvement from Doug Rabson/FreeBSD:

Improve the queuing algorithms used by NFS' asynchronous i/o. The
existing mechanism uses a global queue for some buffers and the
vp->b_dirtyblkhd queue for others. This turns sequential writes into
randomly ordered writes to the server, affecting both read and write
performance. The existing mechanism also copes badly with hung
servers, tending to block accesses to other servers when all the iods
are waiting for a hung server.

The new mechanism uses a queue for each mount point. All asynchronous
i/o goes through this queue which preserves the ordering of requests.
A simple mechanism ensures that the iods are shared out fairly between
active mount points.

Reviewed/integrated/approved by Frank van der Linden <fvdl@netbsd.org>
 1.29  13-Oct-1996  christos revert kprintf changes
 1.28  10-Oct-1996  christos printf -> kprintf, sprintf -> ksprintf
 1.27  02-Jul-1996  fvdl Don't mistake a non-async block that needs to be commited for an
interrupted write.
 1.26  23-May-1996  fvdl * Make mounts with symlinks work (needed for direct mounts with amd). PR #1917
* Never change the NQNFS flag and/or version when just doing an update mount.
Fixes a problem that made diskless booting impossible under some
circumstances.
 1.25  29-Feb-1996  fvdl branches: 1.25.4;
Make sure to clear B_NEEDCOMMIT in the right spot. Fix 'officially blessed'
by Rick Macklem. Fixes PR kern/2128.
 1.24  18-Feb-1996  fvdl Bring in a merge of Rick Macklem's NFSv3 code from Lite2
 1.23  09-Feb-1996  christos nfs prototype changes
 1.22  01-Feb-1996  jtc Rename struct timespec fields to conform to POSIX.1b
 1.21  24-Jul-1995  cgd avoid unnecessary aging of buffers. This used to make sense, when buffer
caches were much smaller, but makes little sense now, and will become more
useless as RAM (and buffer cache) sizes grow. Suggested by Bob Baron.
 1.20  18-Mar-1995  gwr Make call to nfs_writerpc() consistent with others.
 1.19  12-Jan-1995  mycroft Add two missing brelse() calls. From Rick Macklem.
 1.18  10-Jan-1995  mycroft Make sure readdir requests are only truncated on block boundaries.
 1.17  20-Jul-1994  mycroft Fix a problem with write-behind causing processes to be killed occasionally.
From Rick Macklem.
 1.16  12-Jul-1994  cgd minor cache consistency fix
 1.15  29-Jun-1994  cgd branches: 1.15.2;
New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'
 1.14  22-Jun-1994  pk straighten out diskless swap code somewhat.
 1.13  15-Jun-1994  mycroft Turn P_NOSWAP and P_PHYSIO into a hold count, as suggested by a comment.
 1.12  08-Jun-1994  mycroft Update to 4.4-Lite fs code, with local changes.
 1.11  24-May-1994  cgd MIN -> min, MAX -> max
 1.10  25-Apr-1994  cgd some prototype cleanup, eliminate/replace bogus types (e.g. quad and
u_quad) -> use better types (e.g. quad_t & u_quad_t in inodes),
some cleanup.
 1.9  21-Apr-1994  cgd Convert mount, vnode, and buf structs to use <sys/queue.h>. Also,
some knf and structure frobbing to do along with it.
 1.8  18-Dec-1993  mycroft Canonicalize all #includes.
 1.7  03-Sep-1993  jtc branches: 1.7.2;
Include systm.h to get prototypes (and possibly inlines) of *max functions.
 1.6  13-Jul-1993  cgd get rid of some more bogus changes from a week ago
 1.5  13-Jul-1993  cgd diskless changes made last time were hosed; were using NULL for
"no credentials" rather than NOCRED.
 1.4  07-Jul-1993  cgd changes from ws to support diskless booting... these are "OK" on inspection
and after testing... (actually, currently, none of the changed
code is even used...)
 1.3  30-Jun-1993  andrew Paul Kranenburg's VM deadlock fixes. (patchkit patch 00147, part 2)
 1.2  20-May-1993  cgd more rcs id adding and header cleanup. i like vi macros!
 1.1  21-Mar-1993  cgd branches: 1.1.1;
Initial revision
 1.1.1.3  01-Mar-1998  fvdl Import 4.4BSD-Lite2
 1.1.1.2  01-Mar-1998  fvdl Import 4.4BSD-Lite for reference
 1.1.1.1  21-Mar-1993  cgd initial import of 386bsd-0.1 sources
 1.7.2.1  14-Nov-1993  mycroft Canonicalize all #includes.
 1.15.2.2  20-Jul-1994  cgd from trunk, per mycroft
 1.15.2.1  12-Jul-1994  cgd consistency fix, from trunk
 1.25.4.2  08-Jul-1996  jtc Pulled up from rev 1.27 by request from Frank van der Linden
 1.25.4.1  25-May-1996  fvdl Pull in bugfixes from main branch.
 1.33.2.1  14-Oct-1997  thorpej Update marc-pcmcia branch from trunk.
 1.37.2.3  24-Nov-1997  mellon Pull rev 1.40 up from trunk (fvdl)
 1.37.2.2  23-Oct-1997  mellon Pull rev 1.39 up from trunk
 1.37.2.1  23-Oct-1997  mellon Pull rev 1.38 from main trunk
 1.44.2.5  30-May-1999  chs vm_page's blkno is gone.
 1.44.2.4  30-Apr-1999  chs change ubc_alloc()'s length arg to be a pointer instead of the value.
the pointed-to value is the total desired length on input,
and is updated to the length that will fit in the returned window.
this allows callers of ubc_alloc() to be ignorant of the window size.
 1.44.2.3  25-Feb-1999  chs major overhaul of getpages and putpages functions.
 1.44.2.2  16-Nov-1998  chs set NMODIFIED in nfs_write().
putpage is now called with uobj unlocked.
remove some debugging printfs.
 1.44.2.1  09-Nov-1998  chs initial snapshot. lots left to do.
 1.45.14.2  27-Dec-1999  wrstuden Pull up to last week's -current.
 1.45.14.1  21-Dec-1999  wrstuden Initial commit of recent changes to make DEV_BSIZE go away.

Runs on i386, needs work on other arch's. Main kernel routines should be
fine, but a number of the stand programs need help.

cd, fd, ccd, wd, and sd have been updated. sd has been tested with non-512
byte block devices. vnd, raidframe, and lfs need work.

Non 2**n block support is automatic for LKM's and conditional for kernels
on "options NON_PO2_BLOCKS".
 1.45.10.1  19-Oct-1999  fvdl Bring in Kirk McKusick's FFS softdep code on a branch.
 1.45.8.8  21-Apr-2001  bouyer Sync with HEAD
 1.45.8.7  12-Mar-2001  bouyer Sync with HEAD.
 1.45.8.6  11-Feb-2001  bouyer Sync with HEAD.
 1.45.8.5  18-Jan-2001  bouyer Sync with head (for UBC+NFS fixes, mostly).
 1.45.8.4  05-Jan-2001  bouyer Sync with HEAD
 1.45.8.3  13-Dec-2000  bouyer Sync with HEAD (for UBC fixes).
 1.45.8.2  08-Dec-2000  bouyer Sync with HEAD.
 1.45.8.1  20-Nov-2000  bouyer Update thorpej_scsipi to -current as of a month ago
 1.45.4.5  31-Aug-1999  perseant Rudimentary support for LFS under UBC:

- LFS-specific VOP_BALLOC and VOP_PUTPAGES vnode ops.

- getblk VREG panic #ifdef'd out (can be reinstated when Ifile is
internalized and Ifile can be made another type from VREG)

- interface to VOP_PUTPAGES changed to pass all pager flags, not
just sync. FS putpages routines must know about the pager flags.

- new LFS magic disk address, -2 ("unwritten"), meaning accounted for
but not assigned to a fixed disk location (since LFS does these two
things separately, and the previous accounting method using buffer
headers no longer will work). Changed references to (foo == (daddr_t)-1)
to (foo < 0). Since disk drivers reject all addresses < 0, this should
not present a problem for other FSs.
 1.45.4.4  31-Jul-1999  chs in nfs_getpages(), deal with extending writes better.
also, return errnos instead of VM_PAGER_*.
 1.45.4.3  11-Jul-1999  chs remove uvm_vnp_uncache(), it's no longer needed.
 1.45.4.2  04-Jul-1999  chs update uvm_pagermapin() to match new args.
 1.45.4.1  07-Jun-1999  chs merge everything from chs-ubc branch.
 1.49.4.1  14-Dec-2000  he Pull up revision 1.52 (requested by fvdl):
Improve NFS performance, possibly with as much as 100% in
throughput. Please note: this implies a kernel interface change,
VOP_FSYNC gains two arguments.
 1.63.2.19  11-Nov-2002  nathanw Catch up to -current
 1.63.2.18  22-Oct-2002  thorpej Sync with HEAD.
 1.63.2.17  17-Sep-2002  nathanw Catch up to -current.
 1.63.2.16  15-Jul-2002  nathanw Whitespace.
 1.63.2.15  12-Jul-2002  nathanw No longer need to pull in lwp.h; proc.h pulls it in for us.
 1.63.2.14  24-Jun-2002  nathanw Curproc->curlwp renaming.

Change uses of "curproc->l_proc" back to "curproc", which is more like the
original use. Bare uses of "curproc" are now "curlwp".

"curproc" is now #defined in proc.h as ((curlwp) ? (curlwp)->l_proc) : NULL)
so that it is always safe to reference curproc (*de*referencing curproc
is another story, but that's always been true).
 1.63.2.13  20-Jun-2002  nathanw Catch up to -current.
 1.63.2.12  17-Apr-2002  nathanw Catch up to -current.
 1.63.2.11  01-Apr-2002  nathanw Catch up to -current.
(CVS: It's not just a program. It's an adventure!)
 1.63.2.10  28-Feb-2002  nathanw curproc ==> curproc->l_proc
 1.63.2.9  28-Feb-2002  nathanw Catch up to -current.
 1.63.2.8  08-Jan-2002  nathanw Catch up to -current.
 1.63.2.7  14-Nov-2001  nathanw Catch up to -current.
 1.63.2.6  22-Oct-2001  nathanw Catch up to -current.
 1.63.2.5  21-Sep-2001  nathanw Catch up to -current.
 1.63.2.4  24-Aug-2001  nathanw Catch up with -current.
 1.63.2.3  21-Jun-2001  nathanw Catch up to -current.
 1.63.2.2  09-Apr-2001  nathanw Catch up with -current.
 1.63.2.1  05-Mar-2001  nathanw Initial commit of scheduler activations and lightweight process support.
 1.68.4.1  01-Oct-2001  fvdl Catch up with -current.
 1.68.2.5  30-Sep-2002  jdolecek add support for kevents to NFS
to detect file changes on server by other NFS clients, polling kernel thread
is used to periodically check for attribute changes of watched files;
the NFS server is only contacted when the vnode expires from local attrcache
(which takes 5-60 seconds currently), to keep network&CPU overhead low

the routine checking for remote changes is quite simplistic, but hopefully
doing it's job well enough
 1.68.2.4  06-Sep-2002  jdolecek sync kqueue branch with HEAD
 1.68.2.3  23-Jun-2002  jdolecek catch up with -current on kqueue branch
 1.68.2.2  11-Feb-2002  jdolecek Sync w/ -current.
 1.68.2.1  10-Jan-2002  thorpej Sync kqueue branch with -current.
 1.70.2.1  12-Nov-2001  thorpej Sync the thorpej-mips-cache branch with -current.
 1.81.4.1  01-Sep-2002  lukem Pull up revision 1.82 (requested by bouyer in ticket #752):
nfs_doio(): handle the case where nfs_writerpc() returned error != 0.
Fix kern/18125. OK'd by thorpej and chs.
 1.105.2.12  11-Dec-2005  christos Sync with head.
 1.105.2.11  10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.105.2.10  04-Mar-2005  skrll Sync with HEAD.

Hi Perry!
 1.105.2.9  04-Feb-2005  skrll Sync with HEAD.
 1.105.2.8  17-Jan-2005  skrll Sync with HEAD.
 1.105.2.7  18-Dec-2004  skrll Sync with HEAD.
 1.105.2.6  02-Nov-2004  skrll Sync with HEAD.
 1.105.2.5  30-Oct-2004  skrll Correct panic message s/proc/lwp/
 1.105.2.4  21-Sep-2004  skrll Fix the sync with head I botched.
 1.105.2.3  18-Sep-2004  skrll Sync with HEAD.
 1.105.2.2  03-Aug-2004  skrll Sync with HEAD
 1.105.2.1  02-Jul-2003  darrenr Apply the aborted ktrace-lwp changes to a specific branch. This is just for
others to review, I'm concerned that patch fuziness may have resulted in some
errant code being generated but I'll look at that later by comparing the diff
from the base to the branch with the file I attempt to apply to it. This will,
at the very least, put the changes in a better context for others to review
them and attempt to tinker with removing passing of 'struct lwp' through
the kernel.
 1.116.2.3  01-Dec-2005  riz Pull up following revision(s) (requested by jld in ticket #8826):
sys/nfs/nfs_bio.c: revisions 1.136-1.137 via patch
The problem (kern/31926): under certain conditions, which could be
reliably reproduced, NFS reads would occasionally return zeroes instead
of some of the file data, or fail with EINVAL.
 1.116.2.2  18-Sep-2004  he branches: 1.116.2.2.2;
Pull up revision 1.120 (requested by yamt in ticket #858):
Fix access-after-free bugs in dircache code by reference
counting nfsdircache. Fixes PR#26864.
 1.116.2.1  21-Jun-2004  tron Pull up revision 1.118 (requested by yamt in ticket #513):
nfs_doio_read: use np->n_rcred instead of curproc->p_ucred for VDIR.
XXX maybe it's better to use a cred passed by VOP_READDIR.
 1.116.2.2.2.2  01-Dec-2005  riz Pull up following revision(s) (requested by jld in ticket #8826):
sys/nfs/nfs_bio.c: revisions 1.136-1.137 via patch
The problem (kern/31926): under certain conditions, which could be
reliably reproduced, NFS reads would occasionally return zeroes instead
of some of the file data, or fail with EINVAL.
 1.116.2.2.2.1  30-Jan-2005  he branches: 1.116.2.2.2.1.2;
Pull up revision 1.122 (requested by yamt in ticket #968):
Since daddr_t is 64-bit these days, simply use nfs directory
cookies as buffer cache indexes. This should make the
regress/sys/fs/getdents test work. Fixes PR#27112.
 1.116.2.2.2.1.2.1  01-Dec-2005  riz Pull up following revision(s) (requested by jld in ticket #8826):
sys/nfs/nfs_bio.c: revisions 1.136-1.137 via patch
The problem (kern/31926): under certain conditions, which could be
reliably reproduced, NFS reads would occasionally return zeroes instead
of some of the file data, or fail with EINVAL.
 1.124.4.2  19-Mar-2005  yamt sync with head. xen and whitespace. xen part is not finished.
 1.124.4.1  12-Feb-2005  yamt sync with head.
 1.124.2.1  29-Apr-2005  kent sync with -current
 1.128.4.8  21-Jan-2008  yamt sync with head
 1.128.4.7  07-Dec-2007  yamt sync with head
 1.128.4.6  15-Nov-2007  yamt sync with head.
 1.128.4.5  27-Oct-2007  yamt sync with head.
 1.128.4.4  03-Sep-2007  yamt sync with head.
 1.128.4.3  26-Feb-2007  yamt sync with head.
 1.128.4.2  30-Dec-2006  yamt sync with head.
 1.128.4.1  21-Jun-2006  yamt sync with head.
 1.128.2.2  21-Nov-2005  tron Pull up following revision(s) (requested by yamt in ticket #980):
sys/nfs/nfs_bio.c: revision 1.137
nfs_bioread: push delayed truncation and tweak loop accordingly.
PR/31926 from Jed Davis.
 1.128.2.1  21-Nov-2005  tron Pull up following revision(s) (requested by yamt in ticket #980):
sys/nfs/nfs_bio.c: revision 1.136
nfs_bioread: handle file truncation on the server a little more gracefully.
 1.137.2.3  19-Nov-2005  yamt - as read-ahead context is per-vnode now,
there are less reasons to make VOP_READ call uvm_ra_request explicitly.
move it to pager (uvn_get) so that it can handle accesses via mmap as well.
- pass advice to pager via ubc.
- tweak DPRINTF.

XXX can be disturbed by PGO_LOCKED.

XXX it's controversial where it should be done.
(uvm_fault, uvn_get or genfs_getpages.)
 1.137.2.2  18-Nov-2005  yamt - associate read-ahead context to vnode, rather than file.
- revert VOP_READ prototype.
 1.137.2.1  15-Nov-2005  yamt adapt ffs, lfs, nfs.
 1.140.2.2  15-Jan-2006  yamt sync with head.
 1.140.2.1  31-Dec-2005  yamt - adapt nfs.
- nfs_doio_read: #if 0 out "killproc if text is modified" part of
the code as it's broken. (a process reading the modified text is not
necessarily a process which is using the file as a text.)
 1.141.4.2  01-Jun-2006  kardel Sync with head.
 1.141.4.1  22-Apr-2006  simonb Sync with head.
 1.141.2.1  09-Sep-2006  rpaulo sync with head
 1.142.6.1  24-May-2006  tron Merge 2006-05-24 NetBSD-current into the "peter-altq" branch.
 1.142.4.2  06-May-2006  christos - Move kauth_cred_t declaration to <sys/types.h>
- Cleanup struct ucred; forward declarations that are unused.
- Don't include <sys/kauth.h> in any header, but include it in the c files
that need it.

Approved by core.
 1.142.4.1  08-Mar-2006  elad Adapt to kernel authorization KPI.

This could use some testing...
 1.142.2.2  11-Aug-2006  yamt sync with head
 1.142.2.1  24-May-2006  yamt sync with head.
 1.143.4.1  13-Jul-2006  gdamore Merge from HEAD.
 1.145.4.1  12-Jan-2007  ad Sync with head.
 1.147.2.4  17-May-2007  yamt sync with head.
 1.147.2.3  07-May-2007  yamt sync with head.
 1.147.2.2  12-Mar-2007  rmind Sync with HEAD.
 1.147.2.1  28-Feb-2007  yamt sync with head. (somehow missed in the previous)
 1.151.4.1  11-Jul-2007  mjf Sync with head.
 1.151.2.12  24-Aug-2007  ad Sync with buffer cache locking changes. See buf.h/vfs_bio.c for details.
Some minor portions are incomplete and needs to be verified as a whole.
 1.151.2.11  20-Aug-2007  ad Sync with HEAD.
 1.151.2.10  19-Aug-2007  ad - Back out the biodone() changes.
- Eliminate B_ERROR (from HEAD).
 1.151.2.9  15-Jul-2007  ad Sync with head.
 1.151.2.8  18-Jun-2007  yamt fix merge botches.
 1.151.2.7  17-Jun-2007  ad - Increase the number of thread priorities from 128 to 256. How the space
is set up is to be revisited.
- Implement soft interrupts as kernel threads. A generic implementation
is provided, with hooks for fast-path MD code that can run the interrupt
threads over the top of other threads executing in the kernel.
- Split vnode::v_flag into three fields, depending on how the flag is
locked (by the interlock, by the vnode lock, by the file system).
- Miscellaneous locking fixes and improvements.
 1.151.2.6  09-Jun-2007  ad Sync with head.
 1.151.2.5  08-Jun-2007  ad Sync with head.
 1.151.2.4  13-May-2007  ad - Pass the error number and residual count to biodone(), and let it handle
setting error indicators. Prepare to eliminate B_ERROR.
- Add a flag argument to brelse() to be set into the buf's flags, instead
of doing it directly. Typically used to set B_INVAL.
- Add a "struct cpu_info *" argument to kthread_create(), to be used to
create bound threads. Change "bool mpsafe" to "int flags".
- Allow exit of LWPs in the IDL state when (l != curlwp).
- More locking fixes & conversion to the new API.
 1.151.2.3  09-Apr-2007  ad - Add two new arguments to kthread_create1: pri_t pri, bool mpsafe.
- Fork kthreads off proc0 as new LWPs, not new processes.
 1.151.2.2  21-Mar-2007  ad - Replace more simple_locks, and fix up in a few places.
- Use condition variables.
- LOCK_ASSERT -> KASSERT.
 1.151.2.1  13-Mar-2007  ad Pull in the initial set of changes for the vmlocking branch.
 1.160.2.1  15-Aug-2007  skrll Sync with HEAD.
 1.164.6.2  29-Jul-2007  ad It's not a good idea for device drivers to modify b_flags, as they don't
need to understand the locking around that field. Instead of setting
B_ERROR, set b_error instead. b_error is 'owned' by whoever completes
the I/O request.
 1.164.6.1  29-Jul-2007  ad file nfs_bio.c was added on branch matt-mips64 on 2007-07-29 13:31:13 +0000
 1.164.4.6  09-Dec-2007  jmcneill Sync with HEAD.
 1.164.4.5  27-Nov-2007  joerg Sync with HEAD. amd64 Xen support needs testing.
 1.164.4.4  29-Oct-2007  joerg Sync with HEAD.
 1.164.4.3  26-Oct-2007  joerg Sync with HEAD.

Follow the merge of pmap.c on i386 and amd64 and move
pmap_init_tmp_pgtbl into arch/x86/x86/pmap.c. Modify the ACPI wakeup
code to restore CR4 before jumping back into kernel space as the large
page option might cover that.
 1.164.4.2  16-Aug-2007  jmcneill Sync with HEAD.
 1.164.4.1  09-Aug-2007  jmcneill Sync with HEAD.
 1.166.4.1  14-Oct-2007  yamt sync with head.
 1.166.2.2  09-Jan-2008  matt sync with HEAD
 1.166.2.1  06-Nov-2007  matt sync with HEAD
 1.168.2.1  13-Nov-2007  bouyer Sync with HEAD
 1.169.2.2  18-Feb-2008  mjf Sync with HEAD.
 1.169.2.1  08-Dec-2007  mjf Sync with HEAD.
 1.170.2.2  08-Dec-2007  ad Sync with head.
 1.170.2.1  04-Dec-2007  ad Pull the vmlocking changes into a new branch.
 1.171.4.1  02-Jan-2008  bouyer Sync with HEAD
 1.173.6.3  17-Jan-2009  mjf Sync with HEAD.
 1.173.6.2  02-Jun-2008  mjf Sync with HEAD.
 1.173.6.1  03-Apr-2008  mjf Sync with HEAD.
 1.174.2.1  18-May-2008  yamt sync with head.
 1.175.8.2  13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.175.8.1  19-Oct-2008  haad Sync with HEAD.
 1.175.2.4  10-Oct-2010  yamt some locking changes
 1.175.2.3  26-Sep-2010  yamt locking changes
 1.175.2.2  11-Aug-2010  yamt sync with head.
 1.175.2.1  04-May-2009  yamt sync with head.
 1.179.4.2  16-Jul-2010  riz Pull up following revision(s) (requested by jakllsch in ticket #1417):
sys/nfs/nfs_bio.c: revision 1.185
Fix memory leak during some NFS writes.
 1.179.4.1  02-Nov-2008  snj branches: 1.179.4.1.4;
Pull up following revision(s) (requested by tron in ticket #9):
sys/nfs/nfs_bio.c: revision 1.180
sys/miscfs/genfs/genfs_io.c: revision 1.14
sys/uvm/uvm_extern.h: revision 1.149
- allocate 8 pointers on the stack to avoid stack overflow in nfs.
- make that 8 a constant
- remove bogus panic
 1.179.4.1.4.1  20-May-2011  matt bring matt-nb5-mips64 up to date with netbsd-5-1-RELEASE (except compat).
 1.179.2.2  28-Apr-2009  skrll Sync with HEAD.
 1.179.2.1  19-Jan-2009  skrll Sync with HEAD.
 1.181.4.1  13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.183.4.3  03-Jul-2010  rmind sync with head
 1.183.4.2  30-May-2010  rmind sync with head
 1.183.4.1  16-Mar-2010  rmind Change struct uvm_object::vmobjlock to be dynamically allocated with
mutex_obj_alloc(). It allows us to share the locks among UVM objects.
 1.183.2.2  17-Aug-2010  uebayasi Sync with HEAD.
 1.183.2.1  30-Apr-2010  uebayasi Sync with HEAD.
 1.185.6.1  23-Jun-2011  cherry Catchup with rmind-uvmplock merge.
 1.188.22.1  07-Sep-2013  bouyer Pull up following revision(s) (requested by hannken in ticket #933):
sys/nfs/nfs_bio.c: revision 1.189
Function nfs_vinvalbuf() ignores errors from vinvalbuf() and therefore
delayed write errors may get lost.
Change nfs_vinvalbuf() to keep errors from vinvalbuf() for fsync() or =
close().
=20
Presented on tech-kern@
=20
Fix for PR kern/47980 (NFS over-quota not detected if utimes() called
before fsync()/close())
=20
=20
 1.188.16.1  28-Aug-2013  rmind sync with head
 1.188.14.1  07-Sep-2013  bouyer Pull up following revision(s) (requested by hannken in ticket #933):
sys/nfs/nfs_bio.c: revision 1.189
Function nfs_vinvalbuf() ignores errors from vinvalbuf() and therefore
delayed write errors may get lost.
Change nfs_vinvalbuf() to keep errors from vinvalbuf() for fsync() or =
close().
=20
Presented on tech-kern@
=20
Fix for PR kern/47980 (NFS over-quota not detected if utimes() called
before fsync()/close())
=20
=20
 1.188.12.2  03-Dec-2017  jdolecek update from HEAD
 1.188.12.1  20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.188.8.1  07-Sep-2013  bouyer Pull up following revision(s) (requested by hannken in ticket #933):
sys/nfs/nfs_bio.c: revision 1.189
Function nfs_vinvalbuf() ignores errors from vinvalbuf() and therefore
delayed write errors may get lost.
Change nfs_vinvalbuf() to keep errors from vinvalbuf() for fsync() or =
close().
=20
Presented on tech-kern@
=20
Fix for PR kern/47980 (NFS over-quota not detected if utimes() called
before fsync()/close())
=20
=20
 1.188.2.4  22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.188.2.3  25-Jan-2012  yamt uvm_loanabj: take an access pattern hint.
 1.188.2.2  04-Jan-2012  yamt enable O->A loaning read for a few filesystems.
 1.188.2.1  02-Nov-2011  yamt page cache related changes

- maintain object pages in radix tree rather than rb tree.
- reduce unnecessary page scan in putpages. esp. when an object has a ton of
pages cached but only a few of them are dirty.
- reduce the number of pmap operations by tracking page dirtiness more
precisely in uvm layer.
- fix nfs commit range tracking.
- fix nfs write clustering. XXX hack
 1.189.4.1  04-Nov-2015  riz Pull up following revision(s) (requested by manu in ticket #882):
sbin/umount/umount.c: revision 1.48
sys/nfs/nfsmount.h: revision 1.53
sys/nfs/nfs_var.h: revision 1.94
sys/nfs/nfs_iod.c: revision 1.7
sys/nfs/nfs_socket.c: revision 1.197
sys/nfs/nfs_bio.c: revision 1.191
sys/nfs/nfs_vfsops.c: revision 1.230
sys/nfs/nfs_clntsocket.c: revision 1.3
Remove useless and harmful sync(2) call in umount(8)
Remove sync(2) call before unmount(2) in umount(8). This sync(2) is useless
since unmount(2) will perform a VFS_SYNC anyway.
But moreover, this sync(2) may be harmful, as there are some situation where
it cannot return (unreachable NFS server, for instance), causing umount -f
to be uneffective.
Fix soft NFS force unmount
For many reasons, forcibly unmounting a soft NFS mount could hang forever.
Here are the fixes:
- Introduce decents timeouts in operation that awaited NFS server reply.
- On timeout, fails operations on soft mounts with EIO.
- Introduce NFSMNT_DISMNTFORCE to let the filesystem know that a
force unmount is ongoing. This causes timeouts to be reduced and
prevents the NFS client to attempt reconnecting to the NFS server.
Also fix a race condition where some asynchronous I/O could reference
destroyed mount structures. We fix this by awaiting asynchronous I/O
to drain before proceeding.
Reviewed by Chuck Silvers.
 1.190.2.1  22-Sep-2015  skrll Sync with HEAD
 1.191.18.1  08-Apr-2020  martin Merge changes from current as of 20200406
 1.192.2.2  29-Feb-2020  ad Sync with head.
 1.192.2.1  17-Jan-2020  ad Sync with head.
 1.195.2.1  25-Apr-2020  bouyer Sync with bouyer-xenpvh-base2 (HEAD)

RSS XML Feed