Home | History | Annotate | only in /src/sys/fs/puffs
History log of /src/sys/fs/puffs
RevisionDateAuthorComments
 1.1 22-Oct-2006  pooka branches: 1.1.2; 1.1.6; 1.1.8;
kernel portion of puffs - the Pass-to-Userspace Framework File System.
It contains the VFS attachment and userspace message-passing interface.

This work was initially started and completed for Google SoC 2005
and tweaked to work a bit better in the past few weeks. While
being far from complete, it is functional enough to be able and
stable to host a fairly general-purpose in-memory file system in
userspace. Even so, puffs should be considered experimental and
no binary compatibility for interfaces or crash-freedom or zero
security implications should be relied upon just yet.

The GSoC project was mentored by William Studenmund and the final
review for the code was done by Christos.
 1.1.8.2 30-Dec-2006  yamt sync with head.
 1.1.8.1 22-Oct-2006  yamt file Makefile was added on branch yamt-lazymbuf on 2006-12-30 20:50:00 +0000
 1.1.6.2 10-Dec-2006  yamt sync with head.
 1.1.6.1 22-Oct-2006  yamt file Makefile was added on branch yamt-splraiseipl on 2006-12-10 07:18:38 +0000
 1.1.2.2 18-Nov-2006  ad Sync with head.
 1.1.2.1 22-Oct-2006  ad file Makefile was added on branch newlock2 on 2006-11-18 21:39:20 +0000
 1.1 18-May-2009  pooka branches: 1.1.2;
add some todo-items, based on a file which was lingering in my
local tree for apparently almost two years now
 1.1.2.2 20-Jun-2009  yamt sync with head
 1.1.2.1 18-May-2009  yamt file TODO was added on branch yamt-nfs-mp on 2009-06-20 07:20:30 +0000
 1.10 05-Feb-2019  pgoyette It turns out we do want the puffs compat code in any kernel which
has built-in compat_50 regardless of whether the kernel also has
puffs.

Should finally fix PR kern/53943
 1.9 04-Feb-2019  wiz try '&' instead of '&&'
 1.8 04-Feb-2019  pgoyette Don't include puffs_compat in a kernel unless the filesystem is
selected along with COMPAT_50. Also, don't include puffs_compat
in the main puffs filesystem module; it is part of the compat_50
module.

Should address PR kern/53943
 1.7 27-Jan-2019  pgoyette Merge the [pgoyette-compat] branch
 1.6 11-Oct-2014  uebayasi branches: 1.6.18; 1.6.20;
Define filesystem attributes with vfs dependency.
 1.5 06-Jul-2010  pooka branches: 1.5.18;
remember to add the new file to the build
 1.4 10-Nov-2007  pooka branches: 1.4.18; 1.4.38; 1.4.40;
Part 2/n of extensive changes to request transport to/from userspace:

Rip the transport code completely out of puffs and generalize it
into an independent module which will be used for multiple purposes
in the future. This module is called the Pass-to-Userspace
Transporter (known as "putter" among friends).

This is very much work-in-progress and one dependency with puffs
remains: the request framing format.

The device name is still /dev/puffs, but that will change soon.

Users of puffs need the following in their kernel configs now:
pseudo-device putter
 1.3 27-Sep-2007  pooka branches: 1.3.2; 1.3.4;
Split routines handling nodes from puffs_subr to puffs_node.
No functional change.
 1.2 05-Dec-2006  pooka branches: 1.2.2; 1.2.4; 1.2.10; 1.2.22; 1.2.24; 1.2.26;
shuffle functions around a bit: move the transport (/dev/puffs) to
a different file from the messaging (request contents). no functional
change
 1.1 22-Oct-2006  pooka branches: 1.1.2;
kernel portion of puffs - the Pass-to-Userspace Framework File System.
It contains the VFS attachment and userspace message-passing interface.

This work was initially started and completed for Google SoC 2005
and tweaked to work a bit better in the past few weeks. While
being far from complete, it is functional enough to be able and
stable to host a fairly general-purpose in-memory file system in
userspace. Even so, puffs should be considered experimental and
no binary compatibility for interfaces or crash-freedom or zero
security implications should be relied upon just yet.

The GSoC project was mentored by William Studenmund and the final
review for the code was done by Christos.
 1.1.2.3 12-Jan-2007  ad Sync with head.
 1.1.2.2 18-Nov-2006  ad Sync with head.
 1.1.2.1 22-Oct-2006  ad file files.puffs was added on branch newlock2 on 2006-11-18 21:39:20 +0000
 1.2.26.1 06-Oct-2007  yamt sync with head.
 1.2.24.2 09-Jan-2008  matt sync with HEAD
 1.2.24.1 06-Nov-2007  matt sync with HEAD
 1.2.22.2 11-Nov-2007  joerg Sync with HEAD.
 1.2.22.1 02-Oct-2007  joerg Sync with HEAD.
 1.2.10.1 09-Oct-2007  ad Sync with head.
 1.2.4.4 15-Nov-2007  yamt sync with head.
 1.2.4.3 27-Oct-2007  yamt sync with head.
 1.2.4.2 30-Dec-2006  yamt sync with head.
 1.2.4.1 05-Dec-2006  yamt file files.puffs was added on branch yamt-lazymbuf on 2006-12-30 20:50:00 +0000
 1.2.2.2 10-Dec-2006  yamt sync with head.
 1.2.2.1 05-Dec-2006  yamt file files.puffs was added on branch yamt-splraiseipl on 2006-12-10 07:18:38 +0000
 1.3.4.1 19-Nov-2007  mjf Sync with HEAD.
 1.3.2.1 13-Nov-2007  bouyer Sync with HEAD
 1.4.40.1 05-Mar-2011  rmind sync with head
 1.4.38.1 17-Aug-2010  uebayasi Sync with HEAD.
 1.4.18.1 11-Aug-2010  yamt sync with head.
 1.5.18.1 03-Dec-2017  jdolecek update from HEAD
 1.6.20.1 10-Jun-2019  christos Sync with HEAD
 1.6.18.2 22-Sep-2018  pgoyette Include the compat code whether or not the calling device or filesystem
exists.
 1.6.18.1 24-Mar-2018  pgoyette Add fs/puffs compat_50 to the modules
 1.8 12-Dec-2019  pgoyette Rather than keeping a separate mutex, condvar, and pserialize for each
module hook, we can share a common set of synchronization structures.
This cuts the amount of cacheline_aligned data for these structures by
50%.

Note that we still have a per-hook localcount, since we need to count
individual references.

As discussed with riastradh@

Welcome to 9.99.22 !
 1.7 01-Mar-2019  pgoyette Rename the MODULE_*_HOOK() macros to MODULE_HOOK_*() as briefly
discussed on irc.

NFCI intended.

Ride the earlier kernel bump - it;s getting crowded.
 1.6 29-Jan-2019  pgoyette Normalize all the compat hooks' names to the form

<subsystem>_<function>_<version>_hook

NFCI

XXX Note that although this introduces a change in the kernel-to-
XXX module interface, we are NOT bumping the kernel version number.
XXX We will bump the version number once the interface stabilizes.
 1.5 27-Jan-2019  pgoyette Merge the [pgoyette-compat] branch
 1.4 22-Apr-2015  pooka branches: 1.4.16; 1.4.18;
sprinkle COMPAT_50
 1.3 10-Nov-2014  maxv branches: 1.3.2;
Do not uselessly include <sys/malloc.h>.
 1.2 11-Jul-2010  pooka branches: 1.2.2; 1.2.4; 1.2.10; 1.2.24; 1.2.40;
Do fhtovp compat translation only for fhtovp ops, not all vfs ops.
Allocate tailing extra buffer for compat op too.
 1.1 06-Jul-2010  pooka Add compat to enable running puffs in a 64bit time_t kernel against
a server which runs in 32bit time_t namespace.
 1.2.40.1 17-Jan-2015  martin Pull up following revision(s) (requested by maxv in ticket #427):
sys/compat/svr4/svr4_schedctl.c: revision 1.8
sys/netinet/tcp_timer.c: revision 1.88
sys/miscfs/genfs/layer_vfsops.c: revision 1.45
sys/compat/svr4/svr4_ioctl.c: revision 1.37
sys/ufs/chfs/chfs_vfsops.c: revision 1.14
sys/miscfs/fdesc/fdesc_vfsops.c: revision 1.91
sys/compat/linux/arch/i386/linux_ptrace.c: revision 1.30
sys/compat/common/kern_time_50.c: revision 1.28
sys/netinet6/ip6_forward.c: revision 1.74
sys/miscfs/umapfs/umap_vnops.c: revision 1.57
sys/compat/svr4/svr4_fcntl.c: revision 1.74
distrib/sets/lists/comp/mi: revision 1.1931
sys/netinet6/udp6_output.c: revision 1.46
sys/fs/puffs/puffs_compat.c: revision 1.3
sys/fs/udf/udf_rename.c: revision 1.11
sys/compat/svr4/svr4_filio.c: revision 1.24
sys/fs/udf/udf_rename.c: revision 1.12
sys/netinet/tcp_usrreq.c: revision 1.202
sys/miscfs/umapfs/umap_subr.c: revision 1.29
sys/compat/linux/common/linux_fadvise64.c: revision 1.3
sys/netinet/if_atm.c: revision 1.34
sys/miscfs/procfs/procfs_subr.c: revision 1.106
sys/miscfs/genfs/layer_subr.c: revision 1.37
sys/netinet/tcp_sack.c: revision 1.30
sys/compat/freebsd/freebsd_misc.c: revision 1.33
sys/compat/freebsd/freebsd_file.c: revision 1.33
sys/ufs/chfs/chfs_vnode.c: revision 1.12
sys/compat/svr4/svr4_ttold.c: revision 1.34
sys/compat/linux/common/linux_file.c: revision 1.114
sys/compat/linux/arch/mips/linux_machdep.c: revision 1.43
sys/compat/linux/common/linux_signal.c: revision 1.76
sys/compat/common/compat_util.c: revision 1.46
sys/compat/linux/arch/arm/linux_ptrace.c: revision 1.18
sys/compat/svr4/svr4_sockio.c: revision 1.36
sys/compat/linux/arch/arm/linux_machdep.c: revision 1.32
sys/compat/svr4/svr4_signal.c: revision 1.66
sys/kern/kern_exec.c: revision 1.410
sys/fs/puffs/puffs_vfsops.c: revision 1.115
sys/compat/svr4/svr4_exec_elf64.c: revision 1.15
sys/compat/linux/arch/i386/linux_machdep.c: revision 1.159
sys/compat/linux/arch/alpha/linux_machdep.c: revision 1.50
sys/compat/linux32/common/linux32_misc.c: revision 1.24
sys/netinet/in_pcb.c: revision 1.153
sys/sys/malloc.h: revision 1.116
sys/compat/common/if_43.c: revision 1.9
share/man/man9/Makefile: revision 1.380
sys/netinet/tcp_vtw.c: revision 1.12
sys/miscfs/umapfs/umap_vfsops.c: revision 1.95
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.186
sys/compat/common/uipc_syscalls_43.c: revision 1.46
sys/ufs/ext2fs/ext2fs_vnops.c: revision 1.115
sys/fs/puffs/puffs_msgif.c: revision 1.97
sys/compat/svr4/svr4_ipc.c: revision 1.27
sys/compat/linux/common/linux_exec.c: revision 1.117
sys/ufs/ext2fs/ext2fs_readwrite.c: revision 1.66
sys/netinet/tcp_output.c: revision 1.179
sys/compat/svr4/svr4_termios.c: revision 1.28
sys/fs/udf/udf_strat_bootstrap.c: revision 1.4
sys/fs/puffs/puffs_subr.c: revision 1.67
sys/fs/puffs/puffs_node.c: revision 1.36
sys/miscfs/overlay/overlay_vnops.c: revision 1.21
sys/fs/cd9660/cd9660_node.c: revision 1.34
sys/netinet/raw_ip.c: revision 1.146
sys/sys/mallocvar.h: revision 1.13
sys/miscfs/overlay/overlay_vfsops.c: revision 1.63
share/man/man9/malloc.9: revision 1.50
sys/netinet6/dest6.c: revision 1.18
sys/compat/linux/common/linux_uselib.c: revision 1.33
sys/compat/linux/common/linux_socket.c: revision 1.120
share/man/man9/malloc.9: revision 1.51
sys/netinet/tcp_subr.c: revision 1.257
sys/compat/linux/common/linux_socketcall.c: revision 1.45
sys/compat/linux/common/linux_fadvise64_64.c: revision 1.3
sys/compat/freebsd/freebsd_ipc.c: revision 1.17
sys/compat/linux/common/linux_misc_notalpha.c: revision 1.109
sys/compat/linux/arch/alpha/linux_pipe.c: revision 1.17
sys/netinet6/in6_pcb.c: revision 1.132
sys/netinet6/in6_ifattach.c: revision 1.94
sys/compat/svr4/svr4_exec_elf32.c: revision 1.15
sys/miscfs/nullfs/null_vfsops.c: revision 1.90
sys/fs/cd9660/cd9660_util.c: revision 1.12
sys/compat/linux/arch/powerpc/linux_machdep.c: revision 1.48
sys/compat/freebsd/freebsd_exec_elf32.c: revision 1.20
sys/miscfs/procfs/procfs_vfsops.c: revision 1.94
sys/compat/linux/arch/powerpc/linux_ptrace.c: revision 1.28
sys/compat/linux/common/linux_sched.c: revision 1.67
sys/compat/linux/common/linux_exec_aout.c: revision 1.67
sys/compat/linux/common/linux_pipe.c: revision 1.67
sys/compat/linux/common/linux_llseek.c: revision 1.34
sys/compat/linux/arch/mips/linux_ptrace.c: revision 1.10
Do not uselessly include <sys/malloc.h>.
Cleanup:
- remove struct kmembuckets (dead)
- correctly deadify MALLOC_XX
- remove MALLOC_DEFINE_LIMIT and MALLOC_JUSTDEFINE_LIMIT (dead)
- remove malloc_roundup(), malloc_type_setlimit(), MALLOC_DEFINE_LIMIT()
and MALLOC_JUSTDEFINE_LIMIT() from man 9 malloc
New sentence, new line. Bump date for previous.
Obsolete malloc_roundup(9), malloc_type_setlimit(9) and MALLOC_DEFINE_LIMIT(9)
man pages.
 1.2.24.1 03-Dec-2017  jdolecek update from HEAD
 1.2.10.2 05-Mar-2011  rmind sync with head
 1.2.10.1 11-Jul-2010  rmind file puffs_compat.c was added on branch rmind-uvmplock on 2011-03-05 20:55:07 +0000
 1.2.4.2 17-Aug-2010  uebayasi Sync with HEAD.
 1.2.4.1 11-Jul-2010  uebayasi file puffs_compat.c was added on branch uebayasi-xip on 2010-08-17 06:47:19 +0000
 1.2.2.2 11-Aug-2010  yamt sync with head.
 1.2.2.1 11-Jul-2010  yamt file puffs_compat.c was added on branch yamt-nfs-mp on 2010-08-11 22:54:34 +0000
 1.3.2.1 06-Jun-2015  skrll Sync with HEAD
 1.4.18.2 08-Apr-2020  martin Merge changes from current as of 20200406
 1.4.18.1 10-Jun-2019  christos Sync with HEAD
 1.4.16.6 23-Jan-2019  pgoyette Convert the macros for setting and unsetting a hook to generate
in-line code rather than using an intermediary hook##set routine.
Hooks are set and unset only in one place, so the intermediary
routine provides no benefit. IMHO using the macro at the point-
of-call is more readable than using it elsewhere in the code and
then calling the generated intermediary routine (for which you
won't even find its declaration or definition unless you remember
to search for the HOOK_SET macro instead).

NFC intended, will verify with a bulk build and an atf test run.
 1.4.16.5 14-Jan-2019  pgoyette Create a variant of the HOOK macros that handles hook routines of
type void, and use them where appropriate.
 1.4.16.4 13-Jan-2019  pgoyette Remove the HOOK2 versions of the MODULE_HOOK macros. There were
only a few uses, and using them led to some lack of clarity in the
code. Instead, we now use two separate hooks, with names that
make it clear(er) what we're doing.

This also positions us to start unraveling some of the rtsock_50
mess, which will need (at least) five hooks.
 1.4.16.3 18-Sep-2018  pgoyette The COMPAT_HOOK macros were renamed to MODULE_HOOK, adjust all callers
 1.4.16.2 17-Sep-2018  pgoyette Adapt (most of) the indirect function pointers to the new MP-safe
mechanism. Still remaining are the compat_netbsd32 stuff, and
some usb subroutines.
 1.4.16.1 24-Mar-2018  pgoyette Add fs/puffs compat_50 to the modules
 1.108 01-Feb-2025  andvar s/furher/further/ in comment.
 1.107 09-Feb-2024  andvar branches: 1.107.2;
fix spelling mistakes, mainly in comments and log messages.
 1.106 15-May-2020  maxv hardclock_ticks -> getticks()
 1.105 23-Feb-2020  ad UVM locking changes, proposed on tech-kern:

- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart. v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap. Others to follow later.
 1.104 01-Mar-2019  pgoyette branches: 1.104.6;
Rename the MODULE_*_HOOK() macros to MODULE_HOOK_*() as briefly
discussed on irc.

NFCI intended.

Ride the earlier kernel bump - it;s getting crowded.
 1.103 29-Jan-2019  pgoyette Normalize all the compat hooks' names to the form

<subsystem>_<function>_<version>_hook

NFCI

XXX Note that although this introduces a change in the kernel-to-
XXX module interface, we are NOT bumping the kernel version number.
XXX We will bump the version number once the interface stabilizes.
 1.102 27-Jan-2019  pgoyette Merge the [pgoyette-compat] branch
 1.101 17-Apr-2017  hannken branches: 1.101.10; 1.101.12;
Add vfs_ref(mp) and vfs_rele(mp) to add or remove a reference to
struct mount. Rename vfs_destroy(mp) to vfs_rele(mp) and replace
incrementing mp->mnt_refcnt with vfs_ref(mp).
 1.100 26-Dec-2016  skrll branches: 1.100.2;
Hold the interlock when calling cv_broadcast as per condvar(9)
 1.99 07-Jul-2016  msaitoh branches: 1.99.2;
KNF. Remove extra spaces. No functional change.
 1.98 06-May-2015  hannken Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@
 1.97 10-Nov-2014  maxv branches: 1.97.2;
Do not uselessly include <sys/malloc.h>.
 1.96 05-Sep-2014  matt Don't use C++ class and this keywords as variables.
 1.95 28-Aug-2014  hannken Change puffs from hashlist to vcache.
- field "pa_nhashbuckets" of struct "puffs_kargs" becomes a no-op.
and should be removed on the next protocol version bump.
 1.94 17-Oct-2013  christos branches: 1.94.4;
- remove unused variables
- add _NOERROR flavor macros for the case where errors are ignored.
 1.93 05-Nov-2012  dholland branches: 1.93.2;
Excise struct componentname from the namecache.

This uglifies the interface, because several operations need to be
passed the namei flags and cache_lookup also needs for the time being
to be passed cnp->cn_nameiop. Nonetheless, it's a net benefit.

The glop should be able to go away eventually but requires structural
cleanup elsewhere first.

This change requires a kernel bump.
 1.92 27-Jul-2012  manu branches: 1.92.2;
Rename slow sopreq queue into node sopreq queue, to refet the fact that
is only intended for postponed node reclaims.
When purging the node sopreq queue, do not call puffs_msg_sendresp(), as
it makes no sense.
 1.91 22-Jul-2012  manu Fix hang unmount bug introduced by last commit.

We introduced a slow queue for delayed reclaims, while the existing
queue for unmount, flush and exist has been renamed fast queue. Both
queues had timestamp for when an operation should be done, but it was
useless for the fast queue, which is always used to run an operation
ASAP. And the timestamp test had an error that turned ASAP into "at next
tick", but nobody what there to wake the thread at next tick, hence
the hang. The fix is to remove the useless and buggy timestamp test for
fast queue.
 1.90 21-Jul-2012  manu - Improve PUFFS_KFLAG_CACHE_FS_TTL by reclaiming older inactive nodes.

The normal kernel behavior is to retain inactive nodes in the freelist
until it runs out of vnodes. This has some merit for local filesystems,
where the cost of an allocation is about the same as the cost of a
lookup. But that situation is not true for distributed filesystems.
On the other hand, keeping inactive nodes for a long time hold memory
in the file server process, and when the kernel runs out of vnodes, it
produce reclaim avalanches that increase lattency for other operations.

We do not reclaim inactive vnodes immediatly either, as they may be
looked up again shortly. Instead we introduce a grace time and we
reclaim nodes that have been inactive beyond the grace time.

- Fix lookup/reclaim race condition.

The above improvement undercovered a race condition between lookup and
reclaim. If we reclaimed a vnode associated with a userland cookie while
a lookup returning that same cookiewas inprogress, then the kernel ends
up with a vnode associated with a cookie that has been reclaimed in
userland. Next operation on the cookie will crash (or at least confuse)
the filesystem.

We fix this by introducing a lookup count in kernel and userland. On
reclaim, the kernel sends the count, which enable userland to detect
situation where it initiated a lookup that is not completed in kernel.
In such a situation, the reclaim must be ignored, as the node is about
to be looked up again.
 1.89 19-Oct-2011  manu branches: 1.89.2; 1.89.8;
Remove #ifdef DIAGNOSTIC guards around KASSERT, as the macro contains them
 1.88 18-Oct-2011  manu Make sure pagedaemon does not sleep for memory in puffs_vnop_sleep.
Add KASSERT on any sleeping memory allocation to check it cannot happen again.
 1.87 03-Jul-2011  mrg avoid some uninitialised variable warnings from GCC.
at least the puffs one seems valid, but i'm not 100% sure.
 1.86 12-Jun-2011  rmind Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.
 1.85 11-Feb-2011  yamt branches: 1.85.2;
puffs_msg_wait: check PARKFLAG_HASERROR before PARKFLAG_CALL. PR/44240.
 1.84 15-Nov-2010  pooka branches: 1.84.2; 1.84.4;
Apply patch from PR kern/44093 by yamt:

Interrupt server wait only on certain signals (same set at nfs -i)
instead of all signals. According to the PR this helps with
"git clone" run on a puffs file system.
 1.83 12-Nov-2010  pooka Allow clients to reuse a "park".

Patch from <yamt>, fixes PR kern/44086 by him.
 1.82 06-Jul-2010  pooka Remove groolingly spooky variable which has been haunting us for
several years without doing anything useful.
 1.81 06-Jul-2010  pooka Add compat to enable running puffs in a 64bit time_t kernel against
a server which runs in 32bit time_t namespace.
 1.80 14-Jan-2010  pooka branches: 1.80.2; 1.80.4;
In case the operations thread has exited, do not queue any more
operations. This prevents kernel memory leaks (one of which happened
every time the file system was unmounted via PUFFSOP_UNMOUNT ...
and incidentally would've been trivially caught with the old
malloc(9) interface. I wonder if the message is to use a ton of
pools instead of regression-attractive kmem interface).
 1.79 07-Jan-2010  pooka Rename PUFFS_SOPREQ_EXIT to PUFFS_SOPREQSYS_EXIT to better signal
it comes from within the kernel instead of as a direct result of
a user request.

no functional change
 1.78 07-Jan-2010  pooka Fix variable name in my commit tree too.
 1.77 07-Jan-2010  pooka Add a PUFFS_UNMOUNT server->kernel request, which causes the kernel
to initiate self destruct, i.e. unmount(MNT_FORCE). This, however,
is a semi-controlled self-destruct, since all caches are flushed
before the (possibly) violent unmount takes place.
 1.76 07-Dec-2009  pooka Process flush requests from the file server in a separate thread
context. This fixes a long-standing but seldomly seen deadlock,
where the kernel was holding pages busy (due to e.g. readahead
request) while waiting for the server to respond, and the server
made a callback into the kernel asking to invalidate those pages.
... or, well, theoretically fixes, since I didn't have any reliable
way of repeating the deadlock and I think I saw it only twice.
 1.75 07-Dec-2009  pooka Need to send protocol layer response instead of transport layer
return value. While there, just collapse all non-supported types
into one entry.
 1.74 05-Nov-2009  pooka Kill suspend support. It was never implemented correctly:
* it depended on the biglock (in a very cruel way)
* it was attached to userspace transactions rather than logical
fs operations

(If someone wants to revisit it some day, most of the stuff can be
reused from cvs history)
 1.73 18-Mar-2009  cegger Ansify function definitions w/o arguments. Generated with sed.
 1.72 25-Sep-2008  ad branches: 1.72.2; 1.72.4; 1.72.8; 1.72.12;
PR kern/39307 (mfs will sometimes panic at umount time)

Change dounmount() so that it never drops the caller provided reference.
Garbage collecting 'struct mount' is up to the caller.
 1.71 06-May-2008  ad branches: 1.71.2; 1.71.6;
PR kern/38141 lookup/vfs_busy acquire rwlock recursively

Simplify the mount locking. Remove all the crud to deal with recursion on
the mount lock, and crud to deal with unmount as another weirdo lock.

Hopefully this will once and for all fix the deadlocks with this. With this
commit there are two locks on each mount:

- krwlock_t mnt_unmounting. This is used to prevent unmount across critical
sections like getnewvnode(). It's only ever read locked with rw_tryenter(),
and is only ever write locked in dounmount(). A write hold can't be taken
on this lock if the current LWP could hold a vnode lock.

- kmutex_t mnt_updating. This is taken by threads updating the mount, for
example when going r/o -> r/w, and is only present to serialize updates.
In order to take this lock, a read hold must first be taken on
mnt_unmounting, and the two need to be held across the operation.

One effect of this change: previously if an unmount failed, we would make a
half hearted attempt to back out of it gracefully, but that was unlikely to
work in a lot of cases. Now while an unmount that will be aborted is in
progress, new file operations within the mount will fail instead of being
delayed. That is unlikely to be a problem though, because if the admin
requests unmount of a file system then s(he) has made a decision to deny
access to the resource.
 1.70 30-Apr-2008  ad PR kern/38135 vfs_busy/vfs_trybusy confusion

The previous fix worked, but it opened a window where mounts could have
disappeared from mountlist while the caller was traversing it using
vfs_trybusy(). Fix that.
 1.69 29-Apr-2008  ad kern/38135 vfs_busy/vfs_trybusy confusion

The symptom was that sometimes file systems would occasionally not appear
in output from 'df' or 'mount' if the system was busy. Resolution:

- Make mount locks work somewhat like vm_map locks.
- vfs_trybusy() now only fails if the mount is gone, or if someone is
unmounting the file system. Simple contention on mnt_lock doesn't
cause it to fail.
- vfs_busy() will wait even if the file system is being unmounted.
 1.68 31-Jan-2008  tnn branches: 1.68.6; 1.68.8; 1.68.10;
- Needs sys/atomic.h for atomic_inc_uint()
- Quench compiler warning about signed/unsigned mismatch when building LKM
 1.67 30-Jan-2008  ad Expunge references to lockmgr.
 1.66 30-Jan-2008  ad Make it compile. I'll leave it to pooka to figure out what is the correct
thing here because I don't understand what this code is doing.
 1.65 30-Jan-2008  ad PR kern/37706 (forced unmount of file systems is unsafe):

- Do reference counting for 'struct mount'. Each vnode associated with a
mount takes a reference, and in turn the mount takes a reference to the
vfsops.
- Now that mounts are reference counted, replace the overcomplicated mount
locking inherited from 4.4BSD with a recursable rwlock.
 1.64 28-Jan-2008  pooka For code clarity typedef void *puffs_cookie_t.

No functional change.
 1.63 02-Jan-2008  pooka silence gcc about break type-punning
 1.62 02-Jan-2008  ad Merge vmlocking2 to head.
 1.61 05-Dec-2007  pooka branches: 1.61.4;
Send a response message for flush operations from the kernel instead
of abusing the return value of write(2).
 1.60 26-Nov-2007  pooka branches: 1.60.2;
In case the userspace wait is interrupted, don't use ERESTART as
the return value, rather use EINTR.

reported by Reinoud
 1.59 20-Nov-2007  pooka Retire M_PUFFS, use kmem(9) instead.
 1.58 17-Nov-2007  pooka fix some debug prints
 1.57 16-Nov-2007  pooka Restructure the messaging interface a bit more: make all interfacing
with the file server happen through puffs_msg_enqueue() and
puffs_msg_wait() instead of having a billion different routines.
Build the existing system upon these two. Most importantly though,
decouple insertation into the op queue from the actual wait. This
is useful for a number of reasons coming soon to a cvs repo near you.
 1.56 12-Nov-2007  pooka Bounds-check responses from userspace.
 1.55 12-Nov-2007  pooka * split the putter header into a kernel version and a userland version
+ install latter to /usr/include/dev/putter
* remove last dependencies to puffs from putter, it's completely
independent now
 1.54 12-Nov-2007  pooka Move putter code from directly under dev/ to dev/putter/

no functional change
 1.53 10-Nov-2007  pooka Part 2/n of extensive changes to request transport to/from userspace:

Rip the transport code completely out of puffs and generalize it
into an independent module which will be used for multiple purposes
in the future. This module is called the Pass-to-Userspace
Transporter (known as "putter" among friends).

This is very much work-in-progress and one dependency with puffs
remains: the request framing format.

The device name is still /dev/puffs, but that will change soon.

Users of puffs need the following in their kernel configs now:
pseudo-device putter
 1.52 07-Nov-2007  ad Merge from vmlocking:

- pool_cache changes.
- Debugger/procfs locking fixes.
- Other minor changes.
 1.51 04-Nov-2007  pooka branches: 1.51.2;
Make some comments match current reality. No functional change.
 1.50 25-Oct-2007  pooka Reference mountpoint when fetching operations and release waiters
in unmount.
 1.49 21-Oct-2007  pooka Always provide caller information from the kernel based on curlwp.
(but don't deprecate the old puffs_cid interface just yet)
 1.48 19-Oct-2007  pooka When doing a read operation, don't copy the whole kernel buffer to
userspace, since it doesn't contain any information yet. I should
still rework this more so this is just a quickie to get the read/write
style interface more up to speed with the ioctl version.
 1.47 11-Oct-2007  pooka branches: 1.47.2;
Handle suspend and flush requests from the file server.
 1.46 11-Oct-2007  pooka Part 1/n of some pretty extensive changes to how the kernel module
interacts with the userspace file server:

* since the kernel-user communication is not purely request-response
anymore (hasn't been since 2006), try to rename some "request" to
"message". more similar mangling will take place in the future.

* completely rework how messages are allocated. previously most of
them were borrowed from the stack (originally *all* of them),
but now always allocate dynamically. this makes the structure
of the code much cleaner. also makes it possible to fix a
locking order violation. it enables plenty of future enhancements.

* start generalizing the transport interface to be independent of puffs

* move transport interface to read/write instead of ioctl. the
old one had legacy design problems, and besides, ioctl's suck.
implement a very generic version for now; this will be
worked on later hopefully some day reaching "highly optimized".

* implement libpuffs support behind existing library request
interfaces. this will change eventually (I hate those interfaces)
 1.45 09-Oct-2007  pooka g/c vntouser_req(), it's not used anymore
 1.44 04-Oct-2007  pooka g/c the "sizeop" code previous used for ioctl/fcntl. It was already
commented out and has bitrotted beyond all recognition, so it needs
complete rethinking.
 1.43 02-Oct-2007  pooka If kernel resource allocation fails after the file server has
committed something, issue an abort. The abort is done through
the regular op channel, e.g. failed mkdir leads to regular rmdir,
inactive and reclaim. No internal interface is planned currently
for the one file system out of a million which would implement it
to benefit from the one case in a billion where kernel resource
allocation actually does fail and out of that one case in a trillion
where internal vs. external would make a difference.
 1.42 01-Oct-2007  pooka * better error checking: validate error values received from userland
to be vaild errno values
* include string describing error in PUFFS_ERR
* get rid of union in puffs_req, it's nothing but trouble
* pass pmp to async i/o callbacks
 1.41 27-Sep-2007  pooka Add error notifications, which are used to deliver errors from the
kernel to the file server for silly things the file server did,
e.g. attempting to create a file with size VSIZENOTSET. The file
server can handle these as it chooses, but the default action is
for it to throw its hands in the air and sing "goodbye, cruel world,
it's over, walk on by".
 1.40 19-Jul-2007  pooka branches: 1.40.4; 1.40.6; 1.40.8; 1.40.10;
add debug printf
 1.39 09-Jul-2007  ad branches: 1.39.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements
 1.38 06-Jun-2007  pooka Move puffs to a two clause license where it already isn't so. And
as agc pointed out, even files with the third clause were already
effectively two clause because of a slight bug in the language...
 1.37 18-May-2007  pooka Introduce noref setbacks, which the file server can use to signal
the kernel it has 0 references to the node in question. In other
words, this can be used to avoid inactive(), or, if the file server
does not implement inactive, prompt reclaim for removed nodes.
 1.36 08-May-2007  pooka If the op was interrupted, decrease ops waiting for fetch from the
file server only if the op was still waiting for fetch (as opposed
to waiting for the response). Also, properly flag the possible
following inactive as an op for which we do not want to wait for
the response from the file server.
 1.35 07-May-2007  pooka Introduce puffs "setbacks", which can be used to set certain flags
for nodes upon return from the userspace. Currently it can be used
to indicate that the file server should be notified of "inactive"
in case the file server has opted to not receive inactive every
time the reference count for a vnode drops to zero. (inactive is
a common event, almost never requires any action and must be executed
sychronously, so it is wasteful).

While doing this, cleanup the release-relock nonsense from the
vntouser*() arguments. It was never enabled and the whole LOCKEDVP()
concept was very broken to begin with.
 1.34 01-May-2007  pooka Fix a problem introduced when I converted puffs to use newlock2:
when unmounting the file system in case of a certain timing (and
possibly some other conditions), a thread would wait on a condition
variable, while another thread broadcast the cv and immediately
proceeded to destroy it. The result was a system frozen completely
solid shorly after the process waiting for the cv woke up. So
introduce reference counting to synchronize destruction of the
resources in unmount.

I was able to repeat the problem only on my laptop in some special
cases, so I do not know how common it was. Ironically, killing
the file server process violently instead of unmount() didn't have
this problem because it never entered the unmount path from two
directions.
 1.33 24-Apr-2007  pooka remember to flag park as done when we're done with it
 1.32 22-Apr-2007  pooka Now that puffs_park is allocated from the heap and actually freed
by the userdead routine, don't do a TAILQ_FOREACH but rather an
honest for loop.
 1.31 21-Apr-2007  pooka Take care not to access park->park_preq if the waiter is gone, as
that memory is no longer available.
 1.30 20-Apr-2007  pooka don't mutex_enter() manually, we've already park_reference()d a few
lines earlier for entering the same mutex
 1.29 11-Apr-2007  pooka make overspammy debug printf less overspammy
 1.28 04-Apr-2007  pooka Fix one more bug from today's commit: don't remove the op for which
getops runs out of file server buffer space from the request queue.
Otherwise that operation silently vanishes and things go, well, quite
wrong.
 1.27 04-Apr-2007  pooka fix two loop mutex botches in previous
 1.26 04-Apr-2007  pooka Make it possible to interrupt waiters for fs operation completion
again. This is useful until locking is further developed and basically
any deadlocks can be solved by killing appropriate processes.

Thanks especially to Tommi Kyntola and Antti Louko for sitting down
with me and discussing resource ownership and locking strategies
in implementing this.
 1.25 04-Apr-2007  pooka s/ppark/park/ to make all the variable names consistent - park is
always a pointer now. no functional change
 1.24 30-Mar-2007  pooka * abstract ASYNCBIOREAD and let callers freely issue a callback called
from putop. even though there's only one user currently, makes code
more readable
* move "delta" to a standard parameter in vntouser and get rid of the
specialcase vntouser_delta
 1.23 29-Mar-2007  pooka in userdead assign waiter return value only if there is a waiter for
a particular request
 1.22 29-Mar-2007  pooka Convert spinlocks & sleep/wakeup to newlock2 locking stuff. Fix a
bunch of bugs.

* park structures are now always allocated from a pool instead of a
mixed stack/malloc allocation
* get rid of the whole adjbuf concept, always just alloc the maximal
amount of memory to satisfy a request
* little regression: don't allow interrupting wait from file system
to userspace; this had problems already before, but now the problems
really started to shine through. I'll try to make this work again
some day.
* fix bmap to return a sensible value in runp
 1.21 20-Mar-2007  pooka * rework the page cache interaction a bit: cache metadata in the
kernel and flush it out all at once instead of continuous updating
* add support for delivering notifications to the file server about
when a page was written to (but disabled by default for now). the
file server can use this to request flushing or invalidating the
kernel page cache
 1.20 14-Mar-2007  pooka branches: 1.20.2;
Support B_READ|B_ASYNC in strategy by calling biodone() directly
when the file server puts the result.
 1.19 27-Feb-2007  pooka branches: 1.19.2; 1.19.4;
Make wait for the user file server PCATCHable. This makes it
possible to recover the system by just killing processes in case
a file server manages to recurse into itself either by fault of
file server implementation or by pilot error. The downside is that
the code is extremely hard to follow and practically screams out
for newlock2 (in addition to screaming "bug here"). The whole
PCATCH nonsense and induced megacomplexity can hopefully be avoided
in the future by tweaking other parts of the implementation.
 1.18 03-Feb-2007  pooka branches: 1.18.2;
fstrans owner automatically gets a normal lock, don't need to lazy lock

pointed out by hannken
 1.17 29-Jan-2007  hannken Change fstrans enum types to upper case.
No functional change.

From Antti Kantee <pooka@netbsd.org>
 1.16 26-Jan-2007  pooka Initial attempt at suspend/snapshot support for userspace file
servers. This is still pretty much on the level "if it breaks ...".
It should work for single-threaded servers which handle one operation
from start to finish in one go. Also, it does not yet totally
correctly synchronize metadata and data in some cases. So needless
to say, it needs improvement, but it is possible that will have to
wait for some lock revampage.
 1.15 19-Jan-2007  pooka debug print requests going into the queue
 1.14 15-Jan-2007  pooka Store puffs_node's on lists hashed with the cookie value instead
of just one flat list.
 1.13 29-Dec-2006  pooka branches: 1.13.2;
Don't allow calls to be queued while MOUNTING. We don't make any
kernel->server calls at that time and it allows a window where
operations use an incorrect root node cookie.

XXX: there's still a (very much smaller and biglock safe) race, but
that's going to be solved by some more thorough restructuring
 1.12 10-Dec-2006  pooka Don't return EWOULDBLOCK in case we have delivered some requests
even if we are operating on a nonblocking descriptor.
 1.11 10-Dec-2006  pooka PCATCH in tsleep while waiting for operations in getop. Otherwise
we could end up in an unkillable deadlock if GETOP was called when
an operation that had locked the root vnode was already in userspace.
 1.10 05-Dec-2006  pooka branches: 1.10.2;
shuffle functions around a bit: move the transport (/dev/puffs) to
a different file from the messaging (request contents). no functional
change
 1.9 05-Dec-2006  pooka Allow multiple requests to be transferred in each GET/PUTOP. For
a single request, the performance is still the same.
 1.8 21-Nov-2006  pooka if we are going to bail due to the mountpoint being gone from under
us while waiting for syncer lock, release the newly acquired syncer
lock prior to bailing
 1.7 21-Nov-2006  pooka cosmetics
 1.6 14-Nov-2006  pooka branches: 1.6.2;
Fix a race condition with unmount where the mountpoint might disappear
from under us while waiting for syncer_lock and before we got to vfs_busy.
This happens easily e.g. when the userspace server loses its will to
live in VOP_RECLAIM, which is called from vflush() in VFS_UNMOUNT. We
get two competing unmounters. When the first one finishes, it releases
syncer_lock. Now the second one tries to vfs_busy(), but is greeted
with garbage in *mp.

XXX: Technically this is a more general issue and should be fixed
elsewhere, but it's hard to trigger it with normal file systems
unless they are unmounted "simultaneously" twice and are dirty
enough for flushing to take a while. So make a note about it in
the little black book next to the poems and postpone the crusade
for now.
 1.5 09-Nov-2006  pooka few renames to better differentiate between mount & start.. plus some
other renaming
 1.4 07-Nov-2006  pooka attach to genfs & support page cache. most noticeable effect is
mmap and therefore execution of binaries starting to work, some
speed improvements with large file I/O also. caching semantics
and error case handling most likely need revisiting.
 1.3 06-Nov-2006  pooka puffs_park always contains a specific puffs_req, so make it a member
instead of a pointer
 1.2 25-Oct-2006  pooka If the control descriptor is closed, mark userspace dead and wakeup
all waiters *before* trying to get the syncer lock necessary for
dounmount(). This prevents a deadlock if the userspace server dies
while the syncer is running.
 1.1 22-Oct-2006  pooka kernel portion of puffs - the Pass-to-Userspace Framework File System.
It contains the VFS attachment and userspace message-passing interface.

This work was initially started and completed for Google SoC 2005
and tweaked to work a bit better in the past few weeks. While
being far from complete, it is functional enough to be able and
stable to host a fairly general-purpose in-memory file system in
userspace. Even so, puffs should be considered experimental and
no binary compatibility for interfaces or crash-freedom or zero
security implications should be relied upon just yet.

The GSoC project was mentored by William Studenmund and the final
review for the code was done by Christos.
 1.6.2.5 09-Feb-2007  ad Sync with HEAD.
 1.6.2.4 01-Feb-2007  ad Sync with head.
 1.6.2.3 12-Jan-2007  ad Sync with head.
 1.6.2.2 18-Nov-2006  ad Sync with head.
 1.6.2.1 14-Nov-2006  ad file puffs_msgif.c was added on branch newlock2 on 2006-11-18 21:39:20 +0000
 1.10.2.3 18-Dec-2006  yamt sync with head.
 1.10.2.2 10-Dec-2006  yamt sync with head.
 1.10.2.1 05-Dec-2006  yamt file puffs_msgif.c was added on branch yamt-splraiseipl on 2006-12-10 07:18:38 +0000
 1.13.2.9 04-Feb-2008  yamt sync with head.
 1.13.2.8 21-Jan-2008  yamt sync with head
 1.13.2.7 07-Dec-2007  yamt sync with head
 1.13.2.6 15-Nov-2007  yamt sync with head.
 1.13.2.5 27-Oct-2007  yamt sync with head.
 1.13.2.4 03-Sep-2007  yamt sync with head.
 1.13.2.3 26-Feb-2007  yamt sync with head.
 1.13.2.2 30-Dec-2006  yamt sync with head.
 1.13.2.1 29-Dec-2006  yamt file puffs_msgif.c was added on branch yamt-lazymbuf on 2006-12-30 20:50:00 +0000
 1.18.2.5 17-May-2007  yamt sync with head.
 1.18.2.4 07-May-2007  yamt sync with head.
 1.18.2.3 15-Apr-2007  yamt sync with head.
 1.18.2.2 24-Mar-2007  yamt sync with head.
 1.18.2.1 12-Mar-2007  rmind Sync with HEAD.
 1.19.4.1 11-Jul-2007  mjf Sync with head.
 1.19.2.8 12-Oct-2007  ad Sync with head.
 1.19.2.7 09-Oct-2007  ad Sync with head.
 1.19.2.6 01-Sep-2007  ad Update for pool_cache API changes.
 1.19.2.5 20-Aug-2007  ad Sync with HEAD.
 1.19.2.4 09-Jun-2007  ad Sync with head.
 1.19.2.3 08-Jun-2007  ad Sync with head.
 1.19.2.2 10-Apr-2007  ad Sync with head.
 1.19.2.1 05-Apr-2007  ad Compile fixes.
 1.20.2.1 29-Mar-2007  reinoud Pullup to -current
 1.39.2.1 15-Aug-2007  skrll Sync with HEAD.
 1.40.10.2 19-Jul-2007  pooka add debug printf
 1.40.10.1 19-Jul-2007  pooka file puffs_msgif.c was added on branch matt-mips64 on 2007-07-19 22:05:23 +0000
 1.40.8.2 14-Oct-2007  yamt sync with head.
 1.40.8.1 06-Oct-2007  yamt sync with head.
 1.40.6.4 23-Mar-2008  matt sync with HEAD
 1.40.6.3 09-Jan-2008  matt sync with HEAD
 1.40.6.2 08-Nov-2007  matt sync with -HEAD
 1.40.6.1 06-Nov-2007  matt sync with HEAD
 1.40.4.10 09-Dec-2007  jmcneill Sync with HEAD.
 1.40.4.9 27-Nov-2007  joerg Sync with HEAD. amd64 Xen support needs testing.
 1.40.4.8 21-Nov-2007  joerg Sync with HEAD.
 1.40.4.7 14-Nov-2007  joerg Sync with HEAD.
 1.40.4.6 11-Nov-2007  joerg Sync with HEAD.
 1.40.4.5 04-Nov-2007  jmcneill Sync with HEAD.
 1.40.4.4 28-Oct-2007  joerg Sync with HEAD.
 1.40.4.3 26-Oct-2007  joerg Sync with HEAD.

Follow the merge of pmap.c on i386 and amd64 and move
pmap_init_tmp_pgtbl into arch/x86/x86/pmap.c. Modify the ACPI wakeup
code to restore CR4 before jumping back into kernel space as the large
page option might cover that.
 1.40.4.2 07-Oct-2007  joerg Sync with HEAD.
 1.40.4.1 02-Oct-2007  joerg Sync with HEAD.
 1.47.2.4 21-Nov-2007  bouyer Sync with HEAD
 1.47.2.3 18-Nov-2007  bouyer Sync with HEAD
 1.47.2.2 13-Nov-2007  bouyer Sync with HEAD
 1.47.2.1 25-Oct-2007  bouyer Sync with HEAD.
 1.51.2.3 18-Feb-2008  mjf Sync with HEAD.
 1.51.2.2 08-Dec-2007  mjf Sync with HEAD.
 1.51.2.1 19-Nov-2007  mjf Sync with HEAD.
 1.60.2.1 08-Dec-2007  ad Sync with head.
 1.61.4.2 08-Jan-2008  bouyer Sync with HEAD
 1.61.4.1 02-Jan-2008  bouyer Sync with HEAD
 1.68.10.4 11-Aug-2010  yamt sync with head.
 1.68.10.3 11-Mar-2010  yamt sync with head
 1.68.10.2 04-May-2009  yamt sync with head.
 1.68.10.1 16-May-2008  yamt sync with head.
 1.68.8.1 18-May-2008  yamt sync with head.
 1.68.6.2 28-Sep-2008  mjf Sync with HEAD.
 1.68.6.1 02-Jun-2008  mjf Sync with HEAD.
 1.71.6.1 19-Oct-2008  haad Sync with HEAD.
 1.71.2.1 10-Oct-2008  skrll Sync with HEAD.
 1.72.12.1 21-Apr-2010  matt sync to netbsd-5
 1.72.8.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.72.4.5 02-Nov-2011  riz Pull up following revision(s) (requested by manu in ticket #1679):
sys/fs/puffs/puffs_vnops.c: revision 1.157
sys/fs/puffs/puffs_vnops.c: revision 1.158
sys/fs/puffs/puffs_vnops.c: revision 1.159
sys/fs/puffs/puffs_vfsops.c: revision 1.97
sys/fs/puffs/puffs_vfsops.c: revision 1.99
sys/fs/puffs/puffs_vnops.c: revision 1.160
sys/fs/puffs/puffs_vfsops.c: revision 1.100
sys/miscfs/syncfs/sync_subr.c: revision 1.47
sys/fs/puffs/puffs_node.c: revision 1.21
sys/fs/puffs/puffs_node.c: revision 1.22
sys/fs/puffs/puffs_msgif.c: revision 1.88
sys/fs/puffs/puffs_msgif.c: revision 1.89
sys/fs/puffs/puffs_vnops.c: revision 1.156
Make sure ioflush does not sleep in PUFFS code path, waiting for a mutex,
a memory allocation, or a response from the filesystem.
This avoids deadlocks in the following situations:
1) when memory is low: ioflush waits the fileystem, the fielsystem waits
for memory
2) when the filesystem does not respond (e.g.: network outage ona
distributed filesystem)
Fix the build that was broken by struct lwp *updateproc reference in
RUMP-visible code. Instead of checking that updateproc (aka ioflush,
aka syncer) will not sleep in PUFFS code, I check for any kernel thread:
after all none of them are designed to hang awaiting for a remote filesystem
operation to complete.
Roll back the change that forced kernel threads to not sleep in PUFFS.
The change does not make consensus, since only pagedaemon should need it.
Other threads will tolerate sleeping, and problems here are only symptoms
that something is going wrong in memory management. The cause, not the
symptoms, need to be fixed.
Make sure pagedaemon does not sleep for memory in puffs_vnop_sleep.
Add KASSERT on any sleeping memory allocation to check it cannot happen again.
Remove #ifdef DIAGNOSTIC guards around KASSERT, as the macro contains them
 1.72.4.4 15-Jul-2011  riz Pull up following revision(s) (requested by manu in ticket #1604):
sys/fs/puffs/puffs_msgif.c: revision 1.84
Apply patch from PR kern/44093 by yamt:
Interrupt server wait only on certain signals (same set at nfs -i)
instead of all signals. According to the PR this helps with
"git clone" run on a puffs file system.
 1.72.4.3 20-May-2011  bouyer Revert ticket 1604, it does't build.
 1.72.4.2 19-May-2011  bouyer Pull up following revision(s) (requested by manu in ticket #1604):
sys/fs/puffs/puffs_msgif.c: revision 1.84 via patch
Apply patch from PR kern/44093 by yamt:
Interrupt server wait only on certain signals (same set at nfs -i)
instead of all signals. According to the PR this helps with
"git clone" run on a puffs file system.
 1.72.4.1 09-Jan-2010  snj Pull up following revision(s) (requested by pooka in ticket #1212):
sys/fs/puffs/puffs_msgif.c: revision 1.76 via patch
sys/fs/puffs/puffs_sys.h: revision 1.73 via patch
sys/fs/puffs/puffs_vfsops.c: revision 1.84 via patch
Process flush requests from the file server in a separate thread
context. This fixes a long-standing but seldomly seen deadlock,
where the kernel was holding pages busy (due to e.g. readahead
request) while waiting for the server to respond, and the server
made a callback into the kernel asking to invalidate those pages.
... or, well, theoretically fixes, since I didn't have any reliable
way of repeating the deadlock and I think I saw it only twice.
 1.72.2.1 28-Apr-2009  skrll Sync with HEAD.
 1.80.4.2 05-Mar-2011  rmind sync with head
 1.80.4.1 16-Mar-2010  rmind Change struct uvm_object::vmobjlock to be dynamically allocated with
mutex_obj_alloc(). It allows us to share the locks among UVM objects.
 1.80.2.1 17-Aug-2010  uebayasi Sync with HEAD.
 1.84.4.1 17-Feb-2011  bouyer Sync with HEAD
 1.84.2.1 06-Jun-2011  jruoho Sync with HEAD.
 1.85.2.1 23-Jun-2011  cherry Catchup with rmind-uvmplock merge.
 1.89.8.1 12-Aug-2012  martin Pull up following revision(s) (requested by manu in ticket #438):
lib/libperfuse/perfuse_priv.h: revision 1.31
sys/fs/puffs/puffs_msgif.h: revision 1.80
sys/fs/puffs/puffs_vnops.c: revision 1.171
lib/libpuffs/puffs_ops.3: revision 1.31
sys/fs/puffs/puffs_vnops.c: revision 1.172
sys/fs/puffs/puffs_vnops.c: revision 1.173
sys/fs/puffs/puffs_vnops.c: revision 1.174
usr.sbin/perfused/perfused.c: revision 1.24
sys/fs/puffs/puffs_sys.h: revision 1.80
sys/fs/puffs/puffs_sys.h: revision 1.81
sys/fs/puffs/puffs_sys.h: revision 1.82
lib/libperfuse/subr.c: revision 1.19
lib/libperfuse/perfuse.c: revision 1.30
sys/fs/puffs/puffs_msgif.c: revision 1.90
sys/fs/puffs/puffs_msgif.c: revision 1.91
sys/fs/puffs/puffs_msgif.c: revision 1.92
lib/libperfuse/ops.c: revision 1.59
lib/libpuffs/puffs.3: revision 1.53
lib/libperfuse/debug.c: revision 1.12
lib/libpuffs/puffs.3: revision 1.54
sys/fs/puffs/puffs_vnops.c: revision 1.167
sys/fs/puffs/puffs_msgif.h: revision 1.79
usr.sbin/perfused/msg.c: revision 1.21
sys/fs/puffs/puffs_vfsops.c: revision 1.102
sys/fs/puffs/puffs_vfsops.c: revision 1.103
sys/fs/puffs/puffs_vfsops.c: revision 1.105
lib/libpuffs/puffs.h: revision 1.123
lib/libperfuse/perfuse_if.h: revision 1.20
lib/libperfuse/perfuse.c: revision 1.29
lib/libpuffs/dispatcher.c: revision 1.42
lib/libpuffs/dispatcher.c: revision 1.43
- Fix same vnodes associated with multiple cookies
The scheme used to retreive known nodes on lookup was flawed, as it only
used parent and name. This produced a different cookie for the same file
if it was renamed, when looking up ../ or when dealing with multiple files
associated with the same name through link(2).
We therefore abandon the use of node name and introduce hashed lists of
inodes. This causes a huge rewrite of reclaim code, which do not attempt
to keep parents allocated until all their children are reclaimed
- Fix race conditions in reclaim
There are a few situations where we issue multiple FUSE operations for
a PUFFS operation. On reclaim, we therefore have to wait for all FUSE
operation to complete, not just the current exchanges. We do this by
introducing node reference count with node_ref() and node_rele().
- Detect data loss caused by FAF
VOP_PUTPAGES causes FAF writes where the kernel does not check the
operation result. At least issue a warning on error.
- Enjoy FAF shortcut on setattr
No need to wait for the result if the kernel does not want it. There is
however an exception for setattr that touch the size, we need to wait
for completion because we have other operations queued for after the
resize.
- Fix fchmod() on write-open file
fchmod() on a node open with write privilege will send setattr with both mode
and size set. This confuses some FUSE filesystem. Therefore we send two FUSE
operations, one for mode, and one for size.
- Remove node TTL handling for netbsd-5 for simplicity sake. The code
still builds on netbsd-5 but does not have the node TTL feature anymore.
It works fine with kernel support on netbsd-6.
- Improve PUFFS_KFLAG_CACHE_FS_TTL by reclaiming older inactive nodes.
The normal kernel behavior is to retain inactive nodes in the freelist
until it runs out of vnodes. This has some merit for local filesystems,
where the cost of an allocation is about the same as the cost of a
lookup. But that situation is not true for distributed filesystems.
On the other hand, keeping inactive nodes for a long time hold memory
in the file server process, and when the kernel runs out of vnodes, it
produce reclaim avalanches that increase lattency for other operations.
We do not reclaim inactive vnodes immediatly either, as they may be
looked up again shortly. Instead we introduce a grace time and we
reclaim nodes that have been inactive beyond the grace time.
- Fix lookup/reclaim race condition.
The above improvement undercovered a race condition between lookup and
reclaim. If we reclaimed a vnode associated with a userland cookie while
a lookup returning that same cookiewas inprogress, then the kernel ends
up with a vnode associated with a cookie that has been reclaimed in
userland. Next operation on the cookie will crash (or at least confuse)
the filesystem.
We fix this by introducing a lookup count in kernel and userland. On
reclaim, the kernel sends the count, which enable userland to detect
situation where it initiated a lookup that is not completed in kernel.
In such a situation, the reclaim must be ignored, as the node is about
to be looked up again.
Fix hang unmount bug introduced by last commit.
We introduced a slow queue for delayed reclaims, while the existing
queue for unmount, flush and exist has been renamed fast queue. Both
queues had timestamp for when an operation should be done, but it was
useless for the fast queue, which is always used to run an operation
ASAP. And the timestamp test had an error that turned ASAP into "at next
tick", but nobody what there to wake the thread at next tick, hence
the hang. The fix is to remove the useless and buggy timestamp test for
fast queue.
Rename slow sopreq queue into node sopreq queue, to refet the fact that
is only intended for postponed node reclaims.
When purging the node sopreq queue, do not call puffs_msg_sendresp(), as
it makes no sense.
Fix race condition between (create|mknod|mkdir|symlino) and reclaim, just
like we did it between lookup and reclaim.
Missing bit in previous commit (prevent race between create|mknod|mkdir|symlink
and reclaim)
Bump date for previous.
New sentence, new line; remove trailing whitespace; fix typos;
punctuation nits.
Add PUFFS_KFLAG_CACHE_DOTDOT so that vnodes hold a reference on their
parent, keeping them active, and allowing to lookup .. without sending
a request to the filesystem.
Enable the featuure for perfused, as this is how FUSE works.
Missing bit in previous commit (PUFFS_KFLAG_CACHE_DOTDOT option to avoid
looking up ..)
 1.89.2.3 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.89.2.2 16-Jan-2013  yamt sync with (a bit old) head
 1.89.2.1 30-Oct-2012  yamt sync with head
 1.92.2.3 03-Dec-2017  jdolecek update from HEAD
 1.92.2.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.92.2.1 20-Nov-2012  tls Resync to 2012-11-19 00:00:00 UTC
 1.93.2.1 18-May-2014  rmind sync with head
 1.94.4.2 17-Jan-2015  martin Pull up following revision(s) (requested by maxv in ticket #427):
sys/compat/svr4/svr4_schedctl.c: revision 1.8
sys/netinet/tcp_timer.c: revision 1.88
sys/miscfs/genfs/layer_vfsops.c: revision 1.45
sys/compat/svr4/svr4_ioctl.c: revision 1.37
sys/ufs/chfs/chfs_vfsops.c: revision 1.14
sys/miscfs/fdesc/fdesc_vfsops.c: revision 1.91
sys/compat/linux/arch/i386/linux_ptrace.c: revision 1.30
sys/compat/common/kern_time_50.c: revision 1.28
sys/netinet6/ip6_forward.c: revision 1.74
sys/miscfs/umapfs/umap_vnops.c: revision 1.57
sys/compat/svr4/svr4_fcntl.c: revision 1.74
distrib/sets/lists/comp/mi: revision 1.1931
sys/netinet6/udp6_output.c: revision 1.46
sys/fs/puffs/puffs_compat.c: revision 1.3
sys/fs/udf/udf_rename.c: revision 1.11
sys/compat/svr4/svr4_filio.c: revision 1.24
sys/fs/udf/udf_rename.c: revision 1.12
sys/netinet/tcp_usrreq.c: revision 1.202
sys/miscfs/umapfs/umap_subr.c: revision 1.29
sys/compat/linux/common/linux_fadvise64.c: revision 1.3
sys/netinet/if_atm.c: revision 1.34
sys/miscfs/procfs/procfs_subr.c: revision 1.106
sys/miscfs/genfs/layer_subr.c: revision 1.37
sys/netinet/tcp_sack.c: revision 1.30
sys/compat/freebsd/freebsd_misc.c: revision 1.33
sys/compat/freebsd/freebsd_file.c: revision 1.33
sys/ufs/chfs/chfs_vnode.c: revision 1.12
sys/compat/svr4/svr4_ttold.c: revision 1.34
sys/compat/linux/common/linux_file.c: revision 1.114
sys/compat/linux/arch/mips/linux_machdep.c: revision 1.43
sys/compat/linux/common/linux_signal.c: revision 1.76
sys/compat/common/compat_util.c: revision 1.46
sys/compat/linux/arch/arm/linux_ptrace.c: revision 1.18
sys/compat/svr4/svr4_sockio.c: revision 1.36
sys/compat/linux/arch/arm/linux_machdep.c: revision 1.32
sys/compat/svr4/svr4_signal.c: revision 1.66
sys/kern/kern_exec.c: revision 1.410
sys/fs/puffs/puffs_vfsops.c: revision 1.115
sys/compat/svr4/svr4_exec_elf64.c: revision 1.15
sys/compat/linux/arch/i386/linux_machdep.c: revision 1.159
sys/compat/linux/arch/alpha/linux_machdep.c: revision 1.50
sys/compat/linux32/common/linux32_misc.c: revision 1.24
sys/netinet/in_pcb.c: revision 1.153
sys/sys/malloc.h: revision 1.116
sys/compat/common/if_43.c: revision 1.9
share/man/man9/Makefile: revision 1.380
sys/netinet/tcp_vtw.c: revision 1.12
sys/miscfs/umapfs/umap_vfsops.c: revision 1.95
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.186
sys/compat/common/uipc_syscalls_43.c: revision 1.46
sys/ufs/ext2fs/ext2fs_vnops.c: revision 1.115
sys/fs/puffs/puffs_msgif.c: revision 1.97
sys/compat/svr4/svr4_ipc.c: revision 1.27
sys/compat/linux/common/linux_exec.c: revision 1.117
sys/ufs/ext2fs/ext2fs_readwrite.c: revision 1.66
sys/netinet/tcp_output.c: revision 1.179
sys/compat/svr4/svr4_termios.c: revision 1.28
sys/fs/udf/udf_strat_bootstrap.c: revision 1.4
sys/fs/puffs/puffs_subr.c: revision 1.67
sys/fs/puffs/puffs_node.c: revision 1.36
sys/miscfs/overlay/overlay_vnops.c: revision 1.21
sys/fs/cd9660/cd9660_node.c: revision 1.34
sys/netinet/raw_ip.c: revision 1.146
sys/sys/mallocvar.h: revision 1.13
sys/miscfs/overlay/overlay_vfsops.c: revision 1.63
share/man/man9/malloc.9: revision 1.50
sys/netinet6/dest6.c: revision 1.18
sys/compat/linux/common/linux_uselib.c: revision 1.33
sys/compat/linux/common/linux_socket.c: revision 1.120
share/man/man9/malloc.9: revision 1.51
sys/netinet/tcp_subr.c: revision 1.257
sys/compat/linux/common/linux_socketcall.c: revision 1.45
sys/compat/linux/common/linux_fadvise64_64.c: revision 1.3
sys/compat/freebsd/freebsd_ipc.c: revision 1.17
sys/compat/linux/common/linux_misc_notalpha.c: revision 1.109
sys/compat/linux/arch/alpha/linux_pipe.c: revision 1.17
sys/netinet6/in6_pcb.c: revision 1.132
sys/netinet6/in6_ifattach.c: revision 1.94
sys/compat/svr4/svr4_exec_elf32.c: revision 1.15
sys/miscfs/nullfs/null_vfsops.c: revision 1.90
sys/fs/cd9660/cd9660_util.c: revision 1.12
sys/compat/linux/arch/powerpc/linux_machdep.c: revision 1.48
sys/compat/freebsd/freebsd_exec_elf32.c: revision 1.20
sys/miscfs/procfs/procfs_vfsops.c: revision 1.94
sys/compat/linux/arch/powerpc/linux_ptrace.c: revision 1.28
sys/compat/linux/common/linux_sched.c: revision 1.67
sys/compat/linux/common/linux_exec_aout.c: revision 1.67
sys/compat/linux/common/linux_pipe.c: revision 1.67
sys/compat/linux/common/linux_llseek.c: revision 1.34
sys/compat/linux/arch/mips/linux_ptrace.c: revision 1.10
Do not uselessly include <sys/malloc.h>.
Cleanup:
- remove struct kmembuckets (dead)
- correctly deadify MALLOC_XX
- remove MALLOC_DEFINE_LIMIT and MALLOC_JUSTDEFINE_LIMIT (dead)
- remove malloc_roundup(), malloc_type_setlimit(), MALLOC_DEFINE_LIMIT()
and MALLOC_JUSTDEFINE_LIMIT() from man 9 malloc
New sentence, new line. Bump date for previous.
Obsolete malloc_roundup(9), malloc_type_setlimit(9) and MALLOC_DEFINE_LIMIT(9)
man pages.
 1.94.4.1 29-Aug-2014  martin Pull up following revision(s) (requested by hannken in ticket #67):
sys/fs/puffs/puffs_sys.h: revision 1.86
sys/fs/puffs/puffs_vfsops.c: revision 1.114
sys/fs/puffs/puffs_msgif.c: revision 1.95
sys/fs/puffs/puffs_node.c: revision 1.32
sys/fs/puffs/puffs_vnops.c: revision 1.184
Change puffs from hashlist to vcache.
- field "pa_nhashbuckets" of struct "puffs_kargs" becomes a no-op.
and should be removed on the next protocol version bump.
 1.97.2.4 28-Aug-2017  skrll Sync with HEAD
 1.97.2.3 05-Feb-2017  skrll Sync with HEAD
 1.97.2.2 09-Jul-2016  skrll Sync with HEAD
 1.97.2.1 06-Jun-2015  skrll Sync with HEAD
 1.99.2.2 26-Apr-2017  pgoyette Sync with HEAD
 1.99.2.1 07-Jan-2017  pgoyette Sync with HEAD. (Note that most of these changes are simply $NetBSD$
tag issues.)
 1.100.2.1 21-Apr-2017  bouyer Sync with HEAD
 1.101.12.2 08-Apr-2020  martin Merge changes from current as of 20200406
 1.101.12.1 10-Jun-2019  christos Sync with HEAD
 1.101.10.11 22-Jan-2019  pgoyette Convert the MODULE_{,VOID_}HOOK_CALL macros to do everything in-line
rather than defining an intermediate hook##call function. Almost
all of the hooks are called only once, and although we lose the
ability of doing things like

if (MODULE_HOOK_CALL(...) == 0) ...

we simplify things quite a bit. With this change, we no longer need
to have both declaration and definition macros, and the definition
no longer needs to have both prototype argument list and a "real"
argument list.

FWIW, the above if now needs to written as

int ret;

MODULE_HOOK_CALL(..., ret);
if (ret == 0) ...

with appropriate use of braces {}.
 1.101.10.10 21-Jan-2019  pgoyette No need to declare the hook_call() function for void hooks. So
remove and simplify.
 1.101.10.9 18-Jan-2019  pgoyette Don't restrict hooks to having only int or void types. Pass the hook's
type to the various macros, as needed.

Allows us to reduce diffs to original in at least one or two places (we
no longer have to provide an additional parameter to the hook routine
for returning a non-int return value).
 1.101.10.8 14-Jan-2019  pgoyette Create a variant of the HOOK macros that handles hook routines of
type void, and use them where appropriate.
 1.101.10.7 13-Jan-2019  pgoyette Remove the HOOK2 versions of the MODULE_HOOK macros. There were
only a few uses, and using them led to some lack of clarity in the
code. Instead, we now use two separate hooks, with names that
make it clear(er) what we're doing.

This also positions us to start unraveling some of the rtsock_50
mess, which will need (at least) five hooks.
 1.101.10.6 29-Sep-2018  pgoyette In MODULE_HOOK_CALL_DECL we don't need to provide the actual argument
list for calling the hook function, nor do we need to provide the
default value (for when the hook has not been set).
 1.101.10.5 18-Sep-2018  pgoyette The COMPAT_HOOK macros were renamed to MODULE_HOOK, adjust all callers
 1.101.10.4 18-Sep-2018  pgoyette Split the COMPAT_CALL_HOOK to separate the declaration from the
implementation. Some hooks are called from multiple source files,
and the old method resulted in duplicate implementations.

Implement MP-safe hooks for the usb_subr_30 code. Pass the helper
functions as arguments to the compat code so it does not have to
determine if the kernel contains usb code.
 1.101.10.3 17-Sep-2018  pgoyette Adapt (most of) the indirect function pointers to the new MP-safe
mechanism. Still remaining are the compat_netbsd32 stuff, and
some usb subroutines.
 1.101.10.2 24-Mar-2018  pgoyette Use function pointers to call the compatability functions.
 1.101.10.1 24-Mar-2018  pgoyette Add fs/puffs compat_50 to the modules
 1.104.6.1 29-Feb-2020  ad Sync with head.
 1.107.2.1 02-Aug-2025  perseant Sync with HEAD
 1.87 03-Dec-2021  pho Avoid using register_t in <fs/puffs/puffs_msgif.h>

The purpose of this header file is to interface between the
kernel-space and user-space, and is #include'd by a user-space header
<puffs.h>. It should therefore not use any of kernel-only types, as
it's not reasonable to require user-land filesystems to #define
_KERNTYPES.
 1.86 08-Mar-2021  christos give names to the enums so we can cast by name for lint
 1.85 23-Sep-2019  christos branches: 1.85.8;
Restore binary compatibility by using the statvfs90 structure internally.
 1.84 15-Feb-2015  manu branches: 1.84.18;
Add PUFFS_KFLAG_NOFLUSH_META to prevent sending metadata flush to FUSE

FUSE filesystems do not expect to get metadata updates for [amc]time
and size, they updates the value on their own after operations.

The PUFFS PUFFS_KFLAG_NOFLUSH_META option prevents regular metadata cache
flushes to the filesystem , and libperfuse uses it to match Linux FUSE
behavior.

While there, fix a bug in SETATTR: do not update kernel metadata cache
from SETATTR reply when the request is asynchronous, as we do not have
the reply yet.
 1.83 31-Oct-2014  manu branches: 1.83.2;
Add PUFFS_HAVE_FALLOCATE in puffs_msgif.h so that filesystem can decide
at build time wether fallocate is usable
 1.82 31-Oct-2014  manu Add PUFFS support for fallocate and fdiscard operations
 1.81 16-Aug-2014  manu Add a oflags input field to open requests so that the filesystem can pass
back information about the file. Implement PUFFS_OPEN_IO_DIRECT, which
will force direct IO (bypassing page cache) for the file.
 1.80 10-Aug-2012  manu branches: 1.80.2; 1.80.14;
Add PUFFS_KFLAG_CACHE_DOTDOT so that vnodes hold a reference on their
parent, keeping them active, and allowing to lookup .. without sending
a request to the filesystem.

Enable the featuure for perfused, as this is how FUSE works.
 1.79 21-Jul-2012  manu - Improve PUFFS_KFLAG_CACHE_FS_TTL by reclaiming older inactive nodes.

The normal kernel behavior is to retain inactive nodes in the freelist
until it runs out of vnodes. This has some merit for local filesystems,
where the cost of an allocation is about the same as the cost of a
lookup. But that situation is not true for distributed filesystems.
On the other hand, keeping inactive nodes for a long time hold memory
in the file server process, and when the kernel runs out of vnodes, it
produce reclaim avalanches that increase lattency for other operations.

We do not reclaim inactive vnodes immediatly either, as they may be
looked up again shortly. Instead we introduce a grace time and we
reclaim nodes that have been inactive beyond the grace time.

- Fix lookup/reclaim race condition.

The above improvement undercovered a race condition between lookup and
reclaim. If we reclaimed a vnode associated with a userland cookie while
a lookup returning that same cookiewas inprogress, then the kernel ends
up with a vnode associated with a cookie that has been reclaimed in
userland. Next operation on the cookie will crash (or at least confuse)
the filesystem.

We fix this by introducing a lookup count in kernel and userland. On
reclaim, the kernel sends the count, which enable userland to detect
situation where it initiated a lookup that is not completed in kernel.
In such a situation, the reclaim must be ignored, as the node is about
to be looked up again.
 1.78 08-Apr-2012  manu Add name and atttribute cache with filesytem provided TTL.
lookup, create, mknod, mkdir, symlink, getattr and setattr messages
have been extended so that attributes and their TTL can be provided
by the filesytem. lookup, create, mknod, mkdir, and symlink messages
are also extended so that the filesystem can provide name TTL.
 1.77 27-Sep-2011  christos branches: 1.77.2; 1.77.6; 1.77.8;
don't get affected by the NAME_MAX bump. Use the same constant as the
rest of the extrattr code.
 1.76 04-Jul-2011  manu Add a flag to VOP_LISTEXTATTR(9) so that the vnode interface can tell the
filesystem in which format extended attribute shall be listed.

There are currently two formats:
- NUL-terminated strings, used for listxattr(2), this is the default.
- one byte length-pprefixed, non NUL-terminated strings, used for
extattr_list_file(2), which is obtanined by setting the
EXTATTR_LIST_PREFIXLEN flag to VOP_LISTEXTATTR(9)

This approach avoid the need for converting the list back and forth, except
in libperfuse, since FUSE uses NUL-terminated strings, and the kernel may
have requested EXTATTR_LIST_PREFIXLEN.
 1.75 06-Jul-2010  pooka Add compat to enable running puffs in a 64bit time_t kernel against
a server which runs in 32bit time_t namespace.
 1.74 07-Jun-2010  pooka Make retval argument for pathconf a register_t to match VOP_PATHCONF.
This makes the size the same on 64bit archs. Don't bother bumping
any version, since you'd have explicitly had to jump through some
hoops to use pathconf before.
 1.73 21-May-2010  pooka add option string for no attribute cache
(foreseeing the odd event I might actually implement one some day)
 1.72 21-May-2010  pooka Since libpuffs needs a major bump for extattr support anyway, make
some changes to the user-kernel protocol. Namely, try to be a
little more resilient some future changes.
 1.71 21-May-2010  pooka Support extended attributes.
 1.70 20-May-2010  pooka Fix typo.
 1.69 07-Jan-2010  pooka branches: 1.69.2; 1.69.4;
Add a PUFFS_UNMOUNT server->kernel request, which causes the kernel
to initiate self destruct, i.e. unmount(MNT_FORCE). This, however,
is a semi-controlled self-destruct, since all caches are flushed
before the (possibly) violent unmount takes place.
 1.68 17-Oct-2009  pooka Bump protocol version once more to allow for previous to be pulled
to netbsd-5 (protocols are not compatible due to time_t/dev_t
change).
 1.67 17-Oct-2009  pooka Transmit VOP_ABORTOP() to the server.
 1.66 12-Jan-2009  pooka Bump interface version number for the time_t/dev_t changes.
 1.65 28-Jan-2008  pooka branches: 1.65.6; 1.65.10; 1.65.18; 1.65.20; 1.65.26;
For code clarity typedef void *puffs_cookie_t.

No functional change.
 1.64 08-Dec-2007  pooka Now that "l" is gone both as an argument to operations and from
componentname, remove all vestiges of puffs_cid.
 1.63 05-Dec-2007  pooka Send a response message for flush operations from the kernel instead
of abusing the return value of write(2).
 1.62 04-Dec-2007  pooka Add a bit to differentiate if a message is a request or a response.
 1.61 27-Nov-2007  pooka branches: 1.61.2;
Remove "puffs_cid" from the puffs interface following l-removal
from the kernel vfs interfaces. puffs_cc_getcaller(pcc) can be
used now should the same information be desired.
 1.60 12-Nov-2007  pooka * split the putter header into a kernel version and a userland version
+ install latter to /usr/include/dev/putter
* remove last dependencies to puffs from putter, it's completely
independent now
 1.59 21-Oct-2007  pooka branches: 1.59.2;
Always provide caller information from the kernel based on curlwp.
(but don't deprecate the old puffs_cid interface just yet)
 1.58 19-Oct-2007  pooka When doing a read operation, don't copy the whole kernel buffer to
userspace, since it doesn't contain any information yet. I should
still rework this more so this is just a quickie to get the read/write
style interface more up to speed with the ioctl version.
 1.57 11-Oct-2007  pooka branches: 1.57.2;
g/c garbage
 1.56 11-Oct-2007  pooka Part 1/n of some pretty extensive changes to how the kernel module
interacts with the userspace file server:

* since the kernel-user communication is not purely request-response
anymore (hasn't been since 2006), try to rename some "request" to
"message". more similar mangling will take place in the future.

* completely rework how messages are allocated. previously most of
them were borrowed from the stack (originally *all* of them),
but now always allocate dynamically. this makes the structure
of the code much cleaner. also makes it possible to fix a
locking order violation. it enables plenty of future enhancements.

* start generalizing the transport interface to be independent of puffs

* move transport interface to read/write instead of ioctl. the
old one had legacy design problems, and besides, ioctl's suck.
implement a very generic version for now; this will be
worked on later hopefully some day reaching "highly optimized".

* implement libpuffs support behind existing library request
interfaces. this will change eventually (I hate those interfaces)
 1.55 04-Oct-2007  pooka g/c the "sizeop" code previous used for ioctl/fcntl. It was already
commented out and has bitrotted beyond all recognition, so it needs
complete rethinking.
 1.54 02-Oct-2007  pooka If kernel resource allocation fails after the file server has
committed something, issue an abort. The abort is done through
the regular op channel, e.g. failed mkdir leads to regular rmdir,
inactive and reclaim. No internal interface is planned currently
for the one file system out of a million which would implement it
to benefit from the one case in a billion where kernel resource
allocation actually does fail and out of that one case in a trillion
where internal vs. external would make a difference.
 1.53 01-Oct-2007  pooka * better error checking: validate error values received from userland
to be vaild errno values
* include string describing error in PUFFS_ERR
* get rid of union in puffs_req, it's nothing but trouble
* pass pmp to async i/o callbacks
 1.52 27-Sep-2007  pooka nuke trailing , from enum. spotted by xtraeme
 1.51 27-Sep-2007  pooka Add error notifications, which are used to deliver errors from the
kernel to the file server for silly things the file server did,
e.g. attempting to create a file with size VSIZENOTSET. The file
server can handle these as it chooses, but the default action is
for it to throw its hands in the air and sing "goodbye, cruel world,
it's over, walk on by".
 1.50 23-Aug-2007  pooka branches: 1.50.2; 1.50.4;
Add a third type of fh option, passthrough, where the kernel does
not attempt to handle struct fid at all and passes it as such to
userspace.
 1.49 22-Aug-2007  pooka Mimic namei structure changes for puffs. bump both kernel & lib version.
 1.48 15-Aug-2007  pooka Nuke PUFFSLOOKUP_FOO and move to NAMEI_FOO
 1.47 30-Jul-2007  pooka branches: 1.47.4; 1.47.6;
Move PUFFS_TYPEPREFIX to puffs_msgif.h since it's used in a macro there.
 1.46 27-Jul-2007  pooka include <uvm/uvm_prot.h>
 1.45 27-Jul-2007  pooka Change unused fflags parameter in VOP_MMAP to prot and pass in
desired vm protection.
 1.44 19-Jul-2007  pooka define PUFFSREQSIZEOP ioctl, which can be used to fetch the
maximum request size
 1.43 18-Jul-2007  pooka kill MFSNAMELEN limit
 1.42 17-Jul-2007  pooka branches: 1.42.2;
Set a file server supplied file system type in the type field and set
the mntfromname to be the place mounted from instead of the type.
 1.41 16-Jul-2007  pooka 1|2 is more correct when it's 3 instead of 2. This makes calls to
the file server inactive less over-eagerly executed and masks some
problems with the new mounting style. Effectively, it makes some
file systems such as psshfs mountable again (only without -o allops).
 1.40 02-Jul-2007  pooka support turning REQUIREDIR off and extra consume in lookup
 1.39 02-Jul-2007  pooka Get rid of the "int *refs" parameter to inactive: the same can be
accomplished now with puffs_setbacks.
 1.38 01-Jul-2007  pooka Give the file server to ability to request the entire pathname buffer
under lookup by using PUFFS_KFLAG_LOOKUP_FULLPNBUF instead just the
current component.
 1.37 01-Jul-2007  pooka Instead of supplying a plain pid, supply an abstract struct puffs_cid *,
which can currently be used to query the pid and lwpid.
 1.36 01-Jul-2007  pooka make puffs_cred an opaque type
 1.35 24-Jun-2007  pooka Actually, keep PUFFS_KFLAG_NOCACHE and -o cache around as shorthand
to neither page- nor namecache.
 1.34 24-Jun-2007  pooka Split the NOCACHE option in twain: NOCACHE_NAME & NOCACHE_PAGE.
 1.33 06-Jun-2007  pooka Move puffs to a two clause license where it already isn't so. And
as agc pointed out, even files with the third clause were already
effectively two clause because of a slight bug in the language...
 1.32 18-May-2007  pooka Introduce noref setbacks, which the file server can use to signal
the kernel it has 0 references to the node in question. In other
words, this can be used to avoid inactive(), or, if the file server
does not implement inactive, prompt reclaim for removed nodes.
 1.31 18-May-2007  pooka Support VOP_POLL. This requires some acrobatics on the puffs_node,
as we give a reference to userspace for the puffs_node for the
duration of the poll call. So reference count puffs_node separately
from the parent vnode. vref()/vrele() is not possible due to a possible
surprise visit from VOP_INACTIVE.
 1.30 17-May-2007  pooka Make it possible for the file server to specify the root vnode type
and other information instead of always using VDIR. To make this
possible without races, require all root node information already
in puffs_mount() and nuke puffs_start2() and the associated start
operation completely.

requested/inspired by Tobias Nygren
 1.29 07-May-2007  pooka Introduce puffs "setbacks", which can be used to set certain flags
for nodes upon return from the userspace. Currently it can be used
to indicate that the file server should be notified of "inactive"
in case the file server has opted to not receive inactive every
time the reference count for a vnode drops to zero. (inactive is
a common event, almost never requires any action and must be executed
sychronously, so it is wasteful).

While doing this, cleanup the release-relock nonsense from the
vntouser*() arguments. It was never enabled and the whole LOCKEDVP()
concept was very broken to begin with.
 1.28 22-Apr-2007  pooka define PUFFS_KFLAG_WTCACHE, which makes the page cache write-through
 1.27 16-Apr-2007  pooka Give the file server the ability to specify the file handle length
instead of defining a static length file handle on the framework-level.
 1.26 13-Apr-2007  pooka Allow file servers to request the number of hash cookie buckets for
pnode -> vnode reverse lookup.
 1.25 13-Apr-2007  pooka * add fhlen to kernel argument structure
* rename it to puffs_kargs instead of puffs_args
 1.24 11-Apr-2007  pooka * support VFS_FHTOVP and VFS_VPTOFH
* support cookies in for VOP_READDIR

nfs exporting puffs file systems works now
 1.23 06-Apr-2007  pooka actually, we don't need a separate op for flushing the whole page cache
of a node, just use the range op with endoff = 0
 1.22 29-Mar-2007  pooka Convert spinlocks & sleep/wakeup to newlock2 locking stuff. Fix a
bunch of bugs.

* park structures are now always allocated from a pool instead of a
mixed stack/malloc allocation
* get rid of the whole adjbuf concept, always just alloc the maximal
amount of memory to satisfy a request
* little regression: don't allow interrupting wait from file system
to userspace; this had problems already before, but now the problems
really started to shine through. I'll try to make this work again
some day.
* fix bmap to return a sensible value in runp
 1.21 20-Mar-2007  pooka export puffs version of namei ISLASTCN macro to userspace
 1.20 20-Mar-2007  pooka * rework the page cache interaction a bit: cache metadata in the
kernel and flush it out all at once instead of continuous updating
* add support for delivering notifications to the file server about
when a page was written to (but disabled by default for now). the
file server can use this to request flushing or invalidating the
kernel page cache
 1.19 26-Jan-2007  pooka branches: 1.19.2; 1.19.6; 1.19.8; 1.19.10;
Initial attempt at suspend/snapshot support for userspace file
servers. This is still pretty much on the level "if it breaks ...".
It should work for single-threaded servers which handle one operation
from start to finish in one go. Also, it does not yet totally
correctly synchronize metadata and data in some cases. So needless
to say, it needs improvement, but it is possible that will have to
wait for some lock revampage.
 1.18 16-Jan-2007  pooka g/c revoke msg structure
 1.17 09-Jan-2007  pooka comment out flushmulti for now, it's not done and kdump will complain
as mjf noted
 1.16 09-Jan-2007  pooka Introduce flush operations, which the fs server can use to control
kernel caching. Currently supported are only flushing the name
cache for a directory or flushing the name cache for the entire fs.

Also, get rid of PNODE_INACTIVE status, since it was racy and
essentially didn't work. All this on top of being useless in the
first place ....
 1.15 07-Jan-2007  pooka vfs sync, flushes regular file data only (user server can take care of
flushing any metadata it might have hidden away)
 1.14 02-Jan-2007  pooka * check userspace version and prevent incompatible mount
* some general maintenance
 1.13 29-Dec-2006  pooka branches: 1.13.2;
rename the kernel-provided componentname to puffs_kcn; libpuffs now
provides puffs_cn built on top of it
 1.12 07-Dec-2006  pooka branches: 1.12.2;
let implementation ultimately decide if mmap is supported - pass
VOP_MMAP to fs server
 1.11 05-Dec-2006  pooka Allow multiple requests to be transferred in each GET/PUTOP. For
a single request, the performance is still the same.
 1.10 01-Dec-2006  pooka prefix kernel flags with PUFFS_KFLAG to have a separate namespace
from the library flags
 1.9 01-Dec-2006  pooka don't call the fs server for all operations, only those it has told
us that it implements
 1.8 18-Nov-2006  pooka branches: 1.8.2;
Require statvfs info from startreq so that we have that info available.
Also, don't pass fsid to userspace and just fill it in the kernel.
 1.7 17-Nov-2006  pooka Introduce uncached operation, makes sense when the file system backend
can be modified from elsewhere than the file system interface
 1.6 09-Nov-2006  pooka few renames to better differentiate between mount & start.. plus some
other renaming
 1.5 07-Nov-2006  pooka attach to genfs & support page cache. most noticeable effect is
mmap and therefore execution of binaries starting to work, some
speed improvements with large file I/O also. caching semantics
and error case handling most likely need revisiting.
 1.4 26-Oct-2006  pooka support specfs
 1.3 25-Oct-2006  pooka pass VOP_INACTIVE() to userspace
 1.2 23-Oct-2006  pooka bump the reqstruct minsize to something more believable (but I should
really fix this, still)
 1.1 22-Oct-2006  pooka kernel portion of puffs - the Pass-to-Userspace Framework File System.
It contains the VFS attachment and userspace message-passing interface.

This work was initially started and completed for Google SoC 2005
and tweaked to work a bit better in the past few weeks. While
being far from complete, it is functional enough to be able and
stable to host a fairly general-purpose in-memory file system in
userspace. Even so, puffs should be considered experimental and
no binary compatibility for interfaces or crash-freedom or zero
security implications should be relied upon just yet.

The GSoC project was mentored by William Studenmund and the final
review for the code was done by Christos.
 1.8.2.4 01-Feb-2007  ad Sync with head.
 1.8.2.3 12-Jan-2007  ad Sync with head.
 1.8.2.2 18-Nov-2006  ad Sync with head.
 1.8.2.1 18-Nov-2006  ad file puffs_msgif.h was added on branch newlock2 on 2006-11-18 21:39:20 +0000
 1.12.2.2 10-Dec-2006  yamt sync with head.
 1.12.2.1 07-Dec-2006  yamt file puffs_msgif.h was added on branch yamt-splraiseipl on 2006-12-10 07:18:38 +0000
 1.13.2.9 04-Feb-2008  yamt sync with head.
 1.13.2.8 21-Jan-2008  yamt sync with head
 1.13.2.7 07-Dec-2007  yamt sync with head
 1.13.2.6 15-Nov-2007  yamt sync with head.
 1.13.2.5 27-Oct-2007  yamt sync with head.
 1.13.2.4 03-Sep-2007  yamt sync with head.
 1.13.2.3 26-Feb-2007  yamt sync with head.
 1.13.2.2 30-Dec-2006  yamt sync with head.
 1.13.2.1 29-Dec-2006  yamt file puffs_msgif.h was added on branch yamt-lazymbuf on 2006-12-30 20:50:00 +0000
 1.19.10.1 29-Mar-2007  reinoud Pullup to -current
 1.19.8.1 11-Jul-2007  mjf Sync with head.
 1.19.6.7 12-Oct-2007  ad Sync with head.
 1.19.6.6 09-Oct-2007  ad Sync with head.
 1.19.6.5 20-Aug-2007  ad Sync with HEAD.
 1.19.6.4 15-Jul-2007  ad Sync with head.
 1.19.6.3 09-Jun-2007  ad Sync with head.
 1.19.6.2 08-Jun-2007  ad Sync with head.
 1.19.6.1 10-Apr-2007  ad Sync with head.
 1.19.2.4 17-May-2007  yamt sync with head.
 1.19.2.3 07-May-2007  yamt sync with head.
 1.19.2.2 15-Apr-2007  yamt sync with head.
 1.19.2.1 24-Mar-2007  yamt sync with head.
 1.42.2.2 03-Sep-2007  skrll Sync with HEAD.
 1.42.2.1 15-Aug-2007  skrll Sync with HEAD.
 1.47.6.2 30-Jul-2007  pooka Move PUFFS_TYPEPREFIX to puffs_msgif.h since it's used in a macro there.
 1.47.6.1 30-Jul-2007  pooka file puffs_msgif.h was added on branch matt-mips64 on 2007-07-30 09:04:59 +0000
 1.47.4.8 09-Dec-2007  jmcneill Sync with HEAD.
 1.47.4.7 27-Nov-2007  joerg Sync with HEAD. amd64 Xen support needs testing.
 1.47.4.6 14-Nov-2007  joerg Sync with HEAD.
 1.47.4.5 26-Oct-2007  joerg Sync with HEAD.

Follow the merge of pmap.c on i386 and amd64 and move
pmap_init_tmp_pgtbl into arch/x86/x86/pmap.c. Modify the ACPI wakeup
code to restore CR4 before jumping back into kernel space as the large
page option might cover that.
 1.47.4.4 07-Oct-2007  joerg Sync with HEAD.
 1.47.4.3 02-Oct-2007  joerg Sync with HEAD.
 1.47.4.2 03-Sep-2007  jmcneill Sync with HEAD.
 1.47.4.1 16-Aug-2007  jmcneill Sync with HEAD.
 1.50.4.2 14-Oct-2007  yamt sync with head.
 1.50.4.1 06-Oct-2007  yamt sync with head.
 1.50.2.3 23-Mar-2008  matt sync with HEAD
 1.50.2.2 09-Jan-2008  matt sync with HEAD
 1.50.2.1 06-Nov-2007  matt sync with HEAD
 1.57.2.2 13-Nov-2007  bouyer Sync with HEAD
 1.57.2.1 25-Oct-2007  bouyer Sync with HEAD.
 1.59.2.4 18-Feb-2008  mjf Sync with HEAD.
 1.59.2.3 27-Dec-2007  mjf Sync with HEAD.
 1.59.2.2 08-Dec-2007  mjf Sync with HEAD.
 1.59.2.1 19-Nov-2007  mjf Sync with HEAD.
 1.61.2.2 26-Dec-2007  ad Sync with head.
 1.61.2.1 08-Dec-2007  ad Sync with head.
 1.65.26.1 21-Apr-2010  matt sync to netbsd-5
 1.65.20.3 17-Jul-2011  riz Pull up following revision(s) (requested by manu in ticket #1645):
lib/libc/sys/Makefile.inc 1.207 via patch
lib/libc/sys/extattr_get_file.2 patch
lib/libpuffs/dispatcher.c 1.34,1.36 via patch
lib/libpuffs/puffs.c 1.107 via patch
lib/libpuffs/puffs.h 1.115,1.118 via patch
sys/fs/puffs/puffs_msgif.h 1.71,1.76 via patch
sys/fs/puffs/puffs_vfsops.c 1.88 via patch
sys/fs/puffs/puffs_vnops.c 1.145,1.154 via patch
sys/kern/vfs_xattr.c 1.24-1.27 via patch
sys/kern/vnode_if.c 1.87 via patch
sys/sys/Makefile 1.133 via patch
sys/sys/extattr.h 1.6 via patch
sys/sys/vnode_if.h 1.81 via patch
sys/ufs/ffs/ffs_vnops.c patch
sys/ufs/ufs/ufs_extattr.c 1.31,1.34 via patch

* support extended attributes
* bump major due to structure growth
* add some spare space
* remove ABI sillyness
Support extended attributes.
Fix multiple non compliances in our Linux-like extattr API, and make it
public so that it can be used.
Improve a bit listxattr(2). It attemps to list both system and user
extended attributes, and it faled if calling user did not have privilege
for reading system EA. Now we just lise user EA and skip system EA in
reading them is not allowed.
Fix bug introduced in previous commuit: Do not vrele() a vnode we did not
obtained.
Improve UFS1 extended attributes usability
- autocreate attribute backing file for new attributes
- autoload attributes when issuing extattrctl start
- when autoloading attributes, do not display garbage warning when looking
up entries that got ENOENT
Add a flag to VOP_LISTEXTATTR(9) so that the vnode interface can tell the
filesystem in which format extended attribute shall be listed.
There are currently two formats:
- NUL-terminated strings, used for listxattr(2), this is the default.
- one byte length-pprefixed, non NUL-terminated strings, used for
extattr_list_file(2), which is obtanined by setting the
EXTATTR_LIST_PREFIXLEN flag to VOP_LISTEXTATTR(9)
This approach avoid the need for converting the list back and forth, except
in libperfuse, since FUSE uses NUL-terminated strings, and the kernel may
have requested EXTATTR_LIST_PREFIXLEN.
 1.65.20.2 14-Dec-2009  sborrill Revert previous version bump which should not have been in the supplied
patch. This maintains compatibility between 5.0 and 5.1 (at the cost of
needing userland libraries recompiled if one's been tracking netbsd-5).
 1.65.20.1 18-Oct-2009  sborrill Pull up the following revisions(s) (requested by pooka in ticket #1100):
lib/libpuffs/dispatcher.c: revision 1.33
lib/libpuffs/puffs.c: revision 1.99
lib/libpuffs/puffs.h: revision 1.111
sys/fs/puffs/puffs_msgif.h: revision 1.67 via patch
sys/fs/puffs/puffs_vnops.c: revision 1.136

Support VOP_ABORTOP() in puffs.
 1.65.18.1 19-Jan-2009  skrll Sync with HEAD.
 1.65.10.3 11-Aug-2010  yamt sync with head.
 1.65.10.2 11-Mar-2010  yamt sync with head
 1.65.10.1 04-May-2009  yamt sync with head.
 1.65.6.1 17-Jan-2009  mjf Sync with HEAD.
 1.69.4.3 05-Mar-2011  rmind sync with head
 1.69.4.2 03-Jul-2010  rmind sync with head
 1.69.4.1 30-May-2010  rmind sync with head
 1.69.2.1 17-Aug-2010  uebayasi Sync with HEAD.
 1.77.8.4 27-Feb-2015  martin Pull up following revision(s) (requested by manu in ticket #1260):
lib/libpuffs/puffs.3: revision 1,55,1.60
sys/fs/puffs/puffs_msgif.h: revision 1.84
lib/libperfuse/ops.c: revision 1.83
sys/fs/puffs/puffs_sys.h: revision 1.89
sys/fs/puffs/puffs_vfsops.c: revision 1.116
lib/libperfuse/perfuse.c: revision 1.36
sys/fs/puffs/puffs_vnops.c: revision 1.200-1.202

Use more markup. New sentence, new line. Bump date for previous.

Add PUFFS_KFLAG_NOFLUSH_META to prevent sending metadata flush to FUSE
FUSE filesystems do not expect to get metadata updates for [amc]time
and size, they updates the value on their own after operations.

The PUFFS PUFFS_KFLAG_NOFLUSH_META option prevents regular metadata cache
flushes to the filesystem , and libperfuse uses it to match Linux FUSE
behavior.

While there, fix a bug in SETATTR: do not update kernel metadata cache
from SETATTR reply when the request is asynchronous, as we do not have
the reply yet.

Update file size after write without metadata flush
If we do not use metadata flush, we must make sure the size is updated
in the filesystem after a write, otherwise the next GETATTR will get us
a stale value and the file will be truncated.
 1.77.8.3 03-Nov-2014  msaitoh Pull up following revision(s) (requested by manu in ticket #1140):
lib/libperfuse/ops.c 1.63-1.69
lib/libperfuse/perfuse.c 1.32-1.33
lib/libperfuse/perfuse_priv.h 1.32-1.34
lib/libperfuse/subr.c 1.20
lib/libpuffs/creds.c 1.16
lib/libpuffs/dispatcher.c 1.47
lib/libpuffs/puffs.h 1.125
lib/libpuffs/puffs_ops.3 1.37-1.38
lib/libpuffs/requests.c 1.24
sys/fs/puffs/puffs_msgif.h 1.81
sys/fs/puffs/puffs_sys.h 1.85
sys/fs/puffs/puffs_vnops.c 1.183
usr.sbin/perfused/msg.c 1.22
Bring libpuffs, libperfuse and perfused on par with -current:
- implement FUSE direct I/O
- remove useless code and warnings
- fix missing GETATTR bugs
- fix exended attribute get and list operations
 1.77.8.2 12-Aug-2012  martin Pull up following revision(s) (requested by manu in ticket #438):
lib/libperfuse/perfuse_priv.h: revision 1.31
sys/fs/puffs/puffs_msgif.h: revision 1.80
sys/fs/puffs/puffs_vnops.c: revision 1.171
lib/libpuffs/puffs_ops.3: revision 1.31
sys/fs/puffs/puffs_vnops.c: revision 1.172
sys/fs/puffs/puffs_vnops.c: revision 1.173
sys/fs/puffs/puffs_vnops.c: revision 1.174
usr.sbin/perfused/perfused.c: revision 1.24
sys/fs/puffs/puffs_sys.h: revision 1.80
sys/fs/puffs/puffs_sys.h: revision 1.81
sys/fs/puffs/puffs_sys.h: revision 1.82
lib/libperfuse/subr.c: revision 1.19
lib/libperfuse/perfuse.c: revision 1.30
sys/fs/puffs/puffs_msgif.c: revision 1.90
sys/fs/puffs/puffs_msgif.c: revision 1.91
sys/fs/puffs/puffs_msgif.c: revision 1.92
lib/libperfuse/ops.c: revision 1.59
lib/libpuffs/puffs.3: revision 1.53
lib/libperfuse/debug.c: revision 1.12
lib/libpuffs/puffs.3: revision 1.54
sys/fs/puffs/puffs_vnops.c: revision 1.167
sys/fs/puffs/puffs_msgif.h: revision 1.79
usr.sbin/perfused/msg.c: revision 1.21
sys/fs/puffs/puffs_vfsops.c: revision 1.102
sys/fs/puffs/puffs_vfsops.c: revision 1.103
sys/fs/puffs/puffs_vfsops.c: revision 1.105
lib/libpuffs/puffs.h: revision 1.123
lib/libperfuse/perfuse_if.h: revision 1.20
lib/libperfuse/perfuse.c: revision 1.29
lib/libpuffs/dispatcher.c: revision 1.42
lib/libpuffs/dispatcher.c: revision 1.43
- Fix same vnodes associated with multiple cookies
The scheme used to retreive known nodes on lookup was flawed, as it only
used parent and name. This produced a different cookie for the same file
if it was renamed, when looking up ../ or when dealing with multiple files
associated with the same name through link(2).
We therefore abandon the use of node name and introduce hashed lists of
inodes. This causes a huge rewrite of reclaim code, which do not attempt
to keep parents allocated until all their children are reclaimed
- Fix race conditions in reclaim
There are a few situations where we issue multiple FUSE operations for
a PUFFS operation. On reclaim, we therefore have to wait for all FUSE
operation to complete, not just the current exchanges. We do this by
introducing node reference count with node_ref() and node_rele().
- Detect data loss caused by FAF
VOP_PUTPAGES causes FAF writes where the kernel does not check the
operation result. At least issue a warning on error.
- Enjoy FAF shortcut on setattr
No need to wait for the result if the kernel does not want it. There is
however an exception for setattr that touch the size, we need to wait
for completion because we have other operations queued for after the
resize.
- Fix fchmod() on write-open file
fchmod() on a node open with write privilege will send setattr with both mode
and size set. This confuses some FUSE filesystem. Therefore we send two FUSE
operations, one for mode, and one for size.
- Remove node TTL handling for netbsd-5 for simplicity sake. The code
still builds on netbsd-5 but does not have the node TTL feature anymore.
It works fine with kernel support on netbsd-6.
- Improve PUFFS_KFLAG_CACHE_FS_TTL by reclaiming older inactive nodes.
The normal kernel behavior is to retain inactive nodes in the freelist
until it runs out of vnodes. This has some merit for local filesystems,
where the cost of an allocation is about the same as the cost of a
lookup. But that situation is not true for distributed filesystems.
On the other hand, keeping inactive nodes for a long time hold memory
in the file server process, and when the kernel runs out of vnodes, it
produce reclaim avalanches that increase lattency for other operations.
We do not reclaim inactive vnodes immediatly either, as they may be
looked up again shortly. Instead we introduce a grace time and we
reclaim nodes that have been inactive beyond the grace time.
- Fix lookup/reclaim race condition.
The above improvement undercovered a race condition between lookup and
reclaim. If we reclaimed a vnode associated with a userland cookie while
a lookup returning that same cookiewas inprogress, then the kernel ends
up with a vnode associated with a cookie that has been reclaimed in
userland. Next operation on the cookie will crash (or at least confuse)
the filesystem.
We fix this by introducing a lookup count in kernel and userland. On
reclaim, the kernel sends the count, which enable userland to detect
situation where it initiated a lookup that is not completed in kernel.
In such a situation, the reclaim must be ignored, as the node is about
to be looked up again.
Fix hang unmount bug introduced by last commit.
We introduced a slow queue for delayed reclaims, while the existing
queue for unmount, flush and exist has been renamed fast queue. Both
queues had timestamp for when an operation should be done, but it was
useless for the fast queue, which is always used to run an operation
ASAP. And the timestamp test had an error that turned ASAP into "at next
tick", but nobody what there to wake the thread at next tick, hence
the hang. The fix is to remove the useless and buggy timestamp test for
fast queue.
Rename slow sopreq queue into node sopreq queue, to refet the fact that
is only intended for postponed node reclaims.
When purging the node sopreq queue, do not call puffs_msg_sendresp(), as
it makes no sense.
Fix race condition between (create|mknod|mkdir|symlino) and reclaim, just
like we did it between lookup and reclaim.
Missing bit in previous commit (prevent race between create|mknod|mkdir|symlink
and reclaim)
Bump date for previous.
New sentence, new line; remove trailing whitespace; fix typos;
punctuation nits.
Add PUFFS_KFLAG_CACHE_DOTDOT so that vnodes hold a reference on their
parent, keeping them active, and allowing to lookup .. without sending
a request to the filesystem.
Enable the featuure for perfused, as this is how FUSE works.
Missing bit in previous commit (PUFFS_KFLAG_CACHE_DOTDOT option to avoid
looking up ..)
 1.77.8.1 23-Apr-2012  riz Pull up following revision(s) (requested by manu in ticket #195):
lib/libskey/skeysubr.c: revision 1.27
lib/libkvm/kvm_getloadavg.c: revision 1.11
lib/libwrap/update.c: revision 1.9
lib/liby/yyerror.c: revision 1.9
lib/libpuffs/puffs_ops.3: revision 1.30
lib/libwrap/misc.c: revision 1.10
lib/libwrap/hosts_access.c: revision 1.20
lib/libpuffs/pnode.c: revision 1.11
lib/libperfuse/subr.c: revision 1.17
lib/libpuffs/pnode.c: revision 1.12
lib/libperfuse/subr.c: revision 1.18
lib/libwrap/options.c: revision 1.15
lib/libwrap/fix_options.c: revision 1.11
lib/libperfuse/ops.c: revision 1.52
lib/libperfuse/ops.c: revision 1.53
lib/libperfuse/ops.c: revision 1.54
lib/libwrap/hosts_ctl.c: revision 1.5
lib/libintl/gettext.c: revision 1.27
lib/libwrap/shell_cmd.c: revision 1.6
lib/libpuffs/dispatcher.c: revision 1.39
lib/libperfuse/perfuse_priv.h: revision 1.27
lib/libwrap/socket.c: revision 1.19
lib/libpuffs/puffs.3: revision 1.50
lib/libperfuse/perfuse_priv.h: revision 1.28
lib/libpuffs/puffs_priv.h: revision 1.45
lib/libpuffs/puffs.3: revision 1.51
lib/libperfuse/perfuse_priv.h: revision 1.29
lib/libwrap/percent_x.c: revision 1.5
lib/libpuffs/puffs.3: revision 1.52
lib/libperfuse/debug.c: revision 1.11
sys/fs/puffs/puffs_vnops.c: revision 1.165
lib/libwrap/tcpd.h: revision 1.13
sys/fs/puffs/puffs_vnops.c: revision 1.166
lib/libwrap/eval.c: revision 1.7
sys/fs/puffs/puffs_msgif.h: revision 1.78
sys/fs/puffs/puffs_vfsops.c: revision 1.101
lib/libwrap/rfc931.c: revision 1.9
lib/libwrap/clean_exit.c: revision 1.5
lib/libpuffs/puffs.h: revision 1.120
lib/libc/stdlib/jemalloc.c: revision 1.27
lib/librmt/rmtlib.c: revision 1.26
lib/libpuffs/puffs.h: revision 1.121
sys/fs/puffs/puffs_sys.h: revision 1.79
lib/librumpclient/rumpclient.c: revision 1.48
lib/libwrap/refuse.c: revision 1.5
lib/libperfuse/perfuse.c: revision 1.26
lib/libperfuse/perfuse.c: revision 1.27
tests/fs/puffs/t_fuzz.c: revision 1.5
lib/libperfuse/perfuse.c: revision 1.28
lib/libpuffs/dispatcher.c: revision 1.40
sys/fs/puffs/puffs_node.c: revision 1.24
lib/libwrap/diag.c: revision 1.9
lib/libintl/textdomain.c: revision 1.13
Use C89 function definition
Add name and atttribute cache with filesytem provided TTL.
lookup, create, mknod, mkdir, symlink, getattr and setattr messages
have been extended so that attributes and their TTL can be provided
by the filesytem. lookup, create, mknod, mkdir, and symlink messages
are also extended so that the filesystem can provide name TTL.
Add PUFFS_KFLAG_CACHE_FS_TTL flag to puffs_init(3) to use name and
attribute cache with filesystem provided TTL.
lookup, create, mknod, mkdir, symlink, getattr and setattr messages
have been extended so that attributes and their TTL can be provided
by the filesytem. lookup, create, mknod, mkdir, and symlink messages
are also extended so that the filesystem can provide name TTL.
The filesystem updates attributes and TTL using
puffs_pn_getvap(3), puffs_pn_getvattl(3), and puffs_pn_getcnttl(3)
Use new PUFFS_KFLAG_CACHE_FS_TTL option to puffs_init(3) so that
FUSE TTL on name and attributes are used. This save many PUFFS
operations and improves performances.
PUFFS_KFLAG_CACHE_FS_TTL is #ifdef'ed in many places for now so that
libperfuse can still be used on netbsd-5.
Split file system.
Comma fixes.
Remove dangling &quot;and&quot;.
Bump date for previous.
- Makesure update_va does not change vnode size when it should not. For
instance when doing a fault-issued VOP_GETPAGES within VOP_WRITE, changing
size leads to panic: genfs_getpages: past eof.
-Handle ticks wrap around for vnode name andattribute timeout
- When using PUFFS_KFLAG_CACHE_FS_TTL, do not use puffs_node to carry
attribute and TTL fora newly created node. Instead extend puffs_newinfo
and add puffs_newinfo_setva() and puffs_newinfo_setttl()
- Remove node_mk_common_final in libperfuse. It used to set uid/gid for
a newly created vnode but has been made redundant along time ago since
uid and gid are properly set in FUSE header.
- In libperfuse, check for corner case where opc = 0 on INACTIVE and RECLAIM
(how is it possible? Check for it to avoid a crash anyway)
- In libperfuse, make sure we unlimit RLIMIT_AS and RLIMIT_DATA so that
we do notrun out of memory because the kernel is lazy at reclaiming vnodes.
- In libperfuse, cleanup style of perfuse_destroy_pn()
Do not set PUFFS_KFLAG_CACHE_FS_TTL for PUFFS tests
 1.77.6.1 29-Apr-2012  mrg sync to latest -current.
 1.77.2.2 30-Oct-2012  yamt sync with head
 1.77.2.1 17-Apr-2012  yamt sync with head
 1.80.14.3 27-Feb-2015  martin Pull up following revision(s) (requested by manu in ticket #555):
lib/libpuffs/puffs.3: revision 1.60
sys/fs/puffs/puffs_msgif.h: revision 1.84
lib/libperfuse/ops.c: revision 1.83
sys/fs/puffs/puffs_sys.h: revision 1.89
sys/fs/puffs/puffs_vfsops.c: revision 1.116
lib/libperfuse/perfuse.c: revision 1.36
sys/fs/puffs/puffs_vnops.c: revision 1.200-1.202

Add PUFFS_KFLAG_NOFLUSH_META to prevent sending metadata flush to FUSE

FUSE filesystems do not expect to get metadata updates for [amc]time
and size, they updates the value on their own after operations.

The PUFFS PUFFS_KFLAG_NOFLUSH_META option prevents regular metadata cache
flushes to the filesystem , and libperfuse uses it to match Linux FUSE
behavior.

While there, fix a bug in SETATTR: do not update kernel metadata cache
from SETATTR reply when the request is asynchronous, as we do not have
the reply yet.

Update file size after write without metadata flush
If we do not use metadata flush, we must make sure the size is updated
in the filesystem after a write, otherwise the next GETATTR will get us
a stale value and the file will be truncated.
 1.80.14.2 05-Nov-2014  snj Pull up following revision(s) (requested by manu in ticket #181):
lib/libperfuse/fuse.h: revision 1.6
lib/libperfuse/ops.c: revision 1.78
lib/libperfuse/perfuse.c: revision 1.35
lib/libperfuse/perfuse_priv.h: revision 1.36
lib/libpuffs/dispatcher.c: revision 1.48
lib/libpuffs/opdump.c: revision 1.37
lib/libpuffs/puffs.c: revision 1.118
lib/libpuffs/puffs.h: revision 1.126
lib/libpuffs/puffs_ops.3: revisions 1.40-1.41
sys/fs/puffs/puffs_msgif.h: revision 1.82-1.83
sys/fs/puffs/puffs_msgif.h: revision 1.82
sys/fs/puffs/puffs_vnops.c: revision 1.196
Add PUFFS support for fallocate and fdiscard operations
--
libpuffs support for fallocate and fdiscard operations
--
Add PUFFS_HAVE_FALLOCATE in puffs_msgif.h so that filesystem can decide
at build time wether fallocate is usable
--
FUSE fallocate support
There seems to be no fdiscard FUSE operation at the moment, hence that
one is left unused.
 1.80.14.1 26-Aug-2014  riz Pull up following revision(s) (requested by manu in ticket #52):
sys/fs/puffs/puffs_msgif.h: revision 1.81
sys/fs/puffs/puffs_sys.h: revision 1.85
sys/fs/puffs/puffs_vnops.c: revision 1.183
Add a oflags input field to open requests so that the filesystem can pass
back information about the file. Implement PUFFS_OPEN_IO_DIRECT, which
will force direct IO (bypassing page cache) for the file.
 1.80.2.1 03-Dec-2017  jdolecek update from HEAD
 1.83.2.1 06-Apr-2015  skrll Sync with HEAD
 1.84.18.1 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.85.8.1 03-Apr-2021  thorpej Sync with HEAD.
 1.38 08-Feb-2018  dholland Typos.
 1.37 20-Aug-2016  hannken Remove now obsolete operation vcache_remove().

Welcome to 7.99.36
 1.36 10-Nov-2014  maxv branches: 1.36.2;
Do not uselessly include <sys/malloc.h>.
 1.35 04-Nov-2014  manu Fix PUFFS node use-after-reclaim

When puffs_cookie2vnode() misses an entry, vcache_get()
creates a new node (puffs_vfsop_loadvnode being called to
initialize the PUFFS part), then it discovers it is VNON,
and tries to vrele() it. vrele() calls VOP_INACTIVE(),
which led us in puffs_vnop_inactive() where we sent a
request to the filesystem for a node that already had been
reclaimed.

The fix is to check for VNON nodes in puffs_vnop_inactive()
and to return without doing anyting. This is suboptimal, but
a better workaround would probably need to modify vcache API,
with an impact on other filesystems. Let us keep it simple.
 1.34 30-Sep-2014  hannken Fix the puffs_sop_thread -> puffs_cookie2vnode path:
- pass the cookie by reference
- add missing mutex_exit()
- update assertion for VNON typed vnodes
 1.33 05-Sep-2014  manu When changing a directory content, update the ctime/mtime in kernel cache,
otherwise the updated ctime/mtime appears after the cached entry expire.
 1.32 28-Aug-2014  hannken Change puffs from hashlist to vcache.
- field "pa_nhashbuckets" of struct "puffs_kargs" becomes a no-op.
and should be removed on the next protocol version bump.
 1.31 23-Jan-2014  hannken branches: 1.31.4;
Change vnode operations create, mknod, mkdir and symlink to return
the resulting vnode *vpp unlocked.

Discussed on tech-kern@

Welcome to 6.99.30
 1.30 17-Oct-2013  christos - remove unused variables
- add _NOERROR flavor macros for the case where errors are ignored.
 1.29 06-Mar-2013  yamt branches: 1.29.6;
comments
use sizeof(var) instead of sizeof(type) where possibly confusing
 1.28 05-Nov-2012  dholland Excise struct componentname from the namecache.

This uglifies the interface, because several operations need to be
passed the namei flags and cache_lookup also needs for the time being
to be passed cnp->cn_nameiop. Nonetheless, it's a net benefit.

The glop should be able to go away eventually but requires structural
cleanup elsewhere first.

This change requires a kernel bump.
 1.27 23-Jul-2012  manu branches: 1.27.2;
Backout NCHNAMLEN check for cache_enter. That change collided with rmind's
move of this exact check into cache_enter
 1.26 23-Jul-2012  manu Di not call cache_enter with path components bigger than NCHNAMLEN, as it
panics the kernel.
 1.25 22-Jul-2012  rmind Move some the test for MAKEENTRY into the cache_enter(9). Make some
variables in vfs_cache.c static, __read_mostly, etc.

No objection on tech-kern@.
 1.24 08-Apr-2012  manu Add name and atttribute cache with filesytem provided TTL.
lookup, create, mknod, mkdir, symlink, getattr and setattr messages
have been extended so that attributes and their TTL can be provided
by the filesytem. lookup, create, mknod, mkdir, and symlink messages
are also extended so that the filesystem can provide name TTL.
 1.23 19-Jan-2012  manu branches: 1.23.2;
Fix a race condition where the filesystem lookups a vnode that is
being recycled, producing ENOENT while the file does exist.

Approved by yamt
 1.22 19-Oct-2011  manu branches: 1.22.2; 1.22.6;
Remove #ifdef DIAGNOSTIC guards around KASSERT, as the macro contains them
 1.21 18-Oct-2011  manu Make sure pagedaemon does not sleep for memory in puffs_vnop_sleep.
Add KASSERT on any sleeping memory allocation to check it cannot happen again.
 1.20 29-Aug-2011  manu Add a mutex for operations that touch size (setattr, getattr, write, fsync).

This is required to avoid data corruption bugs, where a getattr slices
itself within a setattr operation, and sets the size to the stall value
it got from the filesystem. That value is smaller than the one set by
setattr, and the call to uvm_vnp_setsize() trigged a spurious truncate.
The result is a chunk of zeroed data in the file.

Such a situation can easily happen when the ioflush thread issue a
VOP_FSYNC/puffs_vnop_sync/flushvncache/dosetattrn while andother process
do a sys_stat/VOP_GETATTR/puffs_vnop_getattr.

This mutex on size operation can be removed the day we decide VOP_GETATTR
has to operated on a locked vnode, since the other operations that touch
size already require that.
 1.19 30-Jun-2011  wiz dependant -> dependent
 1.18 12-Jun-2011  rmind Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.
 1.17 25-Jul-2010  hannken branches: 1.17.6;
It makes no sense to call vget() with LK_RETRY.
 1.16 21-Jul-2010  hannken Make holding v_interlock mandatory for callers of vget().

Announced some time ago on tech-kern.
 1.15 05-Nov-2009  pooka branches: 1.15.2; 1.15.4;
Kill suspend support. It was never implemented correctly:
* it depended on the biglock (in a very cruel way)
* it was attached to userspace transactions rather than logical
fs operations

(If someone wants to revisit it some day, most of the stuff can be
reused from cvs history)
 1.14 30-Sep-2009  pooka * fix a race i introduced almost two years ago in rev 1.116:
operations creating a node cannot be considered outgoing operations,
since after return from userspace they modify file system state
by creating a new node. if we do not protect the file system by
holding the directory lock, a lookup operation might race us into
the kernel and create the node earlier.
* remove pnode from hashlish before sending the reclaim faf off to
userspace. also, hold pmp_lock while frobbing the list.
 1.13 06-May-2008  ad branches: 1.13.10; 1.13.18;
PR kern/37950 Unmounting psshfs immediately panics the machine

puffs_getvnode() was inserting vnodes into mnt_vnodelist without taking
a reference to the mount for each. When vnodes are scrubbed, refs to the
vnodes mount structure are dropped => boom.
 1.12 01-Mar-2008  rmind branches: 1.12.2; 1.12.4;
Welcome to 4.99.55:

- Add a lot of missing selinit() and seldestroy() calls.

- Merge selwakeup() and selnotify() calls into a single selnotify().

- Add an additional 'events' argument to selnotify() call. It will
indicate which event (POLL_IN, POLL_OUT, etc) happen. If unknown,
zero may be used.

Note: please pass appropriate value of 'events' where possible.
Proposed on: <tech-kern>
 1.11 28-Jan-2008  pooka branches: 1.11.2; 1.11.6;
For code clarity typedef void *puffs_cookie_t.

No functional change.
 1.10 24-Jan-2008  ad specfs changes for PR kern/37717 (raidclose() is no longer called on
shutdown). There are still problems with device access and a PR will be
filed.

- Kill checkalias(). Allow multiple vnodes to reference a single device.

- Don't play dangerous tricks with block vnodes to ensure that only one
vnode can describe a block device. Instead, prohibit concurrent opens of
block devices. As a bonus remove the unreliable code that prevents
multiple file system mounts on the same device. It's no longer needed.

- Track opens by vnode and by device. Issue cdev_close() when the last open
goes away, instead of abusing vnode::v_usecount to tell if the device is
open.
 1.9 02-Jan-2008  ad Merge vmlocking2 to head.
 1.8 17-Nov-2007  pooka branches: 1.8.2; 1.8.6;
Make puffs_updatenode() take a puffs_node instead of a vnode. This
way we don't need to worry if a vnode has been reclaimed from under
us.
 1.7 16-Nov-2007  pooka Restructure the messaging interface a bit more: make all interfacing
with the file server happen through puffs_msg_enqueue() and
puffs_msg_wait() instead of having a billion different routines.
Build the existing system upon these two. Most importantly though,
decouple insertation into the op queue from the actual wait. This
is useful for a number of reasons coming soon to a cvs repo near you.
 1.6 11-Oct-2007  pooka branches: 1.6.2; 1.6.4; 1.6.6; 1.6.8;
Part 1/n of some pretty extensive changes to how the kernel module
interacts with the userspace file server:

* since the kernel-user communication is not purely request-response
anymore (hasn't been since 2006), try to rename some "request" to
"message". more similar mangling will take place in the future.

* completely rework how messages are allocated. previously most of
them were borrowed from the stack (originally *all* of them),
but now always allocate dynamically. this makes the structure
of the code much cleaner. also makes it possible to fix a
locking order violation. it enables plenty of future enhancements.

* start generalizing the transport interface to be independent of puffs

* move transport interface to read/write instead of ioctl. the
old one had legacy design problems, and besides, ioctl's suck.
implement a very generic version for now; this will be
worked on later hopefully some day reaching "highly optimized".

* implement libpuffs support behind existing library request
interfaces. this will change eventually (I hate those interfaces)
 1.5 10-Oct-2007  ad Merge from vmlocking:

- Split vnode::v_flag into three fields, depending on field locking.
- simple_lock -> kmutex in a few places.
- Fix some simple locking problems.
 1.4 02-Oct-2007  pooka branches: 1.4.2; 1.4.4; 1.4.6;
If kernel resource allocation fails after the file server has
committed something, issue an abort. The abort is done through
the regular op channel, e.g. failed mkdir leads to regular rmdir,
inactive and reclaim. No internal interface is planned currently
for the one file system out of a million which would implement it
to benefit from the one case in a billion where kernel resource
allocation actually does fail and out of that one case in a trillion
where internal vs. external would make a difference.
 1.3 01-Oct-2007  pooka * better error checking: validate error values received from userland
to be vaild errno values
* include string describing error in PUFFS_ERR
* get rid of union in puffs_req, it's nothing but trouble
* pass pmp to async i/o callbacks
 1.2 27-Sep-2007  pooka comments & other minor maintenance
 1.1 27-Sep-2007  pooka Split routines handling nodes from puffs_subr to puffs_node.
No functional change.
 1.4.6.6 28-Oct-2007  ad Fix up mnt_vnodelist handling.
 1.4.6.5 23-Oct-2007  ad Sync with head.
 1.4.6.4 12-Oct-2007  ad Sync with head.
 1.4.6.3 09-Oct-2007  ad Sync with head.
 1.4.6.2 09-Oct-2007  ad Sync with head.
 1.4.6.1 02-Oct-2007  ad file puffs_node.c was added on branch vmlocking on 2007-10-09 13:44:18 +0000
 1.4.4.3 14-Oct-2007  yamt sync with head.
 1.4.4.2 06-Oct-2007  yamt sync with head.
 1.4.4.1 02-Oct-2007  yamt file puffs_node.c was added on branch yamt-x86pmap on 2007-10-06 15:29:48 +0000
 1.4.2.4 21-Nov-2007  joerg Sync with HEAD.
 1.4.2.3 26-Oct-2007  joerg Sync with HEAD.

Follow the merge of pmap.c on i386 and amd64 and move
pmap_init_tmp_pgtbl into arch/x86/x86/pmap.c. Modify the ACPI wakeup
code to restore CR4 before jumping back into kernel space as the large
page option might cover that.
 1.4.2.2 02-Oct-2007  joerg Sync with HEAD.
 1.4.2.1 02-Oct-2007  joerg file puffs_node.c was added on branch jmcneill-pm on 2007-10-02 18:28:52 +0000
 1.6.8.4 23-Mar-2008  matt sync with HEAD
 1.6.8.3 09-Jan-2008  matt sync with HEAD
 1.6.8.2 06-Nov-2007  matt sync with HEAD
 1.6.8.1 11-Oct-2007  matt file puffs_node.c was added on branch matt-armv6 on 2007-11-06 23:31:15 +0000
 1.6.6.2 18-Feb-2008  mjf Sync with HEAD.
 1.6.6.1 19-Nov-2007  mjf Sync with HEAD.
 1.6.4.6 17-Mar-2008  yamt sync with head.
 1.6.4.5 04-Feb-2008  yamt sync with head.
 1.6.4.4 21-Jan-2008  yamt sync with head
 1.6.4.3 07-Dec-2007  yamt sync with head
 1.6.4.2 27-Oct-2007  yamt sync with head.
 1.6.4.1 11-Oct-2007  yamt file puffs_node.c was added on branch yamt-lazymbuf on 2007-10-27 11:35:10 +0000
 1.6.2.1 18-Nov-2007  bouyer Sync with HEAD
 1.8.6.1 02-Jan-2008  bouyer Sync with HEAD
 1.8.2.1 04-Dec-2007  ad Pull the vmlocking changes into a new branch.
 1.11.6.2 02-Jun-2008  mjf Sync with HEAD.
 1.11.6.1 03-Apr-2008  mjf Sync with HEAD.
 1.11.2.1 24-Mar-2008  keiichi sync with head.
 1.12.4.3 11-Aug-2010  yamt sync with head.
 1.12.4.2 11-Mar-2010  yamt sync with head
 1.12.4.1 16-May-2008  yamt sync with head.
 1.12.2.1 18-May-2008  yamt sync with head.
 1.13.18.1 21-Apr-2010  matt sync to netbsd-5
 1.13.10.4 25-Jan-2012  riz Pull up following revision(s) (requested by manu in ticket #1714):
sys/fs/puffs/puffs_node.c: revision 1.23
Fix a race condition where the filesystem lookups a vnode that is
being recycled, producing ENOENT while the file does exist.
Approved by yamt
 1.13.10.3 02-Nov-2011  riz Pull up following revision(s) (requested by manu in ticket #1679):
sys/fs/puffs/puffs_vnops.c: revision 1.157
sys/fs/puffs/puffs_vnops.c: revision 1.158
sys/fs/puffs/puffs_vnops.c: revision 1.159
sys/fs/puffs/puffs_vfsops.c: revision 1.97
sys/fs/puffs/puffs_vfsops.c: revision 1.99
sys/fs/puffs/puffs_vnops.c: revision 1.160
sys/fs/puffs/puffs_vfsops.c: revision 1.100
sys/miscfs/syncfs/sync_subr.c: revision 1.47
sys/fs/puffs/puffs_node.c: revision 1.21
sys/fs/puffs/puffs_node.c: revision 1.22
sys/fs/puffs/puffs_msgif.c: revision 1.88
sys/fs/puffs/puffs_msgif.c: revision 1.89
sys/fs/puffs/puffs_vnops.c: revision 1.156
Make sure ioflush does not sleep in PUFFS code path, waiting for a mutex,
a memory allocation, or a response from the filesystem.
This avoids deadlocks in the following situations:
1) when memory is low: ioflush waits the fileystem, the fielsystem waits
for memory
2) when the filesystem does not respond (e.g.: network outage ona
distributed filesystem)
Fix the build that was broken by struct lwp *updateproc reference in
RUMP-visible code. Instead of checking that updateproc (aka ioflush,
aka syncer) will not sleep in PUFFS code, I check for any kernel thread:
after all none of them are designed to hang awaiting for a remote filesystem
operation to complete.
Roll back the change that forced kernel threads to not sleep in PUFFS.
The change does not make consensus, since only pagedaemon should need it.
Other threads will tolerate sleeping, and problems here are only symptoms
that something is going wrong in memory management. The cause, not the
symptoms, need to be fixed.
Make sure pagedaemon does not sleep for memory in puffs_vnop_sleep.
Add KASSERT on any sleeping memory allocation to check it cannot happen again.
Remove #ifdef DIAGNOSTIC guards around KASSERT, as the macro contains them
 1.13.10.2 17-Sep-2011  bouyer Pull up following revision(s) (requested by manu in ticket #1666):
sys/fs/puffs/puffs_sys.h: revision 1.78 via patch
sys/fs/puffs/puffs_node.c: revision 1.20 via patch
sys/fs/puffs/puffs_vnops.c: revision 1.155 via patch
Add a mutex for operations that touch size (setattr, getattr, write, fsync).
This is required to avoid data corruption bugs, where a getattr slices
itself within a setattr operation, and sets the size to the stall value
it got from the filesystem. That value is smaller than the one set by
setattr, and the call to uvm_vnp_setsize() trigged a spurious truncate.
The result is a chunk of zeroed data in the file.
Such a situation can easily happen when the ioflush thread issue a
VOP_FSYNC/puffs_vnop_sync/flushvncache/dosetattrn while andother process
do a sys_stat/VOP_GETATTR/puffs_vnop_getattr.
This mutex on size operation can be removed the day we decide VOP_GETATTR
has to operated on a locked vnode, since the other operations that touch
size already require that.
 1.13.10.1 03-Oct-2009  snj Pull up following revision(s) (requested by pooka in ticket #1042):
sys/fs/puffs/puffs_node.c: revision 1.14
sys/fs/puffs/puffs_vnops.c: revision 1.134
* fix a race i introduced almost two years ago in rev 1.116:
operations creating a node cannot be considered outgoing operations,
since after return from userspace they modify file system state
by creating a new node. if we do not protect the file system by
holding the directory lock, a lookup operation might race us into
the kernel and create the node earlier.
* remove pnode from hashlish before sending the reclaim faf off to
userspace. also, hold pmp_lock while frobbing the list.
 1.15.4.3 19-May-2011  rmind Implement sharing of vnode_t::v_interlock amongst vnodes:
- Lock is shared amongst UVM objects using uvm_obj_setlock() or getnewvnode().
- Adjust vnode cache to handle unsharing, add VI_LOCKSHARE flag for that.
- Use sharing in tmpfs and layerfs for underlying object.
- Simplify locking in ubc_fault().
- Sprinkle some asserts.

Discussed with ad@.
 1.15.4.2 05-Mar-2011  rmind sync with head
 1.15.4.1 16-Mar-2010  rmind Change struct uvm_object::vmobjlock to be dynamically allocated with
mutex_obj_alloc(). It allows us to share the locks among UVM objects.
 1.15.2.1 17-Aug-2010  uebayasi Sync with HEAD.
 1.17.6.1 23-Jun-2011  cherry Catchup with rmind-uvmplock merge.
 1.22.6.2 29-Apr-2012  mrg sync to latest -current.
 1.22.6.1 18-Feb-2012  mrg merge to -current.
 1.22.2.4 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.22.2.3 16-Jan-2013  yamt sync with (a bit old) head
 1.22.2.2 30-Oct-2012  yamt sync with head
 1.22.2.1 17-Apr-2012  yamt sync with head
 1.23.2.3 03-Nov-2014  msaitoh Pull up following revision(s) (requested by manu in ticket #1149):
sys/fs/puffs/puffs_node.c: revision 1.33
sys/fs/puffs/puffs_vnops.c: revision 1.185
When changing a directory content, update the ctime/mtime in kernel
cache,
otherwise the updated ctime/mtime appears after the cached entry expire.
 1.23.2.2 12-Aug-2012  martin Pull up following revision(s) (requested by manu in ticket #484):
sys/fs/nilfs/nilfs_vnops.c: revision 1.18
sys/ufs/ufs/ufs_lookup.c: revision 1.117
sys/nfs/nfs_vnops.c: revision 1.295
sys/ufs/chfs/chfs_vnops.c: revision 1.8
sys/ufs/ext2fs/ext2fs_lookup.c: revision 1.70
sys/fs/unionfs/unionfs_vnops.c: revision 1.6
sys/kern/vfs_cache.c: revision 1.89
sys/fs/efs/efs_vnops.c: revision 1.26
sys/fs/hfs/hfs_vnops.c: revision 1.26
sys/fs/adosfs/adlookup.c: revision 1.16
sys/fs/puffs/puffs_vnops.c: revision 1.168
sys/fs/tmpfs/tmpfs_vnops.c: revision 1.98
sys/fs/ntfs/ntfs_vnops.c: revision 1.52
sys/fs/cd9660/cd9660_lookup.c: revision 1.20
sys/fs/msdosfs/msdosfs_lookup.c: revision 1.24
sys/fs/smbfs/smbfs_vnops.c: revision 1.80
sys/fs/udf/udf_vnops.c: revision 1.72
sys/fs/filecorefs/filecore_lookup.c: revision 1.14
sys/fs/puffs/puffs_node.c: revision 1.25
Move some the test for MAKEENTRY into the cache_enter(9). Make some
variables in vfs_cache.c static, __read_mostly, etc.
No objection on tech-kern@.
 1.23.2.1 23-Apr-2012  riz Pull up following revision(s) (requested by manu in ticket #195):
lib/libskey/skeysubr.c: revision 1.27
lib/libkvm/kvm_getloadavg.c: revision 1.11
lib/libwrap/update.c: revision 1.9
lib/liby/yyerror.c: revision 1.9
lib/libpuffs/puffs_ops.3: revision 1.30
lib/libwrap/misc.c: revision 1.10
lib/libwrap/hosts_access.c: revision 1.20
lib/libpuffs/pnode.c: revision 1.11
lib/libperfuse/subr.c: revision 1.17
lib/libpuffs/pnode.c: revision 1.12
lib/libperfuse/subr.c: revision 1.18
lib/libwrap/options.c: revision 1.15
lib/libwrap/fix_options.c: revision 1.11
lib/libperfuse/ops.c: revision 1.52
lib/libperfuse/ops.c: revision 1.53
lib/libperfuse/ops.c: revision 1.54
lib/libwrap/hosts_ctl.c: revision 1.5
lib/libintl/gettext.c: revision 1.27
lib/libwrap/shell_cmd.c: revision 1.6
lib/libpuffs/dispatcher.c: revision 1.39
lib/libperfuse/perfuse_priv.h: revision 1.27
lib/libwrap/socket.c: revision 1.19
lib/libpuffs/puffs.3: revision 1.50
lib/libperfuse/perfuse_priv.h: revision 1.28
lib/libpuffs/puffs_priv.h: revision 1.45
lib/libpuffs/puffs.3: revision 1.51
lib/libperfuse/perfuse_priv.h: revision 1.29
lib/libwrap/percent_x.c: revision 1.5
lib/libpuffs/puffs.3: revision 1.52
lib/libperfuse/debug.c: revision 1.11
sys/fs/puffs/puffs_vnops.c: revision 1.165
lib/libwrap/tcpd.h: revision 1.13
sys/fs/puffs/puffs_vnops.c: revision 1.166
lib/libwrap/eval.c: revision 1.7
sys/fs/puffs/puffs_msgif.h: revision 1.78
sys/fs/puffs/puffs_vfsops.c: revision 1.101
lib/libwrap/rfc931.c: revision 1.9
lib/libwrap/clean_exit.c: revision 1.5
lib/libpuffs/puffs.h: revision 1.120
lib/libc/stdlib/jemalloc.c: revision 1.27
lib/librmt/rmtlib.c: revision 1.26
lib/libpuffs/puffs.h: revision 1.121
sys/fs/puffs/puffs_sys.h: revision 1.79
lib/librumpclient/rumpclient.c: revision 1.48
lib/libwrap/refuse.c: revision 1.5
lib/libperfuse/perfuse.c: revision 1.26
lib/libperfuse/perfuse.c: revision 1.27
tests/fs/puffs/t_fuzz.c: revision 1.5
lib/libperfuse/perfuse.c: revision 1.28
lib/libpuffs/dispatcher.c: revision 1.40
sys/fs/puffs/puffs_node.c: revision 1.24
lib/libwrap/diag.c: revision 1.9
lib/libintl/textdomain.c: revision 1.13
Use C89 function definition
Add name and atttribute cache with filesytem provided TTL.
lookup, create, mknod, mkdir, symlink, getattr and setattr messages
have been extended so that attributes and their TTL can be provided
by the filesytem. lookup, create, mknod, mkdir, and symlink messages
are also extended so that the filesystem can provide name TTL.
Add PUFFS_KFLAG_CACHE_FS_TTL flag to puffs_init(3) to use name and
attribute cache with filesystem provided TTL.
lookup, create, mknod, mkdir, symlink, getattr and setattr messages
have been extended so that attributes and their TTL can be provided
by the filesytem. lookup, create, mknod, mkdir, and symlink messages
are also extended so that the filesystem can provide name TTL.
The filesystem updates attributes and TTL using
puffs_pn_getvap(3), puffs_pn_getvattl(3), and puffs_pn_getcnttl(3)
Use new PUFFS_KFLAG_CACHE_FS_TTL option to puffs_init(3) so that
FUSE TTL on name and attributes are used. This save many PUFFS
operations and improves performances.
PUFFS_KFLAG_CACHE_FS_TTL is #ifdef'ed in many places for now so that
libperfuse can still be used on netbsd-5.
Split file system.
Comma fixes.
Remove dangling &quot;and&quot;.
Bump date for previous.
- Makesure update_va does not change vnode size when it should not. For
instance when doing a fault-issued VOP_GETPAGES within VOP_WRITE, changing
size leads to panic: genfs_getpages: past eof.
-Handle ticks wrap around for vnode name andattribute timeout
- When using PUFFS_KFLAG_CACHE_FS_TTL, do not use puffs_node to carry
attribute and TTL fora newly created node. Instead extend puffs_newinfo
and add puffs_newinfo_setva() and puffs_newinfo_setttl()
- Remove node_mk_common_final in libperfuse. It used to set uid/gid for
a newly created vnode but has been made redundant along time ago since
uid and gid are properly set in FUSE header.
- In libperfuse, check for corner case where opc = 0 on INACTIVE and RECLAIM
(how is it possible? Check for it to avoid a crash anyway)
- In libperfuse, make sure we unlimit RLIMIT_AS and RLIMIT_DATA so that
we do notrun out of memory because the kernel is lazy at reclaiming vnodes.
- In libperfuse, cleanup style of perfuse_destroy_pn()
Do not set PUFFS_KFLAG_CACHE_FS_TTL for PUFFS tests
 1.27.2.4 03-Dec-2017  jdolecek update from HEAD
 1.27.2.3 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.27.2.2 23-Jun-2013  tls resync from head
 1.27.2.1 20-Nov-2012  tls Resync to 2012-11-19 00:00:00 UTC
 1.29.6.1 18-May-2014  rmind sync with head
 1.31.4.5 17-Jan-2015  martin Pull up following revision(s) (requested by maxv in ticket #427):
sys/compat/svr4/svr4_schedctl.c: revision 1.8
sys/netinet/tcp_timer.c: revision 1.88
sys/miscfs/genfs/layer_vfsops.c: revision 1.45
sys/compat/svr4/svr4_ioctl.c: revision 1.37
sys/ufs/chfs/chfs_vfsops.c: revision 1.14
sys/miscfs/fdesc/fdesc_vfsops.c: revision 1.91
sys/compat/linux/arch/i386/linux_ptrace.c: revision 1.30
sys/compat/common/kern_time_50.c: revision 1.28
sys/netinet6/ip6_forward.c: revision 1.74
sys/miscfs/umapfs/umap_vnops.c: revision 1.57
sys/compat/svr4/svr4_fcntl.c: revision 1.74
distrib/sets/lists/comp/mi: revision 1.1931
sys/netinet6/udp6_output.c: revision 1.46
sys/fs/puffs/puffs_compat.c: revision 1.3
sys/fs/udf/udf_rename.c: revision 1.11
sys/compat/svr4/svr4_filio.c: revision 1.24
sys/fs/udf/udf_rename.c: revision 1.12
sys/netinet/tcp_usrreq.c: revision 1.202
sys/miscfs/umapfs/umap_subr.c: revision 1.29
sys/compat/linux/common/linux_fadvise64.c: revision 1.3
sys/netinet/if_atm.c: revision 1.34
sys/miscfs/procfs/procfs_subr.c: revision 1.106
sys/miscfs/genfs/layer_subr.c: revision 1.37
sys/netinet/tcp_sack.c: revision 1.30
sys/compat/freebsd/freebsd_misc.c: revision 1.33
sys/compat/freebsd/freebsd_file.c: revision 1.33
sys/ufs/chfs/chfs_vnode.c: revision 1.12
sys/compat/svr4/svr4_ttold.c: revision 1.34
sys/compat/linux/common/linux_file.c: revision 1.114
sys/compat/linux/arch/mips/linux_machdep.c: revision 1.43
sys/compat/linux/common/linux_signal.c: revision 1.76
sys/compat/common/compat_util.c: revision 1.46
sys/compat/linux/arch/arm/linux_ptrace.c: revision 1.18
sys/compat/svr4/svr4_sockio.c: revision 1.36
sys/compat/linux/arch/arm/linux_machdep.c: revision 1.32
sys/compat/svr4/svr4_signal.c: revision 1.66
sys/kern/kern_exec.c: revision 1.410
sys/fs/puffs/puffs_vfsops.c: revision 1.115
sys/compat/svr4/svr4_exec_elf64.c: revision 1.15
sys/compat/linux/arch/i386/linux_machdep.c: revision 1.159
sys/compat/linux/arch/alpha/linux_machdep.c: revision 1.50
sys/compat/linux32/common/linux32_misc.c: revision 1.24
sys/netinet/in_pcb.c: revision 1.153
sys/sys/malloc.h: revision 1.116
sys/compat/common/if_43.c: revision 1.9
share/man/man9/Makefile: revision 1.380
sys/netinet/tcp_vtw.c: revision 1.12
sys/miscfs/umapfs/umap_vfsops.c: revision 1.95
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.186
sys/compat/common/uipc_syscalls_43.c: revision 1.46
sys/ufs/ext2fs/ext2fs_vnops.c: revision 1.115
sys/fs/puffs/puffs_msgif.c: revision 1.97
sys/compat/svr4/svr4_ipc.c: revision 1.27
sys/compat/linux/common/linux_exec.c: revision 1.117
sys/ufs/ext2fs/ext2fs_readwrite.c: revision 1.66
sys/netinet/tcp_output.c: revision 1.179
sys/compat/svr4/svr4_termios.c: revision 1.28
sys/fs/udf/udf_strat_bootstrap.c: revision 1.4
sys/fs/puffs/puffs_subr.c: revision 1.67
sys/fs/puffs/puffs_node.c: revision 1.36
sys/miscfs/overlay/overlay_vnops.c: revision 1.21
sys/fs/cd9660/cd9660_node.c: revision 1.34
sys/netinet/raw_ip.c: revision 1.146
sys/sys/mallocvar.h: revision 1.13
sys/miscfs/overlay/overlay_vfsops.c: revision 1.63
share/man/man9/malloc.9: revision 1.50
sys/netinet6/dest6.c: revision 1.18
sys/compat/linux/common/linux_uselib.c: revision 1.33
sys/compat/linux/common/linux_socket.c: revision 1.120
share/man/man9/malloc.9: revision 1.51
sys/netinet/tcp_subr.c: revision 1.257
sys/compat/linux/common/linux_socketcall.c: revision 1.45
sys/compat/linux/common/linux_fadvise64_64.c: revision 1.3
sys/compat/freebsd/freebsd_ipc.c: revision 1.17
sys/compat/linux/common/linux_misc_notalpha.c: revision 1.109
sys/compat/linux/arch/alpha/linux_pipe.c: revision 1.17
sys/netinet6/in6_pcb.c: revision 1.132
sys/netinet6/in6_ifattach.c: revision 1.94
sys/compat/svr4/svr4_exec_elf32.c: revision 1.15
sys/miscfs/nullfs/null_vfsops.c: revision 1.90
sys/fs/cd9660/cd9660_util.c: revision 1.12
sys/compat/linux/arch/powerpc/linux_machdep.c: revision 1.48
sys/compat/freebsd/freebsd_exec_elf32.c: revision 1.20
sys/miscfs/procfs/procfs_vfsops.c: revision 1.94
sys/compat/linux/arch/powerpc/linux_ptrace.c: revision 1.28
sys/compat/linux/common/linux_sched.c: revision 1.67
sys/compat/linux/common/linux_exec_aout.c: revision 1.67
sys/compat/linux/common/linux_pipe.c: revision 1.67
sys/compat/linux/common/linux_llseek.c: revision 1.34
sys/compat/linux/arch/mips/linux_ptrace.c: revision 1.10
Do not uselessly include <sys/malloc.h>.
Cleanup:
- remove struct kmembuckets (dead)
- correctly deadify MALLOC_XX
- remove MALLOC_DEFINE_LIMIT and MALLOC_JUSTDEFINE_LIMIT (dead)
- remove malloc_roundup(), malloc_type_setlimit(), MALLOC_DEFINE_LIMIT()
and MALLOC_JUSTDEFINE_LIMIT() from man 9 malloc
New sentence, new line. Bump date for previous.
Obsolete malloc_roundup(9), malloc_type_setlimit(9) and MALLOC_DEFINE_LIMIT(9)
man pages.
 1.31.4.4 09-Nov-2014  msaitoh Pull up following revision(s) (requested by manu in ticket #194):
sys/fs/puffs/puffs_vnops.c: revision 1.197
sys/fs/puffs/puffs_node.c: revision 1.35
Fix PUFFS node use-after-reclaim
When puffs_cookie2vnode() misses an entry, vcache_get()
creates a new node (puffs_vfsop_loadvnode being called to
initialize the PUFFS part), then it discovers it is VNON,
and tries to vrele() it. vrele() calls VOP_INACTIVE(),
which led us in puffs_vnop_inactive() where we sent a
request to the filesystem for a node that already had been
reclaimed.
The fix is to check for VNON nodes in puffs_vnop_inactive()
and to return without doing anyting. This is suboptimal, but
a better workaround would probably need to modify vcache API,
with an impact on other filesystems. Let us keep it simple.
 1.31.4.3 30-Sep-2014  martin Pull up following revision(s) (requested by hannken in ticket #67):
sys/fs/puffs/puffs_node.c: revision 1.34
sys/fs/puffs/puffs_vnops.c: revision 1.187
Fix the puffs_sop_thread -> puffs_cookie2vnode path:
- pass the cookie by reference
- add missing mutex_exit()
- update assertion for VNON typed vnodes
 1.31.4.2 10-Sep-2014  martin Pull up following revision(s) (requested by manu in ticket #79):
sys/fs/puffs/puffs_node.c: revision 1.33
sys/fs/puffs/puffs_vnops.c: revision 1.185
When changing a directory content, update the ctime/mtime in kernel
cache, otherwise the updated ctime/mtime appears after the cached
entry expire.
 1.31.4.1 29-Aug-2014  martin Pull up following revision(s) (requested by hannken in ticket #67):
sys/fs/puffs/puffs_sys.h: revision 1.86
sys/fs/puffs/puffs_vfsops.c: revision 1.114
sys/fs/puffs/puffs_msgif.c: revision 1.95
sys/fs/puffs/puffs_node.c: revision 1.32
sys/fs/puffs/puffs_vnops.c: revision 1.184
Change puffs from hashlist to vcache.
- field "pa_nhashbuckets" of struct "puffs_kargs" becomes a no-op.
and should be removed on the next protocol version bump.
 1.36.2.1 05-Oct-2016  skrll Sync with HEAD
 1.67 10-Nov-2014  maxv Do not uselessly include <sys/malloc.h>.
 1.66 16-Nov-2008  pooka branches: 1.66.26; 1.66.42;
more <sys/buf.h> police
 1.65 01-Mar-2008  rmind branches: 1.65.4; 1.65.10; 1.65.12;
Welcome to 4.99.55:

- Add a lot of missing selinit() and seldestroy() calls.

- Merge selwakeup() and selnotify() calls into a single selnotify().

- Add an additional 'events' argument to selnotify() call. It will
indicate which event (POLL_IN, POLL_OUT, etc) happen. If unknown,
zero may be used.

Note: please pass appropriate value of 'events' where possible.
Proposed on: <tech-kern>
 1.64 28-Jan-2008  pooka branches: 1.64.2; 1.64.6;
For code clarity typedef void *puffs_cookie_t.

No functional change.
 1.63 02-Jan-2008  pooka More type-punning workarounds. Curiously the kernel compilation
flags cause gcc to not complain.
 1.62 08-Dec-2007  pooka branches: 1.62.4;
Now that "l" is gone both as an argument to operations and from
componentname, remove all vestiges of puffs_cid.
 1.61 08-Dec-2007  pooka Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.
 1.60 17-Nov-2007  pooka branches: 1.60.2;
Make puffs_updatenode() take a puffs_node instead of a vnode. This
way we don't need to worry if a vnode has been reclaimed from under
us.
 1.59 17-Nov-2007  pooka Implement a biodone callback for async writes similar to reads and
use that when possible.
 1.58 16-Nov-2007  pooka Restructure the messaging interface a bit more: make all interfacing
with the file server happen through puffs_msg_enqueue() and
puffs_msg_wait() instead of having a billion different routines.
Build the existing system upon these two. Most importantly though,
decouple insertation into the op queue from the actual wait. This
is useful for a number of reasons coming soon to a cvs repo near you.
 1.57 11-Oct-2007  pooka branches: 1.57.2; 1.57.4;
Part 1/n of some pretty extensive changes to how the kernel module
interacts with the userspace file server:

* since the kernel-user communication is not purely request-response
anymore (hasn't been since 2006), try to rename some "request" to
"message". more similar mangling will take place in the future.

* completely rework how messages are allocated. previously most of
them were borrowed from the stack (originally *all* of them),
but now always allocate dynamically. this makes the structure
of the code much cleaner. also makes it possible to fix a
locking order violation. it enables plenty of future enhancements.

* start generalizing the transport interface to be independent of puffs

* move transport interface to read/write instead of ioctl. the
old one had legacy design problems, and besides, ioctl's suck.
implement a very generic version for now; this will be
worked on later hopefully some day reaching "highly optimized".

* implement libpuffs support behind existing library request
interfaces. this will change eventually (I hate those interfaces)
 1.56 10-Oct-2007  ad Merge from vmlocking:

- Split vnode::v_flag into three fields, depending on field locking.
- simple_lock -> kmutex in a few places.
- Fix some simple locking problems.
 1.55 01-Oct-2007  pooka * better error checking: validate error values received from userland
to be vaild errno values
* include string describing error in PUFFS_ERR
* get rid of union in puffs_req, it's nothing but trouble
* pass pmp to async i/o callbacks
 1.54 29-Sep-2007  pooka kill trailing whitespace
 1.53 27-Sep-2007  pooka Split routines handling nodes from puffs_subr to puffs_node.
No functional change.
 1.52 27-Sep-2007  pooka Revert previous, it makes no sense whatsoever.
 1.51 27-Sep-2007  pooka Undo state created in cookie2vnode if an error is returned.
 1.50 27-Sep-2007  pooka Differentiate between cookie2vnode returning an error and
return to caller, address unknown: no such cookie, no such node.
Make the callers use this info to either create a new vnode or bail.
 1.49 27-Sep-2007  pooka Add error notifications, which are used to deliver errors from the
kernel to the file server for silly things the file server did,
e.g. attempting to create a file with size VSIZENOTSET. The file
server can handle these as it chooses, but the default action is
for it to throw its hands in the air and sing "goodbye, cruel world,
it's over, walk on by".
 1.48 27-Sep-2007  pooka Don't forget to insert the root node on the hash list.

... I should remember to test also if unmounting a file system works
before I commit stuff.
 1.47 27-Sep-2007  pooka Fix a race in how new cookies are checked. Previously the checking
was done separate of inserting the cookie into the lookup structure
and without any form of interlock. This could lead to the same
cookie pointing to two different nodes. Remedy the race by creating
a separate "checked and ready to be inserted" cookie list which
serves as an interlock without having to hold a fs-global creation
lock.
 1.46 24-Sep-2007  pooka add a few comments and g/c dead code
 1.45 04-Sep-2007  pooka branches: 1.45.2;
* don't allow the file server to specify a node size to be VSIZENOTSET
* KASSERT that VNOVAL == VSIZENOTSET
 1.44 01-Aug-2007  pooka branches: 1.44.2; 1.44.4; 1.44.6;
add comment to flag a slight problem
 1.43 29-Jul-2007  ad It's not a good idea for device drivers to modify b_flags, as they don't
need to understand the locking around that field. Instead of setting
B_ERROR, set b_error instead. b_error is 'owned' by whoever completes
the I/O request.
 1.42 22-Jul-2007  pooka Keep track of the maximum size we have supplied the file server (or
it has supplied us). If we fault pages which are at offset >= server
size, but less than the in-kernel vnode size, inform the file server
of the latest developments in file size before issueing the fault.
The avoids confusion with files which are not written start to finish.

fixes kern/36429 by yamt
 1.41 19-Jul-2007  pooka Initialize pnode to 0 after fetching it from the pool. At least
one effect is poll() working much better, as selinfo doesn't contain
random bits.
 1.40 09-Jul-2007  ad branches: 1.40.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements
 1.39 02-Jul-2007  pooka check for invalid vtype
 1.38 01-Jul-2007  pooka Give the file server to ability to request the entire pathname buffer
under lookup by using PUFFS_KFLAG_LOOKUP_FULLPNBUF instead just the
current component.
 1.37 01-Jul-2007  pooka Instead of supplying a plain pid, supply an abstract struct puffs_cid *,
which can currently be used to query the pid and lwpid.
 1.36 01-Jul-2007  pooka make puffs_cred an opaque type
 1.35 24-Jun-2007  pooka Split the NOCACHE option in twain: NOCACHE_NAME & NOCACHE_PAGE.
 1.34 21-Jun-2007  pooka Refactor the pnode2vnode translation slightly so that VFS_ROOT
can use it directly.
 1.33 21-Jun-2007  pooka Reorganize how the root vnode is fetched so that it doesn't always
go through VFS_ROOT() and allow to fetch it without locking it.
This allows us to call the cache flush operations also for the root
vnode and most notably fixes e.g. a "No such file or directory"
for a psshfs root directory ls -l when a file was locally deleted
and remotely re-created.

Also fix some sloppy programming in root node fetch (mostly cosmetic).
 1.32 06-Jun-2007  pooka Move puffs to a two clause license where it already isn't so. And
as agc pointed out, even files with the third clause were already
effectively two clause because of a slight bug in the language...
 1.31 18-May-2007  pooka Support VOP_POLL. This requires some acrobatics on the puffs_node,
as we give a reference to userspace for the puffs_node for the
duration of the poll call. So reference count puffs_node separately
from the parent vnode. vref()/vrele() is not possible due to a possible
surprise visit from VOP_INACTIVE.
 1.30 17-May-2007  pooka Make it possible for the file server to specify the root vnode type
and other information instead of always using VDIR. To make this
possible without races, require all root node information already
in puffs_mount() and nuke puffs_start2() and the associated start
operation completely.

requested/inspired by Tobias Nygren
 1.29 08-May-2007  pooka Adventures in file systems, part (u_quad_t)-1: we can't use the
file system value for the size of device special files, as that
comes from specfs instead of the "host" file system. Therefore,
take care that getattr doesn't override the value of vp->v_size.
 1.28 01-May-2007  pooka Fix a problem introduced when I converted puffs to use newlock2:
when unmounting the file system in case of a certain timing (and
possibly some other conditions), a thread would wait on a condition
variable, while another thread broadcast the cv and immediately
proceeded to destroy it. The result was a system frozen completely
solid shorly after the process waiting for the cv woke up. So
introduce reference counting to synchronize destruction of the
resources in unmount.

I was able to repeat the problem only on my laptop in some special
cases, so I do not know how common it was. Ironically, killing
the file server process violently instead of unmount() didn't have
this problem because it never entered the unmount path from two
directions.
 1.27 30-Mar-2007  pooka * abstract ASYNCBIOREAD and let callers freely issue a callback called
from putop. even though there's only one user currently, makes code
more readable
* move "delta" to a standard parameter in vntouser and get rid of the
specialcase vntouser_delta
 1.26 29-Mar-2007  pooka Convert spinlocks & sleep/wakeup to newlock2 locking stuff. Fix a
bunch of bugs.

* park structures are now always allocated from a pool instead of a
mixed stack/malloc allocation
* get rid of the whole adjbuf concept, always just alloc the maximal
amount of memory to satisfy a request
* little regression: don't allow interrupting wait from file system
to userspace; this had problems already before, but now the problems
really started to shine through. I'll try to make this work again
some day.
* fix bmap to return a sensible value in runp
 1.25 20-Mar-2007  pooka * rework the page cache interaction a bit: cache metadata in the
kernel and flush it out all at once instead of continuous updating
* add support for delivering notifications to the file server about
when a page was written to (but disabled by default for now). the
file server can use this to request flushing or invalidating the
kernel page cache
 1.24 14-Mar-2007  pooka branches: 1.24.2;
Support B_READ|B_ASYNC in strategy by calling biodone() directly
when the file server puts the result.
 1.23 12-Mar-2007  ad branches: 1.23.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.
 1.22 27-Feb-2007  pooka branches: 1.22.2;
Make wait for the user file server PCATCHable. This makes it
possible to recover the system by just killing processes in case
a file server manages to recurse into itself either by fault of
file server implementation or by pilot error. The downside is that
the code is extremely hard to follow and practically screams out
for newlock2 (in addition to screaming "bug here"). The whole
PCATCH nonsense and induced megacomplexity can hopefully be avoided
in the future by tweaking other parts of the implementation.
 1.21 20-Feb-2007  ad Call genfs_node_destroy() where appropriate.
 1.20 16-Feb-2007  pooka branches: 1.20.2;
Check against root node cookie when fetching a new vnode and invoke
VFS_ROOT() if the cookies match. Without this fix, if the root
vnode was reclaimed, doing lookups for dotdot from the root vnode
was possible. In practice this occured only through getcwd.
 1.19 15-Feb-2007  pooka Hide the debug prints behind PUFFSDEBUG instead of DEBUG. Make the
latter define the former.
 1.18 26-Jan-2007  pooka Initial attempt at suspend/snapshot support for userspace file
servers. This is still pretty much on the level "if it breaks ...".
It should work for single-threaded servers which handle one operation
from start to finish in one go. Also, it does not yet totally
correctly synchronize metadata and data in some cases. So needless
to say, it needs improvement, but it is possible that will have to
wait for some lock revampage.
 1.17 25-Jan-2007  pooka don't hold spinlocks (except vnode interlock) when doing vget()
 1.16 15-Jan-2007  pooka Store puffs_node's on lists hashed with the cookie value instead
of just one flat list.
 1.15 15-Jan-2007  pooka * do not accept the directory cookie as the result of a lookup (otherwise
we'd be locking against ourselves)
* do not accept duplicate cookies when creating new nodes
 1.14 09-Jan-2007  pooka Introduce flush operations, which the fs server can use to control
kernel caching. Currently supported are only flushing the name
cache for a directory or flushing the name cache for the entire fs.

Also, get rid of PNODE_INACTIVE status, since it was racy and
essentially didn't work. All this on top of being useless in the
first place ....
 1.13 30-Dec-2006  pooka branches: 1.13.2;
* use PUFFS_KFLAG_NOCACHE to also signal that we don't want the namecache
* enter files into the namecache immediately when new nodes are created
(if it's a caching mount, of course)
 1.12 29-Dec-2006  pooka rename the kernel-provided componentname to puffs_kcn; libpuffs now
provides puffs_cn built on top of it
 1.11 05-Dec-2006  pooka branches: 1.11.2;
shuffle functions around a bit: move the transport (/dev/puffs) to
a different file from the messaging (request contents). no functional
change
 1.10 05-Dec-2006  pooka Allow multiple requests to be transferred in each GET/PUTOP. For
a single request, the performance is still the same.
 1.9 18-Nov-2006  pooka branches: 1.9.2;
As a first generation best-effort hack, use NOCACHE to mean "file
size can change without the kernel knowing" and therefore query
the file size before invoking read or write operations.
 1.8 16-Nov-2006  christos __unused removal on arguments; approved by core.
 1.7 07-Nov-2006  pooka attach to genfs & support page cache. most noticeable effect is
mmap and therefore execution of binaries starting to work, some
speed improvements with large file I/O also. caching semantics
and error case handling most likely need revisiting.
 1.6 27-Oct-2006  pooka fix checkalias true branch: don't unlock or lock twice
 1.5 27-Oct-2006  pooka support fifos
 1.4 26-Oct-2006  pooka support specfs
 1.3 26-Oct-2006  pooka Fix operations creating new nodes to honor the vnode locking protocol
if the userspace server returns an error. Fixes lockups if any
of the following operations failed: create, mknod, mkdir, symlink
 1.2 23-Oct-2006  pooka Apply a little eggwash to a deadlock condition where calling
getnewvnode() while holding on to any vnode lock deadlocks the
system if the file system is being forcibly unmounted.

Normal file systems don't trigger this problem because of two reaons:
1) they don't hold on to vnode locks while idling who-knows-where, so
the race doesn't trigger
2) they aren't usually unmounted with FORCE; puffs is, in case "someone"
manages to make a crashy userspace server

Nevertheless, a real solution is slowly being braised.
 1.1 22-Oct-2006  pooka kernel portion of puffs - the Pass-to-Userspace Framework File System.
It contains the VFS attachment and userspace message-passing interface.

This work was initially started and completed for Google SoC 2005
and tweaked to work a bit better in the past few weeks. While
being far from complete, it is functional enough to be able and
stable to host a fairly general-purpose in-memory file system in
userspace. Even so, puffs should be considered experimental and
no binary compatibility for interfaces or crash-freedom or zero
security implications should be relied upon just yet.

The GSoC project was mentored by William Studenmund and the final
review for the code was done by Christos.
 1.9.2.4 01-Feb-2007  ad Sync with head.
 1.9.2.3 12-Jan-2007  ad Sync with head.
 1.9.2.2 18-Nov-2006  ad Sync with head.
 1.9.2.1 18-Nov-2006  ad file puffs_subr.c was added on branch newlock2 on 2006-11-18 21:39:20 +0000
 1.11.2.2 10-Dec-2006  yamt sync with head.
 1.11.2.1 05-Dec-2006  yamt file puffs_subr.c was added on branch yamt-splraiseipl on 2006-12-10 07:18:38 +0000
 1.13.2.9 17-Mar-2008  yamt sync with head.
 1.13.2.8 04-Feb-2008  yamt sync with head.
 1.13.2.7 21-Jan-2008  yamt sync with head
 1.13.2.6 07-Dec-2007  yamt sync with head
 1.13.2.5 27-Oct-2007  yamt sync with head.
 1.13.2.4 03-Sep-2007  yamt sync with head.
 1.13.2.3 26-Feb-2007  yamt sync with head.
 1.13.2.2 30-Dec-2006  yamt sync with head.
 1.13.2.1 30-Dec-2006  yamt file puffs_subr.c was added on branch yamt-lazymbuf on 2006-12-30 20:50:01 +0000
 1.20.2.6 17-May-2007  yamt sync with head.
 1.20.2.5 07-May-2007  yamt sync with head.
 1.20.2.4 15-Apr-2007  yamt sync with head.
 1.20.2.3 24-Mar-2007  yamt sync with head.
 1.20.2.2 12-Mar-2007  rmind Sync with HEAD.
 1.20.2.1 27-Feb-2007  yamt - sync with head.
- move sched_changepri back to kern_synch.c as it doesn't know PPQ anymore.
 1.22.2.14 12-Oct-2007  ad Sync with head.
 1.22.2.13 09-Oct-2007  ad Sync with head.
 1.22.2.12 09-Oct-2007  ad Sync with head.
 1.22.2.11 16-Sep-2007  ad Checkpoint work in progress on the vnode lifecycle and reference counting
stuff. This makes it work properly without kernel_lock and fixes a few
quite old bugs. See vfs_subr.c 1.283.2.17 for details.
 1.22.2.10 20-Aug-2007  ad Sync with HEAD.
 1.22.2.9 19-Aug-2007  ad - Back out the biodone() changes.
- Eliminate B_ERROR (from HEAD).
 1.22.2.8 15-Jul-2007  ad Sync with head.
 1.22.2.7 17-Jun-2007  ad - Increase the number of thread priorities from 128 to 256. How the space
is set up is to be revisited.
- Implement soft interrupts as kernel threads. A generic implementation
is provided, with hooks for fast-path MD code that can run the interrupt
threads over the top of other threads executing in the kernel.
- Split vnode::v_flag into three fields, depending on how the flag is
locked (by the interlock, by the vnode lock, by the file system).
- Miscellaneous locking fixes and improvements.
 1.22.2.6 09-Jun-2007  ad Sync with head.
 1.22.2.5 08-Jun-2007  ad Sync with head.
 1.22.2.4 10-Apr-2007  ad Sync with head.
 1.22.2.3 05-Apr-2007  ad Compile fixes.
 1.22.2.2 13-Mar-2007  ad Pull in the initial set of changes for the vmlocking branch.
 1.22.2.1 13-Mar-2007  ad Sync with head.
 1.23.2.1 11-Jul-2007  mjf Sync with head.
 1.24.2.1 29-Mar-2007  reinoud Pullup to -current
 1.40.2.2 10-Sep-2007  skrll Sync with HEAD.
 1.40.2.1 15-Aug-2007  skrll Sync with HEAD.
 1.44.6.2 01-Aug-2007  pooka add comment to flag a slight problem
 1.44.6.1 01-Aug-2007  pooka file puffs_subr.c was added on branch matt-mips64 on 2007-08-01 14:20:46 +0000
 1.44.4.3 23-Mar-2008  matt sync with HEAD
 1.44.4.2 09-Jan-2008  matt sync with HEAD
 1.44.4.1 06-Nov-2007  matt sync with HEAD
 1.44.2.4 09-Dec-2007  jmcneill Sync with HEAD.
 1.44.2.3 21-Nov-2007  joerg Sync with HEAD.
 1.44.2.2 26-Oct-2007  joerg Sync with HEAD.

Follow the merge of pmap.c on i386 and amd64 and move
pmap_init_tmp_pgtbl into arch/x86/x86/pmap.c. Modify the ACPI wakeup
code to restore CR4 before jumping back into kernel space as the large
page option might cover that.
 1.44.2.1 02-Oct-2007  joerg Sync with HEAD.
 1.45.2.2 14-Oct-2007  yamt sync with head.
 1.45.2.1 06-Oct-2007  yamt sync with head.
 1.57.4.3 18-Feb-2008  mjf Sync with HEAD.
 1.57.4.2 27-Dec-2007  mjf Sync with HEAD.
 1.57.4.1 19-Nov-2007  mjf Sync with HEAD.
 1.57.2.1 18-Nov-2007  bouyer Sync with HEAD
 1.60.2.1 26-Dec-2007  ad Sync with head.
 1.62.4.1 08-Jan-2008  bouyer Sync with HEAD
 1.64.6.2 17-Jan-2009  mjf Sync with HEAD.
 1.64.6.1 03-Apr-2008  mjf Sync with HEAD.
 1.64.2.1 24-Mar-2008  keiichi sync with head.
 1.65.12.1 19-Jan-2009  skrll Sync with HEAD.
 1.65.10.1 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.65.4.1 04-May-2009  yamt sync with head.
 1.66.42.1 17-Jan-2015  martin Pull up following revision(s) (requested by maxv in ticket #427):
sys/compat/svr4/svr4_schedctl.c: revision 1.8
sys/netinet/tcp_timer.c: revision 1.88
sys/miscfs/genfs/layer_vfsops.c: revision 1.45
sys/compat/svr4/svr4_ioctl.c: revision 1.37
sys/ufs/chfs/chfs_vfsops.c: revision 1.14
sys/miscfs/fdesc/fdesc_vfsops.c: revision 1.91
sys/compat/linux/arch/i386/linux_ptrace.c: revision 1.30
sys/compat/common/kern_time_50.c: revision 1.28
sys/netinet6/ip6_forward.c: revision 1.74
sys/miscfs/umapfs/umap_vnops.c: revision 1.57
sys/compat/svr4/svr4_fcntl.c: revision 1.74
distrib/sets/lists/comp/mi: revision 1.1931
sys/netinet6/udp6_output.c: revision 1.46
sys/fs/puffs/puffs_compat.c: revision 1.3
sys/fs/udf/udf_rename.c: revision 1.11
sys/compat/svr4/svr4_filio.c: revision 1.24
sys/fs/udf/udf_rename.c: revision 1.12
sys/netinet/tcp_usrreq.c: revision 1.202
sys/miscfs/umapfs/umap_subr.c: revision 1.29
sys/compat/linux/common/linux_fadvise64.c: revision 1.3
sys/netinet/if_atm.c: revision 1.34
sys/miscfs/procfs/procfs_subr.c: revision 1.106
sys/miscfs/genfs/layer_subr.c: revision 1.37
sys/netinet/tcp_sack.c: revision 1.30
sys/compat/freebsd/freebsd_misc.c: revision 1.33
sys/compat/freebsd/freebsd_file.c: revision 1.33
sys/ufs/chfs/chfs_vnode.c: revision 1.12
sys/compat/svr4/svr4_ttold.c: revision 1.34
sys/compat/linux/common/linux_file.c: revision 1.114
sys/compat/linux/arch/mips/linux_machdep.c: revision 1.43
sys/compat/linux/common/linux_signal.c: revision 1.76
sys/compat/common/compat_util.c: revision 1.46
sys/compat/linux/arch/arm/linux_ptrace.c: revision 1.18
sys/compat/svr4/svr4_sockio.c: revision 1.36
sys/compat/linux/arch/arm/linux_machdep.c: revision 1.32
sys/compat/svr4/svr4_signal.c: revision 1.66
sys/kern/kern_exec.c: revision 1.410
sys/fs/puffs/puffs_vfsops.c: revision 1.115
sys/compat/svr4/svr4_exec_elf64.c: revision 1.15
sys/compat/linux/arch/i386/linux_machdep.c: revision 1.159
sys/compat/linux/arch/alpha/linux_machdep.c: revision 1.50
sys/compat/linux32/common/linux32_misc.c: revision 1.24
sys/netinet/in_pcb.c: revision 1.153
sys/sys/malloc.h: revision 1.116
sys/compat/common/if_43.c: revision 1.9
share/man/man9/Makefile: revision 1.380
sys/netinet/tcp_vtw.c: revision 1.12
sys/miscfs/umapfs/umap_vfsops.c: revision 1.95
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.186
sys/compat/common/uipc_syscalls_43.c: revision 1.46
sys/ufs/ext2fs/ext2fs_vnops.c: revision 1.115
sys/fs/puffs/puffs_msgif.c: revision 1.97
sys/compat/svr4/svr4_ipc.c: revision 1.27
sys/compat/linux/common/linux_exec.c: revision 1.117
sys/ufs/ext2fs/ext2fs_readwrite.c: revision 1.66
sys/netinet/tcp_output.c: revision 1.179
sys/compat/svr4/svr4_termios.c: revision 1.28
sys/fs/udf/udf_strat_bootstrap.c: revision 1.4
sys/fs/puffs/puffs_subr.c: revision 1.67
sys/fs/puffs/puffs_node.c: revision 1.36
sys/miscfs/overlay/overlay_vnops.c: revision 1.21
sys/fs/cd9660/cd9660_node.c: revision 1.34
sys/netinet/raw_ip.c: revision 1.146
sys/sys/mallocvar.h: revision 1.13
sys/miscfs/overlay/overlay_vfsops.c: revision 1.63
share/man/man9/malloc.9: revision 1.50
sys/netinet6/dest6.c: revision 1.18
sys/compat/linux/common/linux_uselib.c: revision 1.33
sys/compat/linux/common/linux_socket.c: revision 1.120
share/man/man9/malloc.9: revision 1.51
sys/netinet/tcp_subr.c: revision 1.257
sys/compat/linux/common/linux_socketcall.c: revision 1.45
sys/compat/linux/common/linux_fadvise64_64.c: revision 1.3
sys/compat/freebsd/freebsd_ipc.c: revision 1.17
sys/compat/linux/common/linux_misc_notalpha.c: revision 1.109
sys/compat/linux/arch/alpha/linux_pipe.c: revision 1.17
sys/netinet6/in6_pcb.c: revision 1.132
sys/netinet6/in6_ifattach.c: revision 1.94
sys/compat/svr4/svr4_exec_elf32.c: revision 1.15
sys/miscfs/nullfs/null_vfsops.c: revision 1.90
sys/fs/cd9660/cd9660_util.c: revision 1.12
sys/compat/linux/arch/powerpc/linux_machdep.c: revision 1.48
sys/compat/freebsd/freebsd_exec_elf32.c: revision 1.20
sys/miscfs/procfs/procfs_vfsops.c: revision 1.94
sys/compat/linux/arch/powerpc/linux_ptrace.c: revision 1.28
sys/compat/linux/common/linux_sched.c: revision 1.67
sys/compat/linux/common/linux_exec_aout.c: revision 1.67
sys/compat/linux/common/linux_pipe.c: revision 1.67
sys/compat/linux/common/linux_llseek.c: revision 1.34
sys/compat/linux/arch/mips/linux_ptrace.c: revision 1.10
Do not uselessly include <sys/malloc.h>.
Cleanup:
- remove struct kmembuckets (dead)
- correctly deadify MALLOC_XX
- remove MALLOC_DEFINE_LIMIT and MALLOC_JUSTDEFINE_LIMIT (dead)
- remove malloc_roundup(), malloc_type_setlimit(), MALLOC_DEFINE_LIMIT()
and MALLOC_JUSTDEFINE_LIMIT() from man 9 malloc
New sentence, new line. Bump date for previous.
Obsolete malloc_roundup(9), malloc_type_setlimit(9) and MALLOC_DEFINE_LIMIT(9)
man pages.
 1.66.26.1 03-Dec-2017  jdolecek update from HEAD
 1.91 27-Jan-2019  pgoyette Merge the [pgoyette-compat] branch
 1.90 07-Jul-2016  msaitoh branches: 1.90.16; 1.90.18;
KNF. Remove extra spaces. No functional change.
 1.89 15-Feb-2015  manu Add PUFFS_KFLAG_NOFLUSH_META to prevent sending metadata flush to FUSE

FUSE filesystems do not expect to get metadata updates for [amc]time
and size, they updates the value on their own after operations.

The PUFFS PUFFS_KFLAG_NOFLUSH_META option prevents regular metadata cache
flushes to the filesystem , and libperfuse uses it to match Linux FUSE
behavior.

While there, fix a bug in SETATTR: do not update kernel metadata cache
from SETATTR reply when the request is asynchronous, as we do not have
the reply yet.
 1.88 05-Oct-2014  apb branches: 1.88.2;
Add close brace, accidentally omitted from previous change.
 1.87 05-Oct-2014  apb Safer definitions of DPRINTF and DPRINTF_VERBOSE.

In the PUFFSDEBUG case, wrap do { ... } while (/*CONSTCOND*/0)
around the definitions. In the non-PUFFSDEBUG case, define them
as ((void)0) instead of as empty.
 1.86 28-Aug-2014  hannken Change puffs from hashlist to vcache.
- field "pa_nhashbuckets" of struct "puffs_kargs" becomes a no-op.
and should be removed on the next protocol version bump.
 1.85 16-Aug-2014  manu Add a oflags input field to open requests so that the filesystem can pass
back information about the file. Implement PUFFS_OPEN_IO_DIRECT, which
will force direct IO (bypassing page cache) for the file.
 1.84 17-Oct-2013  christos branches: 1.84.4;
- remove unused variables
- add _NOERROR flavor macros for the case where errors are ignored.
 1.83 06-Mar-2013  yamt branches: 1.83.6;
comment
 1.82 11-Aug-2012  manu branches: 1.82.2;
Missing bit in previous commit (PUFFS_KFLAG_CACHE_DOTDOT option to avoid
looking up ..)
 1.81 27-Jul-2012  manu Rename slow sopreq queue into node sopreq queue, to refet the fact that
is only intended for postponed node reclaims.
When purging the node sopreq queue, do not call puffs_msg_sendresp(), as
it makes no sense.
 1.80 21-Jul-2012  manu - Improve PUFFS_KFLAG_CACHE_FS_TTL by reclaiming older inactive nodes.

The normal kernel behavior is to retain inactive nodes in the freelist
until it runs out of vnodes. This has some merit for local filesystems,
where the cost of an allocation is about the same as the cost of a
lookup. But that situation is not true for distributed filesystems.
On the other hand, keeping inactive nodes for a long time hold memory
in the file server process, and when the kernel runs out of vnodes, it
produce reclaim avalanches that increase lattency for other operations.

We do not reclaim inactive vnodes immediatly either, as they may be
looked up again shortly. Instead we introduce a grace time and we
reclaim nodes that have been inactive beyond the grace time.

- Fix lookup/reclaim race condition.

The above improvement undercovered a race condition between lookup and
reclaim. If we reclaimed a vnode associated with a userland cookie while
a lookup returning that same cookiewas inprogress, then the kernel ends
up with a vnode associated with a cookie that has been reclaimed in
userland. Next operation on the cookie will crash (or at least confuse)
the filesystem.

We fix this by introducing a lookup count in kernel and userland. On
reclaim, the kernel sends the count, which enable userland to detect
situation where it initiated a lookup that is not completed in kernel.
In such a situation, the reclaim must be ignored, as the node is about
to be looked up again.
 1.79 08-Apr-2012  manu Add name and atttribute cache with filesytem provided TTL.
lookup, create, mknod, mkdir, symlink, getattr and setattr messages
have been extended so that attributes and their TTL can be provided
by the filesytem. lookup, create, mknod, mkdir, and symlink messages
are also extended so that the filesystem can provide name TTL.
 1.78 29-Aug-2011  manu branches: 1.78.2; 1.78.6; 1.78.8;
Add a mutex for operations that touch size (setattr, getattr, write, fsync).

This is required to avoid data corruption bugs, where a getattr slices
itself within a setattr operation, and sets the size to the stall value
it got from the filesystem. That value is smaller than the one set by
setattr, and the call to uvm_vnp_setsize() trigged a spurious truncate.
The result is a chunk of zeroed data in the file.

Such a situation can easily happen when the ioflush thread issue a
VOP_FSYNC/puffs_vnop_sync/flushvncache/dosetattrn while andother process
do a sys_stat/VOP_GETATTR/puffs_vnop_getattr.

This mutex on size operation can be removed the day we decide VOP_GETATTR
has to operated on a locked vnode, since the other operations that touch
size already require that.
 1.77 11-Jan-2011  kefren add advlock to puffs. ok pooka@
should fix kern/43321
 1.76 06-Jul-2010  pooka Add compat to enable running puffs in a 64bit time_t kernel against
a server which runs in 32bit time_t namespace.
 1.75 07-Jan-2010  pooka branches: 1.75.2; 1.75.4;
Rename PUFFS_SOPREQ_EXIT to PUFFS_SOPREQSYS_EXIT to better signal
it comes from within the kernel instead of as a direct result of
a user request.

no functional change
 1.74 07-Jan-2010  pooka Add a PUFFS_UNMOUNT server->kernel request, which causes the kernel
to initiate self destruct, i.e. unmount(MNT_FORCE). This, however,
is a semi-controlled self-destruct, since all caches are flushed
before the (possibly) violent unmount takes place.
 1.73 07-Dec-2009  pooka Process flush requests from the file server in a separate thread
context. This fixes a long-standing but seldomly seen deadlock,
where the kernel was holding pages busy (due to e.g. readahead
request) while waiting for the server to respond, and the server
made a callback into the kernel asking to invalidate those pages.
... or, well, theoretically fixes, since I didn't have any reliable
way of repeating the deadlock and I think I saw it only twice.
 1.72 05-Nov-2009  pooka Kill suspend support. It was never implemented correctly:
* it depended on the biglock (in a very cruel way)
* it was attached to userspace transactions rather than logical
fs operations

(If someone wants to revisit it some day, most of the stuff can be
reused from cvs history)
 1.71 05-Nov-2009  pooka Reinstante PNODE_DYING. vmlocking had a brief hiatus when it was not
a valid optimization, but that's long gone and once VOP_INACTIVE is
called and the file server says that the vnode is going to be recycled,
it really is going to be recycled extra references gained or not.
 1.70 28-Jan-2008  pooka branches: 1.70.10; 1.70.20; 1.70.28;
For code clarity typedef void *puffs_cookie_t.

No functional change.
 1.69 02-Jan-2008  pooka More type-punning workarounds. Curiously the kernel compilation
flags cause gcc to not complain.
 1.68 02-Jan-2008  ad Merge vmlocking2 to head.
 1.67 08-Dec-2007  pooka branches: 1.67.4;
Now that "l" is gone both as an argument to operations and from
componentname, remove all vestiges of puffs_cid.
 1.66 05-Dec-2007  pooka Send a response message for flush operations from the kernel instead
of abusing the return value of write(2).
 1.65 20-Nov-2007  pooka branches: 1.65.2;
Retire M_PUFFS, use kmem(9) instead.
 1.64 17-Nov-2007  pooka Make puffs_updatenode() take a puffs_node instead of a vnode. This
way we don't need to worry if a vnode has been reclaimed from under
us.
 1.63 17-Nov-2007  pooka Implement a biodone callback for async writes similar to reads and
use that when possible.
 1.62 16-Nov-2007  pooka Restructure the messaging interface a bit more: make all interfacing
with the file server happen through puffs_msg_enqueue() and
puffs_msg_wait() instead of having a billion different routines.
Build the existing system upon these two. Most importantly though,
decouple insertation into the op queue from the actual wait. This
is useful for a number of reasons coming soon to a cvs repo near you.
 1.61 12-Nov-2007  pooka Bounds-check responses from userspace.
 1.60 10-Nov-2007  pooka Part 2/n of extensive changes to request transport to/from userspace:

Rip the transport code completely out of puffs and generalize it
into an independent module which will be used for multiple purposes
in the future. This module is called the Pass-to-Userspace
Transporter (known as "putter" among friends).

This is very much work-in-progress and one dependency with puffs
remains: the request framing format.

The device name is still /dev/puffs, but that will change soon.

Users of puffs need the following in their kernel configs now:
pseudo-device putter
 1.59 11-Oct-2007  pooka branches: 1.59.2; 1.59.4;
Part 1/n of some pretty extensive changes to how the kernel module
interacts with the userspace file server:

* since the kernel-user communication is not purely request-response
anymore (hasn't been since 2006), try to rename some "request" to
"message". more similar mangling will take place in the future.

* completely rework how messages are allocated. previously most of
them were borrowed from the stack (originally *all* of them),
but now always allocate dynamically. this makes the structure
of the code much cleaner. also makes it possible to fix a
locking order violation. it enables plenty of future enhancements.

* start generalizing the transport interface to be independent of puffs

* move transport interface to read/write instead of ioctl. the
old one had legacy design problems, and besides, ioctl's suck.
implement a very generic version for now; this will be
worked on later hopefully some day reaching "highly optimized".

* implement libpuffs support behind existing library request
interfaces. this will change eventually (I hate those interfaces)
 1.58 09-Oct-2007  pooka g/c more unused stuff
 1.57 09-Oct-2007  pooka g/c vntouser_req(), it's not used anymore
 1.56 04-Oct-2007  pooka g/c the "sizeop" code previous used for ioctl/fcntl. It was already
commented out and has bitrotted beyond all recognition, so it needs
complete rethinking.
 1.55 02-Oct-2007  pooka If kernel resource allocation fails after the file server has
committed something, issue an abort. The abort is done through
the regular op channel, e.g. failed mkdir leads to regular rmdir,
inactive and reclaim. No internal interface is planned currently
for the one file system out of a million which would implement it
to benefit from the one case in a billion where kernel resource
allocation actually does fail and out of that one case in a trillion
where internal vs. external would make a difference.
 1.54 01-Oct-2007  pooka * better error checking: validate error values received from userland
to be vaild errno values
* include string describing error in PUFFS_ERR
* get rid of union in puffs_req, it's nothing but trouble
* pass pmp to async i/o callbacks
 1.53 27-Sep-2007  pooka Split routines handling nodes from puffs_subr to puffs_node.
No functional change.
 1.52 27-Sep-2007  pooka Differentiate between cookie2vnode returning an error and
return to caller, address unknown: no such cookie, no such node.
Make the callers use this info to either create a new vnode or bail.
 1.51 27-Sep-2007  pooka Add error notifications, which are used to deliver errors from the
kernel to the file server for silly things the file server did,
e.g. attempting to create a file with size VSIZENOTSET. The file
server can handle these as it chooses, but the default action is
for it to throw its hands in the air and sing "goodbye, cruel world,
it's over, walk on by".
 1.50 27-Sep-2007  pooka Fix a race in how new cookies are checked. Previously the checking
was done separate of inserting the cookie into the lookup structure
and without any form of interlock. This could lead to the same
cookie pointing to two different nodes. Remedy the race by creating
a separate "checked and ready to be inserted" cookie list which
serves as an interlock without having to hold a fs-global creation
lock.
 1.49 24-Sep-2007  pooka add a few comments and g/c dead code
 1.48 30-Jul-2007  pooka branches: 1.48.4; 1.48.6; 1.48.8; 1.48.10;
Move PUFFS_TYPEPREFIX to puffs_msgif.h since it's used in a macro there.
 1.47 22-Jul-2007  pooka Keep track of the maximum size we have supplied the file server (or
it has supplied us). If we fault pages which are at offset >= server
size, but less than the in-kernel vnode size, inform the file server
of the latest developments in file size before issueing the fault.
The avoids confusion with files which are not written start to finish.

fixes kern/36429 by yamt
 1.46 17-Jul-2007  pooka branches: 1.46.2;
Set a file server supplied file system type in the type field and set
the mntfromname to be the place mounted from instead of the type.
 1.45 01-Jul-2007  pooka Give the file server to ability to request the entire pathname buffer
under lookup by using PUFFS_KFLAG_LOOKUP_FULLPNBUF instead just the
current component.
 1.44 01-Jul-2007  pooka Instead of supplying a plain pid, supply an abstract struct puffs_cid *,
which can currently be used to query the pid and lwpid.
 1.43 01-Jul-2007  pooka make puffs_cred an opaque type
 1.42 24-Jun-2007  pooka Split the NOCACHE option in twain: NOCACHE_NAME & NOCACHE_PAGE.
 1.41 21-Jun-2007  pooka Refactor the pnode2vnode translation slightly so that VFS_ROOT
can use it directly.
 1.40 21-Jun-2007  pooka Reorganize how the root vnode is fetched so that it doesn't always
go through VFS_ROOT() and allow to fetch it without locking it.
This allows us to call the cache flush operations also for the root
vnode and most notably fixes e.g. a "No such file or directory"
for a psshfs root directory ls -l when a file was locally deleted
and remotely re-created.

Also fix some sloppy programming in root node fetch (mostly cosmetic).
 1.39 06-Jun-2007  pooka Move puffs to a two clause license where it already isn't so. And
as agc pointed out, even files with the third clause were already
effectively two clause because of a slight bug in the language...
 1.38 19-May-2007  pooka forgot to commit this with puffs_vnops.c 1.72:

Actually, we do need separate "no references in file server" and
"noref + inactive" flags if we wish to correctly support unix open
file semantics and optimize away pre-reclaim cache flushes. So,
add PNODE_DYING which stands for norefs + inactive.
 1.37 18-May-2007  pooka Introduce noref setbacks, which the file server can use to signal
the kernel it has 0 references to the node in question. In other
words, this can be used to avoid inactive(), or, if the file server
does not implement inactive, prompt reclaim for removed nodes.
 1.36 18-May-2007  pooka Support VOP_POLL. This requires some acrobatics on the puffs_node,
as we give a reference to userspace for the puffs_node for the
duration of the poll call. So reference count puffs_node separately
from the parent vnode. vref()/vrele() is not possible due to a possible
surprise visit from VOP_INACTIVE.
 1.35 17-May-2007  pooka Make it possible for the file server to specify the root vnode type
and other information instead of always using VDIR. To make this
possible without races, require all root node information already
in puffs_mount() and nuke puffs_start2() and the associated start
operation completely.

requested/inspired by Tobias Nygren
 1.34 07-May-2007  pooka Introduce puffs "setbacks", which can be used to set certain flags
for nodes upon return from the userspace. Currently it can be used
to indicate that the file server should be notified of "inactive"
in case the file server has opted to not receive inactive every
time the reference count for a vnode drops to zero. (inactive is
a common event, almost never requires any action and must be executed
sychronously, so it is wasteful).

While doing this, cleanup the release-relock nonsense from the
vntouser*() arguments. It was never enabled and the whole LOCKEDVP()
concept was very broken to begin with.
 1.33 01-May-2007  pooka Fix a problem introduced when I converted puffs to use newlock2:
when unmounting the file system in case of a certain timing (and
possibly some other conditions), a thread would wait on a condition
variable, while another thread broadcast the cv and immediately
proceeded to destroy it. The result was a system frozen completely
solid shorly after the process waiting for the cv woke up. So
introduce reference counting to synchronize destruction of the
resources in unmount.

I was able to repeat the problem only on my laptop in some special
cases, so I do not know how common it was. Ironically, killing
the file server process violently instead of unmount() didn't have
this problem because it never entered the unmount path from two
directions.
 1.32 16-Apr-2007  pooka Give the file server the ability to specify the file handle length
instead of defining a static length file handle on the framework-level.
 1.31 13-Apr-2007  pooka * add fhlen to kernel argument structure
* rename it to puffs_kargs instead of puffs_args
 1.30 04-Apr-2007  pooka Make it possible to interrupt waiters for fs operation completion
again. This is useful until locking is further developed and basically
any deadlocks can be solved by killing appropriate processes.

Thanks especially to Tommi Kyntola and Antti Louko for sitting down
with me and discussing resource ownership and locking strategies
in implementing this.
 1.29 30-Mar-2007  pooka * abstract ASYNCBIOREAD and let callers freely issue a callback called
from putop. even though there's only one user currently, makes code
more readable
* move "delta" to a standard parameter in vntouser and get rid of the
specialcase vntouser_delta
 1.28 29-Mar-2007  pooka Convert spinlocks & sleep/wakeup to newlock2 locking stuff. Fix a
bunch of bugs.

* park structures are now always allocated from a pool instead of a
mixed stack/malloc allocation
* get rid of the whole adjbuf concept, always just alloc the maximal
amount of memory to satisfy a request
* little regression: don't allow interrupting wait from file system
to userspace; this had problems already before, but now the problems
really started to shine through. I'll try to make this work again
some day.
* fix bmap to return a sensible value in runp
 1.27 20-Mar-2007  pooka * rework the page cache interaction a bit: cache metadata in the
kernel and flush it out all at once instead of continuous updating
* add support for delivering notifications to the file server about
when a page was written to (but disabled by default for now). the
file server can use this to request flushing or invalidating the
kernel page cache
 1.26 14-Mar-2007  pooka branches: 1.26.2;
Support B_READ|B_ASYNC in strategy by calling biodone() directly
when the file server puts the result.
 1.25 27-Feb-2007  pooka branches: 1.25.2; 1.25.4;
Make wait for the user file server PCATCHable. This makes it
possible to recover the system by just killing processes in case
a file server manages to recurse into itself either by fault of
file server implementation or by pilot error. The downside is that
the code is extremely hard to follow and practically screams out
for newlock2 (in addition to screaming "bug here"). The whole
PCATCH nonsense and induced megacomplexity can hopefully be avoided
in the future by tweaking other parts of the implementation.
 1.24 15-Feb-2007  pooka branches: 1.24.2;
Hide the debug prints behind PUFFSDEBUG instead of DEBUG. Make the
latter define the former.
 1.23 29-Jan-2007  hubertf Remove more duplicate headers.
Patch by Slava Semushin <slava.semushin@gmail.com>

Again, this was tested by comparing obj files from a pristine and a patched
source tree against an i386/ALL kernel, and also for src/sbin/fsck_ffs,
src/sbin/fsdb and src/usr.sbin/makefs. Only changes in assert() line numbers
were detected in 'objdump -d' output.
 1.22 26-Jan-2007  pooka Initial attempt at suspend/snapshot support for userspace file
servers. This is still pretty much on the level "if it breaks ...".
It should work for single-threaded servers which handle one operation
from start to finish in one go. Also, it does not yet totally
correctly synchronize metadata and data in some cases. So needless
to say, it needs improvement, but it is possible that will have to
wait for some lock revampage.
 1.21 21-Jan-2007  pooka optimize a bit: don't flush pages for vnodes which have no references
in the kernel or links in the backend
 1.20 15-Jan-2007  pooka Store puffs_node's on lists hashed with the cookie value instead
of just one flat list.
 1.19 15-Jan-2007  pooka * do not accept the directory cookie as the result of a lookup (otherwise
we'd be locking against ourselves)
* do not accept duplicate cookies when creating new nodes
 1.18 09-Jan-2007  pooka Introduce flush operations, which the fs server can use to control
kernel caching. Currently supported are only flushing the name
cache for a directory or flushing the name cache for the entire fs.

Also, get rid of PNODE_INACTIVE status, since it was racy and
essentially didn't work. All this on top of being useless in the
first place ....
 1.17 02-Jan-2007  pooka * check userspace version and prevent incompatible mount
* some general maintenance
 1.16 30-Dec-2006  pooka branches: 1.16.2;
* use PUFFS_KFLAG_NOCACHE to also signal that we don't want the namecache
* enter files into the namecache immediately when new nodes are created
(if it's a caching mount, of course)
 1.15 29-Dec-2006  pooka rename the kernel-provided componentname to puffs_kcn; libpuffs now
provides puffs_cn built on top of it
 1.14 10-Dec-2006  pooka Fix a race condition that would cause the mountpoint to be cleaned
from under someone waiting for the fs server response in puffs_unmount()
if the descriptor was closed during the response wait (such as bug
leading to a crash in fs implementation unmount()).
 1.13 05-Dec-2006  pooka branches: 1.13.2;
shuffle functions around a bit: move the transport (/dev/puffs) to
a different file from the messaging (request contents). no functional
change
 1.12 05-Dec-2006  pooka Allow multiple requests to be transferred in each GET/PUTOP. For
a single request, the performance is still the same.
 1.11 01-Dec-2006  pooka prefix kernel flags with PUFFS_KFLAG to have a separate namespace
from the library flags
 1.10 01-Dec-2006  pooka don't call the fs server for all operations, only those it has told
us that it implements
 1.9 18-Nov-2006  pooka branches: 1.9.2;
As a first generation best-effort hack, use NOCACHE to mean "file
size can change without the kernel knowing" and therefore query
the file size before invoking read or write operations.
 1.8 17-Nov-2006  pooka Introduce uncached operation, makes sense when the file system backend
can be modified from elsewhere than the file system interface
 1.7 09-Nov-2006  pooka few renames to better differentiate between mount & start.. plus some
other renaming
 1.6 07-Nov-2006  pooka attach to genfs & support page cache. most noticeable effect is
mmap and therefore execution of binaries starting to work, some
speed improvements with large file I/O also. caching semantics
and error case handling most likely need revisiting.
 1.5 06-Nov-2006  pooka puffs_park always contains a specific puffs_req, so make it a member
instead of a pointer
 1.4 06-Nov-2006  pooka make it possible to build & load puffs as an LKM

by Lubomir Kundrak, PR kern/35000
 1.3 27-Oct-2006  pooka support fifos
 1.2 26-Oct-2006  pooka support specfs
 1.1 22-Oct-2006  pooka kernel portion of puffs - the Pass-to-Userspace Framework File System.
It contains the VFS attachment and userspace message-passing interface.

This work was initially started and completed for Google SoC 2005
and tweaked to work a bit better in the past few weeks. While
being far from complete, it is functional enough to be able and
stable to host a fairly general-purpose in-memory file system in
userspace. Even so, puffs should be considered experimental and
no binary compatibility for interfaces or crash-freedom or zero
security implications should be relied upon just yet.

The GSoC project was mentored by William Studenmund and the final
review for the code was done by Christos.
 1.9.2.4 01-Feb-2007  ad Sync with head.
 1.9.2.3 12-Jan-2007  ad Sync with head.
 1.9.2.2 18-Nov-2006  ad Sync with head.
 1.9.2.1 18-Nov-2006  ad file puffs_sys.h was added on branch newlock2 on 2006-11-18 21:39:20 +0000
 1.13.2.3 18-Dec-2006  yamt sync with head.
 1.13.2.2 10-Dec-2006  yamt sync with head.
 1.13.2.1 05-Dec-2006  yamt file puffs_sys.h was added on branch yamt-splraiseipl on 2006-12-10 07:18:38 +0000
 1.16.2.9 04-Feb-2008  yamt sync with head.
 1.16.2.8 21-Jan-2008  yamt sync with head
 1.16.2.7 07-Dec-2007  yamt sync with head
 1.16.2.6 15-Nov-2007  yamt sync with head.
 1.16.2.5 27-Oct-2007  yamt sync with head.
 1.16.2.4 03-Sep-2007  yamt sync with head.
 1.16.2.3 26-Feb-2007  yamt sync with head.
 1.16.2.2 30-Dec-2006  yamt sync with head.
 1.16.2.1 30-Dec-2006  yamt file puffs_sys.h was added on branch yamt-lazymbuf on 2006-12-30 20:50:01 +0000
 1.24.2.5 17-May-2007  yamt sync with head.
 1.24.2.4 07-May-2007  yamt sync with head.
 1.24.2.3 15-Apr-2007  yamt sync with head.
 1.24.2.2 24-Mar-2007  yamt sync with head.
 1.24.2.1 12-Mar-2007  rmind Sync with HEAD.
 1.25.4.1 11-Jul-2007  mjf Sync with head.
 1.25.2.8 12-Oct-2007  ad Sync with head.
 1.25.2.7 09-Oct-2007  ad Sync with head.
 1.25.2.6 16-Sep-2007  ad Checkpoint work in progress on the vnode lifecycle and reference counting
stuff. This makes it work properly without kernel_lock and fixes a few
quite old bugs. See vfs_subr.c 1.283.2.17 for details.
 1.25.2.5 20-Aug-2007  ad Sync with HEAD.
 1.25.2.4 15-Jul-2007  ad Sync with head.
 1.25.2.3 09-Jun-2007  ad Sync with head.
 1.25.2.2 08-Jun-2007  ad Sync with head.
 1.25.2.1 10-Apr-2007  ad Sync with head.
 1.26.2.1 29-Mar-2007  reinoud Pullup to -current
 1.46.2.1 15-Aug-2007  skrll Sync with HEAD.
 1.48.10.2 30-Jul-2007  pooka Move PUFFS_TYPEPREFIX to puffs_msgif.h since it's used in a macro there.
 1.48.10.1 30-Jul-2007  pooka file puffs_sys.h was added on branch matt-mips64 on 2007-07-30 09:04:59 +0000
 1.48.8.2 14-Oct-2007  yamt sync with head.
 1.48.8.1 06-Oct-2007  yamt sync with head.
 1.48.6.3 23-Mar-2008  matt sync with HEAD
 1.48.6.2 09-Jan-2008  matt sync with HEAD
 1.48.6.1 06-Nov-2007  matt sync with HEAD
 1.48.4.7 09-Dec-2007  jmcneill Sync with HEAD.
 1.48.4.6 21-Nov-2007  joerg Sync with HEAD.
 1.48.4.5 14-Nov-2007  joerg Sync with HEAD.
 1.48.4.4 11-Nov-2007  joerg Sync with HEAD.
 1.48.4.3 26-Oct-2007  joerg Sync with HEAD.

Follow the merge of pmap.c on i386 and amd64 and move
pmap_init_tmp_pgtbl into arch/x86/x86/pmap.c. Modify the ACPI wakeup
code to restore CR4 before jumping back into kernel space as the large
page option might cover that.
 1.48.4.2 07-Oct-2007  joerg Sync with HEAD.
 1.48.4.1 02-Oct-2007  joerg Sync with HEAD.
 1.59.4.4 18-Feb-2008  mjf Sync with HEAD.
 1.59.4.3 27-Dec-2007  mjf Sync with HEAD.
 1.59.4.2 08-Dec-2007  mjf Sync with HEAD.
 1.59.4.1 19-Nov-2007  mjf Sync with HEAD.
 1.59.2.3 21-Nov-2007  bouyer Sync with HEAD
 1.59.2.2 18-Nov-2007  bouyer Sync with HEAD
 1.59.2.1 13-Nov-2007  bouyer Sync with HEAD
 1.65.2.3 26-Dec-2007  ad Sync with head.
 1.65.2.2 08-Dec-2007  ad Sync with head.
 1.65.2.1 04-Dec-2007  ad Pull the vmlocking changes into a new branch.
 1.67.4.2 08-Jan-2008  bouyer Sync with HEAD
 1.67.4.1 02-Jan-2008  bouyer Sync with HEAD
 1.70.28.1 21-Apr-2010  matt sync to netbsd-5
 1.70.20.3 17-Sep-2011  bouyer Pull up following revision(s) (requested by manu in ticket #1666):
sys/fs/puffs/puffs_sys.h: revision 1.78 via patch
sys/fs/puffs/puffs_node.c: revision 1.20 via patch
sys/fs/puffs/puffs_vnops.c: revision 1.155 via patch
Add a mutex for operations that touch size (setattr, getattr, write, fsync).
This is required to avoid data corruption bugs, where a getattr slices
itself within a setattr operation, and sets the size to the stall value
it got from the filesystem. That value is smaller than the one set by
setattr, and the call to uvm_vnp_setsize() trigged a spurious truncate.
The result is a chunk of zeroed data in the file.
Such a situation can easily happen when the ioflush thread issue a
VOP_FSYNC/puffs_vnop_sync/flushvncache/dosetattrn while andother process
do a sys_stat/VOP_GETATTR/puffs_vnop_getattr.
This mutex on size operation can be removed the day we decide VOP_GETATTR
has to operated on a locked vnode, since the other operations that touch
size already require that.
 1.70.20.2 18-Jun-2011  bouyer Pull up following revision(s) (requested by manu in ticket #1623):
lib/libpuffs/puffs.c: revision 1.116
sys/fs/puffs/puffs_vnops.c: revision 1.151
Call advlock method if supplied
 1.70.20.1 09-Jan-2010  snj Pull up following revision(s) (requested by pooka in ticket #1212):
sys/fs/puffs/puffs_msgif.c: revision 1.76 via patch
sys/fs/puffs/puffs_sys.h: revision 1.73 via patch
sys/fs/puffs/puffs_vfsops.c: revision 1.84 via patch
Process flush requests from the file server in a separate thread
context. This fixes a long-standing but seldomly seen deadlock,
where the kernel was holding pages busy (due to e.g. readahead
request) while waiting for the server to respond, and the server
made a callback into the kernel asking to invalidate those pages.
... or, well, theoretically fixes, since I didn't have any reliable
way of repeating the deadlock and I think I saw it only twice.
 1.70.10.2 11-Aug-2010  yamt sync with head.
 1.70.10.1 11-Mar-2010  yamt sync with head
 1.75.4.1 05-Mar-2011  rmind sync with head
 1.75.2.1 17-Aug-2010  uebayasi Sync with HEAD.
 1.78.8.4 27-Feb-2015  martin Pull up following revision(s) (requested by manu in ticket #1260):
lib/libpuffs/puffs.3: revision 1,55,1.60
sys/fs/puffs/puffs_msgif.h: revision 1.84
lib/libperfuse/ops.c: revision 1.83
sys/fs/puffs/puffs_sys.h: revision 1.89
sys/fs/puffs/puffs_vfsops.c: revision 1.116
lib/libperfuse/perfuse.c: revision 1.36
sys/fs/puffs/puffs_vnops.c: revision 1.200-1.202

Use more markup. New sentence, new line. Bump date for previous.

Add PUFFS_KFLAG_NOFLUSH_META to prevent sending metadata flush to FUSE
FUSE filesystems do not expect to get metadata updates for [amc]time
and size, they updates the value on their own after operations.

The PUFFS PUFFS_KFLAG_NOFLUSH_META option prevents regular metadata cache
flushes to the filesystem , and libperfuse uses it to match Linux FUSE
behavior.

While there, fix a bug in SETATTR: do not update kernel metadata cache
from SETATTR reply when the request is asynchronous, as we do not have
the reply yet.

Update file size after write without metadata flush
If we do not use metadata flush, we must make sure the size is updated
in the filesystem after a write, otherwise the next GETATTR will get us
a stale value and the file will be truncated.
 1.78.8.3 03-Nov-2014  msaitoh Pull up following revision(s) (requested by manu in ticket #1140):
lib/libperfuse/ops.c 1.63-1.69
lib/libperfuse/perfuse.c 1.32-1.33
lib/libperfuse/perfuse_priv.h 1.32-1.34
lib/libperfuse/subr.c 1.20
lib/libpuffs/creds.c 1.16
lib/libpuffs/dispatcher.c 1.47
lib/libpuffs/puffs.h 1.125
lib/libpuffs/puffs_ops.3 1.37-1.38
lib/libpuffs/requests.c 1.24
sys/fs/puffs/puffs_msgif.h 1.81
sys/fs/puffs/puffs_sys.h 1.85
sys/fs/puffs/puffs_vnops.c 1.183
usr.sbin/perfused/msg.c 1.22
Bring libpuffs, libperfuse and perfused on par with -current:
- implement FUSE direct I/O
- remove useless code and warnings
- fix missing GETATTR bugs
- fix exended attribute get and list operations
 1.78.8.2 12-Aug-2012  martin Pull up following revision(s) (requested by manu in ticket #438):
lib/libperfuse/perfuse_priv.h: revision 1.31
sys/fs/puffs/puffs_msgif.h: revision 1.80
sys/fs/puffs/puffs_vnops.c: revision 1.171
lib/libpuffs/puffs_ops.3: revision 1.31
sys/fs/puffs/puffs_vnops.c: revision 1.172
sys/fs/puffs/puffs_vnops.c: revision 1.173
sys/fs/puffs/puffs_vnops.c: revision 1.174
usr.sbin/perfused/perfused.c: revision 1.24
sys/fs/puffs/puffs_sys.h: revision 1.80
sys/fs/puffs/puffs_sys.h: revision 1.81
sys/fs/puffs/puffs_sys.h: revision 1.82
lib/libperfuse/subr.c: revision 1.19
lib/libperfuse/perfuse.c: revision 1.30
sys/fs/puffs/puffs_msgif.c: revision 1.90
sys/fs/puffs/puffs_msgif.c: revision 1.91
sys/fs/puffs/puffs_msgif.c: revision 1.92
lib/libperfuse/ops.c: revision 1.59
lib/libpuffs/puffs.3: revision 1.53
lib/libperfuse/debug.c: revision 1.12
lib/libpuffs/puffs.3: revision 1.54
sys/fs/puffs/puffs_vnops.c: revision 1.167
sys/fs/puffs/puffs_msgif.h: revision 1.79
usr.sbin/perfused/msg.c: revision 1.21
sys/fs/puffs/puffs_vfsops.c: revision 1.102
sys/fs/puffs/puffs_vfsops.c: revision 1.103
sys/fs/puffs/puffs_vfsops.c: revision 1.105
lib/libpuffs/puffs.h: revision 1.123
lib/libperfuse/perfuse_if.h: revision 1.20
lib/libperfuse/perfuse.c: revision 1.29
lib/libpuffs/dispatcher.c: revision 1.42
lib/libpuffs/dispatcher.c: revision 1.43
- Fix same vnodes associated with multiple cookies
The scheme used to retreive known nodes on lookup was flawed, as it only
used parent and name. This produced a different cookie for the same file
if it was renamed, when looking up ../ or when dealing with multiple files
associated with the same name through link(2).
We therefore abandon the use of node name and introduce hashed lists of
inodes. This causes a huge rewrite of reclaim code, which do not attempt
to keep parents allocated until all their children are reclaimed
- Fix race conditions in reclaim
There are a few situations where we issue multiple FUSE operations for
a PUFFS operation. On reclaim, we therefore have to wait for all FUSE
operation to complete, not just the current exchanges. We do this by
introducing node reference count with node_ref() and node_rele().
- Detect data loss caused by FAF
VOP_PUTPAGES causes FAF writes where the kernel does not check the
operation result. At least issue a warning on error.
- Enjoy FAF shortcut on setattr
No need to wait for the result if the kernel does not want it. There is
however an exception for setattr that touch the size, we need to wait
for completion because we have other operations queued for after the
resize.
- Fix fchmod() on write-open file
fchmod() on a node open with write privilege will send setattr with both mode
and size set. This confuses some FUSE filesystem. Therefore we send two FUSE
operations, one for mode, and one for size.
- Remove node TTL handling for netbsd-5 for simplicity sake. The code
still builds on netbsd-5 but does not have the node TTL feature anymore.
It works fine with kernel support on netbsd-6.
- Improve PUFFS_KFLAG_CACHE_FS_TTL by reclaiming older inactive nodes.
The normal kernel behavior is to retain inactive nodes in the freelist
until it runs out of vnodes. This has some merit for local filesystems,
where the cost of an allocation is about the same as the cost of a
lookup. But that situation is not true for distributed filesystems.
On the other hand, keeping inactive nodes for a long time hold memory
in the file server process, and when the kernel runs out of vnodes, it
produce reclaim avalanches that increase lattency for other operations.
We do not reclaim inactive vnodes immediatly either, as they may be
looked up again shortly. Instead we introduce a grace time and we
reclaim nodes that have been inactive beyond the grace time.
- Fix lookup/reclaim race condition.
The above improvement undercovered a race condition between lookup and
reclaim. If we reclaimed a vnode associated with a userland cookie while
a lookup returning that same cookiewas inprogress, then the kernel ends
up with a vnode associated with a cookie that has been reclaimed in
userland. Next operation on the cookie will crash (or at least confuse)
the filesystem.
We fix this by introducing a lookup count in kernel and userland. On
reclaim, the kernel sends the count, which enable userland to detect
situation where it initiated a lookup that is not completed in kernel.
In such a situation, the reclaim must be ignored, as the node is about
to be looked up again.
Fix hang unmount bug introduced by last commit.
We introduced a slow queue for delayed reclaims, while the existing
queue for unmount, flush and exist has been renamed fast queue. Both
queues had timestamp for when an operation should be done, but it was
useless for the fast queue, which is always used to run an operation
ASAP. And the timestamp test had an error that turned ASAP into "at next
tick", but nobody what there to wake the thread at next tick, hence
the hang. The fix is to remove the useless and buggy timestamp test for
fast queue.
Rename slow sopreq queue into node sopreq queue, to refet the fact that
is only intended for postponed node reclaims.
When purging the node sopreq queue, do not call puffs_msg_sendresp(), as
it makes no sense.
Fix race condition between (create|mknod|mkdir|symlino) and reclaim, just
like we did it between lookup and reclaim.
Missing bit in previous commit (prevent race between create|mknod|mkdir|symlink
and reclaim)
Bump date for previous.
New sentence, new line; remove trailing whitespace; fix typos;
punctuation nits.
Add PUFFS_KFLAG_CACHE_DOTDOT so that vnodes hold a reference on their
parent, keeping them active, and allowing to lookup .. without sending
a request to the filesystem.
Enable the featuure for perfused, as this is how FUSE works.
Missing bit in previous commit (PUFFS_KFLAG_CACHE_DOTDOT option to avoid
looking up ..)
 1.78.8.1 23-Apr-2012  riz Pull up following revision(s) (requested by manu in ticket #195):
lib/libskey/skeysubr.c: revision 1.27
lib/libkvm/kvm_getloadavg.c: revision 1.11
lib/libwrap/update.c: revision 1.9
lib/liby/yyerror.c: revision 1.9
lib/libpuffs/puffs_ops.3: revision 1.30
lib/libwrap/misc.c: revision 1.10
lib/libwrap/hosts_access.c: revision 1.20
lib/libpuffs/pnode.c: revision 1.11
lib/libperfuse/subr.c: revision 1.17
lib/libpuffs/pnode.c: revision 1.12
lib/libperfuse/subr.c: revision 1.18
lib/libwrap/options.c: revision 1.15
lib/libwrap/fix_options.c: revision 1.11
lib/libperfuse/ops.c: revision 1.52
lib/libperfuse/ops.c: revision 1.53
lib/libperfuse/ops.c: revision 1.54
lib/libwrap/hosts_ctl.c: revision 1.5
lib/libintl/gettext.c: revision 1.27
lib/libwrap/shell_cmd.c: revision 1.6
lib/libpuffs/dispatcher.c: revision 1.39
lib/libperfuse/perfuse_priv.h: revision 1.27
lib/libwrap/socket.c: revision 1.19
lib/libpuffs/puffs.3: revision 1.50
lib/libperfuse/perfuse_priv.h: revision 1.28
lib/libpuffs/puffs_priv.h: revision 1.45
lib/libpuffs/puffs.3: revision 1.51
lib/libperfuse/perfuse_priv.h: revision 1.29
lib/libwrap/percent_x.c: revision 1.5
lib/libpuffs/puffs.3: revision 1.52
lib/libperfuse/debug.c: revision 1.11
sys/fs/puffs/puffs_vnops.c: revision 1.165
lib/libwrap/tcpd.h: revision 1.13
sys/fs/puffs/puffs_vnops.c: revision 1.166
lib/libwrap/eval.c: revision 1.7
sys/fs/puffs/puffs_msgif.h: revision 1.78
sys/fs/puffs/puffs_vfsops.c: revision 1.101
lib/libwrap/rfc931.c: revision 1.9
lib/libwrap/clean_exit.c: revision 1.5
lib/libpuffs/puffs.h: revision 1.120
lib/libc/stdlib/jemalloc.c: revision 1.27
lib/librmt/rmtlib.c: revision 1.26
lib/libpuffs/puffs.h: revision 1.121
sys/fs/puffs/puffs_sys.h: revision 1.79
lib/librumpclient/rumpclient.c: revision 1.48
lib/libwrap/refuse.c: revision 1.5
lib/libperfuse/perfuse.c: revision 1.26
lib/libperfuse/perfuse.c: revision 1.27
tests/fs/puffs/t_fuzz.c: revision 1.5
lib/libperfuse/perfuse.c: revision 1.28
lib/libpuffs/dispatcher.c: revision 1.40
sys/fs/puffs/puffs_node.c: revision 1.24
lib/libwrap/diag.c: revision 1.9
lib/libintl/textdomain.c: revision 1.13
Use C89 function definition
Add name and atttribute cache with filesytem provided TTL.
lookup, create, mknod, mkdir, symlink, getattr and setattr messages
have been extended so that attributes and their TTL can be provided
by the filesytem. lookup, create, mknod, mkdir, and symlink messages
are also extended so that the filesystem can provide name TTL.
Add PUFFS_KFLAG_CACHE_FS_TTL flag to puffs_init(3) to use name and
attribute cache with filesystem provided TTL.
lookup, create, mknod, mkdir, symlink, getattr and setattr messages
have been extended so that attributes and their TTL can be provided
by the filesytem. lookup, create, mknod, mkdir, and symlink messages
are also extended so that the filesystem can provide name TTL.
The filesystem updates attributes and TTL using
puffs_pn_getvap(3), puffs_pn_getvattl(3), and puffs_pn_getcnttl(3)
Use new PUFFS_KFLAG_CACHE_FS_TTL option to puffs_init(3) so that
FUSE TTL on name and attributes are used. This save many PUFFS
operations and improves performances.
PUFFS_KFLAG_CACHE_FS_TTL is #ifdef'ed in many places for now so that
libperfuse can still be used on netbsd-5.
Split file system.
Comma fixes.
Remove dangling &quot;and&quot;.
Bump date for previous.
- Makesure update_va does not change vnode size when it should not. For
instance when doing a fault-issued VOP_GETPAGES within VOP_WRITE, changing
size leads to panic: genfs_getpages: past eof.
-Handle ticks wrap around for vnode name andattribute timeout
- When using PUFFS_KFLAG_CACHE_FS_TTL, do not use puffs_node to carry
attribute and TTL fora newly created node. Instead extend puffs_newinfo
and add puffs_newinfo_setva() and puffs_newinfo_setttl()
- Remove node_mk_common_final in libperfuse. It used to set uid/gid for
a newly created vnode but has been made redundant along time ago since
uid and gid are properly set in FUSE header.
- In libperfuse, check for corner case where opc = 0 on INACTIVE and RECLAIM
(how is it possible? Check for it to avoid a crash anyway)
- In libperfuse, make sure we unlimit RLIMIT_AS and RLIMIT_DATA so that
we do notrun out of memory because the kernel is lazy at reclaiming vnodes.
- In libperfuse, cleanup style of perfuse_destroy_pn()
Do not set PUFFS_KFLAG_CACHE_FS_TTL for PUFFS tests
 1.78.6.1 29-Apr-2012  mrg sync to latest -current.
 1.78.2.3 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.78.2.2 30-Oct-2012  yamt sync with head
 1.78.2.1 17-Apr-2012  yamt sync with head
 1.82.2.3 03-Dec-2017  jdolecek update from HEAD
 1.82.2.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.82.2.1 23-Jun-2013  tls resync from head
 1.83.6.1 18-May-2014  rmind sync with head
 1.84.4.3 27-Feb-2015  martin Pull up following revision(s) (requested by manu in ticket #555):
lib/libpuffs/puffs.3: revision 1.60
sys/fs/puffs/puffs_msgif.h: revision 1.84
lib/libperfuse/ops.c: revision 1.83
sys/fs/puffs/puffs_sys.h: revision 1.89
sys/fs/puffs/puffs_vfsops.c: revision 1.116
lib/libperfuse/perfuse.c: revision 1.36
sys/fs/puffs/puffs_vnops.c: revision 1.200-1.202

Add PUFFS_KFLAG_NOFLUSH_META to prevent sending metadata flush to FUSE

FUSE filesystems do not expect to get metadata updates for [amc]time
and size, they updates the value on their own after operations.

The PUFFS PUFFS_KFLAG_NOFLUSH_META option prevents regular metadata cache
flushes to the filesystem , and libperfuse uses it to match Linux FUSE
behavior.

While there, fix a bug in SETATTR: do not update kernel metadata cache
from SETATTR reply when the request is asynchronous, as we do not have
the reply yet.

Update file size after write without metadata flush
If we do not use metadata flush, we must make sure the size is updated
in the filesystem after a write, otherwise the next GETATTR will get us
a stale value and the file will be truncated.
 1.84.4.2 29-Aug-2014  martin Pull up following revision(s) (requested by hannken in ticket #67):
sys/fs/puffs/puffs_sys.h: revision 1.86
sys/fs/puffs/puffs_vfsops.c: revision 1.114
sys/fs/puffs/puffs_msgif.c: revision 1.95
sys/fs/puffs/puffs_node.c: revision 1.32
sys/fs/puffs/puffs_vnops.c: revision 1.184
Change puffs from hashlist to vcache.
- field "pa_nhashbuckets" of struct "puffs_kargs" becomes a no-op.
and should be removed on the next protocol version bump.
 1.84.4.1 26-Aug-2014  riz Pull up following revision(s) (requested by manu in ticket #52):
sys/fs/puffs/puffs_msgif.h: revision 1.81
sys/fs/puffs/puffs_sys.h: revision 1.85
sys/fs/puffs/puffs_vnops.c: revision 1.183
Add a oflags input field to open requests so that the filesystem can pass
back information about the file. Implement PUFFS_OPEN_IO_DIRECT, which
will force direct IO (bypassing page cache) for the file.
 1.88.2.2 09-Jul-2016  skrll Sync with HEAD
 1.88.2.1 06-Apr-2015  skrll Sync with HEAD
 1.90.18.1 10-Jun-2019  christos Sync with HEAD
 1.90.16.3 14-Jan-2019  pgoyette Create a variant of the HOOK macros that handles hook routines of
type void, and use them where appropriate.
 1.90.16.2 17-Sep-2018  pgoyette Adapt (most of) the indirect function pointers to the new MP-safe
mechanism. Still remaining are the compat_netbsd32 stuff, and
some usb subroutines.
 1.90.16.1 24-Mar-2018  pgoyette Add fs/puffs compat_50 to the modules
 1.28 10-Nov-2007  pooka Part 2/n of extensive changes to request transport to/from userspace:

Rip the transport code completely out of puffs and generalize it
into an independent module which will be used for multiple purposes
in the future. This module is called the Pass-to-Userspace
Transporter (known as "putter" among friends).

This is very much work-in-progress and one dependency with puffs
remains: the request framing format.

The device name is still /dev/puffs, but that will change soon.

Users of puffs need the following in their kernel configs now:
pseudo-device putter
 1.27 11-Oct-2007  pooka branches: 1.27.2; 1.27.4;
Handle suspend and flush requests from the file server.
 1.26 11-Oct-2007  pooka Part 1/n of some pretty extensive changes to how the kernel module
interacts with the userspace file server:

* since the kernel-user communication is not purely request-response
anymore (hasn't been since 2006), try to rename some "request" to
"message". more similar mangling will take place in the future.

* completely rework how messages are allocated. previously most of
them were borrowed from the stack (originally *all* of them),
but now always allocate dynamically. this makes the structure
of the code much cleaner. also makes it possible to fix a
locking order violation. it enables plenty of future enhancements.

* start generalizing the transport interface to be independent of puffs

* move transport interface to read/write instead of ioctl. the
old one had legacy design problems, and besides, ioctl's suck.
implement a very generic version for now; this will be
worked on later hopefully some day reaching "highly optimized".

* implement libpuffs support behind existing library request
interfaces. this will change eventually (I hate those interfaces)
 1.25 04-Oct-2007  pooka g/c the "sizeop" code previous used for ioctl/fcntl. It was already
commented out and has bitrotted beyond all recognition, so it needs
complete rethinking.
 1.24 27-Sep-2007  pooka Differentiate between cookie2vnode returning an error and
return to caller, address unknown: no such cookie, no such node.
Make the callers use this info to either create a new vnode or bail.
 1.23 27-Sep-2007  pooka Fix a race in how new cookies are checked. Previously the checking
was done separate of inserting the cookie into the lookup structure
and without any form of interlock. This could lead to the same
cookie pointing to two different nodes. Remedy the race by creating
a separate "checked and ready to be inserted" cookie list which
serves as an interlock without having to hold a fs-global creation
lock.
 1.22 19-Jul-2007  pooka branches: 1.22.4; 1.22.6; 1.22.8; 1.22.10;
define PUFFSREQSIZEOP ioctl, which can be used to fetch the
maximum request size
 1.21 09-Jul-2007  ad branches: 1.21.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements
 1.20 21-Jun-2007  pooka Refactor the pnode2vnode translation slightly so that VFS_ROOT
can use it directly.
 1.19 06-Jun-2007  pooka Move puffs to a two clause license where it already isn't so. And
as agc pointed out, even files with the third clause were already
effectively two clause because of a slight bug in the language...
 1.18 17-May-2007  pooka Make it possible for the file server to specify the root vnode type
and other information instead of always using VDIR. To make this
possible without races, require all root node information already
in puffs_mount() and nuke puffs_start2() and the associated start
operation completely.

requested/inspired by Tobias Nygren
 1.17 01-May-2007  pooka Fix a problem introduced when I converted puffs to use newlock2:
when unmounting the file system in case of a certain timing (and
possibly some other conditions), a thread would wait on a condition
variable, while another thread broadcast the cv and immediately
proceeded to destroy it. The result was a system frozen completely
solid shorly after the process waiting for the cv woke up. So
introduce reference counting to synchronize destruction of the
resources in unmount.

I was able to repeat the problem only on my laptop in some special
cases, so I do not know how common it was. Ironically, killing
the file server process violently instead of unmount() didn't have
this problem because it never entered the unmount path from two
directions.
 1.16 16-Apr-2007  pooka fix comment in previous
 1.15 16-Apr-2007  pooka Allow to set non-blocking mode for transport fd even if the file
system is not yet mounted.
 1.14 06-Apr-2007  pooka support flushing pagecache
 1.13 06-Apr-2007  pooka actually, we don't need a separate op for flushing the whole page cache
of a node, just use the range op with endoff = 0
 1.12 06-Apr-2007  pooka * enable PUFFS_INVAL_PAGECACHE_NODE_RANGE
* add input parameter validation
 1.11 30-Mar-2007  pooka g/c some commented ltsleep calls accidentally left from newlock2 adaptation
 1.10 29-Mar-2007  pooka Convert spinlocks & sleep/wakeup to newlock2 locking stuff. Fix a
bunch of bugs.

* park structures are now always allocated from a pool instead of a
mixed stack/malloc allocation
* get rid of the whole adjbuf concept, always just alloc the maximal
amount of memory to satisfy a request
* little regression: don't allow interrupting wait from file system
to userspace; this had problems already before, but now the problems
really started to shine through. I'll try to make this work again
some day.
* fix bmap to return a sensible value in runp
 1.9 20-Mar-2007  pooka * rework the page cache interaction a bit: cache metadata in the
kernel and flush it out all at once instead of continuous updating
* add support for delivering notifications to the file server about
when a page was written to (but disabled by default for now). the
file server can use this to request flushing or invalidating the
kernel page cache
 1.8 16-Feb-2007  hannken branches: 1.8.2; 1.8.6; 1.8.8; 1.8.10;
Make fstrans(9) the default helper for file system suspension.
Replaces the now obsolete vn_start_write()/vn_finished_write().
 1.7 09-Feb-2007  ad Merge newlock2 to head.
 1.6 28-Jan-2007  pooka don't need pi_lock for struct member access, so don't take it
 1.5 26-Jan-2007  pooka Initial attempt at suspend/snapshot support for userspace file
servers. This is still pretty much on the level "if it breaks ...".
It should work for single-threaded servers which handle one operation
from start to finish in one go. Also, it does not yet totally
correctly synchronize metadata and data in some cases. So needless
to say, it needs improvement, but it is possible that will have to
wait for some lock revampage.
 1.4 09-Jan-2007  pooka branches: 1.4.2;
Introduce flush operations, which the fs server can use to control
kernel caching. Currently supported are only flushing the name
cache for a directory or flushing the name cache for the entire fs.

Also, get rid of PNODE_INACTIVE status, since it was racy and
essentially didn't work. All this on top of being useless in the
first place ....
 1.3 10-Dec-2006  pooka branches: 1.3.2;
* free puffs_instance structure in all cases when closing the descriptor
* comment, rcsid & kassert police
 1.2 10-Dec-2006  pooka Fix a race condition that would cause the mountpoint to be cleaned
from under someone waiting for the fs server response in puffs_unmount()
if the descriptor was closed during the response wait (such as bug
leading to a crash in fs implementation unmount()).
 1.1 05-Dec-2006  pooka branches: 1.1.2;
shuffle functions around a bit: move the transport (/dev/puffs) to
a different file from the messaging (request contents). no functional
change
 1.1.2.3 18-Dec-2006  yamt sync with head.
 1.1.2.2 10-Dec-2006  yamt sync with head.
 1.1.2.1 05-Dec-2006  yamt file puffs_transport.c was added on branch yamt-splraiseipl on 2006-12-10 07:18:38 +0000
 1.3.2.6 15-Nov-2007  yamt sync with head.
 1.3.2.5 27-Oct-2007  yamt sync with head.
 1.3.2.4 03-Sep-2007  yamt sync with head.
 1.3.2.3 26-Feb-2007  yamt sync with head.
 1.3.2.2 30-Dec-2006  yamt sync with head.
 1.3.2.1 10-Dec-2006  yamt file puffs_transport.c was added on branch yamt-lazymbuf on 2006-12-30 20:50:01 +0000
 1.4.2.4 01-Feb-2007  ad Sync with head.
 1.4.2.3 18-Jan-2007  christos make things compile.
 1.4.2.2 12-Jan-2007  ad Sync with head.
 1.4.2.1 09-Jan-2007  ad file puffs_transport.c was added on branch newlock2 on 2007-01-12 01:04:05 +0000
 1.8.10.1 29-Mar-2007  reinoud Pullup to -current
 1.8.8.1 11-Jul-2007  mjf Sync with head.
 1.8.6.11 12-Oct-2007  ad Sync with head.
 1.8.6.10 09-Oct-2007  ad Sync with head.
 1.8.6.9 20-Aug-2007  ad Sync with HEAD.
 1.8.6.8 15-Jul-2007  ad Sync with head.
 1.8.6.7 09-Jun-2007  ad Sync with head.
 1.8.6.6 08-Jun-2007  ad Sync with head.
 1.8.6.5 13-May-2007  ad - Pass the error number and residual count to biodone(), and let it handle
setting error indicators. Prepare to eliminate B_ERROR.
- Add a flag argument to brelse() to be set into the buf's flags, instead
of doing it directly. Typically used to set B_INVAL.
- Add a "struct cpu_info *" argument to kthread_create(), to be used to
create bound threads. Change "bool mpsafe" to "int flags".
- Allow exit of LWPs in the IDL state when (l != curlwp).
- More locking fixes & conversion to the new API.
 1.8.6.4 10-Apr-2007  ad Sync with head.
 1.8.6.3 10-Apr-2007  ad Nuke the deferred kthread creation stuff, as it's no longer needed.
Pointed out by thorpej@.
 1.8.6.2 09-Apr-2007  ad - Add two new arguments to kthread_create1: pri_t pri, bool mpsafe.
- Fork kthreads off proc0 as new LWPs, not new processes.
 1.8.6.1 13-Mar-2007  ad Pull in the initial set of changes for the vmlocking branch.
 1.8.2.3 07-May-2007  yamt sync with head.
 1.8.2.2 15-Apr-2007  yamt sync with head.
 1.8.2.1 24-Mar-2007  yamt sync with head.
 1.21.2.1 15-Aug-2007  skrll Sync with HEAD.
 1.22.10.2 19-Jul-2007  pooka define PUFFSREQSIZEOP ioctl, which can be used to fetch the
maximum request size
 1.22.10.1 19-Jul-2007  pooka file puffs_transport.c was added on branch matt-mips64 on 2007-07-19 07:52:46 +0000
 1.22.8.2 14-Oct-2007  yamt sync with head.
 1.22.8.1 06-Oct-2007  yamt sync with head.
 1.22.6.2 23-Mar-2008  matt sync with HEAD
 1.22.6.1 06-Nov-2007  matt sync with HEAD
 1.22.4.4 11-Nov-2007  joerg Sync with HEAD.
 1.22.4.3 26-Oct-2007  joerg Sync with HEAD.

Follow the merge of pmap.c on i386 and amd64 and move
pmap_init_tmp_pgtbl into arch/x86/x86/pmap.c. Modify the ACPI wakeup
code to restore CR4 before jumping back into kernel space as the large
page option might cover that.
 1.22.4.2 07-Oct-2007  joerg Sync with HEAD.
 1.22.4.1 02-Oct-2007  joerg Sync with HEAD.
 1.27.4.1 19-Nov-2007  mjf Sync with HEAD.
 1.27.2.1 13-Nov-2007  bouyer Sync with HEAD
 1.126 01-Apr-2021  christos Put a copy of our existing data first in the non-error case (noticed by RVP).
 1.125 27-Feb-2020  ad branches: 1.125.6; 1.125.8;
Tighten up the locking around vp->v_iflag a little more after the recent
split of vmobjlock & v_interlock.
 1.124 17-Jan-2020  ad VFS_VGET(), VFS_ROOT(), VFS_FHTOVP(): give them a "int lktype" argument, to
allow us to get shared locks (or no lock) on the returned vnode. Matches
FreeBSD.
 1.123 27-Sep-2019  christos branches: 1.123.2;
Fix copying issue that was causing errors in unit_test puffs_tstavfs by
removing code.
 1.122 23-Sep-2019  christos Restore binary compatibility by using the statvfs90 structure internally.
 1.121 28-May-2018  chs branches: 1.121.2;
add a genfs method to allow a file system to limit the range of pages
that are given to a single GOP_WRITE() call. needed by ZFS.
 1.120 01-Apr-2017  riastradh branches: 1.120.12;
KASSERT(mutex_owned(vp->v_interlock)) in vnode iterator selector.
 1.119 17-Feb-2017  hannken Add generic genfs_suspendctl() and use it for all file systems.
Layered file systems need work.
 1.118 20-Dec-2015  christos branches: 1.118.2; 1.118.4;
PR/50573: Andreas Gustafsson: puffs can crash kernel for lack of argument
checking
 1.117 16-Feb-2015  martin Remove debug printf
 1.116 15-Feb-2015  manu Add PUFFS_KFLAG_NOFLUSH_META to prevent sending metadata flush to FUSE

FUSE filesystems do not expect to get metadata updates for [amc]time
and size, they updates the value on their own after operations.

The PUFFS PUFFS_KFLAG_NOFLUSH_META option prevents regular metadata cache
flushes to the filesystem , and libperfuse uses it to match Linux FUSE
behavior.

While there, fix a bug in SETATTR: do not update kernel metadata cache
from SETATTR reply when the request is asynchronous, as we do not have
the reply yet.
 1.115 10-Nov-2014  maxv branches: 1.115.2;
Do not uselessly include <sys/malloc.h>.
 1.114 28-Aug-2014  hannken Change puffs from hashlist to vcache.
- field "pa_nhashbuckets" of struct "puffs_kargs" becomes a no-op.
and should be removed on the next protocol version bump.
 1.113 25-May-2014  christos branches: 1.113.2;
use standard dirty vnode test.
 1.112 25-May-2014  hannken The pageflush_selector gets a vnode with v_interlock held.
Remove the mutex_enter()/mutex_exit() and simplify.

Hi christos...
 1.111 24-May-2014  christos Introduce a selector function to the vfs vnode iterator so that we don't
need to vget() vnodes that we are not interested at, and optimize locking
a bit. Iterator changes reviewed by Hannken (thanks), the rest of the bugs
are mine.
 1.110 16-Apr-2014  maxv An (un)privileged user can easily make the kernel dereference a NULL
pointer.

The kernel allows 'data' to be NULL; it's the fs's responsibility to
ensure that it isn't NULL (if the fs actually needs data).

ok christos@
 1.109 23-Mar-2014  hannken branches: 1.109.2;
Change all vfsops to use C99 designated initializers.

No functional changes intended.
 1.108 17-Mar-2014  hannken Change pageflush() to use vfs_vnode_iterator.
 1.107 16-Jan-2013  pooka branches: 1.107.2;
Do the protocol consistency check hack only when compiling ELF.
 1.106 09-Aug-2012  manu branches: 1.106.2;
Backout previous bugfix attempt for unmounts. That changes did not
address the real problem.
 1.105 27-Jul-2012  manu Rename slow sopreq queue into node sopreq queue, to refet the fact that
is only intended for postponed node reclaims.
When purging the node sopreq queue, do not call puffs_msg_sendresp(), as
it makes no sense.
 1.104 27-Jul-2012  manu puffs mounts share global pools. This means that the puffs_vfsops cannot
be vfs_detach'ed by module autounload before puffs_vfsop_unmount() completes
and has freed ressource from the pools. By holding a reference on
puffs_vfsops from each mount, we ensure that no race can occur here.

Works around the crash in kern/46734
 1.103 22-Jul-2012  manu Fix hang unmount bug introduced by last commit.

We introduced a slow queue for delayed reclaims, while the existing
queue for unmount, flush and exist has been renamed fast queue. Both
queues had timestamp for when an operation should be done, but it was
useless for the fast queue, which is always used to run an operation
ASAP. And the timestamp test had an error that turned ASAP into "at next
tick", but nobody what there to wake the thread at next tick, hence
the hang. The fix is to remove the useless and buggy timestamp test for
fast queue.
 1.102 21-Jul-2012  manu - Improve PUFFS_KFLAG_CACHE_FS_TTL by reclaiming older inactive nodes.

The normal kernel behavior is to retain inactive nodes in the freelist
until it runs out of vnodes. This has some merit for local filesystems,
where the cost of an allocation is about the same as the cost of a
lookup. But that situation is not true for distributed filesystems.
On the other hand, keeping inactive nodes for a long time hold memory
in the file server process, and when the kernel runs out of vnodes, it
produce reclaim avalanches that increase lattency for other operations.

We do not reclaim inactive vnodes immediatly either, as they may be
looked up again shortly. Instead we introduce a grace time and we
reclaim nodes that have been inactive beyond the grace time.

- Fix lookup/reclaim race condition.

The above improvement undercovered a race condition between lookup and
reclaim. If we reclaimed a vnode associated with a userland cookie while
a lookup returning that same cookiewas inprogress, then the kernel ends
up with a vnode associated with a cookie that has been reclaimed in
userland. Next operation on the cookie will crash (or at least confuse)
the filesystem.

We fix this by introducing a lookup count in kernel and userland. On
reclaim, the kernel sends the count, which enable userland to detect
situation where it initiated a lookup that is not completed in kernel.
In such a situation, the reclaim must be ignored, as the node is about
to be looked up again.
 1.101 08-Apr-2012  manu Add name and atttribute cache with filesytem provided TTL.
lookup, create, mknod, mkdir, symlink, getattr and setattr messages
have been extended so that attributes and their TTL can be provided
by the filesytem. lookup, create, mknod, mkdir, and symlink messages
are also extended so that the filesystem can provide name TTL.
 1.100 19-Oct-2011  manu branches: 1.100.2; 1.100.6; 1.100.8;
Remove #ifdef DIAGNOSTIC guards around KASSERT, as the macro contains them
 1.99 18-Oct-2011  manu Make sure pagedaemon does not sleep for memory in puffs_vnop_sleep.
Add KASSERT on any sleeping memory allocation to check it cannot happen again.
 1.98 07-Oct-2011  hannken As vnalloc() always allocates with PR_WAITOK there is no longer the need
to test its result for NULL.
 1.97 21-Sep-2011  manu Make sure ioflush does not sleep in PUFFS code path, waiting for a mutex,
a memory allocation, or a response from the filesystem.

This avoids deadlocks in the following situations:
1) when memory is low: ioflush waits the fileystem, the fielsystem waits
for memory
2) when the filesystem does not respond (e.g.: network outage ona
distributed filesystem)
 1.96 12-Jun-2011  rmind Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.
 1.95 21-Jul-2010  hannken branches: 1.95.6;
Make holding v_interlock mandatory for callers of vget().

Announced some time ago on tech-kern.
 1.94 15-Jul-2010  pooka f_namemax is one of the static fields overridden by copy_statvfs_info(),
so be sure to set it to the value coming from the file server as
part of mount args.

exposed, like so many other problems, by njoly's tests
 1.93 06-Jul-2010  pooka Add compat to enable running puffs in a 64bit time_t kernel against
a server which runs in 32bit time_t namespace.
 1.92 06-Jul-2010  pooka ctassert size of some key structures does not change
 1.91 06-Jul-2010  pooka Make sure that pa_spare is zero-filled and does not contain any
garbage which might disrupt future use.
 1.90 24-Jun-2010  hannken Clean up vnode lock operations pass 2:

VOP_UNLOCK(vp, flags) -> VOP_UNLOCK(vp): Remove the unneeded flags argument.

Welcome to 5.99.32.

Discussed on tech-kern.
 1.89 21-May-2010  pooka Since libpuffs needs a major bump for extattr support anyway, make
some changes to the user-kernel protocol. Namely, try to be a
little more resilient some future changes.
 1.88 21-May-2010  pooka Support extended attributes.
 1.87 17-Feb-2010  pooka branches: 1.87.2;
* add a rant about why MPSAFE isn't enabled even though puffs code is
* predict_false that we are mounting when calling statvfs
* KNF
 1.86 14-Jan-2010  pooka branches: 1.86.2;
In case the operations thread has exited, do not queue any more
operations. This prevents kernel memory leaks (one of which happened
every time the file system was unmounted via PUFFSOP_UNMOUNT ...
and incidentally would've been trivially caught with the old
malloc(9) interface. I wonder if the message is to use a ton of
pools instead of regression-attractive kmem interface).
 1.85 07-Jan-2010  pooka Rename PUFFS_SOPREQ_EXIT to PUFFS_SOPREQSYS_EXIT to better signal
it comes from within the kernel instead of as a direct result of
a user request.

no functional change
 1.84 07-Dec-2009  pooka Process flush requests from the file server in a separate thread
context. This fixes a long-standing but seldomly seen deadlock,
where the kernel was holding pages busy (due to e.g. readahead
request) while waiting for the server to respond, and the server
made a callback into the kernel asking to invalidate those pages.
... or, well, theoretically fixes, since I didn't have any reliable
way of repeating the deadlock and I think I saw it only twice.
 1.83 05-Nov-2009  pooka Kill suspend support. It was never implemented correctly:
* it depended on the biglock (in a very cruel way)
* it was attached to userspace transactions rather than logical
fs operations

(If someone wants to revisit it some day, most of the stuff can be
reused from cvs history)
 1.82 18-Mar-2009  cegger Ansify function definitions w/o arguments. Generated with sed.
 1.81 20-May-2008  jmcneill branches: 1.81.6; 1.81.8; 1.81.12; 1.81.16;
Add module dependency on putter.
 1.80 10-May-2008  rumble Convert file systems to dynamically attach with the new module interface.
Make VFS hooks dynamic while we're here and say farewell to VFS_ATTACH and
VFS_HOOKS_ATTACH linksets.

As a consequence, most of the file systems can now be loaded as new style
modules.

Quick sanity check by ad@.
 1.79 29-Apr-2008  ad branches: 1.79.2;
PR kern/38057 ffs makes assuptions about devvp file system
PR kern/33406 softdeps get stuck in endless loop

Introduce VFS_FSYNC() and call it when syncing a block device, if it
has a mounted file system.
 1.78 28-Jan-2008  dholland branches: 1.78.6; 1.78.8; 1.78.10;
Fix some race conditions in rename.
Introduce a per-FS rename lock and new vfsops to manipulate it.
Get this lock while renaming. Also add another relookup() in do_sys_rename,
which is a hack to kludge around some of the worst deficiencies of
ufs_rename.
reviewed-by: pooka (and an earlier rev by ad)
posted on tech-kern with no objections.
 1.77 03-Jan-2008  pooka fix vmlocking2 fallout: fstrans_mount/unmount
 1.76 03-Jan-2008  pooka valloc -> vnalloc, vfree -> vnfree
Avoids collision with userland valloc(3).

no functional change
ad ok
 1.75 02-Jan-2008  pooka More type-punning workarounds. Curiously the kernel compilation
flags cause gcc to not complain.
 1.74 02-Jan-2008  ad Merge vmlocking2 to head.
 1.73 30-Dec-2007  pooka namespace a bit: vfsops -> puffs_vfsop_x() and vops -> puffs_vnop_x()
 1.72 27-Nov-2007  pooka branches: 1.72.2; 1.72.6;
Remove "puffs_cid" from the puffs interface following l-removal
from the kernel vfs interfaces. puffs_cc_getcaller(pcc) can be
used now should the same information be desired.
 1.71 26-Nov-2007  pooka Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern
 1.70 20-Nov-2007  pooka Retire M_PUFFS, use kmem(9) instead.
 1.69 16-Nov-2007  pooka Restructure the messaging interface a bit more: make all interfacing
with the file server happen through puffs_msg_enqueue() and
puffs_msg_wait() instead of having a billion different routines.
Build the existing system upon these two. Most importantly though,
decouple insertation into the op queue from the actual wait. This
is useful for a number of reasons coming soon to a cvs repo near you.
 1.68 12-Nov-2007  pooka * split the putter header into a kernel version and a userland version
+ install latter to /usr/include/dev/putter
* remove last dependencies to puffs from putter, it's completely
independent now
 1.67 12-Nov-2007  pooka Move putter code from directly under dev/ to dev/putter/

no functional change
 1.66 10-Nov-2007  pooka Part 2/n of extensive changes to request transport to/from userspace:

Rip the transport code completely out of puffs and generalize it
into an independent module which will be used for multiple purposes
in the future. This module is called the Pass-to-Userspace
Transporter (known as "putter" among friends).

This is very much work-in-progress and one dependency with puffs
remains: the request framing format.

The device name is still /dev/puffs, but that will change soon.

Users of puffs need the following in their kernel configs now:
pseudo-device putter
 1.65 11-Oct-2007  pooka branches: 1.65.2; 1.65.4;
Handle suspend and flush requests from the file server.
 1.64 11-Oct-2007  pooka in case of version mismatch, print the numbers
 1.63 11-Oct-2007  pooka Part 1/n of some pretty extensive changes to how the kernel module
interacts with the userspace file server:

* since the kernel-user communication is not purely request-response
anymore (hasn't been since 2006), try to rename some "request" to
"message". more similar mangling will take place in the future.

* completely rework how messages are allocated. previously most of
them were borrowed from the stack (originally *all* of them),
but now always allocate dynamically. this makes the structure
of the code much cleaner. also makes it possible to fix a
locking order violation. it enables plenty of future enhancements.

* start generalizing the transport interface to be independent of puffs

* move transport interface to read/write instead of ioctl. the
old one had legacy design problems, and besides, ioctl's suck.
implement a very generic version for now; this will be
worked on later hopefully some day reaching "highly optimized".

* implement libpuffs support behind existing library request
interfaces. this will change eventually (I hate those interfaces)
 1.62 11-Oct-2007  pooka Cache vnode member variables necessary for operations after the
userspace call, namely our private mount structure, in the activation
record. This avoids problems in situations where the userspace
file server happens to die during our upcall and the vnode is
forcibly reclaimed before we roll back to the current stack frame.
 1.61 09-Oct-2007  pooka g/c more unused stuff
 1.60 01-Oct-2007  pooka * better error checking: validate error values received from userland
to be vaild errno values
* include string describing error in PUFFS_ERR
* get rid of union in puffs_req, it's nothing but trouble
* pass pmp to async i/o callbacks
 1.59 27-Sep-2007  pooka Differentiate between cookie2vnode returning an error and
return to caller, address unknown: no such cookie, no such node.
Make the callers use this info to either create a new vnode or bail.
 1.58 27-Sep-2007  pooka Add error notifications, which are used to deliver errors from the
kernel to the file server for silly things the file server did,
e.g. attempting to create a file with size VSIZENOTSET. The file
server can handle these as it chooses, but the default action is
for it to throw its hands in the air and sing "goodbye, cruel world,
it's over, walk on by".
 1.57 27-Sep-2007  pooka Fix a race in how new cookies are checked. Previously the checking
was done separate of inserting the cookie into the lookup structure
and without any form of interlock. This could lead to the same
cookie pointing to two different nodes. Remedy the race by creating
a separate "checked and ready to be inserted" cookie list which
serves as an interlock without having to hold a fs-global creation
lock.
 1.56 05-Sep-2007  pooka branches: 1.56.2;
move static KASSERT from mount to init
 1.55 04-Sep-2007  pooka * don't allow the file server to specify a node size to be VSIZENOTSET
* KASSERT that VNOVAL == VSIZENOTSET
 1.54 23-Aug-2007  pooka branches: 1.54.2;
Add a third type of fh option, passthrough, where the kernel does
not attempt to handle struct fid at all and passes it as such to
userspace.
 1.53 31-Jul-2007  pooka branches: 1.53.2; 1.53.4;
* nuke the nameidata parameter from VFS_MOUNT(). Nobody on tech-kern
knew what it was supposed to be used for and wrstuden gave a go-ahead
* while rototilling, convert file systems which went easily to
use VFS_PROTOS() instead of manually prototyping the methods
 1.52 19-Jul-2007  pooka Make the minimum request size twice the minimum request structure size.
Otherwise ops with payload would have no room for payload.
 1.51 17-Jul-2007  pooka branches: 1.51.2;
Set a file server supplied file system type in the type field and set
the mntfromname to be the place mounted from instead of the type.
 1.50 17-Jul-2007  pooka Make set_statvfs_info() take a parameter for the vfs name instead
of always retrieving it from mp->mnt_op->vfs_name

christos ok
 1.49 14-Jul-2007  dsl Remove the copyout() of the mount args from puffs_mount(), the buffer
supplied is a kernel address.
The puffs userspace code has been changed to do a 2nd call with
MNT_GETARGS to retrieve the information.
 1.48 12-Jul-2007  dsl Change the VFS_MOUNT() interface so that the 'data' buffer passed to the
fs code is a kernel buffer, pass though the length of the buffer as well.
Since the length of the userspace buffer isn'it (yet) passed through the mount
system call, add a field to the vfsops structure containing the default length.
Split sys_mount() for calls from compat code.
Ride one of the recent kernel version changes - old fs LKMs will load, but
sys_mount() will reject any attempt to use them.
 1.47 09-Jul-2007  ad Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements
 1.46 01-Jul-2007  pooka Instead of supplying a plain pid, supply an abstract struct puffs_cid *,
which can currently be used to query the pid and lwpid.
 1.45 21-Jun-2007  pooka Refactor the pnode2vnode translation slightly so that VFS_ROOT
can use it directly.
 1.44 21-Jun-2007  pooka Reorganize how the root vnode is fetched so that it doesn't always
go through VFS_ROOT() and allow to fetch it without locking it.
This allows us to call the cache flush operations also for the root
vnode and most notably fixes e.g. a "No such file or directory"
for a psshfs root directory ls -l when a file was locally deleted
and remotely re-created.

Also fix some sloppy programming in root node fetch (mostly cosmetic).
 1.43 06-Jun-2007  pooka Move puffs to a two clause license where it already isn't so. And
as agc pointed out, even files with the third clause were already
effectively two clause because of a slight bug in the language...
 1.42 17-May-2007  pooka Make it possible for the file server to specify the root vnode type
and other information instead of always using VDIR. To make this
possible without races, require all root node information already
in puffs_mount() and nuke puffs_start2() and the associated start
operation completely.

requested/inspired by Tobias Nygren
 1.41 01-May-2007  pooka Fix a problem introduced when I converted puffs to use newlock2:
when unmounting the file system in case of a certain timing (and
possibly some other conditions), a thread would wait on a condition
variable, while another thread broadcast the cv and immediately
proceeded to destroy it. The result was a system frozen completely
solid shorly after the process waiting for the cv woke up. So
introduce reference counting to synchronize destruction of the
resources in unmount.

I was able to repeat the problem only on my laptop in some special
cases, so I do not know how common it was. Ironically, killing
the file server process violently instead of unmount() didn't have
this problem because it never entered the unmount path from two
directions.
 1.40 16-Apr-2007  pooka Sanity-check & possibly adjust number of hash buckets already before
returning the mount argument structure to userspace.
 1.39 16-Apr-2007  pooka catch invalid size file handles already in the kernel
 1.38 16-Apr-2007  pooka Give the file server the ability to specify the file handle length
instead of defining a static length file handle on the framework-level.
 1.37 14-Apr-2007  xtraeme size_t is unsigned, so use zu rather than zd which is for ssize_t,
as Matt Thomas pointed out.
 1.36 14-Apr-2007  xtraeme Use zd to printf size_t.
 1.35 13-Apr-2007  pooka Allow file servers to request the number of hash cookie buckets for
pnode -> vnode reverse lookup.
 1.34 13-Apr-2007  pooka * add fhlen to kernel argument structure
* rename it to puffs_kargs instead of puffs_args
 1.33 11-Apr-2007  pooka * support VFS_FHTOVP and VFS_VPTOFH
* support cookies in for VOP_READDIR

nfs exporting puffs file systems works now
 1.32 29-Mar-2007  pooka convert to MALLOC_JUSTDEFINE
 1.31 29-Mar-2007  pooka Convert spinlocks & sleep/wakeup to newlock2 locking stuff. Fix a
bunch of bugs.

* park structures are now always allocated from a pool instead of a
mixed stack/malloc allocation
* get rid of the whole adjbuf concept, always just alloc the maximal
amount of memory to satisfy a request
* little regression: don't allow interrupting wait from file system
to userspace; this had problems already before, but now the problems
really started to shine through. I'll try to make this work again
some day.
* fix bmap to return a sensible value in runp
 1.30 20-Mar-2007  pooka * rework the page cache interaction a bit: cache metadata in the
kernel and flush it out all at once instead of continuous updating
* add support for delivering notifications to the file server about
when a page was written to (but disabled by default for now). the
file server can use this to request flushing or invalidating the
kernel page cache
 1.29 13-Mar-2007  ad branches: 1.29.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.
 1.28 16-Feb-2007  hannken branches: 1.28.2; 1.28.6; 1.28.8;
Make fstrans(9) the default helper for file system suspension.
Replaces the now obsolete vn_start_write()/vn_finished_write().
 1.27 29-Jan-2007  hannken Change fstrans enum types to upper case.
No functional change.

From Antti Kantee <pooka@netbsd.org>
 1.26 26-Jan-2007  pooka Initial attempt at suspend/snapshot support for userspace file
servers. This is still pretty much on the level "if it breaks ...".
It should work for single-threaded servers which handle one operation
from start to finish in one go. Also, it does not yet totally
correctly synchronize metadata and data in some cases. So needless
to say, it needs improvement, but it is possible that will have to
wait for some lock revampage.
 1.25 25-Jan-2007  pooka don't hold spinlocks (except vnode interlock) when doing vget()
 1.24 23-Jan-2007  pooka fix comment (no functional change)
 1.23 19-Jan-2007  hannken New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).
 1.22 15-Jan-2007  pooka Store puffs_node's on lists hashed with the cookie value instead
of just one flat list.
 1.21 09-Jan-2007  pooka In vfs_sync(), call VOP_PUTPAGES() for dirty vnodes directly instead
of rolling around VOP_FSYNC(). The user server will be given the
VFS_SYNC instruction and it can do its own equivalent of VOP_FSYNC()
if it pleases, no need for the kernel to explicitly issue #{vnodes}
FSYNCs.
 1.20 09-Jan-2007  pooka Introduce flush operations, which the fs server can use to control
kernel caching. Currently supported are only flushing the name
cache for a directory or flushing the name cache for the entire fs.

Also, get rid of PNODE_INACTIVE status, since it was racy and
essentially didn't work. All this on top of being useless in the
first place ....
 1.19 09-Jan-2007  pooka in vfs_sync flush page cache only for vnodes with dirty pages, not for
vnodes with pages (dirty or otherwise)
 1.18 07-Jan-2007  pooka vfs sync, flushes regular file data only (user server can take care of
flushing any metadata it might have hidden away)
 1.17 02-Jan-2007  pooka * check userspace version and prevent incompatible mount
* some general maintenance
 1.16 10-Dec-2006  pooka branches: 1.16.2;
Fix a race condition that would cause the mountpoint to be cleaned
from under someone waiting for the fs server response in puffs_unmount()
if the descriptor was closed during the response wait (such as bug
leading to a crash in fs implementation unmount()).
 1.15 09-Dec-2006  chs branches: 1.15.2;
a smorgasbord of improvements to vnode locking and path lookup:
- LOCKPARENT is no longer relevant for lookup(), relookup() or VOP_LOOKUP().
these now always return the parent vnode locked. namei() works as before.
lookup() and various other paths no longer acquire vnode locks in the
wrong order via vrele(). fixes PR 32535.
as a nice side effect, path lookup is also up to 25% faster.
- the above allows us to get rid of PDIRUNLOCK.
- also get rid of WANTPARENT (just use LOCKPARENT and unlock it).
- remove an assumption in layer_node_find() that all file systems implement
a recursive VOP_LOCK() (unionfs doesn't).
- require that all file systems supply vfs_vptofh and vfs_fhtovp routines.
fill in eopnotsupp() for file systems that don't support being exported
and remove the checks for NULL. (layerfs calls these without checking.)
- in union_lookup1(), don't change refcounts in the ISDOTDOT case, just
adjust which vnode is locked. fixes PR 33374.
- apply fixes for ufs_rename() from ufs_vnops.c rev. 1.61 to ext2fs_rename().
 1.14 07-Dec-2006  pooka In case of an error, return an error. Otherwise the worst case was
that dostatvfs() wrote to a recently deceased struct mount.
 1.13 01-Dec-2006  pooka branches: 1.13.2;
prefix kernel flags with PUFFS_KFLAG to have a separate namespace
from the library flags
 1.12 01-Dec-2006  pooka don't call the fs server for all operations, only those it has told
us that it implements
 1.11 18-Nov-2006  pooka branches: 1.11.2;
Always override f_iosize from stat() to DEV_BSIZE for now. Places such
as vnd use the information, so until "dealing with it" is defined, it's
overriden by the kernel.
 1.10 18-Nov-2006  pooka prevent value 0 for mnt_stat.f_iosize, it is sometimes used as a divider
 1.9 18-Nov-2006  pooka Require statvfs info from startreq so that we have that info available.
Also, don't pass fsid to userspace and just fill it in the kernel.
 1.8 17-Nov-2006  pooka Introduce uncached operation, makes sense when the file system backend
can be modified from elsewhere than the file system interface
 1.7 09-Nov-2006  pooka few renames to better differentiate between mount & start.. plus some
other renaming
 1.6 07-Nov-2006  pooka attach to genfs & support page cache. most noticeable effect is
mmap and therefore execution of binaries starting to work, some
speed improvements with large file I/O also. caching semantics
and error case handling most likely need revisiting.
 1.5 06-Nov-2006  pooka make it possible to build & load puffs as an LKM

by Lubomir Kundrak, PR kern/35000
 1.4 27-Oct-2006  pooka support fifos
 1.3 26-Oct-2006  pooka support specfs
 1.2 26-Oct-2006  pooka debug print fixes
 1.1 22-Oct-2006  pooka kernel portion of puffs - the Pass-to-Userspace Framework File System.
It contains the VFS attachment and userspace message-passing interface.

This work was initially started and completed for Google SoC 2005
and tweaked to work a bit better in the past few weeks. While
being far from complete, it is functional enough to be able and
stable to host a fairly general-purpose in-memory file system in
userspace. Even so, puffs should be considered experimental and
no binary compatibility for interfaces or crash-freedom or zero
security implications should be relied upon just yet.

The GSoC project was mentored by William Studenmund and the final
review for the code was done by Christos.
 1.11.2.4 01-Feb-2007  ad Sync with head.
 1.11.2.3 12-Jan-2007  ad Sync with head.
 1.11.2.2 18-Nov-2006  ad Sync with head.
 1.11.2.1 18-Nov-2006  ad file puffs_vfsops.c was added on branch newlock2 on 2006-11-18 21:39:20 +0000
 1.13.2.1 17-Feb-2007  tron Apply patch (requested by chs in ticket #422):
- Fix various deadlock problems with nullfs and unionfs.
- Speed up path lookups by upto 25%.
 1.15.2.3 18-Dec-2006  yamt sync with head.
 1.15.2.2 10-Dec-2006  yamt sync with head.
 1.15.2.1 09-Dec-2006  yamt file puffs_vfsops.c was added on branch yamt-splraiseipl on 2006-12-10 07:18:38 +0000
 1.16.2.9 04-Feb-2008  yamt sync with head.
 1.16.2.8 21-Jan-2008  yamt sync with head
 1.16.2.7 07-Dec-2007  yamt sync with head
 1.16.2.6 15-Nov-2007  yamt sync with head.
 1.16.2.5 27-Oct-2007  yamt sync with head.
 1.16.2.4 03-Sep-2007  yamt sync with head.
 1.16.2.3 26-Feb-2007  yamt sync with head.
 1.16.2.2 30-Dec-2006  yamt sync with head.
 1.16.2.1 10-Dec-2006  yamt file puffs_vfsops.c was added on branch yamt-lazymbuf on 2006-12-30 20:50:01 +0000
 1.28.8.1 11-Jul-2007  mjf Sync with head.
 1.28.6.12 28-Oct-2007  ad Fix up mnt_vnodelist handling.
 1.28.6.11 12-Oct-2007  ad Sync with head.
 1.28.6.10 09-Oct-2007  ad Sync with head.
 1.28.6.9 20-Aug-2007  ad Sync with HEAD.
 1.28.6.8 15-Jul-2007  ad Sync with head.
 1.28.6.7 17-Jun-2007  ad - Increase the number of thread priorities from 128 to 256. How the space
is set up is to be revisited.
- Implement soft interrupts as kernel threads. A generic implementation
is provided, with hooks for fast-path MD code that can run the interrupt
threads over the top of other threads executing in the kernel.
- Split vnode::v_flag into three fields, depending on how the flag is
locked (by the interlock, by the vnode lock, by the file system).
- Miscellaneous locking fixes and improvements.
 1.28.6.6 09-Jun-2007  ad Sync with head.
 1.28.6.5 08-Jun-2007  ad Sync with head.
 1.28.6.4 10-Apr-2007  ad Sync with head.
 1.28.6.3 05-Apr-2007  ad Compile fixes.
 1.28.6.2 13-Mar-2007  ad Pull in the initial set of changes for the vmlocking branch.
 1.28.6.1 13-Mar-2007  ad Sync with head.
 1.28.2.3 07-May-2007  yamt sync with head.
 1.28.2.2 15-Apr-2007  yamt sync with head.
 1.28.2.1 24-Mar-2007  yamt sync with head.
 1.29.2.1 29-Mar-2007  reinoud Pullup to -current
 1.51.2.3 10-Sep-2007  skrll Sync with HEAD.
 1.51.2.2 03-Sep-2007  skrll Sync with HEAD.
 1.51.2.1 15-Aug-2007  skrll Sync with HEAD.
 1.53.4.2 31-Jul-2007  pooka * nuke the nameidata parameter from VFS_MOUNT(). Nobody on tech-kern
knew what it was supposed to be used for and wrstuden gave a go-ahead
* while rototilling, convert file systems which went easily to
use VFS_PROTOS() instead of manually prototyping the methods
 1.53.4.1 31-Jul-2007  pooka file puffs_vfsops.c was added on branch matt-mips64 on 2007-07-31 21:14:19 +0000
 1.53.2.7 27-Nov-2007  joerg Sync with HEAD. amd64 Xen support needs testing.
 1.53.2.6 21-Nov-2007  joerg Sync with HEAD.
 1.53.2.5 14-Nov-2007  joerg Sync with HEAD.
 1.53.2.4 11-Nov-2007  joerg Sync with HEAD.
 1.53.2.3 26-Oct-2007  joerg Sync with HEAD.

Follow the merge of pmap.c on i386 and amd64 and move
pmap_init_tmp_pgtbl into arch/x86/x86/pmap.c. Modify the ACPI wakeup
code to restore CR4 before jumping back into kernel space as the large
page option might cover that.
 1.53.2.2 02-Oct-2007  joerg Sync with HEAD.
 1.53.2.1 03-Sep-2007  jmcneill Sync with HEAD.
 1.54.2.3 23-Mar-2008  matt sync with HEAD
 1.54.2.2 09-Jan-2008  matt sync with HEAD
 1.54.2.1 06-Nov-2007  matt sync with HEAD
 1.56.2.2 14-Oct-2007  yamt sync with head.
 1.56.2.1 06-Oct-2007  yamt sync with head.
 1.65.4.3 18-Feb-2008  mjf Sync with HEAD.
 1.65.4.2 08-Dec-2007  mjf Sync with HEAD.
 1.65.4.1 19-Nov-2007  mjf Sync with HEAD.
 1.65.2.3 21-Nov-2007  bouyer Sync with HEAD
 1.65.2.2 18-Nov-2007  bouyer Sync with HEAD
 1.65.2.1 13-Nov-2007  bouyer Sync with HEAD
 1.72.6.2 08-Jan-2008  bouyer Sync with HEAD
 1.72.6.1 02-Jan-2008  bouyer Sync with HEAD
 1.72.2.1 04-Dec-2007  ad Pull the vmlocking changes into a new branch.
 1.78.10.4 11-Aug-2010  yamt sync with head.
 1.78.10.3 11-Mar-2010  yamt sync with head
 1.78.10.2 04-May-2009  yamt sync with head.
 1.78.10.1 16-May-2008  yamt sync with head.
 1.78.8.2 04-Jun-2008  yamt sync with head
 1.78.8.1 18-May-2008  yamt sync with head.
 1.78.6.1 02-Jun-2008  mjf Sync with HEAD.
 1.79.2.1 23-Jun-2008  wrstuden Sync w/ -current. 34 merge conflicts to follow.
 1.81.16.1 21-Apr-2010  matt sync to netbsd-5
 1.81.12.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.81.8.4 25-Apr-2014  sborrill Pull up the following revisions(s) (requested by maxv in ticket #1901):
sys/kern/vfs_syscalls.c: revision 1.478, 1.480 via patch
sys/coda/coda_vfsops.c: revision 1.81
sys/fs/adosfs/advfsops.c: revision 1.70
sys/fs/cd9660/cd9660_vfsops.c: revision 1.84
sys/fs/efs/efs_vfsops.c: revision 1.25
sys/fs/filecorefs/filecore_vfsops.c: revision 1.76
sys/fs/hfs/hfs_vfsops.c: revision 1.31
sys/fs/msdosfs/msdosfs_vfsops.c: revision 1.107
sys/fs/ntfs/ntfs_vfsops.c: revision 1.94
sys/fs/ptyfs/ptyfs_vfsops.c: revision 1.50 via patch
sys/fs/puffs/puffs_vfsops.c: revision 1.110 via patch
sys/fs/smbfs/smbfs_vfsops.c: revision 1.100
sys/fs/sysvbfs/sysvbfs_vfsops.c: revision 1.43
sys/fs/tmpfs/tmpfs_vfsops.c: revision 1.59 via patch
sys/fs/udf/udf_vfsops.c: revision 1.67
sys/fs/union/union_vfsops.c: revision 1.72
sys/fs/unionfs/unionfs_vfsops.c: revision 1.13
sys/kern/vfs_syscalls.c: revision 1.479
sys/miscfs/nullfs/null_vfsops.c: revision 1.88 via patch
sys/miscfs/overlay/overlay_vfsops.c: revision 1.61
sys/miscfs/procfs/procfs_vfsops.c: revision 1.91
sys/miscfs/umapfs/umap_vfsops.c: revision 1.92
sys/nfs/nfs_vfsops.c: revision 1.227
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.180
sys/ufs/ffs/ffs_vfsops.c: revision 1.297
sys/ufs/lfs/lfs_vfsops.c: revision 1.321
sys/ufs/mfs/mfs_vfsops.c: revision 1.107

Due to missing checks in the mount syscall, and a wrong assumption on the
file systems side, the kernel could allocate an unbounded or zero-sized
memory buffer, and could dereference a NULL pointer when particular
arguments are given by a user.
 1.81.8.3 02-Nov-2011  riz branches: 1.81.8.3.2;
Pull up following revision(s) (requested by manu in ticket #1679):
sys/fs/puffs/puffs_vnops.c: revision 1.157
sys/fs/puffs/puffs_vnops.c: revision 1.158
sys/fs/puffs/puffs_vnops.c: revision 1.159
sys/fs/puffs/puffs_vfsops.c: revision 1.97
sys/fs/puffs/puffs_vfsops.c: revision 1.99
sys/fs/puffs/puffs_vnops.c: revision 1.160
sys/fs/puffs/puffs_vfsops.c: revision 1.100
sys/miscfs/syncfs/sync_subr.c: revision 1.47
sys/fs/puffs/puffs_node.c: revision 1.21
sys/fs/puffs/puffs_node.c: revision 1.22
sys/fs/puffs/puffs_msgif.c: revision 1.88
sys/fs/puffs/puffs_msgif.c: revision 1.89
sys/fs/puffs/puffs_vnops.c: revision 1.156
Make sure ioflush does not sleep in PUFFS code path, waiting for a mutex,
a memory allocation, or a response from the filesystem.
This avoids deadlocks in the following situations:
1) when memory is low: ioflush waits the fileystem, the fielsystem waits
for memory
2) when the filesystem does not respond (e.g.: network outage ona
distributed filesystem)
Fix the build that was broken by struct lwp *updateproc reference in
RUMP-visible code. Instead of checking that updateproc (aka ioflush,
aka syncer) will not sleep in PUFFS code, I check for any kernel thread:
after all none of them are designed to hang awaiting for a remote filesystem
operation to complete.
Roll back the change that forced kernel threads to not sleep in PUFFS.
The change does not make consensus, since only pagedaemon should need it.
Other threads will tolerate sleeping, and problems here are only symptoms
that something is going wrong in memory management. The cause, not the
symptoms, need to be fixed.
Make sure pagedaemon does not sleep for memory in puffs_vnop_sleep.
Add KASSERT on any sleeping memory allocation to check it cannot happen again.
Remove #ifdef DIAGNOSTIC guards around KASSERT, as the macro contains them
 1.81.8.2 17-Jul-2011  riz Pull up following revision(s) (requested by manu in ticket #1645):
lib/libc/sys/Makefile.inc 1.207 via patch
lib/libc/sys/extattr_get_file.2 patch
lib/libpuffs/dispatcher.c 1.34,1.36 via patch
lib/libpuffs/puffs.c 1.107 via patch
lib/libpuffs/puffs.h 1.115,1.118 via patch
sys/fs/puffs/puffs_msgif.h 1.71,1.76 via patch
sys/fs/puffs/puffs_vfsops.c 1.88 via patch
sys/fs/puffs/puffs_vnops.c 1.145,1.154 via patch
sys/kern/vfs_xattr.c 1.24-1.27 via patch
sys/kern/vnode_if.c 1.87 via patch
sys/sys/Makefile 1.133 via patch
sys/sys/extattr.h 1.6 via patch
sys/sys/vnode_if.h 1.81 via patch
sys/ufs/ffs/ffs_vnops.c patch
sys/ufs/ufs/ufs_extattr.c 1.31,1.34 via patch

* support extended attributes
* bump major due to structure growth
* add some spare space
* remove ABI sillyness
Support extended attributes.
Fix multiple non compliances in our Linux-like extattr API, and make it
public so that it can be used.
Improve a bit listxattr(2). It attemps to list both system and user
extended attributes, and it faled if calling user did not have privilege
for reading system EA. Now we just lise user EA and skip system EA in
reading them is not allowed.
Fix bug introduced in previous commuit: Do not vrele() a vnode we did not
obtained.
Improve UFS1 extended attributes usability
- autocreate attribute backing file for new attributes
- autoload attributes when issuing extattrctl start
- when autoloading attributes, do not display garbage warning when looking
up entries that got ENOENT
Add a flag to VOP_LISTEXTATTR(9) so that the vnode interface can tell the
filesystem in which format extended attribute shall be listed.
There are currently two formats:
- NUL-terminated strings, used for listxattr(2), this is the default.
- one byte length-pprefixed, non NUL-terminated strings, used for
extattr_list_file(2), which is obtanined by setting the
EXTATTR_LIST_PREFIXLEN flag to VOP_LISTEXTATTR(9)
This approach avoid the need for converting the list back and forth, except
in libperfuse, since FUSE uses NUL-terminated strings, and the kernel may
have requested EXTATTR_LIST_PREFIXLEN.
 1.81.8.1 09-Jan-2010  snj branches: 1.81.8.1.2;
Pull up following revision(s) (requested by pooka in ticket #1212):
sys/fs/puffs/puffs_msgif.c: revision 1.76 via patch
sys/fs/puffs/puffs_sys.h: revision 1.73 via patch
sys/fs/puffs/puffs_vfsops.c: revision 1.84 via patch
Process flush requests from the file server in a separate thread
context. This fixes a long-standing but seldomly seen deadlock,
where the kernel was holding pages busy (due to e.g. readahead
request) while waiting for the server to respond, and the server
made a callback into the kernel asking to invalidate those pages.
... or, well, theoretically fixes, since I didn't have any reliable
way of repeating the deadlock and I think I saw it only twice.
 1.81.8.3.2.1 28-Apr-2014  sborrill Pull up the following revisions(s) (requested by maxv in ticket #1901):
sys/kern/vfs_syscalls.c: revision 1.478, 1.480 via patch
sys/coda/coda_vfsops.c: revision 1.81
sys/fs/adosfs/advfsops.c: revision 1.70
sys/fs/cd9660/cd9660_vfsops.c: revision 1.84
sys/fs/efs/efs_vfsops.c: revision 1.25
sys/fs/filecorefs/filecore_vfsops.c: revision 1.76
sys/fs/hfs/hfs_vfsops.c: revision 1.31
sys/fs/msdosfs/msdosfs_vfsops.c: revision 1.107
sys/fs/ntfs/ntfs_vfsops.c: revision 1.94
sys/fs/ptyfs/ptyfs_vfsops.c: revision 1.50 via patch
sys/fs/puffs/puffs_vfsops.c: revision 1.110 via patch
sys/fs/smbfs/smbfs_vfsops.c: revision 1.100
sys/fs/sysvbfs/sysvbfs_vfsops.c: revision 1.43
sys/fs/tmpfs/tmpfs_vfsops.c: revision 1.59 via patch
sys/fs/udf/udf_vfsops.c: revision 1.67
sys/fs/union/union_vfsops.c: revision 1.72
sys/fs/unionfs/unionfs_vfsops.c: revision 1.13
sys/kern/vfs_syscalls.c: revision 1.479
sys/miscfs/nullfs/null_vfsops.c: revision 1.88 via patch
sys/miscfs/overlay/overlay_vfsops.c: revision 1.61
sys/miscfs/procfs/procfs_vfsops.c: revision 1.91
sys/miscfs/umapfs/umap_vfsops.c: revision 1.92
sys/nfs/nfs_vfsops.c: revision 1.227
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.180
sys/ufs/ffs/ffs_vfsops.c: revision 1.297
sys/ufs/lfs/lfs_vfsops.c: revision 1.321
sys/ufs/mfs/mfs_vfsops.c: revision 1.107

Due to missing checks in the mount syscall, and a wrong assumption on the
file systems side, the kernel could allocate an unbounded or zero-sized
memory buffer, and could dereference a NULL pointer when particular
arguments are given by a user.
 1.81.8.1.2.1 28-Apr-2014  sborrill Pull up the following revisions(s) (requested by maxv in ticket #1901):
sys/kern/vfs_syscalls.c: revision 1.478, 1.480 via patch
sys/coda/coda_vfsops.c: revision 1.81
sys/fs/adosfs/advfsops.c: revision 1.70
sys/fs/cd9660/cd9660_vfsops.c: revision 1.84
sys/fs/efs/efs_vfsops.c: revision 1.25
sys/fs/filecorefs/filecore_vfsops.c: revision 1.76
sys/fs/hfs/hfs_vfsops.c: revision 1.31
sys/fs/msdosfs/msdosfs_vfsops.c: revision 1.107
sys/fs/ntfs/ntfs_vfsops.c: revision 1.94
sys/fs/ptyfs/ptyfs_vfsops.c: revision 1.50 via patch
sys/fs/puffs/puffs_vfsops.c: revision 1.110 via patch
sys/fs/smbfs/smbfs_vfsops.c: revision 1.100
sys/fs/sysvbfs/sysvbfs_vfsops.c: revision 1.43
sys/fs/tmpfs/tmpfs_vfsops.c: revision 1.59 via patch
sys/fs/udf/udf_vfsops.c: revision 1.67
sys/fs/union/union_vfsops.c: revision 1.72
sys/fs/unionfs/unionfs_vfsops.c: revision 1.13
sys/kern/vfs_syscalls.c: revision 1.479
sys/miscfs/nullfs/null_vfsops.c: revision 1.88 via patch
sys/miscfs/overlay/overlay_vfsops.c: revision 1.61
sys/miscfs/procfs/procfs_vfsops.c: revision 1.91
sys/miscfs/umapfs/umap_vfsops.c: revision 1.92
sys/nfs/nfs_vfsops.c: revision 1.227
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.180
sys/ufs/ffs/ffs_vfsops.c: revision 1.297
sys/ufs/lfs/lfs_vfsops.c: revision 1.321
sys/ufs/mfs/mfs_vfsops.c: revision 1.107

Due to missing checks in the mount syscall, and a wrong assumption on the
file systems side, the kernel could allocate an unbounded or zero-sized
memory buffer, and could dereference a NULL pointer when particular
arguments are given by a user.
 1.81.6.1 28-Apr-2009  skrll Sync with HEAD.
 1.86.2.2 17-Aug-2010  uebayasi Sync with HEAD.
 1.86.2.1 30-Apr-2010  uebayasi Sync with HEAD.
 1.87.2.4 05-Mar-2011  rmind sync with head
 1.87.2.3 03-Jul-2010  rmind sync with head
 1.87.2.2 30-May-2010  rmind sync with head
 1.87.2.1 16-Mar-2010  rmind Change struct uvm_object::vmobjlock to be dynamically allocated with
mutex_obj_alloc(). It allows us to share the locks among UVM objects.
 1.95.6.1 23-Jun-2011  cherry Catchup with rmind-uvmplock merge.
 1.100.8.4 27-Feb-2015  martin Pull up following revision(s) (requested by manu in ticket #1260):
lib/libpuffs/puffs.3: revision 1,55,1.60
sys/fs/puffs/puffs_msgif.h: revision 1.84
lib/libperfuse/ops.c: revision 1.83
sys/fs/puffs/puffs_sys.h: revision 1.89
sys/fs/puffs/puffs_vfsops.c: revision 1.116
lib/libperfuse/perfuse.c: revision 1.36
sys/fs/puffs/puffs_vnops.c: revision 1.200-1.202

Use more markup. New sentence, new line. Bump date for previous.

Add PUFFS_KFLAG_NOFLUSH_META to prevent sending metadata flush to FUSE
FUSE filesystems do not expect to get metadata updates for [amc]time
and size, they updates the value on their own after operations.

The PUFFS PUFFS_KFLAG_NOFLUSH_META option prevents regular metadata cache
flushes to the filesystem , and libperfuse uses it to match Linux FUSE
behavior.

While there, fix a bug in SETATTR: do not update kernel metadata cache
from SETATTR reply when the request is asynchronous, as we do not have
the reply yet.

Update file size after write without metadata flush
If we do not use metadata flush, we must make sure the size is updated
in the filesystem after a write, otherwise the next GETATTR will get us
a stale value and the file will be truncated.
 1.100.8.3 21-Apr-2014  bouyer Pull up following revision(s) (requested by maxv in ticket #1050):
sys/ufs/chfs/chfs_vfsops.c: revision 1.11
sys/fs/unionfs/unionfs_vfsops.c: revision 1.13
sys/fs/nilfs/nilfs_vfsops.c: revision 1.16
sys/ufs/mfs/mfs_vfsops.c: revision 1.107
sys/fs/sysvbfs/sysvbfs_vfsops.c: revision 1.43
sys/ufs/ffs/ffs_vfsops.c: revision 1.297
sys/kern/vfs_syscalls.c: revision 1.478
sys/kern/vfs_syscalls.c: revision 1.479
sys/fs/puffs/puffs_vfsops.c: revision 1.110
sys/fs/cd9660/cd9660_vfsops.c: revision 1.84
sys/nfs/nfs_vfsops.c: revision 1.227
sys/fs/v7fs/v7fs_vfsops.c: revision 1.10
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.180
sys/miscfs/umapfs/umap_vfsops.c: revision 1.92
sys/fs/filecorefs/filecore_vfsops.c: revision 1.76
sys/miscfs/nullfs/null_vfsops.c: revision 1.88
sys/fs/ptyfs/ptyfs_vfsops.c: revision 1.50
sys/coda/coda_vfsops.c: revision 1.81
sys/ufs/lfs/lfs_vfsops.c: revision 1.321
sys/fs/tmpfs/tmpfs_vfsops.c: revision 1.59
sys/fs/hfs/hfs_vfsops.c: revision 1.31
sys/miscfs/overlay/overlay_vfsops.c: revision 1.61
sys/fs/union/union_vfsops.c: revision 1.72
sys/fs/ntfs/ntfs_vfsops.c: revision 1.94
sys/kern/vfs_syscalls.c: revision 1.480
sys/fs/efs/efs_vfsops.c: revision 1.25
sys/kern/vfs_syscalls.c: revision 1.482
sys/fs/msdosfs/msdosfs_vfsops.c: revision 1.107
external/cddl/osnet/dist/uts/common/fs/zfs/zfs_vfsops.c: revision 1.12
sys/miscfs/procfs/procfs_vfsops.c: revision 1.91
sys/fs/smbfs/smbfs_vfsops.c: revision 1.100
sys/fs/adosfs/advfsops.c: revision 1.70
sys/fs/udf/udf_vfsops.c: revision 1.67
Limit check for 'data_len'. Otherwise a (un)privileged user can easily
panic the system by passing a huge size.
ok christos@
An (un)privileged user can easily make the kernel dereference a NULL
pointer.
The kernel allows 'data' to be NULL; it's the fs's responsibility to
ensure that it isn't NULL (if the fs actually needs data).
ok christos@
Some fs's - like kernfs - set their vfs_min_mount_data to zero. Add a check
to prevent an (un)privileged user from requesting a zero-sized allocation
(and thus a panic).
This thing is totally buggy: 'data_len' is modified by the fs, so calling
kmem_free with it while its value has changed since the kmem_alloc is far
from being a good idea.
If the kernel figures out that something mismatches, it will panic
(typically with kernfs).
 1.100.8.2 12-Aug-2012  martin branches: 1.100.8.2.4; 1.100.8.2.6;
Pull up following revision(s) (requested by manu in ticket #438):
lib/libperfuse/perfuse_priv.h: revision 1.31
sys/fs/puffs/puffs_msgif.h: revision 1.80
sys/fs/puffs/puffs_vnops.c: revision 1.171
lib/libpuffs/puffs_ops.3: revision 1.31
sys/fs/puffs/puffs_vnops.c: revision 1.172
sys/fs/puffs/puffs_vnops.c: revision 1.173
sys/fs/puffs/puffs_vnops.c: revision 1.174
usr.sbin/perfused/perfused.c: revision 1.24
sys/fs/puffs/puffs_sys.h: revision 1.80
sys/fs/puffs/puffs_sys.h: revision 1.81
sys/fs/puffs/puffs_sys.h: revision 1.82
lib/libperfuse/subr.c: revision 1.19
lib/libperfuse/perfuse.c: revision 1.30
sys/fs/puffs/puffs_msgif.c: revision 1.90
sys/fs/puffs/puffs_msgif.c: revision 1.91
sys/fs/puffs/puffs_msgif.c: revision 1.92
lib/libperfuse/ops.c: revision 1.59
lib/libpuffs/puffs.3: revision 1.53
lib/libperfuse/debug.c: revision 1.12
lib/libpuffs/puffs.3: revision 1.54
sys/fs/puffs/puffs_vnops.c: revision 1.167
sys/fs/puffs/puffs_msgif.h: revision 1.79
usr.sbin/perfused/msg.c: revision 1.21
sys/fs/puffs/puffs_vfsops.c: revision 1.102
sys/fs/puffs/puffs_vfsops.c: revision 1.103
sys/fs/puffs/puffs_vfsops.c: revision 1.105
lib/libpuffs/puffs.h: revision 1.123
lib/libperfuse/perfuse_if.h: revision 1.20
lib/libperfuse/perfuse.c: revision 1.29
lib/libpuffs/dispatcher.c: revision 1.42
lib/libpuffs/dispatcher.c: revision 1.43
- Fix same vnodes associated with multiple cookies
The scheme used to retreive known nodes on lookup was flawed, as it only
used parent and name. This produced a different cookie for the same file
if it was renamed, when looking up ../ or when dealing with multiple files
associated with the same name through link(2).
We therefore abandon the use of node name and introduce hashed lists of
inodes. This causes a huge rewrite of reclaim code, which do not attempt
to keep parents allocated until all their children are reclaimed
- Fix race conditions in reclaim
There are a few situations where we issue multiple FUSE operations for
a PUFFS operation. On reclaim, we therefore have to wait for all FUSE
operation to complete, not just the current exchanges. We do this by
introducing node reference count with node_ref() and node_rele().
- Detect data loss caused by FAF
VOP_PUTPAGES causes FAF writes where the kernel does not check the
operation result. At least issue a warning on error.
- Enjoy FAF shortcut on setattr
No need to wait for the result if the kernel does not want it. There is
however an exception for setattr that touch the size, we need to wait
for completion because we have other operations queued for after the
resize.
- Fix fchmod() on write-open file
fchmod() on a node open with write privilege will send setattr with both mode
and size set. This confuses some FUSE filesystem. Therefore we send two FUSE
operations, one for mode, and one for size.
- Remove node TTL handling for netbsd-5 for simplicity sake. The code
still builds on netbsd-5 but does not have the node TTL feature anymore.
It works fine with kernel support on netbsd-6.
- Improve PUFFS_KFLAG_CACHE_FS_TTL by reclaiming older inactive nodes.
The normal kernel behavior is to retain inactive nodes in the freelist
until it runs out of vnodes. This has some merit for local filesystems,
where the cost of an allocation is about the same as the cost of a
lookup. But that situation is not true for distributed filesystems.
On the other hand, keeping inactive nodes for a long time hold memory
in the file server process, and when the kernel runs out of vnodes, it
produce reclaim avalanches that increase lattency for other operations.
We do not reclaim inactive vnodes immediatly either, as they may be
looked up again shortly. Instead we introduce a grace time and we
reclaim nodes that have been inactive beyond the grace time.
- Fix lookup/reclaim race condition.
The above improvement undercovered a race condition between lookup and
reclaim. If we reclaimed a vnode associated with a userland cookie while
a lookup returning that same cookiewas inprogress, then the kernel ends
up with a vnode associated with a cookie that has been reclaimed in
userland. Next operation on the cookie will crash (or at least confuse)
the filesystem.
We fix this by introducing a lookup count in kernel and userland. On
reclaim, the kernel sends the count, which enable userland to detect
situation where it initiated a lookup that is not completed in kernel.
In such a situation, the reclaim must be ignored, as the node is about
to be looked up again.
Fix hang unmount bug introduced by last commit.
We introduced a slow queue for delayed reclaims, while the existing
queue for unmount, flush and exist has been renamed fast queue. Both
queues had timestamp for when an operation should be done, but it was
useless for the fast queue, which is always used to run an operation
ASAP. And the timestamp test had an error that turned ASAP into "at next
tick", but nobody what there to wake the thread at next tick, hence
the hang. The fix is to remove the useless and buggy timestamp test for
fast queue.
Rename slow sopreq queue into node sopreq queue, to refet the fact that
is only intended for postponed node reclaims.
When purging the node sopreq queue, do not call puffs_msg_sendresp(), as
it makes no sense.
Fix race condition between (create|mknod|mkdir|symlino) and reclaim, just
like we did it between lookup and reclaim.
Missing bit in previous commit (prevent race between create|mknod|mkdir|symlink
and reclaim)
Bump date for previous.
New sentence, new line; remove trailing whitespace; fix typos;
punctuation nits.
Add PUFFS_KFLAG_CACHE_DOTDOT so that vnodes hold a reference on their
parent, keeping them active, and allowing to lookup .. without sending
a request to the filesystem.
Enable the featuure for perfused, as this is how FUSE works.
Missing bit in previous commit (PUFFS_KFLAG_CACHE_DOTDOT option to avoid
looking up ..)
 1.100.8.1 23-Apr-2012  riz Pull up following revision(s) (requested by manu in ticket #195):
lib/libskey/skeysubr.c: revision 1.27
lib/libkvm/kvm_getloadavg.c: revision 1.11
lib/libwrap/update.c: revision 1.9
lib/liby/yyerror.c: revision 1.9
lib/libpuffs/puffs_ops.3: revision 1.30
lib/libwrap/misc.c: revision 1.10
lib/libwrap/hosts_access.c: revision 1.20
lib/libpuffs/pnode.c: revision 1.11
lib/libperfuse/subr.c: revision 1.17
lib/libpuffs/pnode.c: revision 1.12
lib/libperfuse/subr.c: revision 1.18
lib/libwrap/options.c: revision 1.15
lib/libwrap/fix_options.c: revision 1.11
lib/libperfuse/ops.c: revision 1.52
lib/libperfuse/ops.c: revision 1.53
lib/libperfuse/ops.c: revision 1.54
lib/libwrap/hosts_ctl.c: revision 1.5
lib/libintl/gettext.c: revision 1.27
lib/libwrap/shell_cmd.c: revision 1.6
lib/libpuffs/dispatcher.c: revision 1.39
lib/libperfuse/perfuse_priv.h: revision 1.27
lib/libwrap/socket.c: revision 1.19
lib/libpuffs/puffs.3: revision 1.50
lib/libperfuse/perfuse_priv.h: revision 1.28
lib/libpuffs/puffs_priv.h: revision 1.45
lib/libpuffs/puffs.3: revision 1.51
lib/libperfuse/perfuse_priv.h: revision 1.29
lib/libwrap/percent_x.c: revision 1.5
lib/libpuffs/puffs.3: revision 1.52
lib/libperfuse/debug.c: revision 1.11
sys/fs/puffs/puffs_vnops.c: revision 1.165
lib/libwrap/tcpd.h: revision 1.13
sys/fs/puffs/puffs_vnops.c: revision 1.166
lib/libwrap/eval.c: revision 1.7
sys/fs/puffs/puffs_msgif.h: revision 1.78
sys/fs/puffs/puffs_vfsops.c: revision 1.101
lib/libwrap/rfc931.c: revision 1.9
lib/libwrap/clean_exit.c: revision 1.5
lib/libpuffs/puffs.h: revision 1.120
lib/libc/stdlib/jemalloc.c: revision 1.27
lib/librmt/rmtlib.c: revision 1.26
lib/libpuffs/puffs.h: revision 1.121
sys/fs/puffs/puffs_sys.h: revision 1.79
lib/librumpclient/rumpclient.c: revision 1.48
lib/libwrap/refuse.c: revision 1.5
lib/libperfuse/perfuse.c: revision 1.26
lib/libperfuse/perfuse.c: revision 1.27
tests/fs/puffs/t_fuzz.c: revision 1.5
lib/libperfuse/perfuse.c: revision 1.28
lib/libpuffs/dispatcher.c: revision 1.40
sys/fs/puffs/puffs_node.c: revision 1.24
lib/libwrap/diag.c: revision 1.9
lib/libintl/textdomain.c: revision 1.13
Use C89 function definition
Add name and atttribute cache with filesytem provided TTL.
lookup, create, mknod, mkdir, symlink, getattr and setattr messages
have been extended so that attributes and their TTL can be provided
by the filesytem. lookup, create, mknod, mkdir, and symlink messages
are also extended so that the filesystem can provide name TTL.
Add PUFFS_KFLAG_CACHE_FS_TTL flag to puffs_init(3) to use name and
attribute cache with filesystem provided TTL.
lookup, create, mknod, mkdir, symlink, getattr and setattr messages
have been extended so that attributes and their TTL can be provided
by the filesytem. lookup, create, mknod, mkdir, and symlink messages
are also extended so that the filesystem can provide name TTL.
The filesystem updates attributes and TTL using
puffs_pn_getvap(3), puffs_pn_getvattl(3), and puffs_pn_getcnttl(3)
Use new PUFFS_KFLAG_CACHE_FS_TTL option to puffs_init(3) so that
FUSE TTL on name and attributes are used. This save many PUFFS
operations and improves performances.
PUFFS_KFLAG_CACHE_FS_TTL is #ifdef'ed in many places for now so that
libperfuse can still be used on netbsd-5.
Split file system.
Comma fixes.
Remove dangling &quot;and&quot;.
Bump date for previous.
- Makesure update_va does not change vnode size when it should not. For
instance when doing a fault-issued VOP_GETPAGES within VOP_WRITE, changing
size leads to panic: genfs_getpages: past eof.
-Handle ticks wrap around for vnode name andattribute timeout
- When using PUFFS_KFLAG_CACHE_FS_TTL, do not use puffs_node to carry
attribute and TTL fora newly created node. Instead extend puffs_newinfo
and add puffs_newinfo_setva() and puffs_newinfo_setttl()
- Remove node_mk_common_final in libperfuse. It used to set uid/gid for
a newly created vnode but has been made redundant along time ago since
uid and gid are properly set in FUSE header.
- In libperfuse, check for corner case where opc = 0 on INACTIVE and RECLAIM
(how is it possible? Check for it to avoid a crash anyway)
- In libperfuse, make sure we unlimit RLIMIT_AS and RLIMIT_DATA so that
we do notrun out of memory because the kernel is lazy at reclaiming vnodes.
- In libperfuse, cleanup style of perfuse_destroy_pn()
Do not set PUFFS_KFLAG_CACHE_FS_TTL for PUFFS tests
 1.100.8.2.6.1 21-Apr-2014  bouyer Pull up following revision(s) (requested by maxv in ticket #1050):
sys/ufs/chfs/chfs_vfsops.c: revision 1.11
sys/fs/unionfs/unionfs_vfsops.c: revision 1.13
sys/fs/nilfs/nilfs_vfsops.c: revision 1.16
sys/ufs/mfs/mfs_vfsops.c: revision 1.107
sys/fs/sysvbfs/sysvbfs_vfsops.c: revision 1.43
sys/ufs/ffs/ffs_vfsops.c: revision 1.297
sys/kern/vfs_syscalls.c: revision 1.478
sys/kern/vfs_syscalls.c: revision 1.479
sys/fs/puffs/puffs_vfsops.c: revision 1.110
sys/fs/cd9660/cd9660_vfsops.c: revision 1.84
sys/nfs/nfs_vfsops.c: revision 1.227
sys/fs/v7fs/v7fs_vfsops.c: revision 1.10
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.180
sys/miscfs/umapfs/umap_vfsops.c: revision 1.92
sys/fs/filecorefs/filecore_vfsops.c: revision 1.76
sys/miscfs/nullfs/null_vfsops.c: revision 1.88
sys/fs/ptyfs/ptyfs_vfsops.c: revision 1.50
sys/coda/coda_vfsops.c: revision 1.81
sys/ufs/lfs/lfs_vfsops.c: revision 1.321
sys/fs/tmpfs/tmpfs_vfsops.c: revision 1.59
sys/fs/hfs/hfs_vfsops.c: revision 1.31
sys/miscfs/overlay/overlay_vfsops.c: revision 1.61
sys/fs/union/union_vfsops.c: revision 1.72
sys/fs/ntfs/ntfs_vfsops.c: revision 1.94
sys/kern/vfs_syscalls.c: revision 1.480
sys/fs/efs/efs_vfsops.c: revision 1.25
sys/kern/vfs_syscalls.c: revision 1.482
sys/fs/msdosfs/msdosfs_vfsops.c: revision 1.107
external/cddl/osnet/dist/uts/common/fs/zfs/zfs_vfsops.c: revision 1.12
sys/miscfs/procfs/procfs_vfsops.c: revision 1.91
sys/fs/smbfs/smbfs_vfsops.c: revision 1.100
sys/fs/adosfs/advfsops.c: revision 1.70
sys/fs/udf/udf_vfsops.c: revision 1.67
Limit check for 'data_len'. Otherwise a (un)privileged user can easily
panic the system by passing a huge size.
ok christos@
An (un)privileged user can easily make the kernel dereference a NULL
pointer.
The kernel allows 'data' to be NULL; it's the fs's responsibility to
ensure that it isn't NULL (if the fs actually needs data).
ok christos@
Some fs's - like kernfs - set their vfs_min_mount_data to zero. Add a check
to prevent an (un)privileged user from requesting a zero-sized allocation
(and thus a panic).
This thing is totally buggy: 'data_len' is modified by the fs, so calling
kmem_free with it while its value has changed since the kmem_alloc is far
from being a good idea.
If the kernel figures out that something mismatches, it will panic
(typically with kernfs).
 1.100.8.2.4.1 21-Apr-2014  bouyer Pull up following revision(s) (requested by maxv in ticket #1050):
sys/ufs/chfs/chfs_vfsops.c: revision 1.11
sys/fs/unionfs/unionfs_vfsops.c: revision 1.13
sys/fs/nilfs/nilfs_vfsops.c: revision 1.16
sys/ufs/mfs/mfs_vfsops.c: revision 1.107
sys/fs/sysvbfs/sysvbfs_vfsops.c: revision 1.43
sys/ufs/ffs/ffs_vfsops.c: revision 1.297
sys/kern/vfs_syscalls.c: revision 1.478
sys/kern/vfs_syscalls.c: revision 1.479
sys/fs/puffs/puffs_vfsops.c: revision 1.110
sys/fs/cd9660/cd9660_vfsops.c: revision 1.84
sys/nfs/nfs_vfsops.c: revision 1.227
sys/fs/v7fs/v7fs_vfsops.c: revision 1.10
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.180
sys/miscfs/umapfs/umap_vfsops.c: revision 1.92
sys/fs/filecorefs/filecore_vfsops.c: revision 1.76
sys/miscfs/nullfs/null_vfsops.c: revision 1.88
sys/fs/ptyfs/ptyfs_vfsops.c: revision 1.50
sys/coda/coda_vfsops.c: revision 1.81
sys/ufs/lfs/lfs_vfsops.c: revision 1.321
sys/fs/tmpfs/tmpfs_vfsops.c: revision 1.59
sys/fs/hfs/hfs_vfsops.c: revision 1.31
sys/miscfs/overlay/overlay_vfsops.c: revision 1.61
sys/fs/union/union_vfsops.c: revision 1.72
sys/fs/ntfs/ntfs_vfsops.c: revision 1.94
sys/kern/vfs_syscalls.c: revision 1.480
sys/fs/efs/efs_vfsops.c: revision 1.25
sys/kern/vfs_syscalls.c: revision 1.482
sys/fs/msdosfs/msdosfs_vfsops.c: revision 1.107
external/cddl/osnet/dist/uts/common/fs/zfs/zfs_vfsops.c: revision 1.12
sys/miscfs/procfs/procfs_vfsops.c: revision 1.91
sys/fs/smbfs/smbfs_vfsops.c: revision 1.100
sys/fs/adosfs/advfsops.c: revision 1.70
sys/fs/udf/udf_vfsops.c: revision 1.67
Limit check for 'data_len'. Otherwise a (un)privileged user can easily
panic the system by passing a huge size.
ok christos@
An (un)privileged user can easily make the kernel dereference a NULL
pointer.
The kernel allows 'data' to be NULL; it's the fs's responsibility to
ensure that it isn't NULL (if the fs actually needs data).
ok christos@
Some fs's - like kernfs - set their vfs_min_mount_data to zero. Add a check
to prevent an (un)privileged user from requesting a zero-sized allocation
(and thus a panic).
This thing is totally buggy: 'data_len' is modified by the fs, so calling
kmem_free with it while its value has changed since the kmem_alloc is far
from being a good idea.
If the kernel figures out that something mismatches, it will panic
(typically with kernfs).
 1.100.6.1 29-Apr-2012  mrg sync to latest -current.
 1.100.2.4 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.100.2.3 23-Jan-2013  yamt sync with head
 1.100.2.2 30-Oct-2012  yamt sync with head
 1.100.2.1 17-Apr-2012  yamt sync with head
 1.106.2.3 03-Dec-2017  jdolecek update from HEAD
 1.106.2.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.106.2.1 25-Feb-2013  tls resync with head
 1.107.2.1 18-May-2014  rmind sync with head
 1.109.2.1 10-Aug-2014  tls Rebase.
 1.113.2.4 15-Mar-2015  snj Pull up following revision(s) (requested by tron in ticket #587):
sys/fs/puffs/puffs_vfsops.c: revision 1.117
Remove debug printf
 1.113.2.3 27-Feb-2015  martin Pull up following revision(s) (requested by manu in ticket #555):
lib/libpuffs/puffs.3: revision 1.60
sys/fs/puffs/puffs_msgif.h: revision 1.84
lib/libperfuse/ops.c: revision 1.83
sys/fs/puffs/puffs_sys.h: revision 1.89
sys/fs/puffs/puffs_vfsops.c: revision 1.116
lib/libperfuse/perfuse.c: revision 1.36
sys/fs/puffs/puffs_vnops.c: revision 1.200-1.202

Add PUFFS_KFLAG_NOFLUSH_META to prevent sending metadata flush to FUSE

FUSE filesystems do not expect to get metadata updates for [amc]time
and size, they updates the value on their own after operations.

The PUFFS PUFFS_KFLAG_NOFLUSH_META option prevents regular metadata cache
flushes to the filesystem , and libperfuse uses it to match Linux FUSE
behavior.

While there, fix a bug in SETATTR: do not update kernel metadata cache
from SETATTR reply when the request is asynchronous, as we do not have
the reply yet.

Update file size after write without metadata flush
If we do not use metadata flush, we must make sure the size is updated
in the filesystem after a write, otherwise the next GETATTR will get us
a stale value and the file will be truncated.
 1.113.2.2 17-Jan-2015  martin Pull up following revision(s) (requested by maxv in ticket #427):
sys/compat/svr4/svr4_schedctl.c: revision 1.8
sys/netinet/tcp_timer.c: revision 1.88
sys/miscfs/genfs/layer_vfsops.c: revision 1.45
sys/compat/svr4/svr4_ioctl.c: revision 1.37
sys/ufs/chfs/chfs_vfsops.c: revision 1.14
sys/miscfs/fdesc/fdesc_vfsops.c: revision 1.91
sys/compat/linux/arch/i386/linux_ptrace.c: revision 1.30
sys/compat/common/kern_time_50.c: revision 1.28
sys/netinet6/ip6_forward.c: revision 1.74
sys/miscfs/umapfs/umap_vnops.c: revision 1.57
sys/compat/svr4/svr4_fcntl.c: revision 1.74
distrib/sets/lists/comp/mi: revision 1.1931
sys/netinet6/udp6_output.c: revision 1.46
sys/fs/puffs/puffs_compat.c: revision 1.3
sys/fs/udf/udf_rename.c: revision 1.11
sys/compat/svr4/svr4_filio.c: revision 1.24
sys/fs/udf/udf_rename.c: revision 1.12
sys/netinet/tcp_usrreq.c: revision 1.202
sys/miscfs/umapfs/umap_subr.c: revision 1.29
sys/compat/linux/common/linux_fadvise64.c: revision 1.3
sys/netinet/if_atm.c: revision 1.34
sys/miscfs/procfs/procfs_subr.c: revision 1.106
sys/miscfs/genfs/layer_subr.c: revision 1.37
sys/netinet/tcp_sack.c: revision 1.30
sys/compat/freebsd/freebsd_misc.c: revision 1.33
sys/compat/freebsd/freebsd_file.c: revision 1.33
sys/ufs/chfs/chfs_vnode.c: revision 1.12
sys/compat/svr4/svr4_ttold.c: revision 1.34
sys/compat/linux/common/linux_file.c: revision 1.114
sys/compat/linux/arch/mips/linux_machdep.c: revision 1.43
sys/compat/linux/common/linux_signal.c: revision 1.76
sys/compat/common/compat_util.c: revision 1.46
sys/compat/linux/arch/arm/linux_ptrace.c: revision 1.18
sys/compat/svr4/svr4_sockio.c: revision 1.36
sys/compat/linux/arch/arm/linux_machdep.c: revision 1.32
sys/compat/svr4/svr4_signal.c: revision 1.66
sys/kern/kern_exec.c: revision 1.410
sys/fs/puffs/puffs_vfsops.c: revision 1.115
sys/compat/svr4/svr4_exec_elf64.c: revision 1.15
sys/compat/linux/arch/i386/linux_machdep.c: revision 1.159
sys/compat/linux/arch/alpha/linux_machdep.c: revision 1.50
sys/compat/linux32/common/linux32_misc.c: revision 1.24
sys/netinet/in_pcb.c: revision 1.153
sys/sys/malloc.h: revision 1.116
sys/compat/common/if_43.c: revision 1.9
share/man/man9/Makefile: revision 1.380
sys/netinet/tcp_vtw.c: revision 1.12
sys/miscfs/umapfs/umap_vfsops.c: revision 1.95
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.186
sys/compat/common/uipc_syscalls_43.c: revision 1.46
sys/ufs/ext2fs/ext2fs_vnops.c: revision 1.115
sys/fs/puffs/puffs_msgif.c: revision 1.97
sys/compat/svr4/svr4_ipc.c: revision 1.27
sys/compat/linux/common/linux_exec.c: revision 1.117
sys/ufs/ext2fs/ext2fs_readwrite.c: revision 1.66
sys/netinet/tcp_output.c: revision 1.179
sys/compat/svr4/svr4_termios.c: revision 1.28
sys/fs/udf/udf_strat_bootstrap.c: revision 1.4
sys/fs/puffs/puffs_subr.c: revision 1.67
sys/fs/puffs/puffs_node.c: revision 1.36
sys/miscfs/overlay/overlay_vnops.c: revision 1.21
sys/fs/cd9660/cd9660_node.c: revision 1.34
sys/netinet/raw_ip.c: revision 1.146
sys/sys/mallocvar.h: revision 1.13
sys/miscfs/overlay/overlay_vfsops.c: revision 1.63
share/man/man9/malloc.9: revision 1.50
sys/netinet6/dest6.c: revision 1.18
sys/compat/linux/common/linux_uselib.c: revision 1.33
sys/compat/linux/common/linux_socket.c: revision 1.120
share/man/man9/malloc.9: revision 1.51
sys/netinet/tcp_subr.c: revision 1.257
sys/compat/linux/common/linux_socketcall.c: revision 1.45
sys/compat/linux/common/linux_fadvise64_64.c: revision 1.3
sys/compat/freebsd/freebsd_ipc.c: revision 1.17
sys/compat/linux/common/linux_misc_notalpha.c: revision 1.109
sys/compat/linux/arch/alpha/linux_pipe.c: revision 1.17
sys/netinet6/in6_pcb.c: revision 1.132
sys/netinet6/in6_ifattach.c: revision 1.94
sys/compat/svr4/svr4_exec_elf32.c: revision 1.15
sys/miscfs/nullfs/null_vfsops.c: revision 1.90
sys/fs/cd9660/cd9660_util.c: revision 1.12
sys/compat/linux/arch/powerpc/linux_machdep.c: revision 1.48
sys/compat/freebsd/freebsd_exec_elf32.c: revision 1.20
sys/miscfs/procfs/procfs_vfsops.c: revision 1.94
sys/compat/linux/arch/powerpc/linux_ptrace.c: revision 1.28
sys/compat/linux/common/linux_sched.c: revision 1.67
sys/compat/linux/common/linux_exec_aout.c: revision 1.67
sys/compat/linux/common/linux_pipe.c: revision 1.67
sys/compat/linux/common/linux_llseek.c: revision 1.34
sys/compat/linux/arch/mips/linux_ptrace.c: revision 1.10
Do not uselessly include <sys/malloc.h>.
Cleanup:
- remove struct kmembuckets (dead)
- correctly deadify MALLOC_XX
- remove MALLOC_DEFINE_LIMIT and MALLOC_JUSTDEFINE_LIMIT (dead)
- remove malloc_roundup(), malloc_type_setlimit(), MALLOC_DEFINE_LIMIT()
and MALLOC_JUSTDEFINE_LIMIT() from man 9 malloc
New sentence, new line. Bump date for previous.
Obsolete malloc_roundup(9), malloc_type_setlimit(9) and MALLOC_DEFINE_LIMIT(9)
man pages.
 1.113.2.1 29-Aug-2014  martin Pull up following revision(s) (requested by hannken in ticket #67):
sys/fs/puffs/puffs_sys.h: revision 1.86
sys/fs/puffs/puffs_vfsops.c: revision 1.114
sys/fs/puffs/puffs_msgif.c: revision 1.95
sys/fs/puffs/puffs_node.c: revision 1.32
sys/fs/puffs/puffs_vnops.c: revision 1.184
Change puffs from hashlist to vcache.
- field "pa_nhashbuckets" of struct "puffs_kargs" becomes a no-op.
and should be removed on the next protocol version bump.
 1.115.2.3 28-Aug-2017  skrll Sync with HEAD
 1.115.2.2 27-Dec-2015  skrll Sync with HEAD (as of 26th Dec)
 1.115.2.1 06-Apr-2015  skrll Sync with HEAD
 1.118.4.1 21-Apr-2017  bouyer Sync with HEAD
 1.118.2.2 26-Apr-2017  pgoyette Sync with HEAD
 1.118.2.1 20-Mar-2017  pgoyette Sync with HEAD
 1.120.12.1 25-Jun-2018  pgoyette Sync with HEAD
 1.121.2.2 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.121.2.1 08-Apr-2020  martin Merge changes from current as of 20200406
 1.123.2.2 29-Feb-2020  ad Sync with head.
 1.123.2.1 17-Jan-2020  ad Sync with head.
 1.125.8.1 03-Apr-2021  thorpej Sync with HEAD.
 1.125.6.1 03-Apr-2021  thorpej Sync with HEAD.
 1.226 09-Feb-2024  andvar fix spelling mistakes, mainly in comments and log messages.
 1.225 23-Feb-2022  andvar fix various typos in comments, mainly immediatly/immediately/,
as well shared and recently fixed typos in OpenBSD code by Jonathan Grey.
 1.224 05-Dec-2021  msaitoh s/invlid/invalid/ in comment.
 1.223 20-Oct-2021  thorpej Overhaul of the EVFILT_VNODE kevent(2) filter:

- Centralize vnode kevent handling in the VOP_*() wrappers, rather than
forcing each individual file system to deal with it (except VOP_RENAME(),
because VOP_RENAME() is a mess and we currently have 2 different ways
of handling it; at least it's reasonably well-centralized in the "new"
way).
- Add support for NOTE_OPEN, NOTE_CLOSE, NOTE_CLOSE_WRITE, and NOTE_READ,
compatible with the same events in FreeBSD.
- Track which kevent notifications clients are interested in receiving
to avoid doing work for events no one cares about (avoiding, e.g.
taking locks and traversing the klist to send a NOTE_WRITE when
someone is merely watching for a file to be deleted, for example).

In support of the above:

- Add support in vnode_if.sh for specifying PRE- and POST-op handlers,
to be invoked before and after vop_pre() and vop_post(), respectively.
Basic idea from FreeBSD, but implemented differently.
- Add support in vnode_if.sh for specifying CONTEXT fields in the
vop_*_args structures. These context fields are used to convey information
between the file system VOP function and the VOP wrapper, but do not
occupy an argument slot in the VOP_*() call itself. These context fields
are initialized and subsequently interpreted by PRE- and POST-op handlers.
- Version VOP_REMOVE(), uses the a context field for the file system to report
back the resulting link count of the target vnode. Return this in tmpfs,
udf, nfs, chfs, ext2fs, lfs, and ufs.

NetBSD 9.99.92.
 1.222 24-Jul-2021  andvar Fix all remaining typos, mainly in comments but also in few definitions and log messages, reported by me in PR kern/54889.
Also fixed some additional typos in comments, found on review of same files or typos.
 1.221 19-Jul-2021  dholland Abolish all the silly indirection macros for initializing vnode ops tables.

These are things of the form #define foofs_op genfs_op, or #define
foofs_op genfs_eopnotsupp, or similar. They serve no purpose besides
obfuscation, and have gotten cutpasted all over everywhere.

Part 2; cvs randomly didn't commit these changes before, and then hid
them from me until I touched the files to force it to rethink. Dunno
what happened.

There's probably more of these, going to have to scan the tree the
hard way.
 1.220 18-Jul-2021  dholland Use macros for the canned parts of device and fifo vnode op tables.

Add GENFS_SPECOP_ENTRIES and GENFS_FIFOOP_ENTRIES macros that contain
the portion of the vnode ops table declaration that is
(conservatively) the same in every fs. Use these in every fs that
supports devices and/or fifos with separate ops tables.

Note that ptyfs works differently (it has one type of vnode with
open-coded dispatch to the specfs code, which I haven't changed in
this commit) and rump/librump/rumpvfs/rumpfs.c has an indirect dynamic
dispatch that already does more or less the same thing, which I also
haven't changed.

Also note that this anticipates a few bits in the next changeset here
and there, and adds missing but unreachable calls in some cases (e.g.
most fses weren't defining whiteout on devices and fifos, but it isn't
reachable there), and it changes parsepath on devices and fifos to
genfs_badop from genfs_parsepath (but it's not reachable there
either).

It appears that devices in kernfs were missing kqfilter, so it's
possible that if you try to use kqueue on /kern/rootdev that it'll
explode.

And finally note that the ops declaration tables aren't
order-dependent. (Other than vop_default_desc has to come first.)
Otherwise this wouldn't work.
 1.219 29-Jun-2021  dholland Now remove cn_consume from struct componentname.

This change requires a kernel bump.

Note though that I'm not going to version the VOP_LOOKUP args
structure (or any other args structure) as code that doesn't touch
cn_consume doesn't need attention and code that does will fail on it
without further intervention.
 1.218 29-Jun-2021  dholland - Add a new vnode op: VOP_PARSEPATH.
- Move namei_getcomponent to genfs_vnops.c and call it genfs_parsepath.
- Add a parsepath entry to every vnode ops table.

VOP_PARSEPATH takes a directory vnode to be searched and a complete
following path and chooses how much of that path to consume. To begin
with, all parsepath calls are genfs_parsepath, which locates the first
'/' as always.

Note that the call doesn't take the whole struct componentname, only
the string. The other bits of struct componentname should not be
needed and there's no reason to cause potential complications by
exposing them.
 1.217 16-May-2020  christos branches: 1.217.6;
Add ACL support for FFS. From FreeBSD.
 1.216 15-May-2020  maxv hardclock_ticks -> getticks()
 1.215 23-Apr-2020  ad PR kern/54759 (vm.ubc_direct deadlock when read()/write() into mapping of itself)

- Add new flag UBC_ISMAPPED which tells ubc_uiomove() the object is mmap()ed
somewhere. Use it to decide whether to do direct-mapped copy, rather than
poking around directly in the vnode in ubc_uiomove(), which is ugly and
doesn't work for tmpfs. It would be nicer to contain all this in UVM but
the filesystem provides the needed locking here (VV_MAPPED) and to
reinvent that would suck more.

- Rename UBC_UNMAP_FLAG() to UBC_VNODE_FLAGS(). Pass in UBC_ISMAPPED where
appropriate.
 1.214 23-Feb-2020  ad branches: 1.214.4;
UVM locking changes, proposed on tech-kern:

- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart. v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap. Others to follow later.
 1.213 06-Nov-2018  manu branches: 1.213.6;
Fix use after RECLAIM in PUFFS filesystems

From hannken@

When puffs_cookie2vnode() misses an entry and vrele() it operations
puffs_vnop_reclaim() and puffs_vnop_fsync() get called with a VNON
vnode.

Do not notify the server in this case as the cookie is stale.
 1.212 05-Nov-2018  manu Add missing mutex pn->pn_sizemtx lock in puffs_vnop_open()

puffs_vnop_open() calls flushvncache(), which calls dosetattr()
if pn->pn_stat has PNODE_METACACHE_MASK. In that case, the lock
on pn->pn_sizemtx is mandatory and asserted.
 1.211 26-May-2017  riastradh branches: 1.211.2; 1.211.8; 1.211.10;
Make VOP_RECLAIM do the last unlock of the vnode.

VOP_RECLAIM naturally has exclusive access to the vnode, so having it
locked on entry is not strictly necessary -- but it means if there
are any final operations that must be done on the vnode, such as
ffs_update, requiring exclusive access to it, we can now kassert that
the vnode is locked in those operations.

We can't just have the caller release the last lock because some file
systems don't use genfs_lock, and require the vnode to remain valid
for VOP_UNLOCK to work, notably unionfs.
 1.210 26-Apr-2017  riastradh Change VOP_REMOVE and VOP_RMDIR to preserve lock/ref on dvp.

No change to vp -- the plan is to replace the node by the
componentname in the vop parameters, and let all directory vops do
lookups internally.

Proposed on tech-kern with no objections:
https://mail-index.netbsd.org/tech-kern/2017/04/17/msg021825.html
 1.209 11-Apr-2017  riastradh Make VOP_INACTIVE preserve vnode lock on return.

Discussed on tech-kern:
https://mail-index.netbsd.org/tech-kern/2017/04/01/msg021751.html

Ride 7.99.68, a bumpy bus of incremental vfs improvements!
 1.208 08-Apr-2017  hannken Update mtime when updating file size.

PR kern/51762 (mtime not updated by open(O_TRUNC))
 1.207 06-Apr-2017  christos use ubc_zerorange
 1.206 04-Apr-2017  christos use MAX_PAGE_SIZE.
 1.205 21-Jul-2016  christos branches: 1.205.2;
replace variable stack declaration with a large enough one and KASSERT.
 1.204 07-Jul-2016  msaitoh branches: 1.204.2;
KNF. Remove extra spaces. No functional change.
 1.203 20-Apr-2015  riastradh Make VOP_LINK return directory still locked and referenced.

Ride 7.99.10 bump.
 1.202 25-Feb-2015  christos make this compile again.
 1.201 25-Feb-2015  manu Update file size after write without metadata flush

If we do not use metadata flush, we must make sure the size is updated
in the filesystem after a write, otherwise the next GETATTR will get us
a stale value and the file will be truncated.
 1.200 15-Feb-2015  manu Add PUFFS_KFLAG_NOFLUSH_META to prevent sending metadata flush to FUSE

FUSE filesystems do not expect to get metadata updates for [amc]time
and size, they updates the value on their own after operations.

The PUFFS PUFFS_KFLAG_NOFLUSH_META option prevents regular metadata cache
flushes to the filesystem , and libperfuse uses it to match Linux FUSE
behavior.

While there, fix a bug in SETATTR: do not update kernel metadata cache
from SETATTR reply when the request is asynchronous, as we do not have
the reply yet.
 1.199 13-Jan-2015  manu Make sure reads on empty files reach PUFFS filesystems

Sending a read through the page cache will get the operation
short-circuited. This is a problem with some filesystems that
expect to receive the read operation in order to update atime.

We fix that by bypassing the page cache when reading a file
wich a size known to be zero.
 1.198 04-Nov-2014  manu branches: 1.198.2;
PUFFS direct I/O cache fix

There are a few situations where we must take care of the cache if direct
I/O was enabled:
- if we do direct I/O for write but not for read, then any write must
invalidate the cache so that a reader gets the written data and not
the not-updated cache.
- if we used a vnode without direct I/O and it is enabled for writing,
we must flush the cache before compeling the open operation, so that
the cachec write are not lost.

And at inactive time, we wipe direct I/O flags so that a new open without
direct I/O does not inherit direct I/O.
 1.197 04-Nov-2014  manu Fix PUFFS node use-after-reclaim

When puffs_cookie2vnode() misses an entry, vcache_get()
creates a new node (puffs_vfsop_loadvnode being called to
initialize the PUFFS part), then it discovers it is VNON,
and tries to vrele() it. vrele() calls VOP_INACTIVE(),
which led us in puffs_vnop_inactive() where we sent a
request to the filesystem for a node that already had been
reclaimed.

The fix is to check for VNON nodes in puffs_vnop_inactive()
and to return without doing anyting. This is suboptimal, but
a better workaround would probably need to modify vcache API,
with an impact on other filesystems. Let us keep it simple.
 1.196 31-Oct-2014  manu Add PUFFS support for fallocate and fdiscard operations
 1.195 31-Oct-2014  manu According to pooka@'s comment, a long time ago, VOP_STRATEGY could not
fail without taking down the kernel. It seems this is not the case anymore,
hence we can stop dropping errors in puffs_vnop_strategy()

Approved by pooka@
 1.194 07-Oct-2014  he Do the previous correctly...
 1.193 07-Oct-2014  he As is evidenced by several of our 32-bit MIPS ports, it's wrong to
print vsize_t with PRIx64 -- instead use our own PRIxVSIZE macro.
 1.192 06-Oct-2014  he Make this build again without debugging enabled; DPRINTF() can end up
as empty, and in an if conditional, you then need braces if that's the
only potential body.
 1.191 06-Oct-2014  manu Retore LP64 fix that was removed by mistake
 1.190 06-Oct-2014  manu Improve zero-fill of last page after shrink fix:
1) do it only if the file is open for writing, otherwise we send write
requests to the FS on a file that has never been open.
2) do it inside existing if (vap->va_size != VNOVAL) block
 1.189 05-Oct-2014  justin Use PRIx64 for printing offsets
 1.188 05-Oct-2014  manu If we truncate the file, make sure we zero-fill the end of the last
page, otherwise if the file is later truncated to a larger size
(creating a hole), that area will not return zeroes as it should.
 1.187 30-Sep-2014  hannken Fix the puffs_sop_thread -> puffs_cookie2vnode path:
- pass the cookie by reference
- add missing mutex_exit()
- update assertion for VNON typed vnodes
 1.186 11-Sep-2014  manu PUFFS fixes for size update ater write plus read/write sanity checks

- Always update kernel metadata cache for size when writing
This fixes situation where size update after appending to a file lagged
- Make read/write nilpotent when called with null size, as FFS does
- Return EFBIG instead of EINVAL for negative offsets, as FFS does
 1.185 05-Sep-2014  manu When changing a directory content, update the ctime/mtime in kernel cache,
otherwise the updated ctime/mtime appears after the cached entry expire.
 1.184 28-Aug-2014  hannken Change puffs from hashlist to vcache.
- field "pa_nhashbuckets" of struct "puffs_kargs" becomes a no-op.
and should be removed on the next protocol version bump.
 1.183 16-Aug-2014  manu Add a oflags input field to open requests so that the filesystem can pass
back information about the file. Implement PUFFS_OPEN_IO_DIRECT, which
will force direct IO (bypassing page cache) for the file.
 1.182 25-Jul-2014  dholland branches: 1.182.2;
Add VOP_FALLOCATE and VOP_FDISCARD to every vnode ops table I can
find.

The filesystem ones all call genfs_eopnotsupp - right now I am only
implementing the plumbing and we can implement fallocate and/or
fdiscard for files later.

The device ones call spec_fallocate (which is also genfs_eopnotsupp)
and spec_fdiscard, which dispatches to the device-level op.

The fifo ones all call vn_fifo_bypass, which also ends up being
EOPNOTSUPP.
 1.181 24-Mar-2014  hannken branches: 1.181.2;
- Make VI_XLOCK, VI_CLEAN and VI_LOCKSHARE private to kern/vfs_*.c.
- Make vwait() static.
- Add vdead_check() to check a vnode for being or becoming dead.

Discussed on tech-kern.

Welcome to 6.99.38
 1.180 07-Feb-2014  hannken Change vnode operation lookup to return the resulting vnode *vpp unlocked.
Change cache_lookup() to return an unlocked vnode.

Discussed on tech-kern@

Welcome to 6.99.31
 1.179 23-Jan-2014  hannken Change vnode operations create, mknod, mkdir and symlink to return
the resulting vnode *vpp unlocked.

Discussed on tech-kern@

Welcome to 6.99.30
 1.178 17-Jan-2014  hannken Change vnode operations create, mknod, mkdir and symlink to keep the
directory node dvp locked on return.

Discussed on tech-kern@

Welcome to 6.99.29
 1.177 17-Oct-2013  christos - remove unused variables
- add _NOERROR flavor macros for the case where errors are ignored.
 1.176 05-Nov-2012  dholland branches: 1.176.2;
Excise struct componentname from the namecache.

This uglifies the interface, because several operations need to be
passed the namei flags and cache_lookup also needs for the time being
to be passed cnp->cn_nameiop. Nonetheless, it's a net benefit.

The glop should be able to go away eventually but requires structural
cleanup elsewhere first.

This change requires a kernel bump.
 1.175 05-Nov-2012  dholland Disentangle the namecache from the internals of namei.

- Move the namecache's hash computation to inside the namecache code,
instead of being spread out all over the place. Remove cn_hash from
struct componentname and delete all uses of it.

- It is no longer necessary (if it ever was) for cache_lookup and
cache_lookup_raw to clear MAKEENTRY from cnp->cn_flags for the cases
that cache_enter already checks for.

- Rearrange the interface of cache_lookup (and cache_lookup_raw) to
make it somewhat simpler, to exclude certain nonexistent error
conditions, and (most importantly) to make it not require write access
to cnp->cn_flags.

This change requires a kernel bump.
 1.174 10-Aug-2012  manu branches: 1.174.2;
Add PUFFS_KFLAG_CACHE_DOTDOT so that vnodes hold a reference on their
parent, keeping them active, and allowing to lookup .. without sending
a request to the filesystem.

Enable the featuure for perfused, as this is how FUSE works.
 1.173 10-Aug-2012  manu Missing bit in previous commit (prevent race between create|mknod|mkdir|symlink
and reclaim)
 1.172 10-Aug-2012  manu Fix race condition between (create|mknod|mkdir|symlino) and reclaim, just
like we did it between lookup and reclaim.
 1.171 27-Jul-2012  manu Rename slow sopreq queue into node sopreq queue, to refet the fact that
is only intended for postponed node reclaims.
When purging the node sopreq queue, do not call puffs_msg_sendresp(), as
it makes no sense.
 1.170 23-Jul-2012  manu Backout NCHNAMLEN check for cache_enter. That change collided with rmind's
move of this exact check into cache_enter
 1.169 23-Jul-2012  manu Di not call cache_enter with path components bigger than NCHNAMLEN, as it
panics the kernel.
 1.168 22-Jul-2012  rmind Move some the test for MAKEENTRY into the cache_enter(9). Make some
variables in vfs_cache.c static, __read_mostly, etc.

No objection on tech-kern@.
 1.167 21-Jul-2012  manu - Improve PUFFS_KFLAG_CACHE_FS_TTL by reclaiming older inactive nodes.

The normal kernel behavior is to retain inactive nodes in the freelist
until it runs out of vnodes. This has some merit for local filesystems,
where the cost of an allocation is about the same as the cost of a
lookup. But that situation is not true for distributed filesystems.
On the other hand, keeping inactive nodes for a long time hold memory
in the file server process, and when the kernel runs out of vnodes, it
produce reclaim avalanches that increase lattency for other operations.

We do not reclaim inactive vnodes immediatly either, as they may be
looked up again shortly. Instead we introduce a grace time and we
reclaim nodes that have been inactive beyond the grace time.

- Fix lookup/reclaim race condition.

The above improvement undercovered a race condition between lookup and
reclaim. If we reclaimed a vnode associated with a userland cookie while
a lookup returning that same cookiewas inprogress, then the kernel ends
up with a vnode associated with a cookie that has been reclaimed in
userland. Next operation on the cookie will crash (or at least confuse)
the filesystem.

We fix this by introducing a lookup count in kernel and userland. On
reclaim, the kernel sends the count, which enable userland to detect
situation where it initiated a lookup that is not completed in kernel.
In such a situation, the reclaim must be ignored, as the node is about
to be looked up again.
 1.166 18-Apr-2012  manu - Makesure update_va does not change vnode size when it should not. For
instance when doing a fault-issued VOP_GETPAGES within VOP_WRITE, changing
size leads to panic: genfs_getpages: past eof.
-Handle ticks wrap around for vnode name andattribute timeout
 1.165 08-Apr-2012  manu Add name and atttribute cache with filesytem provided TTL.
lookup, create, mknod, mkdir, symlink, getattr and setattr messages
have been extended so that attributes and their TTL can be provided
by the filesytem. lookup, create, mknod, mkdir, and symlink messages
are also extended so that the filesystem can provide name TTL.
 1.164 16-Mar-2012  jakllsch Prevent access beyond end of PUFFS file on read,
similar to as is done for NFS.
 1.163 17-Jan-2012  martin branches: 1.163.2;
Add a few KASSERT() - I have a crash that likely will cause one of them to
fire...
 1.162 18-Nov-2011  christos branches: 1.162.4;
Obey MNT_RELATIME, the only addition is that mkdir in ufs sets IN_ACCESS too.
 1.161 30-Oct-2011  hannken branches: 1.161.2;
Add a comment that pn_sizemtx should be useless as VOP_GETATTR now
needs a shared lock at least.
 1.160 19-Oct-2011  manu Remove #ifdef DIAGNOSTIC guards around KASSERT, as the macro contains them
 1.159 18-Oct-2011  manu Make sure pagedaemon does not sleep for memory in puffs_vnop_sleep.
Add KASSERT on any sleeping memory allocation to check it cannot happen again.
 1.158 17-Oct-2011  manu Roll back the change that forced kernel threads to not sleep in PUFFS.
The change does not make consensus, since only pagedaemon should need it.
Other threads will tolerate sleeping, and problems here are only symptoms
that something is going wrong in memory management. The cause, not the
symptoms, need to be fixed.
 1.157 23-Sep-2011  manu Fix the build that was broken by struct lwp *updateproc reference in
RUMP-visible code. Instead of checking that updateproc (aka ioflush,
aka syncer) will not sleep in PUFFS code, I check for any kernel thread:
after all none of them are designed to hang awaiting for a remote filesystem
operation to complete.
 1.156 21-Sep-2011  manu Make sure ioflush does not sleep in PUFFS code path, waiting for a mutex,
a memory allocation, or a response from the filesystem.

This avoids deadlocks in the following situations:
1) when memory is low: ioflush waits the fileystem, the fielsystem waits
for memory
2) when the filesystem does not respond (e.g.: network outage ona
distributed filesystem)
 1.155 29-Aug-2011  manu Add a mutex for operations that touch size (setattr, getattr, write, fsync).

This is required to avoid data corruption bugs, where a getattr slices
itself within a setattr operation, and sets the size to the stall value
it got from the filesystem. That value is smaller than the one set by
setattr, and the call to uvm_vnp_setsize() trigged a spurious truncate.
The result is a chunk of zeroed data in the file.

Such a situation can easily happen when the ioflush thread issue a
VOP_FSYNC/puffs_vnop_sync/flushvncache/dosetattrn while andother process
do a sys_stat/VOP_GETATTR/puffs_vnop_getattr.

This mutex on size operation can be removed the day we decide VOP_GETATTR
has to operated on a locked vnode, since the other operations that touch
size already require that.
 1.154 04-Jul-2011  manu Add a flag to VOP_LISTEXTATTR(9) so that the vnode interface can tell the
filesystem in which format extended attribute shall be listed.

There are currently two formats:
- NUL-terminated strings, used for listxattr(2), this is the default.
- one byte length-pprefixed, non NUL-terminated strings, used for
extattr_list_file(2), which is obtanined by setting the
EXTATTR_LIST_PREFIXLEN flag to VOP_LISTEXTATTR(9)

This approach avoid the need for converting the list back and forth, except
in libperfuse, since FUSE uses NUL-terminated strings, and the kernel may
have requested EXTATTR_LIST_PREFIXLEN.
 1.153 12-Jun-2011  rmind Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.
 1.152 19-May-2011  rmind branches: 1.152.2;
Remove cache_purge(9) calls from reclamation routines in the file systems,
as vclean(9) performs it for us since Lite2 merge.
 1.151 03-May-2011  manu Call advlock method if supplied
 1.150 11-Jan-2011  kefren branches: 1.150.2;
add advlock to puffs. ok pooka@
should fix kern/43321
 1.149 30-Nov-2010  dholland Abolish the SAVENAME and HASBUF flags. There is now always a buffer,
so the path in a struct componentname is now always valid during VOP
calls.
 1.148 30-Nov-2010  dholland Abolish struct componentname's cn_pnbuf. Use the path buffer in the
pathbuf object passed to namei as work space instead. (For now a pnbuf
pointer appears in struct nameidata, to support certain unclean things
that haven't been fixed yet, but it will be going away in the future.)

This removes the need for the SAVENAME and HASBUF namei flags.
 1.147 14-Jul-2010  pooka RENAME lookup semantics say return EISDIR if dvp = *vpp for the
last component .... obviously(!!)
 1.146 24-Jun-2010  hannken Clean up vnode lock operations pass 2:

VOP_UNLOCK(vp, flags) -> VOP_UNLOCK(vp): Remove the unneeded flags argument.

Welcome to 5.99.32.

Discussed on tech-kern.
 1.145 21-May-2010  pooka Support extended attributes.
 1.144 29-Mar-2010  pooka Stop exposing fifofs internals and leave only fifo_vnodeop_p visible.
 1.143 27-Mar-2010  pooka \n, police!
 1.142 14-Jan-2010  pooka branches: 1.142.2; 1.142.4;
Since VOP_GETATTR() does not require a locked vnode, resolve and
reference the puffs_node before sending the request to the file
server. This diminishes the window where the inode can be reclaimed
and be invalidated before it is accessed (but does not completely
eliminate the race, as that is a caller problem which we cannot
fix here).
 1.141 04-Dec-2009  pooka Push all information cached in the vnode to the file server before
issuing INACTIVE. PR kern/42194.
Also, send setattr in fsync asynchronously if FSYNC_WAIT is not set.
 1.140 19-Nov-2009  pooka Send VOP_ABORTOP() in case attempting cross-dev rename, part of
PR kern/42210. Also, fix a memory management error in said case.
 1.139 19-Nov-2009  pooka Send VOP_ABORTOP() as a FAF -- we don't care about the return value.
 1.138 05-Nov-2009  pooka Kill suspend support. It was never implemented correctly:
* it depended on the biglock (in a very cruel way)
* it was attached to userspace transactions rather than logical
fs operations

(If someone wants to revisit it some day, most of the stuff can be
reused from cvs history)
 1.137 05-Nov-2009  pooka Reinstante PNODE_DYING. vmlocking had a brief hiatus when it was not
a valid optimization, but that's long gone and once VOP_INACTIVE is
called and the file server says that the vnode is going to be recycled,
it really is going to be recycled extra references gained or not.
 1.136 17-Oct-2009  pooka Transmit VOP_ABORTOP() to the server.
 1.135 30-Sep-2009  pooka remove leading whitespace. no functional change.
 1.134 30-Sep-2009  pooka * fix a race i introduced almost two years ago in rev 1.116:
operations creating a node cannot be considered outgoing operations,
since after return from userspace they modify file system state
by creating a new node. if we do not protect the file system by
holding the directory lock, a lookup operation might race us into
the kernel and create the node earlier.
* remove pnode from hashlish before sending the reclaim faf off to
userspace. also, hold pmp_lock while frobbing the list.
 1.133 19-Sep-2009  pooka Set SAVENAME for rmdir and remove.

Addresses an easy part of PR kern/38188
 1.132 12-Sep-2009  tsutsui Fix typo:
- pcinfo = kmem_zalloc(sizeof_puffs_cacheinfo) + runsize,
+ pcinfo = kmem_zalloc(sizeof(struct puffs_cacheinfo) + runsize,
in #ifdef'ed out code, per paired kmem_free() in the same function.
Closes PR kern/41840.
 1.131 26-Nov-2008  pooka Rototill all remaining file systems to use ubc_uiomove() instead
of the ubc_alloc() - uiomove() - ubc_release() dance.
 1.130 16-Nov-2008  pooka more <sys/buf.h> police
 1.129 10-Sep-2008  christos branches: 1.129.2; 1.129.4; 1.129.8;
replace 0xa0 with space from Andy Shevchenko
 1.128 30-Jan-2008  ad branches: 1.128.6; 1.128.10; 1.128.12; 1.128.16;
Replace struct lock on vnodes with a simpler lock object built on
krwlock_t. This is a step towards removing lockmgr and simplifying
vnode locking. Discussed on tech-kern.
 1.127 28-Jan-2008  pooka For code clarity typedef void *puffs_cookie_t.

No functional change.
 1.126 25-Jan-2008  ad Remove VOP_LEASE. Discussed on tech-kern.
 1.125 02-Jan-2008  pooka More type-punning workarounds. Curiously the kernel compilation
flags cause gcc to not complain.
 1.124 02-Jan-2008  ad Merge vmlocking2 to head.
 1.123 30-Dec-2007  pooka namespace a bit: vfsops -> puffs_vfsop_x() and vops -> puffs_vnop_x()
 1.122 08-Dec-2007  pooka branches: 1.122.4;
Now that "l" is gone both as an argument to operations and from
componentname, remove all vestiges of puffs_cid.
 1.121 27-Nov-2007  pooka branches: 1.121.2;
Remove "puffs_cid" from the puffs interface following l-removal
from the kernel vfs interfaces. puffs_cc_getcaller(pcc) can be
used now should the same information be desired.
 1.120 26-Nov-2007  pooka Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern
 1.119 21-Nov-2007  pooka use BUF_ISREAD/WRITE instead of homegrown variants
 1.118 20-Nov-2007  pooka Retire M_PUFFS, use kmem(9) instead.
 1.117 17-Nov-2007  pooka Make puffs_updatenode() take a puffs_node instead of a vnode. This
way we don't need to worry if a vnode has been reclaimed from under
us.
 1.116 17-Nov-2007  pooka Start playing around with vnode locks. For now, do the very easy
thing and release locks before the userspace wait for operations
which release the lock before exit from the method in any case.
However, releasing the lock after inserting the request on the
operation queue gives us proper ordering possibilities in userspace
(at least if that bit were implemented, but I don't think there
any file system in userspace that depends on kernel locking and
probably there never should be one).

inspired by a conversation with Nacho Navarro
 1.115 17-Nov-2007  pooka Implement a biodone callback for async writes similar to reads and
use that when possible.
 1.114 16-Nov-2007  pooka Restructure the messaging interface a bit more: make all interfacing
with the file server happen through puffs_msg_enqueue() and
puffs_msg_wait() instead of having a billion different routines.
Build the existing system upon these two. Most importantly though,
decouple insertation into the op queue from the actual wait. This
is useful for a number of reasons coming soon to a cvs repo near you.
 1.113 26-Oct-2007  pooka branches: 1.113.2;
Read/write can reuse message memory if operating uncached. This
will change evetually, but for now just appease a KASSERT by
resetting the message header to 0 after each loop.
 1.112 23-Oct-2007  pooka The kernel (genfs, uvm) can't deal with strategy returning an error
when vclean()ing. Pending an adventure to the genfs/vm labyrinth
to fix this properly, compensate here by not allowing unstrategic
(no pun) return values. They are always due to the userspace server
crashing anyway, so it's no big deal if we lie about the final
resting place of the pages.
 1.111 21-Oct-2007  pooka * release pathname buffer in link
* some variable massage
 1.110 19-Oct-2007  pooka When doing a read operation, don't copy the whole kernel buffer to
userspace, since it doesn't contain any information yet. I should
still rework this more so this is just a quickie to get the read/write
style interface more up to speed with the ioctl version.
 1.109 19-Oct-2007  pooka comment polish
 1.108 18-Oct-2007  pooka Fix wrong argument order which just happened to work by luck.
 1.107 11-Oct-2007  pooka branches: 1.107.2;
Part 1/n of some pretty extensive changes to how the kernel module
interacts with the userspace file server:

* since the kernel-user communication is not purely request-response
anymore (hasn't been since 2006), try to rename some "request" to
"message". more similar mangling will take place in the future.

* completely rework how messages are allocated. previously most of
them were borrowed from the stack (originally *all* of them),
but now always allocate dynamically. this makes the structure
of the code much cleaner. also makes it possible to fix a
locking order violation. it enables plenty of future enhancements.

* start generalizing the transport interface to be independent of puffs

* move transport interface to read/write instead of ioctl. the
old one had legacy design problems, and besides, ioctl's suck.
implement a very generic version for now; this will be
worked on later hopefully some day reaching "highly optimized".

* implement libpuffs support behind existing library request
interfaces. this will change eventually (I hate those interfaces)
 1.106 11-Oct-2007  pooka Cache vnode member variables necessary for operations after the
userspace call, namely our private mount structure, in the activation
record. This avoids problems in situations where the userspace
file server happens to die during our upcall and the vnode is
forcibly reclaimed before we roll back to the current stack frame.
 1.105 10-Oct-2007  ad Merge from vmlocking:

- Split vnode::v_flag into three fields, depending on field locking.
- simple_lock -> kmutex in a few places.
- Fix some simple locking problems.
 1.104 04-Oct-2007  pooka g/c the "sizeop" code previous used for ioctl/fcntl. It was already
commented out and has bitrotted beyond all recognition, so it needs
complete rethinking.
 1.103 02-Oct-2007  pooka If kernel resource allocation fails after the file server has
committed something, issue an abort. The abort is done through
the regular op channel, e.g. failed mkdir leads to regular rmdir,
inactive and reclaim. No internal interface is planned currently
for the one file system out of a million which would implement it
to benefit from the one case in a billion where kernel resource
allocation actually does fail and out of that one case in a trillion
where internal vs. external would make a difference.
 1.102 01-Oct-2007  pooka * better error checking: validate error values received from userland
to be vaild errno values
* include string describing error in PUFFS_ERR
* get rid of union in puffs_req, it's nothing but trouble
* pass pmp to async i/o callbacks
 1.101 27-Sep-2007  pooka Differentiate between cookie2vnode returning an error and
return to caller, address unknown: no such cookie, no such node.
Make the callers use this info to either create a new vnode or bail.
 1.100 27-Sep-2007  pooka Add error notifications, which are used to deliver errors from the
kernel to the file server for silly things the file server did,
e.g. attempting to create a file with size VSIZENOTSET. The file
server can handle these as it chooses, but the default action is
for it to throw its hands in the air and sing "goodbye, cruel world,
it's over, walk on by".
 1.99 27-Sep-2007  pooka Fix a race in how new cookies are checked. Previously the checking
was done separate of inserting the cookie into the lookup structure
and without any form of interlock. This could lead to the same
cookie pointing to two different nodes. Remedy the race by creating
a separate "checked and ready to be inserted" cookie list which
serves as an interlock without having to hold a fs-global creation
lock.
 1.98 22-Aug-2007  pooka branches: 1.98.2; 1.98.4;
Mimic namei structure changes for puffs. bump both kernel & lib version.
 1.97 13-Aug-2007  pooka * don't call VOP_ACCESS in lookup, that's the file system's problem
* be more careful with r/o fs to catch EEXIST in lookup CREATE
* some comment polish
 1.96 12-Aug-2007  pooka enforce MNT_RDONLY
 1.95 30-Jul-2007  pooka branches: 1.95.4; 1.95.6;
properly setup ubcflags
 1.94 29-Jul-2007  ad It's not a good idea for device drivers to modify b_flags, as they don't
need to understand the locking around that field. Instead of setting
B_ERROR, set b_error instead. b_error is 'owned' by whoever completes
the I/O request.
 1.93 27-Jul-2007  yamt ubc_uiomove: add an "advice" argument rather than using UVM_ADV_RANDOM blindly.
 1.92 27-Jul-2007  pooka Change unused fflags parameter in VOP_MMAP to prot and pass in
desired vm protection.
 1.91 22-Jul-2007  pooka use NULL, not 0, to pass a pointer
 1.90 22-Jul-2007  pooka Keep track of the maximum size we have supplied the file server (or
it has supplied us). If we fault pages which are at offset >= server
size, but less than the in-kernel vnode size, inform the file server
of the latest developments in file size before issueing the fault.
The avoids confusion with files which are not written start to finish.

fixes kern/36429 by yamt
 1.89 19-Jul-2007  pooka don't request more than the maximum request size in readdir
 1.88 09-Jul-2007  ad branches: 1.88.2;
s/pagedaemon_lwp/pagedaemon_proc/
 1.87 09-Jul-2007  ad Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements
 1.86 02-Jul-2007  pooka support turning REQUIREDIR off and extra consume in lookup
 1.85 02-Jul-2007  pooka Get rid of the "int *refs" parameter to inactive: the same can be
accomplished now with puffs_setbacks.
 1.84 01-Jul-2007  pooka loosen KASSERT: we can also fail due to ENOMEM
 1.83 01-Jul-2007  pooka Give the file server to ability to request the entire pathname buffer
under lookup by using PUFFS_KFLAG_LOOKUP_FULLPNBUF instead just the
current component.
 1.82 01-Jul-2007  pooka Instead of supplying a plain pid, supply an abstract struct puffs_cid *,
which can currently be used to query the pid and lwpid.
 1.81 01-Jul-2007  pooka make puffs_cred an opaque type
 1.80 30-Jun-2007  pooka Fix logic flaw in KASSERT. Seems like my lkm wasn't compiled with
DIAGNOSTIC ...
 1.79 26-Jun-2007  pooka Simplify code, mainly vop_strategy. No functional change
 1.78 24-Jun-2007  pooka Split the NOCACHE option in twain: NOCACHE_NAME & NOCACHE_PAGE.
 1.77 21-Jun-2007  pooka Refactor the pnode2vnode translation slightly so that VFS_ROOT
can use it directly.
 1.76 06-Jun-2007  pooka Move puffs to a two clause license where it already isn't so. And
as agc pointed out, even files with the third clause were already
effectively two clause because of a slight bug in the language...
 1.75 06-Jun-2007  pooka In very verbose debug mode, print also return values for operations
(well, at least for those that go through checkop()).
 1.74 05-Jun-2007  yamt improve post-ubc file overwrite performance in common cases.
ie. when it's safe, actually overwrite blocks rather than doing
read-modify-write.

also fixes PR/33152 and PR/36303.
 1.73 01-Jun-2007  yamt \xa0 -> space.
 1.72 19-May-2007  pooka Actually, we do need separate "no references in file server" and
"noref + inactive" flags if we wish to correctly support unix open
file semantics and optimize away pre-reclaim cache flushes. So,
add PNODE_DYING which stands for norefs + inactive.
 1.71 18-May-2007  pooka Introduce noref setbacks, which the file server can use to signal
the kernel it has 0 references to the node in question. In other
words, this can be used to avoid inactive(), or, if the file server
does not implement inactive, prompt reclaim for removed nodes.
 1.70 18-May-2007  pooka selrecord() before calling userspace to avoid (very theoretical) race
where selinfo contains uninitialized garbage
 1.69 18-May-2007  pooka Support VOP_POLL. This requires some acrobatics on the puffs_node,
as we give a reference to userspace for the puffs_node for the
duration of the poll call. So reference count puffs_node separately
from the parent vnode. vref()/vrele() is not possible due to a possible
surprise visit from VOP_INACTIVE.
 1.68 15-May-2007  pooka In case strategy memory allocation for B_ASYNC|B_READ fails,
make sure to release the buf.
 1.67 08-May-2007  pooka Adventures in file systems, part (u_quad_t)-1: we can't use the
file system value for the size of device special files, as that
comes from specfs instead of the "host" file system. Therefore,
take care that getattr doesn't override the value of vp->v_size.
 1.66 07-May-2007  pooka Introduce puffs "setbacks", which can be used to set certain flags
for nodes upon return from the userspace. Currently it can be used
to indicate that the file server should be notified of "inactive"
in case the file server has opted to not receive inactive every
time the reference count for a vnode drops to zero. (inactive is
a common event, almost never requires any action and must be executed
sychronously, so it is wasteful).

While doing this, cleanup the release-relock nonsense from the
vntouser*() arguments. It was never enabled and the whole LOCKEDVP()
concept was very broken to begin with.
 1.65 06-May-2007  pooka If setattr is called explicitly, use that as the sign to flush out
all metadata info cached in the kernel while we're setattr'ing in
any case. Solves problems such as truncate (via extend-by-write)
+ chmod resulting in EPERM because the file was already read-only
when the actual truncate was flushed out of the kernel in fsync.
 1.64 24-Apr-2007  pooka If ubc style write fails, do not extend the file by zero-padding
it. It might be that the file server is either crashing or just
returning consistent errors. uiomove() would handle the error,
but if the pages weren't faulted in, memset() to the unfaultable
ubc window would cause a kernel page fault.
 1.63 22-Apr-2007  pooka Issue close to the file server asynchronously. We're not interested
in the return value.
 1.62 22-Apr-2007  pooka define PUFFS_KFLAG_WTCACHE, which makes the page cache write-through
 1.61 20-Apr-2007  pooka * in readdir, don't copy extra memory back and forth to userspace
* consistent usage of the variable argsize with the rest of the module
 1.60 20-Apr-2007  pooka Size of a readdir cookie is sizeof(**ap->a_cookies), not
sizeof(*ap->a_cookies). Fixes nfs readdir in the case that a
directory had lots of entries with short names.
 1.59 16-Apr-2007  pooka Give the file server the ability to specify the file handle length
instead of defining a static length file handle on the framework-level.
 1.58 11-Apr-2007  pooka * support VFS_FHTOVP and VFS_VPTOFH
* support cookies in for VOP_READDIR

nfs exporting puffs file systems works now
 1.57 04-Apr-2007  pooka Make it possible to interrupt waiters for fs operation completion
again. This is useful until locking is further developed and basically
any deadlocks can be solved by killing appropriate processes.

Thanks especially to Tommi Kyntola and Antti Louko for sitting down
with me and discussing resource ownership and locking strategies
in implementing this.
 1.56 30-Mar-2007  pooka * abstract ASYNCBIOREAD and let callers freely issue a callback called
from putop. even though there's only one user currently, makes code
more readable
* move "delta" to a standard parameter in vntouser and get rid of the
specialcase vntouser_delta
 1.55 29-Mar-2007  pooka Convert spinlocks & sleep/wakeup to newlock2 locking stuff. Fix a
bunch of bugs.

* park structures are now always allocated from a pool instead of a
mixed stack/malloc allocation
* get rid of the whole adjbuf concept, always just alloc the maximal
amount of memory to satisfy a request
* little regression: don't allow interrupting wait from file system
to userspace; this had problems already before, but now the problems
really started to shine through. I'll try to make this work again
some day.
* fix bmap to return a sensible value in runp
 1.54 20-Mar-2007  pooka * rework the page cache interaction a bit: cache metadata in the
kernel and flush it out all at once instead of continuous updating
* add support for delivering notifications to the file server about
when a page was written to (but disabled by default for now). the
file server can use this to request flushing or invalidating the
kernel page cache
 1.53 14-Mar-2007  pooka branches: 1.53.2;
Support B_READ|B_ASYNC in strategy by calling biodone() directly
when the file server puts the result.
 1.52 20-Feb-2007  pooka branches: 1.52.4; 1.52.6;
Properly fix rev 1.44: limit error values from the file server to
positive values of errno and 0. Otherwise it can return internal values
such as EJUSTRETURN and screw things up.

thanks to Bill for reminding me to revisit this
 1.51 15-Feb-2007  pooka branches: 1.51.2;
Sanity-check linklen returned from file server in READLINK.
 1.50 10-Feb-2007  pooka * in write, do sync pageflush for the ubc case every 64k, otherwise
the user file server can't really keep up and just writing and writing
may result in kernel memory exhaustion. this lossage is also partially
due to the stupid way mtime + size info is handled currently, but that
should change soon (*knock knock* ;)
* score a few debug printfs
 1.49 09-Feb-2007  pooka honor B_ASYNC
 1.48 09-Feb-2007  pooka assign value for strategy output parameter b_resid instead of decreasing it
 1.47 08-Feb-2007  pooka If the file server doesn't support write, don't use genfs_null_putpages
for putpages, as it assumes a vnode doesn't have any pages. For
mounts using the page cache this is simply not true. Rather,
prevent opening a regular file in write-mode. That way a vnode
can never have dirty pages which would need to be flushed (i.e.
written).
 1.46 08-Feb-2007  pooka chuq shone arcane wisdom on me: b_bcount comes in, b_resid goes out
 1.45 08-Feb-2007  pooka Don't block and wait for file server response in case strategy is
run in pagedaemon context: it gives the file server way more control
over the fate of the entire kernel than what we're comfortable with.
 1.44 06-Feb-2007  pooka Limit errors from puffs_lookup to 0, EJUSTRETURN and ENOENT, as
that's what namei/lookup expects.
 1.43 29-Jan-2007  hannken Change fstrans enum types to upper case.
No functional change.

From Antti Kantee <pooka@netbsd.org>
 1.42 26-Jan-2007  pooka We don't handle fsync in checkop anymore, so direct the fifoop fsync
also to a place less panicy, namely fifo_fsync (because currently the
metadata information is update when the node is changed. This will
probably change soon, though).
 1.41 26-Jan-2007  pooka Initial attempt at suspend/snapshot support for userspace file
servers. This is still pretty much on the level "if it breaks ...".
It should work for single-threaded servers which handle one operation
from start to finish in one go. Also, it does not yet totally
correctly synchronize metadata and data in some cases. So needless
to say, it needs improvement, but it is possible that will have to
wait for some lock revampage.
 1.40 25-Jan-2007  pooka if strategy fails, set bp->b_error and B_ERROR
 1.39 25-Jan-2007  pooka don't hold spinlocks (except vnode interlock) when doing vget()
 1.38 21-Jan-2007  pooka optimize a bit: don't flush pages for vnodes which have no references
in the kernel or links in the backend
 1.37 21-Jan-2007  pooka remove diagnostic printf
 1.36 19-Jan-2007  pooka hannken noted that the latest gcc (?) complains about uninitialized
variable use in puffs_strategy() for "dowritefaf" (incorrectly)
and "error" (correctly, although the function is practically of
type void)
 1.35 19-Jan-2007  pooka In case the fs server is in the kernel doing an operation on a
completely different file system, we still might re-enter the same
puffs fs in case we execute something on the other file system,
which wants to get a new vnode and ends up recycling a puffs vnode
for the purpose. In this case the fs server will sleep in the
kernel until it itself handles the operation .... which of course
is a slightly unlikely event.

After analyzing the path from getcleanvnode() to the vnode cemetary,
identify that fsync and putpages (strategy) are the ones in danger
of striking a deadlock deal. Abuse the vnode flag VXLOCK to tell
them "this vnode is irreversably going to meet its maker, don't
care about user server return values" (failure is not acceptable
down the vgonel() path) and issue the respective operations as
Fire-And-Forget (FAF) operations. no wait -> no deadlock.

This of course is a "fix" skating on thin ice. A better, more
generic solution is already in sight, but will take more effort to
implement.
 1.34 16-Jan-2007  pooka * don't wait for the answer of VOP_RECLAIM, just fire-and-forget
* revoke puffs_revoke. we can deal with it just by calling genfs_revoke
 1.33 15-Jan-2007  pooka Store puffs_node's on lists hashed with the cookie value instead
of just one flat list.
 1.32 15-Jan-2007  pooka * do not accept the directory cookie as the result of a lookup (otherwise
we'd be locking against ourselves)
* do not accept duplicate cookies when creating new nodes
 1.31 11-Jan-2007  pooka Since fsync is really putpages + fsync, check for both separately
instead of using just putpages to decide the op's faith.

And the real beef in this commit is of course a tyop fix in a comment.
 1.30 09-Jan-2007  pooka Introduce flush operations, which the fs server can use to control
kernel caching. Currently supported are only flushing the name
cache for a directory or flushing the name cache for the entire fs.

Also, get rid of PNODE_INACTIVE status, since it was racy and
essentially didn't work. All this on top of being useless in the
first place ....
 1.29 07-Jan-2007  pooka getcwd wants eofflag - set eofflag in readdir if amount of data is 0
 1.28 02-Jan-2007  pooka In rename, tdvp == tvp holds if we are renaming a directory to "."
(XXX: for all the sense that makes). Deal with it gracefully here
for now.
 1.27 01-Jan-2007  pooka remove r/o mount check done also in vfs lookup()
 1.26 01-Jan-2007  pooka async update node metadata for spec- and fifoops
 1.25 01-Jan-2007  pooka properly handle VOP_REMOVE case where vp == dvp
 1.24 01-Jan-2007  pooka explicitly disable ioctl and fcntl for now - support has bitrotted
 1.23 30-Dec-2006  pooka branches: 1.23.2;
* use PUFFS_KFLAG_NOCACHE to also signal that we don't want the namecache
* enter files into the namecache immediately when new nodes are created
(if it's a caching mount, of course)
 1.22 09-Dec-2006  chs branches: 1.22.2;
a smorgasbord of improvements to vnode locking and path lookup:
- LOCKPARENT is no longer relevant for lookup(), relookup() or VOP_LOOKUP().
these now always return the parent vnode locked. namei() works as before.
lookup() and various other paths no longer acquire vnode locks in the
wrong order via vrele(). fixes PR 32535.
as a nice side effect, path lookup is also up to 25% faster.
- the above allows us to get rid of PDIRUNLOCK.
- also get rid of WANTPARENT (just use LOCKPARENT and unlock it).
- remove an assumption in layer_node_find() that all file systems implement
a recursive VOP_LOCK() (unionfs doesn't).
- require that all file systems supply vfs_vptofh and vfs_fhtovp routines.
fill in eopnotsupp() for file systems that don't support being exported
and remove the checks for NULL. (layerfs calls these without checking.)
- in union_lookup1(), don't change refcounts in the ISDOTDOT case, just
adjust which vnode is locked. fixes PR 33374.
- apply fixes for ufs_rename() from ufs_vnops.c rev. 1.61 to ext2fs_rename().
 1.21 07-Dec-2006  pooka let implementation ultimately decide if mmap is supported - pass
VOP_MMAP to fs server
 1.20 05-Dec-2006  pooka adjust file size in write only if file grows. but since this change is
in the "never use ubc" branch, I don't think it matters except for cosmetics.
 1.19 05-Dec-2006  pooka Allow multiple requests to be transferred in each GET/PUTOP. For
a single request, the performance is still the same.
 1.18 01-Dec-2006  pooka branches: 1.18.2;
prefix kernel flags with PUFFS_KFLAG to have a separate namespace
from the library flags
 1.17 01-Dec-2006  pooka don't call the fs server for all operations, only those it has told
us that it implements
 1.16 28-Nov-2006  pooka don't allow mmap if operating uncached
 1.15 18-Nov-2006  pooka Actually, for NOCACHE, use direct read/write instead of going through
page cache at all and invalidating. XXX: mmap
 1.14 18-Nov-2006  pooka branches: 1.14.2;
make puffs_strategy more robust
 1.13 18-Nov-2006  pooka Require statvfs info from startreq so that we have that info available.
Also, don't pass fsid to userspace and just fill it in the kernel.
 1.12 18-Nov-2006  pooka As a first generation best-effort hack, use NOCACHE to mean "file
size can change without the kernel knowing" and therefore query
the file size before invoking read or write operations.
 1.11 17-Nov-2006  pooka Introduce uncached operation, makes sense when the file system backend
can be modified from elsewhere than the file system interface
 1.10 13-Nov-2006  pooka No need to return a special value for CREATE/RENAME lookup, so just
handle ENOENT. If there's a real error, userspace will return
something else.
 1.9 08-Nov-2006  pooka update struct buf resid in strategy according to what was transferred.
seems like only nestiobuf complains when it wasn't updated ...
 1.8 07-Nov-2006  pooka attach to genfs & support page cache. most noticeable effect is
mmap and therefore execution of binaries starting to work, some
speed improvements with large file I/O also. caching semantics
and error case handling most likely need revisiting.
 1.7 27-Oct-2006  pooka Use spec_fsync for specops vop_fsync: it knows about vflushbuf(), which
is more than what puffs currently knows. makes e.g. ffs unmount for a
puffs-based device node work.
 1.6 27-Oct-2006  pooka support fifos
 1.5 26-Oct-2006  pooka support specfs
 1.4 26-Oct-2006  pooka Fix operations creating new nodes to honor the vnode locking protocol
if the userspace server returns an error. Fixes lockups if any
of the following operations failed: create, mknod, mkdir, symlink
 1.3 25-Oct-2006  pooka pass VOP_INACTIVE() to userspace
 1.2 23-Oct-2006  pooka fix print in VOP_PRINT

also make it compile on amd64. problem noticed by Blair Sadewitz
on current-users
 1.1 22-Oct-2006  pooka kernel portion of puffs - the Pass-to-Userspace Framework File System.
It contains the VFS attachment and userspace message-passing interface.

This work was initially started and completed for Google SoC 2005
and tweaked to work a bit better in the past few weeks. While
being far from complete, it is functional enough to be able and
stable to host a fairly general-purpose in-memory file system in
userspace. Even so, puffs should be considered experimental and
no binary compatibility for interfaces or crash-freedom or zero
security implications should be relied upon just yet.

The GSoC project was mentored by William Studenmund and the final
review for the code was done by Christos.
 1.14.2.5 09-Feb-2007  ad Sync with HEAD.
 1.14.2.4 01-Feb-2007  ad Sync with head.
 1.14.2.3 12-Jan-2007  ad Sync with head.
 1.14.2.2 18-Nov-2006  ad Sync with head.
 1.14.2.1 18-Nov-2006  ad file puffs_vnops.c was added on branch newlock2 on 2006-11-18 21:39:20 +0000
 1.18.2.1 17-Feb-2007  tron Apply patch (requested by chs in ticket #422):
- Fix various deadlock problems with nullfs and unionfs.
- Speed up path lookups by upto 25%.
 1.22.2.2 10-Dec-2006  yamt sync with head.
 1.22.2.1 09-Dec-2006  yamt file puffs_vnops.c was added on branch yamt-splraiseipl on 2006-12-10 07:18:38 +0000
 1.23.2.8 04-Feb-2008  yamt sync with head.
 1.23.2.7 21-Jan-2008  yamt sync with head
 1.23.2.6 07-Dec-2007  yamt sync with head
 1.23.2.5 27-Oct-2007  yamt sync with head.
 1.23.2.4 03-Sep-2007  yamt sync with head.
 1.23.2.3 26-Feb-2007  yamt sync with head.
 1.23.2.2 30-Dec-2006  yamt sync with head.
 1.23.2.1 30-Dec-2006  yamt file puffs_vnops.c was added on branch yamt-lazymbuf on 2006-12-30 20:50:01 +0000
 1.51.2.5 17-May-2007  yamt sync with head.
 1.51.2.4 07-May-2007  yamt sync with head.
 1.51.2.3 15-Apr-2007  yamt sync with head.
 1.51.2.2 24-Mar-2007  yamt sync with head.
 1.51.2.1 27-Feb-2007  yamt - sync with head.
- move sched_changepri back to kern_synch.c as it doesn't know PPQ anymore.
 1.52.6.1 11-Jul-2007  mjf Sync with head.
 1.52.4.14 23-Oct-2007  ad Sync with head.
 1.52.4.13 12-Oct-2007  ad Sync with head.
 1.52.4.12 09-Oct-2007  ad Sync with head.
 1.52.4.11 09-Oct-2007  ad Sync with head.
 1.52.4.10 16-Sep-2007  ad Checkpoint work in progress on the vnode lifecycle and reference counting
stuff. This makes it work properly without kernel_lock and fixes a few
quite old bugs. See vfs_subr.c 1.283.2.17 for details.
 1.52.4.9 20-Aug-2007  ad Sync with HEAD.
 1.52.4.8 19-Aug-2007  ad - Back out the biodone() changes.
- Eliminate B_ERROR (from HEAD).
 1.52.4.7 15-Jul-2007  ad Sync with head.
 1.52.4.6 17-Jun-2007  ad - Increase the number of thread priorities from 128 to 256. How the space
is set up is to be revisited.
- Implement soft interrupts as kernel threads. A generic implementation
is provided, with hooks for fast-path MD code that can run the interrupt
threads over the top of other threads executing in the kernel.
- Split vnode::v_flag into three fields, depending on how the flag is
locked (by the interlock, by the vnode lock, by the file system).
- Miscellaneous locking fixes and improvements.
 1.52.4.5 09-Jun-2007  ad Sync with head.
 1.52.4.4 08-Jun-2007  ad Sync with head.
 1.52.4.3 10-Apr-2007  ad Sync with head.
 1.52.4.2 09-Apr-2007  ad - Add two new arguments to kthread_create1: pri_t pri, bool mpsafe.
- Fork kthreads off proc0 as new LWPs, not new processes.
 1.52.4.1 05-Apr-2007  ad Compile fixes.
 1.53.2.1 29-Mar-2007  reinoud Pullup to -current
 1.88.2.2 03-Sep-2007  skrll Sync with HEAD.
 1.88.2.1 15-Aug-2007  skrll Sync with HEAD.
 1.95.6.2 30-Jul-2007  pooka properly setup ubcflags
 1.95.6.1 30-Jul-2007  pooka file puffs_vnops.c was added on branch matt-mips64 on 2007-07-30 14:49:02 +0000
 1.95.4.9 09-Dec-2007  jmcneill Sync with HEAD.
 1.95.4.8 27-Nov-2007  joerg Sync with HEAD. amd64 Xen support needs testing.
 1.95.4.7 21-Nov-2007  joerg Sync with HEAD.
 1.95.4.6 28-Oct-2007  joerg Sync with HEAD.
 1.95.4.5 26-Oct-2007  joerg Sync with HEAD.

Follow the merge of pmap.c on i386 and amd64 and move
pmap_init_tmp_pgtbl into arch/x86/x86/pmap.c. Modify the ACPI wakeup
code to restore CR4 before jumping back into kernel space as the large
page option might cover that.
 1.95.4.4 07-Oct-2007  joerg Sync with HEAD.
 1.95.4.3 02-Oct-2007  joerg Sync with HEAD.
 1.95.4.2 03-Sep-2007  jmcneill Sync with HEAD.
 1.95.4.1 16-Aug-2007  jmcneill Sync with HEAD.
 1.98.4.2 14-Oct-2007  yamt sync with head.
 1.98.4.1 06-Oct-2007  yamt sync with head.
 1.98.2.3 23-Mar-2008  matt sync with HEAD
 1.98.2.2 09-Jan-2008  matt sync with HEAD
 1.98.2.1 06-Nov-2007  matt sync with HEAD
 1.107.2.4 21-Nov-2007  bouyer Sync with HEAD
 1.107.2.3 18-Nov-2007  bouyer Sync with HEAD
 1.107.2.2 13-Nov-2007  bouyer Sync with HEAD
 1.107.2.1 25-Oct-2007  bouyer Sync with HEAD.
 1.113.2.4 18-Feb-2008  mjf Sync with HEAD.
 1.113.2.3 27-Dec-2007  mjf Sync with HEAD.
 1.113.2.2 08-Dec-2007  mjf Sync with HEAD.
 1.113.2.1 19-Nov-2007  mjf Sync with HEAD.
 1.121.2.2 26-Dec-2007  ad Sync with head.
 1.121.2.1 04-Dec-2007  ad Pull the vmlocking changes into a new branch.
 1.122.4.2 08-Jan-2008  bouyer Sync with HEAD
 1.122.4.1 02-Jan-2008  bouyer Sync with HEAD
 1.128.16.2 13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.128.16.1 19-Oct-2008  haad Sync with HEAD.
 1.128.12.1 24-Sep-2008  wrstuden Merge in changes between wrstuden-revivesa-base-2 and
wrstuden-revivesa-base-3.
 1.128.10.4 11-Aug-2010  yamt sync with head.
 1.128.10.3 11-Mar-2010  yamt sync with head
 1.128.10.2 16-Sep-2009  yamt sync with head
 1.128.10.1 04-May-2009  yamt sync with head.
 1.128.6.2 17-Jan-2009  mjf Sync with HEAD.
 1.128.6.1 28-Sep-2008  mjf Sync with HEAD.
 1.129.8.1 21-Apr-2010  matt sync to netbsd-5
 1.129.4.11 02-Nov-2011  riz Pull up following revision(s) (requested by manu in ticket #1679):
sys/fs/puffs/puffs_vnops.c: revision 1.157
sys/fs/puffs/puffs_vnops.c: revision 1.158
sys/fs/puffs/puffs_vnops.c: revision 1.159
sys/fs/puffs/puffs_vfsops.c: revision 1.97
sys/fs/puffs/puffs_vfsops.c: revision 1.99
sys/fs/puffs/puffs_vnops.c: revision 1.160
sys/fs/puffs/puffs_vfsops.c: revision 1.100
sys/miscfs/syncfs/sync_subr.c: revision 1.47
sys/fs/puffs/puffs_node.c: revision 1.21
sys/fs/puffs/puffs_node.c: revision 1.22
sys/fs/puffs/puffs_msgif.c: revision 1.88
sys/fs/puffs/puffs_msgif.c: revision 1.89
sys/fs/puffs/puffs_vnops.c: revision 1.156
Make sure ioflush does not sleep in PUFFS code path, waiting for a mutex,
a memory allocation, or a response from the filesystem.
This avoids deadlocks in the following situations:
1) when memory is low: ioflush waits the fileystem, the fielsystem waits
for memory
2) when the filesystem does not respond (e.g.: network outage ona
distributed filesystem)
Fix the build that was broken by struct lwp *updateproc reference in
RUMP-visible code. Instead of checking that updateproc (aka ioflush,
aka syncer) will not sleep in PUFFS code, I check for any kernel thread:
after all none of them are designed to hang awaiting for a remote filesystem
operation to complete.
Roll back the change that forced kernel threads to not sleep in PUFFS.
The change does not make consensus, since only pagedaemon should need it.
Other threads will tolerate sleeping, and problems here are only symptoms
that something is going wrong in memory management. The cause, not the
symptoms, need to be fixed.
Make sure pagedaemon does not sleep for memory in puffs_vnop_sleep.
Add KASSERT on any sleeping memory allocation to check it cannot happen again.
Remove #ifdef DIAGNOSTIC guards around KASSERT, as the macro contains them
 1.129.4.10 17-Sep-2011  bouyer Pull up following revision(s) (requested by manu in ticket #1666):
sys/fs/puffs/puffs_sys.h: revision 1.78 via patch
sys/fs/puffs/puffs_node.c: revision 1.20 via patch
sys/fs/puffs/puffs_vnops.c: revision 1.155 via patch
Add a mutex for operations that touch size (setattr, getattr, write, fsync).
This is required to avoid data corruption bugs, where a getattr slices
itself within a setattr operation, and sets the size to the stall value
it got from the filesystem. That value is smaller than the one set by
setattr, and the call to uvm_vnp_setsize() trigged a spurious truncate.
The result is a chunk of zeroed data in the file.
Such a situation can easily happen when the ioflush thread issue a
VOP_FSYNC/puffs_vnop_sync/flushvncache/dosetattrn while andother process
do a sys_stat/VOP_GETATTR/puffs_vnop_getattr.
This mutex on size operation can be removed the day we decide VOP_GETATTR
has to operated on a locked vnode, since the other operations that touch
size already require that.
 1.129.4.9 17-Jul-2011  riz Pull up following revision(s) (requested by manu in ticket #1645):
lib/libc/sys/Makefile.inc 1.207 via patch
lib/libc/sys/extattr_get_file.2 patch
lib/libpuffs/dispatcher.c 1.34,1.36 via patch
lib/libpuffs/puffs.c 1.107 via patch
lib/libpuffs/puffs.h 1.115,1.118 via patch
sys/fs/puffs/puffs_msgif.h 1.71,1.76 via patch
sys/fs/puffs/puffs_vfsops.c 1.88 via patch
sys/fs/puffs/puffs_vnops.c 1.145,1.154 via patch
sys/kern/vfs_xattr.c 1.24-1.27 via patch
sys/kern/vnode_if.c 1.87 via patch
sys/sys/Makefile 1.133 via patch
sys/sys/extattr.h 1.6 via patch
sys/sys/vnode_if.h 1.81 via patch
sys/ufs/ffs/ffs_vnops.c patch
sys/ufs/ufs/ufs_extattr.c 1.31,1.34 via patch

* support extended attributes
* bump major due to structure growth
* add some spare space
* remove ABI sillyness
Support extended attributes.
Fix multiple non compliances in our Linux-like extattr API, and make it
public so that it can be used.
Improve a bit listxattr(2). It attemps to list both system and user
extended attributes, and it faled if calling user did not have privilege
for reading system EA. Now we just lise user EA and skip system EA in
reading them is not allowed.
Fix bug introduced in previous commuit: Do not vrele() a vnode we did not
obtained.
Improve UFS1 extended attributes usability
- autocreate attribute backing file for new attributes
- autoload attributes when issuing extattrctl start
- when autoloading attributes, do not display garbage warning when looking
up entries that got ENOENT
Add a flag to VOP_LISTEXTATTR(9) so that the vnode interface can tell the
filesystem in which format extended attribute shall be listed.
There are currently two formats:
- NUL-terminated strings, used for listxattr(2), this is the default.
- one byte length-pprefixed, non NUL-terminated strings, used for
extattr_list_file(2), which is obtanined by setting the
EXTATTR_LIST_PREFIXLEN flag to VOP_LISTEXTATTR(9)
This approach avoid the need for converting the list back and forth, except
in libperfuse, since FUSE uses NUL-terminated strings, and the kernel may
have requested EXTATTR_LIST_PREFIXLEN.
 1.129.4.8 18-Jun-2011  bouyer Pull up following revision(s) (requested by manu in ticket #1623):
lib/libpuffs/puffs.c: revision 1.116 via patch
sys/fs/puffs/puffs_vnops.c: revision 1.151 via patch
Call advlock method if supplied
 1.129.4.7 16-Jan-2010  bouyer Pull up following revision(s) (requested by pooka in ticket #1244):
sys/fs/puffs/puffs_vnops.c: revision 1.142
Since VOP_GETATTR() does not require a locked vnode, resolve and
reference the puffs_node before sending the request to the file
server. This diminishes the window where the inode can be reclaimed
and be invalidated before it is accessed (but does not completely
eliminate the race, as that is a caller problem which we cannot
fix here).
 1.129.4.6 18-Dec-2009  snj Pull up following revision(s) (requested by pooka in ticket #1184):
sys/fs/puffs/puffs_vnops.c: revision 1.141 via patch
Push all information cached in the vnode to the file server before
issuing INACTIVE. PR kern/42194.
Also, send setattr in fsync asynchronously if FSYNC_WAIT is not set.
 1.129.4.5 28-Nov-2009  bouyer Pull up following revision(s) (requested by pooka in ticket #1154):
sys/fs/puffs/puffs_vnops.c: revision 1.140
Send VOP_ABORTOP() in case attempting cross-dev rename, part of
PR kern/42210. Also, fix a memory management error in said case.
 1.129.4.4 28-Nov-2009  bouyer Pull up following revision(s) (requested by pooka in ticket #1153):
sys/fs/puffs/puffs_vnops.c: revision 1.139
Send VOP_ABORTOP() as a FAF -- we don't care about the return value.
 1.129.4.3 18-Oct-2009  sborrill Pull up the following revisions(s) (requested by pooka in ticket #1100):
lib/libpuffs/dispatcher.c: revision 1.33
lib/libpuffs/puffs.c: revision 1.99
lib/libpuffs/puffs.h: revision 1.111
sys/fs/puffs/puffs_msgif.h: revision 1.67 via patch
sys/fs/puffs/puffs_vnops.c: revision 1.136

Support VOP_ABORTOP() in puffs.
 1.129.4.2 03-Oct-2009  snj Pull up following revision(s) (requested by pooka in ticket #1042):
sys/fs/puffs/puffs_node.c: revision 1.14
sys/fs/puffs/puffs_vnops.c: revision 1.134
* fix a race i introduced almost two years ago in rev 1.116:
operations creating a node cannot be considered outgoing operations,
since after return from userspace they modify file system state
by creating a new node. if we do not protect the file system by
holding the directory lock, a lookup operation might race us into
the kernel and create the node earlier.
* remove pnode from hashlish before sending the reclaim faf off to
userspace. also, hold pmp_lock while frobbing the list.
 1.129.4.1 26-Sep-2009  snj Pull up following revision(s) (requested by pooka in ticket #1014):
sys/fs/puffs/puffs_vnops.c: revision 1.133
Set SAVENAME for rmdir and remove.
Addresses an easy part of PR kern/38188
 1.129.2.1 19-Jan-2009  skrll Sync with HEAD.
 1.142.4.5 31-May-2011  rmind sync with head
 1.142.4.4 05-Mar-2011  rmind sync with head
 1.142.4.3 03-Jul-2010  rmind sync with head
 1.142.4.2 30-May-2010  rmind sync with head
 1.142.4.1 16-Mar-2010  rmind Change struct uvm_object::vmobjlock to be dynamically allocated with
mutex_obj_alloc(). It allows us to share the locks among UVM objects.
 1.142.2.2 17-Aug-2010  uebayasi Sync with HEAD.
 1.142.2.1 30-Apr-2010  uebayasi Sync with HEAD.
 1.150.2.1 06-Jun-2011  jruoho Sync with HEAD.
 1.152.2.1 23-Jun-2011  cherry Catchup with rmind-uvmplock merge.
 1.161.2.5 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.161.2.4 16-Jan-2013  yamt sync with (a bit old) head
 1.161.2.3 30-Oct-2012  yamt sync with head
 1.161.2.2 23-May-2012  yamt sync with head.
 1.161.2.1 17-Apr-2012  yamt sync with head
 1.162.4.3 29-Apr-2012  mrg sync to latest -current.
 1.162.4.2 05-Apr-2012  mrg sync to latest -current.
 1.162.4.1 18-Feb-2012  mrg merge to -current.
 1.163.2.12 27-Feb-2015  martin Pull up following revision(s) (requested by manu in ticket #1260):
lib/libpuffs/puffs.3: revision 1,55,1.60
sys/fs/puffs/puffs_msgif.h: revision 1.84
lib/libperfuse/ops.c: revision 1.83
sys/fs/puffs/puffs_sys.h: revision 1.89
sys/fs/puffs/puffs_vfsops.c: revision 1.116
lib/libperfuse/perfuse.c: revision 1.36
sys/fs/puffs/puffs_vnops.c: revision 1.200-1.202

Use more markup. New sentence, new line. Bump date for previous.

Add PUFFS_KFLAG_NOFLUSH_META to prevent sending metadata flush to FUSE
FUSE filesystems do not expect to get metadata updates for [amc]time
and size, they updates the value on their own after operations.

The PUFFS PUFFS_KFLAG_NOFLUSH_META option prevents regular metadata cache
flushes to the filesystem , and libperfuse uses it to match Linux FUSE
behavior.

While there, fix a bug in SETATTR: do not update kernel metadata cache
from SETATTR reply when the request is asynchronous, as we do not have
the reply yet.

Update file size after write without metadata flush
If we do not use metadata flush, we must make sure the size is updated
in the filesystem after a write, otherwise the next GETATTR will get us
a stale value and the file will be truncated.
 1.163.2.11 16-Jan-2015  martin Pull up following revision(s) (requested by manu in ticket #1236):
sys/fs/puffs/puffs_vnops.c: revision 1.199
Make sure reads on empty files reach PUFFS filesystems
Sending a read through the page cache will get the operation
short-circuited. This is a problem with some filesystems that
expect to receive the read operation in order to update atime.
We fix that by bypassing the page cache when reading a file
wich a size known to be zero.
 1.163.2.10 09-Nov-2014  msaitoh Pull up following revision(s) (requested by manu in ticket #1187):
sys/fs/puffs/puffs_vnops.c: revision 1.198
PUFFS direct I/O cache fix
There are a few situations where we must take care of the cache if
direct
I/O was enabled:
- if we do direct I/O for write but not for read, then any write must
invalidate the cache so that a reader gets the written data and not
the not-updated cache.
- if we used a vnode without direct I/O and it is enabled for writing,
we must flush the cache before compeling the open operation, so that
the cachec write are not lost.
And at inactive time, we wipe direct I/O flags so that a new open
without
direct I/O does not inherit direct I/O.
 1.163.2.9 09-Nov-2014  msaitoh Pull up following revision(s) (requested by manu in ticket #1184):
sys/fs/puffs/puffs_vnops.c: revision 1.195
According to pooka@'s comment, a long time ago, VOP_STRATEGY could not
fail without taking down the kernel. It seems this is not the case
anymore,
hence we can stop dropping errors in puffs_vnop_strategy()
Approved by pooka@
 1.163.2.8 09-Nov-2014  msaitoh Pull up following revision(s) (requested by manu in ticket #1166):
sys/fs/puffs/puffs_vnops.c: revision 1.188-1.194
- If we truncate the file, make sure we zero-fill the end of the last
page, otherwise if the file is later truncated to a larger size
(creating a hole), that area will not return zeroes as it should.
- Use PRIx64 for printing offsets
- Improve zero-fill of last page after shrink fix:
1) do it only if the file is open for writing, otherwise we send write
requests to the FS on a file that has never been open.
2) do it inside existing if (vap->va_size != VNOVAL) block
- Retore LP64 fix that was removed by mistake
- Make this build again without debugging enabled; DPRINTF() can end up
as empty, and in an if conditional, you then need braces if that's the
only potential body.
- As is evidenced by several of our 32-bit MIPS ports, it's wrong to
print vsize_t with PRIx64 -- instead use our own PRIxVSIZE macro.
- Do the previous correctly...
 1.163.2.7 03-Nov-2014  msaitoh Pull up following revision(s) (requested by manu in ticket #1152):
sys/fs/puffs/puffs_vnops.c: revision 1.186
PUFFS fixes for size update ater write plus read/write sanity checks
- Always update kernel metadata cache for size when writing
This fixes situation where size update after appending to a file lagged
- Make read/write nilpotent when called with null size, as FFS does
- Return EFBIG instead of EINVAL for negative offsets, as FFS does
 1.163.2.6 03-Nov-2014  msaitoh Pull up following revision(s) (requested by manu in ticket #1149):
sys/fs/puffs/puffs_node.c: revision 1.33
sys/fs/puffs/puffs_vnops.c: revision 1.185
When changing a directory content, update the ctime/mtime in kernel
cache,
otherwise the updated ctime/mtime appears after the cached entry expire.
 1.163.2.5 03-Nov-2014  msaitoh Pull up following revision(s) (requested by manu in ticket #1140):
lib/libperfuse/ops.c 1.63-1.69
lib/libperfuse/perfuse.c 1.32-1.33
lib/libperfuse/perfuse_priv.h 1.32-1.34
lib/libperfuse/subr.c 1.20
lib/libpuffs/creds.c 1.16
lib/libpuffs/dispatcher.c 1.47
lib/libpuffs/puffs.h 1.125
lib/libpuffs/puffs_ops.3 1.37-1.38
lib/libpuffs/requests.c 1.24
sys/fs/puffs/puffs_msgif.h 1.81
sys/fs/puffs/puffs_sys.h 1.85
sys/fs/puffs/puffs_vnops.c 1.183
usr.sbin/perfused/msg.c 1.22
Bring libpuffs, libperfuse and perfused on par with -current:
- implement FUSE direct I/O
- remove useless code and warnings
- fix missing GETATTR bugs
- fix exended attribute get and list operations
 1.163.2.4 12-Aug-2012  martin Pull up following revision(s) (requested by manu in ticket #438):
lib/libperfuse/perfuse_priv.h: revision 1.31
sys/fs/puffs/puffs_msgif.h: revision 1.80
sys/fs/puffs/puffs_vnops.c: revision 1.171
lib/libpuffs/puffs_ops.3: revision 1.31
sys/fs/puffs/puffs_vnops.c: revision 1.172
sys/fs/puffs/puffs_vnops.c: revision 1.173
sys/fs/puffs/puffs_vnops.c: revision 1.174
usr.sbin/perfused/perfused.c: revision 1.24
sys/fs/puffs/puffs_sys.h: revision 1.80
sys/fs/puffs/puffs_sys.h: revision 1.81
sys/fs/puffs/puffs_sys.h: revision 1.82
lib/libperfuse/subr.c: revision 1.19
lib/libperfuse/perfuse.c: revision 1.30
sys/fs/puffs/puffs_msgif.c: revision 1.90
sys/fs/puffs/puffs_msgif.c: revision 1.91
sys/fs/puffs/puffs_msgif.c: revision 1.92
lib/libperfuse/ops.c: revision 1.59
lib/libpuffs/puffs.3: revision 1.53
lib/libperfuse/debug.c: revision 1.12
lib/libpuffs/puffs.3: revision 1.54
sys/fs/puffs/puffs_vnops.c: revision 1.167
sys/fs/puffs/puffs_msgif.h: revision 1.79
usr.sbin/perfused/msg.c: revision 1.21
sys/fs/puffs/puffs_vfsops.c: revision 1.102
sys/fs/puffs/puffs_vfsops.c: revision 1.103
sys/fs/puffs/puffs_vfsops.c: revision 1.105
lib/libpuffs/puffs.h: revision 1.123
lib/libperfuse/perfuse_if.h: revision 1.20
lib/libperfuse/perfuse.c: revision 1.29
lib/libpuffs/dispatcher.c: revision 1.42
lib/libpuffs/dispatcher.c: revision 1.43
- Fix same vnodes associated with multiple cookies
The scheme used to retreive known nodes on lookup was flawed, as it only
used parent and name. This produced a different cookie for the same file
if it was renamed, when looking up ../ or when dealing with multiple files
associated with the same name through link(2).
We therefore abandon the use of node name and introduce hashed lists of
inodes. This causes a huge rewrite of reclaim code, which do not attempt
to keep parents allocated until all their children are reclaimed
- Fix race conditions in reclaim
There are a few situations where we issue multiple FUSE operations for
a PUFFS operation. On reclaim, we therefore have to wait for all FUSE
operation to complete, not just the current exchanges. We do this by
introducing node reference count with node_ref() and node_rele().
- Detect data loss caused by FAF
VOP_PUTPAGES causes FAF writes where the kernel does not check the
operation result. At least issue a warning on error.
- Enjoy FAF shortcut on setattr
No need to wait for the result if the kernel does not want it. There is
however an exception for setattr that touch the size, we need to wait
for completion because we have other operations queued for after the
resize.
- Fix fchmod() on write-open file
fchmod() on a node open with write privilege will send setattr with both mode
and size set. This confuses some FUSE filesystem. Therefore we send two FUSE
operations, one for mode, and one for size.
- Remove node TTL handling for netbsd-5 for simplicity sake. The code
still builds on netbsd-5 but does not have the node TTL feature anymore.
It works fine with kernel support on netbsd-6.
- Improve PUFFS_KFLAG_CACHE_FS_TTL by reclaiming older inactive nodes.
The normal kernel behavior is to retain inactive nodes in the freelist
until it runs out of vnodes. This has some merit for local filesystems,
where the cost of an allocation is about the same as the cost of a
lookup. But that situation is not true for distributed filesystems.
On the other hand, keeping inactive nodes for a long time hold memory
in the file server process, and when the kernel runs out of vnodes, it
produce reclaim avalanches that increase lattency for other operations.
We do not reclaim inactive vnodes immediatly either, as they may be
looked up again shortly. Instead we introduce a grace time and we
reclaim nodes that have been inactive beyond the grace time.
- Fix lookup/reclaim race condition.
The above improvement undercovered a race condition between lookup and
reclaim. If we reclaimed a vnode associated with a userland cookie while
a lookup returning that same cookiewas inprogress, then the kernel ends
up with a vnode associated with a cookie that has been reclaimed in
userland. Next operation on the cookie will crash (or at least confuse)
the filesystem.
We fix this by introducing a lookup count in kernel and userland. On
reclaim, the kernel sends the count, which enable userland to detect
situation where it initiated a lookup that is not completed in kernel.
In such a situation, the reclaim must be ignored, as the node is about
to be looked up again.
Fix hang unmount bug introduced by last commit.
We introduced a slow queue for delayed reclaims, while the existing
queue for unmount, flush and exist has been renamed fast queue. Both
queues had timestamp for when an operation should be done, but it was
useless for the fast queue, which is always used to run an operation
ASAP. And the timestamp test had an error that turned ASAP into "at next
tick", but nobody what there to wake the thread at next tick, hence
the hang. The fix is to remove the useless and buggy timestamp test for
fast queue.
Rename slow sopreq queue into node sopreq queue, to refet the fact that
is only intended for postponed node reclaims.
When purging the node sopreq queue, do not call puffs_msg_sendresp(), as
it makes no sense.
Fix race condition between (create|mknod|mkdir|symlino) and reclaim, just
like we did it between lookup and reclaim.
Missing bit in previous commit (prevent race between create|mknod|mkdir|symlink
and reclaim)
Bump date for previous.
New sentence, new line; remove trailing whitespace; fix typos;
punctuation nits.
Add PUFFS_KFLAG_CACHE_DOTDOT so that vnodes hold a reference on their
parent, keeping them active, and allowing to lookup .. without sending
a request to the filesystem.
Enable the featuure for perfused, as this is how FUSE works.
Missing bit in previous commit (PUFFS_KFLAG_CACHE_DOTDOT option to avoid
looking up ..)
 1.163.2.3 12-Aug-2012  martin Pull up following revision(s) (requested by manu in ticket #484):
sys/fs/nilfs/nilfs_vnops.c: revision 1.18
sys/ufs/ufs/ufs_lookup.c: revision 1.117
sys/nfs/nfs_vnops.c: revision 1.295
sys/ufs/chfs/chfs_vnops.c: revision 1.8
sys/ufs/ext2fs/ext2fs_lookup.c: revision 1.70
sys/fs/unionfs/unionfs_vnops.c: revision 1.6
sys/kern/vfs_cache.c: revision 1.89
sys/fs/efs/efs_vnops.c: revision 1.26
sys/fs/hfs/hfs_vnops.c: revision 1.26
sys/fs/adosfs/adlookup.c: revision 1.16
sys/fs/puffs/puffs_vnops.c: revision 1.168
sys/fs/tmpfs/tmpfs_vnops.c: revision 1.98
sys/fs/ntfs/ntfs_vnops.c: revision 1.52
sys/fs/cd9660/cd9660_lookup.c: revision 1.20
sys/fs/msdosfs/msdosfs_lookup.c: revision 1.24
sys/fs/smbfs/smbfs_vnops.c: revision 1.80
sys/fs/udf/udf_vnops.c: revision 1.72
sys/fs/filecorefs/filecore_lookup.c: revision 1.14
sys/fs/puffs/puffs_node.c: revision 1.25
Move some the test for MAKEENTRY into the cache_enter(9). Make some
variables in vfs_cache.c static, __read_mostly, etc.
No objection on tech-kern@.
 1.163.2.2 23-Apr-2012  riz Pull up following revision(s) (requested by manu in ticket #195):
lib/libskey/skeysubr.c: revision 1.27
lib/libkvm/kvm_getloadavg.c: revision 1.11
lib/libwrap/update.c: revision 1.9
lib/liby/yyerror.c: revision 1.9
lib/libpuffs/puffs_ops.3: revision 1.30
lib/libwrap/misc.c: revision 1.10
lib/libwrap/hosts_access.c: revision 1.20
lib/libpuffs/pnode.c: revision 1.11
lib/libperfuse/subr.c: revision 1.17
lib/libpuffs/pnode.c: revision 1.12
lib/libperfuse/subr.c: revision 1.18
lib/libwrap/options.c: revision 1.15
lib/libwrap/fix_options.c: revision 1.11
lib/libperfuse/ops.c: revision 1.52
lib/libperfuse/ops.c: revision 1.53
lib/libperfuse/ops.c: revision 1.54
lib/libwrap/hosts_ctl.c: revision 1.5
lib/libintl/gettext.c: revision 1.27
lib/libwrap/shell_cmd.c: revision 1.6
lib/libpuffs/dispatcher.c: revision 1.39
lib/libperfuse/perfuse_priv.h: revision 1.27
lib/libwrap/socket.c: revision 1.19
lib/libpuffs/puffs.3: revision 1.50
lib/libperfuse/perfuse_priv.h: revision 1.28
lib/libpuffs/puffs_priv.h: revision 1.45
lib/libpuffs/puffs.3: revision 1.51
lib/libperfuse/perfuse_priv.h: revision 1.29
lib/libwrap/percent_x.c: revision 1.5
lib/libpuffs/puffs.3: revision 1.52
lib/libperfuse/debug.c: revision 1.11
sys/fs/puffs/puffs_vnops.c: revision 1.165
lib/libwrap/tcpd.h: revision 1.13
sys/fs/puffs/puffs_vnops.c: revision 1.166
lib/libwrap/eval.c: revision 1.7
sys/fs/puffs/puffs_msgif.h: revision 1.78
sys/fs/puffs/puffs_vfsops.c: revision 1.101
lib/libwrap/rfc931.c: revision 1.9
lib/libwrap/clean_exit.c: revision 1.5
lib/libpuffs/puffs.h: revision 1.120
lib/libc/stdlib/jemalloc.c: revision 1.27
lib/librmt/rmtlib.c: revision 1.26
lib/libpuffs/puffs.h: revision 1.121
sys/fs/puffs/puffs_sys.h: revision 1.79
lib/librumpclient/rumpclient.c: revision 1.48
lib/libwrap/refuse.c: revision 1.5
lib/libperfuse/perfuse.c: revision 1.26
lib/libperfuse/perfuse.c: revision 1.27
tests/fs/puffs/t_fuzz.c: revision 1.5
lib/libperfuse/perfuse.c: revision 1.28
lib/libpuffs/dispatcher.c: revision 1.40
sys/fs/puffs/puffs_node.c: revision 1.24
lib/libwrap/diag.c: revision 1.9
lib/libintl/textdomain.c: revision 1.13
Use C89 function definition
Add name and atttribute cache with filesytem provided TTL.
lookup, create, mknod, mkdir, symlink, getattr and setattr messages
have been extended so that attributes and their TTL can be provided
by the filesytem. lookup, create, mknod, mkdir, and symlink messages
are also extended so that the filesystem can provide name TTL.
Add PUFFS_KFLAG_CACHE_FS_TTL flag to puffs_init(3) to use name and
attribute cache with filesystem provided TTL.
lookup, create, mknod, mkdir, symlink, getattr and setattr messages
have been extended so that attributes and their TTL can be provided
by the filesytem. lookup, create, mknod, mkdir, and symlink messages
are also extended so that the filesystem can provide name TTL.
The filesystem updates attributes and TTL using
puffs_pn_getvap(3), puffs_pn_getvattl(3), and puffs_pn_getcnttl(3)
Use new PUFFS_KFLAG_CACHE_FS_TTL option to puffs_init(3) so that
FUSE TTL on name and attributes are used. This save many PUFFS
operations and improves performances.
PUFFS_KFLAG_CACHE_FS_TTL is #ifdef'ed in many places for now so that
libperfuse can still be used on netbsd-5.
Split file system.
Comma fixes.
Remove dangling &quot;and&quot;.
Bump date for previous.
- Makesure update_va does not change vnode size when it should not. For
instance when doing a fault-issued VOP_GETPAGES within VOP_WRITE, changing
size leads to panic: genfs_getpages: past eof.
-Handle ticks wrap around for vnode name andattribute timeout
- When using PUFFS_KFLAG_CACHE_FS_TTL, do not use puffs_node to carry
attribute and TTL fora newly created node. Instead extend puffs_newinfo
and add puffs_newinfo_setva() and puffs_newinfo_setttl()
- Remove node_mk_common_final in libperfuse. It used to set uid/gid for
a newly created vnode but has been made redundant along time ago since
uid and gid are properly set in FUSE header.
- In libperfuse, check for corner case where opc = 0 on INACTIVE and RECLAIM
(how is it possible? Check for it to avoid a crash anyway)
- In libperfuse, make sure we unlimit RLIMIT_AS and RLIMIT_DATA so that
we do notrun out of memory because the kernel is lazy at reclaiming vnodes.
- In libperfuse, cleanup style of perfuse_destroy_pn()
Do not set PUFFS_KFLAG_CACHE_FS_TTL for PUFFS tests
 1.163.2.1 03-Apr-2012  riz Pull up following revision(s) (requested by jakllsch in ticket #154):
sys/fs/puffs/puffs_vnops.c: revision 1.164
Prevent access beyond end of PUFFS file on read,
similar to as is done for NFS.
 1.174.2.3 03-Dec-2017  jdolecek update from HEAD
 1.174.2.2 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.174.2.1 20-Nov-2012  tls Resync to 2012-11-19 00:00:00 UTC
 1.176.2.1 18-May-2014  rmind sync with head
 1.181.2.1 10-Aug-2014  tls Rebase.
 1.182.2.13 27-Feb-2015  martin Pull up following revision(s) (requested by manu in ticket #555):
lib/libpuffs/puffs.3: revision 1.60
sys/fs/puffs/puffs_msgif.h: revision 1.84
lib/libperfuse/ops.c: revision 1.83
sys/fs/puffs/puffs_sys.h: revision 1.89
sys/fs/puffs/puffs_vfsops.c: revision 1.116
lib/libperfuse/perfuse.c: revision 1.36
sys/fs/puffs/puffs_vnops.c: revision 1.200-1.202

Add PUFFS_KFLAG_NOFLUSH_META to prevent sending metadata flush to FUSE

FUSE filesystems do not expect to get metadata updates for [amc]time
and size, they updates the value on their own after operations.

The PUFFS PUFFS_KFLAG_NOFLUSH_META option prevents regular metadata cache
flushes to the filesystem , and libperfuse uses it to match Linux FUSE
behavior.

While there, fix a bug in SETATTR: do not update kernel metadata cache
from SETATTR reply when the request is asynchronous, as we do not have
the reply yet.

Update file size after write without metadata flush
If we do not use metadata flush, we must make sure the size is updated
in the filesystem after a write, otherwise the next GETATTR will get us
a stale value and the file will be truncated.
 1.182.2.12 17-Jan-2015  martin Pull up following revision(s) (requested by manu in ticket #423):
sys/fs/puffs/puffs_vnops.c: revision 1.199
Make sure reads on empty files reach PUFFS filesystems
Sending a read through the page cache will get the operation
short-circuited. This is a problem with some filesystems that
expect to receive the read operation in order to update atime.
We fix that by bypassing the page cache when reading a file
wich a size known to be zero.
 1.182.2.11 09-Nov-2014  msaitoh Pull up following revision(s) (requested by manu in ticket #194):
sys/fs/puffs/puffs_vnops.c: revision 1.197
sys/fs/puffs/puffs_node.c: revision 1.35
Fix PUFFS node use-after-reclaim
When puffs_cookie2vnode() misses an entry, vcache_get()
creates a new node (puffs_vfsop_loadvnode being called to
initialize the PUFFS part), then it discovers it is VNON,
and tries to vrele() it. vrele() calls VOP_INACTIVE(),
which led us in puffs_vnop_inactive() where we sent a
request to the filesystem for a node that already had been
reclaimed.
The fix is to check for VNON nodes in puffs_vnop_inactive()
and to return without doing anyting. This is suboptimal, but
a better workaround would probably need to modify vcache API,
with an impact on other filesystems. Let us keep it simple.
 1.182.2.10 09-Nov-2014  msaitoh Pull up following revision(s) (requested by manu in ticket #193):
sys/fs/puffs/puffs_vnops.c: revision 1.198
PUFFS direct I/O cache fix
There are a few situations where we must take care of the cache if
direct
I/O was enabled:
- if we do direct I/O for write but not for read, then any write must
invalidate the cache so that a reader gets the written data and not
the not-updated cache.
- if we used a vnode without direct I/O and it is enabled for writing,
we must flush the cache before compeling the open operation, so that
the cachec write are not lost.
And at inactive time, we wipe direct I/O flags so that a new open
without
direct I/O does not inherit direct I/O.
 1.182.2.9 05-Nov-2014  snj Pull up following revision(s) (requested by manu in ticket #182):
sys/fs/puffs/puffs_vnops.c: revision 1.195
According to pooka@'s comment, a long time ago, VOP_STRATEGY could not
fail without taking down the kernel. It seems this is not the case
anymore,
hence we can stop dropping errors in puffs_vnop_strategy()
Approved by pooka@
 1.182.2.8 05-Nov-2014  snj Pull up following revision(s) (requested by manu in ticket #181):
lib/libperfuse/fuse.h: revision 1.6
lib/libperfuse/ops.c: revision 1.78
lib/libperfuse/perfuse.c: revision 1.35
lib/libperfuse/perfuse_priv.h: revision 1.36
lib/libpuffs/dispatcher.c: revision 1.48
lib/libpuffs/opdump.c: revision 1.37
lib/libpuffs/puffs.c: revision 1.118
lib/libpuffs/puffs.h: revision 1.126
lib/libpuffs/puffs_ops.3: revisions 1.40-1.41
sys/fs/puffs/puffs_msgif.h: revision 1.82-1.83
sys/fs/puffs/puffs_msgif.h: revision 1.82
sys/fs/puffs/puffs_vnops.c: revision 1.196
Add PUFFS support for fallocate and fdiscard operations
--
libpuffs support for fallocate and fdiscard operations
--
Add PUFFS_HAVE_FALLOCATE in puffs_msgif.h so that filesystem can decide
at build time wether fallocate is usable
--
FUSE fallocate support
There seems to be no fdiscard FUSE operation at the moment, hence that
one is left unused.
 1.182.2.7 14-Oct-2014  martin Pull up revisions 1.192-1.194: fix debug printf formatting and make
it compile without debugging enabled.
 1.182.2.6 13-Oct-2014  martin Pull up following revision(s) (requested by manu in ticket #136):
sys/fs/puffs/puffs_vnops.c: revision 1.189-1.191
If we truncate a file open for writing, make sure we zero-fill the end
of the last page, otherwise if the file is later truncated to a larger
size (creating a hole), that area will not return zeroes as it should.
 1.182.2.5 30-Sep-2014  martin Pull up following revision(s) (requested by hannken in ticket #67):
sys/fs/puffs/puffs_node.c: revision 1.34
sys/fs/puffs/puffs_vnops.c: revision 1.187
Fix the puffs_sop_thread -> puffs_cookie2vnode path:
- pass the cookie by reference
- add missing mutex_exit()
- update assertion for VNON typed vnodes
 1.182.2.4 11-Sep-2014  martin Pull up following revision(s) (requested by manu in ticket #93):
sys/fs/puffs/puffs_vnops.c: revision 1.186
PUFFS fixes for size update ater write plus read/write sanity checks
- Always update kernel metadata cache for size when writing
This fixes situation where size update after appending to a file lagged
- Make read/write nilpotent when called with null size, as FFS does
- Return EFBIG instead of EINVAL for negative offsets, as FFS does
 1.182.2.3 10-Sep-2014  martin Pull up following revision(s) (requested by manu in ticket #79):
sys/fs/puffs/puffs_node.c: revision 1.33
sys/fs/puffs/puffs_vnops.c: revision 1.185
When changing a directory content, update the ctime/mtime in kernel
cache, otherwise the updated ctime/mtime appears after the cached
entry expire.
 1.182.2.2 29-Aug-2014  martin Pull up following revision(s) (requested by hannken in ticket #67):
sys/fs/puffs/puffs_sys.h: revision 1.86
sys/fs/puffs/puffs_vfsops.c: revision 1.114
sys/fs/puffs/puffs_msgif.c: revision 1.95
sys/fs/puffs/puffs_node.c: revision 1.32
sys/fs/puffs/puffs_vnops.c: revision 1.184
Change puffs from hashlist to vcache.
- field "pa_nhashbuckets" of struct "puffs_kargs" becomes a no-op.
and should be removed on the next protocol version bump.
 1.182.2.1 26-Aug-2014  riz Pull up following revision(s) (requested by manu in ticket #52):
sys/fs/puffs/puffs_msgif.h: revision 1.81
sys/fs/puffs/puffs_sys.h: revision 1.85
sys/fs/puffs/puffs_vnops.c: revision 1.183
Add a oflags input field to open requests so that the filesystem can pass
back information about the file. Implement PUFFS_OPEN_IO_DIRECT, which
will force direct IO (bypassing page cache) for the file.
 1.198.2.5 28-Aug-2017  skrll Sync with HEAD
 1.198.2.4 05-Oct-2016  skrll Sync with HEAD
 1.198.2.3 09-Jul-2016  skrll Sync with HEAD
 1.198.2.2 06-Jun-2015  skrll Sync with HEAD
 1.198.2.1 06-Apr-2015  skrll Sync with HEAD
 1.204.2.2 26-Apr-2017  pgoyette Sync with HEAD
 1.204.2.1 26-Jul-2016  pgoyette Sync with HEAD
 1.205.2.1 21-Apr-2017  bouyer Sync with HEAD
 1.211.10.2 08-Apr-2020  martin Merge changes from current as of 20200406
 1.211.10.1 10-Jun-2019  christos Sync with HEAD
 1.211.8.1 26-Nov-2018  pgoyette Sync with HEAD, resolve a couple of conflicts
 1.211.2.1 06-Nov-2018  martin Pull up following revision(s) (requested by manu in ticket #1082):

sys/fs/puffs/puffs_vnops.c: revision 1.213

Fix use after RECLAIM in PUFFS filesystems

From hannken@

When puffs_cookie2vnode() misses an entry and vrele() it operations
puffs_vnop_reclaim() and puffs_vnop_fsync() get called with a VNON
vnode.

Do not notify the server in this case as the cookie is stale.
 1.213.6.1 29-Feb-2020  ad Sync with head.
 1.214.4.1 25-Apr-2020  bouyer Sync with bouyer-xenpvh-base2 (HEAD)
 1.217.6.1 01-Aug-2021  thorpej Sync with HEAD.

RSS XML Feed