Home | History | Annotate | Download | only in puffs
History log of /src/sys/fs/puffs/puffs_vnops.c
RevisionDateAuthorComments
 1.226  09-Feb-2024  andvar fix spelling mistakes, mainly in comments and log messages.
 1.225  23-Feb-2022  andvar fix various typos in comments, mainly immediatly/immediately/,
as well shared and recently fixed typos in OpenBSD code by Jonathan Grey.
 1.224  05-Dec-2021  msaitoh s/invlid/invalid/ in comment.
 1.223  20-Oct-2021  thorpej Overhaul of the EVFILT_VNODE kevent(2) filter:

- Centralize vnode kevent handling in the VOP_*() wrappers, rather than
forcing each individual file system to deal with it (except VOP_RENAME(),
because VOP_RENAME() is a mess and we currently have 2 different ways
of handling it; at least it's reasonably well-centralized in the "new"
way).
- Add support for NOTE_OPEN, NOTE_CLOSE, NOTE_CLOSE_WRITE, and NOTE_READ,
compatible with the same events in FreeBSD.
- Track which kevent notifications clients are interested in receiving
to avoid doing work for events no one cares about (avoiding, e.g.
taking locks and traversing the klist to send a NOTE_WRITE when
someone is merely watching for a file to be deleted, for example).

In support of the above:

- Add support in vnode_if.sh for specifying PRE- and POST-op handlers,
to be invoked before and after vop_pre() and vop_post(), respectively.
Basic idea from FreeBSD, but implemented differently.
- Add support in vnode_if.sh for specifying CONTEXT fields in the
vop_*_args structures. These context fields are used to convey information
between the file system VOP function and the VOP wrapper, but do not
occupy an argument slot in the VOP_*() call itself. These context fields
are initialized and subsequently interpreted by PRE- and POST-op handlers.
- Version VOP_REMOVE(), uses the a context field for the file system to report
back the resulting link count of the target vnode. Return this in tmpfs,
udf, nfs, chfs, ext2fs, lfs, and ufs.

NetBSD 9.99.92.
 1.222  24-Jul-2021  andvar Fix all remaining typos, mainly in comments but also in few definitions and log messages, reported by me in PR kern/54889.
Also fixed some additional typos in comments, found on review of same files or typos.
 1.221  19-Jul-2021  dholland Abolish all the silly indirection macros for initializing vnode ops tables.

These are things of the form #define foofs_op genfs_op, or #define
foofs_op genfs_eopnotsupp, or similar. They serve no purpose besides
obfuscation, and have gotten cutpasted all over everywhere.

Part 2; cvs randomly didn't commit these changes before, and then hid
them from me until I touched the files to force it to rethink. Dunno
what happened.

There's probably more of these, going to have to scan the tree the
hard way.
 1.220  18-Jul-2021  dholland Use macros for the canned parts of device and fifo vnode op tables.

Add GENFS_SPECOP_ENTRIES and GENFS_FIFOOP_ENTRIES macros that contain
the portion of the vnode ops table declaration that is
(conservatively) the same in every fs. Use these in every fs that
supports devices and/or fifos with separate ops tables.

Note that ptyfs works differently (it has one type of vnode with
open-coded dispatch to the specfs code, which I haven't changed in
this commit) and rump/librump/rumpvfs/rumpfs.c has an indirect dynamic
dispatch that already does more or less the same thing, which I also
haven't changed.

Also note that this anticipates a few bits in the next changeset here
and there, and adds missing but unreachable calls in some cases (e.g.
most fses weren't defining whiteout on devices and fifos, but it isn't
reachable there), and it changes parsepath on devices and fifos to
genfs_badop from genfs_parsepath (but it's not reachable there
either).

It appears that devices in kernfs were missing kqfilter, so it's
possible that if you try to use kqueue on /kern/rootdev that it'll
explode.

And finally note that the ops declaration tables aren't
order-dependent. (Other than vop_default_desc has to come first.)
Otherwise this wouldn't work.
 1.219  29-Jun-2021  dholland Now remove cn_consume from struct componentname.

This change requires a kernel bump.

Note though that I'm not going to version the VOP_LOOKUP args
structure (or any other args structure) as code that doesn't touch
cn_consume doesn't need attention and code that does will fail on it
without further intervention.
 1.218  29-Jun-2021  dholland - Add a new vnode op: VOP_PARSEPATH.
- Move namei_getcomponent to genfs_vnops.c and call it genfs_parsepath.
- Add a parsepath entry to every vnode ops table.

VOP_PARSEPATH takes a directory vnode to be searched and a complete
following path and chooses how much of that path to consume. To begin
with, all parsepath calls are genfs_parsepath, which locates the first
'/' as always.

Note that the call doesn't take the whole struct componentname, only
the string. The other bits of struct componentname should not be
needed and there's no reason to cause potential complications by
exposing them.
 1.217  16-May-2020  christos branches: 1.217.6;
Add ACL support for FFS. From FreeBSD.
 1.216  15-May-2020  maxv hardclock_ticks -> getticks()
 1.215  23-Apr-2020  ad PR kern/54759 (vm.ubc_direct deadlock when read()/write() into mapping of itself)

- Add new flag UBC_ISMAPPED which tells ubc_uiomove() the object is mmap()ed
somewhere. Use it to decide whether to do direct-mapped copy, rather than
poking around directly in the vnode in ubc_uiomove(), which is ugly and
doesn't work for tmpfs. It would be nicer to contain all this in UVM but
the filesystem provides the needed locking here (VV_MAPPED) and to
reinvent that would suck more.

- Rename UBC_UNMAP_FLAG() to UBC_VNODE_FLAGS(). Pass in UBC_ISMAPPED where
appropriate.
 1.214  23-Feb-2020  ad branches: 1.214.4;
UVM locking changes, proposed on tech-kern:

- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart. v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap. Others to follow later.
 1.213  06-Nov-2018  manu branches: 1.213.6;
Fix use after RECLAIM in PUFFS filesystems

From hannken@

When puffs_cookie2vnode() misses an entry and vrele() it operations
puffs_vnop_reclaim() and puffs_vnop_fsync() get called with a VNON
vnode.

Do not notify the server in this case as the cookie is stale.
 1.212  05-Nov-2018  manu Add missing mutex pn->pn_sizemtx lock in puffs_vnop_open()

puffs_vnop_open() calls flushvncache(), which calls dosetattr()
if pn->pn_stat has PNODE_METACACHE_MASK. In that case, the lock
on pn->pn_sizemtx is mandatory and asserted.
 1.211  26-May-2017  riastradh branches: 1.211.2; 1.211.8; 1.211.10;
Make VOP_RECLAIM do the last unlock of the vnode.

VOP_RECLAIM naturally has exclusive access to the vnode, so having it
locked on entry is not strictly necessary -- but it means if there
are any final operations that must be done on the vnode, such as
ffs_update, requiring exclusive access to it, we can now kassert that
the vnode is locked in those operations.

We can't just have the caller release the last lock because some file
systems don't use genfs_lock, and require the vnode to remain valid
for VOP_UNLOCK to work, notably unionfs.
 1.210  26-Apr-2017  riastradh Change VOP_REMOVE and VOP_RMDIR to preserve lock/ref on dvp.

No change to vp -- the plan is to replace the node by the
componentname in the vop parameters, and let all directory vops do
lookups internally.

Proposed on tech-kern with no objections:
https://mail-index.netbsd.org/tech-kern/2017/04/17/msg021825.html
 1.209  11-Apr-2017  riastradh Make VOP_INACTIVE preserve vnode lock on return.

Discussed on tech-kern:
https://mail-index.netbsd.org/tech-kern/2017/04/01/msg021751.html

Ride 7.99.68, a bumpy bus of incremental vfs improvements!
 1.208  08-Apr-2017  hannken Update mtime when updating file size.

PR kern/51762 (mtime not updated by open(O_TRUNC))
 1.207  06-Apr-2017  christos use ubc_zerorange
 1.206  04-Apr-2017  christos use MAX_PAGE_SIZE.
 1.205  21-Jul-2016  christos branches: 1.205.2;
replace variable stack declaration with a large enough one and KASSERT.
 1.204  07-Jul-2016  msaitoh branches: 1.204.2;
KNF. Remove extra spaces. No functional change.
 1.203  20-Apr-2015  riastradh Make VOP_LINK return directory still locked and referenced.

Ride 7.99.10 bump.
 1.202  25-Feb-2015  christos make this compile again.
 1.201  25-Feb-2015  manu Update file size after write without metadata flush

If we do not use metadata flush, we must make sure the size is updated
in the filesystem after a write, otherwise the next GETATTR will get us
a stale value and the file will be truncated.
 1.200  15-Feb-2015  manu Add PUFFS_KFLAG_NOFLUSH_META to prevent sending metadata flush to FUSE

FUSE filesystems do not expect to get metadata updates for [amc]time
and size, they updates the value on their own after operations.

The PUFFS PUFFS_KFLAG_NOFLUSH_META option prevents regular metadata cache
flushes to the filesystem , and libperfuse uses it to match Linux FUSE
behavior.

While there, fix a bug in SETATTR: do not update kernel metadata cache
from SETATTR reply when the request is asynchronous, as we do not have
the reply yet.
 1.199  13-Jan-2015  manu Make sure reads on empty files reach PUFFS filesystems

Sending a read through the page cache will get the operation
short-circuited. This is a problem with some filesystems that
expect to receive the read operation in order to update atime.

We fix that by bypassing the page cache when reading a file
wich a size known to be zero.
 1.198  04-Nov-2014  manu branches: 1.198.2;
PUFFS direct I/O cache fix

There are a few situations where we must take care of the cache if direct
I/O was enabled:
- if we do direct I/O for write but not for read, then any write must
invalidate the cache so that a reader gets the written data and not
the not-updated cache.
- if we used a vnode without direct I/O and it is enabled for writing,
we must flush the cache before compeling the open operation, so that
the cachec write are not lost.

And at inactive time, we wipe direct I/O flags so that a new open without
direct I/O does not inherit direct I/O.
 1.197  04-Nov-2014  manu Fix PUFFS node use-after-reclaim

When puffs_cookie2vnode() misses an entry, vcache_get()
creates a new node (puffs_vfsop_loadvnode being called to
initialize the PUFFS part), then it discovers it is VNON,
and tries to vrele() it. vrele() calls VOP_INACTIVE(),
which led us in puffs_vnop_inactive() where we sent a
request to the filesystem for a node that already had been
reclaimed.

The fix is to check for VNON nodes in puffs_vnop_inactive()
and to return without doing anyting. This is suboptimal, but
a better workaround would probably need to modify vcache API,
with an impact on other filesystems. Let us keep it simple.
 1.196  31-Oct-2014  manu Add PUFFS support for fallocate and fdiscard operations
 1.195  31-Oct-2014  manu According to pooka@'s comment, a long time ago, VOP_STRATEGY could not
fail without taking down the kernel. It seems this is not the case anymore,
hence we can stop dropping errors in puffs_vnop_strategy()

Approved by pooka@
 1.194  07-Oct-2014  he Do the previous correctly...
 1.193  07-Oct-2014  he As is evidenced by several of our 32-bit MIPS ports, it's wrong to
print vsize_t with PRIx64 -- instead use our own PRIxVSIZE macro.
 1.192  06-Oct-2014  he Make this build again without debugging enabled; DPRINTF() can end up
as empty, and in an if conditional, you then need braces if that's the
only potential body.
 1.191  06-Oct-2014  manu Retore LP64 fix that was removed by mistake
 1.190  06-Oct-2014  manu Improve zero-fill of last page after shrink fix:
1) do it only if the file is open for writing, otherwise we send write
requests to the FS on a file that has never been open.
2) do it inside existing if (vap->va_size != VNOVAL) block
 1.189  05-Oct-2014  justin Use PRIx64 for printing offsets
 1.188  05-Oct-2014  manu If we truncate the file, make sure we zero-fill the end of the last
page, otherwise if the file is later truncated to a larger size
(creating a hole), that area will not return zeroes as it should.
 1.187  30-Sep-2014  hannken Fix the puffs_sop_thread -> puffs_cookie2vnode path:
- pass the cookie by reference
- add missing mutex_exit()
- update assertion for VNON typed vnodes
 1.186  11-Sep-2014  manu PUFFS fixes for size update ater write plus read/write sanity checks

- Always update kernel metadata cache for size when writing
This fixes situation where size update after appending to a file lagged
- Make read/write nilpotent when called with null size, as FFS does
- Return EFBIG instead of EINVAL for negative offsets, as FFS does
 1.185  05-Sep-2014  manu When changing a directory content, update the ctime/mtime in kernel cache,
otherwise the updated ctime/mtime appears after the cached entry expire.
 1.184  28-Aug-2014  hannken Change puffs from hashlist to vcache.
- field "pa_nhashbuckets" of struct "puffs_kargs" becomes a no-op.
and should be removed on the next protocol version bump.
 1.183  16-Aug-2014  manu Add a oflags input field to open requests so that the filesystem can pass
back information about the file. Implement PUFFS_OPEN_IO_DIRECT, which
will force direct IO (bypassing page cache) for the file.
 1.182  25-Jul-2014  dholland branches: 1.182.2;
Add VOP_FALLOCATE and VOP_FDISCARD to every vnode ops table I can
find.

The filesystem ones all call genfs_eopnotsupp - right now I am only
implementing the plumbing and we can implement fallocate and/or
fdiscard for files later.

The device ones call spec_fallocate (which is also genfs_eopnotsupp)
and spec_fdiscard, which dispatches to the device-level op.

The fifo ones all call vn_fifo_bypass, which also ends up being
EOPNOTSUPP.
 1.181  24-Mar-2014  hannken branches: 1.181.2;
- Make VI_XLOCK, VI_CLEAN and VI_LOCKSHARE private to kern/vfs_*.c.
- Make vwait() static.
- Add vdead_check() to check a vnode for being or becoming dead.

Discussed on tech-kern.

Welcome to 6.99.38
 1.180  07-Feb-2014  hannken Change vnode operation lookup to return the resulting vnode *vpp unlocked.
Change cache_lookup() to return an unlocked vnode.

Discussed on tech-kern@

Welcome to 6.99.31
 1.179  23-Jan-2014  hannken Change vnode operations create, mknod, mkdir and symlink to return
the resulting vnode *vpp unlocked.

Discussed on tech-kern@

Welcome to 6.99.30
 1.178  17-Jan-2014  hannken Change vnode operations create, mknod, mkdir and symlink to keep the
directory node dvp locked on return.

Discussed on tech-kern@

Welcome to 6.99.29
 1.177  17-Oct-2013  christos - remove unused variables
- add _NOERROR flavor macros for the case where errors are ignored.
 1.176  05-Nov-2012  dholland branches: 1.176.2;
Excise struct componentname from the namecache.

This uglifies the interface, because several operations need to be
passed the namei flags and cache_lookup also needs for the time being
to be passed cnp->cn_nameiop. Nonetheless, it's a net benefit.

The glop should be able to go away eventually but requires structural
cleanup elsewhere first.

This change requires a kernel bump.
 1.175  05-Nov-2012  dholland Disentangle the namecache from the internals of namei.

- Move the namecache's hash computation to inside the namecache code,
instead of being spread out all over the place. Remove cn_hash from
struct componentname and delete all uses of it.

- It is no longer necessary (if it ever was) for cache_lookup and
cache_lookup_raw to clear MAKEENTRY from cnp->cn_flags for the cases
that cache_enter already checks for.

- Rearrange the interface of cache_lookup (and cache_lookup_raw) to
make it somewhat simpler, to exclude certain nonexistent error
conditions, and (most importantly) to make it not require write access
to cnp->cn_flags.

This change requires a kernel bump.
 1.174  10-Aug-2012  manu branches: 1.174.2;
Add PUFFS_KFLAG_CACHE_DOTDOT so that vnodes hold a reference on their
parent, keeping them active, and allowing to lookup .. without sending
a request to the filesystem.

Enable the featuure for perfused, as this is how FUSE works.
 1.173  10-Aug-2012  manu Missing bit in previous commit (prevent race between create|mknod|mkdir|symlink
and reclaim)
 1.172  10-Aug-2012  manu Fix race condition between (create|mknod|mkdir|symlino) and reclaim, just
like we did it between lookup and reclaim.
 1.171  27-Jul-2012  manu Rename slow sopreq queue into node sopreq queue, to refet the fact that
is only intended for postponed node reclaims.
When purging the node sopreq queue, do not call puffs_msg_sendresp(), as
it makes no sense.
 1.170  23-Jul-2012  manu Backout NCHNAMLEN check for cache_enter. That change collided with rmind's
move of this exact check into cache_enter
 1.169  23-Jul-2012  manu Di not call cache_enter with path components bigger than NCHNAMLEN, as it
panics the kernel.
 1.168  22-Jul-2012  rmind Move some the test for MAKEENTRY into the cache_enter(9). Make some
variables in vfs_cache.c static, __read_mostly, etc.

No objection on tech-kern@.
 1.167  21-Jul-2012  manu - Improve PUFFS_KFLAG_CACHE_FS_TTL by reclaiming older inactive nodes.

The normal kernel behavior is to retain inactive nodes in the freelist
until it runs out of vnodes. This has some merit for local filesystems,
where the cost of an allocation is about the same as the cost of a
lookup. But that situation is not true for distributed filesystems.
On the other hand, keeping inactive nodes for a long time hold memory
in the file server process, and when the kernel runs out of vnodes, it
produce reclaim avalanches that increase lattency for other operations.

We do not reclaim inactive vnodes immediatly either, as they may be
looked up again shortly. Instead we introduce a grace time and we
reclaim nodes that have been inactive beyond the grace time.

- Fix lookup/reclaim race condition.

The above improvement undercovered a race condition between lookup and
reclaim. If we reclaimed a vnode associated with a userland cookie while
a lookup returning that same cookiewas inprogress, then the kernel ends
up with a vnode associated with a cookie that has been reclaimed in
userland. Next operation on the cookie will crash (or at least confuse)
the filesystem.

We fix this by introducing a lookup count in kernel and userland. On
reclaim, the kernel sends the count, which enable userland to detect
situation where it initiated a lookup that is not completed in kernel.
In such a situation, the reclaim must be ignored, as the node is about
to be looked up again.
 1.166  18-Apr-2012  manu - Makesure update_va does not change vnode size when it should not. For
instance when doing a fault-issued VOP_GETPAGES within VOP_WRITE, changing
size leads to panic: genfs_getpages: past eof.
-Handle ticks wrap around for vnode name andattribute timeout
 1.165  08-Apr-2012  manu Add name and atttribute cache with filesytem provided TTL.
lookup, create, mknod, mkdir, symlink, getattr and setattr messages
have been extended so that attributes and their TTL can be provided
by the filesytem. lookup, create, mknod, mkdir, and symlink messages
are also extended so that the filesystem can provide name TTL.
 1.164  16-Mar-2012  jakllsch Prevent access beyond end of PUFFS file on read,
similar to as is done for NFS.
 1.163  17-Jan-2012  martin branches: 1.163.2;
Add a few KASSERT() - I have a crash that likely will cause one of them to
fire...
 1.162  18-Nov-2011  christos branches: 1.162.4;
Obey MNT_RELATIME, the only addition is that mkdir in ufs sets IN_ACCESS too.
 1.161  30-Oct-2011  hannken branches: 1.161.2;
Add a comment that pn_sizemtx should be useless as VOP_GETATTR now
needs a shared lock at least.
 1.160  19-Oct-2011  manu Remove #ifdef DIAGNOSTIC guards around KASSERT, as the macro contains them
 1.159  18-Oct-2011  manu Make sure pagedaemon does not sleep for memory in puffs_vnop_sleep.
Add KASSERT on any sleeping memory allocation to check it cannot happen again.
 1.158  17-Oct-2011  manu Roll back the change that forced kernel threads to not sleep in PUFFS.
The change does not make consensus, since only pagedaemon should need it.
Other threads will tolerate sleeping, and problems here are only symptoms
that something is going wrong in memory management. The cause, not the
symptoms, need to be fixed.
 1.157  23-Sep-2011  manu Fix the build that was broken by struct lwp *updateproc reference in
RUMP-visible code. Instead of checking that updateproc (aka ioflush,
aka syncer) will not sleep in PUFFS code, I check for any kernel thread:
after all none of them are designed to hang awaiting for a remote filesystem
operation to complete.
 1.156  21-Sep-2011  manu Make sure ioflush does not sleep in PUFFS code path, waiting for a mutex,
a memory allocation, or a response from the filesystem.

This avoids deadlocks in the following situations:
1) when memory is low: ioflush waits the fileystem, the fielsystem waits
for memory
2) when the filesystem does not respond (e.g.: network outage ona
distributed filesystem)
 1.155  29-Aug-2011  manu Add a mutex for operations that touch size (setattr, getattr, write, fsync).

This is required to avoid data corruption bugs, where a getattr slices
itself within a setattr operation, and sets the size to the stall value
it got from the filesystem. That value is smaller than the one set by
setattr, and the call to uvm_vnp_setsize() trigged a spurious truncate.
The result is a chunk of zeroed data in the file.

Such a situation can easily happen when the ioflush thread issue a
VOP_FSYNC/puffs_vnop_sync/flushvncache/dosetattrn while andother process
do a sys_stat/VOP_GETATTR/puffs_vnop_getattr.

This mutex on size operation can be removed the day we decide VOP_GETATTR
has to operated on a locked vnode, since the other operations that touch
size already require that.
 1.154  04-Jul-2011  manu Add a flag to VOP_LISTEXTATTR(9) so that the vnode interface can tell the
filesystem in which format extended attribute shall be listed.

There are currently two formats:
- NUL-terminated strings, used for listxattr(2), this is the default.
- one byte length-pprefixed, non NUL-terminated strings, used for
extattr_list_file(2), which is obtanined by setting the
EXTATTR_LIST_PREFIXLEN flag to VOP_LISTEXTATTR(9)

This approach avoid the need for converting the list back and forth, except
in libperfuse, since FUSE uses NUL-terminated strings, and the kernel may
have requested EXTATTR_LIST_PREFIXLEN.
 1.153  12-Jun-2011  rmind Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.
 1.152  19-May-2011  rmind branches: 1.152.2;
Remove cache_purge(9) calls from reclamation routines in the file systems,
as vclean(9) performs it for us since Lite2 merge.
 1.151  03-May-2011  manu Call advlock method if supplied
 1.150  11-Jan-2011  kefren branches: 1.150.2;
add advlock to puffs. ok pooka@
should fix kern/43321
 1.149  30-Nov-2010  dholland Abolish the SAVENAME and HASBUF flags. There is now always a buffer,
so the path in a struct componentname is now always valid during VOP
calls.
 1.148  30-Nov-2010  dholland Abolish struct componentname's cn_pnbuf. Use the path buffer in the
pathbuf object passed to namei as work space instead. (For now a pnbuf
pointer appears in struct nameidata, to support certain unclean things
that haven't been fixed yet, but it will be going away in the future.)

This removes the need for the SAVENAME and HASBUF namei flags.
 1.147  14-Jul-2010  pooka RENAME lookup semantics say return EISDIR if dvp = *vpp for the
last component .... obviously(!!)
 1.146  24-Jun-2010  hannken Clean up vnode lock operations pass 2:

VOP_UNLOCK(vp, flags) -> VOP_UNLOCK(vp): Remove the unneeded flags argument.

Welcome to 5.99.32.

Discussed on tech-kern.
 1.145  21-May-2010  pooka Support extended attributes.
 1.144  29-Mar-2010  pooka Stop exposing fifofs internals and leave only fifo_vnodeop_p visible.
 1.143  27-Mar-2010  pooka \n, police!
 1.142  14-Jan-2010  pooka branches: 1.142.2; 1.142.4;
Since VOP_GETATTR() does not require a locked vnode, resolve and
reference the puffs_node before sending the request to the file
server. This diminishes the window where the inode can be reclaimed
and be invalidated before it is accessed (but does not completely
eliminate the race, as that is a caller problem which we cannot
fix here).
 1.141  04-Dec-2009  pooka Push all information cached in the vnode to the file server before
issuing INACTIVE. PR kern/42194.
Also, send setattr in fsync asynchronously if FSYNC_WAIT is not set.
 1.140  19-Nov-2009  pooka Send VOP_ABORTOP() in case attempting cross-dev rename, part of
PR kern/42210. Also, fix a memory management error in said case.
 1.139  19-Nov-2009  pooka Send VOP_ABORTOP() as a FAF -- we don't care about the return value.
 1.138  05-Nov-2009  pooka Kill suspend support. It was never implemented correctly:
* it depended on the biglock (in a very cruel way)
* it was attached to userspace transactions rather than logical
fs operations

(If someone wants to revisit it some day, most of the stuff can be
reused from cvs history)
 1.137  05-Nov-2009  pooka Reinstante PNODE_DYING. vmlocking had a brief hiatus when it was not
a valid optimization, but that's long gone and once VOP_INACTIVE is
called and the file server says that the vnode is going to be recycled,
it really is going to be recycled extra references gained or not.
 1.136  17-Oct-2009  pooka Transmit VOP_ABORTOP() to the server.
 1.135  30-Sep-2009  pooka remove leading whitespace. no functional change.
 1.134  30-Sep-2009  pooka * fix a race i introduced almost two years ago in rev 1.116:
operations creating a node cannot be considered outgoing operations,
since after return from userspace they modify file system state
by creating a new node. if we do not protect the file system by
holding the directory lock, a lookup operation might race us into
the kernel and create the node earlier.
* remove pnode from hashlish before sending the reclaim faf off to
userspace. also, hold pmp_lock while frobbing the list.
 1.133  19-Sep-2009  pooka Set SAVENAME for rmdir and remove.

Addresses an easy part of PR kern/38188
 1.132  12-Sep-2009  tsutsui Fix typo:
- pcinfo = kmem_zalloc(sizeof_puffs_cacheinfo) + runsize,
+ pcinfo = kmem_zalloc(sizeof(struct puffs_cacheinfo) + runsize,
in #ifdef'ed out code, per paired kmem_free() in the same function.
Closes PR kern/41840.
 1.131  26-Nov-2008  pooka Rototill all remaining file systems to use ubc_uiomove() instead
of the ubc_alloc() - uiomove() - ubc_release() dance.
 1.130  16-Nov-2008  pooka more <sys/buf.h> police
 1.129  10-Sep-2008  christos branches: 1.129.2; 1.129.4; 1.129.8;
replace 0xa0 with space from Andy Shevchenko
 1.128  30-Jan-2008  ad branches: 1.128.6; 1.128.10; 1.128.12; 1.128.16;
Replace struct lock on vnodes with a simpler lock object built on
krwlock_t. This is a step towards removing lockmgr and simplifying
vnode locking. Discussed on tech-kern.
 1.127  28-Jan-2008  pooka For code clarity typedef void *puffs_cookie_t.

No functional change.
 1.126  25-Jan-2008  ad Remove VOP_LEASE. Discussed on tech-kern.
 1.125  02-Jan-2008  pooka More type-punning workarounds. Curiously the kernel compilation
flags cause gcc to not complain.
 1.124  02-Jan-2008  ad Merge vmlocking2 to head.
 1.123  30-Dec-2007  pooka namespace a bit: vfsops -> puffs_vfsop_x() and vops -> puffs_vnop_x()
 1.122  08-Dec-2007  pooka branches: 1.122.4;
Now that "l" is gone both as an argument to operations and from
componentname, remove all vestiges of puffs_cid.
 1.121  27-Nov-2007  pooka branches: 1.121.2;
Remove "puffs_cid" from the puffs interface following l-removal
from the kernel vfs interfaces. puffs_cc_getcaller(pcc) can be
used now should the same information be desired.
 1.120  26-Nov-2007  pooka Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern
 1.119  21-Nov-2007  pooka use BUF_ISREAD/WRITE instead of homegrown variants
 1.118  20-Nov-2007  pooka Retire M_PUFFS, use kmem(9) instead.
 1.117  17-Nov-2007  pooka Make puffs_updatenode() take a puffs_node instead of a vnode. This
way we don't need to worry if a vnode has been reclaimed from under
us.
 1.116  17-Nov-2007  pooka Start playing around with vnode locks. For now, do the very easy
thing and release locks before the userspace wait for operations
which release the lock before exit from the method in any case.
However, releasing the lock after inserting the request on the
operation queue gives us proper ordering possibilities in userspace
(at least if that bit were implemented, but I don't think there
any file system in userspace that depends on kernel locking and
probably there never should be one).

inspired by a conversation with Nacho Navarro
 1.115  17-Nov-2007  pooka Implement a biodone callback for async writes similar to reads and
use that when possible.
 1.114  16-Nov-2007  pooka Restructure the messaging interface a bit more: make all interfacing
with the file server happen through puffs_msg_enqueue() and
puffs_msg_wait() instead of having a billion different routines.
Build the existing system upon these two. Most importantly though,
decouple insertation into the op queue from the actual wait. This
is useful for a number of reasons coming soon to a cvs repo near you.
 1.113  26-Oct-2007  pooka branches: 1.113.2;
Read/write can reuse message memory if operating uncached. This
will change evetually, but for now just appease a KASSERT by
resetting the message header to 0 after each loop.
 1.112  23-Oct-2007  pooka The kernel (genfs, uvm) can't deal with strategy returning an error
when vclean()ing. Pending an adventure to the genfs/vm labyrinth
to fix this properly, compensate here by not allowing unstrategic
(no pun) return values. They are always due to the userspace server
crashing anyway, so it's no big deal if we lie about the final
resting place of the pages.
 1.111  21-Oct-2007  pooka * release pathname buffer in link
* some variable massage
 1.110  19-Oct-2007  pooka When doing a read operation, don't copy the whole kernel buffer to
userspace, since it doesn't contain any information yet. I should
still rework this more so this is just a quickie to get the read/write
style interface more up to speed with the ioctl version.
 1.109  19-Oct-2007  pooka comment polish
 1.108  18-Oct-2007  pooka Fix wrong argument order which just happened to work by luck.
 1.107  11-Oct-2007  pooka branches: 1.107.2;
Part 1/n of some pretty extensive changes to how the kernel module
interacts with the userspace file server:

* since the kernel-user communication is not purely request-response
anymore (hasn't been since 2006), try to rename some "request" to
"message". more similar mangling will take place in the future.

* completely rework how messages are allocated. previously most of
them were borrowed from the stack (originally *all* of them),
but now always allocate dynamically. this makes the structure
of the code much cleaner. also makes it possible to fix a
locking order violation. it enables plenty of future enhancements.

* start generalizing the transport interface to be independent of puffs

* move transport interface to read/write instead of ioctl. the
old one had legacy design problems, and besides, ioctl's suck.
implement a very generic version for now; this will be
worked on later hopefully some day reaching "highly optimized".

* implement libpuffs support behind existing library request
interfaces. this will change eventually (I hate those interfaces)
 1.106  11-Oct-2007  pooka Cache vnode member variables necessary for operations after the
userspace call, namely our private mount structure, in the activation
record. This avoids problems in situations where the userspace
file server happens to die during our upcall and the vnode is
forcibly reclaimed before we roll back to the current stack frame.
 1.105  10-Oct-2007  ad Merge from vmlocking:

- Split vnode::v_flag into three fields, depending on field locking.
- simple_lock -> kmutex in a few places.
- Fix some simple locking problems.
 1.104  04-Oct-2007  pooka g/c the "sizeop" code previous used for ioctl/fcntl. It was already
commented out and has bitrotted beyond all recognition, so it needs
complete rethinking.
 1.103  02-Oct-2007  pooka If kernel resource allocation fails after the file server has
committed something, issue an abort. The abort is done through
the regular op channel, e.g. failed mkdir leads to regular rmdir,
inactive and reclaim. No internal interface is planned currently
for the one file system out of a million which would implement it
to benefit from the one case in a billion where kernel resource
allocation actually does fail and out of that one case in a trillion
where internal vs. external would make a difference.
 1.102  01-Oct-2007  pooka * better error checking: validate error values received from userland
to be vaild errno values
* include string describing error in PUFFS_ERR
* get rid of union in puffs_req, it's nothing but trouble
* pass pmp to async i/o callbacks
 1.101  27-Sep-2007  pooka Differentiate between cookie2vnode returning an error and
return to caller, address unknown: no such cookie, no such node.
Make the callers use this info to either create a new vnode or bail.
 1.100  27-Sep-2007  pooka Add error notifications, which are used to deliver errors from the
kernel to the file server for silly things the file server did,
e.g. attempting to create a file with size VSIZENOTSET. The file
server can handle these as it chooses, but the default action is
for it to throw its hands in the air and sing "goodbye, cruel world,
it's over, walk on by".
 1.99  27-Sep-2007  pooka Fix a race in how new cookies are checked. Previously the checking
was done separate of inserting the cookie into the lookup structure
and without any form of interlock. This could lead to the same
cookie pointing to two different nodes. Remedy the race by creating
a separate "checked and ready to be inserted" cookie list which
serves as an interlock without having to hold a fs-global creation
lock.
 1.98  22-Aug-2007  pooka branches: 1.98.2; 1.98.4;
Mimic namei structure changes for puffs. bump both kernel & lib version.
 1.97  13-Aug-2007  pooka * don't call VOP_ACCESS in lookup, that's the file system's problem
* be more careful with r/o fs to catch EEXIST in lookup CREATE
* some comment polish
 1.96  12-Aug-2007  pooka enforce MNT_RDONLY
 1.95  30-Jul-2007  pooka branches: 1.95.4; 1.95.6;
properly setup ubcflags
 1.94  29-Jul-2007  ad It's not a good idea for device drivers to modify b_flags, as they don't
need to understand the locking around that field. Instead of setting
B_ERROR, set b_error instead. b_error is 'owned' by whoever completes
the I/O request.
 1.93  27-Jul-2007  yamt ubc_uiomove: add an "advice" argument rather than using UVM_ADV_RANDOM blindly.
 1.92  27-Jul-2007  pooka Change unused fflags parameter in VOP_MMAP to prot and pass in
desired vm protection.
 1.91  22-Jul-2007  pooka use NULL, not 0, to pass a pointer
 1.90  22-Jul-2007  pooka Keep track of the maximum size we have supplied the file server (or
it has supplied us). If we fault pages which are at offset >= server
size, but less than the in-kernel vnode size, inform the file server
of the latest developments in file size before issueing the fault.
The avoids confusion with files which are not written start to finish.

fixes kern/36429 by yamt
 1.89  19-Jul-2007  pooka don't request more than the maximum request size in readdir
 1.88  09-Jul-2007  ad branches: 1.88.2;
s/pagedaemon_lwp/pagedaemon_proc/
 1.87  09-Jul-2007  ad Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements
 1.86  02-Jul-2007  pooka support turning REQUIREDIR off and extra consume in lookup
 1.85  02-Jul-2007  pooka Get rid of the "int *refs" parameter to inactive: the same can be
accomplished now with puffs_setbacks.
 1.84  01-Jul-2007  pooka loosen KASSERT: we can also fail due to ENOMEM
 1.83  01-Jul-2007  pooka Give the file server to ability to request the entire pathname buffer
under lookup by using PUFFS_KFLAG_LOOKUP_FULLPNBUF instead just the
current component.
 1.82  01-Jul-2007  pooka Instead of supplying a plain pid, supply an abstract struct puffs_cid *,
which can currently be used to query the pid and lwpid.
 1.81  01-Jul-2007  pooka make puffs_cred an opaque type
 1.80  30-Jun-2007  pooka Fix logic flaw in KASSERT. Seems like my lkm wasn't compiled with
DIAGNOSTIC ...
 1.79  26-Jun-2007  pooka Simplify code, mainly vop_strategy. No functional change
 1.78  24-Jun-2007  pooka Split the NOCACHE option in twain: NOCACHE_NAME & NOCACHE_PAGE.
 1.77  21-Jun-2007  pooka Refactor the pnode2vnode translation slightly so that VFS_ROOT
can use it directly.
 1.76  06-Jun-2007  pooka Move puffs to a two clause license where it already isn't so. And
as agc pointed out, even files with the third clause were already
effectively two clause because of a slight bug in the language...
 1.75  06-Jun-2007  pooka In very verbose debug mode, print also return values for operations
(well, at least for those that go through checkop()).
 1.74  05-Jun-2007  yamt improve post-ubc file overwrite performance in common cases.
ie. when it's safe, actually overwrite blocks rather than doing
read-modify-write.

also fixes PR/33152 and PR/36303.
 1.73  01-Jun-2007  yamt \xa0 -> space.
 1.72  19-May-2007  pooka Actually, we do need separate "no references in file server" and
"noref + inactive" flags if we wish to correctly support unix open
file semantics and optimize away pre-reclaim cache flushes. So,
add PNODE_DYING which stands for norefs + inactive.
 1.71  18-May-2007  pooka Introduce noref setbacks, which the file server can use to signal
the kernel it has 0 references to the node in question. In other
words, this can be used to avoid inactive(), or, if the file server
does not implement inactive, prompt reclaim for removed nodes.
 1.70  18-May-2007  pooka selrecord() before calling userspace to avoid (very theoretical) race
where selinfo contains uninitialized garbage
 1.69  18-May-2007  pooka Support VOP_POLL. This requires some acrobatics on the puffs_node,
as we give a reference to userspace for the puffs_node for the
duration of the poll call. So reference count puffs_node separately
from the parent vnode. vref()/vrele() is not possible due to a possible
surprise visit from VOP_INACTIVE.
 1.68  15-May-2007  pooka In case strategy memory allocation for B_ASYNC|B_READ fails,
make sure to release the buf.
 1.67  08-May-2007  pooka Adventures in file systems, part (u_quad_t)-1: we can't use the
file system value for the size of device special files, as that
comes from specfs instead of the "host" file system. Therefore,
take care that getattr doesn't override the value of vp->v_size.
 1.66  07-May-2007  pooka Introduce puffs "setbacks", which can be used to set certain flags
for nodes upon return from the userspace. Currently it can be used
to indicate that the file server should be notified of "inactive"
in case the file server has opted to not receive inactive every
time the reference count for a vnode drops to zero. (inactive is
a common event, almost never requires any action and must be executed
sychronously, so it is wasteful).

While doing this, cleanup the release-relock nonsense from the
vntouser*() arguments. It was never enabled and the whole LOCKEDVP()
concept was very broken to begin with.
 1.65  06-May-2007  pooka If setattr is called explicitly, use that as the sign to flush out
all metadata info cached in the kernel while we're setattr'ing in
any case. Solves problems such as truncate (via extend-by-write)
+ chmod resulting in EPERM because the file was already read-only
when the actual truncate was flushed out of the kernel in fsync.
 1.64  24-Apr-2007  pooka If ubc style write fails, do not extend the file by zero-padding
it. It might be that the file server is either crashing or just
returning consistent errors. uiomove() would handle the error,
but if the pages weren't faulted in, memset() to the unfaultable
ubc window would cause a kernel page fault.
 1.63  22-Apr-2007  pooka Issue close to the file server asynchronously. We're not interested
in the return value.
 1.62  22-Apr-2007  pooka define PUFFS_KFLAG_WTCACHE, which makes the page cache write-through
 1.61  20-Apr-2007  pooka * in readdir, don't copy extra memory back and forth to userspace
* consistent usage of the variable argsize with the rest of the module
 1.60  20-Apr-2007  pooka Size of a readdir cookie is sizeof(**ap->a_cookies), not
sizeof(*ap->a_cookies). Fixes nfs readdir in the case that a
directory had lots of entries with short names.
 1.59  16-Apr-2007  pooka Give the file server the ability to specify the file handle length
instead of defining a static length file handle on the framework-level.
 1.58  11-Apr-2007  pooka * support VFS_FHTOVP and VFS_VPTOFH
* support cookies in for VOP_READDIR

nfs exporting puffs file systems works now
 1.57  04-Apr-2007  pooka Make it possible to interrupt waiters for fs operation completion
again. This is useful until locking is further developed and basically
any deadlocks can be solved by killing appropriate processes.

Thanks especially to Tommi Kyntola and Antti Louko for sitting down
with me and discussing resource ownership and locking strategies
in implementing this.
 1.56  30-Mar-2007  pooka * abstract ASYNCBIOREAD and let callers freely issue a callback called
from putop. even though there's only one user currently, makes code
more readable
* move "delta" to a standard parameter in vntouser and get rid of the
specialcase vntouser_delta
 1.55  29-Mar-2007  pooka Convert spinlocks & sleep/wakeup to newlock2 locking stuff. Fix a
bunch of bugs.

* park structures are now always allocated from a pool instead of a
mixed stack/malloc allocation
* get rid of the whole adjbuf concept, always just alloc the maximal
amount of memory to satisfy a request
* little regression: don't allow interrupting wait from file system
to userspace; this had problems already before, but now the problems
really started to shine through. I'll try to make this work again
some day.
* fix bmap to return a sensible value in runp
 1.54  20-Mar-2007  pooka * rework the page cache interaction a bit: cache metadata in the
kernel and flush it out all at once instead of continuous updating
* add support for delivering notifications to the file server about
when a page was written to (but disabled by default for now). the
file server can use this to request flushing or invalidating the
kernel page cache
 1.53  14-Mar-2007  pooka branches: 1.53.2;
Support B_READ|B_ASYNC in strategy by calling biodone() directly
when the file server puts the result.
 1.52  20-Feb-2007  pooka branches: 1.52.4; 1.52.6;
Properly fix rev 1.44: limit error values from the file server to
positive values of errno and 0. Otherwise it can return internal values
such as EJUSTRETURN and screw things up.

thanks to Bill for reminding me to revisit this
 1.51  15-Feb-2007  pooka branches: 1.51.2;
Sanity-check linklen returned from file server in READLINK.
 1.50  10-Feb-2007  pooka * in write, do sync pageflush for the ubc case every 64k, otherwise
the user file server can't really keep up and just writing and writing
may result in kernel memory exhaustion. this lossage is also partially
due to the stupid way mtime + size info is handled currently, but that
should change soon (*knock knock* ;)
* score a few debug printfs
 1.49  09-Feb-2007  pooka honor B_ASYNC
 1.48  09-Feb-2007  pooka assign value for strategy output parameter b_resid instead of decreasing it
 1.47  08-Feb-2007  pooka If the file server doesn't support write, don't use genfs_null_putpages
for putpages, as it assumes a vnode doesn't have any pages. For
mounts using the page cache this is simply not true. Rather,
prevent opening a regular file in write-mode. That way a vnode
can never have dirty pages which would need to be flushed (i.e.
written).
 1.46  08-Feb-2007  pooka chuq shone arcane wisdom on me: b_bcount comes in, b_resid goes out
 1.45  08-Feb-2007  pooka Don't block and wait for file server response in case strategy is
run in pagedaemon context: it gives the file server way more control
over the fate of the entire kernel than what we're comfortable with.
 1.44  06-Feb-2007  pooka Limit errors from puffs_lookup to 0, EJUSTRETURN and ENOENT, as
that's what namei/lookup expects.
 1.43  29-Jan-2007  hannken Change fstrans enum types to upper case.
No functional change.

From Antti Kantee <pooka@netbsd.org>
 1.42  26-Jan-2007  pooka We don't handle fsync in checkop anymore, so direct the fifoop fsync
also to a place less panicy, namely fifo_fsync (because currently the
metadata information is update when the node is changed. This will
probably change soon, though).
 1.41  26-Jan-2007  pooka Initial attempt at suspend/snapshot support for userspace file
servers. This is still pretty much on the level "if it breaks ...".
It should work for single-threaded servers which handle one operation
from start to finish in one go. Also, it does not yet totally
correctly synchronize metadata and data in some cases. So needless
to say, it needs improvement, but it is possible that will have to
wait for some lock revampage.
 1.40  25-Jan-2007  pooka if strategy fails, set bp->b_error and B_ERROR
 1.39  25-Jan-2007  pooka don't hold spinlocks (except vnode interlock) when doing vget()
 1.38  21-Jan-2007  pooka optimize a bit: don't flush pages for vnodes which have no references
in the kernel or links in the backend
 1.37  21-Jan-2007  pooka remove diagnostic printf
 1.36  19-Jan-2007  pooka hannken noted that the latest gcc (?) complains about uninitialized
variable use in puffs_strategy() for "dowritefaf" (incorrectly)
and "error" (correctly, although the function is practically of
type void)
 1.35  19-Jan-2007  pooka In case the fs server is in the kernel doing an operation on a
completely different file system, we still might re-enter the same
puffs fs in case we execute something on the other file system,
which wants to get a new vnode and ends up recycling a puffs vnode
for the purpose. In this case the fs server will sleep in the
kernel until it itself handles the operation .... which of course
is a slightly unlikely event.

After analyzing the path from getcleanvnode() to the vnode cemetary,
identify that fsync and putpages (strategy) are the ones in danger
of striking a deadlock deal. Abuse the vnode flag VXLOCK to tell
them "this vnode is irreversably going to meet its maker, don't
care about user server return values" (failure is not acceptable
down the vgonel() path) and issue the respective operations as
Fire-And-Forget (FAF) operations. no wait -> no deadlock.

This of course is a "fix" skating on thin ice. A better, more
generic solution is already in sight, but will take more effort to
implement.
 1.34  16-Jan-2007  pooka * don't wait for the answer of VOP_RECLAIM, just fire-and-forget
* revoke puffs_revoke. we can deal with it just by calling genfs_revoke
 1.33  15-Jan-2007  pooka Store puffs_node's on lists hashed with the cookie value instead
of just one flat list.
 1.32  15-Jan-2007  pooka * do not accept the directory cookie as the result of a lookup (otherwise
we'd be locking against ourselves)
* do not accept duplicate cookies when creating new nodes
 1.31  11-Jan-2007  pooka Since fsync is really putpages + fsync, check for both separately
instead of using just putpages to decide the op's faith.

And the real beef in this commit is of course a tyop fix in a comment.
 1.30  09-Jan-2007  pooka Introduce flush operations, which the fs server can use to control
kernel caching. Currently supported are only flushing the name
cache for a directory or flushing the name cache for the entire fs.

Also, get rid of PNODE_INACTIVE status, since it was racy and
essentially didn't work. All this on top of being useless in the
first place ....
 1.29  07-Jan-2007  pooka getcwd wants eofflag - set eofflag in readdir if amount of data is 0
 1.28  02-Jan-2007  pooka In rename, tdvp == tvp holds if we are renaming a directory to "."
(XXX: for all the sense that makes). Deal with it gracefully here
for now.
 1.27  01-Jan-2007  pooka remove r/o mount check done also in vfs lookup()
 1.26  01-Jan-2007  pooka async update node metadata for spec- and fifoops
 1.25  01-Jan-2007  pooka properly handle VOP_REMOVE case where vp == dvp
 1.24  01-Jan-2007  pooka explicitly disable ioctl and fcntl for now - support has bitrotted
 1.23  30-Dec-2006  pooka branches: 1.23.2;
* use PUFFS_KFLAG_NOCACHE to also signal that we don't want the namecache
* enter files into the namecache immediately when new nodes are created
(if it's a caching mount, of course)
 1.22  09-Dec-2006  chs branches: 1.22.2;
a smorgasbord of improvements to vnode locking and path lookup:
- LOCKPARENT is no longer relevant for lookup(), relookup() or VOP_LOOKUP().
these now always return the parent vnode locked. namei() works as before.
lookup() and various other paths no longer acquire vnode locks in the
wrong order via vrele(). fixes PR 32535.
as a nice side effect, path lookup is also up to 25% faster.
- the above allows us to get rid of PDIRUNLOCK.
- also get rid of WANTPARENT (just use LOCKPARENT and unlock it).
- remove an assumption in layer_node_find() that all file systems implement
a recursive VOP_LOCK() (unionfs doesn't).
- require that all file systems supply vfs_vptofh and vfs_fhtovp routines.
fill in eopnotsupp() for file systems that don't support being exported
and remove the checks for NULL. (layerfs calls these without checking.)
- in union_lookup1(), don't change refcounts in the ISDOTDOT case, just
adjust which vnode is locked. fixes PR 33374.
- apply fixes for ufs_rename() from ufs_vnops.c rev. 1.61 to ext2fs_rename().
 1.21  07-Dec-2006  pooka let implementation ultimately decide if mmap is supported - pass
VOP_MMAP to fs server
 1.20  05-Dec-2006  pooka adjust file size in write only if file grows. but since this change is
in the "never use ubc" branch, I don't think it matters except for cosmetics.
 1.19  05-Dec-2006  pooka Allow multiple requests to be transferred in each GET/PUTOP. For
a single request, the performance is still the same.
 1.18  01-Dec-2006  pooka branches: 1.18.2;
prefix kernel flags with PUFFS_KFLAG to have a separate namespace
from the library flags
 1.17  01-Dec-2006  pooka don't call the fs server for all operations, only those it has told
us that it implements
 1.16  28-Nov-2006  pooka don't allow mmap if operating uncached
 1.15  18-Nov-2006  pooka Actually, for NOCACHE, use direct read/write instead of going through
page cache at all and invalidating. XXX: mmap
 1.14  18-Nov-2006  pooka branches: 1.14.2;
make puffs_strategy more robust
 1.13  18-Nov-2006  pooka Require statvfs info from startreq so that we have that info available.
Also, don't pass fsid to userspace and just fill it in the kernel.
 1.12  18-Nov-2006  pooka As a first generation best-effort hack, use NOCACHE to mean "file
size can change without the kernel knowing" and therefore query
the file size before invoking read or write operations.
 1.11  17-Nov-2006  pooka Introduce uncached operation, makes sense when the file system backend
can be modified from elsewhere than the file system interface
 1.10  13-Nov-2006  pooka No need to return a special value for CREATE/RENAME lookup, so just
handle ENOENT. If there's a real error, userspace will return
something else.
 1.9  08-Nov-2006  pooka update struct buf resid in strategy according to what was transferred.
seems like only nestiobuf complains when it wasn't updated ...
 1.8  07-Nov-2006  pooka attach to genfs & support page cache. most noticeable effect is
mmap and therefore execution of binaries starting to work, some
speed improvements with large file I/O also. caching semantics
and error case handling most likely need revisiting.
 1.7  27-Oct-2006  pooka Use spec_fsync for specops vop_fsync: it knows about vflushbuf(), which
is more than what puffs currently knows. makes e.g. ffs unmount for a
puffs-based device node work.
 1.6  27-Oct-2006  pooka support fifos
 1.5  26-Oct-2006  pooka support specfs
 1.4  26-Oct-2006  pooka Fix operations creating new nodes to honor the vnode locking protocol
if the userspace server returns an error. Fixes lockups if any
of the following operations failed: create, mknod, mkdir, symlink
 1.3  25-Oct-2006  pooka pass VOP_INACTIVE() to userspace
 1.2  23-Oct-2006  pooka fix print in VOP_PRINT

also make it compile on amd64. problem noticed by Blair Sadewitz
on current-users
 1.1  22-Oct-2006  pooka kernel portion of puffs - the Pass-to-Userspace Framework File System.
It contains the VFS attachment and userspace message-passing interface.

This work was initially started and completed for Google SoC 2005
and tweaked to work a bit better in the past few weeks. While
being far from complete, it is functional enough to be able and
stable to host a fairly general-purpose in-memory file system in
userspace. Even so, puffs should be considered experimental and
no binary compatibility for interfaces or crash-freedom or zero
security implications should be relied upon just yet.

The GSoC project was mentored by William Studenmund and the final
review for the code was done by Christos.
 1.14.2.5  09-Feb-2007  ad Sync with HEAD.
 1.14.2.4  01-Feb-2007  ad Sync with head.
 1.14.2.3  12-Jan-2007  ad Sync with head.
 1.14.2.2  18-Nov-2006  ad Sync with head.
 1.14.2.1  18-Nov-2006  ad file puffs_vnops.c was added on branch newlock2 on 2006-11-18 21:39:20 +0000
 1.18.2.1  17-Feb-2007  tron Apply patch (requested by chs in ticket #422):
- Fix various deadlock problems with nullfs and unionfs.
- Speed up path lookups by upto 25%.
 1.22.2.2  10-Dec-2006  yamt sync with head.
 1.22.2.1  09-Dec-2006  yamt file puffs_vnops.c was added on branch yamt-splraiseipl on 2006-12-10 07:18:38 +0000
 1.23.2.8  04-Feb-2008  yamt sync with head.
 1.23.2.7  21-Jan-2008  yamt sync with head
 1.23.2.6  07-Dec-2007  yamt sync with head
 1.23.2.5  27-Oct-2007  yamt sync with head.
 1.23.2.4  03-Sep-2007  yamt sync with head.
 1.23.2.3  26-Feb-2007  yamt sync with head.
 1.23.2.2  30-Dec-2006  yamt sync with head.
 1.23.2.1  30-Dec-2006  yamt file puffs_vnops.c was added on branch yamt-lazymbuf on 2006-12-30 20:50:01 +0000
 1.51.2.5  17-May-2007  yamt sync with head.
 1.51.2.4  07-May-2007  yamt sync with head.
 1.51.2.3  15-Apr-2007  yamt sync with head.
 1.51.2.2  24-Mar-2007  yamt sync with head.
 1.51.2.1  27-Feb-2007  yamt - sync with head.
- move sched_changepri back to kern_synch.c as it doesn't know PPQ anymore.
 1.52.6.1  11-Jul-2007  mjf Sync with head.
 1.52.4.14  23-Oct-2007  ad Sync with head.
 1.52.4.13  12-Oct-2007  ad Sync with head.
 1.52.4.12  09-Oct-2007  ad Sync with head.
 1.52.4.11  09-Oct-2007  ad Sync with head.
 1.52.4.10  16-Sep-2007  ad Checkpoint work in progress on the vnode lifecycle and reference counting
stuff. This makes it work properly without kernel_lock and fixes a few
quite old bugs. See vfs_subr.c 1.283.2.17 for details.
 1.52.4.9  20-Aug-2007  ad Sync with HEAD.
 1.52.4.8  19-Aug-2007  ad - Back out the biodone() changes.
- Eliminate B_ERROR (from HEAD).
 1.52.4.7  15-Jul-2007  ad Sync with head.
 1.52.4.6  17-Jun-2007  ad - Increase the number of thread priorities from 128 to 256. How the space
is set up is to be revisited.
- Implement soft interrupts as kernel threads. A generic implementation
is provided, with hooks for fast-path MD code that can run the interrupt
threads over the top of other threads executing in the kernel.
- Split vnode::v_flag into three fields, depending on how the flag is
locked (by the interlock, by the vnode lock, by the file system).
- Miscellaneous locking fixes and improvements.
 1.52.4.5  09-Jun-2007  ad Sync with head.
 1.52.4.4  08-Jun-2007  ad Sync with head.
 1.52.4.3  10-Apr-2007  ad Sync with head.
 1.52.4.2  09-Apr-2007  ad - Add two new arguments to kthread_create1: pri_t pri, bool mpsafe.
- Fork kthreads off proc0 as new LWPs, not new processes.
 1.52.4.1  05-Apr-2007  ad Compile fixes.
 1.53.2.1  29-Mar-2007  reinoud Pullup to -current
 1.88.2.2  03-Sep-2007  skrll Sync with HEAD.
 1.88.2.1  15-Aug-2007  skrll Sync with HEAD.
 1.95.6.2  30-Jul-2007  pooka properly setup ubcflags
 1.95.6.1  30-Jul-2007  pooka file puffs_vnops.c was added on branch matt-mips64 on 2007-07-30 14:49:02 +0000
 1.95.4.9  09-Dec-2007  jmcneill Sync with HEAD.
 1.95.4.8  27-Nov-2007  joerg Sync with HEAD. amd64 Xen support needs testing.
 1.95.4.7  21-Nov-2007  joerg Sync with HEAD.
 1.95.4.6  28-Oct-2007  joerg Sync with HEAD.
 1.95.4.5  26-Oct-2007  joerg Sync with HEAD.

Follow the merge of pmap.c on i386 and amd64 and move
pmap_init_tmp_pgtbl into arch/x86/x86/pmap.c. Modify the ACPI wakeup
code to restore CR4 before jumping back into kernel space as the large
page option might cover that.
 1.95.4.4  07-Oct-2007  joerg Sync with HEAD.
 1.95.4.3  02-Oct-2007  joerg Sync with HEAD.
 1.95.4.2  03-Sep-2007  jmcneill Sync with HEAD.
 1.95.4.1  16-Aug-2007  jmcneill Sync with HEAD.
 1.98.4.2  14-Oct-2007  yamt sync with head.
 1.98.4.1  06-Oct-2007  yamt sync with head.
 1.98.2.3  23-Mar-2008  matt sync with HEAD
 1.98.2.2  09-Jan-2008  matt sync with HEAD
 1.98.2.1  06-Nov-2007  matt sync with HEAD
 1.107.2.4  21-Nov-2007  bouyer Sync with HEAD
 1.107.2.3  18-Nov-2007  bouyer Sync with HEAD
 1.107.2.2  13-Nov-2007  bouyer Sync with HEAD
 1.107.2.1  25-Oct-2007  bouyer Sync with HEAD.
 1.113.2.4  18-Feb-2008  mjf Sync with HEAD.
 1.113.2.3  27-Dec-2007  mjf Sync with HEAD.
 1.113.2.2  08-Dec-2007  mjf Sync with HEAD.
 1.113.2.1  19-Nov-2007  mjf Sync with HEAD.
 1.121.2.2  26-Dec-2007  ad Sync with head.
 1.121.2.1  04-Dec-2007  ad Pull the vmlocking changes into a new branch.
 1.122.4.2  08-Jan-2008  bouyer Sync with HEAD
 1.122.4.1  02-Jan-2008  bouyer Sync with HEAD
 1.128.16.2  13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.128.16.1  19-Oct-2008  haad Sync with HEAD.
 1.128.12.1  24-Sep-2008  wrstuden Merge in changes between wrstuden-revivesa-base-2 and
wrstuden-revivesa-base-3.
 1.128.10.4  11-Aug-2010  yamt sync with head.
 1.128.10.3  11-Mar-2010  yamt sync with head
 1.128.10.2  16-Sep-2009  yamt sync with head
 1.128.10.1  04-May-2009  yamt sync with head.
 1.128.6.2  17-Jan-2009  mjf Sync with HEAD.
 1.128.6.1  28-Sep-2008  mjf Sync with HEAD.
 1.129.8.1  21-Apr-2010  matt sync to netbsd-5
 1.129.4.11  02-Nov-2011  riz Pull up following revision(s) (requested by manu in ticket #1679):
sys/fs/puffs/puffs_vnops.c: revision 1.157
sys/fs/puffs/puffs_vnops.c: revision 1.158
sys/fs/puffs/puffs_vnops.c: revision 1.159
sys/fs/puffs/puffs_vfsops.c: revision 1.97
sys/fs/puffs/puffs_vfsops.c: revision 1.99
sys/fs/puffs/puffs_vnops.c: revision 1.160
sys/fs/puffs/puffs_vfsops.c: revision 1.100
sys/miscfs/syncfs/sync_subr.c: revision 1.47
sys/fs/puffs/puffs_node.c: revision 1.21
sys/fs/puffs/puffs_node.c: revision 1.22
sys/fs/puffs/puffs_msgif.c: revision 1.88
sys/fs/puffs/puffs_msgif.c: revision 1.89
sys/fs/puffs/puffs_vnops.c: revision 1.156
Make sure ioflush does not sleep in PUFFS code path, waiting for a mutex,
a memory allocation, or a response from the filesystem.
This avoids deadlocks in the following situations:
1) when memory is low: ioflush waits the fileystem, the fielsystem waits
for memory
2) when the filesystem does not respond (e.g.: network outage ona
distributed filesystem)
Fix the build that was broken by struct lwp *updateproc reference in
RUMP-visible code. Instead of checking that updateproc (aka ioflush,
aka syncer) will not sleep in PUFFS code, I check for any kernel thread:
after all none of them are designed to hang awaiting for a remote filesystem
operation to complete.
Roll back the change that forced kernel threads to not sleep in PUFFS.
The change does not make consensus, since only pagedaemon should need it.
Other threads will tolerate sleeping, and problems here are only symptoms
that something is going wrong in memory management. The cause, not the
symptoms, need to be fixed.
Make sure pagedaemon does not sleep for memory in puffs_vnop_sleep.
Add KASSERT on any sleeping memory allocation to check it cannot happen again.
Remove #ifdef DIAGNOSTIC guards around KASSERT, as the macro contains them
 1.129.4.10  17-Sep-2011  bouyer Pull up following revision(s) (requested by manu in ticket #1666):
sys/fs/puffs/puffs_sys.h: revision 1.78 via patch
sys/fs/puffs/puffs_node.c: revision 1.20 via patch
sys/fs/puffs/puffs_vnops.c: revision 1.155 via patch
Add a mutex for operations that touch size (setattr, getattr, write, fsync).
This is required to avoid data corruption bugs, where a getattr slices
itself within a setattr operation, and sets the size to the stall value
it got from the filesystem. That value is smaller than the one set by
setattr, and the call to uvm_vnp_setsize() trigged a spurious truncate.
The result is a chunk of zeroed data in the file.
Such a situation can easily happen when the ioflush thread issue a
VOP_FSYNC/puffs_vnop_sync/flushvncache/dosetattrn while andother process
do a sys_stat/VOP_GETATTR/puffs_vnop_getattr.
This mutex on size operation can be removed the day we decide VOP_GETATTR
has to operated on a locked vnode, since the other operations that touch
size already require that.
 1.129.4.9  17-Jul-2011  riz Pull up following revision(s) (requested by manu in ticket #1645):
lib/libc/sys/Makefile.inc 1.207 via patch
lib/libc/sys/extattr_get_file.2 patch
lib/libpuffs/dispatcher.c 1.34,1.36 via patch
lib/libpuffs/puffs.c 1.107 via patch
lib/libpuffs/puffs.h 1.115,1.118 via patch
sys/fs/puffs/puffs_msgif.h 1.71,1.76 via patch
sys/fs/puffs/puffs_vfsops.c 1.88 via patch
sys/fs/puffs/puffs_vnops.c 1.145,1.154 via patch
sys/kern/vfs_xattr.c 1.24-1.27 via patch
sys/kern/vnode_if.c 1.87 via patch
sys/sys/Makefile 1.133 via patch
sys/sys/extattr.h 1.6 via patch
sys/sys/vnode_if.h 1.81 via patch
sys/ufs/ffs/ffs_vnops.c patch
sys/ufs/ufs/ufs_extattr.c 1.31,1.34 via patch

* support extended attributes
* bump major due to structure growth
* add some spare space
* remove ABI sillyness
Support extended attributes.
Fix multiple non compliances in our Linux-like extattr API, and make it
public so that it can be used.
Improve a bit listxattr(2). It attemps to list both system and user
extended attributes, and it faled if calling user did not have privilege
for reading system EA. Now we just lise user EA and skip system EA in
reading them is not allowed.
Fix bug introduced in previous commuit: Do not vrele() a vnode we did not
obtained.
Improve UFS1 extended attributes usability
- autocreate attribute backing file for new attributes
- autoload attributes when issuing extattrctl start
- when autoloading attributes, do not display garbage warning when looking
up entries that got ENOENT
Add a flag to VOP_LISTEXTATTR(9) so that the vnode interface can tell the
filesystem in which format extended attribute shall be listed.
There are currently two formats:
- NUL-terminated strings, used for listxattr(2), this is the default.
- one byte length-pprefixed, non NUL-terminated strings, used for
extattr_list_file(2), which is obtanined by setting the
EXTATTR_LIST_PREFIXLEN flag to VOP_LISTEXTATTR(9)
This approach avoid the need for converting the list back and forth, except
in libperfuse, since FUSE uses NUL-terminated strings, and the kernel may
have requested EXTATTR_LIST_PREFIXLEN.
 1.129.4.8  18-Jun-2011  bouyer Pull up following revision(s) (requested by manu in ticket #1623):
lib/libpuffs/puffs.c: revision 1.116 via patch
sys/fs/puffs/puffs_vnops.c: revision 1.151 via patch
Call advlock method if supplied
 1.129.4.7  16-Jan-2010  bouyer Pull up following revision(s) (requested by pooka in ticket #1244):
sys/fs/puffs/puffs_vnops.c: revision 1.142
Since VOP_GETATTR() does not require a locked vnode, resolve and
reference the puffs_node before sending the request to the file
server. This diminishes the window where the inode can be reclaimed
and be invalidated before it is accessed (but does not completely
eliminate the race, as that is a caller problem which we cannot
fix here).
 1.129.4.6  18-Dec-2009  snj Pull up following revision(s) (requested by pooka in ticket #1184):
sys/fs/puffs/puffs_vnops.c: revision 1.141 via patch
Push all information cached in the vnode to the file server before
issuing INACTIVE. PR kern/42194.
Also, send setattr in fsync asynchronously if FSYNC_WAIT is not set.
 1.129.4.5  28-Nov-2009  bouyer Pull up following revision(s) (requested by pooka in ticket #1154):
sys/fs/puffs/puffs_vnops.c: revision 1.140
Send VOP_ABORTOP() in case attempting cross-dev rename, part of
PR kern/42210. Also, fix a memory management error in said case.
 1.129.4.4  28-Nov-2009  bouyer Pull up following revision(s) (requested by pooka in ticket #1153):
sys/fs/puffs/puffs_vnops.c: revision 1.139
Send VOP_ABORTOP() as a FAF -- we don't care about the return value.
 1.129.4.3  18-Oct-2009  sborrill Pull up the following revisions(s) (requested by pooka in ticket #1100):
lib/libpuffs/dispatcher.c: revision 1.33
lib/libpuffs/puffs.c: revision 1.99
lib/libpuffs/puffs.h: revision 1.111
sys/fs/puffs/puffs_msgif.h: revision 1.67 via patch
sys/fs/puffs/puffs_vnops.c: revision 1.136

Support VOP_ABORTOP() in puffs.
 1.129.4.2  03-Oct-2009  snj Pull up following revision(s) (requested by pooka in ticket #1042):
sys/fs/puffs/puffs_node.c: revision 1.14
sys/fs/puffs/puffs_vnops.c: revision 1.134
* fix a race i introduced almost two years ago in rev 1.116:
operations creating a node cannot be considered outgoing operations,
since after return from userspace they modify file system state
by creating a new node. if we do not protect the file system by
holding the directory lock, a lookup operation might race us into
the kernel and create the node earlier.
* remove pnode from hashlish before sending the reclaim faf off to
userspace. also, hold pmp_lock while frobbing the list.
 1.129.4.1  26-Sep-2009  snj Pull up following revision(s) (requested by pooka in ticket #1014):
sys/fs/puffs/puffs_vnops.c: revision 1.133
Set SAVENAME for rmdir and remove.
Addresses an easy part of PR kern/38188
 1.129.2.1  19-Jan-2009  skrll Sync with HEAD.
 1.142.4.5  31-May-2011  rmind sync with head
 1.142.4.4  05-Mar-2011  rmind sync with head
 1.142.4.3  03-Jul-2010  rmind sync with head
 1.142.4.2  30-May-2010  rmind sync with head
 1.142.4.1  16-Mar-2010  rmind Change struct uvm_object::vmobjlock to be dynamically allocated with
mutex_obj_alloc(). It allows us to share the locks among UVM objects.
 1.142.2.2  17-Aug-2010  uebayasi Sync with HEAD.
 1.142.2.1  30-Apr-2010  uebayasi Sync with HEAD.
 1.150.2.1  06-Jun-2011  jruoho Sync with HEAD.
 1.152.2.1  23-Jun-2011  cherry Catchup with rmind-uvmplock merge.
 1.161.2.5  22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.161.2.4  16-Jan-2013  yamt sync with (a bit old) head
 1.161.2.3  30-Oct-2012  yamt sync with head
 1.161.2.2  23-May-2012  yamt sync with head.
 1.161.2.1  17-Apr-2012  yamt sync with head
 1.162.4.3  29-Apr-2012  mrg sync to latest -current.
 1.162.4.2  05-Apr-2012  mrg sync to latest -current.
 1.162.4.1  18-Feb-2012  mrg merge to -current.
 1.163.2.12  27-Feb-2015  martin Pull up following revision(s) (requested by manu in ticket #1260):
lib/libpuffs/puffs.3: revision 1,55,1.60
sys/fs/puffs/puffs_msgif.h: revision 1.84
lib/libperfuse/ops.c: revision 1.83
sys/fs/puffs/puffs_sys.h: revision 1.89
sys/fs/puffs/puffs_vfsops.c: revision 1.116
lib/libperfuse/perfuse.c: revision 1.36
sys/fs/puffs/puffs_vnops.c: revision 1.200-1.202

Use more markup. New sentence, new line. Bump date for previous.

Add PUFFS_KFLAG_NOFLUSH_META to prevent sending metadata flush to FUSE
FUSE filesystems do not expect to get metadata updates for [amc]time
and size, they updates the value on their own after operations.

The PUFFS PUFFS_KFLAG_NOFLUSH_META option prevents regular metadata cache
flushes to the filesystem , and libperfuse uses it to match Linux FUSE
behavior.

While there, fix a bug in SETATTR: do not update kernel metadata cache
from SETATTR reply when the request is asynchronous, as we do not have
the reply yet.

Update file size after write without metadata flush
If we do not use metadata flush, we must make sure the size is updated
in the filesystem after a write, otherwise the next GETATTR will get us
a stale value and the file will be truncated.
 1.163.2.11  16-Jan-2015  martin Pull up following revision(s) (requested by manu in ticket #1236):
sys/fs/puffs/puffs_vnops.c: revision 1.199
Make sure reads on empty files reach PUFFS filesystems
Sending a read through the page cache will get the operation
short-circuited. This is a problem with some filesystems that
expect to receive the read operation in order to update atime.
We fix that by bypassing the page cache when reading a file
wich a size known to be zero.
 1.163.2.10  09-Nov-2014  msaitoh Pull up following revision(s) (requested by manu in ticket #1187):
sys/fs/puffs/puffs_vnops.c: revision 1.198
PUFFS direct I/O cache fix
There are a few situations where we must take care of the cache if
direct
I/O was enabled:
- if we do direct I/O for write but not for read, then any write must
invalidate the cache so that a reader gets the written data and not
the not-updated cache.
- if we used a vnode without direct I/O and it is enabled for writing,
we must flush the cache before compeling the open operation, so that
the cachec write are not lost.
And at inactive time, we wipe direct I/O flags so that a new open
without
direct I/O does not inherit direct I/O.
 1.163.2.9  09-Nov-2014  msaitoh Pull up following revision(s) (requested by manu in ticket #1184):
sys/fs/puffs/puffs_vnops.c: revision 1.195
According to pooka@'s comment, a long time ago, VOP_STRATEGY could not
fail without taking down the kernel. It seems this is not the case
anymore,
hence we can stop dropping errors in puffs_vnop_strategy()
Approved by pooka@
 1.163.2.8  09-Nov-2014  msaitoh Pull up following revision(s) (requested by manu in ticket #1166):
sys/fs/puffs/puffs_vnops.c: revision 1.188-1.194
- If we truncate the file, make sure we zero-fill the end of the last
page, otherwise if the file is later truncated to a larger size
(creating a hole), that area will not return zeroes as it should.
- Use PRIx64 for printing offsets
- Improve zero-fill of last page after shrink fix:
1) do it only if the file is open for writing, otherwise we send write
requests to the FS on a file that has never been open.
2) do it inside existing if (vap->va_size != VNOVAL) block
- Retore LP64 fix that was removed by mistake
- Make this build again without debugging enabled; DPRINTF() can end up
as empty, and in an if conditional, you then need braces if that's the
only potential body.
- As is evidenced by several of our 32-bit MIPS ports, it's wrong to
print vsize_t with PRIx64 -- instead use our own PRIxVSIZE macro.
- Do the previous correctly...
 1.163.2.7  03-Nov-2014  msaitoh Pull up following revision(s) (requested by manu in ticket #1152):
sys/fs/puffs/puffs_vnops.c: revision 1.186
PUFFS fixes for size update ater write plus read/write sanity checks
- Always update kernel metadata cache for size when writing
This fixes situation where size update after appending to a file lagged
- Make read/write nilpotent when called with null size, as FFS does
- Return EFBIG instead of EINVAL for negative offsets, as FFS does
 1.163.2.6  03-Nov-2014  msaitoh Pull up following revision(s) (requested by manu in ticket #1149):
sys/fs/puffs/puffs_node.c: revision 1.33
sys/fs/puffs/puffs_vnops.c: revision 1.185
When changing a directory content, update the ctime/mtime in kernel
cache,
otherwise the updated ctime/mtime appears after the cached entry expire.
 1.163.2.5  03-Nov-2014  msaitoh Pull up following revision(s) (requested by manu in ticket #1140):
lib/libperfuse/ops.c 1.63-1.69
lib/libperfuse/perfuse.c 1.32-1.33
lib/libperfuse/perfuse_priv.h 1.32-1.34
lib/libperfuse/subr.c 1.20
lib/libpuffs/creds.c 1.16
lib/libpuffs/dispatcher.c 1.47
lib/libpuffs/puffs.h 1.125
lib/libpuffs/puffs_ops.3 1.37-1.38
lib/libpuffs/requests.c 1.24
sys/fs/puffs/puffs_msgif.h 1.81
sys/fs/puffs/puffs_sys.h 1.85
sys/fs/puffs/puffs_vnops.c 1.183
usr.sbin/perfused/msg.c 1.22
Bring libpuffs, libperfuse and perfused on par with -current:
- implement FUSE direct I/O
- remove useless code and warnings
- fix missing GETATTR bugs
- fix exended attribute get and list operations
 1.163.2.4  12-Aug-2012  martin Pull up following revision(s) (requested by manu in ticket #438):
lib/libperfuse/perfuse_priv.h: revision 1.31
sys/fs/puffs/puffs_msgif.h: revision 1.80
sys/fs/puffs/puffs_vnops.c: revision 1.171
lib/libpuffs/puffs_ops.3: revision 1.31
sys/fs/puffs/puffs_vnops.c: revision 1.172
sys/fs/puffs/puffs_vnops.c: revision 1.173
sys/fs/puffs/puffs_vnops.c: revision 1.174
usr.sbin/perfused/perfused.c: revision 1.24
sys/fs/puffs/puffs_sys.h: revision 1.80
sys/fs/puffs/puffs_sys.h: revision 1.81
sys/fs/puffs/puffs_sys.h: revision 1.82
lib/libperfuse/subr.c: revision 1.19
lib/libperfuse/perfuse.c: revision 1.30
sys/fs/puffs/puffs_msgif.c: revision 1.90
sys/fs/puffs/puffs_msgif.c: revision 1.91
sys/fs/puffs/puffs_msgif.c: revision 1.92
lib/libperfuse/ops.c: revision 1.59
lib/libpuffs/puffs.3: revision 1.53
lib/libperfuse/debug.c: revision 1.12
lib/libpuffs/puffs.3: revision 1.54
sys/fs/puffs/puffs_vnops.c: revision 1.167
sys/fs/puffs/puffs_msgif.h: revision 1.79
usr.sbin/perfused/msg.c: revision 1.21
sys/fs/puffs/puffs_vfsops.c: revision 1.102
sys/fs/puffs/puffs_vfsops.c: revision 1.103
sys/fs/puffs/puffs_vfsops.c: revision 1.105
lib/libpuffs/puffs.h: revision 1.123
lib/libperfuse/perfuse_if.h: revision 1.20
lib/libperfuse/perfuse.c: revision 1.29
lib/libpuffs/dispatcher.c: revision 1.42
lib/libpuffs/dispatcher.c: revision 1.43
- Fix same vnodes associated with multiple cookies
The scheme used to retreive known nodes on lookup was flawed, as it only
used parent and name. This produced a different cookie for the same file
if it was renamed, when looking up ../ or when dealing with multiple files
associated with the same name through link(2).
We therefore abandon the use of node name and introduce hashed lists of
inodes. This causes a huge rewrite of reclaim code, which do not attempt
to keep parents allocated until all their children are reclaimed
- Fix race conditions in reclaim
There are a few situations where we issue multiple FUSE operations for
a PUFFS operation. On reclaim, we therefore have to wait for all FUSE
operation to complete, not just the current exchanges. We do this by
introducing node reference count with node_ref() and node_rele().
- Detect data loss caused by FAF
VOP_PUTPAGES causes FAF writes where the kernel does not check the
operation result. At least issue a warning on error.
- Enjoy FAF shortcut on setattr
No need to wait for the result if the kernel does not want it. There is
however an exception for setattr that touch the size, we need to wait
for completion because we have other operations queued for after the
resize.
- Fix fchmod() on write-open file
fchmod() on a node open with write privilege will send setattr with both mode
and size set. This confuses some FUSE filesystem. Therefore we send two FUSE
operations, one for mode, and one for size.
- Remove node TTL handling for netbsd-5 for simplicity sake. The code
still builds on netbsd-5 but does not have the node TTL feature anymore.
It works fine with kernel support on netbsd-6.
- Improve PUFFS_KFLAG_CACHE_FS_TTL by reclaiming older inactive nodes.
The normal kernel behavior is to retain inactive nodes in the freelist
until it runs out of vnodes. This has some merit for local filesystems,
where the cost of an allocation is about the same as the cost of a
lookup. But that situation is not true for distributed filesystems.
On the other hand, keeping inactive nodes for a long time hold memory
in the file server process, and when the kernel runs out of vnodes, it
produce reclaim avalanches that increase lattency for other operations.
We do not reclaim inactive vnodes immediatly either, as they may be
looked up again shortly. Instead we introduce a grace time and we
reclaim nodes that have been inactive beyond the grace time.
- Fix lookup/reclaim race condition.
The above improvement undercovered a race condition between lookup and
reclaim. If we reclaimed a vnode associated with a userland cookie while
a lookup returning that same cookiewas inprogress, then the kernel ends
up with a vnode associated with a cookie that has been reclaimed in
userland. Next operation on the cookie will crash (or at least confuse)
the filesystem.
We fix this by introducing a lookup count in kernel and userland. On
reclaim, the kernel sends the count, which enable userland to detect
situation where it initiated a lookup that is not completed in kernel.
In such a situation, the reclaim must be ignored, as the node is about
to be looked up again.
Fix hang unmount bug introduced by last commit.
We introduced a slow queue for delayed reclaims, while the existing
queue for unmount, flush and exist has been renamed fast queue. Both
queues had timestamp for when an operation should be done, but it was
useless for the fast queue, which is always used to run an operation
ASAP. And the timestamp test had an error that turned ASAP into "at next
tick", but nobody what there to wake the thread at next tick, hence
the hang. The fix is to remove the useless and buggy timestamp test for
fast queue.
Rename slow sopreq queue into node sopreq queue, to refet the fact that
is only intended for postponed node reclaims.
When purging the node sopreq queue, do not call puffs_msg_sendresp(), as
it makes no sense.
Fix race condition between (create|mknod|mkdir|symlino) and reclaim, just
like we did it between lookup and reclaim.
Missing bit in previous commit (prevent race between create|mknod|mkdir|symlink
and reclaim)
Bump date for previous.
New sentence, new line; remove trailing whitespace; fix typos;
punctuation nits.
Add PUFFS_KFLAG_CACHE_DOTDOT so that vnodes hold a reference on their
parent, keeping them active, and allowing to lookup .. without sending
a request to the filesystem.
Enable the featuure for perfused, as this is how FUSE works.
Missing bit in previous commit (PUFFS_KFLAG_CACHE_DOTDOT option to avoid
looking up ..)
 1.163.2.3  12-Aug-2012  martin Pull up following revision(s) (requested by manu in ticket #484):
sys/fs/nilfs/nilfs_vnops.c: revision 1.18
sys/ufs/ufs/ufs_lookup.c: revision 1.117
sys/nfs/nfs_vnops.c: revision 1.295
sys/ufs/chfs/chfs_vnops.c: revision 1.8
sys/ufs/ext2fs/ext2fs_lookup.c: revision 1.70
sys/fs/unionfs/unionfs_vnops.c: revision 1.6
sys/kern/vfs_cache.c: revision 1.89
sys/fs/efs/efs_vnops.c: revision 1.26
sys/fs/hfs/hfs_vnops.c: revision 1.26
sys/fs/adosfs/adlookup.c: revision 1.16
sys/fs/puffs/puffs_vnops.c: revision 1.168
sys/fs/tmpfs/tmpfs_vnops.c: revision 1.98
sys/fs/ntfs/ntfs_vnops.c: revision 1.52
sys/fs/cd9660/cd9660_lookup.c: revision 1.20
sys/fs/msdosfs/msdosfs_lookup.c: revision 1.24
sys/fs/smbfs/smbfs_vnops.c: revision 1.80
sys/fs/udf/udf_vnops.c: revision 1.72
sys/fs/filecorefs/filecore_lookup.c: revision 1.14
sys/fs/puffs/puffs_node.c: revision 1.25
Move some the test for MAKEENTRY into the cache_enter(9). Make some
variables in vfs_cache.c static, __read_mostly, etc.
No objection on tech-kern@.
 1.163.2.2  23-Apr-2012  riz Pull up following revision(s) (requested by manu in ticket #195):
lib/libskey/skeysubr.c: revision 1.27
lib/libkvm/kvm_getloadavg.c: revision 1.11
lib/libwrap/update.c: revision 1.9
lib/liby/yyerror.c: revision 1.9
lib/libpuffs/puffs_ops.3: revision 1.30
lib/libwrap/misc.c: revision 1.10
lib/libwrap/hosts_access.c: revision 1.20
lib/libpuffs/pnode.c: revision 1.11
lib/libperfuse/subr.c: revision 1.17
lib/libpuffs/pnode.c: revision 1.12
lib/libperfuse/subr.c: revision 1.18
lib/libwrap/options.c: revision 1.15
lib/libwrap/fix_options.c: revision 1.11
lib/libperfuse/ops.c: revision 1.52
lib/libperfuse/ops.c: revision 1.53
lib/libperfuse/ops.c: revision 1.54
lib/libwrap/hosts_ctl.c: revision 1.5
lib/libintl/gettext.c: revision 1.27
lib/libwrap/shell_cmd.c: revision 1.6
lib/libpuffs/dispatcher.c: revision 1.39
lib/libperfuse/perfuse_priv.h: revision 1.27
lib/libwrap/socket.c: revision 1.19
lib/libpuffs/puffs.3: revision 1.50
lib/libperfuse/perfuse_priv.h: revision 1.28
lib/libpuffs/puffs_priv.h: revision 1.45
lib/libpuffs/puffs.3: revision 1.51
lib/libperfuse/perfuse_priv.h: revision 1.29
lib/libwrap/percent_x.c: revision 1.5
lib/libpuffs/puffs.3: revision 1.52
lib/libperfuse/debug.c: revision 1.11
sys/fs/puffs/puffs_vnops.c: revision 1.165
lib/libwrap/tcpd.h: revision 1.13
sys/fs/puffs/puffs_vnops.c: revision 1.166
lib/libwrap/eval.c: revision 1.7
sys/fs/puffs/puffs_msgif.h: revision 1.78
sys/fs/puffs/puffs_vfsops.c: revision 1.101
lib/libwrap/rfc931.c: revision 1.9
lib/libwrap/clean_exit.c: revision 1.5
lib/libpuffs/puffs.h: revision 1.120
lib/libc/stdlib/jemalloc.c: revision 1.27
lib/librmt/rmtlib.c: revision 1.26
lib/libpuffs/puffs.h: revision 1.121
sys/fs/puffs/puffs_sys.h: revision 1.79
lib/librumpclient/rumpclient.c: revision 1.48
lib/libwrap/refuse.c: revision 1.5
lib/libperfuse/perfuse.c: revision 1.26
lib/libperfuse/perfuse.c: revision 1.27
tests/fs/puffs/t_fuzz.c: revision 1.5
lib/libperfuse/perfuse.c: revision 1.28
lib/libpuffs/dispatcher.c: revision 1.40
sys/fs/puffs/puffs_node.c: revision 1.24
lib/libwrap/diag.c: revision 1.9
lib/libintl/textdomain.c: revision 1.13
Use C89 function definition
Add name and atttribute cache with filesytem provided TTL.
lookup, create, mknod, mkdir, symlink, getattr and setattr messages
have been extended so that attributes and their TTL can be provided
by the filesytem. lookup, create, mknod, mkdir, and symlink messages
are also extended so that the filesystem can provide name TTL.
Add PUFFS_KFLAG_CACHE_FS_TTL flag to puffs_init(3) to use name and
attribute cache with filesystem provided TTL.
lookup, create, mknod, mkdir, symlink, getattr and setattr messages
have been extended so that attributes and their TTL can be provided
by the filesytem. lookup, create, mknod, mkdir, and symlink messages
are also extended so that the filesystem can provide name TTL.
The filesystem updates attributes and TTL using
puffs_pn_getvap(3), puffs_pn_getvattl(3), and puffs_pn_getcnttl(3)
Use new PUFFS_KFLAG_CACHE_FS_TTL option to puffs_init(3) so that
FUSE TTL on name and attributes are used. This save many PUFFS
operations and improves performances.
PUFFS_KFLAG_CACHE_FS_TTL is #ifdef'ed in many places for now so that
libperfuse can still be used on netbsd-5.
Split file system.
Comma fixes.
Remove dangling &quot;and&quot;.
Bump date for previous.
- Makesure update_va does not change vnode size when it should not. For
instance when doing a fault-issued VOP_GETPAGES within VOP_WRITE, changing
size leads to panic: genfs_getpages: past eof.
-Handle ticks wrap around for vnode name andattribute timeout
- When using PUFFS_KFLAG_CACHE_FS_TTL, do not use puffs_node to carry
attribute and TTL fora newly created node. Instead extend puffs_newinfo
and add puffs_newinfo_setva() and puffs_newinfo_setttl()
- Remove node_mk_common_final in libperfuse. It used to set uid/gid for
a newly created vnode but has been made redundant along time ago since
uid and gid are properly set in FUSE header.
- In libperfuse, check for corner case where opc = 0 on INACTIVE and RECLAIM
(how is it possible? Check for it to avoid a crash anyway)
- In libperfuse, make sure we unlimit RLIMIT_AS and RLIMIT_DATA so that
we do notrun out of memory because the kernel is lazy at reclaiming vnodes.
- In libperfuse, cleanup style of perfuse_destroy_pn()
Do not set PUFFS_KFLAG_CACHE_FS_TTL for PUFFS tests
 1.163.2.1  03-Apr-2012  riz Pull up following revision(s) (requested by jakllsch in ticket #154):
sys/fs/puffs/puffs_vnops.c: revision 1.164
Prevent access beyond end of PUFFS file on read,
similar to as is done for NFS.
 1.174.2.3  03-Dec-2017  jdolecek update from HEAD
 1.174.2.2  20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.174.2.1  20-Nov-2012  tls Resync to 2012-11-19 00:00:00 UTC
 1.176.2.1  18-May-2014  rmind sync with head
 1.181.2.1  10-Aug-2014  tls Rebase.
 1.182.2.13  27-Feb-2015  martin Pull up following revision(s) (requested by manu in ticket #555):
lib/libpuffs/puffs.3: revision 1.60
sys/fs/puffs/puffs_msgif.h: revision 1.84
lib/libperfuse/ops.c: revision 1.83
sys/fs/puffs/puffs_sys.h: revision 1.89
sys/fs/puffs/puffs_vfsops.c: revision 1.116
lib/libperfuse/perfuse.c: revision 1.36
sys/fs/puffs/puffs_vnops.c: revision 1.200-1.202

Add PUFFS_KFLAG_NOFLUSH_META to prevent sending metadata flush to FUSE

FUSE filesystems do not expect to get metadata updates for [amc]time
and size, they updates the value on their own after operations.

The PUFFS PUFFS_KFLAG_NOFLUSH_META option prevents regular metadata cache
flushes to the filesystem , and libperfuse uses it to match Linux FUSE
behavior.

While there, fix a bug in SETATTR: do not update kernel metadata cache
from SETATTR reply when the request is asynchronous, as we do not have
the reply yet.

Update file size after write without metadata flush
If we do not use metadata flush, we must make sure the size is updated
in the filesystem after a write, otherwise the next GETATTR will get us
a stale value and the file will be truncated.
 1.182.2.12  17-Jan-2015  martin Pull up following revision(s) (requested by manu in ticket #423):
sys/fs/puffs/puffs_vnops.c: revision 1.199
Make sure reads on empty files reach PUFFS filesystems
Sending a read through the page cache will get the operation
short-circuited. This is a problem with some filesystems that
expect to receive the read operation in order to update atime.
We fix that by bypassing the page cache when reading a file
wich a size known to be zero.
 1.182.2.11  09-Nov-2014  msaitoh Pull up following revision(s) (requested by manu in ticket #194):
sys/fs/puffs/puffs_vnops.c: revision 1.197
sys/fs/puffs/puffs_node.c: revision 1.35
Fix PUFFS node use-after-reclaim
When puffs_cookie2vnode() misses an entry, vcache_get()
creates a new node (puffs_vfsop_loadvnode being called to
initialize the PUFFS part), then it discovers it is VNON,
and tries to vrele() it. vrele() calls VOP_INACTIVE(),
which led us in puffs_vnop_inactive() where we sent a
request to the filesystem for a node that already had been
reclaimed.
The fix is to check for VNON nodes in puffs_vnop_inactive()
and to return without doing anyting. This is suboptimal, but
a better workaround would probably need to modify vcache API,
with an impact on other filesystems. Let us keep it simple.
 1.182.2.10  09-Nov-2014  msaitoh Pull up following revision(s) (requested by manu in ticket #193):
sys/fs/puffs/puffs_vnops.c: revision 1.198
PUFFS direct I/O cache fix
There are a few situations where we must take care of the cache if
direct
I/O was enabled:
- if we do direct I/O for write but not for read, then any write must
invalidate the cache so that a reader gets the written data and not
the not-updated cache.
- if we used a vnode without direct I/O and it is enabled for writing,
we must flush the cache before compeling the open operation, so that
the cachec write are not lost.
And at inactive time, we wipe direct I/O flags so that a new open
without
direct I/O does not inherit direct I/O.
 1.182.2.9  05-Nov-2014  snj Pull up following revision(s) (requested by manu in ticket #182):
sys/fs/puffs/puffs_vnops.c: revision 1.195
According to pooka@'s comment, a long time ago, VOP_STRATEGY could not
fail without taking down the kernel. It seems this is not the case
anymore,
hence we can stop dropping errors in puffs_vnop_strategy()
Approved by pooka@
 1.182.2.8  05-Nov-2014  snj Pull up following revision(s) (requested by manu in ticket #181):
lib/libperfuse/fuse.h: revision 1.6
lib/libperfuse/ops.c: revision 1.78
lib/libperfuse/perfuse.c: revision 1.35
lib/libperfuse/perfuse_priv.h: revision 1.36
lib/libpuffs/dispatcher.c: revision 1.48
lib/libpuffs/opdump.c: revision 1.37
lib/libpuffs/puffs.c: revision 1.118
lib/libpuffs/puffs.h: revision 1.126
lib/libpuffs/puffs_ops.3: revisions 1.40-1.41
sys/fs/puffs/puffs_msgif.h: revision 1.82-1.83
sys/fs/puffs/puffs_msgif.h: revision 1.82
sys/fs/puffs/puffs_vnops.c: revision 1.196
Add PUFFS support for fallocate and fdiscard operations
--
libpuffs support for fallocate and fdiscard operations
--
Add PUFFS_HAVE_FALLOCATE in puffs_msgif.h so that filesystem can decide
at build time wether fallocate is usable
--
FUSE fallocate support
There seems to be no fdiscard FUSE operation at the moment, hence that
one is left unused.
 1.182.2.7  14-Oct-2014  martin Pull up revisions 1.192-1.194: fix debug printf formatting and make
it compile without debugging enabled.
 1.182.2.6  13-Oct-2014  martin Pull up following revision(s) (requested by manu in ticket #136):
sys/fs/puffs/puffs_vnops.c: revision 1.189-1.191
If we truncate a file open for writing, make sure we zero-fill the end
of the last page, otherwise if the file is later truncated to a larger
size (creating a hole), that area will not return zeroes as it should.
 1.182.2.5  30-Sep-2014  martin Pull up following revision(s) (requested by hannken in ticket #67):
sys/fs/puffs/puffs_node.c: revision 1.34
sys/fs/puffs/puffs_vnops.c: revision 1.187
Fix the puffs_sop_thread -> puffs_cookie2vnode path:
- pass the cookie by reference
- add missing mutex_exit()
- update assertion for VNON typed vnodes
 1.182.2.4  11-Sep-2014  martin Pull up following revision(s) (requested by manu in ticket #93):
sys/fs/puffs/puffs_vnops.c: revision 1.186
PUFFS fixes for size update ater write plus read/write sanity checks
- Always update kernel metadata cache for size when writing
This fixes situation where size update after appending to a file lagged
- Make read/write nilpotent when called with null size, as FFS does
- Return EFBIG instead of EINVAL for negative offsets, as FFS does
 1.182.2.3  10-Sep-2014  martin Pull up following revision(s) (requested by manu in ticket #79):
sys/fs/puffs/puffs_node.c: revision 1.33
sys/fs/puffs/puffs_vnops.c: revision 1.185
When changing a directory content, update the ctime/mtime in kernel
cache, otherwise the updated ctime/mtime appears after the cached
entry expire.
 1.182.2.2  29-Aug-2014  martin Pull up following revision(s) (requested by hannken in ticket #67):
sys/fs/puffs/puffs_sys.h: revision 1.86
sys/fs/puffs/puffs_vfsops.c: revision 1.114
sys/fs/puffs/puffs_msgif.c: revision 1.95
sys/fs/puffs/puffs_node.c: revision 1.32
sys/fs/puffs/puffs_vnops.c: revision 1.184
Change puffs from hashlist to vcache.
- field "pa_nhashbuckets" of struct "puffs_kargs" becomes a no-op.
and should be removed on the next protocol version bump.
 1.182.2.1  26-Aug-2014  riz Pull up following revision(s) (requested by manu in ticket #52):
sys/fs/puffs/puffs_msgif.h: revision 1.81
sys/fs/puffs/puffs_sys.h: revision 1.85
sys/fs/puffs/puffs_vnops.c: revision 1.183
Add a oflags input field to open requests so that the filesystem can pass
back information about the file. Implement PUFFS_OPEN_IO_DIRECT, which
will force direct IO (bypassing page cache) for the file.
 1.198.2.5  28-Aug-2017  skrll Sync with HEAD
 1.198.2.4  05-Oct-2016  skrll Sync with HEAD
 1.198.2.3  09-Jul-2016  skrll Sync with HEAD
 1.198.2.2  06-Jun-2015  skrll Sync with HEAD
 1.198.2.1  06-Apr-2015  skrll Sync with HEAD
 1.204.2.2  26-Apr-2017  pgoyette Sync with HEAD
 1.204.2.1  26-Jul-2016  pgoyette Sync with HEAD
 1.205.2.1  21-Apr-2017  bouyer Sync with HEAD
 1.211.10.2  08-Apr-2020  martin Merge changes from current as of 20200406
 1.211.10.1  10-Jun-2019  christos Sync with HEAD
 1.211.8.1  26-Nov-2018  pgoyette Sync with HEAD, resolve a couple of conflicts
 1.211.2.1  06-Nov-2018  martin Pull up following revision(s) (requested by manu in ticket #1082):

sys/fs/puffs/puffs_vnops.c: revision 1.213

Fix use after RECLAIM in PUFFS filesystems

From hannken@

When puffs_cookie2vnode() misses an entry and vrele() it operations
puffs_vnop_reclaim() and puffs_vnop_fsync() get called with a VNON
vnode.

Do not notify the server in this case as the cookie is stale.
 1.213.6.1  29-Feb-2020  ad Sync with head.
 1.214.4.1  25-Apr-2020  bouyer Sync with bouyer-xenpvh-base2 (HEAD)
 1.217.6.1  01-Aug-2021  thorpej Sync with HEAD.

RSS XML Feed