Home | History | Annotate | only in /src/sys/miscfs/specfs
History log of /src/sys/miscfs/specfs
RevisionDateAuthorComments
 1.1 12-Jun-1998  cgd Rework the way kernel include files are installed. In the new method,
as with user-land programs, include files are installed by each directory
in the tree that has includes to install. (This allows more flexibility
as to what gets installed, makes 'partial installs' easier, and gives us
more options as to which machines' includes get installed at any given
time.) The old SYS_INCLUDES={symlinks,copies} behaviours are _both_
still supported, though at least one bug in the 'symlinks' case is
fixed by this change. Include files can't be build before installation,
so directories that have includes as targets (e.g. dev/pci) have to move
those targets into a different Makefile.
 1.219 06-Jan-2025  mlelstv Use correct type function for block and character devices.
Use DIOCGMEDIASSIZE ioctl when no partition info is available
to generate st_size information. This helps dk(4) and dm(4) devices.
 1.218 22-Apr-2023  riastradh branches: 1.218.6;
specfs: KNF. No functional change intended.
 1.217 22-Apr-2023  hannken Remove unused specdev member sd_rdev.

Ride 10.99.4
 1.216 15-Oct-2022  riastradh specfs(9): Attribute blame by stack trace for write to r/o medium.
 1.215 21-Sep-2022  riastradh specfs(9): XXX comment: what if read downgrades lock?
 1.214 12-Aug-2022  riastradh specfs: Refuse to open a closing-in-progress block device.

We could wait for close to complete, but if this happened ever so
slightly earlier it would lead to EBUSY anyway, so there's no point
in adding logic for that -- either way the caller neglected to wait
for the last close to finish before trying to open it the device
again.

https://mail-index.netbsd.org/current-users/2022/08/09/msg042800.html

Reported-by: syzbot+4388f20706ec8a4c8db0@syzkaller.appspotmail.com
https://syzkaller.appspot.com/bug?id=47c67ab6d3a87514d0707882a9ad6671beaa8642

Reported-by: syzbot+0f1756652dce4cb341ed@syzkaller.appspotmail.com
https://syzkaller.appspot.com/bug?id=a632ce762d64241fc82a9bc57230b7b7c7095d1a
 1.213 12-Aug-2022  riastradh specfs: Assert !closing on successful open.

- If there's a prior concurrent close, it must have interrupted this
open.

- If there's a new concurrent close, it must wait until this open has
released device_lock before it can revoke.
 1.212 12-Aug-2022  riastradh specfs: Assert opencnt>0 on successful open.
 1.211 11-Aug-2022  riastradh specfs: Sprinkle opencnt/opened/closing assertions.

There seems to be a bug here but I'm not sure what it is yet:

https://mail-index.netbsd.org/current-users/2022/08/09/msg042800.html
https://syzkaller.appspot.com/bug?id=47c67ab6d3a87514d0707882a9ad6671beaa8642

The decision to actually invoke d_close is serialized under
device_lock, so it should not be possible for more than one process
to close at the same time, but syzbot and kre found a way for
sd_closing to be false later in spec_close. Let's make sure it's
false when we're making what should be the exclusive decision to
close.

We can't assert !sd_opened before cancel and spec_io_drain, because
those are necessary to interrupt and wait for pending opens that
might later set sd_opened, but we can assert !sd_opened afterward
because once sd_closing is true nothing should set sd_opened.
 1.210 28-Mar-2022  riastradh driver(9): New devsw d_cancel op to interrupt I/O before close.

If specified, when revoking a device node or closing its last open
node, specfs will:

1. Call d_cancel, which should return promptly without blocking.
2. Wait for all concurrent d_read/write/ioctl/&c. to drain.
3. Call d_close.

Otherwise, specfs will:

1. Call d_close.
2. Wait for all concurrent d_read/write/ioctl/&c. to drain.

This fallback is problematic because often parts of d_close rely on
concurrent devsw operations to have completed already, so it is up to
each driver to have its own mechanism for waiting, and the extra step
in (2) is almost redundant. But it is still important to ensure that
devsw operations are not active by the time a module tries to invoke
devsw_detach, because only d_open is protected against that.

The signature of d_cancel matches d_close, mostly so we don't raise
questions about `why is this different?'; the lwp argument is not
useful but we should remove it from open/cancel/close all at the same
time.

The only way d_cancel should fail, if it does at all, is with ENODEV,
meaning the driver doesn't support cancelling outstanding I/O, and
will take responsibility for that in d_close. I would make it return
void and only have bdev_cancel and cdev_cancel possibly return ENODEV
so specfs can detect whether a driver supports it, but this would
break the pattern around devsw operation types.

Drivers are allowed to omit it from struct bdevsw, struct cdevsw --
if so, it is as if they used a function that just returns ENODEV.

XXX kernel ABI change to struct bdevsw/cdevsw requires bump
 1.209 28-Mar-2022  riastradh specfs: Remove specnode from hash table in spec_node_revoke.

Previously, it was possible for spec_node_lookup_by_dev to handle a
speconde that a concurrent spec_node_destroy is about to remove from
the hash table and then free, as soon as spec_node_lookup_by_dev
releases device_lock.

Now, the ordering is:

1. Remove specnode from hash table in spec_node_revoke. At this
point, no _new_ vnode references are possible (other than possibly
one acquired by vcache_vget under v_interlock), but there may be
existing ones.

2. Mark vnode reclaimed so vcache_vget will fail.

3. The last vrele (or equivalent logic in vcache_vget) will then free
the specnode in spec_node_destroy.

This way, _if_ a thread in spec_node_lookup_by_dev finds a specnode
in the hash table under device_lock/v_interlock, _then_ it will not
be freed until the thread completes vcache_vget.

This change requires calling spec_node_revoke unconditionally for
device special nodes, not just for active ones. Might introduce
slightly more contention on device_lock but not much because we
already have to take it in this path anyway a little later in
spec_node_destroy.
 1.208 28-Mar-2022  riastradh specfs: Let spec_node_lookup_by_dev wait for reclaim to finish.

vdevgone relies on this to ensure that if there is a concurrent
revoke in progress, it will wait for that revoke to finish -- that
way, it can guarantee all I/O operations have completed and the
device is closed.
 1.207 28-Mar-2022  riastradh specfs: Assert opencnt is nonzero before decrementing.
 1.206 28-Mar-2022  riastradh specfs: Take an I/O reference across bdev/cdev_open.

- Revoke is used to invalidate all prior access control checks when
device permissions are changing, so it must wait for .d_open to exit
so any new access must go through new access control checks.

- Revoke is used by vdevgone in xyz_detach to wait until all use of
the driver's data structures have completed before xyz_detach frees
them.

So we need to make sure spec_close waits for .d_open too.
 1.205 28-Mar-2022  riastradh specfs: Wait for last close in spec_node_revoke.

Otherwise, revoke -- and vdevgone, in the detach path of removable
devices -- may complete while I/O operations are still running
concurrently.
 1.204 28-Mar-2022  riastradh specfs: Prevent new opens while close is waiting to drain.

Otherwise, bdev/cdev_close could have cancelled all _existing_ opens,
and waited for them to complete (and freed resources used by them) --
but a new one could start, and hang (e.g., a tty), at the same time
spec_close tries to drain all pending I/O operations, one of which
(the new open) is now hanging indefinitely.

Preventing the new open from even starting until bdev/cdev_close is
finished and all I/O operations have drained avoids this deadlock.
 1.203 28-Mar-2022  riastradh specfs: Take an I/O reference in spec_node_setmountedfs.

This is not quite correct. We _should_ require the caller to hold a
vnode lock around spec_node_getmountedfs, and an exclusive vnode lock
around spec_node_setmountedfs, so that it is only necessary to check
whether revoke has already happened, not hold an I/O reference.

Unfortunately, various callers in various file systems don't follow
this sensible rule. So let's at least make sure the vnode can't be
revoked in spec_node_setmountedfs, while we're in bdev_ioctl, and
leave a comment explaining what the sorry state of affairs is and how
to fix it later.
 1.202 28-Mar-2022  riastradh specfs: Drain all I/O operations after last .d_close call.

New kind of I/O reference on specdevs, sd_iocnt. This could be done
with psref instead; I chose a reference count instead for now because
we already have to take a per-object lock anyway, v_interlock, for
vdead_check, so another atomic is not likely to hurt much more. We
can always change the mechanism inside spec_io_enter/exit/drain later
on.

Make sure every access to vp->v_rdev or vp->v_specnode and every call
to a devsw operation is protected either:

- by the vnode lock (with vdead_check if we unlocked/relocked),
- by positive sd_opencnt,
- by spec_io_enter/exit, or
- by sd_opencnt management in open/close.
 1.201 28-Mar-2022  riastradh specfs: Resolve a race between close and a failing reopen.
 1.200 28-Mar-2022  riastradh specfs: Paranoia: Assert opencnt is zero on reclaim.
 1.199 28-Mar-2022  riastradh specfs: Omit needless vdead_check in spec_fdiscard.

The vnode lock is held, so the vnode cannot be revoked without also
changing v_op so subsequent uses under the vnode lock will go to
deadfs's VOP_FDISCARD instead (which is genfs_eopnotsupp).
 1.198 28-Mar-2022  riastradh specfs: Add a comment and assertion to spec_close about refcnts.
 1.197 28-Mar-2022  riastradh specfs: If sd_opencnt is zero, sn_opencnt had better be zero.
 1.196 28-Mar-2022  riastradh specfs: Factor KASSERT out of switch in spec_open.

No functional change.
 1.195 28-Mar-2022  riastradh specfs: sn_gone cannot be set while we hold the vnode lock.

Revoke runs with the vnode lock too, which is exclusive. Add an
assertion to this effect in spec_node_revoke to make it clear.
 1.194 28-Mar-2022  riastradh specfs: Reorganize D_DISK tail of spec_open and explain what's up.

No functional change intended.
 1.193 28-Mar-2022  riastradh specfs: Factor VOP_UNLOCK/vn_lock out of switch for clarity.

No functional change.
 1.192 28-Mar-2022  riastradh specfs: Factor common device_lock out of switch for clarity.

No functional change.
 1.191 28-Mar-2022  riastradh specfs: Delete bogus comment about .d_open/.d_close at same time.

Annoying as it is that .d_open and .d_close can run at the same time,
it is also necessary for tty semantics, where open can block
indefinitely, and it is the responsibility of close (called via
revoke) necessary to interrupt it.
 1.190 28-Mar-2022  riastradh specfs: Split spec_open switch into three sections.

The sections are now:

1. Acquire open reference.

1a (intermezzo). Set VV_ISTTY.

2. Drop the vnode lock to call .d_open and autoload modules if
necessary.

3. Handle concurrent revoke if it happenend, or release open reference
if .d_open failed.

No functional change. Sprinkle comments about problems.
 1.189 28-Mar-2022  riastradh specfs: Factor common kauth check out of switch in spec_open.

No functional change.
 1.188 28-Mar-2022  riastradh specfs: Assert v_type is VBLK or VCHR in spec_open.

Nothing else makes sense. Prune dead branches (and replace default
case by panic).
 1.187 28-Mar-2022  riastradh specfs: Call bdev_open without the vnode lock.

There is no need for it to serialize opens, because they are already
serialized by sd_opencnt which for block devices is always either 0
or 1.

There's not obviously any other reason why the vnode lock should be
held across bdev_open, other than that it might be nice to avoid
dropping it if not necessary. For character devices we always have
to drop the vnode lock because open might hang indefinitely, when
opening a tty, which is not allowed while holding the vnode lock.
 1.186 28-Mar-2022  riastradh specfs: Note lock order for vnode lock, device_lock, v_interlock.
 1.185 28-Mar-2022  riastradh driver(9): Eliminate D_MCLOSE.

D_MCLOSE was introduced a few years ago by mistake for audio(4),
which should have used -- and now does use -- fd_clone to create
per-open state. The semantics was originally to call close once
every time the device node is closed, not only for the last close.
Nothing uses it any more, and it complicates reasoning about the
system, so let's simplify it away.
 1.184 19-Mar-2022  hannken Switch spec_vnodeop vector to real vnode locking, VV_LOCKSWORK now.
 1.183 18-Jul-2021  dholland Abolish all the silly indirection macros for initializing vnode ops tables.

These are things of the form #define foofs_op genfs_op, or #define
foofs_op genfs_eopnotsupp, or similar. They serve no purpose besides
obfuscation, and have gotten cutpasted all over everywhere.
 1.182 29-Jun-2021  dholland - Add a new vnode op: VOP_PARSEPATH.
- Move namei_getcomponent to genfs_vnops.c and call it genfs_parsepath.
- Add a parsepath entry to every vnode ops table.

VOP_PARSEPATH takes a directory vnode to be searched and a complete
following path and chooses how much of that path to consume. To begin
with, all parsepath calls are genfs_parsepath, which locates the first
'/' as always.

Note that the call doesn't take the whole struct componentname, only
the string. The other bits of struct componentname should not be
needed and there's no reason to cause potential complications by
exposing them.
 1.181 25-Dec-2020  mlelstv branches: 1.181.4;
When reading from a block device, queue parallel block requests to
fill a buffer with breadn.
 1.180 27-Jun-2020  christos branches: 1.180.2;
Introduce genfs_pathconf() and use it for the default case in all filesystems.
 1.179 23-May-2020  ad Move proc_lock into the data segment. It was dynamically allocated because
at the time we had mutex_obj_alloc() but not __cacheline_aligned.
 1.178 16-May-2020  christos Add ACL support for FFS. From FreeBSD.
 1.177 13-Apr-2020  jdolecek when determining I/O block size for VBLK device, only use pi_bsize
returned by DIOCGPARTINFO if it's bigger than DEV_BSIZE and less
than MAXBSIZE (MAXPHYS)

fixes panic "buf mem pool index 8" in buf_mempoolidx() when the
disklabel contains bsize 128KB and something reads the block device -
buffer cache can't allocate bufs bigger than MAXPHYS
 1.176 22-Sep-2019  christos branches: 1.176.6;
Add a new member to struct vfsstat and grow the unused members
The new member is caled f_mntfromlabel and it is the dkw_wname
of the corresponding wedge. This is now used by df -W to display
the mountpoint name as NAME=
 1.175 03-Sep-2018  riastradh Rename min/max -> uimin/uimax for better honesty.

These functions are defined on unsigned int. The generic name
min/max should not silently truncate to 32 bits on 64-bit systems.
This is purely a name change -- no functional change intended.

HOWEVER! Some subsystems have

#define min(a, b) ((a) < (b) ? (a) : (b))
#define max(a, b) ((a) > (b) ? (a) : (b))

even though our standard name for that is MIN/MAX. Although these
may invite multiple evaluation bugs, these do _not_ cause integer
truncation.

To avoid `fixing' these cases, I first changed the name in libkern,
and then compile-tested every file where min/max occurred in order to
confirm that it failed -- and thus confirm that nothing shadowed
min/max -- before changing it.

I have left a handful of bootloaders that are too annoying to
compile-test, and some dead code:

cobalt ews4800mips hp300 hppa ia64 luna68k vax
acorn32/if_ie.c (not included in any kernels)
macppc/if_gm.c (superseded by gem(4))

It should be easy to fix the fallout once identified -- this way of
doing things fails safe, and the goal here, after all, is to _avoid_
silent integer truncations, not introduce them.

Maybe one day we can reintroduce min/max as type-generic things that
never silently truncate. But we should avoid doing that for a while,
so that existing code has a chance to be detected by the compiler for
conversion to uimin/uimax without changing the semantics until we can
properly audit it all. (Who knows, maybe in some cases integer
truncation is actually intended!)
 1.174 24-Jun-2017  hannken branches: 1.174.4; 1.174.6;
Refuse to open a block device with zero open count when it has
a mountpoint set. This may happen after forced detach or unplug
of a mounted block device.
 1.173 01-Jun-2017  chs branches: 1.173.2;
remove checks for failure after memory allocation calls that cannot fail:

kmem_alloc() with KM_SLEEP
kmem_zalloc() with KM_SLEEP
percpu_alloc()
pserialize_create()
psref_class_create()

all of these paths include an assertion that the allocation has not failed,
so callers should not assert that again.
 1.172 26-May-2017  riastradh Make VOP_RECLAIM do the last unlock of the vnode.

VOP_RECLAIM naturally has exclusive access to the vnode, so having it
locked on entry is not strictly necessary -- but it means if there
are any final operations that must be done on the vnode, such as
ffs_update, requiring exclusive access to it, we can now kassert that
the vnode is locked in those operations.

We can't just have the caller release the last lock because some file
systems don't use genfs_lock, and require the vnode to remain valid
for VOP_UNLOCK to work, notably unionfs.
 1.171 12-Apr-2017  martin branches: 1.171.2;
Make the non-DIAGNOSTIC version compile
 1.170 11-Apr-2017  riastradh Make VOP_INACTIVE preserve vnode lock on return.

Discussed on tech-kern:
https://mail-index.netbsd.org/tech-kern/2017/04/01/msg021751.html

Ride 7.99.68, a bumpy bus of incremental vfs improvements!
 1.169 01-Mar-2017  hannken Add a diagnostic test for buffers written to a block device holding
a read-only mounted file system.

This will become a KASSERT in the near future.
 1.168 02-Jan-2017  hannken branches: 1.168.2;
Rename vget() to vcache_vget() and vcache_tryvget() respectively and
move the definitions to sys/vnode_impl.h.

No functional change intended.

Welcome to 7.99.54
 1.167 09-Dec-2016  nat Add functions to access device flags. This restores simultaneous audio
open/close.

OK hannken@ christos@
 1.166 08-Dec-2016  nat The audio sub-system now supports the following features as
posted to tech-kern:

* Simultaneous playback and mixing of multiple streams
* Playback streams can be of different encoding, frequency, precision
and number of channels
* Simultaneous recording to different formats
* One audio device per process
* Sysctls to set the common format frequency, precision and channels
* Independent mixer controls for recording/playback per stream
* Utilizes little cpu time for multiple streams / good performance
* Compatible with existing programs that use OSS/NetBSD audio
* Changes to audioctl(1) to allow specifying process id for corresponding
audio device
 1.165 08-Sep-2016  pgoyette Revert rev 1.164. This will be redone differently (using "dummy"
modules).

This implementation requires changes to a base kernel in order to
update the set of "special" modules, kinda defeating the purpose of
having modules in the first place. The new method will use dummy
modules (with name tap and tun) which will depend on the real
modules with the if_ prefix.

Coming soon to a NetBSD near you.
 1.164 08-Sep-2016  pgoyette if_config processing wants to auto-load modules named with an if_ prefix,
while specfc wants to auto-load modules without the prefix. For modules
which can be loaded both ways (ie, if_tap and if_tun), provide a simple
conversion table for specfs so it can auto-load the if_ module.

This table should always be quite small, and the auto-load operation is
relatively infrequent, so the additional overhead of comparing names should
be tolerable.
 1.163 20-Aug-2016  hannken Remove now obsolete operation vcache_remove().

Welcome to 7.99.36
 1.162 04-Apr-2016  hannken branches: 1.162.2;
Avoid a race with spec_revoke for the assertion too.

Final fix for PR kern/50467 Panic from disconnecting phone while reading
its contents
 1.161 26-Mar-2016  hannken Whhen spec_strategy() extracts v_rdev take care to avoid a
race with spec_revoke.

Fixes PR kern/50467 Panic from disconnecting phone while reading its contents
 1.160 05-Jan-2016  pgoyette Fix a couple of checks for kernel vm_space, and convert the 'naked
panic" code to KASSERT/KASSERTMSG.

Thanks, Taylor!
 1.159 23-Dec-2015  pgoyette Revert previous
 1.158 22-Dec-2015  pgoyette If we attempt to autoload a driver module, make sure we return an error
if it fails. Otherwise we might end up calling a builtin-but-disabled
driver module and that can generate all sorts of issues...
 1.157 08-Dec-2015  christos Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.
 1.156 08-Dec-2015  christos unfortunately it is not that easy to get rid of DIOCGPART. DTRT for the
raw partition and print a warning if we overflowed. I guess the right solution
for this is to create yet another version of disklabel that is 64 bit friendly.
 1.155 05-Dec-2015  jnemeth messing with uninitialized structs is a bad thing
 1.154 04-Dec-2015  christos Use DIOCGMEDIASIZE instead of DIOCGPART so that we are not limited to 2G.
XXX: All DIOCGPART code needs to be removed...
XXX: pullup-7
 1.153 01-Jul-2015  hannken Unfortunately MFS uses v_data of its anonymous device vnode so
it cannot be used as vcache key. Use v_interlock as key ...
 1.152 30-Jun-2015  hannken Redo previous again, v_specnode is invariant but not unique.

Set "vp->v_data = vp" and use v_data as key.
 1.151 29-Jun-2015  hannken Use the address of vp->v_specnode as vcache key. It is invariant
over the lifetime of the vnode.

The previous worked by luck, it took the first sizeof(void *) bytes
of struct vnode as key.

Resolves CID 1308957: wrong sizeof()
 1.150 29-Jun-2015  christos Revert previous, and explain why.
 1.149 29-Jun-2015  christos CID 1308957: Fix wrong sizeof()
 1.148 23-Jun-2015  hannken Add a vfs_newvnode() method to deadfs and use it to create
anonymous device vnodes with bdevvp() and cdevvp().

Implement spec_inactive() and spec_reclaim() to handle these nodes.
 1.147 20-Apr-2015  riastradh Make vget always return vnode unlocked.

Convert callers who want locks to use vn_lock afterward.

Add extra argument so the compiler will report stragglers.
 1.146 28-Mar-2015  maxv Remove the 'cred' argument from bread(). Remove a now unused var in
ffs_snapshot.c. Update the man page accordingly.

ok hannken@
 1.145 25-Jul-2014  dholland branches: 1.145.2; 1.145.4; 1.145.6;
Add VOP_FALLOCATE and VOP_FDISCARD to every vnode ops table I can
find.

The filesystem ones all call genfs_eopnotsupp - right now I am only
implementing the plumbing and we can implement fallocate and/or
fdiscard for files later.

The device ones call spec_fallocate (which is also genfs_eopnotsupp)
and spec_fdiscard, which dispatches to the device-level op.

The fifo ones all call vn_fifo_bypass, which also ends up being
EOPNOTSUPP.
 1.144 25-Jul-2014  dholland Implement spec_fdiscard() using bdev_discard() and cdev_discard().
Also define spec_fallocate() to genfs_eopnotsupp().
 1.143 24-Mar-2014  hannken branches: 1.143.2;
- Make VI_XLOCK, VI_CLEAN and VI_LOCKSHARE private to kern/vfs_*.c.
- Make vwait() static.
- Add vdead_check() to check a vnode for being or becoming dead.

Discussed on tech-kern.

Welcome to 6.99.38
 1.142 07-Feb-2014  hannken Change vnode operation lookup to return the resulting vnode *vpp unlocked.
Change cache_lookup() to return an unlocked vnode.

Discussed on tech-kern@

Welcome to 6.99.31
 1.141 30-Sep-2013  hannken Replace macro v_specmountpoint with two functions spec_node_getmountedfs()
and spec_node_setmountedfs() to manage the file system mounted on a device.
Assert the device is a block device.

Welcome to 6.99.24

Discussed on tech-kern@ some time ago.

Reviewed by: David Holland <dholland@netbsd.org>
 1.140 20-Jul-2013  dholland oops, spell b_bcount properly
 1.139 20-Jul-2013  dholland In spec_strategy, if fscow_run() fails, set b_resid along with b_error
to avoid panic in biodone. Noticed by riastradh.
 1.138 16-Jun-2013  dholland branches: 1.138.2; 1.138.4;
Hang a warning banner on some nasty code I just found.
 1.137 13-Feb-2013  hannken Make the spec_node table implementation private to spec_vnops.c.

To retrieve a spec_node, two new lookup functions (by device or by mount)
are implemented. Both return a referenced vnode, for an opened block device
the opened vnode is returned so further diagnostic checks "vp == ... sd_bdevvp"
will not fire. Otherwise any vnode matching the criteria gets returned.

No objections on tech-kern.

Welcome to 6.99.17
 1.136 20-Dec-2012  hannken Change bread() and breadn() to never return a buffer on
error and modify all callers to not brelse() on error.

Welcome to 6.99.16

PR kern/46282 (6.0_BETA crash: msdosfs_bmap -> pcbmap -> bread -> bio_doread)
 1.135 29-Apr-2012  chs branches: 1.135.2;
change vflushbuf() to take the full FSYNC_* flags.
translate FSYNC_LAZY into PGO_LAZY for VOP_PUTPAGES() so that
genfs_do_io() can set the appropriate io priority for the I/O.
this is the first part of addressing PR 46325.
 1.134 12-Jun-2011  rmind branches: 1.134.2; 1.134.6; 1.134.8;
Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.
 1.133 27-Apr-2011  hannken branches: 1.133.2;
Remove no longer needed flag FSYNC_VFS /* fsync: via FSYNC_VFS() */.
 1.132 26-Apr-2011  hannken Change vflushbuf() to return an error if a synchronous write fails.

Welcome to 5.99.51.
 1.131 21-Aug-2010  pgoyette branches: 1.131.2;
Update the rest of the kernel to conform to the module subsystem's new
locking protocol.
 1.130 24-Jun-2010  hannken Clean up vnode lock operations pass 2:

VOP_UNLOCK(vp, flags) -> VOP_UNLOCK(vp): Remove the unneeded flags argument.

Welcome to 5.99.32.

Discussed on tech-kern.
 1.129 13-Apr-2010  ahoka Revert my last change, it's not The Right Thing [tm].
 1.128 13-Apr-2010  ahoka Autoload modules with any class.

This fixes autoloading of pf, zfs and possibly others.
 1.127 14-Nov-2009  elad branches: 1.127.2; 1.127.4;
- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.
 1.126 06-Oct-2009  elad Factor out a block of code that appears in three places (Veriexec, keylock,
and securelevel) so that others can use it as well.
 1.125 04-Oct-2009  tsutsui Put workaround fix for LOCKDEBUG panic mentioned in PR kern/41078:
Don't try to load a driver module if the driver is already exist but just
not attached. [bc]dev_open() could return ENXIO even if the driver exists.

XXX: Maybe this should be handled by helper functions for
XXX: module_autoload() calls on demand.
 1.124 25-Apr-2009  rmind - Rearrange pg_delete() and pg_remove() (renamed pg_free), thus
proc_enterpgrp() with proc_leavepgrp() to free process group and/or
session without proc_lock held.
- Rename SESSHOLD() and SESSRELE() to to proc_sesshold() and
proc_sessrele(). The later releases proc_lock now.

Quick OK by <ad>.
 1.123 22-Feb-2009  ad PR kern/26878 FFSv2 + softdep = livelock (no free ram)
PR kern/16942 panic with softdep and quotas
PR kern/19565 panic: softdep_write_inodeblock: indirect pointer #1 mismatch
PR kern/26274 softdep panic: allocdirect_merge: ...
PR kern/26374 Long delay before non-root users can write to softdep partitions
PR kern/28621 1.6.x "vp != NULL" panic in ffs_softdep.c:4653 while unmounting a softdep (+quota) filesystem
PR kern/29513 FFS+Softdep panic with unfsck-able file-corruption
PR kern/31544 The ffs softdep code appears to fail to write dirty bits to disk
PR kern/31981 stopping scsi disk can cause panic (softdep)
PR kern/32116 kernel panic in softdep (assertion failure)
PR kern/32532 softdep_trackbufs deadlock
PR kern/37191 softdep: locking against myself
PR kern/40474 Kernel panic after remounting raid root with softdep

Retire softdep, pass 2. As discussed and later formally announced on the
mailing lists.
 1.122 02-Feb-2009  haad branches: 1.122.2;
Add support for loading pseudo-device drivers. Try to autoload modules from
specs_open routine. If devsw_open fail, get driver name with devsw_getname
routine and autoload module.

For now only dm drivervcan be loaded, other pseudo drivers needs more work.

Ok by ad@.
 1.121 11-Jan-2009  christos merge christos-time_t
 1.120 29-Dec-2008  pooka Rename specfs_lock as device_lock and move it from specfs to devsw.
Relaxes kernel dependency on vfs.
 1.119 16-May-2008  hannken branches: 1.119.6; 1.119.10;
Make sure all cached buffers with valid, not yet written data have been
run through copy-on-write. Call fscow_run() with valid data where possible.

The LP_UFSCOW hack is no longer needed to protect ffs_copyonwrite() against
endless recursion.

- Add a flag B_MODIFY to bread(), breada() and breadn(). If set the caller
intends to modify the buffer returned.

- Always run copy-on-write on buffers returned from ffs_balloc().

- Add new function ffs_getblk() that gets a buffer, assigns a new blkno,
may clear the buffer and runs copy-on-write. Process possible errors
from getblk() or fscow_run(). Part of PR kern/38664.

Welcome to 4.99.63

Reviewed by: YAMAMOTO Takashi <yamt@netbsd.org>
 1.118 29-Apr-2008  ad branches: 1.118.2;
PR kern/38057 ffs makes assuptions about devvp file system
PR kern/33406 softdeps get stuck in endless loop

Introduce VFS_FSYNC() and call it when syncing a block device, if it
has a mounted file system.
 1.117 28-Apr-2008  martin Remove clause 3 and 4 from TNF licenses
 1.116 24-Apr-2008  ad branches: 1.116.2;
Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.
 1.115 25-Jan-2008  hannken branches: 1.115.6; 1.115.8;
Spec_open(): clear sd_bdevvp if bdev_open() failed.

Ok: Andrew Doran <ad@netbsd.org>
 1.114 25-Jan-2008  ad Remove VOP_LEASE. Discussed on tech-kern.
 1.113 24-Jan-2008  ad spec_fsync: don't assert that 'vp' holds the block device open. If it's
not open, there shouldn't be dirty buffers so vinvalbuf() is harmless.
 1.112 24-Jan-2008  ad specfs changes for PR kern/37717 (raidclose() is no longer called on
shutdown). There are still problems with device access and a PR will be
filed.

- Kill checkalias(). Allow multiple vnodes to reference a single device.

- Don't play dangerous tricks with block vnodes to ensure that only one
vnode can describe a block device. Instead, prohibit concurrent opens of
block devices. As a bonus remove the unreliable code that prevents
multiple file system mounts on the same device. It's no longer needed.

- Track opens by vnode and by device. Issue cdev_close() when the last open
goes away, instead of abusing vnode::v_usecount to tell if the device is
open.
 1.111 02-Jan-2008  ad Merge vmlocking2 to head.
 1.110 02-Dec-2007  hannken branches: 1.110.2; 1.110.6;
Fscow_run(): add a flag "bool data_valid" to note still valid data.
Buffers run through copy-on-write are marked B_COWDONE. This condition
is valid until the buffer has run through bwrite() and gets cleared from
biodone().

Welcome to 4.99.39.

Reviewed by: YAMAMOTO Takashi <yamt@netbsd.org>
 1.109 26-Nov-2007  pooka Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern
 1.108 10-Oct-2007  ad branches: 1.108.4;
Merge from vmlocking:

- Split vnode::v_flag into three fields, depending on field locking.
- simple_lock -> kmutex in a few places.
- Fix some simple locking problems.
 1.107 08-Oct-2007  ad Merge brelse() changes from the vmlocking branch.
 1.106 07-Oct-2007  hannken Update the file system copy-on-write handler.

- Instead of hooking the handler on the specdev of a mounted file system
hook directly on the `struct mount'.

- Rename from `vn_cow_*' to `fscow_*' and move to `kern/vfs_trans.c'. Use
`mount_*specific' instead of clobbering `struct mount' or `struct specinfo'.

- Replace the hand-made reader/writer lock with a krwlock.

- Keep `vn_cow_*' functions and mark as obsolete.

- Welcome to NetBSD 4.99.32 - `struct specinfo' changed size.

Reviewed by: Jason Thorpe <thorpej@netbsd.org>
 1.105 01-Sep-2007  pooka branches: 1.105.2;
Make bioops a pointer and point it to the softdeps struct in softdep
init. Decouples "options SOFTDEP" from the main kernel and ffs code.
 1.104 03-Aug-2007  pooka branches: 1.104.2; 1.104.4; 1.104.6;
ANSI-fy
 1.103 29-Jul-2007  ad It's not a good idea for device drivers to modify b_flags, as they don't
need to understand the locking around that field. Instead of setting
B_ERROR, set b_error instead. b_error is 'owned' by whoever completes
the I/O request.
 1.102 27-Jul-2007  pooka vop_mmap parameter change
 1.101 22-Jul-2007  pooka Retire uvn_attach() - it abuses VXLOCK and its functionality,
setting vnode sizes, is handled elsewhere: file system vnode creation
or spec_open() for regular files or block special files, respectively.

Add a call to VOP_MMAP() to the pagedvn exec path, since the vnode
is being memory mapped.

reviewed by tech-kern & wrstuden
 1.100 09-Jul-2007  ad branches: 1.100.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements
 1.99 05-Jun-2007  yamt improve post-ubc file overwrite performance in common cases.
ie. when it's safe, actually overwrite blocks rather than doing
read-modify-write.

also fixes PR/33152 and PR/36303.
 1.98 04-Mar-2007  christos branches: 1.98.2; 1.98.4;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.
 1.97 26-Nov-2006  elad branches: 1.97.4;
Implement Veriexec's raw disk policy on-top of kauth(9)'s device scope,
using both the rawio_spec and passthru actions to detect raw disk
activity. Same for kernel memory policy.

Update documentation (no longer need to expose veriexec_rawchk()) and
remove all Veriexec-related bits from specfs.
 1.96 04-Nov-2006  elad Change KAUTH_SYSTEM_RAWIO to KAUTH_DEVICE_RAWIO_SPEC (moving the raw i/o
requests to the device scope) and add KAUTH_DEVICE_RAWIO_PASSTHRU.

Expose iskmemdev() through sys/conf.h.

okay yamt@
 1.95 02-Nov-2006  elad Redo Veriexec raw disk/memory access policies so they hold only if the
request is for write access.
 1.94 01-Nov-2006  elad Only use blkdev/bvp for the Veriexec case. While here, fix up IPS mode
restrictions on kernel memory.

okay yamt@
 1.93 30-Oct-2006  elad oops, remove debug printf slipped in. good catch from yamt@, thanks!
 1.92 30-Sep-2006  jld The poll routine needs to return POLLERR on error, not an errno. Sorry
about that. Pointed out by Juergen Hannken-Illjes in mail.
 1.91 21-Sep-2006  jld Protect spec_poll from racing against revocation and thus dereferencing a
NULL v_specinfo. Mostly copied (with understanding) from rev 1.83's fix
to spec_ioctl, and needed for the same reason (kern/vfs_subr.c r1.231).
 1.90 19-Sep-2006  elad For the VBLK case, we always check vfs_mountedon() and it has nothing
to do with the security model used. Move back the call to spec_open(),
which can now return the real return value from vfs_mountedon() (EBUSY)
and not EPERM, changing semantics.
 1.89 08-Sep-2006  elad branches: 1.89.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)
 1.88 11-Aug-2006  christos branches: 1.88.2;
Pretending to be Elad's keyboard:

fileassoc.diff adds a fileassoc_table_run() routine that allows you to
pass a callback to be called with every entry on a given mount.

veriexec.diff adds some raw device access policies: if raw disk is
opened at strict level 1, all fingerprints on this disk will be
invalidated as a safety measure. level 2 will not allow opening disk
for raw writing if we monitor it, and prevent raw writes to memory.
level 3 will not allow opening any disk for raw writing.

both update all relevant documentation.

veriexec concept is okay blymn@.
 1.87 14-May-2006  elad branches: 1.87.6;
integrate kauth.
 1.86 01-Mar-2006  yamt branches: 1.86.2; 1.86.4; 1.86.6;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.
 1.85 11-Dec-2005  christos branches: 1.85.2; 1.85.4; 1.85.6;
merge ktrace-lwp.
 1.84 02-Nov-2005  yamt merge yamt-vop branch. remove following VOPs.

VOP_BLKATOFF
VOP_VALLOC
VOP_BALLOC
VOP_REALLOCBLKS
VOP_VFREE
VOP_TRUNCATE
VOP_UPDATE
 1.83 11-Sep-2005  chs branches: 1.83.2;
in spec_ioctl(), don't dereference v_specinfo if it's NULL.
this is needed due to rev. 1.231 of kern/vfs_subr.c, which now sets
v_specinfo to NULL before changing the vnode's ops vector.
 1.82 30-Aug-2005  xtraeme Remove __P()
 1.81 21-Jun-2005  ws branches: 1.81.2;
PR-30566: Poll must not return <sys/errno.h> values.
Start with those places I can easily test.
 1.80 26-Feb-2005  perry branches: 1.80.2;
nuke trailing whitespace
 1.79 25-May-2004  hannken branches: 1.79.4; 1.79.6;
Add ffs internal snapshots. Written by Marshall Kirk McKusick for FreeBSD.

- Not enabled by default. Needs kernel option FFS_SNAPSHOT.
- Change parameters of ffs_blkfree.
- Let the copy-on-write functions return an error so spec_strategy
may fail if the copy-on-write fails.
- Change genfs_*lock*() to use vp->v_vnlock instead of &vp->v_lock.
- Add flag B_METAONLY to VOP_BALLOC to return indirect block buffer.
- Add a function ffs_checkfreefile needed for snapshot creation.
- Add special handling of snapshot files:
Snapshots may not be opened for writing and the attributes are read-only.
Use the mtime as the time this snapshot was taken.
Deny mtime updates for snapshot files.
- Add function transferlockers to transfer any waiting processes from
one lock to another.
- Add vfsop VFS_SNAPSHOT to take a snapshot and make it accessible through
a vnode.
- Add snapshot support to ls, fsck_ffs and dump.

Welcome to 2.0F.

Approved by: Jason R. Thorpe <thorpej@netbsd.org>
 1.78 12-May-2004  jrf caddr_t -> void * and removal of some more casts.
 1.77 14-Feb-2004  hannken branches: 1.77.2; 1.77.4; 1.77.6;
Add a generic copy-on-write hook to add/remove functions that will be
called with every buffer written through spec_strategy().

Used by fss(4). Future file-system-internal snapshots will need them too.

Welcome to 1.6ZK

Approved by: Jason R. Thorpe <thorpej@netbsd.org>
 1.76 25-Jan-2004  hannken Make VOP_STRATEGY(bp) a real VOP as discussed on tech-kern.

VOP_STRATEGY(bp) is replaced by one of two new functions:

- VOP_STRATEGY(vp, bp) Call the strategy routine of vp for bp.
- DEV_STRATEGY(bp) Call the d_strategy routine of bp->b_dev for bp.

DEV_STRATEGY(bp) is used only for block-to-block device situations.
 1.75 10-Dec-2003  hannken The file system snapshot pseudo driver.

Uses a hook in spec_strategy() to save data written from a mounted
file system to its block device and a hook in dounmount().

Not enabled by default in any kernel config.

Approved by: Frank van der Linden <fvdl@netbsd.org>
 1.74 26-Nov-2003  pk spec_close: asserting that the terminal's process group be set if it is
associated with a session is too strong; a foreground group may go away
without being immediately replaced with another.
 1.73 25-Nov-2003  pk spec_close: we don't need to lock the vnode just to make a copy of its flags.
 1.72 24-Nov-2003  pk spec_close: controlling terminal hack: drop session reference count only if
we actually had a reference.
 1.71 06-Nov-2003  dsl When closing a process's controlling terminal, also remove the links
to the session and pgrp from the tty. The way that the console is
handled means that the vrele() may not actually do the final close
on the tty itself.
 1.70 15-Oct-2003  dsl Set vnode size of character disk devices to that of the partition when they
are opened (was always done for block devices).
This means that fstat will report the partition size and hence newfs
needn't grovel into the disklabel to find the filesystem size.
 1.69 07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.68 29-Jun-2003  fvdl branches: 1.68.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
 1.67 28-Jun-2003  darrenr Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
 1.66 26-Oct-2002  jdolecek now that mem_no is emitted by config(8), there is no reason to keep
copy of more or less identical iskmemdev() for every arch; move the function
to spec_vnop.c, and g/c machine-dependant copies
 1.65 23-Oct-2002  jdolecek merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe
 1.64 06-Sep-2002  gehenna Merge the gehenna-devsw branch into the trunk.

This merge changes the device switch tables from static array to
dynamically generated by config(8).

- All device switches is defined as a constant structure in device drivers.

- The new grammer ``device-major'' is introduced to ``files''.

device-major <prefix> char <num> [block <num>] [<rules>]

- All device major numbers must be listed up in port dependent majors.<arch>
by using this grammer.

- Added the new naming convention.
The name of the device switch must be <prefix>_[bc]devsw for auto-generation
of device switch tables.

- The backward compatibility of loading block/character device
switch by LKM framework is broken. This is necessary to convert
from block/character device major to device name in runtime and vice versa.

- The restriction to assign device major by LKM is completely removed.
We don't need to reserve LKM entries for dynamic loading of device switch.

- In compile time, device major numbers list is packed into the kernel and
the LKM framework will refer it to assign device major number dynamically.
 1.63 26-Aug-2002  thorpej Fix a signed/unsigned comparison warning from GCC 3.3.
 1.62 10-Jul-2002  wiz Spell acquire with a 'c'.
 1.61 12-May-2002  matt branches: 1.61.2;
Extern speclisth
 1.60 10-Nov-2001  lukem add RCSIDs
 1.59 23-Sep-2001  chs branches: 1.59.2;
change spec_{read,write}() to specify the device blkno in units of DEV_BSIZE
rather than the device's sector size. this allows /dev/rcd0a and /dev/cd0a
to return the same data. fixes PRs 3261 and 14026.
 1.58 21-Sep-2001  chs use shared locks instead of exclusive for VOP_READ() and VOP_READDIR().
 1.57 15-Sep-2001  chs a whole bunch of changes to improve performance and robustness under load:

- remove special treatment of pager_map mappings in pmaps. this is
required now, since I've removed the globals that expose the address range.
pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's
no longer any need to special-case it.
- eliminate struct uvm_vnode by moving its fields into struct vnode.
- rewrite the pageout path. the pager is now responsible for handling the
high-level requests instead of only getting control after a bunch of work
has already been done on its behalf. this will allow us to UBCify LFS,
which needs tighter control over its pages than other filesystems do.
writing a page to disk no longer requires making it read-only, which
allows us to write wired pages without causing all kinds of havoc.
- use a new PG_PAGEOUT flag to indicate that a page should be freed
on behalf of the pagedaemon when it's unlocked. this flag is very similar
to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the
pageout fails due to eg. an indirect-block buffer being locked.
this allows us to remove the "version" field from struct vm_page,
and together with shrinking "loan_count" from 32 bits to 16,
struct vm_page is now 4 bytes smaller.
- no longer use PG_RELEASED for swap-backed pages. if the page is busy
because it's being paged out, we can't release the swap slot to be
reallocated until that write is complete, but unlike with vnodes we
don't keep a count of in-progress writes so there's no good way to
know when the write is done. instead, when we need to free a busy
swap-backed page, just sleep until we can get it busy ourselves.
- implement a fast-path for extending writes which allows us to avoid
zeroing new pages. this substantially reduces cpu usage.
- encapsulate the data used by the genfs code in a struct genfs_node,
which must be the first element of the filesystem-specific vnode data
for filesystems which use genfs_{get,put}pages().
- eliminate many of the UVM pagerops, since they aren't needed anymore
now that the pager "put" operation is a higher-level operation.
- enhance the genfs code to allow NFS to use the genfs_{get,put}pages
instead of a modified copy.
- clean up struct vnode by removing all the fields that used to be used by
the vfs_cluster.c code (which we don't use anymore with UBC).
- remove kmem_object and mb_object since they were useless.
instead of allocating pages to these objects, we now just allocate
pages with no object. such pages are mapped in the kernel until they
are freed, so we can use the mapping to find the page to free it.
this allows us to remove splvm() protection in several places.

The sum of all these changes improves write throughput on my
decstation 5000/200 to within 1% of the rate of NetBSD 1.5
and reduces the elapsed time for "make release" of a NetBSD 1.5
source tree on my 128MB pc to 10% less than a 1.5 kernel took.
 1.56 18-Aug-2001  chs branches: 1.56.2;
undo the part of the last revision that made user block device access
use the UBC interfaces. too many problems with that yet.
 1.55 17-Aug-2001  chs initialize the UVM vnode size for block devices.
UBCify user access to block devices.
 1.54 17-Apr-2001  thorpej branches: 1.54.2;
Don't hold vp->v_interlock when calling vcount(); vcount() calls
vgone(), which may sleep.
 1.53 22-Jan-2001  jdolecek branches: 1.53.2;
make filesystem vnodeop, specop, fifoop and vnodeopv_* arrays const
 1.52 08-Nov-2000  chs fix an LP64BE bogon.
 1.51 27-Oct-2000  jmc Remove usecount check in spec_open. It fails to catch VALIAS situations
and vfs_mountedon will handle them all correctly.
 1.50 19-Sep-2000  fvdl Adapt for VOP_FSYNC parameter change.
 1.49 22-Jul-2000  jdolecek change the lf_advlock() arguments from

int lf_advlock __P((struct lockf **,
off_t, caddr_t, int, struct flock *, int));
to

int lf_advlock __P((struct vop_advlock_args *, struct lockf **, off_t));

This matches common usage and is also compatible with similar change
in FreeBSD (though they use u_quad_t as last arg).
 1.48 30-Mar-2000  augustss branches: 1.48.4;
Register, begone!
 1.47 08-Dec-1999  sommerfeld Add appropriate VOP_FCNTL handlers to deadfs and specfs ops vectors.
 1.46 08-Dec-1999  sommerfeld Change to comment (only) indicating what the specfs ops vector is used for.
 1.45 15-Nov-1999  fvdl Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O
 1.44 16-Oct-1999  wrstuden branches: 1.44.2; 1.44.4;
In spec_close(), if we're not doing a non-blocking close and VXLOCK is
not set, unlock the vnode before calling the device's close routine and
relock it after it returns. tty close routines will sleep waiting for
buffers to drain, which won't happen often times as the other side needs
to grab the vnode lock first.

Make all unmount routines lock the device vnode before calling VOP_CLOSE().
 1.43 02-Oct-1998  ross branches: 1.43.6; 1.43.12;
Make spec_write() process errors and return them, otherwise we don't even
notice things like hitting the end of a partition or device. (To be sure,
writes to block special files are rare, but as long as we support them...)
 1.42 18-Aug-1998  thorpej Add some braces to make egcs happy (ambiguous else warning).
 1.41 03-Aug-1998  kleink Recognize _PC_SYNC_IO.
 1.40 05-Jun-1998  kleink Convert fsync vnode operator implementations and usage from the old `waitfor'
argument and MNT_WAIT/MNT_NOWAIT to `flags' and FSYNC_WAIT.
 1.39 01-Mar-1998  fvdl Merge with Lite2 + local changes
 1.38 16-Oct-1997  christos Add missing cast to dev_t
 1.37 09-Oct-1997  mycroft Make various standard wmesg strings const.
 1.36 02-Apr-1997  kleink branches: 1.36.4;
Remove superfluous (uio_resid == 0) check.
 1.35 02-Apr-1997  kleink added advisory record locking support
 1.34 13-Oct-1996  christos backout previous kprintf changes
 1.33 10-Oct-1996  christos printf -> kprintf, sprintf -> ksprintf
 1.32 07-Sep-1996  mycroft Implement poll(2).
 1.31 05-Sep-1996  thorpej Remove some unused variables.
 1.30 01-Sep-1996  mycroft Add a set of generic file system operations that most file systems use.
Also, fix some time stamp bogosities.
 1.29 22-Apr-1996  christos remove include of <sys/cpu.h>
 1.28 09-Feb-1996  christos miscfs prototype changes
 1.27 15-Oct-1995  mycroft Implement VOP_BWRITE() using vn_bwrite(), per r_friedl@informatik.uni-kl.de.
 1.26 24-Jul-1995  cgd branches: 1.26.2;
avoid unnecessary aging of buffers. This used to make sense, when buffer
caches were much smaller, but makes little sense now, and will become more
useless as RAM (and buffer cache) sizes grow. Suggested by Bob Baron.
 1.25 08-Jul-1995  cgd add missing splx(), as suggested by enami@sys.ptg.sony.co.jp.
 1.24 02-Jul-1995  mycroft Make spec_read() and spec_write() vaguely consistent.
 1.23 10-Apr-1995  mycroft Use the new d_type field. Set VISTTY for vnodes of tty devices.
 1.22 14-Dec-1994  mycroft Remove a_fp.
 1.21 13-Dec-1994  mycroft Turn lease_check() into a vnode op, per CSRG.
 1.20 14-Nov-1994  christos fixed struct comment; passed extra argument (struct file *) to open
 1.19 29-Oct-1994  cgd light clean; make sure headers are properly included, types are OK, etc.
 1.18 20-Oct-1994  cgd update for new syscall args description mechanism
 1.17 16-Jul-1994  paulus Support for block special files with sector sizes other than DEV_BSIZE -
if the device has a disklabel with a non-zero sector size value, that
value is used instead of DEV_BSIZE.
 1.16 29-Jun-1994  cgd branches: 1.16.2;
New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'
 1.15 08-Jun-1994  mycroft Update to 4.4-Lite fs code, with local changes.
 1.14 24-May-1994  cgd MIN -> min, MAX -> max
 1.13 21-Apr-1994  cgd Convert mount, vnode, and buf structs to use <sys/queue.h>. Also,
some knf and structure frobbing to do along with it.
 1.12 27-Jan-1994  cgd oops; fix that last...
 1.11 27-Jan-1994  cgd hack from Mike Karels to deal with the last close on a controlling
terminal. from 4.4BSD.
 1.10 22-Dec-1993  cgd fix return type of vnode print routine
 1.9 18-Dec-1993  mycroft Canonicalize all #includes.
 1.8 12-Nov-1993  cgd new specfs.h and fifo.h locations
 1.7 30-Oct-1993  glass fix chris typo.
 1.6 29-Oct-1993  cgd limit block sizes requested
 1.5 23-Aug-1993  cgd branches: 1.5.2;
changes from 0.9-ALPHA2 to 0.9-BETA
 1.4 27-Jun-1993  andrew branches: 1.4.2;
ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.
 1.3 20-May-1993  cgd add $Id$ strings, and clean up file headers where necessary
 1.2 21-Mar-1993  cgd after 0.2.2 "stable" patches applied
 1.1 21-Mar-1993  cgd branches: 1.1.1;
Initial revision
 1.1.1.4 01-Mar-1998  fvdl Import some files that were changed after Lite2
 1.1.1.3 01-Mar-1998  fvdl Import 4.4BSD-Lite2
 1.1.1.2 01-Mar-1998  fvdl Import 4.4BSD-Lite for reference
 1.1.1.1 21-Mar-1993  cgd initial import of 386bsd-0.1 sources
 1.4.2.1 20-Aug-1993  mycroft Add a small bit of debugging code.
 1.5.2.3 06-Jan-1994  pk Re-instate EOPNOTSUPP.
 1.5.2.2 28-Dec-1993  pk Use ENODEV rather then EOPNOTSUP for unsupported operations on non-socket devices
 1.5.2.1 12-Nov-1993  cgd new specfs.h and fifo.h locations, and include file syntax updates
 1.16.2.1 16-Jul-1994  cgd update from trunk, per paulus
 1.26.2.1 15-Oct-1995  mycroft Update from main branch.
 1.36.4.1 14-Oct-1997  thorpej Update marc-pcmcia branch from trunk.
 1.43.12.2 27-Dec-1999  wrstuden Pull up to last week's -current.
 1.43.12.1 21-Dec-1999  wrstuden Initial commit of recent changes to make DEV_BSIZE go away.

Runs on i386, needs work on other arch's. Main kernel routines should be
fine, but a number of the stand programs need help.

cd, fd, ccd, wd, and sd have been updated. sd has been tested with non-512
byte block devices. vnd, raidframe, and lfs need work.

Non 2**n block support is automatic for LKM's and conditional for kernels
on "options NON_PO2_BLOCKS".
 1.43.6.2 27-Oct-2000  he Pull up revision 1.51 (requested by jmc):
Fix security problem in spec_open().
 1.43.6.1 18-Oct-1999  cgd pull up rev 1.44 from trunk (requested by wrstuden):
In spec_close(), call the device's close routine with the vnode
unlocked if the call might block. Force a non-blocking close if
VXLOCK is set. This eliminates a potential deadlock situation, and
should eliminate the dirty buffers on reboot issue.
 1.44.4.1 19-Oct-1999  fvdl Bring in Kirk McKusick's FFS softdep code on a branch.
 1.44.2.4 21-Apr-2001  bouyer Sync with HEAD
 1.44.2.3 11-Feb-2001  bouyer Sync with HEAD.
 1.44.2.2 22-Nov-2000  bouyer Sync with HEAD.
 1.44.2.1 20-Nov-2000  bouyer Update thorpej_scsipi to -current as of a month ago
 1.48.4.4 27-Oct-2001  he Pull up revision 1.59 (via patch, requested by chs):
Change spec_{read,write}() to specify block number in units of
DEV_BSIZE instead of the device's sector size. With this,
/dev/rcd0a and /dev/cd0a returns the same data. Fixes PR#3261
and PR#14026.
 1.48.4.3 14-Dec-2000  he Pull up revision 1.50 (requested by fvdl):
Improve NFS performance, possibly with as much as 100% in
throughput. Please note: this implies a kernel interface change,
VOP_FSYNC gains two arguments.
 1.48.4.2 30-Oct-2000  tv Pullup 1.51 [jmc]:
Remove usecount check in spec_open. It fails to catch VALIAS situations
and vfs_mountedon will handle them all correctly.
 1.48.4.1 30-Jul-2000  jdolecek Pullup from trunk (approved by thorpej):
Change lf_advlock() to:
int lf_advlock (struct vop_advlock_args *, struct lockf **, off_t)

This matches it's usage. Change inspired by FreeBSD, though we use
off_t instead u_quad_t as the last argument.

sys/lockf.h rev. 1.9
msdosfs/msdosfs_vnops.c rev. 1.99
kern/vfs_lockf.c rev. 1.17
miscfs/specfs/spec_vnops.c rev. 1.49
nfs/nfs_vnops.c rev. 1.115
ufs/ext2fs/ext2fs_vnops.c rev. 1.28
ufs/ufs/ufs_vnops.c rev. 1.72
 1.53.2.15 11-Nov-2002  nathanw Catch up to -current
 1.53.2.14 17-Sep-2002  nathanw Catch up to -current.
 1.53.2.13 27-Aug-2002  nathanw Catch up to -current.
 1.53.2.12 01-Aug-2002  nathanw Catch up to -current.
 1.53.2.11 15-Jul-2002  nathanw Whitespace.
 1.53.2.10 12-Jul-2002  nathanw No longer need to pull in lwp.h; proc.h pulls it in for us.
 1.53.2.9 24-Jun-2002  nathanw Curproc->curlwp renaming.

Change uses of "curproc->l_proc" back to "curproc", which is more like the
original use. Bare uses of "curproc" are now "curlwp".

"curproc" is now #defined in proc.h as ((curlwp) ? (curlwp)->l_proc) : NULL)
so that it is always safe to reference curproc (*de*referencing curproc
is another story, but that's always been true).
 1.53.2.8 20-Jun-2002  nathanw Catch up to -current.
 1.53.2.7 14-Nov-2001  nathanw Catch up to -current.
 1.53.2.6 26-Sep-2001  nathanw Catch up to -current.
Again.
 1.53.2.5 21-Sep-2001  nathanw Catch up to -current.
 1.53.2.4 24-Aug-2001  nathanw A few files and lwp/proc conversions I missed in the last big update.
GENERIC runs again.
 1.53.2.3 24-Aug-2001  nathanw Catch up with -current.
 1.53.2.2 21-Jun-2001  nathanw Catch up to -current.
 1.53.2.1 05-Mar-2001  nathanw Initial commit of scheduler activations and lightweight process support.
 1.54.2.7 10-Oct-2002  jdolecek sync kqueue with -current; this includes merge of gehenna-devsw branch,
merge of i386 MP branch, and part of autoconf rototil work
 1.54.2.6 26-Sep-2002  jdolecek spec_kqfilter(): return EOPNOTSUPP for !VCHR case; block devices don't
support kevents, and we don't want to attempt to support for any other
files ending here neither (i.e. those which get spec vnodeops via vflush())
 1.54.2.5 06-Sep-2002  jdolecek sync kqueue branch with HEAD
 1.54.2.4 23-Jun-2002  jdolecek catch up with -current on kqueue branch
 1.54.2.3 10-Jan-2002  thorpej Sync kqueue branch with -current.
 1.54.2.2 25-Aug-2001  thorpej Merge Aug 24 -current into the kqueue branch.
 1.54.2.1 10-Jul-2001  lukem add spec_kqfilter()
 1.56.2.6 01-Oct-2001  fvdl Catch up with -current.
 1.56.2.5 28-Sep-2001  fvdl Bring locking state of VCHR vnodes across device entry points back
to what they used to be. The locking state of vnodes across
device functions really needs to be cleaned up, but at least
this is not worse than it was before.
 1.56.2.4 27-Sep-2001  fvdl Do real locking for cloned vnodes (most filesystems have real locking
for spec vnodes, so clones should have it too). Could probably do locking
all the time for spec vnodes, but need to check if vnodes created
during bootstrap with {b,c}devvp will cause trouble if they have actual
locks.
 1.56.2.3 26-Sep-2001  fvdl * add a VCLONED vnode flag that indicates a vnode representing a cloned
device.
* rename REVOKEALL to REVOKEALIAS, and add a REVOKECLONE flag, to pass
to VOP_REVOKE
* the revoke system call will revoke all aliases, as before, but not the
clones
* vdevgone is called when detaching a device, so make it use REVOKECLONE
to get rid of all clones as well
* clean up all uses of VOP_OPEN wrt. locking.
* add a few VOPS to spec_vnops that need to do something when it's a
clone vnode (access and getattr)
* add a copy of the vnode vattr structure of the original 'master' vnode
to the specinfo of a cloned vnode. could possibly redirect getattr to
the 'master' vnode, but this has issues with revoke
* add a vdev_reassignvp function that disassociates a vnode from its
original device, and reassociates it with the specified dev_t. to be
used by cloning devices only, in case a new minor is allocated.
* change all direct references in drivers to v_devcookie and v_rdev
to vdev_privdata(vp) and vdev_rdev(vp). for diagnostic purposes
when debugging race conditions that still exist wrt. locking and
revoking vnodes.
* make the locking state of a vnode consistent when passed to
d_open and d_close (unlocked). locked would be better, but has
some deadlock issues
 1.56.2.2 18-Sep-2001  fvdl Various changes to make cloning devices possible:

* Add an extra argument (struct vnode **) to VOP_OPEN. If it is
not NULL, specfs will create a cloned (aliased) vnode during
the call, and return it there. The caller should release and
unlock the original vnode if a new vnode was returned. The
new vnode is returned locked.

* Add a flag field to the cdevsw and bdevsw structures.
DF_CLONING indicates that it wants a new vnode for each
open (XXX is there a better way? devprop?)

* If a device is cloning, always call the close entry
point for a VOP_CLOSE.


Also, rewrite cons.c to do the right thing with vnodes. Use VOPs
rather then direct device entry calls. Suggested by mycroft@

Light to moderate testing done an i386 system (arch doesn't matter
though, these are MI changes).
 1.56.2.1 07-Sep-2001  thorpej Commit my "devvp" changes to the thorpej-devvp branch. This
replaces the use of dev_t in most places with a struct vnode *.

This will form the basic infrastructure for real cloning device
support (besides being architecurally cleaner -- it'll be good
to get away from using numbers to represent objects).
 1.59.2.1 12-Nov-2001  thorpej Sync the thorpej-mips-cache branch with -current.
 1.61.2.3 29-Aug-2002  gehenna catch up with -current.
 1.61.2.2 15-Jul-2002  gehenna catch up with -current.
 1.61.2.1 16-May-2002  gehenna Replace the direct-access to devsw table with calling devsw APIs.
 1.68.2.6 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.68.2.5 04-Mar-2005  skrll Sync with HEAD.

Hi Perry!
 1.68.2.4 21-Sep-2004  skrll Fix the sync with head I botched.
 1.68.2.3 18-Sep-2004  skrll Sync with HEAD.
 1.68.2.2 03-Aug-2004  skrll Sync with HEAD
 1.68.2.1 02-Jul-2003  darrenr Apply the aborted ktrace-lwp changes to a specific branch. This is just for
others to review, I'm concerned that patch fuziness may have resulted in some
errant code being generated but I'll look at that later by comparing the diff
from the base to the branch with the file I attempt to apply to it. This will,
at the very least, put the changes in a better context for others to review
them and attempt to tinker with removing passing of 'struct lwp' through
the kernel.
 1.77.6.1 29-Dec-2005  riz Pull up following revision(s) (requested by chs in ticket #10207):
sys/miscfs/specfs/spec_vnops.c: revision 1.83
in spec_ioctl(), don't dereference v_specinfo if it's NULL.
this is needed due to rev. 1.231 of kern/vfs_subr.c, which now sets
v_specinfo to NULL before changing the vnode's ops vector.
 1.77.4.1 29-Dec-2005  riz Pull up following revision(s) (requested by chs in ticket #10207):
sys/miscfs/specfs/spec_vnops.c: revision 1.83
in spec_ioctl(), don't dereference v_specinfo if it's NULL.
this is needed due to rev. 1.231 of kern/vfs_subr.c, which now sets
v_specinfo to NULL before changing the vnode's ops vector.
 1.77.2.1 29-Dec-2005  riz Pull up following revision(s) (requested by chs in ticket #10207):
sys/miscfs/specfs/spec_vnops.c: revision 1.83
in spec_ioctl(), don't dereference v_specinfo if it's NULL.
this is needed due to rev. 1.231 of kern/vfs_subr.c, which now sets
v_specinfo to NULL before changing the vnode's ops vector.
 1.79.6.1 19-Mar-2005  yamt sync with head. xen and whitespace. xen part is not finished.
 1.79.4.1 29-Apr-2005  kent sync with -current
 1.80.2.2 11-Nov-2006  bouyer Pull up following revision(s) (requested by jld in ticket #1557):
sys/miscfs/specfs/spec_vnops.c: revision 1.91 via patch
Protect spec_poll from racing against revocation and thus dereferencing a
NULL v_specinfo. Mostly copied (with understanding) from rev 1.83's fix
to spec_ioctl, and needed for the same reason (kern/vfs_subr.c r1.231).
 1.80.2.1 26-Sep-2005  tron branches: 1.80.2.1.2; 1.80.2.1.4;
Pull up following revision(s) (requested by chs in ticket #812):
sys/miscfs/specfs/spec_vnops.c: revision 1.83
in spec_ioctl(), don't dereference v_specinfo if it's NULL.
this is needed due to rev. 1.231 of kern/vfs_subr.c, which now sets
v_specinfo to NULL before changing the vnode's ops vector.
 1.80.2.1.4.1 11-Nov-2006  bouyer Pull up following revision(s) (requested by jld in ticket #1557):
sys/miscfs/specfs/spec_vnops.c: revision 1.91 via patch
Protect spec_poll from racing against revocation and thus dereferencing a
NULL v_specinfo. Mostly copied (with understanding) from rev 1.83's fix
to spec_ioctl, and needed for the same reason (kern/vfs_subr.c r1.231).
 1.80.2.1.2.1 11-Nov-2006  bouyer Pull up following revision(s) (requested by jld in ticket #1557):
sys/miscfs/specfs/spec_vnops.c: revision 1.91 via patch
Protect spec_poll from racing against revocation and thus dereferencing a
NULL v_specinfo. Mostly copied (with understanding) from rev 1.83's fix
to spec_ioctl, and needed for the same reason (kern/vfs_subr.c r1.231).
 1.81.2.7 04-Feb-2008  yamt sync with head.
 1.81.2.6 21-Jan-2008  yamt sync with head
 1.81.2.5 07-Dec-2007  yamt sync with head
 1.81.2.4 27-Oct-2007  yamt sync with head.
 1.81.2.3 03-Sep-2007  yamt sync with head.
 1.81.2.2 30-Dec-2006  yamt sync with head.
 1.81.2.1 21-Jun-2006  yamt sync with head.
 1.83.2.1 20-Oct-2005  yamt adapt specfs and fifofs.
 1.85.6.2 01-Jun-2006  kardel Sync with head.
 1.85.6.1 22-Apr-2006  simonb Sync with head.
 1.85.4.1 09-Sep-2006  rpaulo sync with head
 1.85.2.1 31-Dec-2005  yamt adapt some random parts of kernel to uio_vmspace.
 1.86.6.1 24-May-2006  tron Merge 2006-05-24 NetBSD-current into the "peter-altq" branch.
 1.86.4.2 06-May-2006  christos - Move kauth_cred_t declaration to <sys/types.h>
- Cleanup struct ucred; forward declarations that are unused.
- Don't include <sys/kauth.h> in any header, but include it in the c files
that need it.

Approved by core.
 1.86.4.1 08-Mar-2006  elad Adapt to kernel authorization KPI.
 1.86.2.3 14-Sep-2006  yamt sync with head.
 1.86.2.2 03-Sep-2006  yamt sync with head.
 1.86.2.1 24-May-2006  yamt sync with head.
 1.87.6.1 14-Aug-2006  tron Pull up following revision(s) (requested by elad in ticket #15):
sys/miscfs/specfs/spec_vnops.c: revision 1.88
share/man/man9/fileassoc.9: revision 1.7
sys/kern/kern_verifiedexec.c: revision 1.66
sys/sys/verified_exec.h: revision 1.39
sys/sys/fileassoc.h: revision 1.3
lib/libc/gen/sysctl.3: revision 1.178
share/man/man9/veriexec.9: revision 1.4
sys/kern/kern_fileassoc.c: revision 1.6
Pretending to be Elad's keyboard:
fileassoc.diff adds a fileassoc_table_run() routine that allows you to
pass a callback to be called with every entry on a given mount.
veriexec.diff adds some raw device access policies: if raw disk is
opened at strict level 1, all fingerprints on this disk will be
invalidated as a safety measure. level 2 will not allow opening disk
for raw writing if we monitor it, and prevent raw writes to memory.
level 3 will not allow opening any disk for raw writing.
both update all relevant documentation.
veriexec concept is okay blymn@.
 1.88.2.2 12-Jan-2007  ad Sync with head.
 1.88.2.1 18-Nov-2006  ad Sync with head.
 1.89.2.2 10-Dec-2006  yamt sync with head.
 1.89.2.1 22-Oct-2006  yamt sync with head
 1.97.4.1 12-Mar-2007  rmind Sync with HEAD.
 1.98.4.1 11-Jul-2007  mjf Sync with head.
 1.98.2.12 09-Oct-2007  ad Sync with head.
 1.98.2.11 09-Oct-2007  ad Sync with head.
 1.98.2.10 24-Aug-2007  ad Remove the only (and insane) reference to B_TAPE that came long with 386BSD.
 1.98.2.9 20-Aug-2007  ad Sync with HEAD.
 1.98.2.8 20-Aug-2007  ad softdep locking improvements. It hangs looping in flush_inodedep_deps(),
more work required.
 1.98.2.7 19-Aug-2007  ad - Back out the biodone() changes.
- Eliminate B_ERROR (from HEAD).
 1.98.2.6 17-Jun-2007  ad - Increase the number of thread priorities from 128 to 256. How the space
is set up is to be revisited.
- Implement soft interrupts as kernel threads. A generic implementation
is provided, with hooks for fast-path MD code that can run the interrupt
threads over the top of other threads executing in the kernel.
- Split vnode::v_flag into three fields, depending on how the flag is
locked (by the interlock, by the vnode lock, by the file system).
- Miscellaneous locking fixes and improvements.
 1.98.2.5 09-Jun-2007  ad Sync with head.
 1.98.2.4 13-May-2007  ad - Pass the error number and residual count to biodone(), and let it handle
setting error indicators. Prepare to eliminate B_ERROR.
- Add a flag argument to brelse() to be set into the buf's flags, instead
of doing it directly. Typically used to set B_INVAL.
- Add a "struct cpu_info *" argument to kthread_create(), to be used to
create bound threads. Change "bool mpsafe" to "int flags".
- Allow exit of LWPs in the IDL state when (l != curlwp).
- More locking fixes & conversion to the new API.
 1.98.2.3 13-Apr-2007  ad - Make the devsw interface MP safe, and add some comments.
- Allow individual block/character drivers to be marked MP safe.
- Provide wrappers around the device methods that look up the
device, returning ENXIO if it's not found, and acquire the
kernel lock if needed.
 1.98.2.2 13-Apr-2007  ad - Fix a (new) bug where vget tries to acquire freed vnodes' interlocks.
- Minor locking fixes.
 1.98.2.1 13-Mar-2007  ad Pull in the initial set of changes for the vmlocking branch.
 1.100.2.2 03-Sep-2007  skrll Sync with HEAD.
 1.100.2.1 15-Aug-2007  skrll Sync with HEAD.
 1.104.6.2 03-Aug-2007  pooka ANSI-fy
 1.104.6.1 03-Aug-2007  pooka file spec_vnops.c was added on branch matt-mips64 on 2007-08-03 08:45:37 +0000
 1.104.4.3 23-Mar-2008  matt sync with HEAD
 1.104.4.2 09-Jan-2008  matt sync with HEAD
 1.104.4.1 06-Nov-2007  matt sync with HEAD
 1.104.2.5 03-Dec-2007  joerg Sync with HEAD.
 1.104.2.4 27-Nov-2007  joerg Sync with HEAD. amd64 Xen support needs testing.
 1.104.2.3 26-Oct-2007  joerg Sync with HEAD.

Follow the merge of pmap.c on i386 and amd64 and move
pmap_init_tmp_pgtbl into arch/x86/x86/pmap.c. Modify the ACPI wakeup
code to restore CR4 before jumping back into kernel space as the large
page option might cover that.
 1.104.2.2 03-Sep-2007  jmcneill Sync with HEAD.
 1.104.2.1 03-Aug-2007  jmcneill file spec_vnops.c was added on branch jmcneill-pm on 2007-09-03 16:48:52 +0000
 1.105.2.1 14-Oct-2007  yamt sync with head.
 1.108.4.2 18-Feb-2008  mjf Sync with HEAD.
 1.108.4.1 08-Dec-2007  mjf Sync with HEAD.
 1.110.6.1 02-Jan-2008  bouyer Sync with HEAD
 1.110.2.1 04-Dec-2007  ad Pull the vmlocking changes into a new branch.
 1.115.8.1 18-May-2008  yamt sync with head.
 1.115.6.2 17-Jan-2009  mjf Sync with HEAD.
 1.115.6.1 02-Jun-2008  mjf Sync with HEAD.
 1.116.2.5 09-Oct-2010  yamt sync with head
 1.116.2.4 11-Aug-2010  yamt sync with head.
 1.116.2.3 11-Mar-2010  yamt sync with head
 1.116.2.2 04-May-2009  yamt sync with head.
 1.116.2.1 16-May-2008  yamt sync with head.
 1.118.2.1 23-Jun-2008  wrstuden Sync w/ -current. 34 merge conflicts to follow.
 1.119.10.3 30-Dec-2008  christos sync with head.
 1.119.10.2 09-Nov-2008  christos account for major and minor being unsigned long long
 1.119.10.1 16-May-2008  christos file spec_vnops.c was added on branch christos-time_t on 2008-11-09 02:05:20 +0000
 1.119.6.3 28-Apr-2009  skrll Sync with HEAD.
 1.119.6.2 03-Mar-2009  skrll Sync with HEAD.
 1.119.6.1 19-Jan-2009  skrll Sync with HEAD.
 1.122.2.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.127.4.4 31-May-2011  rmind sync with head
 1.127.4.3 05-Mar-2011  rmind sync with head
 1.127.4.2 03-Jul-2010  rmind sync with head
 1.127.4.1 16-Mar-2010  rmind Change struct uvm_object::vmobjlock to be dynamically allocated with
mutex_obj_alloc(). It allows us to share the locks among UVM objects.
 1.127.2.2 22-Oct-2010  uebayasi Sync with HEAD (-D20101022).
 1.127.2.1 17-Aug-2010  uebayasi Sync with HEAD.
 1.131.2.1 06-Jun-2011  jruoho Sync with HEAD.
 1.133.2.1 23-Jun-2011  cherry Catchup with rmind-uvmplock merge.
 1.134.8.2 10-May-2016  snj Pull up following revision(s) (requested by hannken in ticket #1376):
sys/miscfs/specfs/spec_vnops.c: revisions 1.161, 1.162 via patch
Whhen spec_strategy() extracts v_rdev take care to avoid a
race with spec_revoke.
Fixes PR kern/50467 Panic from disconnecting phone while reading its contents
--
Avoid a race with spec_revoke for the assertion too.
Final fix for PR kern/50467 Panic from disconnecting phone while reading
its contents
 1.134.8.1 07-May-2012  riz branches: 1.134.8.1.4; 1.134.8.1.6;
Pull up following revision(s) (requested by chs in ticket #204):
sys/fs/sysvbfs/sysvbfs_vnops.c: revision 1.44
sys/ufs/ffs/ffs_vfsops.c: revision 1.277
sys/fs/v7fs/v7fs_vnops.c: revision 1.11
sys/ufs/chfs/chfs_vnops.c: revision 1.7
sys/ufs/ext2fs/ext2fs_readwrite.c: revision 1.61
sys/miscfs/genfs/genfs_io.c: revision 1.54
sys/kern/vfs_wapbl.c: revision 1.52
sys/uvm/uvm_pager.h: revision 1.43
sys/ufs/ffs/ffs_vnops.c: revision 1.121
sys/kern/vfs_subr.c: revision 1.434
sys/fs/msdosfs/msdosfs_vnops.c: revision 1.83
sys/fs/ntfs/ntfs_vnops.c: revision 1.51
sys/fs/udf/udf_subr.c: revision 1.119
sys/miscfs/specfs/spec_vnops.c: revision 1.135
sys/ufs/ext2fs/ext2fs_vnops.c: revision 1.103
sys/fs/udf/udf_vnops.c: revision 1.71
sys/ufs/ufs/ufs_readwrite.c: revision 1.104
change vflushbuf() to take the full FSYNC_* flags.
translate FSYNC_LAZY into PGO_LAZY for VOP_PUTPAGES() so that
genfs_do_io() can set the appropriate io priority for the I/O.
this is the first part of addressing PR 46325.
mark all wapbl I/O as BPRIO_TIMECRITICAL.
this is the second part of addressing PR 46325.
 1.134.8.1.6.1 10-May-2016  snj Pull up following revision(s) (requested by hannken in ticket #1376):
sys/miscfs/specfs/spec_vnops.c: revisions 1.161, 1.162 via patch
Whhen spec_strategy() extracts v_rdev take care to avoid a
race with spec_revoke.
Fixes PR kern/50467 Panic from disconnecting phone while reading its contents
--
Avoid a race with spec_revoke for the assertion too.
Final fix for PR kern/50467 Panic from disconnecting phone while reading
its contents
 1.134.8.1.4.1 10-May-2016  snj Pull up following revision(s) (requested by hannken in ticket #1376):
sys/miscfs/specfs/spec_vnops.c: revisions 1.161, 1.162 via patch
Whhen spec_strategy() extracts v_rdev take care to avoid a
race with spec_revoke.
Fixes PR kern/50467 Panic from disconnecting phone while reading its contents
--
Avoid a race with spec_revoke for the assertion too.
Final fix for PR kern/50467 Panic from disconnecting phone while reading
its contents
 1.134.6.1 02-Jun-2012  mrg sync to latest -current.
 1.134.2.3 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.134.2.2 23-Jan-2013  yamt sync with head
 1.134.2.1 23-May-2012  yamt sync with head.
 1.135.2.4 03-Dec-2017  jdolecek update from HEAD
 1.135.2.3 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.135.2.2 23-Jun-2013  tls resync from head
 1.135.2.1 25-Feb-2013  tls resync with head
 1.138.4.1 23-Jul-2013  riastradh sync with HEAD
 1.138.2.2 18-May-2014  rmind sync with head
 1.138.2.1 28-Aug-2013  rmind sync with head
 1.143.2.1 10-Aug-2014  tls Rebase.
 1.145.6.1 29-Apr-2016  snj Pull up following revision(s) (requested by hannken in ticket #1154):
sys/miscfs/specfs/spec_vnops.c: revision 1.161, 1.162
Whhen spec_strategy() extracts v_rdev take care to avoid a
race with spec_revoke.
Fixes PR kern/50467 Panic from disconnecting phone while reading its contents
--
Avoid a race with spec_revoke for the assertion too.
Final fix for PR kern/50467 Panic from disconnecting phone while reading
its contents
 1.145.4.9 28-Aug-2017  skrll Sync with HEAD
 1.145.4.8 05-Feb-2017  skrll Sync with HEAD
 1.145.4.7 05-Oct-2016  skrll Sync with HEAD
 1.145.4.6 22-Apr-2016  skrll Sync with HEAD
 1.145.4.5 19-Mar-2016  skrll Sync with HEAD
 1.145.4.4 27-Dec-2015  skrll Sync with HEAD (as of 26th Dec)
 1.145.4.3 22-Sep-2015  skrll Sync with HEAD
 1.145.4.2 06-Jun-2015  skrll Sync with HEAD
 1.145.4.1 06-Apr-2015  skrll Sync with HEAD
 1.145.2.1 29-Apr-2016  snj Pull up following revision(s) (requested by hannken in ticket #1154):
sys/miscfs/specfs/spec_vnops.c: revisions 1.161, 1.162
Whhen spec_strategy() extracts v_rdev take care to avoid a
race with spec_revoke.
Fixes PR kern/50467 Panic from disconnecting phone while reading its contents
--
Avoid a race with spec_revoke for the assertion too.
Final fix for PR kern/50467 Panic from disconnecting phone while reading
its contents
 1.162.2.8 26-Apr-2017  pgoyette Sync with HEAD
 1.162.2.7 20-Mar-2017  pgoyette Sync with HEAD
 1.162.2.6 07-Jan-2017  pgoyette Sync with HEAD. (Note that most of these changes are simply $NetBSD$
tag issues.)
 1.162.2.5 23-Jul-2016  pgoyette Simplify, remove redundant code.
 1.162.2.4 23-Jul-2016  pgoyette Restore original handling of ioctl() returns. If the underlying disk's
ioctl() returns success, we call uvm_vnp_setsize(). Regardless of any
error from the ioctl() call we should return success.
 1.162.2.3 22-Jul-2016  pgoyette Return the actual error code, rather than blind success.
 1.162.2.2 21-Jul-2016  pgoyette fix an error patch to call {b,c}devsw_release()
 1.162.2.1 20-Jul-2016  pgoyette Adapt machine-independant code to the new {b,c}devsw reference-counting
(using localcount(9)). All callers of {b,c}devsw_lookup() now call
{b,c}devsw_lookup_acquire() which retains a reference on the 'struct
{b,c}devsw'. This reference must be released by the caller once it is
finished with the structure's content (or other data that would disappear
if the 'struct {b,c}devsw' were to disappear).
 1.168.2.1 21-Apr-2017  bouyer Sync with HEAD
 1.171.2.1 27-Apr-2017  pgoyette Restore all work from the former pgoyette-localcount branch (which is
now abandoned doe to cvs merge botch).

The branch now builds, and installs via anita. There are still some
problems (cgd is non-functional and all atf tests time-out) but they
will get resolved soon.
 1.173.2.1 01-Jul-2017  snj Pull up following revision(s) (requested by hannken in ticket #76):
sys/miscfs/specfs/spec_vnops.c: revision 1.174
Refuse to open a block device with zero open count when it has
a mountpoint set. This may happen after forced detach or unplug
of a mounted block device.
 1.174.6.3 21-Apr-2020  martin Sync with HEAD
 1.174.6.2 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.174.6.1 10-Jun-2019  christos Sync with HEAD
 1.174.4.1 06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.176.6.1 20-Apr-2020  bouyer Sync with HEAD
 1.180.2.1 03-Jan-2021  thorpej Sync w/ HEAD.
 1.181.4.1 01-Aug-2021  thorpej Sync with HEAD.
 1.218.6.1 02-Aug-2025  perseant Sync with HEAD
 1.54 22-Apr-2023  hannken Remove unused specdev member sd_rdev.

Ride 10.99.4
 1.53 26-Oct-2022  riastradh miscfs/specfs/specdev.h: New home for extern spec_vnodeop_opv_desc.

Also use it for extern spec_vnodeop_p, which is already there.
 1.52 28-Mar-2022  riastradh specfs: Reorder struct specnode members to save padding.

Shrinks from 40 bytes to 32 bytes on LP64 systems this way.
 1.51 28-Mar-2022  riastradh specfs: Let spec_node_lookup_by_dev wait for reclaim to finish.

vdevgone relies on this to ensure that if there is a concurrent
revoke in progress, it will wait for that revoke to finish -- that
way, it can guarantee all I/O operations have completed and the
device is closed.
 1.50 28-Mar-2022  riastradh specfs: Prevent new opens while close is waiting to drain.

Otherwise, bdev/cdev_close could have cancelled all _existing_ opens,
and waited for them to complete (and freed resources used by them) --
but a new one could start, and hang (e.g., a tty), at the same time
spec_close tries to drain all pending I/O operations, one of which
(the new open) is now hanging indefinitely.

Preventing the new open from even starting until bdev/cdev_close is
finished and all I/O operations have drained avoids this deadlock.
 1.49 28-Mar-2022  riastradh specfs: Drain all I/O operations after last .d_close call.

New kind of I/O reference on specdevs, sd_iocnt. This could be done
with psref instead; I chose a reference count instead for now because
we already have to take a per-object lock anyway, v_interlock, for
vdead_check, so another atomic is not likely to hurt much more. We
can always change the mechanism inside spec_io_enter/exit/drain later
on.

Make sure every access to vp->v_rdev or vp->v_specnode and every call
to a devsw operation is protected either:

- by the vnode lock (with vdead_check if we unlocked/relocked),
- by positive sd_opencnt,
- by spec_io_enter/exit, or
- by sd_opencnt management in open/close.
 1.48 28-Mar-2022  riastradh specfs: Resolve a race between close and a failing reopen.
 1.47 28-Mar-2022  riastradh specfs: Document sn_opencnt, sd_opencnt, sd_refcnt.
 1.46 18-Jul-2021  dholland Abolish all the silly indirection macros for initializing vnode ops tables.

These are things of the form #define foofs_op genfs_op, or #define
foofs_op genfs_eopnotsupp, or similar. They serve no purpose besides
obfuscation, and have gotten cutpasted all over everywhere.
 1.45 18-Jul-2021  dholland Use macros for the canned parts of device and fifo vnode op tables.

Add GENFS_SPECOP_ENTRIES and GENFS_FIFOOP_ENTRIES macros that contain
the portion of the vnode ops table declaration that is
(conservatively) the same in every fs. Use these in every fs that
supports devices and/or fifos with separate ops tables.

Note that ptyfs works differently (it has one type of vnode with
open-coded dispatch to the specfs code, which I haven't changed in
this commit) and rump/librump/rumpvfs/rumpfs.c has an indirect dynamic
dispatch that already does more or less the same thing, which I also
haven't changed.

Also note that this anticipates a few bits in the next changeset here
and there, and adds missing but unreachable calls in some cases (e.g.
most fses weren't defining whiteout on devices and fifos, but it isn't
reachable there), and it changes parsepath on devices and fifos to
genfs_badop from genfs_parsepath (but it's not reachable there
either).

It appears that devices in kernfs were missing kqfilter, so it's
possible that if you try to use kqueue on /kern/rootdev that it'll
explode.

And finally note that the ops declaration tables aren't
order-dependent. (Other than vop_default_desc has to come first.)
Otherwise this wouldn't work.
 1.44 23-Jun-2015  hannken branches: 1.44.34;
Add a vfs_newvnode() method to deadfs and use it to create
anonymous device vnodes with bdevvp() and cdevvp().

Implement spec_inactive() and spec_reclaim() to handle these nodes.
 1.43 25-Jul-2014  dholland branches: 1.43.4;
Implement spec_fdiscard() using bdev_discard() and cdev_discard().
Also define spec_fallocate() to genfs_eopnotsupp().
 1.42 30-Sep-2013  hannken branches: 1.42.2;
Replace macro v_specmountpoint with two functions spec_node_getmountedfs()
and spec_node_setmountedfs() to manage the file system mounted on a device.
Assert the device is a block device.

Welcome to 6.99.24

Discussed on tech-kern@ some time ago.

Reviewed by: David Holland <dholland@netbsd.org>
 1.41 21-Apr-2013  dholland branches: 1.41.4;
add missing spec_whiteout
 1.40 13-Feb-2013  hannken Make the spec_node table implementation private to spec_vnops.c.

To retrieve a spec_node, two new lookup functions (by device or by mount)
are implemented. Both return a referenced vnode, for an opened block device
the opened vnode is returned so further diagnostic checks "vp == ... sd_bdevvp"
will not fire. Otherwise any vnode matching the criteria gets returned.

No objections on tech-kern.

Welcome to 6.99.17
 1.39 14-Nov-2009  elad branches: 1.39.2; 1.39.12; 1.39.22;
- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.
 1.38 06-Oct-2009  elad Factor out a block of code that appears in three places (Veriexec, keylock,
and securelevel) so that others can use it as well.
 1.37 29-Dec-2008  pooka Rename specfs_lock as device_lock and move it from specfs to devsw.
Relaxes kernel dependency on vfs.
 1.36 28-Apr-2008  martin branches: 1.36.8;
Remove clause 3 and 4 from TNF licenses
 1.35 25-Jan-2008  ad branches: 1.35.6; 1.35.8; 1.35.10;
Remove VOP_LEASE. Discussed on tech-kern.
 1.34 24-Jan-2008  ad specfs changes for PR kern/37717 (raidclose() is no longer called on
shutdown). There are still problems with device access and a PR will be
filed.

- Kill checkalias(). Allow multiple vnodes to reference a single device.

- Don't play dangerous tricks with block vnodes to ensure that only one
vnode can describe a block device. Instead, prohibit concurrent opens of
block devices. As a bonus remove the unreliable code that prevents
multiple file system mounts on the same device. It's no longer needed.

- Track opens by vnode and by device. Issue cdev_close() when the last open
goes away, instead of abusing vnode::v_usecount to tell if the device is
open.
 1.33 07-Oct-2007  hannken branches: 1.33.4;
Update the file system copy-on-write handler.

- Instead of hooking the handler on the specdev of a mounted file system
hook directly on the `struct mount'.

- Rename from `vn_cow_*' to `fscow_*' and move to `kern/vfs_trans.c'. Use
`mount_*specific' instead of clobbering `struct mount' or `struct specinfo'.

- Replace the hand-made reader/writer lock with a krwlock.

- Keep `vn_cow_*' functions and mark as obsolete.

- Welcome to NetBSD 4.99.32 - `struct specinfo' changed size.

Reviewed by: Jason Thorpe <thorpej@netbsd.org>
 1.32 03-Aug-2007  pooka branches: 1.32.2; 1.32.4; 1.32.6; 1.32.8;
cleanup unused prototype
 1.31 22-Jul-2007  pooka Retire uvn_attach() - it abuses VXLOCK and its functionality,
setting vnode sizes, is handled elsewhere: file system vnode creation
or spec_open() for regular files or block special files, respectively.

Add a call to VOP_MMAP() to the pagedvn exec path, since the vnode
is being memory mapped.

reviewed by tech-kern & wrstuden
 1.30 14-May-2006  elad branches: 1.30.18; 1.30.28;
integrate kauth.
 1.29 11-Dec-2005  christos branches: 1.29.4; 1.29.6; 1.29.8; 1.29.10; 1.29.12;
merge ktrace-lwp.
 1.28 02-Nov-2005  yamt merge yamt-vop branch. remove following VOPs.

VOP_BLKATOFF
VOP_VALLOC
VOP_BALLOC
VOP_REALLOCBLKS
VOP_VFREE
VOP_TRUNCATE
VOP_UPDATE
 1.27 30-Aug-2005  xtraeme branches: 1.27.2;
Remove __P()
 1.26 25-May-2004  hannken branches: 1.26.12;
Add ffs internal snapshots. Written by Marshall Kirk McKusick for FreeBSD.

- Not enabled by default. Needs kernel option FFS_SNAPSHOT.
- Change parameters of ffs_blkfree.
- Let the copy-on-write functions return an error so spec_strategy
may fail if the copy-on-write fails.
- Change genfs_*lock*() to use vp->v_vnlock instead of &vp->v_lock.
- Add flag B_METAONLY to VOP_BALLOC to return indirect block buffer.
- Add a function ffs_checkfreefile needed for snapshot creation.
- Add special handling of snapshot files:
Snapshots may not be opened for writing and the attributes are read-only.
Use the mtime as the time this snapshot was taken.
Deny mtime updates for snapshot files.
- Add function transferlockers to transfer any waiting processes from
one lock to another.
- Add vfsop VFS_SNAPSHOT to take a snapshot and make it accessible through
a vnode.
- Add snapshot support to ls, fsck_ffs and dump.

Welcome to 2.0F.

Approved by: Jason R. Thorpe <thorpej@netbsd.org>
 1.25 14-Feb-2004  hannken Add a generic copy-on-write hook to add/remove functions that will be
called with every buffer written through spec_strategy().

Used by fss(4). Future file-system-internal snapshots will need them too.

Welcome to 1.6ZK

Approved by: Jason R. Thorpe <thorpej@netbsd.org>
 1.24 07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.23 06-Jan-2003  matt branches: 1.23.2;
Add multiple inclusion protection.
 1.22 23-Oct-2002  jdolecek merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe
 1.21 12-May-2002  matt Extern speclisth
 1.20 17-Aug-2001  chs branches: 1.20.2;
add definitions for UBCification of block devices.
 1.19 08-Dec-1999  sommerfeld branches: 1.19.6; 1.19.8;
Add appropriate VOP_FCNTL handlers to deadfs and specfs ops vectors.
 1.18 15-Nov-1999  fvdl Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O
 1.17 01-Mar-1998  fvdl branches: 1.17.14; 1.17.16; 1.17.20;
Merge with Lite2 + local changes
 1.16 11-Apr-1997  kleink Implement a POSIX compliant genfs VOP_SEEK() and use it in the appropriate
places; by Chris G. Demetriou and myself.
 1.15 02-Apr-1997  kleink added advisory record locking support
 1.14 07-Sep-1996  mycroft Implement poll(2).
 1.13 01-Sep-1996  mycroft Add a set of generic file system operations that most file systems use.
Also, fix some time stamp bogosities.
 1.12 13-Feb-1996  mycroft GC *_nullop(). Minor nits.
 1.11 09-Feb-1996  christos miscfs prototype changes
 1.10 15-Oct-1995  mycroft Implement VOP_BWRITE() using vn_bwrite(), per r_friedl@informatik.uni-kl.de.
 1.9 13-Dec-1994  mycroft branches: 1.9.2;
Turn lease_check() into a vnode op, per CSRG.
 1.8 29-Jun-1994  cgd New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'
 1.7 08-Jun-1994  mycroft Update to 4.4-Lite fs code, with local changes.
 1.6 22-Dec-1993  cgd fix return type of vnode print routine
 1.5 07-Sep-1993  ws Changes to VFS readdir semantics
NFS changes for better cookie support
ISOFS changes for better Rockridge support and support for generation numbers
 1.4 27-Jun-1993  andrew ANSIfications - lots of function prototyping.
 1.3 20-May-1993  cgd add rcs ids as necessary, and also clean up headers
 1.2 19-Apr-1993  mycroft Add consistent multiple-inclusion protection.
 1.1 21-Mar-1993  cgd branches: 1.1.1;
Initial revision
 1.1.1.3 01-Mar-1998  fvdl Import 4.4BSD-Lite2
 1.1.1.2 01-Mar-1998  fvdl Import 4.4BSD-Lite for reference
 1.1.1.1 21-Mar-1993  cgd initial import of 386bsd-0.1 sources
 1.9.2.1 15-Oct-1995  mycroft Update from main branch.
 1.17.20.2 27-Dec-1999  wrstuden Pull up to last week's -current.
 1.17.20.1 21-Dec-1999  wrstuden Initial commit of recent changes to make DEV_BSIZE go away.

Runs on i386, needs work on other arch's. Main kernel routines should be
fine, but a number of the stand programs need help.

cd, fd, ccd, wd, and sd have been updated. sd has been tested with non-512
byte block devices. vnd, raidframe, and lfs need work.

Non 2**n block support is automatic for LKM's and conditional for kernels
on "options NON_PO2_BLOCKS".
 1.17.16.2 26-Oct-1999  fvdl Merge changes in the trickle-sync and softdep code as done by Kirk McKusick
in FreeBSD since the version that we based the branch on. Merging mostly
done by Ethan Solomita <ethan@geocast.com>.

Also, make sure the syncer thread/process isn't active when we're
unmounting a filesystem. This could wreak havoc. XXX should be done
on a per-mountpoint basis, but especially the softdep code would
end up to be a big pile of vfs_busy() calls.
 1.17.16.1 19-Oct-1999  fvdl Bring in Kirk McKusick's FFS softdep code on a branch.
 1.17.14.1 20-Nov-2000  bouyer Update thorpej_scsipi to -current as of a month ago
 1.19.8.3 23-Jun-2002  jdolecek catch up with -current on kqueue branch
 1.19.8.2 25-Aug-2001  thorpej Merge Aug 24 -current into the kqueue branch.
 1.19.8.1 10-Jul-2001  lukem add spec_kqfilter()
 1.19.6.4 07-Jan-2003  thorpej Sync with HEAD.
 1.19.6.3 11-Nov-2002  nathanw Catch up to -current
 1.19.6.2 20-Jun-2002  nathanw Catch up to -current.
 1.19.6.1 24-Aug-2001  nathanw Catch up with -current.
 1.20.2.5 01-Oct-2001  fvdl Catch up with -current.
 1.20.2.4 27-Sep-2001  fvdl Do real locking for cloned vnodes (most filesystems have real locking
for spec vnodes, so clones should have it too). Could probably do locking
all the time for spec vnodes, but need to check if vnodes created
during bootstrap with {b,c}devvp will cause trouble if they have actual
locks.
 1.20.2.3 26-Sep-2001  fvdl * add a VCLONED vnode flag that indicates a vnode representing a cloned
device.
* rename REVOKEALL to REVOKEALIAS, and add a REVOKECLONE flag, to pass
to VOP_REVOKE
* the revoke system call will revoke all aliases, as before, but not the
clones
* vdevgone is called when detaching a device, so make it use REVOKECLONE
to get rid of all clones as well
* clean up all uses of VOP_OPEN wrt. locking.
* add a few VOPS to spec_vnops that need to do something when it's a
clone vnode (access and getattr)
* add a copy of the vnode vattr structure of the original 'master' vnode
to the specinfo of a cloned vnode. could possibly redirect getattr to
the 'master' vnode, but this has issues with revoke
* add a vdev_reassignvp function that disassociates a vnode from its
original device, and reassociates it with the specified dev_t. to be
used by cloning devices only, in case a new minor is allocated.
* change all direct references in drivers to v_devcookie and v_rdev
to vdev_privdata(vp) and vdev_rdev(vp). for diagnostic purposes
when debugging race conditions that still exist wrt. locking and
revoking vnodes.
* make the locking state of a vnode consistent when passed to
d_open and d_close (unlocked). locked would be better, but has
some deadlock issues
 1.20.2.2 18-Sep-2001  fvdl Various changes to make cloning devices possible:

* Add an extra argument (struct vnode **) to VOP_OPEN. If it is
not NULL, specfs will create a cloned (aliased) vnode during
the call, and return it there. The caller should release and
unlock the original vnode if a new vnode was returned. The
new vnode is returned locked.

* Add a flag field to the cdevsw and bdevsw structures.
DF_CLONING indicates that it wants a new vnode for each
open (XXX is there a better way? devprop?)

* If a device is cloning, always call the close entry
point for a VOP_CLOSE.


Also, rewrite cons.c to do the right thing with vnodes. Use VOPs
rather then direct device entry calls. Suggested by mycroft@

Light to moderate testing done an i386 system (arch doesn't matter
though, these are MI changes).
 1.20.2.1 07-Sep-2001  thorpej Commit my "devvp" changes to the thorpej-devvp branch. This
replaces the use of dev_t in most places with a struct vnode *.

This will form the basic infrastructure for real cloning device
support (besides being architecurally cleaner -- it'll be good
to get away from using numbers to represent objects).
 1.23.2.4 10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.23.2.3 21-Sep-2004  skrll Fix the sync with head I botched.
 1.23.2.2 18-Sep-2004  skrll Sync with HEAD.
 1.23.2.1 03-Aug-2004  skrll Sync with HEAD
 1.26.12.4 04-Feb-2008  yamt sync with head.
 1.26.12.3 27-Oct-2007  yamt sync with head.
 1.26.12.2 03-Sep-2007  yamt sync with head.
 1.26.12.1 21-Jun-2006  yamt sync with head.
 1.27.2.1 20-Oct-2005  yamt adapt specfs and fifofs.
 1.29.12.1 24-May-2006  tron Merge 2006-05-24 NetBSD-current into the "peter-altq" branch.
 1.29.10.1 06-May-2006  christos - Move kauth_cred_t declaration to <sys/types.h>
- Cleanup struct ucred; forward declarations that are unused.
- Don't include <sys/kauth.h> in any header, but include it in the c files
that need it.

Approved by core.
 1.29.8.1 24-May-2006  yamt sync with head.
 1.29.6.1 01-Jun-2006  kardel Sync with head.
 1.29.4.1 09-Sep-2006  rpaulo sync with head
 1.30.28.1 15-Aug-2007  skrll Sync with HEAD.
 1.30.18.3 09-Oct-2007  ad Sync with head.
 1.30.18.2 20-Aug-2007  ad Sync with HEAD.
 1.30.18.1 13-Mar-2007  ad Pull in the initial set of changes for the vmlocking branch.
 1.32.8.2 03-Aug-2007  pooka cleanup unused prototype
 1.32.8.1 03-Aug-2007  pooka file specdev.h was added on branch matt-mips64 on 2007-08-03 08:50:24 +0000
 1.32.6.1 14-Oct-2007  yamt sync with head.
 1.32.4.2 23-Mar-2008  matt sync with HEAD
 1.32.4.1 06-Nov-2007  matt sync with HEAD
 1.32.2.2 26-Oct-2007  joerg Sync with HEAD.

Follow the merge of pmap.c on i386 and amd64 and move
pmap_init_tmp_pgtbl into arch/x86/x86/pmap.c. Modify the ACPI wakeup
code to restore CR4 before jumping back into kernel space as the large
page option might cover that.
 1.32.2.1 03-Aug-2007  joerg file specdev.h was added on branch jmcneill-pm on 2007-10-26 15:48:57 +0000
 1.33.4.1 18-Feb-2008  mjf Sync with HEAD.
 1.35.10.3 11-Mar-2010  yamt sync with head
 1.35.10.2 04-May-2009  yamt sync with head.
 1.35.10.1 16-May-2008  yamt sync with head.
 1.35.8.1 18-May-2008  yamt sync with head.
 1.35.6.2 17-Jan-2009  mjf Sync with HEAD.
 1.35.6.1 02-Jun-2008  mjf Sync with HEAD.
 1.36.8.1 19-Jan-2009  skrll Sync with HEAD.
 1.39.22.4 03-Dec-2017  jdolecek update from HEAD
 1.39.22.3 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.39.22.2 23-Jun-2013  tls resync from head
 1.39.22.1 25-Feb-2013  tls resync with head
 1.39.12.1 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.39.2.3 28-May-2010  uebayasi specdev::v_phys_addr is now specdev::v_physseg.
 1.39.2.2 28-Apr-2010  uebayasi When mounting a block device as XIP, pass registered struct vm_physseg
* as a cookie from the block device to the caller (== mount code).
struct vm_physseg * will be passed to XIP vnode pager
(genfs_do_getpages_xip()), then converted back to paddr_t.

(My future plan is to pass struct vm_physseg * back to the fault handler,
and to pmap_enter() as is.)
 1.39.2.1 23-Mar-2010  uebayasi Put run-time XIP-specific per-mount data in struct specdev, not struct mount.
 1.41.4.1 18-May-2014  rmind sync with head
 1.42.2.1 10-Aug-2014  tls Rebase.
 1.43.4.1 22-Sep-2015  skrll Sync with HEAD
 1.44.34.1 01-Aug-2021  thorpej Sync with HEAD.

RSS XML Feed