Home | History | Annotate | Download | only in ext2fs
History log of /src/sys/ufs/ext2fs/ext2fs_vfsops.c
RevisionDateAuthorComments
 1.229  16-Feb-2025  joe remove unecessary branching

break is executed regardless the path of the if branch
 1.228  30-Dec-2024  hannken emove comment "we are always called with the filesystem marked `MPBUSY'."
above some xxx_sync() operations. These operations get called without
any exclusive lock.

This comment appeared with "add quota support" on 1990-05-02.
On 1998/02/18 MNT_MPBUSY disappeared when vfs_busy() was changed from
an exclusive lock to a shared lock.

PR kern/58837 "ffs: Missing locking around fs_fmod/time"
 1.227  02-Jul-2024  rin ext2fs: Fix copy-paste for PR kern/58388
 1.226  01-Jul-2024  riastradh ext2fs: Fix indexing of group descriptors on disk.

XXX Evidently we need some more automatic tests for this!

PR kern/58388
 1.225  27-Aug-2023  christos - fix cgload/cgsave inconsistencies
- add a constant for the rev 0 group descriptor size
 1.224  26-Aug-2023  christos fix kmem_free size for e2fs_gd
 1.223  26-Aug-2023  riastradh ext2fs: Nix trailing whitespace.
 1.222  25-Aug-2023  christos Support INCOMPAT_64BIT on ext4 (Vladimir 'phcoder' Serbinenko)
 1.221  22-May-2022  andvar branches: 1.221.4;
fix various small typos, mainly in comments.
 1.220  19-Mar-2022  hannken Remove now unused VV_LOCKSWORK, all file systems support locking.

Remove unused predicates vn_locked() and vn_anylocked().

Welcome to 9.99.95
 1.219  16-May-2020  christos Add ACL support for FFS. From FreeBSD.
 1.218  04-Apr-2020  ad Merge the remaining changes from the ad-namecache branch, affecting namei()
and getcwd():

- push vnode locking back as far as possible.
- do most lookups directly in the namecache, avoiding vnode locks & refs.
- don't block new refs to vnodes across VOP_INACTIVE().
- get shared locks for VOP_LOOKUP() if the file system supports it.
- correct lock types for VOP_ACCESS() / VOP_GETATTR() in a few places.

Possible future enhancements:

- make the lookups lockless.
- support dotdot lookups by being lockless and inferring absence of chroot.
- maybe make it work for layered file systems.
- avoid vnode references at the root & cwd.
 1.217  16-Mar-2020  pgoyette Use the module subsystem's ability to process SYSCTL_SETUP() entries to
automate installation of sysctl nodes.

Note that there are still a number of device and pseudo-device modules
that create entries tied to individual device units, rather than to the
module itself. These are not changed.
 1.216  27-Feb-2020  ad Tighten up the locking around vp->v_iflag a little more after the recent
split of vmobjlock & v_interlock.
 1.215  17-Jan-2020  ad VFS_VGET(), VFS_ROOT(), VFS_FHTOVP(): give them a "int lktype" argument, to
allow us to get shared locks (or no lock) on the returned vnode. Matches
FreeBSD.
 1.214  20-Jun-2019  pgoyette branches: 1.214.2; 1.214.4;
Split the ufs code out of the ffs module and into its own module.

Adapt chfs and ext2fs modules accordingly.
 1.213  01-Jan-2019  hannken Add "void *extra" argument to vcache_new() so a file system may
pass more information about the file to create.

Welcome to 8.99.30
 1.212  10-Dec-2018  maxv Remove unused mbuf.h includes.
 1.211  28-May-2018  chs branches: 1.211.2;
add a genfs method to allow a file system to limit the range of pages
that are given to a single GOP_WRITE() call. needed by ZFS.
 1.210  30-Jul-2017  riastradh branches: 1.210.2;
kmem_xyz(sizeof(struct foo)) --> kmem_xyz(sizeof(*foo))

No change to amd64 binary.
 1.209  28-May-2017  hannken Change ext2fs to use vcache_new like we did for ffs:
- Change ext2fs_valloc to return an inode number.
- Make ext2fs_makeinode private to ext2fs_vnops.c and
pass vattr instead of mode.
 1.208  17-Apr-2017  hannken branches: 1.208.2;
Remove unused argument "nextp" from vfs_busy() and vfs_unbusy().
Remove argument "keepref" from vfs_unbusy() and add vfs_ref() where needed.
 1.207  17-Apr-2017  hannken Add vfs_ref(mp) and vfs_rele(mp) to add or remove a reference to
struct mount. Rename vfs_destroy(mp) to vfs_rele(mp) and replace
incrementing mp->mnt_refcnt with vfs_ref(mp).
 1.206  01-Apr-2017  riastradh KASSERT(mutex_owned(vp->v_interlock)) in vnode iterator selector.
 1.205  17-Feb-2017  hannken Add generic genfs_suspendctl() and use it for all file systems.
Layered file systems need work.
 1.204  25-Aug-2016  christos branches: 1.204.2;
put back second strlcpy; pointed out by dholland.
 1.203  23-Aug-2016  christos CID 1371644: use strlcpy, remove dup copy.
 1.202  20-Aug-2016  jdolecek fix code which sets REV1 e2fs_fsmnt, set also mount time and mount count
 1.201  20-Aug-2016  jdolecek adjust ext2fs_loadvnode_content() to do the sanity checking before allocating
memory, and avoid reallocaing memory on vnode reload
 1.200  20-Aug-2016  jdolecek add support for GDT_CSUM AKA uninit_bg feature
 1.199  14-Aug-2016  jdolecek switch code to use the EXT2_HAS_{COMPAT|ROCOMPAT|INCOMPAT}_FEATURE() macros instead of open coding the checks
 1.198  13-Aug-2016  christos KNF, no functional changes...
 1.197  05-Aug-2016  jdolecek add devel ifndefs for incompat/rocompat features so that it's possible
to ignore them and mount the filesystem; default is for the mount to fail
 1.196  03-Aug-2016  pgoyette Update previous. Since original format was %llu, replace it with
% PRIu64 (unsigned).
 1.195  03-Aug-2016  pgoyette Use correct printf() format for inode (fixes build for me)
 1.194  03-Aug-2016  jdolecek support arbitrary ext3/ext4 inode size, add all the new ext4 fields ext2fs_dinode, and add support for loading the extra inode data
 1.193  28-Mar-2015  maxv branches: 1.193.2;
Remove the 'cred' argument from bread(). Remove a now unused var in
ffs_snapshot.c. Update the man page accordingly.

ok hannken@
 1.192  27-Mar-2015  riastradh Disentangle buffer-cached I/O from page-cached I/O in UFS.

Page-cached I/O is used for regular files, and is initiated by VFS
users such as userland and NFS.

Buffer-cached I/O is used for directories and symlinks, and is issued
only internally by UFS.

New UFS routine ufs_bufio replaces vn_rdwr for internal use.
ufs_bufio is implemented by new UFS operations uo_bufrd/uo_bufwr,
which sit in ufs_readwrite.c alongside the VOP_READ/VOP_WRITE
implementations.

I preserved the code as much as possible and will leave further
simplification for future commits. I kept the ulfs_readwrite.c
copypasta close to ufs_readwrite.c in case we ever want to merge them
back; likewise ext2fs_readwrite.c.

No externally visible semantic change. All atf fs tests still pass.
 1.191  17-Mar-2015  hannken Change ffs to use vcache_new:
- Change ffs_valloc to return an inode number.
- Remove now obsolete UFS operations UFS_VALLOC and UFS_VFREE.
- Make ufs_makeinode private to ufs_vnops.c and pass vattr instead of mode.
 1.190  23-Feb-2015  maxv Hum. Perhaps I missed a bit of the specification. Let's not be that
severe when checking the superblock.

Should fix ATF.
 1.189  22-Feb-2015  maxv Merge _sbcompute() and _sbcheck() into _sbfill().

In ext2fs_sbfill(), check more fields of the superblock, to prevent
several kernel panics when mounting/unmounting a disk.
 1.188  20-Feb-2015  maxv Several fixes:
- rename ext2fs_checksb() -> ext2fs_sbcheck(): more consistent
- in ext2fs_sbcheck(), add a check to ensure e2fs_inode_size!=0,
otherwise division by zero
- add ext2fs_sbcompute(), to compute dynamic values of the superblock.
It is done twice in _reload() and _mountfs(), so put it in a function.
- reorder the code in charge of loading the superblock: now, read the
superblock, swap it directly, and *then* pass it to ext2fs_sbcheck().
It is similar to what ffs now does. It is better since the fields don't
need to be swapped on the fly in ext2fs_sbcheck().
Tested on amd64.
 1.187  19-Feb-2015  maxv e2fs_sbcheck(): add a check to ensure e2fs_bpg!=0. Otherwise the kernel
panics with a division by zero.

While here, remove the #ifdef's.
 1.186  09-Nov-2014  maxv branches: 1.186.2;
Do not uselessly include <sys/malloc.h>.
 1.185  19-Sep-2014  matt curlwp can never be NULL now.
 1.184  22-Aug-2014  hannken Use mount from argument "mp", "vp->v_mount" is not valid here.

PR kern/49142 (panic in ext2fs_loadvnode mounting an ext2fs filesystem)

Needs pullup to -7
 1.183  09-Jul-2014  maxv branches: 1.183.2;
Remove ROOTNAME (unused).
 1.182  24-May-2014  christos Introduce a selector function to the vfs vnode iterator so that we don't
need to vget() vnodes that we are not interested at, and optimize locking
a bit. Iterator changes reviewed by Hannken (thanks), the rest of the bugs
are mine.
 1.181  08-May-2014  hannken Add a global vnode cache:

- vcache_get() retrieves a referenced and initialised vnode / fs node pair.
- vcache_remove() removes a vnode / fs node pair from the cache.

On cache miss vcache_get() calls new vfs operation vfs_loadvnode() to
initialise a vnode / fs node pair. This call is guaranteed exclusive,
no other thread will try to load this vnode / fs node pair.

Convert ufs/ext2fs, ufs/ffs and ufs/mfs to use this interface.

Remove now unused ufs/ufs_ihash

Discussed on tech-kern.

Welcome to 6.99.41
 1.180  16-Apr-2014  maxv An (un)privileged user can easily make the kernel dereference a NULL
pointer.

The kernel allows 'data' to be NULL; it's the fs's responsibility to
ensure that it isn't NULL (if the fs actually needs data).

ok christos@
 1.179  23-Mar-2014  hannken branches: 1.179.2;
Change all vfsops to use C99 designated initializers.

No functional changes intended.
 1.178  17-Mar-2014  hannken Change ext2fs_sync() to use vfs_vnode_iterator.
 1.177  05-Mar-2014  hannken Current support for iterating over mnt_vnodelist is rudimentary. Every
caller has to care about list and vnode mutexes, reference count being zero,
intermediate vnode states like VI_CLEAN, VI_XLOCK, VI_MARKER and so on.

Add an interface to iterate over a vnode list:

void vfs_vnode_iterator_init(struct mount *mp, struct vnode_iterator **marker)
void vfs_vnode_iterator_destroy(struct vnode_iterator *marker)
bool vfs_vnode_iterator_next(struct vnode_iterator *marker, struct vnode **vpp)

vfs_vnode_iterator_next() returns either "false / *vpp == NULL" when done
or "true / *vpp != NULL" to return the next referenced vnode from the list.

To make vrecycle() work in this environment change it to

bool vrecycle(struct vnode *vp)

where "vp" is a referenced vnode to be destroyed if this is the last reference.

Discussed on tech-kern.

Welcome to 6.99.34
 1.176  25-Feb-2014  pooka Ensure that the top level sysctl nodes (kern, vfs, net, ...) exist before
the sysctl link sets are processed, and remove redundancy.

Shaves >13kB off of an amd64 GENERIC, not to mention >1k duplicate
lines of code.
 1.175  23-Nov-2013  christos change the mountlist CIRCLEQ into a TAILQ
 1.174  29-Oct-2013  hannken Vnode API cleanup pass 1.

- Make these defines and functions private to vfs_vnode.c:

VC_MASK, VC_LOCK, DOCLOSE, VI_IANCTREDO and VI_INACTNOW
vclean() and vrelel()

- Remove the long time unused lwp argument from vrecycle().

- Remove vtryget(), it is responsible for ugly hacks and doesn't
look that effective.

Presented on tech-kern.

Welcome to 6.99.25
 1.173  30-Sep-2013  hannken Replace macro v_specmountpoint with two functions spec_node_getmountedfs()
and spec_node_setmountedfs() to manage the file system mounted on a device.
Assert the device is a block device.

Welcome to 6.99.24

Discussed on tech-kern@ some time ago.

Reviewed by: David Holland <dholland@netbsd.org>
 1.172  11-Aug-2013  dholland Kill off uo_unmark_vnode/UFS_UNMARK_VNODE as it's now a leftover.
 1.171  23-Jun-2013  dholland branches: 1.171.2;
fsbtodb() -> FFS_FSBTODB(), EXT2_FSBTODB(), or MFS_FSBTODB()
dbtofsb() -> FFS_DBTOFSB() or EXT2_DBTOFSB()

(Christos already did the lfs ones a few days back)
 1.170  19-Jun-2013  dholland Rename ambiguous macros:
MAXDIRSIZE -> UFS_MAXDIRSIZE or LFS_MAXDIRSIZE
NINDIR -> FFS_NINDIR, EXT2_NINDIR, LFS_NINDIR, or MFS_NINDIR
INOPB -> FFS_INOPB, LFS_INOPB
INOPF -> FFS_INOPF, LFS_INOPF
blksize -> ffs_blksize, ext2_blksize, or lfs_blksize
sblksize -> ffs_blksize

These are not the only ambiguously defined filesystem macros, of
course, there's a pile more. I may not have found all the ambiguous
definitions of blksize(), too, as there are a lot of other things
called 'blksize' in the system.
 1.169  08-Apr-2013  skrll Remove some set but unused variables
 1.168  20-Dec-2012  hannken Change bread() and breadn() to never return a buffer on
error and modify all callers to not brelse() on error.

Welcome to 6.99.16

PR kern/46282 (6.0_BETA crash: msdosfs_bmap -> pcbmap -> bread -> bio_doread)
 1.167  21-Nov-2012  jakllsch Write support for the Ext4 Read-only Compatible Feature "huge_file".

Primarily, this feature extends the inode block count field to 48 bits.
Additionally, this feature allows this field to be represented in file
system block size units rather than DEV_BSIZE units.
 1.166  01-Sep-2012  christos branches: 1.166.2;
really print the incompatible bits.
 1.165  01-Sep-2012  chs when failing a mount due to unsupported features,
print which features are involved.
 1.164  30-Apr-2012  rmind - Replace some malloc(9) uses with kmem(9).
- G/C M_IPMOPTS, M_IPMADDR and M_BWMETER.
 1.163  13-Mar-2012  elad Replace the remaining KAUTH_GENERIC_ISSUSER authorization calls with
something meaningful. All relevant documentation has been updated or
written.

Most of these changes were brought up in the following messages:

http://mail-index.netbsd.org/tech-kern/2012/01/18/msg012490.html
http://mail-index.netbsd.org/tech-kern/2012/01/19/msg012502.html
http://mail-index.netbsd.org/tech-kern/2012/02/17/msg012728.html

Thanks to christos, manu, njoly, and jmmv for input.

Huge thanks to pgoyette for spinning these changes through some build
cycles and ATF.
 1.162  14-Nov-2011  hannken branches: 1.162.4; 1.162.6; 1.162.10; 1.162.12;
VOP_OPEN() needs a locked vnode. All these copy-and-pasted xxxfs_mount()
implementations need more review.
 1.161  07-Oct-2011  hannken branches: 1.161.2;
As vnalloc() always allocates with PR_WAITOK there is no longer the need
to test its result for NULL.
 1.160  12-Jun-2011  rmind Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.
 1.159  27-Jul-2010  jakllsch branches: 1.159.6;
Make DEBUG_EXT2 work with 64-bit size_t.
 1.158  21-Jul-2010  hannken Make holding v_interlock mandatory for callers of vget().

Announced some time ago on tech-kern.
 1.157  24-Jun-2010  hannken Clean up vnode lock operations pass 2:

VOP_UNLOCK(vp, flags) -> VOP_UNLOCK(vp): Remove the unneeded flags argument.

Welcome to 5.99.32.

Discussed on tech-kern.
 1.156  11-Feb-2010  mlelstv branches: 1.156.2;
There is no code left that uses disk size data, so don't query it.
 1.155  31-Jan-2010  mlelstv branches: 1.155.2;
Fix block shift to work with different device block sizes.
 1.154  31-Jan-2010  mlelstv Replace individual queries for partition information with
new helper function.
 1.153  08-Jan-2010  pooka The VATTR_NULL/VREF/VHOLD/HOLDRELE() macros lost their will to live
years ago when the kernel was modified to not alter ABI based on
DIAGNOSTIC, and now just call the respective function interfaces
(in lowercase). Plenty of mix'n match upper/lowercase has creeped
into the tree since then. Nuke the macros and convert all callsites
to lowercase.

no functional change
 1.152  21-Oct-2009  pooka update i_uid and i_gid after chown
 1.151  19-Oct-2009  bouyer Remove closes 3 & 4 from my licence. Lots of thanks to Soren Jacobsen
for the booring work !
 1.150  13-Sep-2009  tsutsui Move declaration of ufs_hashlock into <ufs/ufs_extern.h> from each c source.
 1.149  12-Sep-2009  tsutsui Reduce diffs a bit between ext2fs_reload() and ffs_reload().
 1.148  12-Sep-2009  tsutsui Add a missed brelse(9) call after bread(9) in ext2fs_reload().

This may close PR kern/28712 (ext2fs hang on mount after fsck).
 1.147  12-Sep-2009  tsutsui Pull a fix from ffs_vfsops.c rev 1.248:
> Fix bug introduced in revision 1.174(*) where a NULL fspec with an MNT_UPDATE
> command would always return EINVAL. This broke fsck on root, where fsck'ing
> a dirty root would always return an error causing rc to resort in a reboot.
(*) This is "Apply the NFS exports list rototill patch" change
in ext2fs_vfsops.c rev 1.91.
 1.146  12-Sep-2009  tsutsui Pull a fix for mount function from ffs_vfsops.c rev1.186:
> Change ffs_mount, in MNT_UPDATE case, to check dev_t's for equality
> instead of just vnode pointers. Fixes erroneous "does not match mounted
> device" errors from mount(8) in the presence of MFS /dev, init.root, &c.
 1.145  11-Sep-2009  tsutsui Fix botch around argument check in ext2fs_mount(). Taken from ffs_vfsops.c.

Fixes LOCKDEBUG panic which is the same one mentioned in PR kern/41078
on trying to mount_ext2fs against a raw device, while that panic
seems to have another route cause around module_autoload() in
sys/miscfs/specfs/spec_vnops.c:spec_open().
 1.144  29-Jun-2009  dholland Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.
 1.143  25-Apr-2009  elad Add genfs_can_mount() and use it to prevent some more code duplication of
the security checks when mounting a device (VOP_ACCESS() + kauth(9) call)).

Proposed with no objections on tech-kern@:

http://mail-index.netbsd.org/tech-kern/2009/04/20/msg004859.html

The vnode is always expected to be locked, so no locking is done outside
the file-system code.
 1.142  01-Mar-2009  christos PR/40936: Frederik Sausmikat: ext2fs: add support for inodes > 128 bytes
 1.141  08-Dec-2008  pooka branches: 1.141.2;
Remove no longer valid comment (which probably didn't even say what
it wanted to say in the first place).
 1.140  23-Nov-2008  mrg add support for 32 bit uid/gid fields in ext2, but only do so for
when the revision is > REV0.
 1.139  13-Nov-2008  ad These depend on ffs.
 1.138  13-Nov-2008  ad Remove #ifdef LFS from the ufs code.
 1.137  28-Jun-2008  rumble branches: 1.137.2; 1.137.4; 1.137.6;
Create sysctl entries during module initialisation and destroy them
appropriately.

Many of these file systems are now ready for modularisation.
 1.136  16-May-2008  hannken branches: 1.136.2;
Make sure all cached buffers with valid, not yet written data have been
run through copy-on-write. Call fscow_run() with valid data where possible.

The LP_UFSCOW hack is no longer needed to protect ffs_copyonwrite() against
endless recursion.

- Add a flag B_MODIFY to bread(), breada() and breadn(). If set the caller
intends to modify the buffer returned.

- Always run copy-on-write on buffers returned from ffs_balloc().

- Add new function ffs_getblk() that gets a buffer, assigns a new blkno,
may clear the buffer and runs copy-on-write. Process possible errors
from getblk() or fscow_run(). Part of PR kern/38664.

Welcome to 4.99.63

Reviewed by: YAMAMOTO Takashi <yamt@netbsd.org>
 1.135  10-May-2008  rumble Convert file systems to dynamically attach with the new module interface.
Make VFS hooks dynamic while we're here and say farewell to VFS_ATTACH and
VFS_HOOKS_ATTACH linksets.

As a consequence, most of the file systems can now be loaded as new style
modules.

Quick sanity check by ad@.
 1.134  06-May-2008  ad branches: 1.134.2;
PR kern/38141 lookup/vfs_busy acquire rwlock recursively

Simplify the mount locking. Remove all the crud to deal with recursion on
the mount lock, and crud to deal with unmount as another weirdo lock.

Hopefully this will once and for all fix the deadlocks with this. With this
commit there are two locks on each mount:

- krwlock_t mnt_unmounting. This is used to prevent unmount across critical
sections like getnewvnode(). It's only ever read locked with rw_tryenter(),
and is only ever write locked in dounmount(). A write hold can't be taken
on this lock if the current LWP could hold a vnode lock.

- kmutex_t mnt_updating. This is taken by threads updating the mount, for
example when going r/o -> r/w, and is only present to serialize updates.
In order to take this lock, a read hold must first be taken on
mnt_unmounting, and the two need to be held across the operation.

One effect of this change: previously if an unmount failed, we would make a
half hearted attempt to back out of it gracefully, but that was unlikely to
work in a lot of cases. Now while an unmount that will be aborted is in
progress, new file operations within the mount will fail instead of being
delayed. That is unlikely to be a problem though, because if the admin
requests unmount of a file system then s(he) has made a decision to deny
access to the resource.
 1.133  30-Apr-2008  ad PR kern/38135 vfs_busy/vfs_trybusy confusion

The previous fix worked, but it opened a window where mounts could have
disappeared from mountlist while the caller was traversing it using
vfs_trybusy(). Fix that.
 1.132  29-Apr-2008  ad PR kern/38057 ffs makes assuptions about devvp file system
PR kern/33406 softdeps get stuck in endless loop

Introduce VFS_FSYNC() and call it when syncing a block device, if it
has a mounted file system.
 1.131  05-Feb-2008  ad branches: 1.131.6; 1.131.8; 1.131.10;
Do genfs_node_init() earlier. PR kern/36162.
 1.130  30-Jan-2008  ad PR kern/37706 (forced unmount of file systems is unsafe):

- Do reference counting for 'struct mount'. Each vnode associated with a
mount takes a reference, and in turn the mount takes a reference to the
vfsops.
- Now that mounts are reference counted, replace the overcomplicated mount
locking inherited from 4.4BSD with a recursable rwlock.
 1.129  28-Jan-2008  dholland Fix some race conditions in rename.
Introduce a per-FS rename lock and new vfsops to manipulate it.
Get this lock while renaming. Also add another relookup() in do_sys_rename,
which is a hack to kludge around some of the worst deficiencies of
ufs_rename.
reviewed-by: pooka (and an earlier rev by ad)
posted on tech-kern with no objections.
 1.128  24-Jan-2008  ad specfs changes for PR kern/37717 (raidclose() is no longer called on
shutdown). There are still problems with device access and a PR will be
filed.

- Kill checkalias(). Allow multiple vnodes to reference a single device.

- Don't play dangerous tricks with block vnodes to ensure that only one
vnode can describe a block device. Instead, prohibit concurrent opens of
block devices. As a bonus remove the unreliable code that prevents
multiple file system mounts on the same device. It's no longer needed.

- Track opens by vnode and by device. Issue cdev_close() when the last open
goes away, instead of abusing vnode::v_usecount to tell if the device is
open.
 1.127  03-Jan-2008  pooka valloc -> vnalloc, vfree -> vnfree
Avoids collision with userland valloc(3).

no functional change
ad ok
 1.126  02-Jan-2008  ad Merge vmlocking2 to head.
 1.125  08-Dec-2007  pooka branches: 1.125.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.
 1.124  01-Dec-2007  tsutsui branches: 1.124.2;
Use e2fs_first_dblock in superblock to read/write group descriptor blocks.
 1.123  26-Nov-2007  pooka Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern
 1.122  26-Nov-2007  tsutsui Misc cosmetics.
 1.121  26-Nov-2007  tsutsui Account e2fs_reserved_ngdb blocks accordingly in ext2fs_statvfs().
 1.120  17-Oct-2007  ad branches: 1.120.4;
Sync with ffs: fix ufs_ihashlock / ufs_hash_lock deadlock.
From Sverre Froyen.
 1.119  10-Oct-2007  ad Merge from vmlocking:

- Split vnode::v_flag into three fields, depending on field locking.
- simple_lock -> kmutex in a few places.
- Fix some simple locking problems.
 1.118  08-Oct-2007  ad Merge ffs locking & brelse changes from the vmlocking branch.
 1.117  31-Jul-2007  pooka branches: 1.117.2; 1.117.4; 1.117.6; 1.117.8;
* nuke the nameidata parameter from VFS_MOUNT(). Nobody on tech-kern
knew what it was supposed to be used for and wrstuden gave a go-ahead
* while rototilling, convert file systems which went easily to
use VFS_PROTOS() instead of manually prototyping the methods
 1.116  26-Jul-2007  pooka Use eopnotsupp() instead of vfs_stdsuspendctl() and retire the latter.
 1.115  20-Jul-2007  pooka In sync, skip over vnodes based on if they are clean rather than
if they have pages.
 1.114  17-Jul-2007  pooka branches: 1.114.2;
Make set_statvfs_info() take a parameter for the vfs name instead
of always retrieving it from mp->mnt_op->vfs_name

christos ok
 1.113  12-Jul-2007  dsl Change the VFS_MOUNT() interface so that the 'data' buffer passed to the
fs code is a kernel buffer, pass though the length of the buffer as well.
Since the length of the userspace buffer isn'it (yet) passed through the mount
system call, add a field to the vfsops structure containing the default length.
Split sys_mount() for calls from compat code.
Ride one of the recent kernel version changes - old fs LKMs will load, but
sys_mount() will reject any attempt to use them.
 1.112  30-Jun-2007  pooka Using POOL_INIT here makes no sense, since file systems always have
an init method. So get rid of it and #ifdef _LKM and just always
init in the init method. Give malloc types the same treatment.
Makes file systems nicer to work with in linksetless environments
and fixes a few LKM discrepancies.
 1.111  05-Jun-2007  yamt improve post-ubc file overwrite performance in common cases.
ie. when it's safe, actually overwrite blocks rather than doing
read-modify-write.

also fixes PR/33152 and PR/36303.
 1.110  12-Mar-2007  ad branches: 1.110.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.
 1.109  04-Mar-2007  christos branches: 1.109.2;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.
 1.108  15-Feb-2007  ad branches: 1.108.2;
Replace some uses of lockmgr() / simplelocks.
 1.107  19-Jan-2007  hannken New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).
 1.106  04-Jan-2007  elad Consistent usage of KAUTH_GENERIC_ISSUSER.
 1.105  16-Nov-2006  christos __unused removal on arguments; approved by core.
 1.104  25-Oct-2006  reinoud Revisit mnt_vnodelist TAILQ patch. Remove all suspicious TAILQ_FOREACH()
loops where vnodes can get removed or added during the loops. This could
lead to panic's on unmount since nodes are skipped or otherwise
TAILQ_NEXT(0xdeadbeef, ...) was dereferenced.
 1.103  20-Oct-2006  reinoud Replace the LIST structure mp->mnt_vnodelist to a TAILQ structure since all
vnodes were synced and processed backwards. This meant that the last
accessed node was processed first and the earlierst last.

An extra benefit is the removal of the ugly hack from the Berkly days on
LFS.

In the proces, i've also replaced the various variations hand written loops
by the TAILQ_FOREACH() macro's.
 1.102  12-Oct-2006  christos - sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386
 1.101  30-Aug-2006  christos branches: 1.101.2; 1.101.4;
fix incomplete initializer.
 1.100  23-Jul-2006  ad Use the LWP cached credentials where sane.
 1.99  13-Jul-2006  martin Fix alignement problems for fhandle_t, exposed by gcc4.1.

While touching all vptofh/fhtovp functions, get rid of VFS_MAXFIDSIZ,
version the getfh(2) syscall and explicitly pass the size available in
the filehandle from userland.

Discussed on tech-kern, with lots of help from yamt (thanks!).
 1.98  07-Jun-2006  kardel branches: 1.98.2;
merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html
 1.97  14-May-2006  elad branches: 1.97.2;
integrate kauth.
 1.96  18-Mar-2006  bouyer bread() will always return a valid bp. So remplace the (always true) if (bp)
with a KASSERT.
Should fix Coverity ID 2444.
 1.95  21-Feb-2006  thorpej branches: 1.95.2; 1.95.4; 1.95.6;
Use device_class() instead of accessing dv_class directly.
 1.94  11-Dec-2005  christos branches: 1.94.2; 1.94.4; 1.94.6;
merge ktrace-lwp.
 1.93  02-Nov-2005  yamt merge yamt-vop branch. remove following VOPs.

VOP_BLKATOFF
VOP_VALLOC
VOP_BALLOC
VOP_REALLOCBLKS
VOP_VFREE
VOP_TRUNCATE
VOP_UPDATE
 1.92  27-Sep-2005  yamt branches: 1.92.2;
introduce "ufs_ops" and use it for ITIMES.
 1.91  23-Sep-2005  jmmv Apply the NFS exports list rototill patch:

- Remove all NFS related stuff from file system specific code.
- Drop the vfs_checkexp hook and generalize it in the new nfs_check_export
function, thus removing redundancy from all file systems.
- Move all NFS export-related stuff from kern/vfs_subr.c to the new
file sys/nfs/nfs_export.c. The former was becoming large and its code
is always compiled, regardless of the build options. Using the latter,
the code is only compiled in when NFSSERVER is enabled. While doing this,
also make some functions in nfs_subs.c conditional to NFSSERVER.
- Add a new command in nfssvc(2), called NFSSVC_SETEXPORTSLIST, that takes a
path and a set of export entries. At the moment it can only clear the
exports list or append entries, one by one, but it is done in a way that
allows setting the whole set of entries atomically in the future (see the
comment in mountd_set_exports_list or in doc/TODO).
- Change mountd(8) to use the nfssvc(2) system call instead of mount(2) so
that it becomes file system agnostic. In fact, all this whole thing was
done to remove a 'XXX' block from this utility!
- Change the mount*, newfs and fsck* userland utilities to not deal with NFS
exports initialization; done internally by the kernel when initializing
the NFS support for each file system.
- Implement an interface for VFS (called VFS hooks) so that several kernel
subsystems can run arbitrary code upon receipt of specific VFS events.
At the moment, this only provides support for unmount and is used to
destroy NFS exports lists from the file systems being unmounted, though it
has room for extension.

Thanks go to yamt@, chs@, thorpej@, wrstuden@ and others for their comments
and advice in the development of this patch.
 1.90  12-Sep-2005  christos - access the ffs and ext2fs itimes functions through a pointer, so that
if the filesystem is not compiled in the kernel still links. Probably
a better solution is to use weak symbols.
- move the filesystem-specific itime macros to the filesystem header files.
 1.89  30-Aug-2005  xtraeme * Remove __P()
* Use ANSI function declarations on ext2fs and mfs
 1.88  23-Aug-2005  christos Don't overload MAXNAMLEN, use a separate constant for each filesystem type.
 1.87  23-Jul-2005  yamt update file timestamps for nfsd loaned-read and mmap.
PR/25279. discussed on tech-kern@.
 1.86  28-Jun-2005  yamt branches: 1.86.2;
- constify genfs_ops.
- use member designators.
 1.85  29-May-2005  christos - sprinkle const
- avoid shadow variables.
 1.84  29-Mar-2005  thorpej - Define a VFS_ATTACH() macro that places a reference to a vfsops structure
into the "vfsops" link set.
- Use VFS_ATTACH() where vfsops are declared for individual file systems.
- In vfsinit(), traverse the "vfsops" link set, rather than vfs_list_initial[].
 1.83  26-Feb-2005  perry branches: 1.83.2;
nuke trailing whitespace
 1.82  09-Feb-2005  ws Add support for large files (>2GB).
Like Linux, automagically convert old filesystem to use this,
if they are already at revision 1.
For revision 0, just punt (unlike Linux; makes me a bit too nervous.)

There should be an option to fsck_ext2fs to upgrade revision 0 to revision 1.

Reviewd by Manuel (bouyer@).
 1.81  11-Jan-2005  mycroft branches: 1.81.2; 1.81.4;
Rearrange some code slightly to avoid uninitialized variable warnings.
 1.80  09-Jan-2005  mycroft Whoops -- move the location of the VOP_OPEN()/VOP_CLOSE(), et al, from
foo_mountfs() to foo_mount(), to match the new mountroot API.
Also, for ext2fs and lfs, copy some restructuring from ffs to allow changing
file system parameters without specifying the device name.
(ntfs could use some more work.)
 1.79  09-Jan-2005  mycroft Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.
 1.78  02-Jan-2005  thorpej Add the system call and VFS infrastructure for file system extended
attributes.

From FreeBSD.
 1.77  11-Nov-2004  christos Put the correct fragment size in struct statvfs. From Kevin Lahey.
 1.76  21-Sep-2004  thorpej Add a new VNODE_LOCKDEBUG option, which enables checks in the VOP_*()
calls to ensure that the vnode lock state is as expected when the VOP
call is made. Modify vnode_if.src to set the expected state according
to the documenting lock table for each VOP. Modify vnode_if.sh to emit
the checks.

Notes:
- The checks are only performed if the vnode has the VLOCKSWORK bit
set. Some file systems (e.g. specfs) don't even bother with vnode
locks, so of course the checks will fail.
- We can't actually run with VNODE_LOCKDEBUG because there are so many
vnode locking problems, not the least of which is the "use SHARED for
VOP_READ()" issue, which screws things up for the entire call chain.

Inspired by similar changes in OpenBSD, but implemented differently.
 1.75  15-Aug-2004  mycroft Fixing age old cruft:
* Rather than using mnt_maxsymlinklen to indicate that a file systems returns
d_type fields(!), add a new internal flag, IMNT_DTYPE.

Add 3 new elements to ufsmount:
* um_maxsymlinklen, replaces mnt_maxsymlinklen (which never should have existed
in the first place).
* um_dirblksiz, which tracks the current directory block size, eliminating the
FS-specific checks littered throughout the code. This may be used later to
make the block size variable.
* um_maxfilesize, which is the maximum file size, possibly adjusted lower due
to implementation issues.

Sync some bug fixes from FFS into ext2fs, particularly:
* ffs_lookup.c 1.21, 1.28, 1.33, 1.48
* ffs_inode.c 1.43, 1.44, 1.45, 1.66, 1.67
* ffs_vnops.c 1.84, 1.85, 1.86

Clean up some crappy pointer frobnication.
 1.74  14-Aug-2004  mycroft Add a new flag, IN_MODIFY. This is like IN_UPDATE|IN_CHANGE, but unlike
setting those flags, it does not cause the inode to be written in the periodic
sync. This is used for writes to special files (devices and named pipes) and
FIFOs.

Do not preemptively sync updates to access times and modification times. They
are now updated in the inode only opportunistically, or when the file or device
is closed. (Really, it should be delayed beyond close, but this is enough to
help substantially with device nodes.)

And the most amusing part:
Trickle sync was broken on both FFS and ext2fs, in different ways. In FFS, the
periodic call to VFS_SYNC(MNT_LAZY) was still causing all file data to be
synced. In ext2fs, it was causing the metadata to *not* be synced. We now
only call VOP_UPDATE() on the node if we're doing MNT_LAZY. I've confirmed
that we do in fact trickle correctly now.
 1.73  05-Jul-2004  pk Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().
 1.72  25-May-2004  hannken Add ffs internal snapshots. Written by Marshall Kirk McKusick for FreeBSD.

- Not enabled by default. Needs kernel option FFS_SNAPSHOT.
- Change parameters of ffs_blkfree.
- Let the copy-on-write functions return an error so spec_strategy
may fail if the copy-on-write fails.
- Change genfs_*lock*() to use vp->v_vnlock instead of &vp->v_lock.
- Add flag B_METAONLY to VOP_BALLOC to return indirect block buffer.
- Add a function ffs_checkfreefile needed for snapshot creation.
- Add special handling of snapshot files:
Snapshots may not be opened for writing and the attributes are read-only.
Use the mtime as the time this snapshot was taken.
Deny mtime updates for snapshot files.
- Add function transferlockers to transfer any waiting processes from
one lock to another.
- Add vfsop VFS_SNAPSHOT to take a snapshot and make it accessible through
a vnode.
- Add snapshot support to ls, fsck_ffs and dump.

Welcome to 2.0F.

Approved by: Jason R. Thorpe <thorpej@netbsd.org>
 1.71  25-May-2004  atatat Sysctl descriptions under vfs subtree
 1.70  20-May-2004  atatat Explicitly call pool_init() (and pool_destroy()) when being built as
an _LKM.

This adds pools to the list of things that lkms must do manually
because they're set up with link sets. Not that there's anything
wrong with link sets, but that we need to try harder to remember that
lkms are second class citizens. Of a sort.
 1.69  02-May-2004  wiz Fix typo in error message, reported by Piotr Meyer in PR 25418.
 1.68  25-Apr-2004  simonb Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.
 1.67  21-Apr-2004  christos Replace the statfs() family of system calls with statvfs().
Retain binary compatibility.
 1.66  24-Mar-2004  atatat branches: 1.66.2;
Tango on sysctl_createv() and flags. The flags have all been renamed,
and sysctl_createv() now uses more arguments.
 1.65  22-Mar-2004  bouyer Fix disclaimer in my copyright. Pointed out by Thomas Klausner.
 1.64  04-Dec-2003  atatat Dynamic sysctl.

Gone are the old kern_sysctl(), cpu_sysctl(), hw_sysctl(),
vfs_sysctl(), etc, routines, along with sysctl_int() et al. Now all
nodes are registered with the tree, and nodes can be added (or
removed) easily, and I/O to and from the tree is handled generically.

Since the nodes are registered with the tree, the mapping from name to
number (and back again) can now be discovered, instead of having to be
hard coded. Adding new nodes to the tree is likewise much simpler --
the new infrastructure handles almost all the work for simple types,
and just about anything else can be done with a small helper function.

All existing nodes are where they were before (numerically speaking),
so all existing consumers of sysctl information should notice no
difference.

PS - I'm sorry, but there's a distinct lack of documentation at the
moment. I'm working on sysctl(3/8/9) right now, and I promise to
watch out for buses.
 1.63  14-Oct-2003  dbj add mnt_iflag field to struct mount for internal flags
mv MNT_GONE, MNT_UNMOUNT and MNT_WANTRDWR to this field
additonally add mnt_writeopcountupper and mnt_writeopcountlower fields
in preparation for pending write suspension support work
bump kernel version to 1.6ZD
 1.62  05-Oct-2003  bouyer Remove references to University of California from my copyright notices.
 1.61  07-Aug-2003  agc Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.
 1.60  29-Jun-2003  fvdl branches: 1.60.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
 1.59  29-Jun-2003  thorpej Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.
 1.58  28-Jun-2003  darrenr Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
 1.57  16-Apr-2003  christos PR/1796: John Kohl: statfs misbehaves under chrooted environments.

- Under chroot it displays only the visible filesystems with appropriate paths.
- The statfs f_mntonname gets adjusted to contain the real path from root.
- While was there, fixed a bug in ext2fs, locking problems with vfs_getfsstat(),
and factored out some of the vfsop statfs() code to copy_statfs_info(). This
fixes the problem where some filesystems forgot to set fsid.
- Made coda look more like a normal fs.
 1.56  05-Apr-2003  fvdl Actually get an ext2fs_dinode structure from the pool before using it.
 1.55  02-Apr-2003  fvdl Add support for UFS2. UFS2 is an enhanced FFS, adding support for
64 bit block pointers, extended attribute storage, and a few
other things.

This commit does not yet include the code to manipulate the extended
storage (for e.g. ACLs), this will be done later.

Originally written by Kirk McKusick and Network Associates Laboratories for
FreeBSD.
 1.54  21-Mar-2003  dsl Use 'void *' instead of 'caddr_t' in prototypes of VOP_IOCTL, VOP_FCNTL
and VOP_ADVLOCK, delete casts from callers (and some to copyin/out).
 1.53  24-Jan-2003  fvdl Bump daddr_t to 64 bits. Replace it with int32_t in all places where
it was used on-disk, so that on-disk formats remain the same.
Remove ufs_daddr_t and ufs_lbn_t for the time being.
 1.52  21-Sep-2002  christos MNT_GETARGS support
 1.51  06-Sep-2002  gehenna Merge the gehenna-devsw branch into the trunk.

This merge changes the device switch tables from static array to
dynamically generated by config(8).

- All device switches is defined as a constant structure in device drivers.

- The new grammer ``device-major'' is introduced to ``files''.

device-major <prefix> char <num> [block <num>] [<rules>]

- All device major numbers must be listed up in port dependent majors.<arch>
by using this grammer.

- Added the new naming convention.
The name of the device switch must be <prefix>_[bc]devsw for auto-generation
of device switch tables.

- The backward compatibility of loading block/character device
switch by LKM framework is broken. This is necessary to convert
from block/character device major to device name in runtime and vice versa.

- The restriction to assign device major by LKM is completely removed.
We don't need to reserve LKM entries for dynamic loading of device switch.

- In compile time, device major numbers list is packed into the kernel and
the LKM framework will refer it to assign device major number dynamically.
 1.50  30-Jul-2002  soren Die, qaddr_t, die! - mnt_data in struct mount is already effectively
a void *, so stop pretending otherwise.
 1.49  08-Mar-2002  thorpej branches: 1.49.6;
Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.
 1.48  08-Nov-2001  lukem add RCSID
 1.47  06-Nov-2001  simonb Use the sector size from the partition info, not a hard-coded value.
 1.46  06-Nov-2001  simonb Remove a variable that is set but never used.
 1.45  15-Sep-2001  chs branches: 1.45.2;
a whole bunch of changes to improve performance and robustness under load:

- remove special treatment of pager_map mappings in pmaps. this is
required now, since I've removed the globals that expose the address range.
pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's
no longer any need to special-case it.
- eliminate struct uvm_vnode by moving its fields into struct vnode.
- rewrite the pageout path. the pager is now responsible for handling the
high-level requests instead of only getting control after a bunch of work
has already been done on its behalf. this will allow us to UBCify LFS,
which needs tighter control over its pages than other filesystems do.
writing a page to disk no longer requires making it read-only, which
allows us to write wired pages without causing all kinds of havoc.
- use a new PG_PAGEOUT flag to indicate that a page should be freed
on behalf of the pagedaemon when it's unlocked. this flag is very similar
to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the
pageout fails due to eg. an indirect-block buffer being locked.
this allows us to remove the "version" field from struct vm_page,
and together with shrinking "loan_count" from 32 bits to 16,
struct vm_page is now 4 bytes smaller.
- no longer use PG_RELEASED for swap-backed pages. if the page is busy
because it's being paged out, we can't release the swap slot to be
reallocated until that write is complete, but unlike with vnodes we
don't keep a count of in-progress writes so there's no good way to
know when the write is done. instead, when we need to free a busy
swap-backed page, just sleep until we can get it busy ourselves.
- implement a fast-path for extending writes which allows us to avoid
zeroing new pages. this substantially reduces cpu usage.
- encapsulate the data used by the genfs code in a struct genfs_node,
which must be the first element of the filesystem-specific vnode data
for filesystems which use genfs_{get,put}pages().
- eliminate many of the UVM pagerops, since they aren't needed anymore
now that the pager "put" operation is a higher-level operation.
- enhance the genfs code to allow NFS to use the genfs_{get,put}pages
instead of a modified copy.
- clean up struct vnode by removing all the fields that used to be used by
the vfs_cluster.c code (which we don't use anymore with UBC).
- remove kmem_object and mb_object since they were useless.
instead of allocating pages to these objects, we now just allocate
pages with no object. such pages are mapped in the kernel until they
are freed, so we can use the mapping to find the page to free it.
this allows us to remove splvm() protection in several places.

The sum of all these changes improves write throughput on my
decstation 5000/200 to within 1% of the rate of NetBSD 1.5
and reduces the elapsed time for "make release" of a NetBSD 1.5
source tree on my 128MB pc to 10% less than a 1.5 kernel took.
 1.44  15-Sep-2001  chs add a new VFS op, vfs_reinit, which is called when desiredvnodes is
adjusted via sysctl. file systems that have hash tables which are
sized based on the value of this variable now resize those hash tables
using the new value. the max number of FFS softdeps is also recalculated.

convert various file systems to use the <sys/queue.h> macros for
their hash tables.
 1.43  30-May-2001  mrg branches: 1.43.4; 1.43.6;
use _KERNEL_OPT
 1.42  22-Jan-2001  jdolecek branches: 1.42.2;
make filesystem vnodeop, specop, fifoop and vnodeopv_* arrays const
 1.41  10-Dec-2000  chs in *_sync(), don't skip vnodes which have (potentially dirty) pages.
 1.40  27-Nov-2000  chs Initial integration of the Unified Buffer Cache project.
 1.39  19-Sep-2000  fvdl Adapt for VOP_FSYNC parameter change.
 1.38  22-Jul-2000  jdolecek ext2fs_reload(), ext2fs_mountfs(): do devvp locking same way as ffs
this has not shown any good or bad effect, but might help narrow
some problems people seen with ext2fs reload (hi Soren!)
 1.37  30-Jun-2000  fvdl Rearrange code around getnewvnode as was already done for ffs, to avoid
locking against oneself because getnewvnode recycles a softdep-using vnode.
 1.36  29-May-2000  mycroft branches: 1.36.2;
Pull in IN_ACCESSED changes and some MNT_LAZY `bug fixes' from FFS.
 1.35  30-Mar-2000  augustss branches: 1.35.2;
Remove register declarations.
 1.34  16-Mar-2000  jdolecek Add new VFS op routine - vfs_done and call it on filesystem detach
in vfs_detach(). vfs_done may free global filesystem's resources,
typically those allocated in respective filesystem's init function.
Needed so those filesystems which went in via LKM have a chance to
clean after themselves before unloading. This fixes random panics
when LKM for filesystem using pools was loaded and unloaded several
times.

For each leaf filesystem, add appropriate vfs_done routine.
 1.33  31-Jan-2000  bouyer Check that we can handle the inode size before mounting the fs, and correct
a return value.
 1.32  28-Jan-2000  bouyer Correct (minor) bogons in filetype option support, and add support
for sparse_super option
 1.31  26-Jan-2000  bouyer First cut at ext2fs rev 1 support (as of mke2fs 1.18): supports the filetype
option read/write and the sparse option read-only.
 1.30  15-Nov-1999  fvdl Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O
 1.29  20-Oct-1999  enami Check if the type of device node isn't VBAD before touching v_specinfo. If
the device vnode is revoked, the field is NULL and touching it causes null
pointer derefercence.
 1.28  16-Oct-1999  wrstuden branches: 1.28.2; 1.28.4;
In spec_close(), if we're not doing a non-blocking close and VXLOCK is
not set, unlock the vnode before calling the device's close routine and
relock it after it returns. tty close routines will sleep waiting for
buffers to drain, which won't happen often times as the other side needs
to grab the vnode lock first.

Make all unmount routines lock the device vnode before calling VOP_CLOSE().
 1.27  17-Jul-1999  wrstuden branches: 1.27.2;
Adjust mountroot routines to vrele rootvp in case of mount error. Closes
PR 7977 by Neil Carson, <neil@brini.com>.
 1.26  08-Jul-1999  wrstuden Modify file systems to deal with struct lock in struct vnode. All leaf
fs's other than nfs use genfs_lock() for locking.

Modify lookup routines to set PDIRUNLOCK when they unlock the parrent.
 1.25  01-Jun-1999  bouyer memset ump->um_e2fs to 0 after malloc, it is bigger than SBSIZE and thus some
parts were left uninitialised. The symptom was that a read-only mount tried
to rewrite back the superblock.
 1.24  26-Feb-1999  wrstuden branches: 1.24.2; 1.24.4; 1.24.6;
Modify vfsops to seperate vfs_fhtovp() into two routines. vfs_fhtovp() now
only handles the file handle to vnode conversion, and a new call,
vfs_checkexp(), performs the export verification.
 1.23  10-Feb-1999  bouyer Make sure a buffer optained from bread() is always bresle()'d in case of
error. Closes PR kern/1448 from Wolfgang Solfrank.
 1.22  02-Dec-1998  bouyer - intentation
- sync LK_* flags with ffs/ufs
 1.21  01-Dec-1998  bouyer In ext2fs_sync(), don't flush the vnode if vput() returned an error. Fixes
PR kern/6495.
 1.20  23-Oct-1998  thorpej For consistency w/ FFS/LFS, define EXT2_DINODE_SIZE, and use it instead
of pointer arithmetic and/or sizeof(struct ext2fs_dinode).
 1.19  29-Sep-1998  bouyer #include opt_uvm.h only if _KENREL and !_LKM
Make ext2fs_init() call ufs_init(). it was doing the init by itself,
testing for extern done != 0. This bug was hidden by the fact that
ext2fs_init() is called before ffs_init().
 1.18  13-Sep-1998  christos Fix copyright '\t' -> ' '
 1.17  01-Sep-1998  thorpej Use the pool allocator and "nointr" pool page allocator for ext2fs inodes.
 1.16  09-Aug-1998  perry bzero->memset, bcopy->memcpy, bcmp->memcmp
 1.15  05-Jul-1998  jonathan * defopt COMPAT_{09,10,11,12,13} and COMPAT_NOMID.
TODO: revisit interaction between native compat and emul compat usage.
 1.14  24-Jun-1998  sommerfe Always include fifos; "not an option any more".
 1.13  22-Jun-1998  sommerfe defopt for options FIFO
 1.12  05-Jun-1998  kleink Convert fsync vnode operator implementations and usage from the old `waitfor'
argument and MNT_WAIT/MNT_NOWAIT to `flags' and FSYNC_WAIT.
 1.11  18-Mar-1998  bouyer Add support for reading/writing FFS in non-native byte order, conditioned
to "options FFS_EI". The superblock and inodes (without blk addr) are
byteswapped at disk read/write time, other metadatas are byteswapped
when used (as they are acceeded directly in the buffer cache).
This required the addition of a "um_flags" field to struct ufsmount.
ffs_bswap.c contains superblock and inode byteswap routines also used
by userland utilities.
 1.10  02-Mar-1998  bouyer Close kern/5077: When DIAGNOSTIC is defined, don't complain about
bad magic numbers at a mount attempt. A message is still printed when the
magic number is OK, but the version number or the block size is bad.
Patch from Soren S. Jorvang, but different from the one in the PR.
 1.9  01-Mar-1998  fvdl Merge with Lite2 + local changes
 1.8  18-Feb-1998  drochner add missing vfsops element
 1.7  18-Feb-1998  thorpej Place a pointer to an array of our vnodeopv_desc *'s in our vfsops
structure, for use by vfs_attach().
 1.6  27-Oct-1997  bouyer When allocating an inode with dtime set, also bzero e2di_blocks[].
 1.5  23-Oct-1997  bouyer Uses ext2fs_vinit not ufs_vinit.
In ext2fs, an inode is deleted either when mode == 0 or dtime != 0. If
dtime != 0, reset others fields before using the inode, or we could end
up with the wrong v_op in ext2fs_vinit.
While I'm there, kill a unused variable in ext2fs_readwrite
 1.4  09-Oct-1997  bouyer branches: 1.4.2;
Add byte-swapping functions (bswap16, bswap32, bswap64) to libkern.
Only assembly version for i386 bswap16 and bswap32 for now (bswap64 uses
bswap32). Contribution of assembly versions of these are welcome.
Add byte-swapping of ext2fs metadata for big-endian systems.
Tested on i386 and sparc.
 1.3  17-Jul-1997  bouyer branches: 1.3.2;
Add a lock locking around inode hashing.
 1.2  12-Jun-1997  mrg remove swap configuration.
 1.1  11-Jun-1997  bouyer The ext2fs layer, based on the ffs/ufs one. Uses a few functions from
sys/ufs/ufs/
 1.3.2.1  14-Oct-1997  thorpej Update marc-pcmcia branch from trunk.
 1.4.2.2  27-Oct-1997  mellon Pull rev 1.6 up from trunk (bouyer)
 1.4.2.1  23-Oct-1997  mellon Pull rev 1.5 up from trunk
 1.24.6.1  30-Nov-1999  itojun bring in latest KAME (as of 19991130, KAME/NetBSD141) into kame branch
just for reference purposes.
This commit includes 1.4 -> 1.4.1 sync for kame branch.

The branch does not compile at all (due to the lack of ALTQ and some other
source code). Please do not try to modify the branch, this is just for
referenre purposes.

synchronization to latest KAME will take place on HEAD branch soon.
 1.24.4.3  06-Aug-1999  chs UBCify.
 1.24.4.2  02-Aug-1999  thorpej Update from trunk.
 1.24.4.1  21-Jun-1999  thorpej Sync w/ -current.
 1.24.2.3  01-Feb-2000  he Apply patch (requested by bouyer):
Add support for ext2fs revision 1, with read-only support for
the 'sparse_super' and 'filetype' options. Should fix PR#9088.
 1.24.2.2  18-Oct-1999  cgd pull up rev 1.28 from trunk (requested by wrstuden):
In spec_close(), call the device's close routine with the vnode
unlocked if the call might block. Force a non-blocking close if
VXLOCK is set. This eliminates a potential deadlock situation, and
should eliminate the dirty buffers on reboot issue.
 1.24.2.1  21-Jun-1999  perry pullup 1.24->1.25 (bouyer): zero ump->um_e2fs after malloc
 1.27.2.2  27-Dec-1999  wrstuden Pull up to last week's -current.
 1.27.2.1  21-Dec-1999  wrstuden Initial commit of recent changes to make DEV_BSIZE go away.

Runs on i386, needs work on other arch's. Main kernel routines should be
fine, but a number of the stand programs need help.

cd, fd, ccd, wd, and sd have been updated. sd has been tested with non-512
byte block devices. vnd, raidframe, and lfs need work.

Non 2**n block support is automatic for LKM's and conditional for kernels
on "options NON_PO2_BLOCKS".
 1.28.4.3  15-Nov-1999  fvdl Sync with -current
 1.28.4.2  03-Nov-1999  fvdl Give ufs_ihashget an extra argument: the flags passed to vget() for
locking. This way we can avoid locking against ourselves when
ufs_ihashget is called during the flushing of metadata. XXX

Also, comment out a VOP_FSYNC call that I think is now unneeded, and
put a diagnostic printf there to check if this still happens.
 1.28.4.1  19-Oct-1999  fvdl Bring in Kirk McKusick's FFS softdep code on a branch.
 1.28.2.5  11-Feb-2001  bouyer Sync with HEAD.
 1.28.2.4  13-Dec-2000  bouyer Sync with HEAD (for UBC fixes).
 1.28.2.3  08-Dec-2000  bouyer Sync with HEAD.
 1.28.2.2  20-Nov-2000  bouyer Update thorpej_scsipi to -current as of a month ago
 1.28.2.1  20-Oct-1999  thorpej Sync w/ trunk.
 1.35.2.1  22-Jun-2000  minoura Sync w/ netbsd-1-5-base.
 1.36.2.3  14-Dec-2000  he Pull up revision 1.39 (requested by fvdl):
Improve NFS performance, possibly with as much as 100% in
throughput. Please note: this implies a kernel interface change,
VOP_FSYNC gains two arguments.
 1.36.2.2  24-Jul-2000  jdolecek pullup rev. 1.38 from trunk (approved by thorpej):
ext2fs_reload(), ext2fs_mountfs(): do devvp locking same way as ffs
this has not shown any good or bad effect, but might help narrow
some problems people seen with ext2fs reload
 1.36.2.1  03-Jul-2000  fvdl pullup from trunk:

Fix a "locking against myself" problem; holding ufs_hashlock
across getnewvnode() could cause a recursive lock if it resulted in
recycling a vnode that was using softdeps.
 1.42.2.10  18-Oct-2002  nathanw Catch up to -current.
 1.42.2.9  17-Sep-2002  nathanw Catch up to -current.
 1.42.2.8  01-Aug-2002  nathanw Catch up to -current.
 1.42.2.7  12-Jul-2002  nathanw No longer need to pull in lwp.h; proc.h pulls it in for us.
 1.42.2.6  24-Jun-2002  nathanw Curproc->curlwp renaming.

Change uses of "curproc->l_proc" back to "curproc", which is more like the
original use. Bare uses of "curproc" are now "curlwp".

"curproc" is now #defined in proc.h as ((curlwp) ? (curlwp)->l_proc) : NULL)
so that it is always safe to reference curproc (*de*referencing curproc
is another story, but that's always been true).
 1.42.2.5  01-Apr-2002  nathanw Catch up to -current.
(CVS: It's not just a program. It's an adventure!)
 1.42.2.4  14-Nov-2001  nathanw Catch up to -current.
 1.42.2.3  21-Sep-2001  nathanw Catch up to -current.
 1.42.2.2  21-Jun-2001  nathanw Catch up to -current.
 1.42.2.1  05-Mar-2001  nathanw Initial commit of scheduler activations and lightweight process support.
 1.43.6.3  01-Oct-2001  fvdl Catch up with -current.
 1.43.6.2  26-Sep-2001  fvdl * add a VCLONED vnode flag that indicates a vnode representing a cloned
device.
* rename REVOKEALL to REVOKEALIAS, and add a REVOKECLONE flag, to pass
to VOP_REVOKE
* the revoke system call will revoke all aliases, as before, but not the
clones
* vdevgone is called when detaching a device, so make it use REVOKECLONE
to get rid of all clones as well
* clean up all uses of VOP_OPEN wrt. locking.
* add a few VOPS to spec_vnops that need to do something when it's a
clone vnode (access and getattr)
* add a copy of the vnode vattr structure of the original 'master' vnode
to the specinfo of a cloned vnode. could possibly redirect getattr to
the 'master' vnode, but this has issues with revoke
* add a vdev_reassignvp function that disassociates a vnode from its
original device, and reassociates it with the specified dev_t. to be
used by cloning devices only, in case a new minor is allocated.
* change all direct references in drivers to v_devcookie and v_rdev
to vdev_privdata(vp) and vdev_rdev(vp). for diagnostic purposes
when debugging race conditions that still exist wrt. locking and
revoking vnodes.
* make the locking state of a vnode consistent when passed to
d_open and d_close (unlocked). locked would be better, but has
some deadlock issues
 1.43.6.1  18-Sep-2001  fvdl Various changes to make cloning devices possible:

* Add an extra argument (struct vnode **) to VOP_OPEN. If it is
not NULL, specfs will create a cloned (aliased) vnode during
the call, and return it there. The caller should release and
unlock the original vnode if a new vnode was returned. The
new vnode is returned locked.

* Add a flag field to the cdevsw and bdevsw structures.
DF_CLONING indicates that it wants a new vnode for each
open (XXX is there a better way? devprop?)

* If a device is cloning, always call the close entry
point for a VOP_CLOSE.


Also, rewrite cons.c to do the right thing with vnodes. Use VOPs
rather then direct device entry calls. Suggested by mycroft@

Light to moderate testing done an i386 system (arch doesn't matter
though, these are MI changes).
 1.43.4.4  10-Oct-2002  jdolecek sync kqueue with -current; this includes merge of gehenna-devsw branch,
merge of i386 MP branch, and part of autoconf rototil work
 1.43.4.3  06-Sep-2002  jdolecek sync kqueue branch with HEAD
 1.43.4.2  16-Mar-2002  jdolecek Catch up with -current.
 1.43.4.1  10-Jan-2002  thorpej Sync kqueue branch with -current.
 1.45.2.1  12-Nov-2001  thorpej Sync the thorpej-mips-cache branch with -current.
 1.49.6.2  29-Aug-2002  gehenna catch up with -current.
 1.49.6.1  16-May-2002  gehenna Use devsw APIs for checking validity of major numbers.
 1.60.2.13  10-Nov-2005  skrll Sync with HEAD. Here we go again...
 1.60.2.12  01-Apr-2005  skrll Sync with HEAD.
 1.60.2.11  04-Mar-2005  skrll Sync with HEAD.

Hi Perry!
 1.60.2.10  15-Feb-2005  skrll Sync with HEAD.
 1.60.2.9  17-Jan-2005  skrll Sync with HEAD.
 1.60.2.8  14-Nov-2004  skrll Sync with HEAD.
 1.60.2.7  24-Sep-2004  skrll Sync with HEAD.
 1.60.2.6  21-Sep-2004  skrll Fix the sync with head I botched.
 1.60.2.5  18-Sep-2004  skrll Sync with HEAD.
 1.60.2.4  25-Aug-2004  skrll Sync with HEAD.
 1.60.2.3  24-Aug-2004  skrll Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.
 1.60.2.2  03-Aug-2004  skrll Sync with HEAD
 1.60.2.1  02-Jul-2003  darrenr Apply the aborted ktrace-lwp changes to a specific branch. This is just for
others to review, I'm concerned that patch fuziness may have resulted in some
errant code being generated but I'll look at that later by comparing the diff
from the base to the branch with the file I attempt to apply to it. This will,
at the very least, put the changes in a better context for others to review
them and attempt to tinker with removing passing of 'struct lwp' through
the kernel.
 1.66.2.1  29-May-2004  tron Pull up revision 1.71 (requested by atatat in ticket #393):
Sysctl descriptions under vfs subtree
 1.81.4.2  19-Mar-2005  yamt sync with head. xen and whitespace. xen part is not finished.
 1.81.4.1  12-Feb-2005  yamt sync with head.
 1.81.2.1  29-Apr-2005  kent sync with -current
 1.83.2.1  24-Aug-2005  riz Pull up following revision(s) (requested by yamt in ticket #688):
sys/miscfs/genfs/genfs_vnops.c: revision 1.98 via patch
sys/ufs/ffs/ffs_vfsops.c: revision 1.165
sys/ufs/lfs/lfs_extern.h: revision 1.69
sys/fs/filecorefs/filecore_vfsops.c: revision 1.20
sys/nfs/nfs_node.c: revision 1.80
sys/fs/smbfs/smbfs_node.c: revision 1.24
sys/fs/cd9660/cd9660_vfsops.c: revision 1.24
sys/fs/msdosfs/msdosfs_denode.c: revision 1.8
sys/miscfs/genfs/genfs_node.h: revision 1.6
sys/ufs/lfs/lfs_vfsops.c: revision 1.183
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.86
sys/fs/adosfs/advfsops.c: revision 1.23
sys/fs/ntfs/ntfs_vfsops.c: revision 1.31
- constify genfs_ops.
- use member designators.

sys/miscfs/genfs/genfs_vnops.c: revision 1.99 via patch
genfs_getpages: don't forget to put the vnode onto the syncer's work que
ue
even in the case of PGO_LOCKED.

sys/uvm/uvm_bio.c: revision 1.40
sys/uvm/uvm_pager.h: revision 1.29
sys/miscfs/genfs/genfs_vnops.c: revision 1.100 via patch
sys/ufs/ufs/ufs_inode.c: revision 1.50
- introduce PGO_NOBLOCKALLOC and use it for ubc mapping
to prevent unnecessary block allocations in the case that
page size > block size.
- ufs_balloc_range: use VM_PROT_WRITE+PGO_NOBLOCKALLOC rather than
VM_PROT_READ.

sys/uvm/uvm_fault.c: revision 1.96
sys/miscfs/genfs/genfs_vnops.c: revision 1.101 via patch
sys/uvm/uvm_object.h: revision 1.19
sys/miscfs/genfs/genfs_node.h: revision 1.7
ensure that vnodes with dirty pages are always on syncer's queue.
- genfs_putpages: wait for i/o completion of PG_RELEASED/PG_PAGEOUT pages by
setting "wasclean" false when encountering them.
suggested by Stephan Uphoff in PR/24596 (1).
- genfs_putpages: write protect pages when cleaning out, if
we're going to take the vnode off the syncer's queue.
uvm_fault: don't write-map pages unless its vnode is already on
the syncer's queue.
fix PR/24596 (3) but in the different way from the suggested fix.
(to keep our current behaviour, ie. not to require explicit msync.
discussed on tech-kern@.)
- genfs_putpages: don't mistakenly take a vnode off the queue
by introducing a generation number in genfs_node.
genfs_getpages: increment the generation number.
suggested by Stephan Uphoff in PR/24596 (2).
- add some assertions.

sys/miscfs/genfs/genfs_vnops.c: revision 1.102 via patch
genfs_putpages: don't bother to clean the vnode unless VONWORKLST.

sys/ufs/ffs/ffs_vnops.c: revision 1.71
ffs_full_fsync: because VBLK/VCHR can be mmap'ed,
do VOP_PUTPAGES for them as well.

sys/uvm/uvm_fault.c: revision 1.97
uvm_fault: check a correct object in the case of layered filesystems.
fix PR/30811 from Jukka Salmi.

sys/uvm/uvm_object.h: revision 1.20
sys/ufs/ffs/ffs_vfsops.c: revision 1.167
sys/uvm/uvm_bio.c: revision 1.41
sys/ufs/ufs/ufs_vnops.c: revision 1.129
sys/uvm/uvm_mmap.c: revision 1.92
sys/uvm/uvm_fault.c: revision 1.98
sys/kern/vfs_subr.c: revision 1.252
sys/fs/msdosfs/denode.h: revision 1.5
sys/miscfs/genfs/genfs_vnops.c: revision 1.103 via patch
sys/fs/msdosfs/msdosfs_denode.c: revision 1.9
sys/sys/vnode.h: revision 1.141
sys/ufs/ufs/ufs_inode.c: revision 1.51
sys/ufs/ufs/ufs_extern.h: revision 1.45 via patch
sys/miscfs/genfs/genfs_node.h: revision 1.8
sys/ufs/lfs/lfs_vfsops.c: revision 1.184
sys/uvm/uvm_pager.h: revision 1.30
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.87
update file timestamps for nfsd loaned-read and mmap.
PR/25279. discussed on tech-kern@.

sys/miscfs/genfs/genfs_vnops.c: revision 1.104 via patch
don't write-protect wired pages. pointed by Chuck Silvers.
for now, leave a vnode on the syncer's queue, as suggested by him.

sys/ufs/ffs/ffs_vnops.c: revision 1.72
revert VCHR part of ffs_vnops.c 1.71.
as VCHR uses the device pager, no point to call VOP_PUTPAGES here.
pointed by Chuck Silvers.
 1.86.2.9  11-Feb-2008  yamt sync with head.
 1.86.2.8  04-Feb-2008  yamt sync with head.
 1.86.2.7  21-Jan-2008  yamt sync with head
 1.86.2.6  07-Dec-2007  yamt sync with head
 1.86.2.5  27-Oct-2007  yamt sync with head.
 1.86.2.4  03-Sep-2007  yamt sync with head.
 1.86.2.3  26-Feb-2007  yamt sync with head.
 1.86.2.2  30-Dec-2006  yamt sync with head.
 1.86.2.1  21-Jun-2006  yamt sync with head.
 1.92.2.1  20-Oct-2005  yamt adapt ufs.
 1.94.6.3  01-Jun-2006  kardel Sync with head.
 1.94.6.2  22-Apr-2006  simonb Sync with head.
 1.94.6.1  04-Feb-2006  simonb Adapt for timecounters: mostly use get*time() and use "time_second"
instead of "time.tv_sec".
 1.94.4.1  09-Sep-2006  rpaulo sync with head
 1.94.2.1  01-Mar-2006  yamt sync with head.
 1.95.6.2  24-May-2006  tron Merge 2006-05-24 NetBSD-current into the "peter-altq" branch.
 1.95.6.1  28-Mar-2006  tron Merge 2006-03-28 NetBSD-current into the "peter-altq" branch.
 1.95.4.3  06-May-2006  christos - Move kauth_cred_t declaration to <sys/types.h>
- Cleanup struct ucred; forward declarations that are unused.
- Don't include <sys/kauth.h> in any header, but include it in the c files
that need it.

Approved by core.
 1.95.4.2  19-Apr-2006  elad sync with head.
 1.95.4.1  08-Mar-2006  elad Adapt to kernel authorization KPI.
 1.95.2.5  03-Sep-2006  yamt sync with head.
 1.95.2.4  11-Aug-2006  yamt sync with head
 1.95.2.3  26-Jun-2006  yamt sync with head.
 1.95.2.2  24-May-2006  yamt sync with head.
 1.95.2.1  01-Apr-2006  yamt sync with head.
 1.97.2.1  19-Jun-2006  chap Sync with head.
 1.98.2.1  13-Jul-2006  gdamore Merge from HEAD.
 1.101.4.2  10-Dec-2006  yamt sync with head.
 1.101.4.1  22-Oct-2006  yamt sync with head
 1.101.2.3  01-Feb-2007  ad Sync with head.
 1.101.2.2  12-Jan-2007  ad Sync with head.
 1.101.2.1  18-Nov-2006  ad Sync with head.
 1.108.2.2  24-Mar-2007  yamt sync with head.
 1.108.2.1  12-Mar-2007  rmind Sync with HEAD.
 1.109.2.11  25-Oct-2007  ad Fix up mnt_vnodelist handling.
 1.109.2.10  23-Oct-2007  ad Sync with head.
 1.109.2.9  19-Oct-2007  ad Hook ext2fs_vfree into ext2fs_ufsops (for ufs_reclaim).
 1.109.2.8  20-Aug-2007  ad Sync with HEAD.
 1.109.2.7  29-Jul-2007  ad Add vfs_destroy() to free mount structures. The specificdata_ref was being
leaked.
 1.109.2.6  15-Jul-2007  ad Sync with head.
 1.109.2.5  17-Jun-2007  ad - Increase the number of thread priorities from 128 to 256. How the space
is set up is to be revisited.
- Implement soft interrupts as kernel threads. A generic implementation
is provided, with hooks for fast-path MD code that can run the interrupt
threads over the top of other threads executing in the kernel.
- Split vnode::v_flag into three fields, depending on how the flag is
locked (by the interlock, by the vnode lock, by the file system).
- Miscellaneous locking fixes and improvements.
 1.109.2.4  09-Jun-2007  ad Sync with head.
 1.109.2.3  13-May-2007  ad - Pass the error number and residual count to biodone(), and let it handle
setting error indicators. Prepare to eliminate B_ERROR.
- Add a flag argument to brelse() to be set into the buf's flags, instead
of doing it directly. Typically used to set B_INVAL.
- Add a "struct cpu_info *" argument to kthread_create(), to be used to
create bound threads. Change "bool mpsafe" to "int flags".
- Allow exit of LWPs in the IDL state when (l != curlwp).
- More locking fixes & conversion to the new API.
 1.109.2.2  13-Mar-2007  ad Pull in the initial set of changes for the vmlocking branch.
 1.109.2.1  13-Mar-2007  ad Sync with head.
 1.110.2.1  11-Jul-2007  mjf Sync with head.
 1.114.2.1  15-Aug-2007  skrll Sync with HEAD.
 1.117.8.2  31-Jul-2007  pooka * nuke the nameidata parameter from VFS_MOUNT(). Nobody on tech-kern
knew what it was supposed to be used for and wrstuden gave a go-ahead
* while rototilling, convert file systems which went easily to
use VFS_PROTOS() instead of manually prototyping the methods
 1.117.8.1  31-Jul-2007  pooka file ext2fs_vfsops.c was added on branch matt-mips64 on 2007-07-31 21:14:21 +0000
 1.117.6.2  18-Oct-2007  yamt sync with head.
 1.117.6.1  14-Oct-2007  yamt sync with head.
 1.117.4.3  23-Mar-2008  matt sync with HEAD
 1.117.4.2  09-Jan-2008  matt sync with HEAD
 1.117.4.1  06-Nov-2007  matt sync with HEAD
 1.117.2.4  09-Dec-2007  jmcneill Sync with HEAD.
 1.117.2.3  03-Dec-2007  joerg Sync with HEAD.
 1.117.2.2  27-Nov-2007  joerg Sync with HEAD. amd64 Xen support needs testing.
 1.117.2.1  26-Oct-2007  joerg Sync with HEAD.

Follow the merge of pmap.c on i386 and amd64 and move
pmap_init_tmp_pgtbl into arch/x86/x86/pmap.c. Modify the ACPI wakeup
code to restore CR4 before jumping back into kernel space as the large
page option might cover that.
 1.120.4.3  18-Feb-2008  mjf Sync with HEAD.
 1.120.4.2  27-Dec-2007  mjf Sync with HEAD.
 1.120.4.1  08-Dec-2007  mjf Sync with HEAD.
 1.124.2.3  30-Dec-2007  ad Fix remaining problems with ext2fs on this branch.
 1.124.2.2  26-Dec-2007  ad Sync with head.
 1.124.2.1  04-Dec-2007  ad Pull the vmlocking changes into a new branch.
 1.125.4.2  08-Jan-2008  bouyer Sync with HEAD
 1.125.4.1  02-Jan-2008  bouyer Sync with HEAD
 1.131.10.6  11-Aug-2010  yamt sync with head.
 1.131.10.5  11-Mar-2010  yamt sync with head
 1.131.10.4  16-Sep-2009  yamt sync with head
 1.131.10.3  18-Jul-2009  yamt sync with head.
 1.131.10.2  04-May-2009  yamt sync with head.
 1.131.10.1  16-May-2008  yamt sync with head.
 1.131.8.1  18-May-2008  yamt sync with head.
 1.131.6.3  17-Jan-2009  mjf Sync with HEAD.
 1.131.6.2  29-Jun-2008  mjf Sync with HEAD.
 1.131.6.1  02-Jun-2008  mjf Sync with HEAD.
 1.134.2.2  18-Sep-2008  wrstuden Sync with wrstuden-revivesa-base-2.
 1.134.2.1  23-Jun-2008  wrstuden Sync w/ -current. 34 merge conflicts to follow.
 1.136.2.1  03-Jul-2008  simonb Sync with head.
 1.137.6.7  25-Apr-2014  sborrill Pull up the following revisions(s) (requested by maxv in ticket #1901):
sys/kern/vfs_syscalls.c: revision 1.478, 1.480 via patch
sys/coda/coda_vfsops.c: revision 1.81
sys/fs/adosfs/advfsops.c: revision 1.70
sys/fs/cd9660/cd9660_vfsops.c: revision 1.84
sys/fs/efs/efs_vfsops.c: revision 1.25
sys/fs/filecorefs/filecore_vfsops.c: revision 1.76
sys/fs/hfs/hfs_vfsops.c: revision 1.31
sys/fs/msdosfs/msdosfs_vfsops.c: revision 1.107
sys/fs/ntfs/ntfs_vfsops.c: revision 1.94
sys/fs/ptyfs/ptyfs_vfsops.c: revision 1.50 via patch
sys/fs/puffs/puffs_vfsops.c: revision 1.110 via patch
sys/fs/smbfs/smbfs_vfsops.c: revision 1.100
sys/fs/sysvbfs/sysvbfs_vfsops.c: revision 1.43
sys/fs/tmpfs/tmpfs_vfsops.c: revision 1.59 via patch
sys/fs/udf/udf_vfsops.c: revision 1.67
sys/fs/union/union_vfsops.c: revision 1.72
sys/fs/unionfs/unionfs_vfsops.c: revision 1.13
sys/kern/vfs_syscalls.c: revision 1.479
sys/miscfs/nullfs/null_vfsops.c: revision 1.88 via patch
sys/miscfs/overlay/overlay_vfsops.c: revision 1.61
sys/miscfs/procfs/procfs_vfsops.c: revision 1.91
sys/miscfs/umapfs/umap_vfsops.c: revision 1.92
sys/nfs/nfs_vfsops.c: revision 1.227
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.180
sys/ufs/ffs/ffs_vfsops.c: revision 1.297
sys/ufs/lfs/lfs_vfsops.c: revision 1.321
sys/ufs/mfs/mfs_vfsops.c: revision 1.107

Due to missing checks in the mount syscall, and a wrong assumption on the
file systems side, the kernel could allocate an unbounded or zero-sized
memory buffer, and could dereference a NULL pointer when particular
arguments are given by a user.
 1.137.6.6  16-Jan-2011  bouyer branches: 1.137.6.6.2;
Pull up following revision(s) (requested by tsutsui in ticket #1486):
sbin/fsck_ext2fs/setup.c: revision 1.26
sbin/newfs_ext2fs/mke2fs.c: revision 1.10
sbin/newfs_ext2fs/mke2fs.c: revision 1.11
sbin/newfs_ext2fs/mke2fs.c: revision 1.12
sbin/fsck_ext2fs/inode.c: revision 1.24
sys/lib/libsa/ext2fs.c: revision 1.6
sbin/newfs_ext2fs/extern.h: revision 1.3
sbin/fsck_ext2fs/inode.c: revision 1.25
sys/lib/libsa/ext2fs.c: revision 1.7
sbin/fsck_ext2fs/inode.c: revision 1.26
sys/ufs/ext2fs/ext2fs_inode.c: revision 1.68
sbin/fsck_ext2fs/inode.c: revision 1.27
sbin/fsck_ext2fs/inode.c: revision 1.28
sys/ufs/ext2fs/ext2fs_dinode.h: revision 1.18
sys/ufs/ext2fs/ext2fs_dinode.h: revision 1.19
sbin/newfs_ext2fs/newfs_ext2fs.c: revision 1.5
sbin/newfs_ext2fs/newfs_ext2fs.8: revision 1.2
sbin/newfs_ext2fs/newfs_ext2fs.c: revision 1.6
sbin/newfs_ext2fs/newfs_ext2fs.8: revision 1.3
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.142
sbin/newfs_ext2fs/newfs_ext2fs.c: revision 1.7
sbin/newfs_ext2fs/newfs_ext2fs.8: revision 1.4
sbin/newfs_ext2fs/newfs_ext2fs.c: revision 1.8
PR/40936: Frederik Sausmikat: ext2fs: add support for inodes > 128 bytes
Support variable inode sizes.
catch up with variable inode size.
Don't use e2fs_inode_size in superblock on E2FS_REV0 file system.
- accept only EXT2_REV0_DINODE_SIZE inodesize on -O 0
- use inodesize to get offset of inode, not struct ext2fs_dinode array
Replace a magic number with a new EXT2_REV0_DINODE_SIZE macro.
Use EXT2_DINODE_SIZE() to get offset of inode, not struct ext2fs_dinode array.
Fix botched logic in inodesize check.
Use inodesize to get offset of inode in one more place.
- add a sanity check for e2fs_inode_size in readsb()
- use EXT2_DINODE_SIZE() rather than sizeof(struct ext2fs_dinode) or
struct ext2fs_dinode array/pointer to see e2fs_ipb and inode offsets
Sort options.
New sentence, new line.
Sort options in usage.
- unsigned -> unsigned int
- remove unnecessary casts from malloc(3) and free(3)
- fix a bogus indent
Use "size > INT32_MAX" rather than "size >= 0x80000000U" to check 2GB limit.
Add missed byteswap ops against ext2fs_dinode members.
Handle 32 bit uid field on E2FS_REV1.
 1.137.6.5  27-Oct-2009  bouyer branches: 1.137.6.5.2;
Pull up following revision(s) (requested by pooka in ticket #1112):
sys/ufs/ext2fs/ext2fs_vnops.c: revision 1.91
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.152
sys/ufs/ext2fs/ext2fs_extern.h: revision 1.42
update i_uid and i_gid after chown
 1.137.6.4  16-Oct-2009  snj Pull up following revision(s) (requested by tsutsui in ticket #1060):
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.148
Add a missed brelse(9) call after bread(9) in ext2fs_reload().
This may close PR kern/28712 (ext2fs hang on mount after fsck).
 1.137.6.3  16-Oct-2009  snj Pull up following revision(s) (requested by tsutsui in ticket #1060):
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.147
Pull a fix from ffs_vfsops.c rev 1.248:
Fix bug introduced in revision 1.174(*) where a NULL fspec with an MNT_UPDATE
command would always return EINVAL. This broke fsck on root, where fsck'ing
a dirty root would always return an error causing rc to resort in a reboot.
(*) This is "Apply the NFS exports list rototill patch" change
in ext2fs_vfsops.c rev 1.91.
 1.137.6.2  16-Oct-2009  snj Pull up following revision(s) (requested by tsutsui in ticket #1060):
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.146
Pull a fix for mount function from ffs_vfsops.c rev1.186:
Change ffs_mount, in MNT_UPDATE case, to check dev_t's for equality
instead of just vnode pointers. Fixes erroneous "does not match mounted
device" errors from mount(8) in the presence of MFS /dev, init.root, &c.
 1.137.6.1  29-Nov-2008  snj branches: 1.137.6.1.4;
Pull up following revision(s) (requested by mrg in ticket #147):
sys/ufs/ext2fs/ext2fs_alloc.c: revision 1.37
sys/ufs/ext2fs/ext2fs_bswap.c: revision 1.14
sys/ufs/ext2fs/ext2fs_dinode.h: revision 1.17
sys/ufs/ext2fs/ext2fs_lookup.c: revision 1.56
sys/ufs/ext2fs/ext2fs_vnops.c: revision 1.83
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.140
sys/ufs/ufs/inode.h: revision 1.55
add support for 32 bit uid/gid fields in ext2, but only do so for
when the revision is > REV0.
 1.137.6.6.2.1  28-Apr-2014  sborrill Pull up the following revisions(s) (requested by maxv in ticket #1901):
sys/kern/vfs_syscalls.c: revision 1.478, 1.480 via patch
sys/coda/coda_vfsops.c: revision 1.81
sys/fs/adosfs/advfsops.c: revision 1.70
sys/fs/cd9660/cd9660_vfsops.c: revision 1.84
sys/fs/efs/efs_vfsops.c: revision 1.25
sys/fs/filecorefs/filecore_vfsops.c: revision 1.76
sys/fs/hfs/hfs_vfsops.c: revision 1.31
sys/fs/msdosfs/msdosfs_vfsops.c: revision 1.107
sys/fs/ntfs/ntfs_vfsops.c: revision 1.94
sys/fs/ptyfs/ptyfs_vfsops.c: revision 1.50 via patch
sys/fs/puffs/puffs_vfsops.c: revision 1.110 via patch
sys/fs/smbfs/smbfs_vfsops.c: revision 1.100
sys/fs/sysvbfs/sysvbfs_vfsops.c: revision 1.43
sys/fs/tmpfs/tmpfs_vfsops.c: revision 1.59 via patch
sys/fs/udf/udf_vfsops.c: revision 1.67
sys/fs/union/union_vfsops.c: revision 1.72
sys/fs/unionfs/unionfs_vfsops.c: revision 1.13
sys/kern/vfs_syscalls.c: revision 1.479
sys/miscfs/nullfs/null_vfsops.c: revision 1.88 via patch
sys/miscfs/overlay/overlay_vfsops.c: revision 1.61
sys/miscfs/procfs/procfs_vfsops.c: revision 1.91
sys/miscfs/umapfs/umap_vfsops.c: revision 1.92
sys/nfs/nfs_vfsops.c: revision 1.227
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.180
sys/ufs/ffs/ffs_vfsops.c: revision 1.297
sys/ufs/lfs/lfs_vfsops.c: revision 1.321
sys/ufs/mfs/mfs_vfsops.c: revision 1.107

Due to missing checks in the mount syscall, and a wrong assumption on the
file systems side, the kernel could allocate an unbounded or zero-sized
memory buffer, and could dereference a NULL pointer when particular
arguments are given by a user.
 1.137.6.5.2.1  28-Apr-2014  sborrill Pull up the following revisions(s) (requested by maxv in ticket #1901):
sys/kern/vfs_syscalls.c: revision 1.478, 1.480 via patch
sys/coda/coda_vfsops.c: revision 1.81
sys/fs/adosfs/advfsops.c: revision 1.70
sys/fs/cd9660/cd9660_vfsops.c: revision 1.84
sys/fs/efs/efs_vfsops.c: revision 1.25
sys/fs/filecorefs/filecore_vfsops.c: revision 1.76
sys/fs/hfs/hfs_vfsops.c: revision 1.31
sys/fs/msdosfs/msdosfs_vfsops.c: revision 1.107
sys/fs/ntfs/ntfs_vfsops.c: revision 1.94
sys/fs/ptyfs/ptyfs_vfsops.c: revision 1.50 via patch
sys/fs/puffs/puffs_vfsops.c: revision 1.110 via patch
sys/fs/smbfs/smbfs_vfsops.c: revision 1.100
sys/fs/sysvbfs/sysvbfs_vfsops.c: revision 1.43
sys/fs/tmpfs/tmpfs_vfsops.c: revision 1.59 via patch
sys/fs/udf/udf_vfsops.c: revision 1.67
sys/fs/union/union_vfsops.c: revision 1.72
sys/fs/unionfs/unionfs_vfsops.c: revision 1.13
sys/kern/vfs_syscalls.c: revision 1.479
sys/miscfs/nullfs/null_vfsops.c: revision 1.88 via patch
sys/miscfs/overlay/overlay_vfsops.c: revision 1.61
sys/miscfs/procfs/procfs_vfsops.c: revision 1.91
sys/miscfs/umapfs/umap_vfsops.c: revision 1.92
sys/nfs/nfs_vfsops.c: revision 1.227
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.180
sys/ufs/ffs/ffs_vfsops.c: revision 1.297
sys/ufs/lfs/lfs_vfsops.c: revision 1.321
sys/ufs/mfs/mfs_vfsops.c: revision 1.107

Due to missing checks in the mount syscall, and a wrong assumption on the
file systems side, the kernel could allocate an unbounded or zero-sized
memory buffer, and could dereference a NULL pointer when particular
arguments are given by a user.
 1.137.6.1.4.1  21-Apr-2010  matt sync to netbsd-5
 1.137.4.3  28-Apr-2009  skrll Sync with HEAD.
 1.137.4.2  03-Mar-2009  skrll Sync with HEAD.
 1.137.4.1  19-Jan-2009  skrll Sync with HEAD.
 1.137.2.1  13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.141.2.2  23-Jul-2009  jym Sync with HEAD.
 1.141.2.1  13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.155.2.2  17-Aug-2010  uebayasi Sync with HEAD.
 1.155.2.1  30-Apr-2010  uebayasi Sync with HEAD.
 1.156.2.4  19-May-2011  rmind Implement sharing of vnode_t::v_interlock amongst vnodes:
- Lock is shared amongst UVM objects using uvm_obj_setlock() or getnewvnode().
- Adjust vnode cache to handle unsharing, add VI_LOCKSHARE flag for that.
- Use sharing in tmpfs and layerfs for underlying object.
- Simplify locking in ubc_fault().
- Sprinkle some asserts.

Discussed with ad@.
 1.156.2.3  05-Mar-2011  rmind sync with head
 1.156.2.2  03-Jul-2010  rmind sync with head
 1.156.2.1  16-Mar-2010  rmind Change struct uvm_object::vmobjlock to be dynamically allocated with
mutex_obj_alloc(). It allows us to share the locks among UVM objects.
 1.159.6.1  23-Jun-2011  cherry Catchup with rmind-uvmplock merge.
 1.161.2.6  22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.161.2.5  23-Jan-2013  yamt sync with head
 1.161.2.4  16-Jan-2013  yamt sync with (a bit old) head
 1.161.2.3  30-Oct-2012  yamt sync with head
 1.161.2.2  23-May-2012  yamt sync with head.
 1.161.2.1  17-Apr-2012  yamt sync with head
 1.162.12.1  21-Apr-2014  bouyer Pull up following revision(s) (requested by maxv in ticket #1050):
sys/ufs/chfs/chfs_vfsops.c: revision 1.11
sys/fs/unionfs/unionfs_vfsops.c: revision 1.13
sys/fs/nilfs/nilfs_vfsops.c: revision 1.16
sys/ufs/mfs/mfs_vfsops.c: revision 1.107
sys/fs/sysvbfs/sysvbfs_vfsops.c: revision 1.43
sys/ufs/ffs/ffs_vfsops.c: revision 1.297
sys/kern/vfs_syscalls.c: revision 1.478
sys/kern/vfs_syscalls.c: revision 1.479
sys/fs/puffs/puffs_vfsops.c: revision 1.110
sys/fs/cd9660/cd9660_vfsops.c: revision 1.84
sys/nfs/nfs_vfsops.c: revision 1.227
sys/fs/v7fs/v7fs_vfsops.c: revision 1.10
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.180
sys/miscfs/umapfs/umap_vfsops.c: revision 1.92
sys/fs/filecorefs/filecore_vfsops.c: revision 1.76
sys/miscfs/nullfs/null_vfsops.c: revision 1.88
sys/fs/ptyfs/ptyfs_vfsops.c: revision 1.50
sys/coda/coda_vfsops.c: revision 1.81
sys/ufs/lfs/lfs_vfsops.c: revision 1.321
sys/fs/tmpfs/tmpfs_vfsops.c: revision 1.59
sys/fs/hfs/hfs_vfsops.c: revision 1.31
sys/miscfs/overlay/overlay_vfsops.c: revision 1.61
sys/fs/union/union_vfsops.c: revision 1.72
sys/fs/ntfs/ntfs_vfsops.c: revision 1.94
sys/kern/vfs_syscalls.c: revision 1.480
sys/fs/efs/efs_vfsops.c: revision 1.25
sys/kern/vfs_syscalls.c: revision 1.482
sys/fs/msdosfs/msdosfs_vfsops.c: revision 1.107
external/cddl/osnet/dist/uts/common/fs/zfs/zfs_vfsops.c: revision 1.12
sys/miscfs/procfs/procfs_vfsops.c: revision 1.91
sys/fs/smbfs/smbfs_vfsops.c: revision 1.100
sys/fs/adosfs/advfsops.c: revision 1.70
sys/fs/udf/udf_vfsops.c: revision 1.67
Limit check for 'data_len'. Otherwise a (un)privileged user can easily
panic the system by passing a huge size.
ok christos@
An (un)privileged user can easily make the kernel dereference a NULL
pointer.
The kernel allows 'data' to be NULL; it's the fs's responsibility to
ensure that it isn't NULL (if the fs actually needs data).
ok christos@
Some fs's - like kernfs - set their vfs_min_mount_data to zero. Add a check
to prevent an (un)privileged user from requesting a zero-sized allocation
(and thus a panic).
This thing is totally buggy: 'data_len' is modified by the fs, so calling
kmem_free with it while its value has changed since the kmem_alloc is far
from being a good idea.
If the kernel figures out that something mismatches, it will panic
(typically with kernfs).
 1.162.10.1  21-Apr-2014  bouyer Pull up following revision(s) (requested by maxv in ticket #1050):
sys/ufs/chfs/chfs_vfsops.c: revision 1.11
sys/fs/unionfs/unionfs_vfsops.c: revision 1.13
sys/fs/nilfs/nilfs_vfsops.c: revision 1.16
sys/ufs/mfs/mfs_vfsops.c: revision 1.107
sys/fs/sysvbfs/sysvbfs_vfsops.c: revision 1.43
sys/ufs/ffs/ffs_vfsops.c: revision 1.297
sys/kern/vfs_syscalls.c: revision 1.478
sys/kern/vfs_syscalls.c: revision 1.479
sys/fs/puffs/puffs_vfsops.c: revision 1.110
sys/fs/cd9660/cd9660_vfsops.c: revision 1.84
sys/nfs/nfs_vfsops.c: revision 1.227
sys/fs/v7fs/v7fs_vfsops.c: revision 1.10
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.180
sys/miscfs/umapfs/umap_vfsops.c: revision 1.92
sys/fs/filecorefs/filecore_vfsops.c: revision 1.76
sys/miscfs/nullfs/null_vfsops.c: revision 1.88
sys/fs/ptyfs/ptyfs_vfsops.c: revision 1.50
sys/coda/coda_vfsops.c: revision 1.81
sys/ufs/lfs/lfs_vfsops.c: revision 1.321
sys/fs/tmpfs/tmpfs_vfsops.c: revision 1.59
sys/fs/hfs/hfs_vfsops.c: revision 1.31
sys/miscfs/overlay/overlay_vfsops.c: revision 1.61
sys/fs/union/union_vfsops.c: revision 1.72
sys/fs/ntfs/ntfs_vfsops.c: revision 1.94
sys/kern/vfs_syscalls.c: revision 1.480
sys/fs/efs/efs_vfsops.c: revision 1.25
sys/kern/vfs_syscalls.c: revision 1.482
sys/fs/msdosfs/msdosfs_vfsops.c: revision 1.107
external/cddl/osnet/dist/uts/common/fs/zfs/zfs_vfsops.c: revision 1.12
sys/miscfs/procfs/procfs_vfsops.c: revision 1.91
sys/fs/smbfs/smbfs_vfsops.c: revision 1.100
sys/fs/adosfs/advfsops.c: revision 1.70
sys/fs/udf/udf_vfsops.c: revision 1.67
Limit check for 'data_len'. Otherwise a (un)privileged user can easily
panic the system by passing a huge size.
ok christos@
An (un)privileged user can easily make the kernel dereference a NULL
pointer.
The kernel allows 'data' to be NULL; it's the fs's responsibility to
ensure that it isn't NULL (if the fs actually needs data).
ok christos@
Some fs's - like kernfs - set their vfs_min_mount_data to zero. Add a check
to prevent an (un)privileged user from requesting a zero-sized allocation
(and thus a panic).
This thing is totally buggy: 'data_len' is modified by the fs, so calling
kmem_free with it while its value has changed since the kmem_alloc is far
from being a good idea.
If the kernel figures out that something mismatches, it will panic
(typically with kernfs).
 1.162.6.1  21-Apr-2014  bouyer Pull up following revision(s) (requested by maxv in ticket #1050):
sys/ufs/chfs/chfs_vfsops.c: revision 1.11
sys/fs/unionfs/unionfs_vfsops.c: revision 1.13
sys/fs/nilfs/nilfs_vfsops.c: revision 1.16
sys/ufs/mfs/mfs_vfsops.c: revision 1.107
sys/fs/sysvbfs/sysvbfs_vfsops.c: revision 1.43
sys/ufs/ffs/ffs_vfsops.c: revision 1.297
sys/kern/vfs_syscalls.c: revision 1.478
sys/kern/vfs_syscalls.c: revision 1.479
sys/fs/puffs/puffs_vfsops.c: revision 1.110
sys/fs/cd9660/cd9660_vfsops.c: revision 1.84
sys/nfs/nfs_vfsops.c: revision 1.227
sys/fs/v7fs/v7fs_vfsops.c: revision 1.10
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.180
sys/miscfs/umapfs/umap_vfsops.c: revision 1.92
sys/fs/filecorefs/filecore_vfsops.c: revision 1.76
sys/miscfs/nullfs/null_vfsops.c: revision 1.88
sys/fs/ptyfs/ptyfs_vfsops.c: revision 1.50
sys/coda/coda_vfsops.c: revision 1.81
sys/ufs/lfs/lfs_vfsops.c: revision 1.321
sys/fs/tmpfs/tmpfs_vfsops.c: revision 1.59
sys/fs/hfs/hfs_vfsops.c: revision 1.31
sys/miscfs/overlay/overlay_vfsops.c: revision 1.61
sys/fs/union/union_vfsops.c: revision 1.72
sys/fs/ntfs/ntfs_vfsops.c: revision 1.94
sys/kern/vfs_syscalls.c: revision 1.480
sys/fs/efs/efs_vfsops.c: revision 1.25
sys/kern/vfs_syscalls.c: revision 1.482
sys/fs/msdosfs/msdosfs_vfsops.c: revision 1.107
external/cddl/osnet/dist/uts/common/fs/zfs/zfs_vfsops.c: revision 1.12
sys/miscfs/procfs/procfs_vfsops.c: revision 1.91
sys/fs/smbfs/smbfs_vfsops.c: revision 1.100
sys/fs/adosfs/advfsops.c: revision 1.70
sys/fs/udf/udf_vfsops.c: revision 1.67
Limit check for 'data_len'. Otherwise a (un)privileged user can easily
panic the system by passing a huge size.
ok christos@
An (un)privileged user can easily make the kernel dereference a NULL
pointer.
The kernel allows 'data' to be NULL; it's the fs's responsibility to
ensure that it isn't NULL (if the fs actually needs data).
ok christos@
Some fs's - like kernfs - set their vfs_min_mount_data to zero. Add a check
to prevent an (un)privileged user from requesting a zero-sized allocation
(and thus a panic).
This thing is totally buggy: 'data_len' is modified by the fs, so calling
kmem_free with it while its value has changed since the kmem_alloc is far
from being a good idea.
If the kernel figures out that something mismatches, it will panic
(typically with kernfs).
 1.162.4.2  02-Jun-2012  mrg sync to latest -current.
 1.162.4.1  05-Apr-2012  mrg sync to latest -current.
 1.166.2.4  03-Dec-2017  jdolecek update from HEAD
 1.166.2.3  20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.166.2.2  23-Jun-2013  tls resync from head
 1.166.2.1  25-Feb-2013  tls resync with head
 1.171.2.2  18-May-2014  rmind sync with head
 1.171.2.1  28-Aug-2013  rmind sync with head
 1.179.2.1  10-Aug-2014  tls Rebase.
 1.183.2.2  17-Jan-2015  martin Pull up following revision(s) (requested by maxv in ticket #427):
sys/compat/svr4/svr4_schedctl.c: revision 1.8
sys/netinet/tcp_timer.c: revision 1.88
sys/miscfs/genfs/layer_vfsops.c: revision 1.45
sys/compat/svr4/svr4_ioctl.c: revision 1.37
sys/ufs/chfs/chfs_vfsops.c: revision 1.14
sys/miscfs/fdesc/fdesc_vfsops.c: revision 1.91
sys/compat/linux/arch/i386/linux_ptrace.c: revision 1.30
sys/compat/common/kern_time_50.c: revision 1.28
sys/netinet6/ip6_forward.c: revision 1.74
sys/miscfs/umapfs/umap_vnops.c: revision 1.57
sys/compat/svr4/svr4_fcntl.c: revision 1.74
distrib/sets/lists/comp/mi: revision 1.1931
sys/netinet6/udp6_output.c: revision 1.46
sys/fs/puffs/puffs_compat.c: revision 1.3
sys/fs/udf/udf_rename.c: revision 1.11
sys/compat/svr4/svr4_filio.c: revision 1.24
sys/fs/udf/udf_rename.c: revision 1.12
sys/netinet/tcp_usrreq.c: revision 1.202
sys/miscfs/umapfs/umap_subr.c: revision 1.29
sys/compat/linux/common/linux_fadvise64.c: revision 1.3
sys/netinet/if_atm.c: revision 1.34
sys/miscfs/procfs/procfs_subr.c: revision 1.106
sys/miscfs/genfs/layer_subr.c: revision 1.37
sys/netinet/tcp_sack.c: revision 1.30
sys/compat/freebsd/freebsd_misc.c: revision 1.33
sys/compat/freebsd/freebsd_file.c: revision 1.33
sys/ufs/chfs/chfs_vnode.c: revision 1.12
sys/compat/svr4/svr4_ttold.c: revision 1.34
sys/compat/linux/common/linux_file.c: revision 1.114
sys/compat/linux/arch/mips/linux_machdep.c: revision 1.43
sys/compat/linux/common/linux_signal.c: revision 1.76
sys/compat/common/compat_util.c: revision 1.46
sys/compat/linux/arch/arm/linux_ptrace.c: revision 1.18
sys/compat/svr4/svr4_sockio.c: revision 1.36
sys/compat/linux/arch/arm/linux_machdep.c: revision 1.32
sys/compat/svr4/svr4_signal.c: revision 1.66
sys/kern/kern_exec.c: revision 1.410
sys/fs/puffs/puffs_vfsops.c: revision 1.115
sys/compat/svr4/svr4_exec_elf64.c: revision 1.15
sys/compat/linux/arch/i386/linux_machdep.c: revision 1.159
sys/compat/linux/arch/alpha/linux_machdep.c: revision 1.50
sys/compat/linux32/common/linux32_misc.c: revision 1.24
sys/netinet/in_pcb.c: revision 1.153
sys/sys/malloc.h: revision 1.116
sys/compat/common/if_43.c: revision 1.9
share/man/man9/Makefile: revision 1.380
sys/netinet/tcp_vtw.c: revision 1.12
sys/miscfs/umapfs/umap_vfsops.c: revision 1.95
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.186
sys/compat/common/uipc_syscalls_43.c: revision 1.46
sys/ufs/ext2fs/ext2fs_vnops.c: revision 1.115
sys/fs/puffs/puffs_msgif.c: revision 1.97
sys/compat/svr4/svr4_ipc.c: revision 1.27
sys/compat/linux/common/linux_exec.c: revision 1.117
sys/ufs/ext2fs/ext2fs_readwrite.c: revision 1.66
sys/netinet/tcp_output.c: revision 1.179
sys/compat/svr4/svr4_termios.c: revision 1.28
sys/fs/udf/udf_strat_bootstrap.c: revision 1.4
sys/fs/puffs/puffs_subr.c: revision 1.67
sys/fs/puffs/puffs_node.c: revision 1.36
sys/miscfs/overlay/overlay_vnops.c: revision 1.21
sys/fs/cd9660/cd9660_node.c: revision 1.34
sys/netinet/raw_ip.c: revision 1.146
sys/sys/mallocvar.h: revision 1.13
sys/miscfs/overlay/overlay_vfsops.c: revision 1.63
share/man/man9/malloc.9: revision 1.50
sys/netinet6/dest6.c: revision 1.18
sys/compat/linux/common/linux_uselib.c: revision 1.33
sys/compat/linux/common/linux_socket.c: revision 1.120
share/man/man9/malloc.9: revision 1.51
sys/netinet/tcp_subr.c: revision 1.257
sys/compat/linux/common/linux_socketcall.c: revision 1.45
sys/compat/linux/common/linux_fadvise64_64.c: revision 1.3
sys/compat/freebsd/freebsd_ipc.c: revision 1.17
sys/compat/linux/common/linux_misc_notalpha.c: revision 1.109
sys/compat/linux/arch/alpha/linux_pipe.c: revision 1.17
sys/netinet6/in6_pcb.c: revision 1.132
sys/netinet6/in6_ifattach.c: revision 1.94
sys/compat/svr4/svr4_exec_elf32.c: revision 1.15
sys/miscfs/nullfs/null_vfsops.c: revision 1.90
sys/fs/cd9660/cd9660_util.c: revision 1.12
sys/compat/linux/arch/powerpc/linux_machdep.c: revision 1.48
sys/compat/freebsd/freebsd_exec_elf32.c: revision 1.20
sys/miscfs/procfs/procfs_vfsops.c: revision 1.94
sys/compat/linux/arch/powerpc/linux_ptrace.c: revision 1.28
sys/compat/linux/common/linux_sched.c: revision 1.67
sys/compat/linux/common/linux_exec_aout.c: revision 1.67
sys/compat/linux/common/linux_pipe.c: revision 1.67
sys/compat/linux/common/linux_llseek.c: revision 1.34
sys/compat/linux/arch/mips/linux_ptrace.c: revision 1.10
Do not uselessly include <sys/malloc.h>.
Cleanup:
- remove struct kmembuckets (dead)
- correctly deadify MALLOC_XX
- remove MALLOC_DEFINE_LIMIT and MALLOC_JUSTDEFINE_LIMIT (dead)
- remove malloc_roundup(), malloc_type_setlimit(), MALLOC_DEFINE_LIMIT()
and MALLOC_JUSTDEFINE_LIMIT() from man 9 malloc
New sentence, new line. Bump date for previous.
Obsolete malloc_roundup(9), malloc_type_setlimit(9) and MALLOC_DEFINE_LIMIT(9)
man pages.
 1.183.2.1  22-Aug-2014  martin Pull up following revision(s) (requested by hannken in ticket #49):
sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.184
Use mount from argument "mp", "vp->v_mount" is not valid here.
PR kern/49142 (panic in ext2fs_loadvnode mounting an ext2fs filesystem)
 1.186.2.3  28-Aug-2017  skrll Sync with HEAD
 1.186.2.2  05-Oct-2016  skrll Sync with HEAD
 1.186.2.1  06-Apr-2015  skrll Sync with HEAD
 1.193.2.4  26-Apr-2017  pgoyette Sync with HEAD
 1.193.2.3  20-Mar-2017  pgoyette Sync with HEAD
 1.193.2.2  06-Aug-2016  pgoyette Sync with HEAD
 1.193.2.1  20-Jul-2016  pgoyette Adapt machine-independant code to the new {b,c}devsw reference-counting
(using localcount(9)). All callers of {b,c}devsw_lookup() now call
{b,c}devsw_lookup_acquire() which retains a reference on the 'struct
{b,c}devsw'. This reference must be released by the caller once it is
finished with the structure's content (or other data that would disappear
if the 'struct {b,c}devsw' were to disappear).
 1.204.2.1  21-Apr-2017  bouyer Sync with HEAD
 1.208.2.2  17-May-2017  pgoyette Typo - insert missing *
 1.208.2.1  17-May-2017  pgoyette Adapt for bdevsw_lookup_acquire()
 1.210.2.3  18-Jan-2019  pgoyette Synch with HEAD
 1.210.2.2  26-Dec-2018  pgoyette Sync with HEAD, resolve a few conflicts
 1.210.2.1  25-Jun-2018  pgoyette Sync with HEAD
 1.211.2.3  13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.211.2.2  08-Apr-2020  martin Merge changes from current as of 20200406
 1.211.2.1  10-Jun-2019  christos Sync with HEAD
 1.214.4.3  29-Feb-2020  ad Sync with head.
 1.214.4.2  19-Jan-2020  ad Set IMNT_SHRLOOKUP and use it for the in-cache case. Need to check what
more can be done with tmpfs though, it can probably do the whole lookup.
 1.214.4.1  17-Jan-2020  ad Sync with head.
 1.214.2.1  07-Jan-2025  martin Pull up following revision(s) (requested by hannken in ticket #1934):

sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.228
sys/ufs/lfs/lfs_vfsops.c: revision 1.383
sys/ufs/ffs/ffs_wapbl.c: revision 1.50
sys/ufs/ffs/ffs_vfsops.c: revision 1.383 (patch)
sys/ufs/ffs/ffs_vfsops.c: revision 1.384 (patch)

Remove comment "we are always called with the filesystem marked `MPBUSY'."
above some xxx_sync() operations. These operations get called without
any exclusive lock.

This comment appeared with "add quota support" on 1990-05-02.
On 1998/02/18 MNT_MPBUSY disappeared when vfs_busy() was changed from
an exclusive lock to a shared lock.

PR kern/58837 "ffs: Missing locking around fs_fmod/time"

Protect test/clear fs->fs_fmod with um_lock like it is already
protected in ffs_alloc.c.

When writing to disk protect moving superblock to buffer with um_lock.

Set/clear fs->fmod while mounting, updating a mount or unmounting
is safe as these operations run exclusive, either mounting creates
a new file system or the file system is suspended. Assert suspension
for update and unmount.

PR kern/58837 "ffs: Missing locking around fs_fmod/time"
 1.221.4.1  07-Jan-2025  martin Pull up following revision(s) (requested by hannken in ticket #1037):

sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.228
sys/ufs/lfs/lfs_vfsops.c: revision 1.383
sys/ufs/ffs/ffs_wapbl.c: revision 1.50
sys/ufs/ffs/ffs_vfsops.c: revision 1.383
sys/ufs/ffs/ffs_vfsops.c: revision 1.384

Remove comment "we are always called with the filesystem marked `MPBUSY'."
above some xxx_sync() operations. These operations get called without
any exclusive lock.

This comment appeared with "add quota support" on 1990-05-02.
On 1998/02/18 MNT_MPBUSY disappeared when vfs_busy() was changed from
an exclusive lock to a shared lock.

PR kern/58837 "ffs: Missing locking around fs_fmod/time"

Protect test/clear fs->fs_fmod with um_lock like it is already
protected in ffs_alloc.c.

When writing to disk protect moving superblock to buffer with um_lock.

Set/clear fs->fmod while mounting, updating a mount or unmounting
is safe as these operations run exclusive, either mounting creates
a new file system or the file system is suspended. Assert suspension
for update and unmount.

PR kern/58837 "ffs: Missing locking around fs_fmod/time"

RSS XML Feed