History log of /src/sys/ufs/lfs/lfs_bio.c |
Revision | | Date | Author | Comments |
1.150 |
| 15-Sep-2025 |
perseant | If we don't have enough space, flush with checkpoint: the Ifile might be clogging up the buffer cache.
Rewrite the logic in lfs_flush() so that the requested filesystem is always flushed, regardless of whether only_onefs is set.
Use LFS_WAIT_BYTES and LFS_WAIT_BUFS as the thresholds when determining whether to wait for resources, rather than their _MAX_ counterparts.
|
1.149 |
| 05-Sep-2020 |
riastradh | Round of uvm.h cleanup.
The poorly named uvm.h is generally supposed to be for uvm-internal users only.
- Narrow it to files that actually need it -- mostly files that need to query whether curlwp is the pagedaemon, which should maybe be exposed by an external header.
- Use uvm_extern.h where feasible and uvm_*.h for things not exposed by it. We should split up uvm_extern.h but this will serve for now to reduce the uvm.h dependencies.
- Use uvm_stat.h and #ifdef UVMHIST uvm.h for files that use UVMHIST(ubchist), since ubchist is declared in uvm.h but the reference evaporates if UVMHIST is not defined, so we reduce header file dependencies.
- Make uvm_device.h and uvm_swap.h independently includable while here.
ok chs@
|
1.148 |
| 11-Jun-2020 |
ad | uvm_availmem(): give it a boolean argument to specify whether a recent cached value will do, or if the very latest total must be fetched. It can be called thousands of times a second and fetching the totals impacts not only the calling LWP but other CPUs doing unrelated activity in the VM system.
|
1.147 |
| 14-Mar-2020 |
ad | OR into bp->b_cflags; don't overwrite.
|
1.146 |
| 23-Feb-2020 |
riastradh | Prevent new dirops while we issue lfs_flush_dirops.
lfs_flush_dirops assumes (by KASSERT((ip->i_state & IN_ADIROP) == 0)) that vnodes on the dchain will not become involved in active dirops even while holding no other locks (lfs_lock, v_interlock), so we must set lfs_writer here. All other callers already set lfs_writer.
We set fs->lfs_writer++ without explicitly doing lfs_writer_enter because
(a) we already waited for the dirops to drain, and (b) we hold lfs_lock and cannot drop it before setting lfs_writer.
|
1.145 |
| 18-Feb-2020 |
chs | remove the aiodoned thread. I originally added this to provide a thread context for doing page cache iodone work, but since then biodone() has changed to hand off all iodone work to a softint thread, so we no longer need the special-purpose aiodoned thread.
|
1.144 |
| 31-Dec-2019 |
ad | branches: 1.144.2; Rename uvm_free() -> uvm_availmem().
|
1.143 |
| 21-Dec-2019 |
ad | uvmexp.free -> uvm_free()
|
1.142 |
| 09-Jun-2018 |
zafer | branches: 1.142.2; 1.142.6; Add missing b_cflags and b_oflags. Ok dholland@ Addresses PR kern/42342 by Yoshihiro Nakajima
|
1.141 |
| 10-Jun-2017 |
maya | branches: 1.141.4; Rename i_flag to i_state.
The similarity to i_flags has previously caused errors.
|
1.140 |
| 08-Jun-2017 |
chs | move some buffer cache internals declarations from buf.h to vfs_bio.c. this is needed to avoid name conflicts with ZFS and also makes it clearer that other code shouldn't be messing with these. remove the LFS debug code that poked around in bufqueues and remove the BQ_EMPTY bufqueue since nothing uses it anymore. provide a function to let LFS and wapbl read the value of nbuf for now.
|
1.139 |
| 17-Apr-2017 |
hannken | branches: 1.139.4; Remove unused argument "nextp" from vfs_busy() and vfs_unbusy(). Remove argument "keepref" from vfs_unbusy() and add vfs_ref() where needed.
|
1.138 |
| 13-Apr-2017 |
hannken | Switch lfs_flush() and lfs_writerd() to mountlist iterator.
|
1.137 |
| 01-Apr-2017 |
maya | Switch lfs_writer_daemon to use condvar instead of mtsleep. track thread existence with struct lwp instead of pid + lid, it's more useful from ddb.
|
1.136 |
| 13-Mar-2017 |
riastradh | #if DIAGNOSTIC panic ---> KASSERT
Replace some #if DEBUG by this too. DEBUG is only for expensive assertions; these are not.
|
1.135 |
| 03-Oct-2015 |
hannken | branches: 1.135.2; 1.135.4; Remove dubious vhold()/holdrele() from lfs_reserve(). The vnodes are always referenced on entry.
If we changed ulfs_remove() and ulfs_rmdir() to return the locked dvp the vnodes were always locked on entry.
Remove an outdated comment from lfs_reserveavail(), unlocking/relocking the vnode was removed in rev 1.49.
|
1.134 |
| 12-Aug-2015 |
dholland | Hack up dinode usage to be 64 vs. 32 as needed. Part 1.
(This part changes the native lfs code; the ufs-derived code already has 64 vs. 32 logic, but as aspects of it are unsafe, and don't entirely interoperate cleanly with the lfs 64/32 stuff, pass 2 will be rehashing that.)
|
1.133 |
| 02-Aug-2015 |
dholland | Fix assorted 64 -> 32 truncations in lfs. Also, some minor tidyups and corrections in passing.
|
1.132 |
| 28-Jul-2015 |
dholland | Add a new lfs header file: lfs_accessors.h.
This contains all the accessor functions and macros out of lfs.h. Add an include of lfs_accessors.h after all uses of lfs.h... except for code that wants to define its own struct lfs-alike that the accessors are supposed to play along with. For these, set STRUCT_LFS and include lfs_accessors.h after the necessary structure has been defined, so that lfs_accessors.h can emit functions in terms of it.
|
1.131 |
| 25-Jul-2015 |
martin | Use accessors in DEBUG and DIAGNOSTIC code as well
|
1.130 |
| 24-Jul-2015 |
dholland | More lfs superblock accessors. (This changes the rest of the code over; all the accessors were already added.)
The difference between this commit and the previous one is arbitrary, but the previous one passed the regression tests on its own so I'm keeping it separate to help with any bisections that might be needed in the future.
|
1.129 |
| 24-Jul-2015 |
dholland | Switch to accessor functions for elements of the LFS on-disk superblock. This will allow switching between 32/64 bit forms on the fly; it will also allow handling LFS_EI reasonably tidily. (That currently doesn't work on the superblock.)
It also gets rid of cpp abuse in the form of fake structure member macros.
Also, instead of doing sleep/wakeup on &lfs_avail and &lfs_nextseg inside the on-disk superblock, add extra elements to the in-memory struct lfs for this. (XXX: these should be changed to condvars, but not right now)
XXX: this migrates a structure needed by the lfs code in libsa (struct salfs) into lfs.h, where it doesn't belong, but for the time being this is necessary in order to allow the accessors (and the various lfs macros and other goop that relies on them) to compile.
|
1.128 |
| 27-Nov-2013 |
christos | branches: 1.128.6; Change the queue.3 *_END(&head) macros to NULL. Since we don't have CIRCLEQ anymore, all the macros expand to NULL anyway, so this improves readability. Requested by rmind@
|
1.127 |
| 23-Nov-2013 |
christos | change the mountlist CIRCLEQ into a TAILQ
|
1.126 |
| 28-Jul-2013 |
dholland | Add lfs_kernel.h for declarations that don't need to be exposed to userland.
lfs currently has the following headers: lfs.h - on-disk structures and stuff needed for userlevel tools lfs_inode.h - additional restricted materials for userlevel tools that operate the fs (newfs_lfs, fsck_lfs, lfs_cleanerd) lfs_kernel.h - stuff needed only in the kernel
and the following legacy headers that are expected to be mopped up and folded into one of the above: lfs_extern.h - function prototypes ulfs_bswap.h - endian-independent support ulfs_dinode.h - now contains very little ulfs_dirhash.h - dirhash support ulfs_extattr.h - extattr support ulfs_extern.h - more function prototypes ulfs_inode.h - assorted kernel-only declarations ulfs_quota.h - quota support ulfs_quota1.h - more quota support ulfs_quota2.h - more quota support ulfs_quotacommon.h - more quota support ulfsmount.h - legacy copy of ufsmount material
|
1.125 |
| 18-Jun-2013 |
christos | branches: 1.125.2; Prefix most of the cpp macros with lfs_ and LFS_ to avoid conflicts with ffs. This was done so that boot blocks that want to compile both FFS and LFS in the same file work.
|
1.124 |
| 06-Jun-2013 |
dholland | Split lfs from ufs step 4:
Massedit all ufs symbols to be "ulfs" instead, to make sure there are no conflicts with ufs. Confirmed with grep.
(This required changing a few comments that maybe should have been left alone to say "ulfs", but we'll survive that.)
|
1.123 |
| 06-Jun-2013 |
dholland | Split lfs from ufs, part 2:
Change all <ufs/ufs/foo.h> includes to <ufs/lfs/ulfs_foo.h>.
|
1.122 |
| 16-Feb-2012 |
perseant | branches: 1.122.2; Pass t_renamerace and t_rmdirrace tests.
Adapt dholland@'s fix to ufs_rename to fix PR kern/43582. Address several other MP locking issues discovered during the course of investigating the same problem.
Removed extraneous vn_lock() calls on the Ifile, since the Ifile writes are controlled by the segment lock.
Fix PR kern/45982 by deemphasizing the estimate of how much metadata will fill the empty space on disk when the disk is nearly empty (t_renamerace crates a lot of inode blocks on a tiny empty disk).
|
1.121 |
| 02-Jan-2012 |
perseant | branches: 1.121.2;
* Remove PGO_RECLAIM during lfs_putpages()' call to genfs_putpages(), to avoid a live lock in the latter when reclaiming a vnode with dirty pages.
* Add a new segment flag, SEGM_RECLAIM, to note when a segment is being written for vnode reclamation, and record which inode is being reclaimed, to aid in forensic debugging.
* Add a new segment flag, SEGM_SINGLE, so that opportunistic writes can write a single segment's worth of blocks and then stop, rather than writing all the way up to the cleaner's reserved number of segments.
* Add assert statements to check mutex ownership is the way it ought to be, mostly in lfs_putpages; fix problems uncovered by this.
* Don't clear VU_DIROP until the inode actually makes its way to disk, avoiding a problem where dirop inodes could become separated (uncovered by a modified version of the "ckckp" forensic regression test).
* Move the vfs_getopsbyname() call into lfs_writerd. Prepare code to make lfs_writerd notice when there are no more LFSs, and exit losing the reference, so that, in theory, the module can be unloaded. This code is not enabled, since it causes a crash on exit.
* Set IN_MODIFIED on inodes flushed by lfs_flush_dirops. Really we only need to set IN_MODIFIED if we are going to write them again (e.g., to write pages); need to think about this more.
Finally, several changes to help avoid "no clean segments" panics:
* In lfs_bmapv, note when a vnode is loaded only to discover whether its blocks are live, so it can immediately be recycled. Since the cleaner will try to choose ~empty segments over full ones, this prevents the cleaner from (1) filling the vnode cache with junk, and (2) squeezing any unwritten writes to disk and running the fs out of segments.
* Overestimate by half the amount of metadata that will be required to fill the clean segments. This will make the disk appear smaller, but should help avoid a "no clean segments" panic.
* Rearrange lfs_writerd. In particular, lfs_writerd now pays attention to the number of clean segments available, and holds off writing until there is room.
|
1.120 |
| 11-Jul-2011 |
hannken | branches: 1.120.2; 1.120.6; Change VOP_BWRITE() to take a vnode as its first argument like all other VOPs do. Layered file systems no longer have to modify bp->b_vp and run into trouble when an async VOP_BWRITE() uses the wrong vnode.
- change all occurences of VOP_BWRITE(bp) to VOP_BWRITE(bp->b_vp, bp). - remove layer_bwrite(). - welcome to 5.99.55
Adresses PR kern/38762 panic: vwakeup: neg numoutput
No objections from tech-kern@.
|
1.119 |
| 12-Jun-2011 |
rmind | Welcome to 5.99.53! Merge rmind-uvmplock branch:
- Reorganize locking in UVM and provide extra serialisation for pmap(9). New lock order: [vmpage-owner-lock] -> pmap-lock.
- Simplify locking in some pmap(9) modules by removing P->V locking.
- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).
- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner. Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.
- Unify /dev/mem et al in MI code and provide required locking (removes kernel-lock on some ports). Also, avoid cache-aliasing issues.
Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches formed the core changes of this branch.
|
1.118 |
| 24-Jun-2010 |
hannken | branches: 1.118.6; Clean up vnode lock operations pass 2:
VOP_UNLOCK(vp, flags) -> VOP_UNLOCK(vp): Remove the unneeded flags argument.
Welcome to 5.99.32.
Discussed on tech-kern.
|
1.117 |
| 16-Feb-2010 |
mlelstv | branches: 1.117.2; Three changes in a single commit.
- drop the notion of frags (LFS fragments) vs fsb (FFS fragments) The code uses a complicated unity function that just makes the code difficult to understand.
- support larger sector sizes. Fix disk address computations to use DEV_BSIZE in the kernel as required by device drivers and to use sector sizes in userland.
- Fix several locking bugs in lfs_bio.c and lfs_subr.c.
|
1.116 |
| 08-Jan-2010 |
pooka | branches: 1.116.2; The VATTR_NULL/VREF/VHOLD/HOLDRELE() macros lost their will to live years ago when the kernel was modified to not alter ABI based on DIAGNOSTIC, and now just call the respective function interfaces (in lowercase). Plenty of mix'n match upper/lowercase has creeped into the tree since then. Nuke the macros and convert all callsites to lowercase.
no functional change
|
1.115 |
| 07-Dec-2009 |
eeh | Fix some more hangs and deadlocks.
|
1.114 |
| 06-May-2008 |
ad | branches: 1.114.18; PR kern/38141 lookup/vfs_busy acquire rwlock recursively
Simplify the mount locking. Remove all the crud to deal with recursion on the mount lock, and crud to deal with unmount as another weirdo lock.
Hopefully this will once and for all fix the deadlocks with this. With this commit there are two locks on each mount:
- krwlock_t mnt_unmounting. This is used to prevent unmount across critical sections like getnewvnode(). It's only ever read locked with rw_tryenter(), and is only ever write locked in dounmount(). A write hold can't be taken on this lock if the current LWP could hold a vnode lock.
- kmutex_t mnt_updating. This is taken by threads updating the mount, for example when going r/o -> r/w, and is only present to serialize updates. In order to take this lock, a read hold must first be taken on mnt_unmounting, and the two need to be held across the operation.
One effect of this change: previously if an unmount failed, we would make a half hearted attempt to back out of it gracefully, but that was unlikely to work in a lot of cases. Now while an unmount that will be aborted is in progress, new file operations within the mount will fail instead of being delayed. That is unlikely to be a problem though, because if the admin requests unmount of a file system then s(he) has made a decision to deny access to the resource.
|
1.113 |
| 30-Apr-2008 |
ad | PR kern/38135 vfs_busy/vfs_trybusy confusion
The previous fix worked, but it opened a window where mounts could have disappeared from mountlist while the caller was traversing it using vfs_trybusy(). Fix that.
|
1.112 |
| 29-Apr-2008 |
ad | kern/38135 vfs_busy/vfs_trybusy confusion
The symptom was that sometimes file systems would occasionally not appear in output from 'df' or 'mount' if the system was busy. Resolution:
- Make mount locks work somewhat like vm_map locks. - vfs_trybusy() now only fails if the mount is gone, or if someone is unmounting the file system. Simple contention on mnt_lock doesn't cause it to fail. - vfs_busy() will wait even if the file system is being unmounted.
|
1.111 |
| 28-Apr-2008 |
martin | Remove clause 3 and 4 from TNF licenses
|
1.110 |
| 20-Feb-2008 |
matt | branches: 1.110.6; 1.110.8; 1.110.10; Merge all the *different* definitions of bufqueues into one common one.
|
1.109 |
| 15-Feb-2008 |
ad | The buffer LOCKED flag need not be under the protection of bufcache_lock, BUSY is enough.
|
1.108 |
| 30-Jan-2008 |
ad | PR kern/37706 (forced unmount of file systems is unsafe):
- Do reference counting for 'struct mount'. Each vnode associated with a mount takes a reference, and in turn the mount takes a reference to the vfsops. - Now that mounts are reference counted, replace the overcomplicated mount locking inherited from 4.4BSD with a recursable rwlock.
|
1.107 |
| 02-Jan-2008 |
ad | Merge vmlocking2 to head.
|
1.106 |
| 11-Oct-2007 |
ad | branches: 1.106.4; 1.106.6; 1.106.10; Remove LOCK_ASSERT(!simple_lock_held(&foo));
|
1.105 |
| 10-Oct-2007 |
ad | Merge from vmlocking:
- Split vnode::v_flag into three fields, depending on field locking. - simple_lock -> kmutex in a few places. - Fix some simple locking problems.
|
1.104 |
| 08-Oct-2007 |
ad | Merge ffs locking & brelse changes from the vmlocking branch.
|
1.103 |
| 29-Jul-2007 |
ad | branches: 1.103.4; 1.103.6; 1.103.8; 1.103.10; It's not a good idea for device drivers to modify b_flags, as they don't need to understand the locking around that field. Instead of setting B_ERROR, set b_error instead. b_error is 'owned' by whoever completes the I/O request.
|
1.102 |
| 17-Jul-2007 |
christos | branches: 1.102.2; eliminate MFSNAMELEN
|
1.101 |
| 16-May-2007 |
perseant | Change references to SEGM_W_DIROPS to SEGM_CKP, and replace the logic that formerly used SEGM_W_DIROPS in lfs_segwrite() appropriately. This prevents a problem in which processes could get stuck in "buffers" sleep forever.
|
1.100 |
| 18-Apr-2007 |
perseant | Add/change a couple of comments about locking restrictions.
|
1.99 |
| 17-Apr-2007 |
perseant | Install a new sysctl, vfs.lfs.ignore_lazy_sync, which causes LFS to ignore the "smooth" syncer, as if vfs.sync.*delay = 0, but only for LFS. The default is "on", i.e., ignore lazy sync.
Reduce the amount of polling/busy-waiting done by lfs_putpages(). To accomplish this, copied genfs_putpages() and modified it to indicate which page it was that caused it to return with EDEADLK. fsync()/fdatasync() should no longer ever fail with EAGAIN, and should not consume huge quantities of cpu.
Also, try to make dirops less likely to be written as the result of a VOP_PUTPAGES(), while ensuring that they are written regularly.
|
1.98 |
| 16-Nov-2006 |
christos | branches: 1.98.2; 1.98.4; 1.98.8; 1.98.10; 1.98.16; __unused removal on arguments; approved by core.
|
1.97 |
| 12-Oct-2006 |
christos | - sprinkle __unused on function decls. - fix a couple of unused bugs - no more -Wno-unused for i386
|
1.96 |
| 04-Oct-2006 |
christos | fix empty if
|
1.95 |
| 15-Sep-2006 |
yamt | branches: 1.95.2; merge yamt-pdpolicy branch. - separate page replacement policy from the rest of kernel - implement an alternative replacement policy
|
1.94 |
| 29-Jun-2006 |
perseant | branches: 1.94.4; Don't wake up the cleaner if the filesystem is unwrappable, and fix the compatibility fcntls.
Also includes one-line fixes for an MP locking bug and a zero-length FINFO problem that manifested during testing.
|
1.93 |
| 14-May-2006 |
elad | branches: 1.93.4; integrate kauth.
|
1.92 |
| 04-May-2006 |
perseant | Introduce another per-filesystem parameter, lfs_resvseg, to separate the notion of "how many segments are reserved for the cleaner" from that of "how many segments are not counted in lfs_bfree". The default value used for existing filesystems is the same as the previous implicit value of (lfs_minfreeseg / 2 + 1), modulo some sanity checking.
Count pending dirops on a per-filesystem basis, since once we start writing them we can't stop until we're done. This seems to help stave off the "no clean segments" panic in the case of filling the filesystem with directories and small files (e.g. simultaneously unpacking more copies of pkgsrc than will fit).
|
1.91 |
| 13-Apr-2006 |
perseant | Make lfs_vref/lfs_vunref not need to know about VXLOCK and VFREEING explicitly (especially since we didn't know about VFREEING at all before), but notice the EBUSY return from vget() instead.
Fix some more MP locking protocol issues, most of which were pointed out by Christian Ehrhardt this morning on tech-kern.
|
1.90 |
| 05-Mar-2006 |
christos | branches: 1.90.2; 1.90.4; cleanup more SET/CLR/ISSET lossage
|
1.89 |
| 06-Jan-2006 |
yamt | branches: 1.89.2; 1.89.4; 1.89.6; initialize necessary members of struct buf. PR/32462 from Reinoud Zandijk.
|
1.88 |
| 04-Jan-2006 |
yamt | - add simple functions to allocate/free a buffer for i/o. - make bufpool static.
|
1.87 |
| 11-Dec-2005 |
christos | branches: 1.87.2; merge ktrace-lwp.
|
1.86 |
| 29-May-2005 |
christos | branches: 1.86.2; - sprinkle const - avoid shadow variables.
|
1.85 |
| 23-Apr-2005 |
perseant | Provide a resize_lfs(8), including kernel and cleaner support. The current implementation requires the fs to be mounted while resizing. Tested in both directions, and everything appears to work happily, but ymmv.
|
1.84 |
| 19-Apr-2005 |
perseant | Keep per-inode, per-fs, and subsystem-wide counts of blocks allocated through lfs_balloc(), and use that to estimate the number of dirty pages belonging to LFS (subsystem or filesystem). This is almost certainly wrong for the case of a large mmap()ed region, but the accounting is tighter than what we had before, and performs much better in the typical case of pages dirtied through write().
|
1.83 |
| 06-Apr-2005 |
perseant | Fix some locking issues that appeared with the simple_lock work. Address a "pager_map" deadlock in lfs_putpages().
|
1.82 |
| 01-Apr-2005 |
perseant | Protect various per-fs structures with fs->lfs_interlock simple_lock, to improve behavior in the multiprocessor case. Add debugging segment-lock assertion statements.
|
1.81 |
| 09-Mar-2005 |
perseant | branches: 1.81.2; Be more careful about handling of flags to lfs_flush, to ensure that the lfs_writing mutex is respected.
|
1.80 |
| 08-Mar-2005 |
perseant | Straighten out the maze of ifdefs. Instead, consolidate all the debugging stuff under '#ifdef DEBUG', and use sysctl knobs to turn on/off particular parts of the debugging reporting (if DEBUG is enabled). Re-enable the LFS statistics in sysctl, while I'm there. A bit of a rototill.
|
1.79 |
| 26-Feb-2005 |
perry | nuke trailing whitespace
|
1.78 |
| 26-Feb-2005 |
perseant | Various minor LFS improvements:
* Note when lfs_putpages(9) thinks it is not going to be writing any pages before calling genfs_putpages(9). This prevents a situation in which blocks can be queued for writing without a segment header. * Correct computation of NRESERVE(), though it is still a gross overestimate in most cases. Note that if NRESERVE() is too high, it may be impossible to create files on the filesystem. We catch this case on filesystem mount and refuse to mount r/w. * Allow filesystems to be mounted whose block size is == MAXBSIZE. * Somewhere along the line, ufs_bmaparray(9) started mangling UNWRITTEN entries in indirect blocks again, triggering a failed assertion "daddr <= LFS_MAX_DADDR". Explicitly convert to and from int32_t to correct this. * Add a high-water mark for the number of dirty pages any given LFS can hold before triggering a flush. This is settable by sysctl, but off (zero) by default. * Be more careful about the MAX_BYTES and MAX_BUFS computations so we shouldn't see "please increase to at least zero" messages. * Note that VBLK and VCHR vnodes can have nonzero values in di_db[0] even though their v_size == 0. Don't panic when we see this. * Change lfs_bfree to a signed quantity. The manner in which it is processed before being passed to the cleaner means that sometimes it may drop below zero, and the cleaner must be aware of this. * Never report bfree < 0 (or higher than lfs_dsize) through lfs_statvfs(9). This prevents df(1) from ever telling us that our full filesystems have 16TB free. * Account space allocated through lfs_balloc(9) that does not have associated buffer headers, so that the pagedaemon doesn't run us out of segments. * Return ENOSPC from lfs_balloc(9) when bfree drops to zero. * Address a deadlock in lfs_bmapv/lfs_markv when the filesystem is being unmounted. Because vfs_busy() is a shared lock, and lfs_bmapv/lfs_markv mark the filesystem vfs_busy(), the cleaner can be holding the lock that umount() is blocking on, then try to vfs_busy() again in getnewvnode().
|
1.77 |
| 28-Jan-2004 |
yamt | branches: 1.77.6; 1.77.8; 1.77.10; use bufmem instead of bufpages to make lfs a little less broken.
|
1.76 |
| 04-Dec-2003 |
yamt | use b_private rather than b_saveaddr. XXX LFS_USE_B_INVAL
|
1.75 |
| 03-Oct-2003 |
yamt | assertions.
|
1.74 |
| 23-Sep-2003 |
yamt | remove unnecessary externs of lfs_do_flush.
|
1.73 |
| 07-Sep-2003 |
yamt | - raise spl to bio in lfs_countlocked() rather than having callers to do so. - buffer cache MP locks. - assert B_CALL buffers are not on the free queue.
|
1.72 |
| 07-Aug-2003 |
agc | Move UCB-licensed code from 4-clause to 3-clause licence.
Patches provided by Joel Baker in PR 22364, verified by myself.
|
1.71 |
| 12-Jul-2003 |
yamt | - protect global resource counts with lfs_subsys_lock. - clean up scattered externs a little.
|
1.70 |
| 02-Jul-2003 |
yamt | a comment.
|
1.69 |
| 02-Jul-2003 |
yamt | use queue.h macros.
|
1.68 |
| 02-Jul-2003 |
yamt | use VFSTOUFS macro.
|
1.67 |
| 02-Jul-2003 |
yamt | - add a new functions, lfs_writer_enter/leave, and use them instead of duplicated code fragments. - add an assertion.
|
1.66 |
| 27-Apr-2003 |
perseant | branches: 1.66.2; Don't change update time on block write; lets e.g. "tar xp" work properly.
|
1.65 |
| 02-Apr-2003 |
fvdl | Add support for UFS2. UFS2 is an enhanced FFS, adding support for 64 bit block pointers, extended attribute storage, and a few other things.
This commit does not yet include the code to manipulate the extended storage (for e.g. ACLs), this will be done later.
Originally written by Kirk McKusick and Network Associates Laboratories for FreeBSD.
|
1.64 |
| 15-Mar-2003 |
perseant | Add simple_lock protection for lfs_seglock and lfs_subsys_pages; these will be expanded to cover other per-fs and subsystem-wide data as well.
Fix a case of IN_MODIFIED being set without updating lfs_uinodes, resulting in a "lfs_uinodes < 0" panic.
Fix a deadlock in lfs_putpages arising from the need to busy all pages in a block; unbusy any that had already been busied before starting over.
|
1.63 |
| 02-Mar-2003 |
perseant | Account SEGUSE_ACTIVE correctly so that the automatic segment cleaning actually happens.
Add a new fcntl call that will write the minimum necessary to checkpoint (i.e., for on-disk directory structure to be consistent, not including updates to file data) so that the cleaner can clean segments more quickly without sacrificing three-way commit for cleaning.
|
1.62 |
| 25-Feb-2003 |
thorpej | Add a new BUF_INIT() macro which initializes b_dep and b_interlock, and use it. This fixes a few places where either b_dep or b_interlock were not properly initialized.
|
1.61 |
| 20-Feb-2003 |
perseant | Tabify, and fix some comment alignment problems.
|
1.60 |
| 19-Feb-2003 |
yamt | workaround for "another flush is..." infinity loop in writerd. if we're writerd, sleep in lfs_flush until another writer goes away instead of busy loop in writed.
|
1.59 |
| 19-Feb-2003 |
yamt | init b_interlock.
|
1.58 |
| 17-Feb-2003 |
perseant | Add code to UBCify LFS. This is still behind "#ifdef LFS_UBC" for now (there are still some details to work out) but expect that to go away soon. To support these basic changes (creation of lfs_putpages, lfs_gop_write, mods to lfs_balloc) several other changes were made, to wit:
* Create a writer daemon kernel thread whose purpose is to handle page writes for the pagedaemon, but which also takes over some of the functions of lfs_check(). This thread is started the first time an LFS is mounted.
* Add a "flags" parameter to GOP_SIZE. Current values are GOP_SIZE_READ, meaning that the call should return the size of the in-core version of the file, and GOP_SIZE_WRITE, meaning that it should return the on-disk size. One of GOP_SIZE_READ or GOP_SIZE_WRITE must be specified.
* Instead of using malloc(...M_WAITOK) for everything, reserve enough resources to get by and use malloc(...M_NOWAIT), using the reserves if necessary. Use the pool subsystem for structures small enough that this is feasible. This also obsoletes LFS_THROTTLE.
And a few that are not strictly necessary:
* Moves the LFS inode extensions off onto a separately allocated structure; getting closer to LFS as an LKM. "Welcome to 1.6O."
* Unified GOP_ALLOC between FFS and LFS.
* Update LFS copyright headers to correct values.
* Actually cast to unsigned in lfs_shellsort, like the comment says.
* Keep track of which segments were empty before the previous checkpoint; any segments that pass two checkpoints both dirty and empty can be summarily cleaned. Do this. Right now lfs_segclean still works, but this should be turned into an effectless compatibility syscall.
|
1.57 |
| 05-Feb-2003 |
pk | Make the buffer cache code MP-safe.
|
1.56 |
| 24-Jan-2003 |
fvdl | Bump daddr_t to 64 bits. Replace it with int32_t in all places where it was used on-disk, so that on-disk formats remain the same. Remove ufs_daddr_t and ufs_lbn_t for the time being.
|
1.55 |
| 30-Dec-2002 |
yamt | comment and assertions
|
1.54 |
| 30-Dec-2002 |
yamt | move check of lfs_unlockvp from lfs_reserveavail to lfs_reserve because lfs_reservebuf needs same check as well.
|
1.53 |
| 29-Dec-2002 |
yamt | fix vref/vunref mismatch.
|
1.52 |
| 28-Dec-2002 |
yamt | - in lfs_reserve, vref vnodes that we're locking so that cleaner doesn't try to reclaim them. (workaround for deadlock noted in the comment in lfs_reserveavail) - in lfs_rename, mark vnodes which are being moved as well as directry vnodes.
|
1.51 |
| 26-Dec-2002 |
yamt | - in lfs_reserve, reserve locked buffer count as well. - don't wait for locking buf in lfs_bwrite_ext to avoid deadlocks. - skip lfs_reserve when we're doing dirop. reserve more (for lfs_truncate) in set_dirop instead.
this mostly solves PR 18972. (and hopefully PR 19196)
|
1.50 |
| 22-Dec-2002 |
yamt | add a XXX comment. (description of possible deadlock)
|
1.49 |
| 17-Dec-2002 |
yamt | #if 0 out vnode unlock/lock in lfs_reserve for now and add a comment about it. deadlock is better than corruption (or panic), IMO.
|
1.48 |
| 14-Dec-2002 |
yamt | - in lfs_bwrite_ext, if we're cleaner, mark inode IN_CLEANING rather then IN_MODIFIED. otherwise cleaned (indirect) blocks belongs to the inode isn't written until next sync. - add assertions.
|
1.47 |
| 27-Nov-2002 |
yamt | more XXX comment.
|
1.46 |
| 24-Nov-2002 |
yamt | add a XXX comment to lfs_reserve. * it isn't safe to unlock vp here * because we're passing data using inode from namei. * (eg. i_offset)
|
1.45 |
| 24-Nov-2002 |
yamt | lfs_reserve shouldn't block for lfs_unlockvp. otherwise cleaner deadlocks. PR 19134.
|
1.44 |
| 20-Jun-2002 |
perseant | Fix miscalculation in lfs_fits found by Trevin Beattie <trevin@xmission.com>. Change some of the variable names from "nb", "db" to "fsb" to reflect their calling conventions.
|
1.43 |
| 14-May-2002 |
perseant | branches: 1.43.2; Phase one of my three-phase plan to make LFS play nice with UBC, and bug-fixes I found while making sure there weren't any new ones.
* Make the write clusters keep track of the buffers whose blocks they contain. This should make it possible to (1) write clusters using a page mapping instead of malloc, if desired, and (2) schedule blocks for rewriting (somewhere else) if a write error occurs. Code is present to use pagemove() to construct the clusters but that is untested and will go away anyway in favor of page mapping. * DEBUG now keeps a log of Ifile writes, so that any lingering instances of the "dirty bufs" problem can be properly debugged. * Keep track of whether the Ifile has been dirtied by various routines that can be called by lfs_segwrite, and loop on that until it is clean, for a checkpoint. Checkpoints need to be squeaky clean. * Warn the user (once) if the Ifile grows larger than is reasonable for their buffer cache. Both lfs_mountfs and lfs_unmount check since the Ifile can grow. * If an inode is not found in a disk block, try rereading the block, under the assumption that the block was copied to a cluster and then freed. * Protect WRITEINPROG() with splbio() to fix a hang in lfs_update.
|
1.42 |
| 12-May-2002 |
matt | Eliminate commons.
|
1.41 |
| 11-Feb-2002 |
perseant | Include the space taken by inodes in the count made by lfs_check(); make VOP_SETATTR call lfs_check. This prevents large numbers of inode changes (say, at the end of tar(1)) from filling the buffer cache.
|
1.40 |
| 23-Nov-2001 |
chs | add spaces for KNF. confirmed to produce identical objects.
|
1.39 |
| 08-Nov-2001 |
lukem | add RCSID
|
1.38 |
| 06-Nov-2001 |
simonb | Remove some variables that are set but never used.
|
1.37 |
| 26-Oct-2001 |
lukem | remove #include <ufs/ufs/quota.h> where it was just to appease <ufs/ufs/inode.h>, since the latter now includes the former. leave the former in source that obviously uses specific bits of it (for completeness.)
|
1.36 |
| 13-Jul-2001 |
perseant | branches: 1.36.4; Merge the short-lived perseant-lfsv2 branch into the trunk.
Kernels and tools understand both v1 and v2 filesystems; newfs_lfs generates v2 by default. Changes for the v2 layout include:
- Segments of non-PO2 size and arbitrary block offset, so these can be matched to convenient physical characteristics of the partition (e.g., stripe or track size and offset).
- Address by fragment instead of by disk sector, paving the way for non-512-byte-sector devices. In theory fragments can be as large as you like, though in reality they must be smaller than MAXBSIZE in size.
- Use serial number and filesystem identifier to ensure that roll-forward doesn't get old data and think it's new. Roll-forward is enabled for v2 filesystems, though not for v1 filesystems by default.
- The inode free list is now a tailq, paving the way for undelete (undelete is not yet implemented, but can be without further non-backwards-compatible changes to disk structures).
- Inode atime information is kept in the Ifile, instead of on the inode; that is, the inode is never written *just* because atime was changed. Because of this the inodes remain near the file data on the disk, rather than wandering all over as the disk is read repeatedly. This speeds up repeated reads by a small but noticeable amount.
Other changes of note include:
- The ifile written by newfs_lfs can now be of arbitrary length, it is no longer restricted to a single indirect block.
- Fixed an old bug where ctime was changed every time a vnode was created. I need to look more closely to make sure that the times are only updated during write(2) and friends, not after-the-fact during a segment write, and certainly not by the cleaner.
|
1.35 |
| 03-Dec-2000 |
perseant | branches: 1.35.2; 1.35.4; 1.35.6; Fix typo in 'malloc' for non-MALLOCLOG case
|
1.34 |
| 03-Dec-2000 |
perseant | Get rid of some old unnecessary code that cleared B_NEEDCOMMIT from buffers in lfs_writeseg (possibly after they had been freed).
If MALLOCLOG is defined, make lfs_newbuf and lfs_freebuf pass along the caller's file and line to _malloc and _free.
|
1.33 |
| 27-Nov-2000 |
perseant | If LFS_DO_ROLLFORWARD is defined, roll forward from the older checkpoint on mount, through the newer checkpoint and on through any newer partial-segments that may have been written but not checkpointed because of an intervening crash.
LFS_DO_ROLLFORWARD is not defined by default.
|
1.32 |
| 17-Nov-2000 |
perseant | Correct accounting of lfs_avail, locked_queue_count, and locked_queue_bytes. (PR #11468). In the case of fragment allocation, check to see if enough space is available before extending a fragment already scheduled for writing.
The locked_queue_* variables indicate the number of buffer headers and bytes, respectively, that are unavailable to getnewbuf() because they are locked up waiting for LFS to flush them; make sure that that is actually what we're counting, i.e., never count malloced buffers, and always use b_bufsize instead of b_bcount.
If DEBUG is defined, the periodic calls to lfs_countlocked will now complain if either counter is incorrect. (In the future lfs_countlocked will not need to be called at all if DEBUG is not defined.)
|
1.31 |
| 12-Nov-2000 |
perseant | Do not needlessly dirty segment table blocks during lfs_segwrite, preventing needless disk activity when the filesystem is idle. (PR #10979.)
|
1.30 |
| 13-Sep-2000 |
perseant | Cast back to int32_t in LFS_EST_BFREE and LFS_EST_RSVD macros, for consistency with their arguments.
Change the debugging printf in lfs_reserve to match, and enclose it in #ifdef DEBUG.
Tested on alpha, arm32, sparc.
|
1.29 |
| 12-Sep-2000 |
perseant | Make this file compile on the alpha as well (use %ld and cast to long, instead of %qd with no cast).
|
1.28 |
| 10-Sep-2000 |
augustss | Make this file compile again.
|
1.27 |
| 09-Sep-2000 |
perseant | Various bug-fixes to LFS, to wit:
Kernel:
* Add runtime quantity lfs_ravail, the number of disk-blocks reserved for writing. Writes to the filesystem first reserve a maximum amount of blocks before their write is allowed to proceed; after the blocks are allocated the reserved total is reduced by a corresponding amount.
If the lfs_reserve function cannot immediately reserve the requested number of blocks, the inode is unlocked, and the thread sleeps until the cleaner has made enough space available for the blocks to be reserved. In this way large files can be written to the filesystem (or, smaller files can be written to a nearly-full but thoroughly clean filesystem) and the cleaner can still function properly.
* Remove explicit switching on dlfs_minfreeseg from the kernel code; it is now merely a fs-creation parameter used to compute dlfs_avail and dlfs_bfree (and used by fsck_lfs(8) to check their accuracy). Its former role is better assumed by a properly computed dlfs_avail.
* Bounds-check inode numbers submitted through lfs_bmapv and lfs_markv. This prevents a panic, but, if the cleaner is feeding the filesystem the wrong data, you are still in a world of hurt.
* Cleanup: remove explicit references of DEV_BSIZE in favor of btodb()/dbtob().
lfs_cleanerd:
* Make -n mean "send N segments' blocks through a single call to lfs_markv". Previously it had meant "clean N segments though N calls to lfs_markv, before looking again to see if more need to be cleaned". The new behavior gives better packing of direct data on disk with as little metadata as possible, largely alleviating the problem that the cleaner can consume more disk through inefficient use of metadata than it frees by moving dirty data away from clean "holes" to produce entirely clean segments.
* Make -b mean "read as many segments as necessary to write N segments of dirty data back to disk", rather than its former meaning of "read as many segments as necessary to free N segments worth of space". The new meaning, combined with the new -n behavior described above, further aids in cleaning storage efficiency as entire segments can be written at once, using as few blocks as possible for segment summaries and inode blocks.
* Make the cleaner take note of segments which could not be cleaned due to error, and not attempt to clean them until they are entirely free of dirty blocks. This prevents the case in which a cleanerd running with -n 1 and without -b (formerly the default) would spin trying repeatedly to clean a corrupt segment, while the remaining space filled and deadlocked the filesystem.
* Update the lfs_cleanerd manual page to describe all the options, including the changes mentioned here (in particular, the -b and -n flags were previously undocumented).
fsck_lfs:
* Check, and optionally fix, lfs_avail (to an exact figure) and lfs_bfree (within a margin of error) in pass 5.
newfs_lfs:
* Reduce the default dlfs_minfreeseg to 1/20 of the total segments.
* Add a warning if the sgs disklabel field is 16 (the default for FFS' cpg, but not usually desirable for LFS' sgs: 5--8 is a better range).
* Change the calculation of lfs_avail and lfs_bfree, corresponding to the kernel changes mentioned above.
mount_lfs:
* Add -N and -b options to pass corresponding -n and -b options to lfs_cleanerd.
* Default to calling lfs_cleanerd with "-b -n 4".
[All of these changes were largely tested in the 1.5 branch, with the idea that they (along with previous un-pulled-up work) could be applied to the branch while it was still in ALPHA2; however my test system has experienced corruption on another filesystem (/dev/console has gone missing :^), and, while I believe this unrelated to the LFS changes, I cannot with good conscience request that the changes be pulled up.]
|
1.26 |
| 05-Jul-2000 |
perseant | Clean up accounting of lfs_uinodes (dirty but unwritten inodes).
Make lfs_uinodes a signed quantity for debugging purposes, and set it to zero as fs mount time.
Enclose setting/clearing of the dirty flags (IN_MODIFIED, IN_ACCESSED, IN_CLEANING) in macros, and use those macros everywhere. Make LFS_ITIMES use these macros; updated the ITIMES macro in inode.h to know about this. Make ufs_getattr use ITIMES instead of FFS_ITIMES.
|
1.25 |
| 03-Jul-2000 |
perseant | Allow the number of free segments reserved for the cleaner to be parametrized in the filesystem, defaulting to MIN_FREE_SEGS = 2 but set to something more reasonable at newfs_lfs time.
Note the number of blocks that have been scheduled for writing but which are not yet on disk in an inode extension, i_lfs_effnblks. Move i_ffs_effnlink out of the ffs extension and onto the main inode, since it's used all over the shared code and the lfs extension would clobber it.
At inode write time, indirect blocks and inode-held blocks of inodes that have i_lfs_effnblks != i_ffs_blocks are cleansed of UNWRITTEN disk addresses, so that these never make it to disk.
|
1.24 |
| 27-Jun-2000 |
perseant | Fixes associated with filling an LFS:
Change the space computation to appear to change the size of the *disk* rather than the *bytes used* when more segment summaries and inode blocks are written. Try to estimate the amount of space that these will take up when more files are written, so the disk size doesn't change too much.
Regularize error returns from lfs_valloc, lfs_balloc, lfs_truncate: they now fail entirely, rather than succeeding half-way and leaving the fs in an inconsistent state.
Rewrite lfs_truncate, mostly stealing from ffs_truncate. The old lfs_truncate had difficulty truncating a large file to a non-zero size (indirect blocks were not handled appropriately).
Unmark VDIROP on fvp after ufs_remove, ufs_rmdir, so these can be reclaimed immediately: this vnode would not be written to disk again anyway if the removal succeeded, and if it failed, no directory operation occurred.
ufs_makeinode and ufs_mkdir now remove IN_ADIROP on error.
|
1.23 |
| 06-Jun-2000 |
perseant | branches: 1.23.2; Protect inode free list with seglock, instead of separate lock, so that the head of the inode free list (on the superblock) always matches the rest of the free list (in the ifile).
Protect lfs_fragextend with seglock, to prevent the segment byte count fudging from making its way to disk.
Don't try to inactivate dirop vnodes that are still in the middle of their dirop (may address PR#10285).
|
1.22 |
| 31-May-2000 |
fredb | Make this build. (Balance parenthesis.
|
1.21 |
| 31-May-2000 |
perseant | update for IN_ACCESSED changes
|
1.20 |
| 27-May-2000 |
perseant | branches: 1.20.2; Prevent dirops from getting around lfs_check and wedging the buffer cache. All the dirop vnops now mark the inodes with a new flag, IN_ADIROP, which is removed as soon as the dirop is done (as opposed to VDIROP which stays until the file is written). To address one issue raised in PR#9357.
|
1.19 |
| 19-May-2000 |
thorpej | NULL != 0
|
1.18 |
| 05-May-2000 |
perseant | Change the way LFS does block accounting, from trying to infer from the buffer cache flags, to marking the inode and/or indirect blocks with a special disk address UNWRITTEN==-2 when a block is accounted for. (This address is never written to disk, but only used in-core. This is essentially the same method of block accounting as on the UBC branch, where the buffer headers don't exist.) Make sure that truncation is handled properly, especially in the case of holey files.
Fixes PR#9994.
|
1.17 |
| 30-Mar-2000 |
augustss | Remove register declarations.
|
1.16 |
| 15-Dec-1999 |
perseant | In lfs_bwrite, don't mark buffers dirty if lfs is mounted read-only. (Previously buffers could be marked dirty by the cleaner, and possibly by other means.)
Also check for softdep mount in vfs_shutdown before trying to bawrite buffers, since other filesystems don't need it and lfs doesn't bawrite. (This fragment reviewed by fvdl.)
Partially addresses PR#8964.
|
1.15 |
| 04-Dec-1999 |
ragge | CL* discarding.
|
1.14 |
| 23-Nov-1999 |
fvdl | Be more careful to block bio interrupts for some data structures. There were at least a few missed cases where vp->v_{clean,dirty}blkhd were unprotected since the softdep/trickle sync merge.
|
1.13 |
| 06-Nov-1999 |
perseant | branches: 1.13.2; Address ufs_hashlock/ufs_ihashins protocol bug, discovered while doing a post-mortem of a production machine. Also, take the active dirop count off of the fs and make it global (since it is measuring a global resource) and tie the threshold value LFS_MAXDIROP to desiredvnodes.
|
1.12 |
| 21-Oct-1999 |
perseant | Under degenerate access patterns (e.g. `bonnie' benchmark) lfs_check could fail, because the particular block being requested was always in the cache (although other routines that cannot afford to call lfs_check have in the meantime stuffed the cache full of dirty blocks). Partially addresses PR 8383.
|
1.11 |
| 01-Jun-1999 |
perseant | branches: 1.11.2; 1.11.4; 1.11.6; Fixed lfs_update (and related functions) so that calls from lfs_fsync will DTRT with vnodes marked VDIROP. In particular, the message "flushing VDIROP" will no longer appear, and the filesystem will remain stable in the event of a crash.
This was particularly a problem with NFS-exported LFSes, since fsync was called on every file close.
|
1.10 |
| 12-Apr-1999 |
perseant | Disallow threshold-initiated cache flush when dirops are active. Also, make SET_ENDOP use lfs_check instead of inlining most of it.
|
1.9 |
| 25-Mar-1999 |
perseant | branches: 1.9.2; Fixes to make dirops and lfs_vflush play together well. In particular, if we are short on vnodes, lfs_vflush from another process can grab a vnode that lfs_markv has already processed but not yet written; but lfs_markv holds the seglock. When lfs_vflush gets around to writing it, the context for copyin is gone. So, now lfs_markv calls copyin itself, rather than having lfs_writeseg do it.
|
1.8 |
| 25-Mar-1999 |
perseant | clean up unused/required #ifdefs
|
1.7 |
| 10-Mar-1999 |
perseant | New sources should leave the LFS in a more-or-less working state. Changes include:
- DIROP segregation is enabled, and greater care is taken to make sure that a checkpoint completes. Fsck is not needed to remount the filesystem. - Several checks to make sure that the LFS subsystem does not overuse various resources (memory, in particular). - The cleaner routines, lfs_markv in particular, are completely rewritten. A buffer overflow is removed. Greater care is taken to ensure that inodes come from where lfs_cleanerd say they come from (so we know nothing has changed since lfs_bmapv was called). - Fragment allocation is fixed, so that writes beyond end-of-file do the right thing.
|
1.6 |
| 01-Mar-1998 |
fvdl | Merge with Lite2 + local changes
|
1.5 |
| 09-Feb-1996 |
christos | lfs prototypes
|
1.4 |
| 18-Jun-1995 |
cgd | don't assume the f_fsnamelen is nul-truncated or longer than MFSNAMELEN
|
1.3 |
| 18-Jan-1995 |
mycroft | Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS differently.
|
1.2 |
| 29-Jun-1994 |
cgd | New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'
|
1.1 |
| 08-Jun-1994 |
mycroft | branches: 1.1.1; Update to 4.4-Lite fs code, with local changes.
|
1.1.1.2 |
| 01-Mar-1998 |
fvdl | Import 4.4BSD-Lite2
|
1.1.1.1 |
| 01-Mar-1998 |
fvdl | Import 4.4BSD-Lite for reference
|
1.9.2.4 |
| 17-Dec-1999 |
he | Pull up revision 1.13 (requested by perseant): Address locking protocol error for inode hash, and make the maximum number of active dirops a global quantity.
|
1.9.2.3 |
| 17-Dec-1999 |
he | Pull up revision 1.11 (requested by perseant): Avoid flushing vnodes involved in a dirop, making lfs' promise of "no fsck needed, even in the event of a crash" closer to reality.
|
1.9.2.2 |
| 26-Oct-1999 |
he | Pull up revision 1.12 (requested by perseant): Fix LFS buffer starvation under degenerate access patterns.
|
1.9.2.1 |
| 13-Apr-1999 |
perseant | branches: 1.9.2.1.2; Pull-up of changes made to the trunk on Sunday [1.9->1.10], to wit:
Take out the `#ifdef USE_UFSHASH'; use ufs_hashlock to lock the inode free list instead of free_lock.
Fix inode reporting in lfs_statfs (the meaning of f_files and f_ffree was reversed).
Fix "lfs_ifind: dinode xxx not found" panic. When inodes were freed, then immediately reloaded, their dinodes were located in an inode block which was not on disk at the advertized location, nor in the cache (although it would be flushed to disk next segment write). Fix this by using getblk() instead of lfs_newbuf() for inode blocks.
Better checking for held inode locks in lfs_fastvget, for a number of error conditions. Also change the default setting of lfs_clean_vnhead to 0, which seems to make the locking problems go away (although this is difficult to test as I can't reliably reproduce them).
Make sure that the wakeup occurs for vnodes that lfs_update might be sleeping on (nodes which are not marked IN_MODIFIED/IN_CLEANING, but which have dirty buffers), by marking them with the appropriate flag if dirtybuffers were added while the write was in progress.
Fix block counting during file truncation, if not truncating to zero.
Disallow threshold-initiated cache flush when dirops are active. Also, make SET_ENDOP use lfs_check instead of inlining most of it.
Improve the debugging printfs in the cleaner syscalls (in particular, make it obvious that they're coming from lfs).
Check the superblock version field, and refuse to mount the filesystem if the version number is higher than we know about. This allows, e.g., changes in the format of the ifile, segment size restrictions and boundaries, etc., which would not affect existing fields in the superblock, but which would drastically affect the filesystem, to be smoothly integrated at a later date.
|
1.9.2.1.2.2 |
| 31-Aug-1999 |
perseant | Rudimentary support for LFS under UBC:
- LFS-specific VOP_BALLOC and VOP_PUTPAGES vnode ops.
- getblk VREG panic #ifdef'd out (can be reinstated when Ifile is internalized and Ifile can be made another type from VREG)
- interface to VOP_PUTPAGES changed to pass all pager flags, not just sync. FS putpages routines must know about the pager flags.
- new LFS magic disk address, -2 ("unwritten"), meaning accounted for but not assigned to a fixed disk location (since LFS does these two things separately, and the previous accounting method using buffer headers no longer will work). Changed references to (foo == (daddr_t)-1) to (foo < 0). Since disk drivers reject all addresses < 0, this should not present a problem for other FSs.
|
1.9.2.1.2.1 |
| 21-Jun-1999 |
thorpej | Sync w/ -current.
|
1.11.6.2 |
| 27-Dec-1999 |
wrstuden | Pull up to last week's -current.
|
1.11.6.1 |
| 21-Dec-1999 |
wrstuden | Initial commit of recent changes to make DEV_BSIZE go away.
Runs on i386, needs work on other arch's. Main kernel routines should be fine, but a number of the stand programs need help.
cd, fd, ccd, wd, and sd have been updated. sd has been tested with non-512 byte block devices. vnd, raidframe, and lfs need work.
Non 2**n block support is automatic for LKM's and conditional for kernels on "options NON_PO2_BLOCKS".
|
1.11.4.1 |
| 15-Nov-1999 |
fvdl | Sync with -current
|
1.11.2.3 |
| 08-Dec-2000 |
bouyer | Sync with HEAD.
|
1.11.2.2 |
| 22-Nov-2000 |
bouyer | Sync with HEAD.
|
1.11.2.1 |
| 20-Nov-2000 |
bouyer | Update thorpej_scsipi to -current as of a month ago
|
1.13.2.2 |
| 06-Nov-1999 |
perseant | Address ufs_hashlock/ufs_ihashins protocol bug, discovered while doing a post-mortem of a production machine. Also, take the active dirop count off of the fs and make it global (since it is measuring a global resource) and tie the threshold value LFS_MAXDIROP to desiredvnodes.
|
1.13.2.1 |
| 06-Nov-1999 |
perseant | file lfs_bio.c was added on branch comdex-fall-1999 on 1999-11-06 20:33:06 +0000
|
1.20.2.1 |
| 22-Jun-2000 |
minoura | Sync w/ netbsd-1-5-base.
|
1.23.2.2 |
| 03-Feb-2001 |
he | Pull up revisions 1.31-1.32 (requested by perseant): o Don't write anything if the filesystem is idle (PR#10979). o Close up accounting holes in LFS' accounting of immediately- available-space, number of clean segments, and amount of dirty space taken up by metadata (PR#11468, PR#11470, PR#11534).
|
1.23.2.1 |
| 14-Sep-2000 |
perseant | Pull up recent LFS kernel changes (approved by thorpej):
ufs/ufs/inode.h, 1.20--1.22 (add i_lfs_effnblks extension ; make ITIMES aware of LFS_ITIMES; _LKM protection so userland progs compile) ufs/ufs/ufs_vnops.c, 1.69, 1.71 (remove IN_ADIROP; use ITIMES instead of FFS_ITIMES) ufs/ufs/ufs_readwrite.c, 1.27 (use lfs_reserve in lfs_write) ufs/lfs/lfs.h, 1.26--1.32 (define LFS_EST_* macros ; change MIN_FREE_SEGS to lfs_minfreesegs ; add avail and bfree to CLEANERINFO ; change lfs_uinodes to signed ; change lfs_dmeta to signed ; add whitespace to line up structure members ; explicit cast to int32_t in LFS_EST_* macros) ufs/lfs/lfs_alloc.c, back out 1.34.2.3 (pullups of 1.39, 1.40); then pull up 1.38 (clean up on error) 1.39--1.43 (restore fvdl's ufs_hashlock fix ; restore fvdl's ufs_hashlock fix ; set i_lfs_effnblks ; use UINO macros ; add comments and fix long lines) ufs/lfs/lfs_balloc.c, 1.19 (don't succeed halfway) 1.21--1.25 (use i_lfs_effnblks ; fix i_lfs_effnblks computation and quieten ; fix i_ffs_blocks in unwritten fragment ; remove useless debugging check ; add comments and (c) 2000) ufs/lfs/lfs_bio.c, 1.24--1.30 (cleanup and make lfs_flush_fs take "struct lfs *" instead of "struct mount *" ; use lfs_minfreeseg instead of MIN_FREE_SEGS ; use UINO macros, and copy bfree/avail to CLEANERINFO ; add lfs_reserve function ; 1.28--1.30 fix printf formatting) ufs/lfs/lfs_cksum.c, 1.13 (add (c) 2000) ufs/lfs/lfs_debug.c, 1.11 (use btodb instead of DEV_BSIZE) ufs/lfs/lfs_extern.h, 1.18, 1.20--1.21 (function prototype changes) ufs/lfs/lfs_inode.c, 1.38 (rewrite lfs_truncate from ffs_truncate) 1.40--1.44 (count written and unwritten blocks seperately ; use disk block units instead of bytes ; remove unnecessary "mod" variable ; correct B_DELWRI to avoid bawrite panic ; use lfs_reserve) ufs/lfs/lfs_segment.c, 1.52-1.59 (use lfs_dmeta to note used summaries ; check for UNWRITTEN in indirect blocks ; more debugging stuff inside #ifdef DEBUG_LFS ; use LK_CANRECURSE ; don't drop dirty indirect blocks ; use UINO macros ; don't hose the free list ; use btodb() instead of DEV_BSIZE ; make it compile again (oops)) ufs/lfs/lfs_subr.c, 1.16--1.17 (check for locked inodes before changing ; use btodb() instead of DEV_BSIZE, (c) 2000) ufs/lfs/lfs_syscalls.c, back out 1.41.4.2 (fvdl's ufs_hashlock fix); then pull up 1.43 (use lfs_dmeta) 1.44--1.45 (restore fvdl's ufs_hashlock fix) 1.46--1.47 (fix lfs_avail leakage from sblock segments ; use UINO macros) 1.49 (bounds-check inode numbers in lfs_markv) ufs/lfs/lfs_vfsops.c, 1.53 (use LFS_EST_* macros in lfs_statfs) 1.56--1.58 (initialize lfs_minfreeseg, lfs_effnblk ; initialize lfs_uinodes ; initialize lfs_ravail) ufs/lfs/lfs_vnops.c, 1.40 (remove VDIROP from removed files) 1.42--1.44 (move SET_ENDOP below the removal of VDIROP ; use UINO macros and add lfs_itimes function ; use lfs_reserve in dirops)
|
1.35.6.5 |
| 06-Sep-2002 |
jdolecek | sync kqueue branch with HEAD
|
1.35.6.4 |
| 23-Jun-2002 |
jdolecek | catch up with -current on kqueue branch
|
1.35.6.3 |
| 16-Mar-2002 |
jdolecek | Catch up with -current.
|
1.35.6.2 |
| 10-Jan-2002 |
thorpej | Sync kqueue branch with -current.
|
1.35.6.1 |
| 03-Aug-2001 |
lukem | update to -current
|
1.35.4.4 |
| 13-Jul-2001 |
perseant | Be more careful about when we update ctime/mtime. In particular, if we are only writing indirect blocks, that doesn't count for mtime; and when we first create a vnode, that certainly *does not* count for ctime (a bug that's been there from the beginning).
This does not change the fact that mtime might still be set after write(2) is "completed", but it does make the atime-in-the-ifile code have some effect (noticeable less degradation of read time after an intervening large write).
|
1.35.4.3 |
| 02-Jul-2001 |
perseant | Change disk addressing unit to be the fragment, instead of the disk sector. All quantities in the superblock, inodes, indirect blocks, etc. refer now to this abstract unit (called "fsb" as it is in FFS) instead of disk sectors; as a consequence segment summary blocks have to be multiples of a fragment in size. In v1 filesystems, compatibility code ensures that 1 fsb == 1 sector, regardless of fragment size.
Fragments can now range in size between 512 and 32k; in the event that LFS_LABELPAD (8k) is smaller than the disk address unit size, an extra proto-superblock is kept at 8k from the beginning of the disk, to be used *only* to locate the real superblocks. (Not all of the userland knows about this yet.)
Almost all of this was done not by me, but by joff.
|
1.35.4.2 |
| 29-Jun-2001 |
perseant | Get rid of __P(), protoizing where it had not already been done
|
1.35.4.1 |
| 27-Jun-2001 |
perseant | Import of what I've been calling "LFSv2", that is, LFS with some features added that require changes to the on-disk data structures. These include:
- 64-bit time in everything but inodes - User-specified segment offset, and segment size no longer restricted to PO2. - Serial number on segment summaries in addition to timestamp, and a new volume identifier, to make roll-forward feasible without fear of finding old data and thinking it was new.
Although I think this version works at least as well as what's on the trunk, we're not done yet; hence this commit is going in on a branch and not on the trunk. Enhancements that are not here yet include fragment addressing, like FFS does, instead of block addressing.
|
1.35.2.13 |
| 03-Jan-2003 |
thorpej | Sync with HEAD.
|
1.35.2.12 |
| 29-Dec-2002 |
thorpej | Sync with HEAD.
|
1.35.2.11 |
| 19-Dec-2002 |
thorpej | Sync with HEAD.
|
1.35.2.10 |
| 11-Dec-2002 |
thorpej | Sync with HEAD.
|
1.35.2.9 |
| 01-Aug-2002 |
nathanw | Catch up to -current.
|
1.35.2.8 |
| 15-Jul-2002 |
nathanw | Whitespace.
|
1.35.2.7 |
| 24-Jun-2002 |
nathanw | Curproc->curlwp renaming.
Change uses of "curproc->l_proc" back to "curproc", which is more like the original use. Bare uses of "curproc" are now "curlwp".
"curproc" is now #defined in proc.h as ((curlwp) ? (curlwp)->l_proc) : NULL) so that it is always safe to reference curproc (*de*referencing curproc is another story, but that's always been true).
|
1.35.2.6 |
| 20-Jun-2002 |
nathanw | Catch up to -current.
|
1.35.2.5 |
| 28-Feb-2002 |
nathanw | Catch up to -current.
|
1.35.2.4 |
| 08-Jan-2002 |
nathanw | Catch up to -current.
|
1.35.2.3 |
| 14-Nov-2001 |
nathanw | Catch up to -current.
|
1.35.2.2 |
| 24-Aug-2001 |
nathanw | Catch up with -current.
|
1.35.2.1 |
| 05-Mar-2001 |
nathanw | Initial commit of scheduler activations and lightweight process support.
|
1.36.4.1 |
| 12-Nov-2001 |
thorpej | Sync the thorpej-mips-cache branch with -current.
|
1.43.2.1 |
| 15-Jul-2002 |
gehenna | catch up with -current.
|
1.66.2.7 |
| 10-Nov-2005 |
skrll | Sync with HEAD. Here we go again...
|
1.66.2.6 |
| 01-Apr-2005 |
skrll | Sync with HEAD.
|
1.66.2.5 |
| 08-Mar-2005 |
skrll | Sync with HEAD.
|
1.66.2.4 |
| 04-Mar-2005 |
skrll | Sync with HEAD.
Hi Perry!
|
1.66.2.3 |
| 21-Sep-2004 |
skrll | Fix the sync with head I botched.
|
1.66.2.2 |
| 18-Sep-2004 |
skrll | Sync with HEAD.
|
1.66.2.1 |
| 03-Aug-2004 |
skrll | Sync with HEAD
|
1.77.10.1 |
| 19-Mar-2005 |
yamt | sync with head. xen and whitespace. xen part is not finished.
|
1.77.8.1 |
| 29-Apr-2005 |
kent | sync with -current
|
1.77.6.1 |
| 10-May-2005 |
riz | Pull up the following revisions (requested by perseant in ticket #1281):
1.8 sys/ufs/lfs/TODO 1.75 sys/ufs/lfs/lfs.h (via patch) 1.74 sys/ufs/lfs/lfs_alloc.c (via patch) 1.49, 1.51 sys/ufs/lfs/lfs_balloc.c (1.51 via patch) 1.78 sys/ufs/lfs/lfs_bio.c 1.62 sys/ufs/lfs/lfs_extern.h (via patch) 1.156 sys/ufs/lfs/lfs_segment.c (via patch) 1.48 sys/ufs/lfs/lfs_subr.c 1.101 sys/ufs/lfs/lfs_syscalls.c 1.163 sys/ufs/lfs/lfs_vfsops.c (via patch) 1.134 sys/ufs/lfs/lfs_vnops.c (via patch) 1.61 sys/ufs/ufs/ufs_readwrite.c (via patch)
1.20 libexec/lfs_cleanerd/clean.h (via patch) 1.52 libexec/lfs_cleanerd/cleanerd.c (via patch) 1.41 libexec/lfs_cleanerd/library.c (via patch)
1.4 regress/sys/fs/lfs/newfs_fsck/Makefile 1.2 regress/sys/fs/lfs/newfs_fsck/mkfs_mount 1.2 regress/sys/fs/lfs/newfs_fsck/smallfiles 1.3 sbin/fsck_lfs/bufcache.c 1.3 sbin/fsck_lfs/bufcache.h 1.3 sbin/fsck_lfs/lfs.h 1.8 sbin/fsck_lfs/lfs.c (via patch) 1.8 sbin/fsck_lfs/pass3.c (via patch) 1.18 sbin/fsck_lfs/pass0.c (via patch) 1.18 sbin/fsck_lfs/utilities.c (via patch) 1.7 sbin/fsck_lfs/segwrite.c 1.19 sbin/fsck_lfs/setup.c (via patch) 1.3 sbin/newfs_lfs/Makefile 0 sbin/newfs_lfs/lfs.c (yes, remove it) 1.1 sbin/newfs_lfs/make_lfs.c 1.15 sbin/newfs_lfs/newfs.c (via patch)
Various minor LFS improvements.
Kernel:
* Note when lfs_putpages(9) thinks it is not going to be writing any pages before calling genfs_putpages(9). This prevents a situation in which blocks can be queued for writing without a segment header. * Correct computation of NRESERVE(), though it is still a gross overestimate in most cases. Note that if NRESERVE() is too high, it may be impossible to create files on the filesystem. We catch this case on filesystem mount and refuse to mount r/w. * Allow filesystems to be mounted whose block size is == MAXBSIZE. * Somewhere along the line, ufs_bmaparray(9) started mangling UNWRITTEN entries in indirect blocks again, triggering a failed assertion "daddr <= LFS_MAX_DADDR". Explicitly convert to and from int32_t to correct this. Should fix PR #29045. * Add a high-water mark for the number of dirty pages any given LFS can hold before triggering a flush. This is settable by sysctl, but off (zero) by default. * Be more careful about the MAX_BYTES and MAX_BUFS computations so we shouldn't see "please increase to at least zero" messages. * Note that VBLK and VCHR vnodes can have nonzero values in di_db[0] even though their v_size == 0. Don't panic when we see this. Fixes PR #26680. * Change lfs_bfree to a signed quantity. The manner in which it is processed before being passed to the cleaner means that sometimes it may drop below zero, and the cleaner must be aware of this. * Never report bfree < 0 (or higher than lfs_dsize) through lfs_statfs(9). This prevents df(1) from ever telling us that our full filesystems have 16TB free. * Account space allocated through lfs_balloc(9) that does not have associated buffer headers, so that the pagedaemon doesn't run us out of segments. * Return ENOSPC from lfs_balloc(9) when bfree drops to zero. * Address a deadlock in lfs_bmapv/lfs_markv when the filesystem is being unmounted. Because vfs_busy() is a shared lock, and lfs_bmapv/lfs_markv mark the filesystem vfs_busy(), the cleaner can be holding the lock that umount() is blocking on, then try to vfs_busy() again in getnewvnode().
cleaner:
* Adapt lfs_cleanerd to use the fcntl call to get the Ifile filehandle, so it need not be in the namespace. * Make lfs_cleanerd be more careful when there are very few available segments. * Make lfs_cleanerd less verbose when the filesystem is unmounted.
newfs_lfs, fsck_lfs, and regression:
* Extend the lfs library from fsck_lfs(8) so that it can be used with a not-yet-existent LFS. Make newfs_lfs(8) use this library, so it can create LFSs whose Ifile is larger than one segment. Addresses PR #11110. * Make newfs_lfs(8) use strsuftoi64() for its arguments, a la newfs(8). * Make fsck_lfs(8) respect the "file system is clean" flag. * Don't let fsck_lfs(8) think it has dirty blocks when invoked with the -n flag. * Remove the Ifile from the filesystem namespace. The cleaner now uses a fcntl call on the root inode to find the Ifile filehandle. (As a side-effect, addresses PR #29144.)
|
1.81.2.5 |
| 10-Aug-2006 |
tron | Apply patch (requested by fair in perseant #1457): Bring LFS up to current, including a patch (1.95 lfs_alloc.c) that should prevent the inode free list errors seen on the STABLE branch subsequent to pullup ticket #1327.
|
1.81.2.4 |
| 20-May-2006 |
riz | Pull up following revision(s) (requested by perseant in ticket #1327): sys/ufs/lfs/lfs_alloc.c: revision 1.92 sys/ufs/lfs/lfs.h: revision 1.105 sys/ufs/lfs/lfs_vfsops.c: revision 1.207 sys/ufs/lfs/lfs_subr.c: revision 1.59 sys/ufs/lfs/lfs_vnops.c: revision 1.173 sys/ufs/lfs/lfs_bio.c: revision 1.92 Introduce another per-filesystem parameter, lfs_resvseg, to separate the notion of "how many segments are reserved for the cleaner" from that of "how many segments are not counted in lfs_bfree". The default value used for existing filesystems is the same as the previous implicit value of (lfs_minfreeseg / 2 + 1), modulo some sanity checking. Count pending dirops on a per-filesystem basis, since once we start writing them we can't stop until we're done. This seems to help stave off the "no clean segments" panic in the case of filling the filesystem with directories and small files (e.g. simultaneously unpacking more copies of pkgsrc than will fit).
|
1.81.2.3 |
| 20-May-2006 |
riz | Pull up following revision(s) (requested by perseant in ticket #1327): sys/ufs/lfs/lfs.h: revision 1.102 sys/ufs/lfs/lfs_segment.c: revision 1.173 sys/ufs/lfs/lfs_vnops.c: revision 1.167 via patch sys/ufs/lfs/lfs_bio.c: revision 1.91 Make lfs_vref/lfs_vunref not need to know about VXLOCK and VFREEING explicitly (especially since we didn't know about VFREEING at all before), but notice the EBUSY return from vget() instead. Fix some more MP locking protocol issues, most of which were pointed out by Christian Ehrhardt this morning on tech-kern.
|
1.81.2.2 |
| 20-May-2006 |
riz | Pull up following revision(s) (requested by perseant in ticket #1327): sys/ufs/lfs/lfs_vnops.c: revision 1.152 sys/ufs/lfs/lfs_debug.c: revision 1.31 sys/ufs/lfs/lfs_subr.c: revision 1.53 sys/ufs/lfs/lfs_extern.h: revision 1.68 sys/ufs/lfs/lfs_inode.c: revision 1.96 sys/ufs/lfs/lfs_bio.c: revision 1.86 sys/ufs/lfs/lfs_alloc.c: revision 1.83 sys/ufs/lfs/lfs_vfsops.c: revision 1.181 sys/ufs/lfs/lfs.h: revision 1.88 sys/ufs/lfs/lfs_segment.c: revision 1.164 - sprinkle const - avoid shadow variables.
|
1.81.2.1 |
| 07-May-2005 |
tron | Apply patch (requested by perseant in ticket #242): * fsck_lfs buffer cache fixes, including PR #29151 * Change fsck_lfs phase 0 message to reflect reality * fsck_lfs: check phase 5 (cleanerinfo accounting) even on roll-forward * Keep better track of the free list during roll-forward, avoiding a core dump * Improve hash table use for fsck_lfs buffer and vnode cache * Document fsck_lfs flag -f, and implement -q * Add resize_lfs, including kernel support * Add LFS to mountd's list of exportable filesystem types * Make the LFS lkm work again [christos@] * Add MP locking to the LFS kernel subsystem * Fix pager_map deadlock in lfs_putpages() * Avoid incomplete file extension that looks like "partial truncation" to fsck * Use lfs_malloc for cleaner malloc, since the cleaner often runs in low-memory conditions. * Use splay trees, not hash table, to track page allocation for write. * Fix mkdir panic on full fs * Fix page accounting leak by counting differently. * Use rightly named structure for lfs_getattr [skrll@] * Cosmetic changes for readability.
|
1.86.2.7 |
| 27-Feb-2008 |
yamt | sync with head.
|
1.86.2.6 |
| 04-Feb-2008 |
yamt | sync with head.
|
1.86.2.5 |
| 21-Jan-2008 |
yamt | sync with head
|
1.86.2.4 |
| 27-Oct-2007 |
yamt | sync with head.
|
1.86.2.3 |
| 03-Sep-2007 |
yamt | sync with head.
|
1.86.2.2 |
| 30-Dec-2006 |
yamt | sync with head.
|
1.86.2.1 |
| 21-Jun-2006 |
yamt | sync with head.
|
1.87.2.1 |
| 15-Jan-2006 |
yamt | sync with head.
|
1.89.6.4 |
| 11-Aug-2006 |
yamt | sync with head
|
1.89.6.3 |
| 24-May-2006 |
yamt | sync with head.
|
1.89.6.2 |
| 13-Mar-2006 |
yamt | sync with head.
|
1.89.6.1 |
| 05-Mar-2006 |
yamt | separate page replacement policy from the rest of kernel.
|
1.89.4.2 |
| 01-Jun-2006 |
kardel | Sync with head.
|
1.89.4.1 |
| 22-Apr-2006 |
simonb | Sync with head.
|
1.89.2.1 |
| 09-Sep-2006 |
rpaulo | sync with head
|
1.90.4.1 |
| 24-May-2006 |
tron | Merge 2006-05-24 NetBSD-current into the "peter-altq" branch.
|
1.90.2.3 |
| 11-May-2006 |
elad | sync with head
|
1.90.2.2 |
| 06-May-2006 |
christos | - Move kauth_cred_t declaration to <sys/types.h> - Cleanup struct ucred; forward declarations that are unused. - Don't include <sys/kauth.h> in any header, but include it in the c files that need it.
Approved by core.
|
1.90.2.1 |
| 19-Apr-2006 |
elad | sync with head.
|
1.93.4.1 |
| 13-Jul-2006 |
gdamore | Merge from HEAD.
|
1.94.4.1 |
| 18-Nov-2006 |
ad | Sync with head.
|
1.95.2.2 |
| 10-Dec-2006 |
yamt | sync with head.
|
1.95.2.1 |
| 22-Oct-2006 |
yamt | sync with head
|
1.98.16.1 |
| 03-Sep-2007 |
wrstuden | Sync w/ NetBSD-4-RC_1
|
1.98.10.1 |
| 11-Jul-2007 |
mjf | Sync with head.
|
1.98.8.7 |
| 24-Aug-2007 |
ad | Sync with buffer cache locking changes. See buf.h/vfs_bio.c for details. Some minor portions are incomplete and needs to be verified as a whole.
|
1.98.8.6 |
| 20-Aug-2007 |
ad | Sync with HEAD.
|
1.98.8.5 |
| 19-Aug-2007 |
ad | - Back out the biodone() changes. - Eliminate B_ERROR (from HEAD).
|
1.98.8.4 |
| 23-Jun-2007 |
ad | - Lock v_cleanblkhd, v_dirtyblkhd, v_numoutput with the vnode's interlock. Get rid of global_v_numoutput_lock. Partially incomplete as the buffer cache locking doesn't work very well and needs an overhaul. - Some changes to try and make softdep MP safe. Untested.
|
1.98.8.3 |
| 08-Jun-2007 |
ad | Sync with head.
|
1.98.8.2 |
| 13-May-2007 |
ad | - Pass the error number and residual count to biodone(), and let it handle setting error indicators. Prepare to eliminate B_ERROR. - Add a flag argument to brelse() to be set into the buf's flags, instead of doing it directly. Typically used to set B_INVAL. - Add a "struct cpu_info *" argument to kthread_create(), to be used to create bound threads. Change "bool mpsafe" to "int flags". - Allow exit of LWPs in the IDL state when (l != curlwp). - More locking fixes & conversion to the new API.
|
1.98.8.1 |
| 13-Mar-2007 |
ad | Pull in the initial set of changes for the vmlocking branch.
|
1.98.4.2 |
| 17-May-2007 |
yamt | sync with head.
|
1.98.4.1 |
| 07-May-2007 |
yamt | sync with head.
|
1.98.2.1 |
| 05-Jun-2007 |
bouyer | Pull up following revision(s) (requested by perseant in ticket #703): sys/miscfs/genfs/genfs.h 1.21 sys/miscfs/genfs/genfs_vnops.c 1.151 sys/ufs/lfs/lfs.h 1.119, 1.120 sys/ufs/lfs/lfs_bio.c 1.99-101 sys/ufs/lfs/lfs_extern.h 1.89 sys/ufs/lfs/lfs_inode.c 1.108, 1.109 sys/ufs/lfs/lfs_segment.c 1.197, 1.199, 1.200 sys/ufs/lfs/lfs_subr.c 1.69, 1.70 sys/ufs/lfs/lfs_syscalls.c 1.119 sys/ufs/lfs/lfs_vfsops.c 1.234, 1.235 sys/ufs/lfs/lfs_vnops.c 1.195, 1.196, 1.200, 1.202-206
Reduce busy waiting in lfs_putpages(), and other LFS improvements.
|
1.102.2.1 |
| 15-Aug-2007 |
skrll | Sync with HEAD.
|
1.103.10.2 |
| 29-Jul-2007 |
ad | It's not a good idea for device drivers to modify b_flags, as they don't need to understand the locking around that field. Instead of setting B_ERROR, set b_error instead. b_error is 'owned' by whoever completes the I/O request.
|
1.103.10.1 |
| 29-Jul-2007 |
ad | file lfs_bio.c was added on branch matt-mips64 on 2007-07-29 13:31:15 +0000
|
1.103.8.1 |
| 14-Oct-2007 |
yamt | sync with head.
|
1.103.6.3 |
| 23-Mar-2008 |
matt | sync with HEAD
|
1.103.6.2 |
| 09-Jan-2008 |
matt | sync with HEAD
|
1.103.6.1 |
| 06-Nov-2007 |
matt | sync with HEAD
|
1.103.4.1 |
| 26-Oct-2007 |
joerg | Sync with HEAD.
Follow the merge of pmap.c on i386 and amd64 and move pmap_init_tmp_pgtbl into arch/x86/x86/pmap.c. Modify the ACPI wakeup code to restore CR4 before jumping back into kernel space as the large page option might cover that.
|
1.106.10.1 |
| 02-Jan-2008 |
bouyer | Sync with HEAD
|
1.106.6.5 |
| 19-Dec-2007 |
ad | Use a global lfs_lock.
|
1.106.6.4 |
| 19-Dec-2007 |
ad | Fix some more problems w/lfs on this branch.
|
1.106.6.3 |
| 19-Dec-2007 |
ad | Get lfs mostly working.
|
1.106.6.2 |
| 08-Dec-2007 |
ad | Minor locking fixes.
|
1.106.6.1 |
| 04-Dec-2007 |
ad | Pull the vmlocking changes into a new branch.
|
1.106.4.1 |
| 18-Feb-2008 |
mjf | Sync with HEAD.
|
1.110.10.3 |
| 11-Aug-2010 |
yamt | sync with head.
|
1.110.10.2 |
| 11-Mar-2010 |
yamt | sync with head
|
1.110.10.1 |
| 16-May-2008 |
yamt | sync with head.
|
1.110.8.1 |
| 18-May-2008 |
yamt | sync with head.
|
1.110.6.1 |
| 02-Jun-2008 |
mjf | Sync with HEAD.
|
1.114.18.1 |
| 19-Dec-2013 |
matt | Adapt to new uvm_estimatepageable arguments
|
1.116.2.2 |
| 17-Aug-2010 |
uebayasi | Sync with HEAD.
|
1.116.2.1 |
| 30-Apr-2010 |
uebayasi | Sync with HEAD.
|
1.117.2.2 |
| 03-Jul-2010 |
rmind | sync with head
|
1.117.2.1 |
| 16-Mar-2010 |
rmind | Change struct uvm_object::vmobjlock to be dynamically allocated with mutex_obj_alloc(). It allows us to share the locks among UVM objects.
|
1.118.6.1 |
| 23-Jun-2011 |
cherry | Catchup with rmind-uvmplock merge.
|
1.120.6.1 |
| 18-Feb-2012 |
mrg | merge to -current.
|
1.120.2.2 |
| 22-May-2014 |
yamt | sync with head.
for a reference, the tree before this commit was tagged as yamt-pagecache-tag8.
this commit was splitted into small chunks to avoid a limitation of cvs. ("Protocol error: too many arguments")
|
1.120.2.1 |
| 17-Apr-2012 |
yamt | sync with head
|
1.121.2.1 |
| 17-Mar-2012 |
bouyer | Pull up following revision(s) (requested by perseant in ticket #116): sys/ufs/lfs/lfs_alloc.c: revision 1.112 tests/fs/vfs/t_rmdirrace.c: revision 1.9 tests/fs/vfs/t_renamerace.c: revision 1.25 sys/ufs/lfs/lfs_vnops.c: revision 1.240 sys/ufs/lfs/lfs_segment.c: revision 1.224 sys/ufs/lfs/lfs_bio.c: revision 1.122 sys/ufs/lfs/lfs_vfsops.c: revision 1.294 sbin/newfs_lfs/make_lfs.c: revision 1.19 sys/ufs/lfs/lfs.h: revision 1.136 Pass t_renamerace and t_rmdirrace tests. Adapt dholland@'s fix to ufs_rename to fix PR kern/43582. Address several other MP locking issues discovered during the course of investigating the same problem. Removed extraneous vn_lock() calls on the Ifile, since the Ifile writes are controlled by the segment lock. Fix PR kern/45982 by deemphasizing the estimate of how much metadata will fill the empty space on disk when the disk is nearly empty (t_renamerace crates a lot of inode blocks on a tiny empty disk).
|
1.122.2.3 |
| 03-Dec-2017 |
jdolecek | update from HEAD
|
1.122.2.2 |
| 20-Aug-2014 |
tls | Rebase to HEAD as of a few days ago.
|
1.122.2.1 |
| 23-Jun-2013 |
tls | resync from head
|
1.125.2.2 |
| 18-May-2014 |
rmind | sync with head
|
1.125.2.1 |
| 28-Aug-2013 |
rmind | sync with head
|
1.128.6.3 |
| 28-Aug-2017 |
skrll | Sync with HEAD
|
1.128.6.2 |
| 27-Dec-2015 |
skrll | Sync with HEAD (as of 26th Dec)
|
1.128.6.1 |
| 22-Sep-2015 |
skrll | Sync with HEAD
|
1.135.4.1 |
| 21-Apr-2017 |
bouyer | Sync with HEAD
|
1.135.2.2 |
| 26-Apr-2017 |
pgoyette | Sync with HEAD
|
1.135.2.1 |
| 20-Mar-2017 |
pgoyette | Sync with HEAD
|
1.139.4.1 |
| 30-Oct-2017 |
snj | Pull up following revision(s) (requested by maya in ticket #330): sbin/fsck_lfs/inode.c: 1.69 sbin/fsck_lfs/lfs.c: 1.73 sbin/fsck_lfs/pass6.c: 1.50 sbin/fsck_lfs/segwrite.c: 1.46 sys/ufs/lfs/lfs.h: 1.202-1.203 sys/ufs/lfs/lfs_accessors.h: 1.48 sys/ufs/lfs/lfs_alloc.c: 1.136-1.137 sys/ufs/lfs/lfs_balloc.c: 1.94 sys/ufs/lfs/lfs_bio.c: 1.141 sys/ufs/lfs/lfs_extern.h: 1.113 sys/ufs/lfs/lfs_inode.c: 1.156-1.157 sys/ufs/lfs/lfs_inode.h: 1.20, 1.21, 1.23 sys/ufs/lfs/lfs_itimes.c: 1.20 sys/ufs/lfs/lfs_pages.c: 1.13-1.15 sys/ufs/lfs/lfs_rename.c: 1.22 sys/ufs/lfs/lfs_segment.c: 1.270-1.275 sys/ufs/lfs/lfs_subr.c: 1.94-1.97 sys/ufs/lfs/lfs_syscalls.c: 1.175 sys/ufs/lfs/lfs_vfsops.c: 1.360 sys/ufs/lfs/lfs_vnops.c: 1.316-1.321 sys/ufs/lfs/ulfs_inode.c: 1.20 sys/ufs/lfs/ulfs_inode.h: 1.24 sys/ufs/lfs/ulfs_lookup.c: 1.41 sys/ufs/lfs/ulfs_quota2.c: 1.31 sys/ufs/lfs/ulfs_readwrite.c: 1.24 sys/ufs/lfs/ulfs_vnops.c: 1.49-1.50 Update inode member i_flag --> i_state to keep up with kernel changes Move definition of IN_ALLMOD near the flag it's a mask for. Now we can see that it doesn't match all the flags, but changing that will require more careful thought. Correct confusion between i_flag and i_flags These will have to be renamed. Spotted by Riastradh, thanks! Add an XXX about the missing flags so it's not buried in a commit message. now the XXX count for LFS is 260 Rename i_flag to i_state. The similarity to i_flags has previously caused errors. Use continue to denote the no-op loop to match netbsd style newline for extra clarity. It isn't safe to drain dirops with seglock held, it'll deadlock if there are any dirops. drain before grabbing seglock. lfs_dirops == 0 is always true (as we already drained dirops), so omit that part of the comparison. Fixes a lot of LFS deadlocks. PR kern/52301 Many thanks to dholland for help analyzing coredumps Ifdef out KDASSERT which fires on my machine. Deduplicate sanity check that seglock is held on segunlock Revert r1.272 fix to PR kern/52301, the performance hit is making things unusable. change lfs_nextsegsleep and lfs_allclean_wakeup to use condvar XXX had to use lfs_lock in lfs_segwait, removed kernel_lock, is this appropriate? fix buffer overflow/KASSERT when cookies are supplied lfs no longer uses the ffs-style struct direct, use the correct minimum size from dholland XXX more wrong Consistently use {,UN}MARK_VNODE macros rather than function calls. Not much point doing anything after a panic call Ask some question about the code in a XXX comment XXX question our double-flushing of dirops Fix typo in comment
|
1.141.4.1 |
| 25-Jun-2018 |
pgoyette | Sync with HEAD
|
1.142.6.1 |
| 17-Aug-2020 |
martin | Pull up following revision(s) (requested by riastradh in ticket #1050):
sys/ufs/lfs/lfs_subr.c: revision 1.101 sys/ufs/lfs/lfs_subr.c: revision 1.102 sys/ufs/lfs/lfs_inode.c: revision 1.158 sys/ufs/lfs/lfs_inode.h: revision 1.25 sys/ufs/lfs/lfs_balloc.c: revision 1.95 sys/ufs/lfs/lfs_pages.c: revision 1.21 sys/ufs/lfs/lfs_vnops.c: revision 1.330 sys/ufs/lfs/lfs_alloc.c: revision 1.140 (patch) sys/ufs/lfs/lfs_alloc.c: revision 1.141 (patch) lib/libp2k/p2k.c: revision 1.72 sys/ufs/lfs/lfs.h: revision 1.205 sys/ufs/lfs/lfs.h: revision 1.206 sys/ufs/lfs/lfs_segment.c: revision 1.284 sys/ufs/lfs/lfs.h: revision 1.207 sys/ufs/lfs/lfs_segment.c: revision 1.285 sys/ufs/lfs/lfs_debug.c: revision 1.55 sys/ufs/lfs/lfs_rename.c: revision 1.23 usr.sbin/dumplfs/dumplfs.c: revision 1.65 sys/ufs/lfs/lfs_vfsops.c: revision 1.371 sys/arch/i386/stand/efiboot/bootx64/Makefile: revision 1.3 sys/ufs/lfs/lfs_vfsops.c: revision 1.372 sys/ufs/lfs/lfs_vfsops.c: revision 1.373 sbin/fsck_lfs/pass1.c: revision 1.46 sys/ufs/lfs/lfs_vnops.c: revision 1.326 sys/ufs/lfs/lfs_vnops.c: revision 1.327 sys/ufs/lfs/lfs_vfsops.c: revision 1.375 (patch) sys/ufs/lfs/lfs_vnops.c: revision 1.328 sys/ufs/lfs/lfs_subr.c: revision 1.98 sys/ufs/lfs/lfs_extern.h: revision 1.116 sys/ufs/lfs/lfs_vnops.c: revision 1.329 sys/ufs/lfs/lfs_subr.c: revision 1.99 sys/ufs/lfs/lfs_extern.h: revision 1.117 sys/ufs/lfs/lfs_accessors.h: revision 1.49 sys/ufs/lfs/lfs_extern.h: revision 1.118 sys/rump/fs/lib/liblfs/Makefile: revision 1.15 sys/ufs/lfs/lfs_bio.c: revision 1.146 (patch) sys/ufs/lfs/lfs_bio.c: revision 1.147 sys/ufs/lfs/lfs_subr.c: revision 1.100
Fix kassert in lfs by initializing vp first.
Use a marker node to iterate lfs_dchainhd / i_lfs_dchain.
I believe elements can be removed while the lock is dropped, including the next node we're hanging on to.
Just use VOP_BWRITE for lfs_bwrite_log. Hope this doesn't cause trouble with vfs_suspend.
Teach lfs to transition ro<->rw.
Prevent new dirops while we issue lfs_flush_dirops.
lfs_flush_dirops assumes (by KASSERT((ip->i_state & IN_ADIROP) == 0)) that vnodes on the dchain will not become involved in active dirops even while holding no other locks (lfs_lock, v_interlock), so we must set lfs_writer here. All other callers already set lfs_writer.
We set fs->lfs_writer++ without explicitly doing lfs_writer_enter because (a) we already waited for the dirops to drain, and (b) we hold lfs_lock and cannot drop it before setting lfs_writer.
Assert lfs_writer where I think we can now prove it.
Serialize access to the splay tree with lfs_lock.
Change some cheap KDASSERT into KASSERT.
Take a reference and fix assertions in lfs_flush_dirops. Fixes panic: KASSERT((ip->i_state & IN_ADIROP) == 0) at lfs_vnops.c:1670 lfs_flush_dirops lfs_check lfs_setattr VOP_SETATTR change_mode sys_fchmod syscall
This assertion -- and the assertion that vp->v_uflag has VU_DIROP set -- is valid only until we release lfs_lock, because we may race with lfs_unmark_dirop which will remove the nodes and change the flags.
Further, vp itself is valid only as long as it is referenced, which it is as long as it's on the dchain, but lfs_unmark_dirop drops the dchain's reference.
Don't lfs_writer_enter while holding v_interlock.
There's no need to lfs_writer_enter at all here, as far as I can see. lfs_flush_fs will do it for us.
Break deadlock in PR kern/52301.
The lock order is lfs_writer -> lfs_seglock. The problem in 52301 is that lfs_segwrite violates this lock order by sometimes doing lfs_seglock -> lfs_writer, either (a) when doing a checkpoint or (b), opportunistically, when there are no dirops pending. Both cases can deadlock, because dirops sometimes take the seglock (lfs_truncate, lfs_valloc, lfs_vfree): (a) There may be dirops pending, and they may be waiting for the seglock, so we can't wait for them to complete while holding the seglock. (b) The test for fs->lfs_dirops == 0 happens unlocked, and the state may change by the time lfs_writer_enter acquires lfs_lock.
To resolve this in each case: (a) Do lfs_writer_enter before lfs_seglock, since we will need it unconditionally anyway. The worst performance impact of this should be that some dirops get delayed a little bit. (b) Create a new lfs_writer_tryenter to use at this point so that the test for fs->lfs_dirops == 0 and the acquisition of lfs_writer happen atomically under lfs_lock.
Initialize/destroy lfs_allclean_wakeup in modcmd, not lfs_mountfs.
Fixes reloading lfs.kmod.
In lfs_update, hold lfs_writer around lfs_vflush.
Otherwise, we might do lfs_vflush -> lfs_seglock -> lfs_segwait(SEGM_CKP) -> lfs_writer_enter which is the reverse of the lfs_writer -> lfs_seglock ordering.
Call lfs_orphan in lfs_rename while we're still in the dirop. lfs_writer_enter can't fail; keep it simple and don't pretend it can.
Assert that mtsleep can't fail either -- it doesn't catch signals and there's no timeout.
Teach LFS_ORPHAN_NEXTFREE about lfs64.
Dust off the orphan detection code and try to make it work.
Fix !DIAGNOSTIC compile
Fix userland references to LFS_ORPHAN_NEXTFREE.
Forgot to grep for these or do a full distribution build, oops!
Fix missing <sys/evcnt.h> by removing the evcnts instead.
Just wanted to confirm that a race might happen, and indeed it did. These serve little diagnostic value otherwise.
OR into bp->b_cflags; don't overwrite.
CTASSERT lfs on-disk structure sizes.
Avoid misaligned access to lfs64 on-disk records in memory. lfs64 directory entries are only 32-bit aligned in order to conserve space in directory blocks, and we had a hack to stuff a 64-bit inode in them. This replaces the hack by __aligned(4) __packed, and goes further:
1. It's not clear that all the other lfs64 data structures are 64-bit aligned on disk to begin with. We can go through these later and upgrade them from struct foo64 { ... } __aligned(4) __packed; union foo { struct foo64 f64; ... }; to struct foo64 { ... }; union foo { struct foo64 f64 __aligned(8); ... } __aligned(4) __packed; if we really want to take advantage of 64-bit memory accesses. However, the __aligned(4) __packed must remain on the union because: 2. We access even the lfs32 data structures via a union that has lfs64 members, and it turns out that compilers will assume access through a union with 64-bit aligned members implies the whole union has 64-bit alignment, even if we're only accessing a 32-bit aligned member.
Fix clang build after packed lfs64 accessor change.
Suppress spurious address-of-packed error in rump lfs too.
|
1.142.2.1 |
| 08-Apr-2020 |
martin | Merge changes from current as of 20200406
|
1.144.2.1 |
| 29-Feb-2020 |
ad | Sync with head.
|