History log of /src/sys/kern/vfs_wapbl.c |
Revision | | Date | Author | Comments |
1.117 |
| 07-Dec-2024 |
riastradh | sys/kern/vfs_wapbl.c: Provide stub SET_ERROR for userland builds.
Should fix:
/tmp/build/2024.12.07.14.08.54-amd64/src/sys/kern/vfs_wapbl.c: In function 'wapbl_replay_start': /tmp/build/2024.12.07.14.08.54-amd64/src/sys/kern/vfs_wapbl.c:2978:24: error: implicit declaration of function 'SET_ERROR'; did you mean 'EV_ERROR'? [-Werror=implicit-function-declaration] 2978 | return SET_ERROR(EINVAL); | ^~~~~~~~~ | EV_ERROR
|
1.116 |
| 07-Dec-2024 |
riastradh | vfs(9): Sprinkle SET_ERROR dtrace probes.
PR kern/58378: Kernel error code origination lacks dtrace probes
|
1.115 |
| 07-Dec-2024 |
riastradh | vfs(9): Fix some more whitespace issues.
No functional change intended.
|
1.114 |
| 07-Dec-2024 |
riastradh | vfs(9): Sprinkle KNF.
No functional change intended.
|
1.113 |
| 13-May-2024 |
msaitoh | s/of of/of/ in comment.
|
1.112 |
| 09-Apr-2022 |
riastradh | sys: Use membar_release/acquire around reference drop.
This just goes through my recent reference count membar audit and changes membar_exit to membar_release and membar_enter to membar_acquire -- this should make everything cheaper on most CPUs without hurting correctness, because membar_acquire is generally cheaper than membar_enter.
|
1.111 |
| 04-Apr-2022 |
andvar | fix various typos, mainly in comments.
|
1.110 |
| 12-Mar-2022 |
riastradh | sys: Membar audit around reference count releases.
If two threads are using an object that is freed when the reference count goes to zero, we need to ensure that all memory operations related to the object happen before freeing the object.
Using an atomic_dec_uint_nv(&refcnt) == 0 ensures that only one thread takes responsibility for freeing, but it's not enough to ensure that the other thread's memory operations happen before the freeing.
Consider:
Thread A Thread B obj->foo = 42; obj->baz = 73; mumble(&obj->bar); grumble(&obj->quux); /* membar_exit(); */ /* membar_exit(); */ atomic_dec -- not last atomic_dec -- last /* membar_enter(); */ KASSERT(invariant(obj->foo, obj->bar)); free_stuff(obj);
The memory barriers ensure that
obj->foo = 42; mumble(&obj->bar);
in thread A happens before
KASSERT(invariant(obj->foo, obj->bar)); free_stuff(obj);
in thread B. Without them, this ordering is not guaranteed.
So in general it is necessary to do
membar_exit(); if (atomic_dec_uint_nv(&obj->refcnt) != 0) return; membar_enter();
to release a reference, for the `last one out hit the lights' style of reference counting. (This is in contrast to the style where one thread blocks new references and then waits under a lock for existing ones to drain with a condvar -- no membar needed thanks to mutex(9).)
I searched for atomic_dec to find all these. Obviously we ought to have a better abstraction for this because there's so much copypasta. This is a stop-gap measure to fix actual bugs until we have that. It would be nice if an abstraction could gracefully handle the different styles of reference counting in use -- some years ago I drafted an API for this, but making it cover everything got a little out of hand (particularly with struct vnode::v_usecount) and I ended up setting it aside to work on psref/localcount instead for better scalability.
I got bored of adding #ifdef __HAVE_ATOMIC_AS_MEMBAR everywhere, so I only put it on things that look performance-critical on 5sec review. We should really adopt membar_enter_preatomic/membar_exit_postatomic or something (except they are applicable only to atomic r/m/w, not to atomic_load/store_*, making the naming annoying) and get rid of all the ifdefs.
|
1.109 |
| 03-Aug-2021 |
chs | initialize wc_unused to 0, to avoid writing uninitialized memory to disk. detected by KMSAN.
|
1.108 |
| 12-Apr-2020 |
jdolecek | fix wapbl_discard() to actually discard the queued bufs properly - need to set BC_INVAL for them, and also need to explicitly remove them from the BQ_LOCKED queue
fixes DIAGNOSTIC panic when force unmounting unresponsive disk device PR kern/51178 by Michael van Elst
|
1.107 |
| 12-Apr-2020 |
jdolecek | fix race between wapbl_discard() and wapbl_biodone() on forced unmount on shutdown with slow I/O device
wapbl_discard() needs to hold both wl_mtx and bufcache_lock while manipulating wl_entries - the rw lock is not enough, because wapbl_biodone() only takes wl_mtx while removing the finished entry from list
wapbl_biodone() must take bufcache_lock before reading we->we_wapbl, so it's blocked until wapbl_discard() finishes, and takes !wl path appropriately
this is supposed to fix panic on shutdown: [ 67549.6304123] forcefully unmounting / (/dev/wd0a)... ... [ 67549.7272030] panic: mutex_vector_enter,510: uninitialized lock (lock=0xffffa722a4f4f5b0, from=ffffffff80a884fa) ... [ 67549.7272030] wapbl_biodone() at netbsd:wapbl_biodone+0x4d [ 67549.7272030] biointr() at netbsd:biointr+0x7d [ 67549.7272030] softint_dispatch() at netbsd:softint_dispatch+0x12c [ 67549.7272030] Xsoftintr() at netbsd:Xsoftintr+0x4f
|
1.106 |
| 16-Mar-2020 |
pgoyette | branches: 1.106.2; Use the module subsystem's ability to process SYSCTL_SETUP() entries to automate installation of sysctl nodes.
Note that there are still a number of device and pseudo-device modules that create entries tied to individual device units, rather than to the module itself. These are not changed.
|
1.105 |
| 14-Mar-2020 |
ad | OR into bp->b_cflags; don't overwrite.
|
1.104 |
| 08-Mar-2020 |
ad | Typo.
|
1.103 |
| 10-Dec-2018 |
jdolecek | constify wapbl_ops
|
1.102 |
| 10-Dec-2018 |
jdolecek | add wo_wapbl_jlock_assert to wapbl_ops
|
1.101 |
| 02-Dec-2017 |
jdolecek | branches: 1.101.2; 1.101.4; according to benchmark extracting pkgsrc.tar, using FUA and hence waiting for each transfer to write through to the medium is way slower than just letting the drive use a cached write and doing DIOCCACHESYNC on the end
Results were (fs block 32KB / frag 4KB, partition aligned on 32KB boundary): HDD at siisata(4): no-FUA: 108 sec w/FUA: 294 sec SSD at ahcisata(4): no-FUA: 73 sec w/FUA: 502 sec
change the flag so that FUA is only used for the commit block write; for journal data write, only pass DPO, rely on the cache flush to get them to media
|
1.100 |
| 27-Oct-2017 |
joerg | Revert printf return value change.
|
1.99 |
| 27-Oct-2017 |
utkarsh009 | [syzkaller] Cast all the printf's to (void *) > as a result of new printf(9) declaration.
|
1.98 |
| 23-Oct-2017 |
jdolecek | remove counter for 'journal I/O bufs biowait' - it's (total - async), so superfluous; adjust the description of the the other counters a bit to make them more clear
|
1.97 |
| 08-Jun-2017 |
chs | move some buffer cache internals declarations from buf.h to vfs_bio.c. this is needed to avoid name conflicts with ZFS and also makes it clearer that other code shouldn't be messing with these. remove the LFS debug code that poked around in bufqueues and remove the BQ_EMPTY bufqueue since nothing uses it anymore. provide a function to let LFS and wapbl read the value of nbuf for now.
|
1.96 |
| 10-Apr-2017 |
jdolecek | rename allow_fuadpo to allow_dpofua, so it's the same order as the SCSI flag
|
1.95 |
| 10-Apr-2017 |
jdolecek | improve performance of journal writes by parallelizing the I/O - use 4 bufs by default, add sysctl vfs.wapbl.journal_iobufs to control it
this also removes need to allocate iobuf during commit, so it might help to avoid deadlock during memory shortages like PR kern/47030
|
1.94 |
| 10-Apr-2017 |
jdolecek | change b_wapbllist to TAILQ, to preserve the LRU order
|
1.93 |
| 05-Apr-2017 |
jdolecek | optionally use FUA instead of full cache sync, and DPO for journal writes, when supported by disk device; controlled by sysctl vfs.wapbl.allow_fuadpo, default off for now
discussed on tech-kern
|
1.92 |
| 17-Mar-2017 |
riastradh | Back out part of previous: missed a caller of wapbl_write_inodes.
|
1.91 |
| 17-Mar-2017 |
riastradh | Nix trailing whitespace.
|
1.90 |
| 17-Mar-2017 |
riastradh | Sort includes.
|
1.89 |
| 17-Mar-2017 |
riastradh | Assert write lock in wapbl_write_revocations, wapbl_write_inodes.
Only one call site, so trivial to prove correct.
|
1.88 |
| 05-Mar-2017 |
mrg | add missing sys/evcnt.h include.
|
1.87 |
| 05-Mar-2017 |
jdolecek | add some event counters, for commits, writes, cache flush
|
1.86 |
| 10-Nov-2016 |
jdolecek | branches: 1.86.2; during truncate with wapbl, register deallocation for upper indirect block before recursing into lower blocks, to make sure that it will be removed after all its referenced blocks are removed
fixes 'ffs_blkfree_common: freeing free block' panic triggered by ufs_truncate_retry() when just the upper indirect block registration failed, code tried to free the lower blocks again after wapbl flush
problem found by hannken@, thank you
|
1.85 |
| 28-Oct-2016 |
jdolecek | reorganize ffs_truncate()/ffs_indirtrunc() to be able to partially succeed; change wapbl_register_deallocation() to return EAGAIN rather than panic when code hits the limit
callers changed to either loop calling ffs_truncate() using new utility ufs_truncate_retry() if their semantics requires it, or just ignore the failure; remove ufs_wapbl_truncate()
this fixes possible user-triggerable panic during truncate, and resolves WAPBL performance issue with truncates of large files
PR kern/47146 and kern/49175
|
1.84 |
| 02-Oct-2016 |
jdolecek | drop wl_mtx mutex during call to pool_get() with PR_WAITOK
pointed out by riastradh
|
1.83 |
| 02-Oct-2016 |
jdolecek | fix off-by-one in wapbl_write_revocations() - when exiting the write loop, wd gets set to next unwritten record, not last written one as code assumed; 'lost head!' KASSERT is not triggered any more
|
1.82 |
| 02-Oct-2016 |
jdolecek | wapbl_write_revocations(): fix use-after-free when writing more then one block worth of revocations, introduced in previous commit; discovered by Brad Harder on current-users
|
1.81 |
| 01-Oct-2016 |
jdolecek | allocate wapbl dealloc registration structures via pool, so that there is more flexibility with limit handling
|
1.80 |
| 22-Sep-2016 |
jdolecek | misplaced comment
|
1.79 |
| 22-Sep-2016 |
jdolecek | store the number of block records per block into wl as wl_brperjblock, so that it's visible it's same value everywhere; no functional change
|
1.78 |
| 19-May-2016 |
riastradh | branches: 1.78.2; Replace deprecated disabled code by comment
describing what it intends to do, and why it won't work yet
From coypu.
|
1.77 |
| 07-May-2016 |
riastradh | Tweak comment on wapbl_flush.
|
1.76 |
| 07-May-2016 |
riastradh | Use %jx and a cast to uintmax_t, not %x, to print a dev_t.
|
1.75 |
| 07-May-2016 |
riastradh | Clarify comment about early exit from wapbl_flush.
Note possible bug. Requires further analysis.
|
1.74 |
| 07-May-2016 |
riastradh | Omit unused parameter to wapbl_fini.
|
1.73 |
| 07-May-2016 |
riastradh | Delete debugging option wapbl_lazy_truncate. Simplify.
Likely nobody has used this in the past decade -- you would have to enter ddb and write 1 to it in order to enable it anyway.
Patch prepared by coypu.
|
1.72 |
| 07-May-2016 |
riastradh | Turn WAPBL_DEBUG panic or KASSERT into KASSERTMSG
From coypu.
|
1.71 |
| 07-May-2016 |
riastradh | Document log layout and internal subroutines of vfs_wapbl.c.
|
1.70 |
| 07-May-2016 |
riastradh | KASSERT(A); KASSERT(B) instead of KASSERT(A && B).
|
1.69 |
| 07-May-2016 |
riastradh | Rename labels to make wapbl_flush a little easier to follow.
out ---> wait_out out2 ---> out
From coypu.
|
1.68 |
| 07-May-2016 |
riastradh | Sort and deduplicate includes.
|
1.67 |
| 03-May-2016 |
riastradh | Fix non-DIAGNOSTIC build.
|
1.66 |
| 03-May-2016 |
riastradh | panic takes no \n.
From coypu.
|
1.65 |
| 03-May-2016 |
riastradh | #ifdef DIAGNOSTIC panic ---> KASSERTMSG
From coypu.
|
1.64 |
| 15-Nov-2015 |
pgoyette | Enable the module's MODULE_CMD_FINI action. It actually works as intended.
|
1.63 |
| 14-Nov-2015 |
pgoyette | Fix obvious typo - even though it is inside a #ifdef notyet ... #endif
|
1.62 |
| 09-Aug-2015 |
mlelstv | Refactor disk address calculation from physical block numbers in the journal into a function. Make that function work correctly with sector sizes != DEV_BSIZE when compiled outside the kernel (i.e. fsck_ffs). Fixes PR bin/45933
|
1.61 |
| 18-Oct-2014 |
snj | branches: 1.61.2; src is too big these days to tolerate superfluous apostrophes. It's "its", people!
|
1.60 |
| 05-Sep-2014 |
matt | Don't next structure and enum definitions. Don't use C++ keywords new, try, class, private, etc.
|
1.59 |
| 25-Feb-2014 |
pooka | branches: 1.59.4; Ensure that the top level sysctl nodes (kern, vfs, net, ...) exist before the sysctl link sets are processed, and remove redundancy.
Shaves >13kB off of an amd64 GENERIC, not to mention >1k duplicate lines of code.
|
1.58 |
| 15-Sep-2013 |
martin | Remove unused variable
|
1.57 |
| 15-Sep-2013 |
joerg | Provide a prototype for wapbl_space_free under _KERNEL.
|
1.56 |
| 14-Sep-2013 |
joerg | wapbl_advance and friends are only used in the kernel
|
1.55 |
| 09-Feb-2013 |
christos | branches: 1.55.2; why didn't gcc find the formatting error?
|
1.54 |
| 08-Dec-2012 |
hannken | Try to coalesce writes to the journal in MAXPHYS sized and aligned blocks. Speeds up wapbl_flush() on raid5 by a factor of 3-4.
Discussed on tech-kern.
Needs pullup to NetBSD-6.
|
1.53 |
| 17-Nov-2012 |
hannken | wapbl_biodone: Release the buffer before reclaiming the log. wapbl_flush() may wait for the log to become empty and all buffers should be unbusy before it returns.
|
1.52 |
| 29-Apr-2012 |
chs | branches: 1.52.2; mark all wapbl I/O as BPRIO_TIMECRITICAL. this is the second part of addressing PR 46325.
|
1.51 |
| 28-Jan-2012 |
para | branches: 1.51.2; replacing malloc(9) with kmem(9) wapbl_entries get there own pool, they are freed from softint context
ok: rmind@
|
1.50 |
| 27-Jan-2012 |
para | extending vmem(9) to be able to allocated resources for it's own needs. simplifying uvm_map handling (no special kernel entries anymore no relocking) make malloc(9) a thin wrapper around kmem(9) (with private interface for interrupt safety reasons)
releng@ acknowledged
|
1.49 |
| 11-Jan-2012 |
yamt | comments
|
1.48 |
| 02-Dec-2011 |
yamt | branches: 1.48.2; - move disk cache flushing code into a separate function. - more verbose output if vfs.wapbl.verbose_commit >= 2. namely, time taken for each DIOCCACHESYNC calls. wapbl_flush: 1322826000.785245900 this transaction = 546304 bytes wapbl_cache_sync: 1: dev 0x0 0.017572724 wapbl_cache_sync: 2: dev 0x0 0.007199825 wapbl_flush: 1322826011.860771302 this transaction = 431104 bytes wapbl_cache_sync: 1: dev 0x0 0.019469753 wapbl_cache_sync: 2: dev 0x0 0.009473410 wapbl_flush: 1322829266.489154342 this transaction = 187904 bytes wapbl_cache_sync: 1: dev 0x4 0.022270180 wapbl_cache_sync: 2: dev 0x4 0.030749402 - fix a comment.
|
1.47 |
| 01-Sep-2011 |
christos | branches: 1.47.2; add a couple of asserts
|
1.46 |
| 14-Aug-2011 |
christos | fix sign-compare warnings
|
1.45 |
| 12-Jun-2011 |
rmind | Welcome to 5.99.53! Merge rmind-uvmplock branch:
- Reorganize locking in UVM and provide extra serialisation for pmap(9). New lock order: [vmpage-owner-lock] -> pmap-lock.
- Simplify locking in some pmap(9) modules by removing P->V locking.
- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).
- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner. Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.
- Unify /dev/mem et al in MI code and provide required locking (removes kernel-lock on some ports). Also, avoid cache-aliasing issues.
Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches formed the core changes of this branch.
|
1.44 |
| 26-May-2011 |
uebayasi | branches: 1.44.2; Catch up with B_* flag name changes in debug code.
|
1.43 |
| 20-Feb-2011 |
nakayama | Fix digit number of nanosecond.
|
1.42 |
| 18-Feb-2011 |
hannken | Adjust previous: set the dealloc soft limit to half hard limit.
|
1.41 |
| 16-Feb-2011 |
hannken | Set the limit for deallocations in one transaction to a more realistic (and much lower) value. When flushing the log these deallocations will produce new blocks and that may execeed the journal size resulting in a "wapbl_flush: current transaction too big to flush" panic. Seen when removing a large snapshot.
Adresses PR #44568 (WAPBL doens't play nice with snapshots).
|
1.40 |
| 14-Feb-2011 |
bouyer | if DIAGNOSTIC, check the size of the transaction in wapbl_end(). Hopefully this will point us to the place which generaed the large transaction, before an asynchronous panic() in wabl_end()
|
1.39 |
| 08-Jan-2011 |
christos | branches: 1.39.2; 1.39.4; Add two sysctls one that does verbose transaction logging and a second one that disables flushing the disk cache (which is fast but dangerous for data integrity). From simon a long while back.
|
1.38 |
| 09-Nov-2010 |
hannken | Wapbl_register_deallocation(): the taken reader lock is not sufficient to protect wl_dealloc* members. Take the mutex here and change the lock requirements of these fields to "writer lock or mutex".
This error lead to file system corruption and "freeing free block" panics.
|
1.37 |
| 10-Sep-2010 |
drochner | fix two bugs reported by Ryo Shimizu: -wrong initialization reported in a followup to PR bin/43336 (looks harmless because it applies to zero-initialized memory, so LIST_INIT() is a no-op) -wrong loop count in reply misses a hash bucket (PR kern/43827) (this was introduced by a post-netbsd-5 change, so it isn't related to the PR above)
|
1.36 |
| 21-Apr-2010 |
pooka | dumdidumdum, need _KERNEL in previous for fsck. noticed by moof
|
1.35 |
| 21-Apr-2010 |
pooka | Reduce #ifdef spew by attaching wapbl as a module. (no, it's still too ifdef-ridden to be able to actually do anything useful and module-like like load into any kernel)
|
1.34 |
| 27-Feb-2010 |
mlelstv | branches: 1.34.2; Move block number computations to callers of wapl_read/wapl_write and conditionally build DEV_BSIZE adjustments for kernel. fsck_ffs shares the same code but accesses physical blocks.
Also compute correct block numbers for each physical sector.
|
1.33 |
| 27-Feb-2010 |
mlelstv | Store physical block numbers in superblock that point to the journal. Calculate position of both commit headers correctly for disks with large sectors. Correct calculation of circular buffer size.
|
1.32 |
| 26-Feb-2010 |
mlelstv | mnt_fs_bshift is the filesystem block size, not the fragment size.
Revert to physical block size. This is fine as long as filesystem and log stay on a similar physical medium.
|
1.31 |
| 23-Feb-2010 |
mlelstv | Use correct offset to block number calculations.
Also change access to filesystem blocks to be done by fragment instead of by physical block. Fragments are the fundamental blocks of the filesystem.
For a theoretical filesystem that accesses the disk in smaller units than stored in mp->mnt_fs_bshift, the assumption might be wrong. But this will also break other subsystems. The value mp->mnt_dev_bshift which formerly represents the physical sector size is currently only virtual in NetBSD (always DEV_BSIZE).
|
1.30 |
| 06-Feb-2010 |
uebayasi | branches: 1.30.2; __inline -> inline
|
1.29 |
| 25-Nov-2009 |
pooka | make WAPBL_DEBUG_PRINT compile
|
1.28 |
| 01-Oct-2009 |
pooka | Add dealloccnt to list of things to be considered in the stetson-harrison decision making algorithm for flushing a wapbl transation.
|
1.27 |
| 01-Oct-2009 |
pooka | Turn a KASSERT into a panic. I don't want us to be randomly overwriting memory on non-DIAGNOSTIC kernels if resource estimation fails.
|
1.26 |
| 14-Jul-2009 |
apb | Convert free text inside #ifdef to a proper comment. Inspired by PR 41255 from Kurt Lidl.
|
1.25 |
| 05-Apr-2009 |
lukem | branches: 1.25.2; fix sign-compare issues
|
1.24 |
| 15-Mar-2009 |
cegger | ansify function definitions
|
1.23 |
| 22-Feb-2009 |
ad | PR kern/39564 wapbl performance issues with disk cache flushing PR kern/40361 WAPBL locking panic in -current PR kern/40361 WAPBL locking panic in -current PR kern/40470 WAPBL corrupts ext2fs PR kern/40562 busy loop in ffs_sync when unmounting a file system PR kern/40525 panic: ffs_valloc: dup alloc
- A fix for an issue that can lead to "ffs_valloc: dup" due to dirty cg buffers being invalidated. Problem discovered and patch by dholland@.
- If the syncer fails to lazily sync a vnode due to lock contention, retry 1 second later instead of 30 seconds later.
- Flush inode atime updates every ~10 seconds (this makes most sense with logging). Presently they didn't hit the disk for read-only files or devices until the file system was unmounted. It would be better to trickle the updates out but that would require more extensive changes.
- Fix issues with file system corruption, busy looping and other nasty problems when logging and non-logging file systems are intermixed, with one being the root file system.
- For logging, do not flush metadata on an inode-at-a-time basis if the sync has been requested by ioflush. Previously, we could try hundreds of log sync operations a second due to inode update activity, causing the syncer to fall behind and metadata updates to be serialized across the entire file system. Instead, burst out metadata and log flushes at a minimum interval of every 10 seconds on an active file system (happens more often if the log becomes full). Note this does not change the operation of fsync() etc.
- With the flush issue fixed, re-enable concurrent metadata updates in vfs_wapbl.c.
|
1.22 |
| 18-Feb-2009 |
yamt | redo rev.1.19 correctly.
|
1.21 |
| 18-Feb-2009 |
yamt | whitespace
|
1.20 |
| 02-Feb-2009 |
yamt | branches: 1.20.2; remove a non-ascii comment.
|
1.19 |
| 02-Feb-2009 |
yamt | back to malloc for now as wapbl_biodone is called by softint.
|
1.18 |
| 31-Jan-2009 |
yamt | - malloc -> kmem_alloc - kill WAPBL_UVM_ALLOC. - kill wapbl_blk_pool to reduce #ifdef.
|
1.17 |
| 03-Jan-2009 |
yamt | remove extra semicolons.
|
1.16 |
| 24-Nov-2008 |
joerg | Move the specification of the on-disk journal format into a separate header.
|
1.15 |
| 20-Nov-2008 |
joerg | Push functionality to deal with existing inode records into a separate function.
|
1.14 |
| 18-Nov-2008 |
joerg | Decouple journal operation from replay header by copying the interesting fields into wapbl_replay as opposed to embedding wapbl_wc_header.
|
1.13 |
| 18-Nov-2008 |
joerg | #if 0 wapbl_replay_verify.
|
1.12 |
| 18-Nov-2008 |
joerg | Check for NULL before calling free as the kernel free doesn't handle it.
|
1.11 |
| 18-Nov-2008 |
joerg | Rename wapbl_replay_prescan to wapbl_replay_process.
|
1.10 |
| 18-Nov-2008 |
joerg | Refact wapbl_replay_prescan to use a function for each WAPBL record. Merge wapbl_replay_get_inodes into wapbl_replay_prescan. Change the logic to determine the head: It doesn't make sense to update it if the last inode record seen was not the beginning of the journal, as the beginning of the journal might not be 0, so always update inodeshead.
|
1.9 |
| 17-Nov-2008 |
joerg | In wapbl_replay_write just iterate over the hash table and not the transactions. The initial prescan has already sorted out what blocks are in the journal and removed any revoced blocks, so the hash table is authorative.
|
1.8 |
| 17-Nov-2008 |
joerg | Remove debug printf.
|
1.7 |
| 17-Nov-2008 |
joerg | Ensure that block records are correctly padded.
|
1.6 |
| 11-Nov-2008 |
joerg | Move WAPL replay handling from bread() into ufs_strategy. This changes the order of hook processing as the copy-on-write handlers are called after the journal processing. This makes more sense as the journal overwrite is logically part of the disk IO.
|
1.5 |
| 10-Nov-2008 |
joerg | Define wapbl_flush_fn_t only for the kernel.
|
1.4 |
| 10-Nov-2008 |
joerg | Reduce internals of WAPBL exposed to the rest of the system.
|
1.3 |
| 11-Aug-2008 |
yamt | branches: 1.3.2; 1.3.4; 1.3.6; 1.3.8; fix a comment.
|
1.2 |
| 31-Jul-2008 |
simonb | Merge the simonb-wapbl branch. From the original branch commit:
Add Wasabi System's WAPBL (Write Ahead Physical Block Logging) journaling code. Originally written by Darrin B. Jewell while at Wasabi and updated to -current by Antti Kantee, Andy Doran, Greg Oster and Simon Burge.
OK'd by core@, releng@.
|
1.1 |
| 10-Jun-2008 |
simonb | branches: 1.1.2; 1.1.4; file vfs_wapbl.c was initially added on branch simonb-wapbl.
|
1.1.4.2 |
| 13-Dec-2008 |
haad | Update haad-dm branch to haad-dm-base2.
|
1.1.4.1 |
| 19-Oct-2008 |
haad | Sync with HEAD.
|
1.1.2.11 |
| 28-Jul-2008 |
oster | Turn on WAPBL_DEBUG_SERIALIZE in order to use RW_WRITER locks instead of RW_READER locks in wapbl_begin(). Include the following comment as well:
XXX: The original code calls for the use of a RW_READER lock here, but it turns out there are performance issues with high metadata-rate workloads (e.g. multiple simultaneous tar extractions). For now, we force the lock to be RW_WRITER, since that currently has the best performance characteristics (even for a single tar-file extraction).
Approved by: simonb
|
1.1.2.10 |
| 25-Jul-2008 |
simonb | Remove an XXX comment that doesn't apply.
|
1.1.2.9 |
| 01-Jul-2008 |
matt | #include <sys/atomic.h> to make rump happy.
|
1.1.2.8 |
| 30-Jun-2008 |
oster | Protect v_numoutput with v_interlock.
Approved by: simonb
|
1.1.2.7 |
| 19-Jun-2008 |
simonb | Fix reference counting for pool initialisation - atomic_inc_uint_nv() will return 1 (not 0!) for the first time a value is incremented.
|
1.1.2.6 |
| 18-Jun-2008 |
simonb | In wapbl_stop(), destroy the condvars/mutexes/locks created in wapbl_start(). Fixes a LOCKDEBUG panic.
|
1.1.2.5 |
| 18-Jun-2008 |
rmind | - Remove wapbl_global_mtx, use atomic-ops for reference counting; - Move few pool_put() calls out of the locked code area;
OK by <simonb>.
|
1.1.2.4 |
| 12-Jun-2008 |
martin | License police
|
1.1.2.3 |
| 11-Jun-2008 |
simonb | Fix some whitespace and long line niggles.
|
1.1.2.2 |
| 11-Jun-2008 |
simonb | Fix a couple of typos. From wizd.
|
1.1.2.1 |
| 10-Jun-2008 |
simonb | Initial commit of Wasabi System's WAPBL (Write Ahead Physical Block Logging) journaling code. Originally written by Darrin B. Jewell while at Wasabi and updated to -current by Antti Kantee, Andy Doran, Greg Oster and Simon Burge.
Still a number of issues - look in doc/BRANCHES for "simonb-wapbl" for more info.
|
1.3.8.6 |
| 18-Jun-2011 |
bouyer | Pull up following revision(s) (requested by hannken in ticket #1627): sys/kern/vfs_wapbl.c: revisions 1.41-1.42 sbin/dump/snapshot.c: revisions 1.6 (patch) share/man/man4/fss.4: revisions 1.15 (patch) sys/dev/fss.c: revisions 1.73 (patch) sys/dev/fssvar.h: revisions 1.25 usr.sbin/fssconfig/fssconfig.c: revisions 1.7 sys/ufs/ffs/ffs_balloc.c: revisions 1.54 sys/ufs/ffs/ffs_snapshot.c: revisions 1.90, 1.98, 1.100-1.101, 1.103-1.110, 1.111, 1.112-1.115 (patch)
- Try to keep snapshot indirect blocks contiguous. This speeds up snapshot creation by a factor of ~3 and reduces the file system suspension time by a factor of ~5.
- Refine the scope of WAPBL transactions and the limit for deallocations in one transaction so we should no longer get a "wapbl_flush: current transaction too big to flush" panic when creating or removing snapshots on larger logging disks.
- fss(4): Allow FSSIOCSET to set the initial flags. Add a new flag "FSS_UNLINK_ON_CREATE" to unlink the backing store before the snapshot gets created. With this change dump(8) no longer dumps the zero-sized, but named snapshot it is working on.
|
1.3.8.5 |
| 07-Mar-2011 |
riz | Pull up following revision(s) (requested by bouyer in ticket #1543): sys/kern/vfs_wapbl.c: revision 1.27 sys/kern/vfs_wapbl.c: revision 1.28 Turn a KASSERT into a panic. I don't want us to be randomly overwriting memory on non-DIAGNOSTIC kernels if resource estimation fails. Add dealloccnt to list of things to be considered in the stetson-harrison decision making algorithm for flushing a wapbl transation.
|
1.3.8.4 |
| 16-Feb-2011 |
bouyer | Pull up following revision(s) (requested by tron in ticket #1535): sys/kern/vfs_wapbl.c: revision 1.39 via patch Add two sysctls one that does verbose transaction logging and a second one that disables flushing the disk cache (which is fast but dangerous for data integrity). From simon a long while back.
|
1.3.8.3 |
| 22-Nov-2010 |
riz | Pull up following revision(s) (requested by hannken in ticket #1477): sys/kern/vfs_wapbl.c: revision 1.38 Wapbl_register_deallocation(): the taken reader lock is not sufficient to protect wl_dealloc* members. Take the mutex here and change the lock requirements of these fields to "writer lock or mutex". This error lead to file system corruption and "freeing free block" panics.
|
1.3.8.2 |
| 13-Sep-2010 |
snj | branches: 1.3.8.2.2; Apply patch (requested by drochner in ticket #1454): Fix inconsistencies in the wapbl replay process which can lead to a premature abort of the fsck run and possibly leave a corrupted filesystem. Addresses PR bin/43336.
|
1.3.8.1 |
| 24-Feb-2009 |
snj | branches: 1.3.8.1.2; 1.3.8.1.4; Pull up following revision(s) (requested by ad in ticket #490): sys/kern/vfs_wapbl.c: revision 1.23 sys/miscfs/syncfs/sync_subr.c: revision 1.36 sys/miscfs/syncfs/sync_vnops.c: revision 1.26 sys/ufs/ffs/ffs_alloc.c: revision 1.121 sys/ufs/ffs/ffs_vfsops.c: revision 1.242 sys/ufs/ffs/ffs_vnops.c: revision 1.110 PR kern/39564 wapbl performance issues with disk cache flushing PR kern/40361 WAPBL locking panic in -current PR kern/40361 WAPBL locking panic in -current PR kern/40470 WAPBL corrupts ext2fs PR kern/40562 busy loop in ffs_sync when unmounting a file system PR kern/40525 panic: ffs_valloc: dup alloc - A fix for an issue that can lead to "ffs_valloc: dup" due to dirty cg buffers being invalidated. Problem discovered and patch by dholland@. - If the syncer fails to lazily sync a vnode due to lock contention, retry 1 second later instead of 30 seconds later. - Flush inode atime updates every ~10 seconds (this makes most sense with logging). Presently they didn't hit the disk for read-only files or devices until the file system was unmounted. It would be better to trickle the updates out but that would require more extensive changes. - Fix issues with file system corruption, busy looping and other nasty problems when logging and non-logging file systems are intermixed, with one being the root file system. - For logging, do not flush metadata on an inode-at-a-time basis if the sync has been requested by ioflush. Previously, we could try hundreds of log sync operations a second due to inode update activity, causing the syncer to fall behind and metadata updates to be serialized across the entire file system. Instead, burst out metadata and log flushes at a minimum interval of every 10 seconds on an active file system (happens more often if the log becomes full). Note this does not change the operation of fsync() etc. - With the flush issue fixed, re-enable concurrent metadata updates in vfs_wapbl.c.
|
1.3.8.2.2.2 |
| 07-Mar-2011 |
riz | Pull up following revision(s) (requested by bouyer in ticket #1543): sys/kern/vfs_wapbl.c: revision 1.27 sys/kern/vfs_wapbl.c: revision 1.28 Turn a KASSERT into a panic. I don't want us to be randomly overwriting memory on non-DIAGNOSTIC kernels if resource estimation fails. Add dealloccnt to list of things to be considered in the stetson-harrison decision making algorithm for flushing a wapbl transation.
|
1.3.8.2.2.1 |
| 22-Nov-2010 |
riz | Pull up following revision(s) (requested by hannken in ticket #1477): sys/kern/vfs_wapbl.c: revision 1.38 Wapbl_register_deallocation(): the taken reader lock is not sufficient to protect wl_dealloc* members. Take the mutex here and change the lock requirements of these fields to "writer lock or mutex". This error lead to file system corruption and "freeing free block" panics.
|
1.3.8.1.4.1 |
| 20-May-2011 |
matt | bring matt-nb5-mips64 up to date with netbsd-5-1-RELEASE (except compat).
|
1.3.8.1.2.1 |
| 22-Nov-2010 |
riz | Pull up following revision(s) (requested by hannken in ticket #1477): sys/kern/vfs_wapbl.c: revision 1.38 Wapbl_register_deallocation(): the taken reader lock is not sufficient to protect wl_dealloc* members. Take the mutex here and change the lock requirements of these fields to "writer lock or mutex". This error lead to file system corruption and "freeing free block" panics.
|
1.3.6.3 |
| 28-Apr-2009 |
skrll | Sync with HEAD.
|
1.3.6.2 |
| 03-Mar-2009 |
skrll | Sync with HEAD.
|
1.3.6.1 |
| 19-Jan-2009 |
skrll | Sync with HEAD.
|
1.3.4.3 |
| 17-Jan-2009 |
mjf | Sync with HEAD.
|
1.3.4.2 |
| 28-Sep-2008 |
mjf | Sync with HEAD.
|
1.3.4.1 |
| 11-Aug-2008 |
mjf | file vfs_wapbl.c was added on branch mjf-devfs2 on 2008-09-28 10:40:54 +0000
|
1.3.2.2 |
| 18-Sep-2008 |
wrstuden | Sync with wrstuden-revivesa-base-2.
|
1.3.2.1 |
| 11-Aug-2008 |
wrstuden | file vfs_wapbl.c was added on branch wrstuden-revivesa on 2008-09-18 04:31:45 +0000
|
1.20.2.2 |
| 23-Jul-2009 |
jym | Sync with HEAD.
|
1.20.2.1 |
| 13-May-2009 |
jym | Sync with HEAD.
Commit is split, to avoid a "too many arguments" protocol error.
|
1.25.2.6 |
| 09-Oct-2010 |
yamt | sync with head
|
1.25.2.5 |
| 11-Aug-2010 |
yamt | sync with head.
|
1.25.2.4 |
| 11-Mar-2010 |
yamt | sync with head
|
1.25.2.3 |
| 18-Jul-2009 |
yamt | sync with head.
|
1.25.2.2 |
| 04-May-2009 |
yamt | sync with head.
|
1.25.2.1 |
| 05-Apr-2009 |
yamt | file vfs_wapbl.c was added on branch yamt-nfs-mp on 2009-05-04 08:13:49 +0000
|
1.30.2.2 |
| 22-Oct-2010 |
uebayasi | Sync with HEAD (-D20101022).
|
1.30.2.1 |
| 30-Apr-2010 |
uebayasi | Sync with HEAD.
|
1.34.2.4 |
| 31-May-2011 |
rmind | sync with head
|
1.34.2.3 |
| 05-Mar-2011 |
rmind | sync with head
|
1.34.2.2 |
| 30-May-2010 |
rmind | sync with head
|
1.34.2.1 |
| 16-Mar-2010 |
rmind | Change struct uvm_object::vmobjlock to be dynamically allocated with mutex_obj_alloc(). It allows us to share the locks among UVM objects.
|
1.39.4.2 |
| 05-Mar-2011 |
bouyer | Sync with HEAD
|
1.39.4.1 |
| 17-Feb-2011 |
bouyer | Sync with HEAD
|
1.39.2.1 |
| 06-Jun-2011 |
jruoho | Sync with HEAD.
|
1.44.2.1 |
| 23-Jun-2011 |
cherry | Catchup with rmind-uvmplock merge.
|
1.47.2.4 |
| 22-May-2014 |
yamt | sync with head.
for a reference, the tree before this commit was tagged as yamt-pagecache-tag8.
this commit was splitted into small chunks to avoid a limitation of cvs. ("Protocol error: too many arguments")
|
1.47.2.3 |
| 16-Jan-2013 |
yamt | sync with (a bit old) head
|
1.47.2.2 |
| 23-May-2012 |
yamt | sync with head.
|
1.47.2.1 |
| 17-Apr-2012 |
yamt | sync with head
|
1.48.2.2 |
| 02-Jun-2012 |
mrg | sync to latest -current.
|
1.48.2.1 |
| 18-Feb-2012 |
mrg | merge to -current.
|
1.51.2.2 |
| 02-Jan-2013 |
riz | Pull up following revision(s) (requested by hannken in ticket #758): sys/kern/vfs_wapbl.c: revision 1.53 sys/kern/vfs_wapbl.c: revision 1.54 wapbl_biodone: Release the buffer before reclaiming the log. wapbl_flush() may wait for the log to become empty and all buffers should be unbusy before it returns. Try to coalesce writes to the journal in MAXPHYS sized and aligned blocks. Speeds up wapbl_flush() on raid5 by a factor of 3-4. Discussed on tech-kern. Needs pullup to NetBSD-6.
|
1.51.2.1 |
| 07-May-2012 |
riz | Pull up following revision(s) (requested by chs in ticket #204): sys/fs/sysvbfs/sysvbfs_vnops.c: revision 1.44 sys/ufs/ffs/ffs_vfsops.c: revision 1.277 sys/fs/v7fs/v7fs_vnops.c: revision 1.11 sys/ufs/chfs/chfs_vnops.c: revision 1.7 sys/ufs/ext2fs/ext2fs_readwrite.c: revision 1.61 sys/miscfs/genfs/genfs_io.c: revision 1.54 sys/kern/vfs_wapbl.c: revision 1.52 sys/uvm/uvm_pager.h: revision 1.43 sys/ufs/ffs/ffs_vnops.c: revision 1.121 sys/kern/vfs_subr.c: revision 1.434 sys/fs/msdosfs/msdosfs_vnops.c: revision 1.83 sys/fs/ntfs/ntfs_vnops.c: revision 1.51 sys/fs/udf/udf_subr.c: revision 1.119 sys/miscfs/specfs/spec_vnops.c: revision 1.135 sys/ufs/ext2fs/ext2fs_vnops.c: revision 1.103 sys/fs/udf/udf_vnops.c: revision 1.71 sys/ufs/ufs/ufs_readwrite.c: revision 1.104 change vflushbuf() to take the full FSYNC_* flags. translate FSYNC_LAZY into PGO_LAZY for VOP_PUTPAGES() so that genfs_do_io() can set the appropriate io priority for the I/O. this is the first part of addressing PR 46325. mark all wapbl I/O as BPRIO_TIMECRITICAL. this is the second part of addressing PR 46325.
|
1.52.2.5 |
| 03-Dec-2017 |
jdolecek | update from HEAD
|
1.52.2.4 |
| 20-Aug-2014 |
tls | Rebase to HEAD as of a few days ago.
|
1.52.2.3 |
| 25-Feb-2013 |
tls | resync with head
|
1.52.2.2 |
| 20-Nov-2012 |
tls | Resync to 2012-11-19 00:00:00 UTC
|
1.52.2.1 |
| 12-Sep-2012 |
tls | Initial snapshot of work to eliminate 64K MAXPHYS. Basically works for physio (I/O to raw devices); needs more doing to get it going with the filesystems, but it shouldn't damage data.
All work's been done on amd64 so far. Not hard to add support to other ports. If others want to pitch in, one very helpful thing would be to sort out when and how IDE disks can do 128K or larger transfers, and adjust the various PCI IDE (or at least ahcisata) drivers and wd.c accordingly -- it would make testing much easier. Another very helpful thing would be to implement a smart minphys() for RAIDframe along the lines detailed in the MAXPHYS-NOTES file.
|
1.55.2.1 |
| 18-May-2014 |
rmind | sync with head
|
1.59.4.1 |
| 09-Aug-2015 |
martin | Pull up following revision(s) (requested by mlelstv in ticket #943): sys/kern/vfs_wapbl.c: revision 1.62 Refactor disk address calculation from physical block numbers in the journal into a function. Make that function work correctly with sector sizes != DEV_BSIZE when compiled outside the kernel (i.e. fsck_ffs). Fixes PR bin/45933
|
1.61.2.6 |
| 28-Aug-2017 |
skrll | Sync with HEAD
|
1.61.2.5 |
| 05-Dec-2016 |
skrll | Sync with HEAD
|
1.61.2.4 |
| 05-Oct-2016 |
skrll | Sync with HEAD
|
1.61.2.3 |
| 29-May-2016 |
skrll | Sync with HEAD
|
1.61.2.2 |
| 27-Dec-2015 |
skrll | Sync with HEAD (as of 26th Dec)
|
1.61.2.1 |
| 22-Sep-2015 |
skrll | Sync with HEAD
|
1.78.2.4 |
| 26-Apr-2017 |
pgoyette | Sync with HEAD
|
1.78.2.3 |
| 20-Mar-2017 |
pgoyette | Sync with HEAD
|
1.78.2.2 |
| 07-Jan-2017 |
pgoyette | Sync with HEAD. (Note that most of these changes are simply $NetBSD$ tag issues.)
|
1.78.2.1 |
| 04-Nov-2016 |
pgoyette | Sync with HEAD
|
1.86.2.1 |
| 21-Apr-2017 |
bouyer | Sync with HEAD
|
1.101.4.3 |
| 21-Apr-2020 |
martin | Sync with HEAD
|
1.101.4.2 |
| 08-Apr-2020 |
martin | Merge changes from current as of 20200406
|
1.101.4.1 |
| 10-Jun-2019 |
christos | Sync with HEAD
|
1.101.2.1 |
| 26-Dec-2018 |
pgoyette | Sync with HEAD, resolve a few conflicts
|
1.106.2.1 |
| 20-Apr-2020 |
bouyer | Sync with HEAD
|