Home | History | Annotate | Download | only in lfs
History log of /src/sys/ufs/lfs/ulfs_inode.c
RevisionDateAuthorComments
 1.26  05-Sep-2020  riastradh Round of uvm.h cleanup.

The poorly named uvm.h is generally supposed to be for uvm-internal
users only.

- Narrow it to files that actually need it -- mostly files that need
to query whether curlwp is the pagedaemon, which should maybe be
exposed by an external header.

- Use uvm_extern.h where feasible and uvm_*.h for things not exposed
by it. We should split up uvm_extern.h but this will serve for now
to reduce the uvm.h dependencies.

- Use uvm_stat.h and #ifdef UVMHIST uvm.h for files that use
UVMHIST(ubchist), since ubchist is declared in uvm.h but the
reference evaporates if UVMHIST is not defined, so we reduce header
file dependencies.

- Make uvm_device.h and uvm_swap.h independently includable while
here.

ok chs@
 1.25  23-Feb-2020  ad UVM locking changes, proposed on tech-kern:

- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart. v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap. Others to follow later.
 1.24  15-Jan-2020  ad Merge from yamt-pagecache (after much testing):

- Reduce unnecessary page scan in putpages esp. when an object has a ton of
pages cached but only a few of them are dirty.

- Reduce the number of pmap operations by tracking page dirtiness more
precisely in uvm layer.
 1.23  31-Dec-2019  ad branches: 1.23.2;
- Add and use wrapper functions that take and acquire page interlocks, and pairs
of page interlocks. Require that the page interlock be held over calls to
uvm_pageactivate(), uvm_pagewire() and similar.

- Solve the concurrency problem with page replacement state. Rather than
updating the global state synchronously, set an intended state on
individual pages (active, inactive, enqueued, dequeued) while holding the
page interlock. After the interlock is released put the pages on a 128
entry per-CPU queue for their state changes to be made real in batch.
This results in in a ~400 fold decrease in contention on my test system.
Proposed on tech-kern but modified to use the page interlock rather than
atomics to synchronise as it's much easier to maintain that way, and
cheaper.
 1.22  13-Dec-2019  ad Break the global uvm_pageqlock into a per-page identity lock and a private
lock for use of the pagedaemon policy code. Discussed on tech-kern.

PR kern/54209: NetBSD 8 large memory performance extremely low
PR kern/54210: NetBSD-8 processes presumably not exiting
PR kern/54727: writing a large file causes unreasonable system behaviour
 1.21  28-Oct-2017  pgoyette branches: 1.21.4;
Update the kernhist(9) kernel history code to address issues identified
in PR kern/52639, as well as some general cleaning-up...

(As proposed on tech-kern@ with additional changes and enhancements.)

Details of changes:

* All history arguments are now stored as uintmax_t values[1], both in
the kernel and in the structures used for exporting the history data
to userland via sysctl(9). This avoids problems on some architectures
where passing a 64-bit (or larger) value to printf(3) can cause it to
process the value as multiple arguments. (This can be particularly
problematic when printf()'s format string is not a literal, since in
that case the compiler cannot know how large each argument should be.)

* Update the data structures used for exporting kernel history data to
include a version number as well as the length of history arguments.

* All [2] existing users of kernhist(9) have had their format strings
updated. Each format specifier now includes an explicit length
modifier 'j' to refer to numeric values of the size of uintmax_t.

* All [2] existing users of kernhist(9) have had their format strings
updated to replace uses of "%p" with "%#jx", and the pointer
arguments are now cast to (uintptr_t) before being subsequently cast
to (uintmax_t). This is needed to avoid compiler warnings about
casting "pointer to integer of a different size."

* All [2] existing users of kernhist(9) have had instances of "%s" or
"%c" format strings replaced with numeric formats; several instances
of mis-match between format string and argument list have been fixed.

* vmstat(1) has been modified to handle the new size of arguments in the
history data as exported by sysctl(9).

* vmstat(1) now provides a warning message if the history requested with
the -u option does not exist (previously, this condition was silently
ignored, with only a single blank line being printed).

* vmstat(1) now checks the version and argument length included in the
data exported via sysctl(9) and exits if they do not match the values
with which vmstat was built.

* The kernhist(9) man-page has been updated to note the additional
requirements imposed on the format strings, along with several other
minor changes and enhancements.

[1] It would have been possible to use an explicit length (for example,
uint64_t) for the history arguments. But that would require another
"rototill" of all the users in the future when we add support for an
architecture that supports a larger size. Also, the printf(3) format
specifiers for explicitly-sized values, such as "%"PRIu64, are much
more verbose (and less aesthetically appealing, IMHO) than simply
using "%ju".

[2] I've tried very hard to find "all [the] existing users of kernhist(9)"
but it is possible that I've missed some of them. I would be glad to
update any stragglers that anyone identifies.
 1.20  10-Jun-2017  maya Rename i_flag to i_state.

The similarity to i_flags has previously caused errors.
 1.19  26-May-2017  riastradh branches: 1.19.2;
Eliminate crusty debugging sludge.

We have a mostly sane vnode lifecycle now. If this needs debugging,
it should be done once at the call site of VOP_RECLAIM.
 1.18  11-Apr-2017  riastradh Make VOP_INACTIVE preserve vnode lock on return.

Discussed on tech-kern:
https://mail-index.netbsd.org/tech-kern/2017/04/01/msg021751.html

Ride 7.99.68, a bumpy bus of incremental vfs improvements!
 1.17  30-Mar-2017  hannken Remove now redundant calls to fstrans_start()/fstrans_done().

Add fstrans_start()/fstrans_done() to lfs_putpages().
 1.16  20-Aug-2016  hannken branches: 1.16.2;
Remove now obsolete operation vcache_remove().

Welcome to 7.99.36
 1.15  20-Jun-2016  dholland branches: 1.15.2;
One more batch of already-synced ufs changes:

ufs_extern.h 1.79 is equivalent to ulfs_extern.h 1.14
ufsmount.h 1.43 is (roughly) equivalent to lfs_extern.h 1.102
ufs_inode.c 1.94 does not apply to lfs
ufs_inode.c 1.95 does not apply to lfs either
ufs_readwrite.c 1.108 is equivalent to ulfs_readwrite.c 1.8
ufs_readwrite.c 1.109 is equivalent to ulfs_readwrite.c 1.9
ufs_readwrite.c 1.110 is equivalent to ulfs_readwrite.c 1.10
ufs_readwrite.c 1.111 does not apply to lfs
ufs_readwrite.c 1.112 is equivalent to ulfs_readwrite.c 1.11
ufs_readwrite.c 1.113 is equivalent to ulfs_readwrite.c 1.13
ufs_readwrite.c 1.114 is equivalent to ulfs_readwrite.c 1.14
ufs_readwrite.c 1.115 is equivalent to ulfs_readwrite.c 1.15
ufs_readwrite.c 1.116-1.118 does not apply to lfs
ufs_readwrite.c 1.119-1.120 are equivalent to ulfs_readwrite.c 1.16
ufs_rename.c 1.12 is equivalent to lfs_rename.c 1.8
ufs_vnops.c 1.226 is equivalent to ulfs_vnops.c 1.22 and lfs_vnops.c 1.270
ufs_vnops.c 1.227 is equivalent to ulfs_vnops.c 1.23
ufs_vnops.c 1.228-1.229 are equivalent to ulfs_vnops.c 1.24
ufs_vnops.c 1.230 is equivalent to ulfs_vnops.c 1.25 and lfs_vnops.c 1.271
ufs_vnops.c 1.231 originated in lfs
ufs_vnops.c 1.232 does not apply to lfs
 1.14  20-Jun-2016  dholland Merge ufs_inode.c 1.93: missing unlock on error path.
 1.13  20-Jun-2016  dholland Note more already-merged versions:

inode.h 1.68 is subsumed by ulfs_inode.h 1.19
inode.h 1.69-1.72 do not apply to lfs
ufs_extern.h 1.74 was covered when lfs was moved to the new vnode cache
ufs_extern.h 1.75 is equivalent to ulfs_extern.h 1.13
ufs_extern.h 1.76-1.77 do not apply to lfs
ufsmount.h 1.42 does not apply to lfs
ufs_inode.c 1.90 is subsumed by ulfs_inode.c 1.10
ufs_inode.c 1.91-1.92 do not apply to lfs
ufs_lookup.c 1.130 is subsumed by ulfs_lookup.c 1.24
ufs_lookup.c 1.131 is equivalent to ulfs_lookup.c 1.20
ufs_lookup.c 1.132 is equivalent to ulfs_lookup.c 1.21
ufs_lookup.c 1.133 is equivalent to ulfs_lookup.c 1.22
ufs_lookup.c 1.134 is equivalent to ulfs_lookup.c 1.23
ufs_lookup.c 1.135 is equivalent to ulfs_lookup.c 1.25
ufs_quota2.c 1.38 is equivalent to ulfs_quota2.c 1.17
ufs_quota2.c 1.39 is equivalent to ulfs_quota2.c 1.16
ufs_quota2.c 1.40 is equivalent to ulfs_quota2.c 1.18
ufs_vfsops.c 1.53 is subsumed by lfs_vfsops.c 1.324
ufs_vfsops.c 1.54 is subsumed by lfs_vfsops.c 1.324
ufs_vnops.c 1.223-1.224 do not apply to lfs
 1.12  14-Nov-2015  pgoyette Remove historic references to wapbl.
 1.11  01-Sep-2015  dholland Add new accessors for the d_type and d_namlen fields of struct lfs_direct.
Napalm the old byteswap access logic for these.
 1.10  31-May-2015  hannken Change lfs from hash table to vcache.

- Change lfs_valloc() to return an inode number and version instead of
a vnode and move lfs_ialloc() and lfs_vcreate() to new lfs_init_vnode().

- Add lfs_valloc_fixed() to allocate a known inode, used by kernel
roll forward.

- Remove lfs_*ref(), these functions cannot coexist with vcache and
their commented behaviour is far away from their implementation.

- Add the cleaner lwp and blockinfo to struct ulfsmount so lfs_loadvnode()
may use hints from the cleaner.

- Remove vnode locks from ulfs_lookup() like we did with ufs_lookup().
 1.9  28-Jul-2013  dholland branches: 1.9.4; 1.9.6; 1.9.8;
Remove the now-pointless ulfs ops macros.
 1.8  28-Jul-2013  dholland Get rid of the ulfs_ops table as we only have one fs in here now.
 1.7  08-Jun-2013  dholland branches: 1.7.2; 1.7.4;
There is no WAPBL in LFS.
 1.6  08-Jun-2013  dholland mp->mnt_wapbl and mp->mnt_wapbl_replay are always NULL in here.
 1.5  06-Jun-2013  dholland Add lfs_ or ulfs_ in front of extern symbols lacking them, mostly
quota-related (and particularly quota2-related) stuff.
 1.4  06-Jun-2013  dholland Split lfs from ufs step 4:

Massedit all ufs symbols to be "ulfs" instead, to make sure there are
no conflicts with ufs. Confirmed with grep.

(This required changing a few comments that maybe should have been
left alone to say "ulfs", but we'll survive that.)
 1.3  06-Jun-2013  dholland Split lfs from ufs step 3: rearrange config stuff.
Add new options:
LFS_EI
LFS_DIRHASH
LFS_EXTATTR
LFS_EXTATTR_AUTOSTART
LFS_QUOTA
LFS_QUOTA2

and update code referring to the corresponding FFS and UFS config
symbols to use the LFS versions. Disable the one extant reference
to APPLE_UFS in the ulfs files. Use opt_lfs.h only, not opt_ffs.h.
 1.2  06-Jun-2013  dholland Split lfs from ufs, part 2:

Change all <ufs/ufs/foo.h> includes to <ufs/lfs/ulfs_foo.h>.
 1.1  06-Jun-2013  dholland Split lfs from ufs, part 1: cut and paste 15000 lines of ufs as "ulfs".

These are verbatim copies except that I've preserved the ufs rcsids
for reference. Also,
ufs/quota.h -> ulfs_quotacommon.h
ufs/ufs_quota.h -> ulfs_quota.h

Splitting lfs from ufs was ok'd by core some years ago. This is not
from my original tree, which became unmergeable after the several sets
of quota changes; I've done the work over again over the last couple
days.
 1.7.4.1  28-Aug-2013  rmind sync with head
 1.7.2.4  03-Dec-2017  jdolecek update from HEAD
 1.7.2.3  20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.7.2.2  23-Jun-2013  tls resync from head
 1.7.2.1  08-Jun-2013  tls file ulfs_inode.c was added on branch tls-maxphys on 2013-06-23 06:18:39 +0000
 1.9.8.6  28-Aug-2017  skrll Sync with HEAD
 1.9.8.5  05-Oct-2016  skrll Sync with HEAD
 1.9.8.4  09-Jul-2016  skrll Sync with HEAD
 1.9.8.3  27-Dec-2015  skrll Sync with HEAD (as of 26th Dec)
 1.9.8.2  22-Sep-2015  skrll Sync with HEAD
 1.9.8.1  06-Jun-2015  skrll Sync with HEAD
 1.9.6.1  10-Jul-2016  martin Pull up following revision(s) (requested by dholland in ticket #1188):
sys/ufs/lfs/ulfs_inode.c: revision 1.14
Merge ufs_inode.c 1.93: missing unlock on error path.
 1.9.4.2  22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.9.4.1  28-Jul-2013  yamt file ulfs_inode.c was added on branch yamt-pagecache on 2014-05-22 11:41:19 +0000
 1.15.2.1  26-Apr-2017  pgoyette Sync with HEAD
 1.16.2.1  21-Apr-2017  bouyer Sync with HEAD
 1.19.2.2  02-Nov-2017  snj Pull up following revision(s) (requested by pgoyette in ticket #335):
share/man/man9/kernhist.9: 1.5-1.8
sys/arch/acorn26/acorn26/pmap.c: 1.39
sys/arch/arm/arm32/fault.c: 1.105 via patch
sys/arch/arm/arm32/pmap.c: 1.350, 1.359
sys/arch/arm/broadcom/bcm2835_bsc.c: 1.7
sys/arch/arm/omap/if_cpsw.c: 1.20
sys/arch/arm/omap/tiotg.c: 1.7
sys/arch/evbarm/conf/RPI2_INSTALL: 1.3
sys/dev/ic/sl811hs.c: 1.98
sys/dev/usb/ehci.c: 1.256
sys/dev/usb/if_axe.c: 1.83
sys/dev/usb/motg.c: 1.18
sys/dev/usb/ohci.c: 1.274
sys/dev/usb/ucom.c: 1.119
sys/dev/usb/uhci.c: 1.277
sys/dev/usb/uhub.c: 1.137
sys/dev/usb/umass.c: 1.160-1.162
sys/dev/usb/umass_quirks.c: 1.100
sys/dev/usb/umass_scsipi.c: 1.55
sys/dev/usb/usb.c: 1.168
sys/dev/usb/usb_mem.c: 1.70
sys/dev/usb/usb_subr.c: 1.221
sys/dev/usb/usbdi.c: 1.175
sys/dev/usb/usbdi_util.c: 1.67-1.70
sys/dev/usb/usbroothub.c: 1.3
sys/dev/usb/xhci.c: 1.75
sys/external/bsd/drm2/dist/drm/i915/i915_gem.c: 1.34
sys/kern/kern_history.c: 1.15
sys/kern/kern_xxx.c: 1.74
sys/kern/vfs_bio.c: 1.275-1.276
sys/miscfs/genfs/genfs_io.c: 1.71
sys/sys/kernhist.h: 1.21
sys/ufs/ffs/ffs_balloc.c: 1.63
sys/ufs/lfs/lfs_vfsops.c: 1.361
sys/ufs/lfs/ulfs_inode.c: 1.21
sys/ufs/lfs/ulfs_vnops.c: 1.52
sys/ufs/ufs/ufs_inode.c: 1.102
sys/ufs/ufs/ufs_vnops.c: 1.239
sys/uvm/pmap/pmap.c: 1.37-1.39
sys/uvm/pmap/pmap_tlb.c: 1.22
sys/uvm/uvm_amap.c: 1.108
sys/uvm/uvm_anon.c: 1.64
sys/uvm/uvm_aobj.c: 1.126
sys/uvm/uvm_bio.c: 1.91
sys/uvm/uvm_device.c: 1.66
sys/uvm/uvm_fault.c: 1.201
sys/uvm/uvm_km.c: 1.144
sys/uvm/uvm_loan.c: 1.85
sys/uvm/uvm_map.c: 1.353
sys/uvm/uvm_page.c: 1.194
sys/uvm/uvm_pager.c: 1.111
sys/uvm/uvm_pdaemon.c: 1.109
sys/uvm/uvm_swap.c: 1.175
sys/uvm/uvm_vnode.c: 1.103
usr.bin/vmstat/vmstat.c: 1.219
Reorder to test for null before null deref in debug code
--
Reorder to test for null before null deref in debug code
--
KNF
--
No need for '\n' in UVMHIST_LOG
--
normalise a BIOHIST log message
--
Update the kernhist(9) kernel history code to address issues identified
in PR kern/52639, as well as some general cleaning-up...
(As proposed on tech-kern@ with additional changes and enhancements.)
Details of changes:
* All history arguments are now stored as uintmax_t values[1], both in
the kernel and in the structures used for exporting the history data
to userland via sysctl(9). This avoids problems on some architectures
where passing a 64-bit (or larger) value to printf(3) can cause it to
process the value as multiple arguments. (This can be particularly
problematic when printf()'s format string is not a literal, since in
that case the compiler cannot know how large each argument should be.)
* Update the data structures used for exporting kernel history data to
include a version number as well as the length of history arguments.
* All [2] existing users of kernhist(9) have had their format strings
updated. Each format specifier now includes an explicit length
modifier 'j' to refer to numeric values of the size of uintmax_t.
* All [2] existing users of kernhist(9) have had their format strings
updated to replace uses of "%p" with "%#jx", and the pointer
arguments are now cast to (uintptr_t) before being subsequently cast
to (uintmax_t). This is needed to avoid compiler warnings about
casting "pointer to integer of a different size."
* All [2] existing users of kernhist(9) have had instances of "%s" or
"%c" format strings replaced with numeric formats; several instances
of mis-match between format string and argument list have been fixed.
* vmstat(1) has been modified to handle the new size of arguments in the
history data as exported by sysctl(9).
* vmstat(1) now provides a warning message if the history requested with
the -u option does not exist (previously, this condition was silently
ignored, with only a single blank line being printed).
* vmstat(1) now checks the version and argument length included in the
data exported via sysctl(9) and exits if they do not match the values
with which vmstat was built.
* The kernhist(9) man-page has been updated to note the additional
requirements imposed on the format strings, along with several other
minor changes and enhancements.
[1] It would have been possible to use an explicit length (for example,
uint64_t) for the history arguments. But that would require another
"rototill" of all the users in the future when we add support for an
architecture that supports a larger size. Also, the printf(3)
format
specifiers for explicitly-sized values, such as "%"PRIu64, are much
more verbose (and less aesthetically appealing, IMHO) than simply
using "%ju".
[2] I've tried very hard to find "all [the] existing users of
kernhist(9)"
but it is possible that I've missed some of them. I would be glad
to
update any stragglers that anyone identifies.
--
For some reason this single kernel seems to have outgrown its declared
size as a result of the kernhist(9) changes. Bump the size.
XXX The amount of increase may be excessive - anyone with more detailed
XXX knowledge please feel free to further adjust the value
appropriately.
--
Misssed one cast of pointer --> uintptr_t in previous kernhist(9) commit
--
And yet another one. :(
--
Use correct mark-up for NetBSD version.
--
More improvements in grammar and readability.
--
Remove a stray '"' (obvious typo) and add a couple of casts that are
probably needed.
--
And replace an instance of "%p" conversion with "%#jx"
--
Whitespace fix. Give Bl tag table a width. Fix Xr.
 1.19.2.1  30-Oct-2017  snj Pull up following revision(s) (requested by maya in ticket #330):
sbin/fsck_lfs/inode.c: 1.69
sbin/fsck_lfs/lfs.c: 1.73
sbin/fsck_lfs/pass6.c: 1.50
sbin/fsck_lfs/segwrite.c: 1.46
sys/ufs/lfs/lfs.h: 1.202-1.203
sys/ufs/lfs/lfs_accessors.h: 1.48
sys/ufs/lfs/lfs_alloc.c: 1.136-1.137
sys/ufs/lfs/lfs_balloc.c: 1.94
sys/ufs/lfs/lfs_bio.c: 1.141
sys/ufs/lfs/lfs_extern.h: 1.113
sys/ufs/lfs/lfs_inode.c: 1.156-1.157
sys/ufs/lfs/lfs_inode.h: 1.20, 1.21, 1.23
sys/ufs/lfs/lfs_itimes.c: 1.20
sys/ufs/lfs/lfs_pages.c: 1.13-1.15
sys/ufs/lfs/lfs_rename.c: 1.22
sys/ufs/lfs/lfs_segment.c: 1.270-1.275
sys/ufs/lfs/lfs_subr.c: 1.94-1.97
sys/ufs/lfs/lfs_syscalls.c: 1.175
sys/ufs/lfs/lfs_vfsops.c: 1.360
sys/ufs/lfs/lfs_vnops.c: 1.316-1.321
sys/ufs/lfs/ulfs_inode.c: 1.20
sys/ufs/lfs/ulfs_inode.h: 1.24
sys/ufs/lfs/ulfs_lookup.c: 1.41
sys/ufs/lfs/ulfs_quota2.c: 1.31
sys/ufs/lfs/ulfs_readwrite.c: 1.24
sys/ufs/lfs/ulfs_vnops.c: 1.49-1.50
Update inode member i_flag --> i_state to keep up with kernel changes
Move definition of IN_ALLMOD near the flag it's a mask for.
Now we can see that it doesn't match all the flags, but changing that will
require more careful thought.
Correct confusion between i_flag and i_flags
These will have to be renamed.
Spotted by Riastradh, thanks!
Add an XXX about the missing flags so it's not buried in a commit
message.
now the XXX count for LFS is 260
Rename i_flag to i_state.
The similarity to i_flags has previously caused errors.
Use continue to denote the no-op loop to match netbsd style
newline for extra clarity.
It isn't safe to drain dirops with seglock held, it'll deadlock if there
are any dirops. drain before grabbing seglock.
lfs_dirops == 0 is always true (as we already drained dirops), so omit
that part of the comparison.
Fixes a lot of LFS deadlocks. PR kern/52301
Many thanks to dholland for help analyzing coredumps
Ifdef out KDASSERT which fires on my machine.
Deduplicate sanity check that seglock is held on segunlock
Revert r1.272 fix to PR kern/52301, the performance hit is making things
unusable.
change lfs_nextsegsleep and lfs_allclean_wakeup to use condvar
XXX had to use lfs_lock in lfs_segwait, removed kernel_lock, is this
appropriate?
fix buffer overflow/KASSERT when cookies are supplied
lfs no longer uses the ffs-style struct direct, use the correct minimum
size
from dholland
XXX more wrong
Consistently use {,UN}MARK_VNODE macros rather than function calls.
Not much point doing anything after a panic call
Ask some question about the code in a XXX comment
XXX question our double-flushing of dirops
Fix typo in comment
 1.21.4.1  08-Apr-2020  martin Merge changes from current as of 20200406
 1.23.2.2  29-Feb-2020  ad Sync with head.
 1.23.2.1  17-Jan-2020  ad Sync with head.

RSS XML Feed