History log of /src/sys/kern/kern_descrip.c |
Revision | | Date | Author | Comments |
1.266 |
| 16-Jul-2025 |
kre | Kernel part of O_CLOFORK implementation (plus kernel revbump)
This is Ricardo Branco's implementation of O_CLOFORK (and associated fcntl, etc) for NetBSD (with a few minor changes by me).
For now, the header file symbols that should be exposed to userland are hidden inside temporary #ifdef _KERNEL blocks, just to avoid random userland apps, or config scripts, from seeing any of this before it is better tested.
Userland parts of this will follow soon.
This also bumps the kernel version to 10.99.15 (changes to data structs, and the signature of fd_dup()).
|
1.265 |
| 21-Dec-2024 |
riastradh | closef(9): Assert no ERESTART from struct fileops::fo_close.
This cannot possibly work so make sure we flag it early.
Currently the sys_close wrapper will neuter ERESTART by mapping it to EINTR, but let's catch this mistake earlier where we have better diagnostic information available like what the fo_close function is. (Haven't seen the printf fire in the >decade since I added it, so I think this KASSERT is unlikely.)
|
1.264 |
| 10-Nov-2024 |
kre | Make O_CLOEXEC always close specified files on exec
It turns out that close-on-exec doesn't always close on exec.
If all close-on-exec fd's were made close-on-exec via dup3() or fcntl(F_DUPFD_CLOEXEC) or use of the internal fd_clone() (whose uses I did not fully investigate but I think is used to create a fd for the open of a cloner device, and perhaps other things) then none of the close-on-exec file descriptors will be closed when an exec happens - but will be passed through to the new process (still marked, apparently, as close-on-exec - but still won't be closed if another exec happens) - that is unless...
If at least one fd in the process has close-on-exec set some other way (fcntl(F_SETFD), open(O_CLOEXEC) (and the similar functions for sockets, and epoll) and perhaps others then all close-on-exec file descriptors in the process will be correctly closed when an exec happens (however they obtained the close-on-exec status).
There are two steps that need to be taken (in the kernel) when turning on close on exec - the obvious one of setting the ff_exclose field in the struct fdfile for the fd. And second, marking the file descriptor table (which holds the fdfile's for one or more processes) as containing file descriptors with close-on-exec set (it is a simple yes/no, and once set is never cleared until an actual exec happens). If it was set during an exec, all the file descriptors are examined, and those marked close-on-exec are closed. If the file descriptor table doesn't indicate that close-on-exec fds exist in the table, none of that happens.
Several places were setting ff_exclose in the struct fdfile but not bothering to set the fd_exclose field in the file descriptor table.
There's even a function (fd_set_exclose()) whose whole purpose is to do this properly - but it wasn't being used.
Now it is, everywhere (I hope).
|
1.263 |
| 14-Jul-2024 |
kre | PR kern/58425 -- Disallow INT_MIN as a (negative) pid arg.
Since -INT_MIN is undefined, and to point of negative pid args is to negate them, and use the result as a pgrp id instead, we need to avoid accidentally negating INT_MIN.
Since pid_t is just an integral type, of unspecified width, when testing pid_t value test for <= INT_MIN (or > INT_MIN sometimes) rather than == INT_MIN. When testing int values, just == INT_MIN is all that is needed, < INT_MIN cannot occur.
XXX pullup -9, -10
|
1.262 |
| 04-Oct-2023 |
ad | branches: 1.262.6; kauth_cred_hold(): return cred verbatim so that donating a reference to another data structure can be done more elegantly.
|
1.261 |
| 23-Sep-2023 |
ad | Repply this change with a couple of bugs fixed:
- Do away with separate pool_cache for some kernel objects that have no special requirements and use the general purpose allocator instead. On one of my test systems this makes for a small (~1%) but repeatable reduction in system time during builds presumably because it decreases the kernel's cache / memory bandwidth footprint a little. - vfs_lockf: cache a pointer to the uidinfo and put mutex in the data segment.
|
1.260 |
| 12-Sep-2023 |
ad | Back out recent change to replace pool_cache with then general allocator. Will return to this when I have time again.
|
1.259 |
| 10-Sep-2023 |
ad | - Do away with separate pool_cache for some kernel objects that have no special requirements and use the general purpose allocator instead. On one of my test systems this makes for a small (~1%) but repeatable reduction in system time during builds presumably because it decreases the kernel's cache / memory bandwidth footprint a little. - vfs_lockf: cache a pointer to the uidinfo and put mutex in the data segment.
|
1.258 |
| 10-Sep-2023 |
ad | It's easy to exhaust the open file limit on a system with many CPUs due to caching. Allow a bit of leeway to reduce the element of surprise.
|
1.257 |
| 22-Apr-2023 |
riastradh | fcntl(2), flock(2): Assert FHASLOCK is clear if no fo_advlock.
|
1.256 |
| 22-Apr-2023 |
riastradh | file(9): New fo_advlock operation.
This moves the vnode-specific logic from sys_descrip.c into vfs_vnode.c, like we did for fo_seek.
XXX kernel revbump -- struct fileops API and ABI change
|
1.255 |
| 24-Feb-2023 |
riastradh | kern: Eliminate most __HAVE_ATOMIC_AS_MEMBAR conditionals.
I'm leaving in the conditional around the legacy membar_enters (store-before-load, store-before-store) in kern_mutex.c and in kern_lock.c because they may still matter: store-before-load barriers tend to be the most expensive kind, so eliding them is probably worthwhile on x86. (It also may not matter; I just don't care to do measurements right now, and it's a single valid and potentially justifiable use case in the whole tree.)
However, membar_release/acquire can be mere instruction barriers on all TSO platforms including x86, so there's no need to go out of our way with a bad API to conditionalize them. If the procedure call overhead is measurable we just could change them to be macros on x86 that expand into __insn_barrier.
Discussed on tech-kern: https://mail-index.netbsd.org/tech-kern/2023/02/23/msg028729.html
|
1.254 |
| 23-Feb-2023 |
riastradh | kern_descrip.c: Change membar_enter to membar_acquire in fd_getfile.
membar_acquire is cheaper on many CPUs, and unlikely to be costlier on any CPUs, than the legacy membar_enter.
Add a long comment explaining the interaction between fd_getfile and fd_close and why membar_acquire is safe.
XXX pullup-10
|
1.253 |
| 23-Feb-2023 |
riastradh | kern_descrip.c: Use atomic_store_relaxed/release for ff->ff_file.
1. atomic_store_relaxed in fd_close avoids the appearance of race in sanitizers (minor bug).
2. atomic_store_release in fd_affix is necessary because the lock activity was not, in fact, enough to guarantee ordering (real bug some architectures like aarch64).
The premise appears to have been that the mutex_enter/exit earlier in fd_affix is enough to guarantee that initialization of fp (A) happens before use of fp by a user once fp is published (B):
fp->f_... = ...; // A
/* fd_affix */ mutex_enter(&fp->f_lock); fp->f_count++; mutex_exit(&fp->f_lock); ... ff->ff_file = fp; // B
But actually mutex_enter/exit allow the following reordering by the CPU:
mutex_enter(&fp->f_lock); ff->ff_file = fp; // B fp->f_count++; fp->f_... = ...; // A mutex_exit(&fp->f_lock);
The only constraints they imply are:
1. fp->f_count++ and B cannot precede mutex_enter 2. mutex_exit cannot precede A and fp->f_count++
They imply no constraint on the relative ordering of A, B, and fp->f_count++ amongst each other, however.
This affects any architecture that has a native load-acquire or store-release operation in mutex_enter/exit, like aarch64, instead of explicit load-before-load/store and load/store-before-store barrier.
No need for atomic_store_* in fd_copy or fd_free because we have exclusive access to ff as is.
XXX pullup-9 XXX pullup-10
|
1.252 |
| 23-Feb-2023 |
riastradh | kern_descrip.c: Fix membars around reference count decrement.
In general, the `last one out hit the lights' style of reference counting (as opposed to the `whoever's destroying must wait for pending users to finish' style) requires memory barriers like so:
... usage of resources associated with object ... membar_release(); if (atomic_dec_uint_nv(&obj->refcnt) != 0) return; membar_acquire(); ... freeing of resources associated with object ...
This way, all usage happens-before all freeing. This fixes several errors:
- fd_close failed to ensure whatever its caller did would happen-before the freeing, in the case where another thread is concurrently trying to close the fd (ff->ff_file == NULL).
Fix: Add membar_release before atomic_dec_uint(&ff->ff_refcnt) in that branch.
- fd_close failed to ensure all loads its caller had issued will have happened-before the freeing, in the case where the fd is still in use by another thread (fdp->fd_refcnt > 1 and ff->ff_refcnt-- > 0).
Fix: Change membar_producer to membar_release before atomic_dec_uint(&ff->ff_refcnt).
- fd_close failed to ensure that any usage of fp by other callers would happen-before any freeing it does.
Fix: Add membar_acquire after atomic_dec_uint_nv(&ff->ff_refcnt).
- fd_free failed to ensure that any usage of fdp by other callers would happen-before any freeing it does.
Fix: Add membar_acquire after atomic_dec_uint_nv(&fdp->fd_refcnt).
While here, change membar_exit -> membar_release. No semantic change, just updating away from the legacy API.
XXX pullup-8 XXX pullup-9 XXX pullup-10
|
1.251 |
| 29-Jun-2021 |
dholland | branches: 1.251.10; Add containment for the cloning devices hack in vn_open.
Cloning devices (and also things like /dev/stderr) work by allocating a struct file, stuffing it in the file table (which is a layer violation), stuffing the file descriptor number for it in a magic field of struct lwp (which is gross), and then "failing" with one of two magic errnos, EDUPFD or EMOVEFD.
Before this commit, all callers of vn_open in the kernel (there are quite a few) were expected to check for these errors and handle the situation. Needless to say, none of them except for open() itself did, resulting in internal negative errnos being returned to userspace.
This hack is fairly deeply rooted and cannot be eliminated all at once. This commit adds logic to handle the magic errnos inside vn_open; now on success vn_open returns either a vnode or an integer file descriptor, along with a flag that says whether the underlying code requested EDUPFD or EMOVEFD. Callers not prepared to cope with file descriptors can pass NULL for the extra return values, in which case if a file descriptor would be produced vn_open fails with EOPNOTSUPP.
Since I'm rearranging vn_open's signature anyway, stop exposing struct nameidata. Instead, take three arguments: an optional vnode to use as the starting point (like openat()), the path, and additional namei flags to use, restricted to NOCHROOT and TRYEMULROOT. (Other namei behavior, e.g. NOFOLLOW, can be requested via the open flags.)
This change requires a kernel bump. Ride the one an hour ago. (That was supposed to be coordinated; did not intend to let an hour slip by. My fault.)
|
1.250 |
| 24-Dec-2020 |
nia | branches: 1.250.4; Avoid negating the minimum size of pid_t (this overflows).
Reported-by: syzbot+e2eb02f9dfaf4f2e6626@syzkaller.appspotmail.com
|
1.249 |
| 28-Aug-2020 |
christos | branches: 1.249.2; We already zeroed the struct, no point in zeroing things twice.
|
1.248 |
| 28-Aug-2020 |
riastradh | Just zero out struct file::f_lock when exposed to userland.
Userland has no business examining a snapshot of the lock state, even if pseudonymized. Should fix hppa build, where kmutex_t is somewhat larger than anticipated by recent changes.
|
1.247 |
| 26-Aug-2020 |
christos | Instead of returning 0 when sysctl kern.expose_address=0, return a random hashed value of the data. This allows sockstat to work without exposing kernel addresses or being setgid kmem.
|
1.246 |
| 23-May-2020 |
ad | Move proc_lock into the data segment. It was dynamically allocated because at the time we had mutex_obj_alloc() but not __cacheline_aligned.
|
1.245 |
| 01-Feb-2020 |
riastradh | Load struct fdfile::ff_file with atomic_load_consume.
Exceptions: when we're only testing whether it's there, not about to dereference it.
Note: We do not use atomic_store_release to set it because the preceding mutex_exit should be enough.
(That said, it's not clear the mutex_enter/exit is needed unless refcnt > 0 already, in which case maybe it would be a win to switch from the membar implied by mutex_enter to the membar implied by atomic_store_release -- which I would generally expect to be much cheaper. And a little clearer without a long comment.)
|
1.244 |
| 01-Feb-2020 |
riastradh | Load struct filedesc::fd_dt with atomic_load_consume.
Exceptions: when fd_refcnt <= 1, or when holding fd_lock.
While here:
- Restore KASSERT(mutex_owned(&fdp->fd_lock)) in fd_unused. => This is used only in fd_close and fd_abort, where it holds. - Move bounds check assertion in fd_putfile to where it matters. - Store fd_dt with atomic_store_release. - Move load of fd_dt under lock in knote_fdclose. - Omit membar_consumer in fdesc_readdir. => atomic_load_consume serves the same purpose now. => Was needed only on alpha anyway.
|
1.243 |
| 20-Feb-2019 |
christos | branches: 1.243.4; 1.243.6; handle O_NOSIGPIPE too.
|
1.242 |
| 03-Jan-2019 |
maxv | Add KASSERT.
|
1.241 |
| 24-Nov-2018 |
maxv | Fix kernel pointer leaks in the kern.file sysctl, same as kern.file2.
|
1.240 |
| 24-Nov-2018 |
maxv | Rename fill_file -> fill_file2, since that's the KERN_FILE2 sysctl.
|
1.239 |
| 02-Nov-2018 |
maxv | Add LIST_INIT for filehead.
|
1.238 |
| 05-Oct-2018 |
christos | Provide a sysctl kern.expose_address to expose kernel addresses in sysctl structure returns for non-root. Defaults to off. Turning it on will restore sockstat/fstat and friends for regular users.
|
1.237 |
| 13-Sep-2018 |
maxv | Don't leak kernel pointers to userland in kern.file2, same as kern.proc2.
|
1.236 |
| 03-Sep-2018 |
riastradh | Rename min/max -> uimin/uimax for better honesty.
These functions are defined on unsigned int. The generic name min/max should not silently truncate to 32 bits on 64-bit systems. This is purely a name change -- no functional change intended.
HOWEVER! Some subsystems have
#define min(a, b) ((a) < (b) ? (a) : (b)) #define max(a, b) ((a) > (b) ? (a) : (b))
even though our standard name for that is MIN/MAX. Although these may invite multiple evaluation bugs, these do _not_ cause integer truncation.
To avoid `fixing' these cases, I first changed the name in libkern, and then compile-tested every file where min/max occurred in order to confirm that it failed -- and thus confirm that nothing shadowed min/max -- before changing it.
I have left a handful of bootloaders that are too annoying to compile-test, and some dead code:
cobalt ews4800mips hp300 hppa ia64 luna68k vax acorn32/if_ie.c (not included in any kernels) macppc/if_gm.c (superseded by gem(4))
It should be easy to fix the fallout once identified -- this way of doing things fails safe, and the goal here, after all, is to _avoid_ silent integer truncations, not introduce them.
Maybe one day we can reintroduce min/max as type-generic things that never silently truncate. But we should avoid doing that for a while, so that existing code has a chance to be detected by the compiler for conversion to uimin/uimax without changing the semantics until we can properly audit it all. (Who knows, maybe in some cases integer truncation is actually intended!)
|
1.235 |
| 03-Jul-2018 |
kamil | Avoid unportable signed integer left shift in fd_unused()
Detected with Kernel Undefined Behavior Sanitizer.
There were at least a single place reported, for consistency fix all the left bit shift operations. sys/kern/kern_descrip.c:345:2, left shift of 1 by 31 places cannot be represented in type 'int' sys/kern/kern_descrip.c:346:28, left shift of 1 by 31 places cannot be represented in type 'int'
Reported by <Harry Pantazis>
|
1.234 |
| 03-Jul-2018 |
kamil | Avoid unportable signed integer left shift in fd_copy()
Detected with Kernel Undefined Behavior Sanitizer.
There were at least a single place reported, for consistency fix all the left bit shift operations. sys/kern/kern_descrip.c:1492:3, left shift of 1 by 31 places cannot be represented in type 'int' sys/kern/kern_descrip.c:1493:28, left shift of 1 by 31 places cannot be represented in type 'int'
Reported by <Harry Pantazis>
|
1.233 |
| 03-Jul-2018 |
kamil | Avoid unportable signed integer left shift in fd_isused()
Detected with Kernel Undefined Behavior Sanitizer.
sys/kern/kern_descrip.c:188:34, left shift of 1 by 31 places cannot be represented in type 'int'
Reported by <Harry Pantazis>
|
1.232 |
| 03-Jul-2018 |
kamil | Avoid unportable signed integer left shift in fd_used()
Detected with Kernel Undefined Behavior Sanitizer.
There were at least a single place reported, for consistency fix all the left bit shift operations. sys/kern/kern_descrip.c:302:26, left shift of 1 by 31 places cannot be represented in type 'int'
Reported by <Harry Pantazis>
|
1.231 |
| 01-Jun-2017 |
chs | branches: 1.231.8; 1.231.10; remove checks for failure after memory allocation calls that cannot fail:
kmem_alloc() with KM_SLEEP kmem_zalloc() with KM_SLEEP percpu_alloc() pserialize_create() psref_class_create()
all of these paths include an assertion that the allocation has not failed, so callers should not assert that again.
|
1.230 |
| 11-May-2017 |
nat | Explicitly set the flags instead of masking set values in.
This fixes FNONBLOCK weirdness seen in audio.c
OK christos@ and martin@.
|
1.229 |
| 03-Aug-2015 |
christos | branches: 1.229.8; 1. mask fflags so we don't tack on whateve oflags were passed from userland 2. honor O_CLOEXEC, so the children of daemons that use cloning devices, don't end up with the parents descriptors fd_clone and in general the fd approach of 'allocate' > 'play with guts' > 'attach' should be converted to be more constructor like. XXX: pullup-{6,7}
|
1.228 |
| 21-Sep-2014 |
christos | branches: 1.228.2; remove casts to the same type.
|
1.227 |
| 05-Sep-2014 |
matt | Try not to use f_data, use f_{vnode,socket,pipe,mqueue,kqueue,ksem} to get a correctly typed pointer.
|
1.226 |
| 05-Sep-2014 |
matt | Don't next structure and enum definitions. Don't use C++ keywords new, try, class, private, etc.
|
1.225 |
| 25-Jul-2014 |
dholland | branches: 1.225.2; Add d_discard to all struct cdevsw instances I could find.
All have been set to "nodiscard"; some should get a real implementation.
|
1.224 |
| 16-Mar-2014 |
dholland | branches: 1.224.2; Change (mostly mechanically) every cdevsw/bdevsw I can find to use designated initializers.
I have not built every extant kernel so I have probably broken at least one build; however I've also found and fixed some wrong cdevsw/bdevsw entries so even if so I think we come out ahead.
|
1.223 |
| 25-Feb-2014 |
pooka | Ensure that the top level sysctl nodes (kern, vfs, net, ...) exist before the sysctl link sets are processed, and remove redundancy.
Shaves >13kB off of an amd64 GENERIC, not to mention >1k duplicate lines of code.
|
1.222 |
| 15-Sep-2013 |
martin | Remove __CT_LOCAL_.. hack
|
1.221 |
| 14-Sep-2013 |
martin | Avoid warnings for a local CTASSERT
|
1.220 |
| 05-Sep-2013 |
pooka | In fd_abort(), reset ff_exclose to preserve invariants expected by fd_free()
|
1.219 |
| 24-Nov-2012 |
christos | branches: 1.219.2; Return EOPNOTSUPP for fnullop_kqfilter to prevent registration of unsupported fds. XXX: We should really fix the fd's to be supported in the future. Unsupported fd's have a NULL f_event, so registering crashes the kernel with a NULL function dereference of f_event.
|
1.218 |
| 25-Jan-2012 |
christos | branches: 1.218.2; 1.218.6; 1.218.8; As discussed in tech-kern, provide the means to prevent delivery of SIGPIPE on EPIPE for all file descriptor types:
- provide O_NOSIGPIPE for open,kqueue1,pipe2,dup3,fcntl(F_{G,S}ETFL) [NetBSD] - provide SOCK_NOSIGPIPE for socket,socketpair [NetBSD] - provide SO_NOSIGPIPE for {g,s}seckopt [NetBSD/FreeBSD/MacOSX] - provide F_{G,S}ETNOSIGPIPE for fcntl [MacOSX]
|
1.217 |
| 25-Sep-2011 |
chs | branches: 1.217.2; 1.217.6; in fd_allocfile(), free the fd if we fail to allocate a file.
|
1.216 |
| 15-Jul-2011 |
christos | fail with EINVAL if flags not are not O_CLOEXEC|O_NONBLOCK in pipe2(2) and dup3(2)
|
1.215 |
| 26-Jun-2011 |
christos | * Arrange for interfaces that create new file descriptors to be able to set close-on-exec on creation (http://udrepper.livejournal.com/20407.html).
- Add F_DUPFD_CLOEXEC to fcntl(2). - Add MSG_CMSG_CLOEXEC to recvmsg(2) for unix file descriptor passing. - Add dup3(2) syscall with a flags argument for O_CLOEXEC, O_NONBLOCK. - Add pipe2(2) syscall with a flags argument for O_CLOEXEC, O_NONBLOCK. - Add flags SOCK_CLOEXEC, SOCK_NONBLOCK to the socket type parameter for socket(2) and socketpair(2). - Add new paccept(2) syscall that takes an additional sigset_t to alter the sigmask temporarily and a flags argument to set SOCK_CLOEXEC, SOCK_NONBLOCK. - Add new mode character 'e' to fopen(3) and popen(3) to open pipes and file descriptors for close on exec. - Add new kqueue1(2) syscall with a new flags argument to open the kqueue file descriptor with O_CLOEXEC, O_NONBLOCK.
* Fix the system calls that take socklen_t arguments to actually do so.
* Don't include userland header files (signal.h) from system header files (rump_syscallargs.h).
* Bump libc version for the new syscalls.
|
1.214 |
| 24-Apr-2011 |
rmind | Drop extern inline for fd_getfile(). Apparently, GCC already ignores it.
|
1.213 |
| 23-Apr-2011 |
rmind | - Sprinkle __cacheline_aligned and __read_mostly in file descriptor code. - While here, remove trailing whitespaces, KNF.
|
1.212 |
| 10-Apr-2011 |
christos | - Add O_CLOEXEC to open(2) - Add fd_set_exclose() to encapsulate uses of FIO{,N}CLEX, O_CLOEXEC, F{G,S}ETFD - Add a pipe1() function to allow passing flags to the fd's that pipe(2) opens to ease implementation of linux pipe2(2) - Factor out fp handling code from open(2) and fhopen(2)
|
1.211 |
| 15-Feb-2011 |
pooka | Support FD_CLOEXEC in rump kernels.
|
1.210 |
| 28-Jan-2011 |
pooka | Move sysctl routines from init_sysctl.c to kern_descrip.c (for descriptors) and kern_proc.c (for processes). This makes them usable in a rump kernel, in case somebody was wondering.
|
1.209 |
| 01-Jan-2011 |
pooka | branches: 1.209.2; 1.209.4; Update comment and inspired by that update variable naming too. no functional change.
|
1.208 |
| 17-Dec-2010 |
yamt | update some comments
|
1.207 |
| 29-Oct-2010 |
pooka | Attach implicit threads to initproc instead of proc0. This way applications which alter, by purpose or by accident, the uid in an implicit thread are don't affect kernel threads.
from discussion with njoly
|
1.206 |
| 01-Sep-2010 |
pooka | Actually, the comment probably meant "would be nice to KASSERT here, but can't". So turn it into a KASSERT now that it's possible.
|
1.205 |
| 01-Sep-2010 |
pooka | Remove XXX comment. I'm not sure what it precisely means, but I'm guessing it's from a time when rump used filedesc0 for everything (and that isn't true anymore).
|
1.204 |
| 04-Aug-2010 |
pooka | Remove overzealous KASSERT: the refcount can be non-zero if another thread attempts to use a non-open file descriptor. from ad
fixes PR kern/43694
|
1.203 |
| 01-Jul-2010 |
rmind | Remove pfind() and pgfind(), fix locking in various broken uses of these. Rename real routines to proc_find() and pgrp_find(), remove PFIND_* flags and have consistent behaviour. Provide proc_find_raw() for special cases. Fix memory leak in sysctl_proc_corename().
COMPAT_LINUX: rework ptrace() locking, minimise differences between different versions per-arch.
Note: while this change adds some formal cosmetics for COMPAT_DARWIN and COMPAT_IRIX - locking there is utterly broken (for ages).
Fixes PR/43176.
|
1.202 |
| 20-Dec-2009 |
dsl | branches: 1.202.2; 1.202.4; If a multithreaded app closes an fd while another thread is blocked in read/write/accept, then the expectation is that the blocked thread will exit and the close complete. Since only one fd is affected, but many fd can refer to the same file, the close code can only request the fs code unblock with ERESTART. Fixed for pipes and sockets, ERESTART will only be generated after such a close - so there should be no change for other programs. Also rename fo_abort() to fo_restart() (this used to be fo_drain()). Fixes PR/26567
|
1.201 |
| 09-Dec-2009 |
dsl | Rename fo_drain() to fo_abort(), 'drain' is used to mean 'wait for output do drain' in many places, whereas fo_drain() was called in order to force blocking read()/write() etc calls to return to userspace so that a close() call from a different thread can complete. In the sockets code comment out the broken code in the inner function, it was being called from compat code.
|
1.200 |
| 27-Oct-2009 |
rmind | - Amend fd_hold() to take an argument and add assert (reflects two cases, fork1() and the rest, e.g. kthread_create(), when creating from lwp0).
- lwp_create(): do not touch filedesc internals, use fd_hold().
|
1.199 |
| 16-Aug-2009 |
yamt | assertion
|
1.198 |
| 30-Jun-2009 |
martin | Update fd_freefile when kqueue descriptors are not copied from parent to child. From Wolfgang Solfrank in PR kern/41651. Approved by Andrew Doran.
|
1.197 |
| 08-Jun-2009 |
yamt | fd_free: fix posix advisory locks. PR/41549 from HITOSHI OSADA.
|
1.196 |
| 07-Jun-2009 |
yamt | shut up the following assertion failure and add a comment.
panic: kernel diagnostic assertion "!fd_isused(fdp, fd)" failed: file "/siro/nbsd/src/sys/kern/kern_descrip.c", line 175
|
1.195 |
| 29-May-2009 |
yamt | fd_free: reset fd_himap/lomap to make fd_checkmaps comfortable. PR/41487.
|
1.194 |
| 28-May-2009 |
yamt | wrap a long line.
|
1.193 |
| 26-May-2009 |
ad | PR kern/41487: kern_descrip.c assertion failure
Remove bogus assertion.
|
1.192 |
| 24-May-2009 |
ad | More changes to improve kern_descrip.c.
- Avoid atomics in more places. - Remove the per-descriptor mutex, and just use filedesc_t::fd_lock. It was only being used to synchronize close, and in any case we needed to take fd_lock to free the descriptor slot. - Optimize certain paths for the <NDFDFILE case. - Sprinkle more comments and assertions. - Cache more stuff in filedesc_t. - Fix numerous minor bugs spotted along the way. - Restructure how the open files array is maintained, for clarity and so that we can eliminate the membar_consumer() call in fd_getfile(). This is mostly syntactic sugar; the main functional change is that fd_nfiles now lives alongside the open file array.
Some measurements with libmicro:
- simple file syscalls are like close() are between 1 to 10% faster. - some nice improvements, e.g. poll(1000) which is ~50% faster.
|
1.191 |
| 23-May-2009 |
ad | Make descriptor access and file allocation cheaper in many cases, mostly by avoiding a bunch of atomic operations.
|
1.190 |
| 04-Apr-2009 |
ad | Add fileops::fo_drain(), to be called from fd_close() when there is more than one active reference to a file descriptor. It should dislodge threads sleeping while holding a reference to the descriptor. Implemented only for sockets but should be extended to pipes, fifos, etc.
Fixes the case of a multithreaded process doing something like the following, which would have hung until the process got a signal.
thr0 accept(fd, ...) thr1 close(fd)
|
1.189 |
| 29-Mar-2009 |
rmind | fownsignal: pre-check for zero pgid, avoids locking of proc_lock.
|
1.188 |
| 11-Mar-2009 |
mrg | completely rework the way that orphaned sockets that are being fdpassed via SCM_RIGHTS messages are dealt with:
1. unp_gc: make this a kthread.
2. unp_detach: go not call unp_gc directly. instead, wake up unp_gc kthread.
3. unp_scan: do not close files here. instead, put them on a global list for unp_gc to close, along with a per-file "deferred close count". if file is already enqueued for close, just increment deferred close count. this eliminates the recursive calls.
3. unp_gc: scan files on global deferred close list. close each file N times, as specified by deferred close count in file. continue processing list until it becomes empty (closing may cause additional files to be queued for close).
4. unp_gc: add additional bit to mark files we are scanning. set during initial scan of global file list that currently clears FMARK/FDEFER. during later scans, never examine / garbage collect descriptors that we have not marked during the earlier scan. do not proceed with this initial scan until all deferred closes have been processed. be careful with locking to ensure no races are introduced between deferred close and file scan.
5. unp_gc: use dummy file_t to mark position in list when scanning. allow us to drop filelist_lock. in turn allows us to eliminate kmem_alloc() and safely close files, etc.
6. prohibit transfer of descriptors within SCM_RIGHTS messages if (num_files_in_transit > maxfiles / unp_rights_ratio)
7. fd_allocfile: ensure recycled filse don't get scanned.
this is 97% work done by andrew doran, with a couple of minor bug fixes and a lot of testing by yours truly.
|
1.187 |
| 08-Mar-2009 |
ad | Don't bother with file_t::f_iflags any more, as it's not used. Noted by mrg@.
|
1.186 |
| 02-Mar-2009 |
rmind | fd_copy: fix off-by-one bug in a race condition path and assert. Should fix PR/40625. OK by <ad>.
|
1.185 |
| 21-Dec-2008 |
ad | branches: 1.185.2; - Fix a bug where we trashed descriptor zero in the old open files array while ironically trying to preserve the same during copy. Would only have occurred if a multithreaded program expanded the descriptor table and, within a tiny window of exposure, another thread in the program tried to access descriptor zero.
- Convert to use kmem_alloc/kmem_free.
|
1.184 |
| 18-Nov-2008 |
pooka | Move fd_closeexec() and fd_checkstd() from kern_descrip to their own file, subr_exec_fd.c (they're used only by exec).
After this change, the kernel source modules are in a partitioned enough state to allow building a system without vfs at all.
|
1.183 |
| 18-Nov-2008 |
pooka | cwd is logically a vfs concept, so take it out from the bosom of kern_descrip and into vfs_cwd. No functional change.
|
1.182 |
| 02-Jul-2008 |
matt | branches: 1.182.2; 1.182.4; 1.182.6; Change {ff,fd}_exclose and ff_allocated to bool. Change exclose arg to fd_dup to bool. Switch assignments from 1/0 to true/false.
This make alpha kernels compile. Bump kern to 4.99.69 since structure changed.
|
1.181 |
| 02-Jul-2008 |
matt | Switch from KASSERT to CTASSERT for those asserts testing sizes of types.
|
1.180 |
| 24-Jun-2008 |
gmcgarry | ioctl commands are unsigned long. Changes ABI for fsetown() and fgetown() on 64-bit architectures.
|
1.179 |
| 05-May-2008 |
ad | branches: 1.179.2; 1.179.4; - Convert hashinit() to use kmem_alloc(). The hash tables can be large and it's better to not have them in kmem_map. - Convert a couple of minor items along the way to kmem_alloc(). - Fix some memory leaks.
|
1.178 |
| 28-Apr-2008 |
martin | Remove clause 3 and 4 from TNF licenses
|
1.177 |
| 24-Apr-2008 |
ad | branches: 1.177.2; Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since we no longer need to guard against access from hardware interrupt handlers.
Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the child process share the parent's lock so that signal state may be kept in sync. Partially addresses PR kern/37437.
|
1.176 |
| 24-Apr-2008 |
ad | Network protocol interrupts can now block on locks, so merge the globals proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock). Implications:
- Inspecting process state requires thread context, so signals can no longer be sent from a hardware interrupt handler. Signal activity must be deferred to a soft interrupt or kthread.
- As the proc state locking is simplified, it's now safe to take exit() and wait() out from under kernel_lock.
- The system spends less time at IPL_SCHED, and there is less lock activity.
|
1.175 |
| 09-Apr-2008 |
wiz | branches: 1.175.2; Commit fix for the fdfile leak described in PR 38374.
Patch provided by YAMAMOTO Takashi.
Ok ad@
|
1.174 |
| 27-Mar-2008 |
ad | Replace use of CACHE_LINE_SIZE in some obvious places.
|
1.173 |
| 21-Mar-2008 |
ad | File descriptor changes, discussed on tech-kern:
- Redo reference counting to be sane. LWPs accessing files take a short term reference on the local file descriptor. This is the most common case. While a file is in a process descriptor table, a reference is held to the file. The file reference count only changes during control operations like open() or close(). Code that comes at files from an unusual direction (i.e. foreign to the process) like procfs or sysctl takes a reference on the file (f_count), and not on a descriptor.
- Remove knowledge of reference counting and locking from most code that deals with files.
- Make the usual case of file descriptor lookup lockless.
- Make kqueue MP and MT safe. PR kern/38098, PR kern/38137.
- Fix numerous file handling bugs, and bugs in the descriptor code that affected multithreaded processes.
- Split descriptor system calls out into sys_descrip.c.
- A few stylistic changes: KNF, remove unused casts now that caddr_t is gone. Replace dumb gotos with loop control in a few places.
- Don't do redundant pointer passing (struct proc, lwp, filedesc *) unless the routine is likely to be inlined. Most of the time it's about the current process.
|
1.172 |
| 06-Feb-2008 |
ad | branches: 1.172.6; - Shrink 'struct file' to 60 bytes on 32-bit platforms. - Align 'struct file' and 'struct filedesc' to CACHE_LINE_SIZE.
|
1.171 |
| 27-Jan-2008 |
dsl | Move the prototype for do_posix_fadvise() somewhere useful.
|
1.170 |
| 27-Jan-2008 |
martin | Implement new version of posix_fadvise as a stub callinig the real worker function, and compatibility stub doing the same with old argument sturcture.
|
1.169 |
| 05-Jan-2008 |
ad | Add fgetdummy/fputdummy: allocate and free dummy 'struct file' entries to be used when traversing filehead.
|
1.168 |
| 05-Jan-2008 |
dsl | Use FILE_LOCK() and FILE_UNLOCK()
|
1.167 |
| 26-Dec-2007 |
ad | Merge more changes from vmlocking2, mainly:
- Locking improvements. - Use pool_cache for more items.
|
1.166 |
| 20-Dec-2007 |
dsl | Convert all the system call entry points from: int foo(struct lwp *l, void *v, register_t *retval) to: int foo(struct lwp *l, const struct foo_args *uap, register_t *retval) Fixup compat code to not write into 'uap' and (in some cases) to actually pass a correctly formatted 'uap' structure with the right name to the next routine. A few 'compat' routines that just call standard ones have been deleted. All the 'compat' code compiles (along with the kernels required to test build it). 98% done by automated scripts.
|
1.165 |
| 08-Dec-2007 |
pooka | branches: 1.165.4; Remove cn_lwp from struct componentname. curlwp should be used from on. The NDINIT() macro no longer takes the lwp parameter and associates the credentials of the calling thread with the namei structure.
|
1.164 |
| 29-Nov-2007 |
ad | branches: 1.164.2; Use atomics to adjust filedesc::fd_refcnt.
|
1.163 |
| 29-Nov-2007 |
ad | Use atomics to adjust cwdi_refcnt.
|
1.162 |
| 07-Nov-2007 |
ad | Merge from vmlocking:
- pool_cache changes. - Debugger/procfs locking fixes. - Other minor changes.
|
1.161 |
| 08-Oct-2007 |
ad | branches: 1.161.2; 1.161.4; Merge file descriptor locking, cwdi locking and cross-call changes from the vmlocking branch.
|
1.160 |
| 07-Sep-2007 |
rmind | branches: 1.160.2; Implementation of POSIX message queues.
Reviewed by: <ad>, <tech-kern>
|
1.159 |
| 09-Jul-2007 |
ad | branches: 1.159.2; 1.159.6; 1.159.8; Merge some of the less invasive changes from the vmlocking branch:
- kthread, callout, devsw API changes - select()/poll() improvements - miscellaneous MT safety improvements
|
1.158 |
| 12-May-2007 |
dsl | Split the fcntl locking code out from its copyin/out. Use to avoid all the stackgap stuff in compat code.
|
1.157 |
| 22-Apr-2007 |
dsl | I'm not sure why I decided that cwdinit() shouldn't copy cwd_edir. Since this is called in fork() it does rather need to give the child process the parent's emulation root. This means that (for example) an emulated shell will, by default, run programs from the emulation root.
|
1.156 |
| 22-Apr-2007 |
dsl | Change the way that emulations locate files within the emulation root to avoid having to allocate space in the 'stackgap' - which is very LWP unfriendly. The additional code for non-emulation namei() is trivial, the reduction for the emulations is massive. The vnode for a processes emulation root is saved in the cwdi structure during process exec. If the emulation root the TRYEMULROOT flag are set, namei() will do an initial search for absolute pathnames in the emulation root, if that fails it will retry from the normal root. ".." at the emulation root will always go to the real root, even in the middle of paths and when expanding symlinks. Absolute symlinks found using absolute paths in the emulation root will be relative to the emulation root (so /usr/lib/xxx.so -> /lib/xxx.so links inside the emulation root don't need changing). If the root of the emulation would be returned (for an emulation lookup), then the real root is returned instead (matching the behaviour of emul_lookup, but being a cheap comparison here) so that programs that scan "../.." looking for the root dircetory don't loop forever. The target for symbolic links is no longer mangled (it used to get the CHECK_ALT_xxx() treatment, so could get /emul/xxx prepended). CHECK_ALT_xxx() are no more. Most of the change is deleting them, and adding TRYEMULROOT to the flags to NDINIT(). A lot of the emulation system call stubs could now be deleted.
|
1.155 |
| 21-Mar-2007 |
dsl | Somehow a single K&R function definition was lurking - nuke it.
|
1.154 |
| 12-Mar-2007 |
ad | branches: 1.154.2; 1.154.4; Pass an ipl argument to pool_init/POOL_INIT to be used when initializing the pool's lock.
|
1.153 |
| 10-Mar-2007 |
dsl | branches: 1.153.2; Split the work for sys_stat, sys_lstat, sys_fstat and sys_fhstat out into separate functions that don't do the copyout. This allows all the compat_xxx versions to convert the 'struct stat' to the correct format without using the 'stackgap'. The stackgap isn't at all LWP friendly, and needs to be removed from any compat functions that might involve threads (inc. clone()). The code is still binary compatible with existing LKMs.
|
1.152 |
| 09-Mar-2007 |
ad | - Make the proclist_lock a mutex. The write:read ratio is unfavourable, and mutexes are cheaper use than RW locks. - LOCK_ASSERT -> KASSERT in some places. - Hold proclist_lock/kernel_lock longer in a couple of places.
|
1.151 |
| 17-Feb-2007 |
pavel | Change the process/lwp flags seen by userland via sysctl back to the P_*/L_* naming convention, and rename the in-kernel flags to avoid conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD constant.
Restores source compatibility with pre-newlock2 tools like ps or top.
Reviewed by Andrew Doran.
|
1.150 |
| 09-Feb-2007 |
ad | branches: 1.150.2; Merge newlock2 to head.
|
1.149 |
| 31-Jan-2007 |
ad | ffree(): don't call kauth_cred_free() with a held simplelock.
|
1.148 |
| 06-Dec-2006 |
yamt | use KSI_INIT rather than memset. no functional changes.
|
1.147 |
| 01-Nov-2006 |
yamt | remove some __unused from function parameters.
|
1.146 |
| 12-Oct-2006 |
christos | - sprinkle __unused on function decls. - fix a couple of unused bugs - no more -Wno-unused for i386
|
1.145 |
| 02-Sep-2006 |
christos | branches: 1.145.2; 1.145.4; add missing initializer
|
1.144 |
| 23-Jul-2006 |
ad | Use the LWP cached credentials where sane.
|
1.143 |
| 14-May-2006 |
elad | integrate kauth.
|
1.142 |
| 15-Apr-2006 |
christos | Coverity CID 845: Make it clear that devnullfp != NULL.
|
1.141 |
| 07-Mar-2006 |
pooka | branches: 1.141.2; 1.141.4; remove the no longer useful fdavail(), as proposed and (thankfully) not discussed on tech-kern
|
1.140 |
| 31-Jan-2006 |
yamt | branches: 1.140.2; 1.140.4; 1.140.6; falloc: grab fd_slock when calling fd_unused.
|
1.139 |
| 24-Dec-2005 |
perry | branches: 1.139.2; Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.
|
1.138 |
| 11-Dec-2005 |
christos | merge ktrace-lwp.
|
1.137 |
| 29-Nov-2005 |
yamt | merge yamt-readahead branch.
|
1.136 |
| 03-Oct-2005 |
mrg | branches: 1.136.6; fix a bug pointed out by der mouse on tech-kern: in F_GETOWN, use a pointer to a temporary "int" variable to pass to fo_ioctl(TIOCGPGRP), not a register_t pointer. (how did F_GETOWN ever work on sparc64 before?)
|
1.135 |
| 19-Aug-2005 |
christos | 64 bit inode changes.
|
1.134 |
| 23-Jun-2005 |
thorpej | branches: 1.134.2; Use ANSI function decls. Apply some static.
|
1.133 |
| 29-May-2005 |
christos | - add const. - remove unnecessary casts. - add __UNCONST casts and mark them with XXXUNCONST as necessary.
|
1.132 |
| 20-May-2005 |
wrstuden | The file being closed is (fdp->fd_lastfile - i), not i. So compare (fdp->fd_lastfile - i) against fd_knlistsize. Otherwise we can call knote_fdclose() on a file descriptor that doesn't have a knote.
This issue explains random panics I have had on process exit over the past few years.
|
1.131 |
| 26-Feb-2005 |
perry | branches: 1.131.2; nuke trailing whitespace
|
1.130 |
| 12-Feb-2005 |
christos | pass the flag to fdclone.
|
1.129 |
| 14-Jan-2005 |
cube | branches: 1.129.2; 1.129.4; As fd_lastfile might be negative, we can't use the (u_int) cast trick to compare fd and fdp->fd_lastfile in fdrelease(), so change the test to a more explicit one. Spotted by Matt Thomas.
Should fix the panic reported by Matthias Scheler.
|
1.128 |
| 12-Jan-2005 |
cube | fd_lastfile should be -1 when there are no opened file descriptors. Hence, make find_last_set return -1 in such situation, and initialize it such. Otherwise, with 0 meaning two things, it confused the F_CLOSEM fcntl which could end up looping indifintely (PR#28929 by Brian Marcotte).
However, this change enlightens another bug in fdcopy(), where more entries than needed were cleared in the new file descriptor table, so the memset() call there is fixed too.
Analyzed with the help of Greg Oster.
|
1.127 |
| 30-Nov-2004 |
christos | Cloning cleanup: 1. make fileops const 2. add 2 new negative errno's to `officially' support the cloning hack: - EDUPFD (used to overload ENODEV) - EMOVEFD (used to overload ENXIO) 3. Created an fdclone() function to encapsulate the operations needed for EMOVEFD, and made all cloners use it. 4. Centralize the local noop/badop fileops functions to: fnullop_fcntl, fnullop_poll, fnullop_kqfilter, fbadop_stat
|
1.126 |
| 31-May-2004 |
pk | Implement mutexes for file descriptor and current working directory access. Fix a potential race condition when reallocating storage for file descriptors (even for non-SMP kernels). Add missing locks for `struct file' ref count updates.
|
1.125 |
| 25-Apr-2004 |
simonb | Initialise (most) pools from a link set instead of explicit calls to pool_init. Untouched pools are ones that either in arch-specific code, or aren't initialiased during initial system startup.
Convert struct session, ucred and lockf to pools.
|
1.124 |
| 05-Apr-2004 |
yamt | add assertions related to file descriptor allocation.
|
1.123 |
| 07-Jan-2004 |
jdolecek | branches: 1.123.2; fix F_MAXFD fcntl - it returned the value as errno instead of return value from the syscall from mouss <usebsd at free dot fr>
|
1.122 |
| 05-Jan-2004 |
christos | Ad F_CLOSEM, F_MAXFD from Matt Thomas.
|
1.121 |
| 30-Nov-2003 |
provos | fix off by one in find_last_set(); triggered for processes that have no open file descriptors; found by tim robbins from freebsd
|
1.120 |
| 26-Nov-2003 |
yamt | fdcopy: copy inline bitmaps properly. hopefully fixes PR/23469.
|
1.119 |
| 09-Nov-2003 |
yamt | fix typos in comments.
|
1.118 |
| 09-Nov-2003 |
yamt | - fix an use-after-free bug in /dev/fd/* handling. specifically, don't keep a stale pointer in fd_ofiles. it isn't needed anymore as fd allocation is now done using bitmaps. - clean up dupfdopen() a little. - don't call fd_used() unnecessarily.
|
1.117 |
| 09-Nov-2003 |
yamt | in the non-overwritten case of sys_dup2(), call fd_used() by itsself rather than leaving it to finishdup().
|
1.116 |
| 01-Nov-2003 |
provos | use fdremove to remove kqueue file descriptor so that bitmap information is maintained correctly; found by Juergen Hannken-Illjes
|
1.115 |
| 30-Oct-2003 |
provos | use a two-level bitmap as suggested by mogul and banga for fdalloc; approved thorpej@
|
1.114 |
| 22-Sep-2003 |
christos | - pass signo to fownsignal [ok by jd] - make urg signal handling use fownsignal - remove out of band detection in sowakeup
|
1.113 |
| 21-Sep-2003 |
jdolecek | cleanup & uniform descriptor owner handling: * introduce fsetown(), fgetown(), fownsignal() - this sets/retrieves/signals the owner of descriptor, according to appropriate sematics of TIOCSPGRP/FIOSETOWN/SIOCSPGRP/TIOCGPGRP/FIOGETOWN/SIOCGPGRP ioctl; use these routines instead of custom code where appropriate * make every place handling TIOCSPGRP/TIOCGPGRP handle also FIOSETOWN/FIOGETOWN properly, and remove the translation of FIO[SG]OWN to TIOC[SG]PGRP in sys_ioctl() & sys_fcntl() * also remove the socket-specific hack in sys_ioctl()/sys_fcntl() and pass the ioctls down to soo_ioctl() as any other ioctl
change discussed on tech-kern@
|
1.112 |
| 13-Sep-2003 |
jdolecek | move dupfd from struct proc to struct lwp - it's per-LWP, not per-process; we use curlwp where the lwp is not directly available, i.e. in device open routines
briefly discussed on tech-kern
|
1.111 |
| 07-Aug-2003 |
agc | Move UCB-licensed code from 4-clause to 3-clause licence.
Patches provided by Joel Baker in PR 22364, verified by myself.
|
1.110 |
| 29-Jun-2003 |
fvdl | branches: 1.110.2; Back out the lwp/ktrace changes. They contained a lot of colateral damage, and need to be examined and discussed more.
|
1.109 |
| 28-Jun-2003 |
darrenr | Pass lwp pointers throughtout the kernel, as required, so that the lwpid can be inserted into ktrace records. The general change has been to replace "struct proc *" with "struct lwp *" in various function prototypes, pass the lwp through and use l_proc to get the process pointer when needed.
Bump the kernel rev up to 1.6V
|
1.108 |
| 16-May-2003 |
itojun | use strlcat
|
1.107 |
| 22-Mar-2003 |
dsl | Correct rewinding if FIONBIO or FIOASYNC fail in F_SETFL (code use to always turn off FIONBIO if FIOASYNC fails) (approved by christos)
|
1.106 |
| 22-Mar-2003 |
dsl | Change caddr_t to void *
|
1.105 |
| 17-Mar-2003 |
martin | When being passed bogus file descriptors make close(2) return EBADF. From Stephen Ma in PR kern/20762.
|
1.104 |
| 01-Mar-2003 |
yamt | make fdcheckstd f_slock friendly.
|
1.103 |
| 23-Feb-2003 |
pk | Make updating a file's reference and use count MP-safe.
|
1.102 |
| 14-Feb-2003 |
pk | Use a mutex to protect the global list of open files.
|
1.101 |
| 01-Feb-2003 |
thorpej | Add extensible malloc types, adapted from FreeBSD. This turns malloc types into a structure, a pointer to which is passed around, instead of an int constant. Allow the limit to be adjusted when the malloc type is defined, or with a function call, as suggested by Jonathan Stone.
|
1.100 |
| 19-Jan-2003 |
simonb | Remove variable that is only assigned too but not referenced.
|
1.99 |
| 18-Jan-2003 |
thorpej | Merge the nathanw_sa branch.
|
1.98 |
| 06-Jan-2003 |
wiz | descriptor, not decriptor.
|
1.97 |
| 24-Nov-2002 |
scw | Quell uninitialised variable warnings.
|
1.96 |
| 23-Oct-2002 |
jdolecek | merge kqueue branch into -current
kqueue provides a stateful and efficient event notification framework currently supported events include socket, file, directory, fifo, pipe, tty and device changes, and monitoring of processes and signals
kqueue is supported by all writable filesystems in NetBSD tree (with exception of Coda) and all device drivers supporting poll(2)
based on work done by Jonathan Lemon for FreeBSD initial NetBSD port done by Luke Mewburn and Jason Thorpe
|
1.95 |
| 23-Sep-2002 |
simonb | fp->f_count is unsigned, don't check if it's less than zero.
|
1.94 |
| 06-Sep-2002 |
gehenna | Merge the gehenna-devsw branch into the trunk.
This merge changes the device switch tables from static array to dynamically generated by config(8).
- All device switches is defined as a constant structure in device drivers.
- The new grammer ``device-major'' is introduced to ``files''.
device-major <prefix> char <num> [block <num>] [<rules>]
- All device major numbers must be listed up in port dependent majors.<arch> by using this grammer.
- Added the new naming convention. The name of the device switch must be <prefix>_[bc]devsw for auto-generation of device switch tables.
- The backward compatibility of loading block/character device switch by LKM framework is broken. This is necessary to convert from block/character device major to device name in runtime and vice versa.
- The restriction to assign device major by LKM is completely removed. We don't need to reserve LKM entries for dynamic loading of device switch.
- In compile time, device major numbers list is packed into the kernel and the LKM framework will refer it to assign device major number dynamically.
|
1.93 |
| 18-Jun-2002 |
thorpej | sys_fpathconf: Don't panic in the default case; just return EOPNOTSUPP.
|
1.92 |
| 09-May-2002 |
atatat | branches: 1.92.2; Maintain a short list of the actual descriptors that were closed and log that intead of being ambiguous about which of 0, 1, and/or 2 it was that was closed.
|
1.91 |
| 28-Apr-2002 |
enami | Log who invoked the s[ug]id program. Tested by mozilla.
|
1.90 |
| 27-Apr-2002 |
enami | A loop to expand file descriptor table and retry is move from fdalloc() to caller. So, no longer need to loop in fdalloc().
|
1.89 |
| 27-Apr-2002 |
enami | KNF.
|
1.88 |
| 24-Apr-2002 |
christos | Avoid file use underflow; thanks to YAMAMOTO Takashi for noticing.
|
1.87 |
| 23-Apr-2002 |
christos | Don't forget to set mature and unuse the file.
|
1.86 |
| 23-Apr-2002 |
christos | From OpenBSD, via FreeBSD: If a set{u,g}id binary is invoked with fd < 3 closed, open those fds to /dev/null.
XXX: This needs to be fixed in a better way. The kernel should not need to know about /dev/null or special case 0, 1, 2.
|
1.85 |
| 08-Mar-2002 |
thorpej | Pool deals fairly well with physical memory shortage, but it doesn't deal with shortages of the VM maps where the backing pages are mapped (usually kmem_map). Try to deal with this:
* Group all information about the backend allocator for a pool in a separate structure. The pool references this structure, rather than the individual fields. * Change the pool_init() API accordingly, and adjust all callers. * Link all pools using the same backend allocator on a list. * The backend allocator is responsible for waiting for physical memory to become available, but will still fail if it cannot callocate KVA space for the pages. If this happens, carefully drain all pools using the same backend allocator, so that some KVA space can be freed. * Change pool_reclaim() to indicate if it actually succeeded in freeing some pages, and use that information to make draining easier and more efficient. * Get rid of PR_URGENT. There was only one use of it, and it could be dealt with by the caller.
From art@openbsd.org.
|
1.84 |
| 31-Jan-2002 |
kleink | fcntl(..., F_GETOWN, ...): fix LP64-BE bug; raised by der Mouse on tech-kern.
|
1.83 |
| 07-Dec-2001 |
jdolecek | Back off previous for now, Jason thinks it's not right. Will discuss on tech-kern@
|
1.82 |
| 06-Dec-2001 |
jdolecek | replace FIF_WANTCLOSE/FIF_LARVAL with FWANTCLOSE/FLARVAL, which are set in f_flag of struct file for now, keep former f_iflags of struct file as _f_spare0, it will be g/c'ed when struct file will be changed (this will happen soon)
|
1.81 |
| 12-Nov-2001 |
lukem | add RCSIDs
|
1.80 |
| 18-Jul-2001 |
thorpej | branches: 1.80.2; 1.80.4; Unshare the file descriptor table and `cwdinfo' when we exec. From Matthew Orgass <darkstar@pgh.net>.
|
1.79 |
| 01-Jul-2001 |
thorpej | branches: 1.79.2; Duh, use fd_getfile() in sys_close().
|
1.78 |
| 16-Jun-2001 |
jdolecek | Add DTYPE_PIPE (to be used by new pipe implementation) and handle it accordingly.
|
1.77 |
| 14-Jun-2001 |
thorpej | Fix a partial construction problem that can cause race conditions between creation of a file descriptor and close(2) when using kernel assisted threads. What we do is stick descriptors in the table, but mark them as "larval". This causes essentially everything to treat it as a non-existent descriptor, except for fdalloc(), which sees a filled slot so that it won't (incorrectly) allocate it again. When a descriptor is fully constructed, the code that has constructed it marks it as "mature" (which actually clears the "larval" flag), and things continue to work as normal.
While here, gather all the code that gets a descriptor from the table into a fd_getfile() function, and call it, rather than having the same (sometimes incorrect) code copied all over the place.
|
1.76 |
| 07-Jun-2001 |
thorpej | Rework fdalloc() even further: split fdalloc() into fdalloc() and fdexpand(). The former will return ENOSPC if there is not space in the current filedesc table. The latter performs the expansion of the filedesc table. This means that fdalloc() won't ever block, and it gives callers an opportunity to clean up before the potentially-blocking fdexpand() call.
Update all fdalloc() callers to deal with the need-to-fdexpand() case.
Rewrite unp_externalize() to use fdalloc() and fdexpand() in a safe way, using an algorithm suggested by Bill Sommerfeld: - Use a temporary array of integers to hold the new filedesc table indexes. This allows us to repeat the loop if necessary. - Loop through the array of file *'s, assigning them to filedesc table slots. If fdalloc() indicates expansion is necessary, undo the assignments we've done so far, expand, and retry the whole process. - Once all file *'s have been assigned to slots, update the f_msgcount and unp_rights counters. - Right before we return, copy the temporary integer array to the message buffer, and trim the length as before. Note that once locking is added to the filedesc array, this entire operation will be `atomic', in that the lock will be held while file *'s are assigned to embryonic table slots, thus preventing anything else from using them.
|
1.75 |
| 06-Jun-2001 |
thorpej | Change fdalloc() to return ERESTART if we had to reallocate the descriptor array, which may have blocked. Change callers of fdalloc() to restart whatever they\'re doing if this condition happens. (XXX unp_externalize() needs some work, but that will be tackled later.)
Change finishdup() to close the descriptor in the `new\' slot if one exists, and change sys_dup2() accordingly.
Closes a race condition when using kernel-assisted user threads.
While here, garbage-collect UF_MAPPED -- it is not used anywhere.
|
1.74 |
| 09-Apr-2001 |
jdolecek | Change the first arg to fileops fo_stat routine to struct file *, adjust callers and appropriate routines to cope. This makes fo_stat more consistent with rest of fileops routines and also makes the fo_stat match FreeBSD as an added bonus. Discussed with Luke Mewburn on tech-kern@.
|
1.73 |
| 07-Apr-2001 |
jdolecek | Add new 'stat' fileop and call the stat function via f_ops rather than directly. For compat syscalls, also add necessary FILE_USE()/FILE_UNUSE(). Now that soo_stat() gets a proc arg, pass it on to usrreq function.
|
1.72 |
| 26-Feb-2001 |
lukem | branches: 1.72.2; convert to ANSI KNF
|
1.71 |
| 15-Aug-2000 |
fvdl | Fix omission in previous.
|
1.70 |
| 15-Aug-2000 |
eeh | Fix LP64BE bug.
|
1.69 |
| 04-Jul-2000 |
jdolecek | change tablefull() to accept one more parameter - optional hint
use that to inform about way to raise current limit when we reach maximum number of processes, descriptors or vnodes
XXX hopefully I catched all users of tablefull()
|
1.68 |
| 27-Jun-2000 |
mrg | remove include of <vm/vm.h>
|
1.67 |
| 26-May-2000 |
sommerfeld | branches: 1.67.4; Eliminate incorrect use of "curproc" in a comment.
|
1.66 |
| 30-Mar-2000 |
augustss | Get rid of register declarations.
|
1.65 |
| 23-Mar-2000 |
thorpej | Implement fdremove() which is used in place of all the code that did the "fdp->fd_ofiles[fd] = 0" assignment; fdremove() make sure the fd_freefiles hints stay in sync.
From OpenBSD.
|
1.64 |
| 22-Mar-2000 |
thorpej | Pool'ify filedesc0 allocation.
|
1.63 |
| 24-Jan-2000 |
thorpej | In cwdinit(), if there isn't a cdir vnode yet, don't VREF() it.
|
1.62 |
| 08-Dec-1999 |
sommerfeld | Fix bug observed by Perry and myself: when emacs was shut down uncleanly due to a lost connection, it would hang in closef() waiting for the usecount to go back to 1.
An audit of FILE_USE() vs FILE_UNUSE() usage led me to discover some incorrect error-path code..
In sys_fcntl(), avoid leaking a file descriptor usecount in an error case of F_SETFL; don't return, instead go to "out" to clean up. I suspect that the F_SETFL would fail because vop_fcntl is not implemented in deadfs.
|
1.61 |
| 03-Aug-1999 |
wrstuden | branches: 1.61.2; 1.61.8; Add support for fcntl(2) to generate VOP_FCNTL calls. Any fcntl call with F_FSCTL set and F_SETFL calls generate calls to a new fileop fo_fcntl. Add genfs_fcntl() and soo_fcntl() which return 0 for F_SETFL and EOPNOTSUPP otherwise. Have all leaf filesystems use genfs_fcntl().
Reviewed by: thorpej Tested by: wrstuden
|
1.60 |
| 20-Jun-1999 |
christos | Fix umask inheritance problem introduced by the cwdi changes, whereby children processes will not inherit the parent's umask but 022.
|
1.59 |
| 05-May-1999 |
thorpej | Add "use counting" to file entries. When closing a file, and it's reference count is 0, wait for use count to drain before finishing the close.
This is necessary in order for multiple processes to safely share file descriptor tables.
|
1.58 |
| 30-Apr-1999 |
thorpej | Break cdir/rdir/cmask info out of struct filedesc, and put it in a new substructure, `cwdinfo'. Implement optional sharing of this substructure.
This is required for clone(2).
|
1.57 |
| 24-Mar-1999 |
mrg | branches: 1.57.4; completely remove Mach VM support. all that is left is the all the header files as UVM still uses (most of) these.
|
1.56 |
| 22-Mar-1999 |
sommerfe | bug fix to fdavail: be consistent about taking per-process descriptor limit into account when checking against the limit; fdp->fd_nfiles may be greater than the current descriptor limit, and there may be space in fdp->fd_ofiles beyond the limit. If we say it's available, unp_externalize will get confused and panic when fdalloc fails.
|
1.55 |
| 31-Aug-1998 |
thorpej | Use the pool allocator and "nointr" pool page allocator for file structures.
|
1.54 |
| 13-Aug-1998 |
kleink | Per POSIX, fail with EINVAL if advisory locking is attempted on a file type that doesn't support it, rather than using a homegrown EBADF or EOPNOTSUPP.
|
1.53 |
| 04-Aug-1998 |
perry | Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one. bcopy(x, y, z) -> memcpy(y, x, z) ovbcopy(x, y, z) -> memmove(y, x, z) bcmp(x, y, z) -> memcmp(x, y, z) bzero(x, y) -> memset(x, 0, y)
|
1.52 |
| 31-Jul-1998 |
perry | fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.
|
1.51 |
| 01-Mar-1998 |
fvdl | branches: 1.51.2; Merge with Lite2 + local changes
|
1.50 |
| 10-Feb-1998 |
mrg | - add defopt's for UVM, UVMHIST and PMAP_NEW. - remove unnecessary UVMHIST_DECL's.
|
1.49 |
| 05-Feb-1998 |
mrg | initial import of the new virtual memory system, UVM, into -current.
UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some minor portions derived from the old Mach code. i provided some help getting swap and paging working, and other bug fixes/ideas. chuck silvers <chuq@chuq.com> also provided some other fixes.
this is the rest of the MI portion changes.
this will be KNF'd shortly. :-)
|
1.48 |
| 05-Jan-1998 |
thorpej | Implement file descriptor table sharing. Partially from FreeBSD.
|
1.47 |
| 20-Oct-1997 |
thorpej | Fix the shared library versioning snafu caused by the recent changes to the stat(2) family and msync(2). This uses a primitive function versioning scheme.
This reverts the libc shared library major version from 13 to 12, and adds a few new interfaces to bring us to libc version 12.20.
From Frank van der Linden <fvdl@NetBSD.ORG>.
|
1.46 |
| 19-Oct-1997 |
mycroft | Minor change; remove unnecessary casts.
|
1.45 |
| 15-Oct-1997 |
mycroft | Adjust u_int arguments of some system calls to int, to match user-level prototypes.
|
1.44 |
| 17-Jul-1997 |
phil | In sys_flock, change EBADF to EINVAL because error was generated by a bad argument, not a bad file descriptor. (Found in response to PR 2602.)
|
1.43 |
| 02-Apr-1997 |
kleink | Like in F_SETLK, check if F_GETLK is actually called with a valid lock type.
|
1.42 |
| 30-Mar-1996 |
christos | Eliminate kern_conf.h
|
1.41 |
| 29-Mar-1996 |
cgd | kill unnecessary (and sometimes dangerous) casts of ioctl commands to int
|
1.40 |
| 14-Mar-1996 |
christos | - fdopen -> filedescopen - bring kgdb prototype in scope.
|
1.39 |
| 09-Feb-1996 |
christos | More proto fixes
|
1.38 |
| 04-Feb-1996 |
christos | First pass at prototyping
|
1.37 |
| 07-Oct-1995 |
mycroft | Prefix names of system call implementation functions with `sys_'.
|
1.36 |
| 19-Sep-1995 |
thorpej | Make system calls conform to a standard prototype and bring those prototypes into scope.
|
1.35 |
| 24-Jun-1995 |
christos | Extracted all of the compat_xxx routines, and created a library [libcompat] for them. There are a few #ifdef COMPAT_XX remaining, but they are not easy or worth eliminating (yet).
|
1.34 |
| 10-Apr-1995 |
mycroft | Change `fdclose' to `fdrelease', to avoid confusion with device interfaces.
|
1.33 |
| 08-Mar-1995 |
cgd | need COMPAT_OSF1 for some things
|
1.32 |
| 15-Feb-1995 |
mycroft | NULL out file descriptors as they're closed, for the benefit of fstat(8).
|
1.31 |
| 23-Jan-1995 |
cgd | ooops. forgot to emable fpathconf's use of VOP_PATHCONF!
|
1.30 |
| 12-Jan-1995 |
cgd | cast pointer to long, not int
|
1.29 |
| 14-Dec-1994 |
mycroft | Remove old declaration.
|
1.28 |
| 14-Dec-1994 |
mycroft | Revert dup handling.
|
1.27 |
| 04-Dec-1994 |
mycroft | Abstract out the code to maintain fd_lastfile. Remove the old dup() compatibility kluge. Rearrange fdopen() handling. Make a common function to handle closing a particular file descriptor in a process. Some other cleanup.
|
1.26 |
| 30-Oct-1994 |
cgd | be more careful with types, also pull in headers where necessary.
|
1.25 |
| 20-Oct-1994 |
cgd | update for new syscall args description mechanism
|
1.24 |
| 30-Aug-1994 |
mycroft | Convert process, file, and namei lists and hash tables to use queue.h.
|
1.23 |
| 15-Aug-1994 |
mycroft | Need ofstat() for iBCS2 syscall conversion.
|
1.22 |
| 29-Jun-1994 |
cgd | branches: 1.22.2; New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'
|
1.21 |
| 22-Jun-1994 |
mycroft | Make ogetdtablesize if COMPAT_HPUX.
|
1.20 |
| 16-Jun-1994 |
glass | compat_ultrix
|
1.19 |
| 14-Jun-1994 |
chopps | getdtabledsize used by sunos compat code.
|
1.18 |
| 14-Jun-1994 |
cgd | make getdtablesize COMPAT_43; should be COMPAT_44 or _09, but that has probs
|
1.17 |
| 19-May-1994 |
cgd | update to 4.4-Lite, with some local changes
|
1.16 |
| 17-May-1994 |
cgd | copyright foo
|
1.15 |
| 07-May-1994 |
cgd | stub fpathconf
|
1.14 |
| 04-May-1994 |
cgd | Rename a lot of process flags.
|
1.13 |
| 27-Mar-1994 |
cgd | expand uid_t/gid_t/off_t
|
1.12 |
| 04-Jan-1994 |
cgd | generalize dupfdopen() to allow dups and moves. from jsp
|
1.11 |
| 21-Dec-1993 |
cgd | more of the same; gah!
|
1.10 |
| 21-Dec-1993 |
cgd | kill a billism
|
1.9 |
| 18-Dec-1993 |
mycroft | Canonicalize all #includes.
|
1.8 |
| 23-Aug-1993 |
mycroft | branches: 1.8.2; RLIMIT_OFILE --> RLIMIT_NOFILE
|
1.7 |
| 13-Jul-1993 |
cgd | break args structs out, into syscallname_args structs, so gcc2 doesn't whine so much.
|
1.6 |
| 27-Jun-1993 |
andrew | ANSIfications - removed all implicit function return types and argument definitions. Ensured that all files include "systm.h" to gain access to general prototypes. Casts where necessary.
|
1.5 |
| 22-May-1993 |
cgd | add include of select.h if necessary for protos, or delete if extraneous
|
1.4 |
| 18-May-1993 |
cgd | make kernel select interface be one-stop shopping & clean it all up.
|
1.3 |
| 04-Apr-1993 |
cgd | now uses `maxfdescs' to bound `openfiles' resource limit.
|
1.2 |
| 23-Mar-1993 |
cgd | modified files to support kernfs and fdesc fs
|
1.1 |
| 21-Mar-1993 |
cgd | branches: 1.1.1; Initial revision
|
1.1.1.3 |
| 01-Mar-1998 |
fvdl | Import 4.4BSD-Lite2
|
1.1.1.2 |
| 01-Mar-1998 |
fvdl | Import 4.4BSD-Lite for reference
|
1.1.1.1 |
| 21-Mar-1993 |
cgd | initial import of 386bsd-0.1 sources
|
1.8.2.2 |
| 21-Dec-1993 |
cgd | from trunk
|
1.8.2.1 |
| 14-Nov-1993 |
mycroft | Canonicalize all #includes.
|
1.22.2.1 |
| 15-Aug-1994 |
mycroft | update from trunk
|
1.51.2.1 |
| 08-Aug-1998 |
eeh | Revert cdevsw mmap routines to return int.
|
1.57.4.2 |
| 01-Jul-1999 |
thorpej | Sync w/ -current.
|
1.57.4.1 |
| 21-Jun-1999 |
thorpej | Sync w/ -current.
|
1.61.8.1 |
| 27-Dec-1999 |
wrstuden | Pull up to last week's -current.
|
1.61.2.3 |
| 21-Apr-2001 |
bouyer | Sync with HEAD
|
1.61.2.2 |
| 12-Mar-2001 |
bouyer | Sync with HEAD.
|
1.61.2.1 |
| 20-Nov-2000 |
bouyer | Update thorpej_scsipi to -current as of a month ago
|
1.67.4.8 |
| 27-Apr-2002 |
he | Apply patch (requested by christos): Adapt previous pull-up to the branch.
|
1.67.4.7 |
| 26-Apr-2002 |
he | Pull up revisions 1.86-1.88 (requested by christos): If a set{u,g}id binary is invoked with fd < 3 closed, open those file desciptors to /dev/null.
|
1.67.4.6 |
| 09-Feb-2002 |
he | Apply patch (requested by windsor): Correct typo in previous pull-up.
|
1.67.4.5 |
| 09-Feb-2002 |
he | Pull up revision 1.84 (via patch, requested by kleink): Fix an LP64-BE bug with fctnl(..., F_GETOWN, ...).
|
1.67.4.4 |
| 29-Jul-2001 |
he | Pull up revision 1.80 (via patch, requested by thorpej): Unshare the file descriptor table and ``cwdinfo'' when we exec.
|
1.67.4.3 |
| 10-Jun-2001 |
he | Pull up revision 1.75 (via patch, requested by thorpej): Change fdalloc() to return ERESTART if reallocation of the descriptor array was needed, and change uses to handle that condition. Make finishdup() close the descriptor in the new slot if it exists, and change sys_dup2() accordingly. Closes a race condition when using kernel-assisted user threads.
|
1.67.4.2 |
| 26-Aug-2000 |
mrg | pull up 1.70, 1.71. approved by thorpej: 1.70 >Fix LP64BE bug. 1.71 >Fix omission in previous.
|
1.67.4.1 |
| 04-Jul-2000 |
jdolecek | Pullup from trunk [approved by thorpej]:
change tablefull() to accept one more parameter - optional hint
use that to inform about way to raise current limit when we reach maximum number of processes, descriptors or vnodes
|
1.72.2.17 |
| 07-Jan-2003 |
thorpej | Sync with HEAD.
|
1.72.2.16 |
| 11-Dec-2002 |
thorpej | Sync with HEAD.
|
1.72.2.15 |
| 11-Nov-2002 |
nathanw | Catch up to -current
|
1.72.2.14 |
| 18-Oct-2002 |
nathanw | Catch up to -current.
|
1.72.2.13 |
| 17-Sep-2002 |
nathanw | Catch up to -current.
|
1.72.2.12 |
| 12-Jul-2002 |
nathanw | No longer need to pull in lwp.h; proc.h pulls it in for us.
|
1.72.2.11 |
| 10-Jul-2002 |
nathanw | Whitespace.
|
1.72.2.10 |
| 20-Jun-2002 |
nathanw | Catch up to -current.
|
1.72.2.9 |
| 29-May-2002 |
nathanw | #include <sys/sa.h> before <sys/syscallargs.h>, to provide sa_upcall_t now that <sys/param.h> doesn't include <sys/sa.h>.
(Behold the Power of Ed)
|
1.72.2.8 |
| 01-Apr-2002 |
nathanw | Catch up to -current. (CVS: It's not just a program. It's an adventure!)
|
1.72.2.7 |
| 28-Feb-2002 |
nathanw | Catch up to -current.
|
1.72.2.6 |
| 08-Jan-2002 |
nathanw | Catch up to -current.
|
1.72.2.5 |
| 14-Nov-2001 |
nathanw | Catch up to -current.
|
1.72.2.4 |
| 24-Aug-2001 |
nathanw | Catch up with -current.
|
1.72.2.3 |
| 21-Jun-2001 |
nathanw | Catch up to -current.
|
1.72.2.2 |
| 09-Apr-2001 |
nathanw | Catch up with -current.
|
1.72.2.1 |
| 05-Mar-2001 |
nathanw | Initial commit of scheduler activations and lightweight process support.
|
1.79.2.10 |
| 12-Oct-2002 |
jdolecek | need knote_fdclose() in finishdup()
|
1.79.2.9 |
| 10-Oct-2002 |
jdolecek | sync kqueue with -current; this includes merge of gehenna-devsw branch, merge of i386 MP branch, and part of autoconf rototil work
|
1.79.2.8 |
| 06-Sep-2002 |
jdolecek | sync kqueue branch with HEAD
|
1.79.2.7 |
| 23-Jun-2002 |
jdolecek | catch up with -current on kqueue branch
|
1.79.2.6 |
| 16-Mar-2002 |
jdolecek | Catch up with -current.
|
1.79.2.5 |
| 15-Mar-2002 |
jdolecek | fdfree(): fix the argument to knote_fdfree() - 'i' is not the the descriptor value, it's index counting from fdp->fd_lastfile down; this fixes deadlock in closef() when watched descriptor is lower than the kqueue one and the process has open further descriptors
finishdup(): add comment a knote_fdfree() call is needed there; will address this later
|
1.79.2.4 |
| 11-Feb-2002 |
jdolecek | Sync w/ -current.
|
1.79.2.3 |
| 10-Jan-2002 |
thorpej | Sync kqueue branch with -current.
|
1.79.2.2 |
| 03-Aug-2001 |
lukem | update to -current
|
1.79.2.1 |
| 10-Jul-2001 |
lukem | create and destroy fd_kn{list,hash} entries as appropriate (for kqueue use)
|
1.80.4.1 |
| 12-Nov-2001 |
thorpej | Sync the thorpej-mips-cache branch with -current.
|
1.80.2.1 |
| 07-Sep-2001 |
thorpej | Commit my "devvp" changes to the thorpej-devvp branch. This replaces the use of dev_t in most places with a struct vnode *.
This will form the basic infrastructure for real cloning device support (besides being architecurally cleaner -- it'll be good to get away from using numbers to represent objects).
|
1.92.2.2 |
| 15-Jul-2002 |
gehenna | catch up with -current.
|
1.92.2.1 |
| 16-May-2002 |
gehenna | Add the character device switch.
|
1.110.2.11 |
| 11-Dec-2005 |
christos | Sync with head.
|
1.110.2.10 |
| 10-Nov-2005 |
skrll | Sync with HEAD. Here we go again...
|
1.110.2.9 |
| 04-Mar-2005 |
skrll | Sync with HEAD.
Hi Perry!
|
1.110.2.8 |
| 24-Feb-2005 |
skrll | Reduce diff to HEAD
|
1.110.2.7 |
| 15-Feb-2005 |
skrll | Sync with HEAD.
|
1.110.2.6 |
| 17-Jan-2005 |
skrll | Sync with HEAD.
|
1.110.2.5 |
| 18-Dec-2004 |
skrll | Sync with HEAD.
|
1.110.2.4 |
| 21-Sep-2004 |
skrll | Fix the sync with head I botched.
|
1.110.2.3 |
| 18-Sep-2004 |
skrll | Sync with HEAD.
|
1.110.2.2 |
| 03-Aug-2004 |
skrll | Sync with HEAD
|
1.110.2.1 |
| 02-Jul-2003 |
darrenr | Apply the aborted ktrace-lwp changes to a specific branch. This is just for others to review, I'm concerned that patch fuziness may have resulted in some errant code being generated but I'll look at that later by comparing the diff from the base to the branch with the file I attempt to apply to it. This will, at the very least, put the changes in a better context for others to review them and attempt to tinker with removing passing of 'struct lwp' through the kernel.
|
1.123.2.3 |
| 24-May-2005 |
riz | Pull up revision 1.132 (requested by wrstuden in ticket #1537): The file being closed is (fdp->fd_lastfile - i), not i. So compare (fdp->fd_lastfile - i) against fd_knlistsize. Otherwise we can call knote_fdclose() on a file descriptor that doesn't have a knote. This issue explains random panics I have had on process exit over the past few years.
|
1.123.2.2 |
| 16-Mar-2005 |
tron | Pull up revision 1.128 via patch (requested by cube in ticket #1089): fd_lastfile should be -1 when there are no opened file descriptors. Hence, make find_last_set return -1 in such situation, and initialize it such. Otherwise, with 0 meaning two things, it confused the F_CLOSEM fcntl which could end up looping indifintely (PR#28929 by Brian Marcotte). However, this change enlightens another bug in fdcopy(), where more entries than needed were cleared in the new file descriptor table, so the memset() call there is fixed too. Analyzed with the help of Greg Oster.
|
1.123.2.1 |
| 10-Jul-2004 |
tron | branches: 1.123.2.1.2; Pull up revision 1.124 (requested by tls in ticket #634): add assertions related to file descriptor allocation.
|
1.123.2.1.2.2 |
| 24-May-2005 |
riz | Pull up revision 1.132 (requested by wrstuden in ticket #1537): The file being closed is (fdp->fd_lastfile - i), not i. So compare (fdp->fd_lastfile - i) against fd_knlistsize. Otherwise we can call knote_fdclose() on a file descriptor that doesn't have a knote. This issue explains random panics I have had on process exit over the past few years.
|
1.123.2.1.2.1 |
| 16-Mar-2005 |
tron | Pull up revision 1.128 via patch (requested by cube in ticket #1089): fd_lastfile should be -1 when there are no opened file descriptors. Hence, make find_last_set return -1 in such situation, and initialize it such. Otherwise, with 0 meaning two things, it confused the F_CLOSEM fcntl which could end up looping indifintely (PR#28929 by Brian Marcotte). However, this change enlightens another bug in fdcopy(), where more entries than needed were cleared in the new file descriptor table, so the memset() call there is fixed too. Analyzed with the help of Greg Oster.
|
1.129.4.1 |
| 19-Mar-2005 |
yamt | sync with head. xen and whitespace. xen part is not finished.
|
1.129.2.1 |
| 29-Apr-2005 |
kent | sync with -current
|
1.131.2.1 |
| 28-May-2005 |
tron | Pull up revision 1.132 (requested by wrstuden in ticket #331): The file being closed is (fdp->fd_lastfile - i), not i. So compare (fdp->fd_lastfile - i) against fd_knlistsize. Otherwise we can call knote_fdclose() on a file descriptor that doesn't have a knote. This issue explains random panics I have had on process exit over the past few years.
|
1.134.2.11 |
| 24-Mar-2008 |
yamt | sync with head.
|
1.134.2.10 |
| 11-Feb-2008 |
yamt | sync with head.
|
1.134.2.9 |
| 04-Feb-2008 |
yamt | sync with head.
|
1.134.2.8 |
| 21-Jan-2008 |
yamt | sync with head
|
1.134.2.7 |
| 07-Dec-2007 |
yamt | sync with head
|
1.134.2.6 |
| 15-Nov-2007 |
yamt | sync with head.
|
1.134.2.5 |
| 27-Oct-2007 |
yamt | sync with head.
|
1.134.2.4 |
| 03-Sep-2007 |
yamt | sync with head.
|
1.134.2.3 |
| 26-Feb-2007 |
yamt | sync with head.
|
1.134.2.2 |
| 30-Dec-2006 |
yamt | sync with head.
|
1.134.2.1 |
| 21-Jun-2006 |
yamt | sync with head.
|
1.136.6.6 |
| 18-Nov-2005 |
yamt | - associate read-ahead context to vnode, rather than file. - revert VOP_READ prototype.
|
1.136.6.5 |
| 17-Nov-2005 |
yamt | use UVM_ADV_ rather than POSIX_FADV_.
|
1.136.6.4 |
| 16-Nov-2005 |
yamt | update a comment following posix_fadvise prototype change.
|
1.136.6.3 |
| 16-Nov-2005 |
yamt | sys_posix_fadvise: correct how to return an error.
|
1.136.6.2 |
| 15-Nov-2005 |
yamt | add posix_fadvise.
|
1.136.6.1 |
| 15-Nov-2005 |
yamt | - setup/cleanup readahead context. - adapt to the new VOP_READ prototype.
|
1.139.2.1 |
| 01-Feb-2006 |
yamt | sync with head.
|
1.140.6.4 |
| 03-Sep-2006 |
yamt | sync with head.
|
1.140.6.3 |
| 11-Aug-2006 |
yamt | sync with head
|
1.140.6.2 |
| 24-May-2006 |
yamt | sync with head.
|
1.140.6.1 |
| 13-Mar-2006 |
yamt | sync with head.
|
1.140.4.2 |
| 01-Jun-2006 |
kardel | Sync with head.
|
1.140.4.1 |
| 22-Apr-2006 |
simonb | Sync with head.
|
1.140.2.1 |
| 09-Sep-2006 |
rpaulo | sync with head
|
1.141.4.1 |
| 24-May-2006 |
tron | Merge 2006-05-24 NetBSD-current into the "peter-altq" branch.
|
1.141.2.4 |
| 06-May-2006 |
christos | - Move kauth_cred_t declaration to <sys/types.h> - Cleanup struct ucred; forward declarations that are unused. - Don't include <sys/kauth.h> in any header, but include it in the c files that need it.
Approved by core.
|
1.141.2.3 |
| 19-Apr-2006 |
elad | sync with head.
|
1.141.2.2 |
| 08-Mar-2006 |
elad | Adapt to kernel authorization KPI.
|
1.141.2.1 |
| 07-Mar-2006 |
elad | file kern_descrip.c was added on branch elad-kernelauth on 2006-03-08 00:53:40 +0000
|
1.145.4.2 |
| 10-Dec-2006 |
yamt | sync with head.
|
1.145.4.1 |
| 22-Oct-2006 |
yamt | sync with head
|
1.145.2.6 |
| 01-Feb-2007 |
ad | Sync with head.
|
1.145.2.5 |
| 30-Jan-2007 |
ad | Remove support for SA. Ok core@.
|
1.145.2.4 |
| 12-Jan-2007 |
ad | Sync with head.
|
1.145.2.3 |
| 18-Nov-2006 |
ad | Sync with head.
|
1.145.2.2 |
| 17-Nov-2006 |
ad | Checkpoint work in progress.
|
1.145.2.1 |
| 11-Sep-2006 |
ad | - Allocate and free turnstiles where needed. - Split proclist_mutex and alllwp_mutex out of the proclist_lock, and use in interrupt context. - Fix an MP race in enterpgrp()/setsid(). - Acquire proclist_lock and p_crmutex in some obvious places.
|
1.150.2.5 |
| 17-May-2007 |
yamt | sync with head.
|
1.150.2.4 |
| 07-May-2007 |
yamt | sync with head.
|
1.150.2.3 |
| 24-Mar-2007 |
yamt | sync with head.
|
1.150.2.2 |
| 12-Mar-2007 |
rmind | Sync with HEAD.
|
1.150.2.1 |
| 27-Feb-2007 |
yamt | - sync with head. - move sched_changepri back to kern_synch.c as it doesn't know PPQ anymore.
|
1.153.2.9 |
| 09-Oct-2007 |
ad | Sync with head.
|
1.153.2.8 |
| 09-Oct-2007 |
ad | Sync with head.
|
1.153.2.7 |
| 01-Sep-2007 |
ad | Use pool_cache for allocating a few more types of objects.
|
1.153.2.6 |
| 09-Jul-2007 |
ad | closef: restore check for l == NULL removed in revision 1.153.2.4. Noted by yamt.
|
1.153.2.5 |
| 08-Jun-2007 |
ad | Sync with head.
|
1.153.2.4 |
| 13-May-2007 |
ad | - Pass the error number and residual count to biodone(), and let it handle setting error indicators. Prepare to eliminate B_ERROR. - Add a flag argument to brelse() to be set into the buf's flags, instead of doing it directly. Typically used to set B_INVAL. - Add a "struct cpu_info *" argument to kthread_create(), to be used to create bound threads. Change "bool mpsafe" to "int flags". - Allow exit of LWPs in the IDL state when (l != curlwp). - More locking fixes & conversion to the new API.
|
1.153.2.3 |
| 12-Apr-2007 |
ad | filedesc::fd_lock a reader/writer lock, for multithreaded processes.
|
1.153.2.2 |
| 21-Mar-2007 |
ad | - Replace more simple_locks, and fix up in a few places. - Use condition variables. - LOCK_ASSERT -> KASSERT.
|
1.153.2.1 |
| 13-Mar-2007 |
ad | Sync with head.
|
1.154.4.1 |
| 29-Mar-2007 |
reinoud | Pullup to -current
|
1.154.2.1 |
| 11-Jul-2007 |
mjf | Sync with head.
|
1.159.8.4 |
| 23-Mar-2008 |
matt | sync with HEAD
|
1.159.8.3 |
| 09-Jan-2008 |
matt | sync with HEAD
|
1.159.8.2 |
| 08-Nov-2007 |
matt | sync with -HEAD
|
1.159.8.1 |
| 06-Nov-2007 |
matt | sync with HEAD
|
1.159.6.5 |
| 09-Dec-2007 |
jmcneill | Sync with HEAD.
|
1.159.6.4 |
| 03-Dec-2007 |
joerg | Sync with HEAD.
|
1.159.6.3 |
| 11-Nov-2007 |
joerg | Sync with HEAD.
|
1.159.6.2 |
| 26-Oct-2007 |
joerg | Sync with HEAD.
Follow the merge of pmap.c on i386 and amd64 and move pmap_init_tmp_pgtbl into arch/x86/x86/pmap.c. Modify the ACPI wakeup code to restore CR4 before jumping back into kernel space as the large page option might cover that.
|
1.159.6.1 |
| 02-Oct-2007 |
joerg | Sync with HEAD.
|
1.159.2.1 |
| 10-Sep-2007 |
skrll | Sync with HEAD.
|
1.160.2.1 |
| 14-Oct-2007 |
yamt | sync with head.
|
1.161.4.4 |
| 18-Feb-2008 |
mjf | Sync with HEAD.
|
1.161.4.3 |
| 27-Dec-2007 |
mjf | Sync with HEAD.
|
1.161.4.2 |
| 08-Dec-2007 |
mjf | Sync with HEAD.
|
1.161.4.1 |
| 19-Nov-2007 |
mjf | Sync with HEAD.
|
1.161.2.1 |
| 13-Nov-2007 |
bouyer | Sync with HEAD
|
1.164.2.3 |
| 26-Dec-2007 |
ad | Sync with head.
|
1.164.2.2 |
| 13-Dec-2007 |
ad | Unused var
|
1.164.2.1 |
| 13-Dec-2007 |
ad | Eliminate contention on filelist_lock.
|
1.165.4.2 |
| 08-Jan-2008 |
bouyer | Sync with HEAD
|
1.165.4.1 |
| 02-Jan-2008 |
bouyer | Sync with HEAD
|
1.172.6.5 |
| 17-Jan-2009 |
mjf | Sync with HEAD.
|
1.172.6.4 |
| 02-Jul-2008 |
mjf | Sync with HEAD.
|
1.172.6.3 |
| 29-Jun-2008 |
mjf | Sync with HEAD.
|
1.172.6.2 |
| 02-Jun-2008 |
mjf | Sync with HEAD.
|
1.172.6.1 |
| 03-Apr-2008 |
mjf | Sync with HEAD.
|
1.175.2.1 |
| 18-May-2008 |
yamt | sync with head.
|
1.177.2.8 |
| 09-Oct-2010 |
yamt | sync with head
|
1.177.2.7 |
| 11-Aug-2010 |
yamt | sync with head.
|
1.177.2.6 |
| 11-Mar-2010 |
yamt | sync with head
|
1.177.2.5 |
| 19-Aug-2009 |
yamt | sync with head.
|
1.177.2.4 |
| 18-Jul-2009 |
yamt | sync with head.
|
1.177.2.3 |
| 20-Jun-2009 |
yamt | sync with head
|
1.177.2.2 |
| 04-May-2009 |
yamt | sync with head.
|
1.177.2.1 |
| 16-May-2008 |
yamt | sync with head.
|
1.179.4.2 |
| 03-Jul-2008 |
simonb | Sync with head.
|
1.179.4.1 |
| 27-Jun-2008 |
simonb | Sync with head.
|
1.179.2.3 |
| 18-Sep-2008 |
wrstuden | Sync with wrstuden-revivesa-base-2.
|
1.179.2.2 |
| 14-May-2008 |
wrstuden | Per discussion with ad, remove most of the #include <sys/sa.h> lines as they were including sa.h just for the type(s) needed for syscallargs.h.
Instead, create a new file, sys/satypes.h, which contains just the types needed for syscallargs.h. Yes, there's only one now, but that may change and it's probably more likely to change if it'd be difficult to handle. :-)
Per discussion with matt at n dot o, add an include of satypes.h to sigtypes.h. Upcall handlers are kinda signal handlers, and signalling is the header file that's already included for syscallargs.h that closest matches SA.
This shaves about 3000 lines off of the diff of the branch relative to the base. That also represents about 18% of the total before this checkin.
I think this reduction is very good thing.
|
1.179.2.1 |
| 10-May-2008 |
wrstuden | Initial checkin of re-adding SA. Everything except kern_sa.c compiles in GENERIC for i386. This is still a work-in-progress, but this checkin covers most of the mechanical work (changing signalling to be able to accomidate SA's process-wide signalling and re-adding includes of sys/sa.h and savar.h). Subsequent changes will be much more interesting.
Also, kern_sa.c has received partial cleanup. There's still more to do, though.
|
1.182.6.6 |
| 04-Apr-2009 |
snj | Pull up following revision(s) (requested by ad in ticket #661): sys/arch/xen/xen/xenevt.c: revision 1.32 sys/compat/svr4/svr4_net.c: revision 1.56 sys/compat/svr4_32/svr4_32_net.c: revision 1.19 sys/dev/dmover/dmover_io.c: revision 1.32 sys/dev/putter/putter.c: revision 1.21 sys/kern/kern_descrip.c: revision 1.190 sys/kern/kern_drvctl.c: revision 1.23 sys/kern/kern_event.c: revision 1.64 sys/kern/sys_mqueue.c: revision 1.14 sys/kern/sys_pipe.c: revision 1.109 sys/kern/sys_socket.c: revision 1.59 sys/kern/uipc_syscalls.c: revision 1.136 sys/kern/vfs_vnops.c: revision 1.164 sys/kern/uipc_socket.c: revision 1.188 sys/net/bpf.c: revision 1.144 sys/net/if_tap.c: revision 1.55 sys/opencrypto/cryptodev.c: revision 1.47 sys/sys/file.h: revision 1.67 sys/sys/param.h: patch sys/sys/socketvar.h: revision 1.119 Add fileops::fo_drain(), to be called from fd_close() when there is more than one active reference to a file descriptor. It should dislodge threads sleeping while holding a reference to the descriptor. Implemented only for sockets but should be extended to pipes, fifos, etc. Fixes the case of a multithreaded process doing something like the following, which would have hung until the process got a signal. thr0 accept(fd, ...) thr1 close(fd)
|
1.182.6.5 |
| 31-Mar-2009 |
snj | Pull up following revision(s) (requested by rmind in ticket #619): sys/kern/kern_descrip.c: revision 1.189 fownsignal: pre-check for zero pgid, avoids locking of proc_lock.
|
1.182.6.4 |
| 18-Mar-2009 |
snj | Pull up following revision(s) (requested by mrg in ticket #577): sys/kern/kern_descrip.c: revision 1.188 sys/kern/uipc_usrreq.c: revision 1.121 sys/sys/fcntl.h: revision 1.35 sys/sys/file.h: revision 1.66 sys/sys/param.h: patch sys/sys/un.h: revision 1.45 completely rework the way that orphaned sockets that are being fdpassed via SCM_RIGHTS messages are dealt with: 1. unp_gc: make this a kthread. 2. unp_detach: go not call unp_gc directly. instead, wake up unp_gc kthread. 3. unp_scan: do not close files here. instead, put them on a global list for unp_gc to close, along with a per-file "deferred close count". if file is already enqueued for close, just increment deferred close count. this eliminates the recursive calls. 3. unp_gc: scan files on global deferred close list. close each file N times, as specified by deferred close count in file. continue processing list until it becomes empty (closing may cause additional files to be queued for close). 4. unp_gc: add additional bit to mark files we are scanning. set during initial scan of global file list that currently clears FMARK/FDEFER. during later scans, never examine / garbage collect descriptors that we have not marked during the earlier scan. do not proceed with this initial scan until all deferred closes have been processed. be careful with locking to ensure no races are introduced between deferred close and file scan. 5. unp_gc: use dummy file_t to mark position in list when scanning. allow us to drop filelist_lock. in turn allows us to eliminate kmem_alloc() and safely close files, etc. 6. prohibit transfer of descriptors within SCM_RIGHTS messages if (num_files_in_transit > maxfiles / unp_rights_ratio) 7. fd_allocfile: ensure recycled filse don't get scanned. this is 97% work done by andrew doran, with a couple of minor bug fixes and a lot of testing by yours truly.
|
1.182.6.3 |
| 15-Mar-2009 |
snj | Pull up following revision(s) (requested by mrg in ticket #566): sys/kern/init_sysctl.c: revision 1.157 sys/kern/kern_descrip.c: revision 1.187 usr.sbin/pstat/pstat.c: revision 1.112 Don't bother with file_t::f_iflags any more, as it's not used. Noted by mrg@.
|
1.182.6.2 |
| 02-Mar-2009 |
snj | Pull up following revision(s) (requested by rmind in ticket #542): sys/kern/kern_descrip.c: revision 1.186 fd_copy: fix off-by-one bug in a race condition path and assert. Should fix PR/40625. OK by <ad>.
|
1.182.6.1 |
| 02-Feb-2009 |
snj | Pull up following revision(s) (requested by ad in ticket #358): sys/kern/kern_descrip.c: revision 1.185 - Fix a bug where we trashed descriptor zero in the old open files array while ironically trying to preserve the same during copy. Would only have occurred if a multithreaded program expanded the descriptor table and, within a tiny window of exposure, another thread in the program tried to access descriptor zero. - Convert to use kmem_alloc/kmem_free.
|
1.182.4.3 |
| 28-Apr-2009 |
skrll | Sync with HEAD.
|
1.182.4.2 |
| 03-Mar-2009 |
skrll | Sync with HEAD.
|
1.182.4.1 |
| 19-Jan-2009 |
skrll | Sync with HEAD.
|
1.182.2.1 |
| 13-Dec-2008 |
haad | Update haad-dm branch to haad-dm-base2.
|
1.185.2.2 |
| 23-Jul-2009 |
jym | Sync with HEAD.
|
1.185.2.1 |
| 13-May-2009 |
jym | Sync with HEAD.
Commit is split, to avoid a "too many arguments" protocol error.
|
1.202.4.4 |
| 31-May-2011 |
rmind | sync with head
|
1.202.4.3 |
| 21-Apr-2011 |
rmind | sync with head
|
1.202.4.2 |
| 05-Mar-2011 |
rmind | sync with head
|
1.202.4.1 |
| 03-Jul-2010 |
rmind | sync with head
|
1.202.2.3 |
| 06-Nov-2010 |
uebayasi | Sync with HEAD.
|
1.202.2.2 |
| 22-Oct-2010 |
uebayasi | Sync with HEAD (-D20101022).
|
1.202.2.1 |
| 17-Aug-2010 |
uebayasi | Sync with HEAD.
|
1.209.4.2 |
| 17-Feb-2011 |
bouyer | Sync with HEAD
|
1.209.4.1 |
| 08-Feb-2011 |
bouyer | Sync with HEAD
|
1.209.2.1 |
| 06-Jun-2011 |
jruoho | Sync with HEAD.
|
1.217.6.1 |
| 18-Feb-2012 |
mrg | merge to -current.
|
1.217.2.3 |
| 22-May-2014 |
yamt | sync with head.
for a reference, the tree before this commit was tagged as yamt-pagecache-tag8.
this commit was splitted into small chunks to avoid a limitation of cvs. ("Protocol error: too many arguments")
|
1.217.2.2 |
| 16-Jan-2013 |
yamt | sync with (a bit old) head
|
1.217.2.1 |
| 17-Apr-2012 |
yamt | sync with head
|
1.218.8.1 |
| 24-Nov-2012 |
jdc | Pull up revisions: src/sys/kern/kern_event.c revision 1.79 src/sys/kern/kern_descrip.c revision 1.219 src/lib/libc/sys/kqueue.2 revision 1.33 src/tests/lib/libc/sys/t_kevent.c revision 1.2-1.5 (requested by christos in ticket #716).
- initialize kn_id - in close, invalidate f_data and f_type early to prevent accidental re-use - add a DIAGNOSTIC for when we use unsupported fd's and a KASSERT for f_event being NULL.
Return EOPNOTSUPP for fnullop_kqfilter to prevent registration of unsupported fds. XXX: We should really fix the fd's to be supported in the future. Unsupported fd's have a NULL f_event, so registering crashes the kernel with a NULL function dereference of f_event.
mention that kevent returns now EOPNOTSUPP.
Move the references to PRs from code comments to the test description. Once ATF has the ability to output the metadata in the HTML reports, it should be easy to traverse between releng and gnats -reports via links.
Add a (skipped for now) test case for PR 46463
adapt to new reality
Add a test for adding an event to an unsupported fd.
|
1.218.6.3 |
| 03-Dec-2017 |
jdolecek | update from HEAD
|
1.218.6.2 |
| 20-Aug-2014 |
tls | Rebase to HEAD as of a few days ago.
|
1.218.6.1 |
| 25-Feb-2013 |
tls | resync with head
|
1.218.2.1 |
| 24-Nov-2012 |
jdc | Pull up revisions: src/sys/kern/kern_event.c revision 1.79 src/sys/kern/kern_descrip.c revision 1.219 src/lib/libc/sys/kqueue.2 revision 1.33 src/tests/lib/libc/sys/t_kevent.c revision 1.2-1.5 (requested by christos in ticket #716).
- initialize kn_id - in close, invalidate f_data and f_type early to prevent accidental re-use - add a DIAGNOSTIC for when we use unsupported fd's and a KASSERT for f_event being NULL.
Return EOPNOTSUPP for fnullop_kqfilter to prevent registration of unsupported fds. XXX: We should really fix the fd's to be supported in the future. Unsupported fd's have a NULL f_event, so registering crashes the kernel with a NULL function dereference of f_event.
mention that kevent returns now EOPNOTSUPP.
Move the references to PRs from code comments to the test description. Once ATF has the ability to output the metadata in the HTML reports, it should be easy to traverse between releng and gnats -reports via links.
Add a (skipped for now) test case for PR 46463
adapt to new reality
Add a test for adding an event to an unsupported fd.
|
1.219.2.1 |
| 18-May-2014 |
rmind | sync with head
|
1.224.2.1 |
| 10-Aug-2014 |
tls | Rebase.
|
1.225.2.2 |
| 03-Jun-2017 |
snj | Pull up following revision(s) (requested by riastradh in ticket #1425): sys/kern/kern_descrip.c: revision 1.230 Explicitly set the flags instead of masking set values in. This fixes FNONBLOCK weirdness seen in audio.c OK christos@ and martin@.
|
1.225.2.1 |
| 04-Aug-2015 |
snj | branches: 1.225.2.1.2; 1.225.2.1.6; Pull up following revision(s) (requested by christos in ticket #933): sys/kern/kern_descrip.c: revision 1.229 1. mask fflags so we don't tack on whateve oflags were passed from userland 2. honor O_CLOEXEC, so the children of daemons that use cloning devices, don't end up with the parents descriptors fd_clone and in general the fd approach of 'allocate' > 'play with guts' > 'attach' should be converted to be more constructor like.
|
1.225.2.1.6.1 |
| 03-Jun-2017 |
snj | Pull up following revision(s) (requested by riastradh in ticket #1425): sys/kern/kern_descrip.c: revision 1.230 Explicitly set the flags instead of masking set values in. This fixes FNONBLOCK weirdness seen in audio.c OK christos@ and martin@.
|
1.225.2.1.2.1 |
| 03-Jun-2017 |
snj | Pull up following revision(s) (requested by riastradh in ticket #1425): sys/kern/kern_descrip.c: revision 1.230 Explicitly set the flags instead of masking set values in. This fixes FNONBLOCK weirdness seen in audio.c OK christos@ and martin@.
|
1.228.2.2 |
| 28-Aug-2017 |
skrll | Sync with HEAD
|
1.228.2.1 |
| 22-Sep-2015 |
skrll | Sync with HEAD
|
1.229.8.1 |
| 19-May-2017 |
pgoyette | Resolve conflicts from previous merge (all resulting from $NetBSD keywork expansion)
|
1.231.10.2 |
| 08-Apr-2020 |
martin | Merge changes from current as of 20200406
|
1.231.10.1 |
| 10-Jun-2019 |
christos | Sync with HEAD
|
1.231.8.6 |
| 18-Jan-2019 |
pgoyette | Synch with HEAD
|
1.231.8.5 |
| 26-Nov-2018 |
pgoyette | Sync with HEAD, resolve a couple of conflicts
|
1.231.8.4 |
| 20-Oct-2018 |
pgoyette | Sync with head
|
1.231.8.3 |
| 30-Sep-2018 |
pgoyette | Ssync with HEAD
|
1.231.8.2 |
| 06-Sep-2018 |
pgoyette | Sync with HEAD
Resolve a couple of conflicts (result of the uimin/uimax changes)
|
1.231.8.1 |
| 28-Jul-2018 |
pgoyette | Sync with HEAD
|
1.243.6.1 |
| 29-Feb-2020 |
ad | Sync with head.
|
1.243.4.3 |
| 20-Nov-2024 |
martin | Pull up following revision(s) (requested by riastradh in ticket #1921):
sys/kern/kern_event.c: revision 1.106 sys/kern/sys_select.c: revision 1.51 sys/kern/subr_exec_fd.c: revision 1.10 sys/kern/sys_aio.c: revision 1.46 sys/kern/kern_descrip.c: revision 1.244 sys/kern/kern_descrip.c: revision 1.245 sys/ddb/db_xxx.c: revision 1.72 sys/ddb/db_xxx.c: revision 1.73 sys/miscfs/fdesc/fdesc_vnops.c: revision 1.132 sys/kern/uipc_usrreq.c: revision 1.195 sys/kern/sys_descrip.c: revision 1.36 sys/kern/uipc_usrreq.c: revision 1.196 sys/kern/uipc_socket2.c: revision 1.135 sys/kern/uipc_socket2.c: revision 1.136 sys/kern/kern_sig.c: revision 1.383 sys/kern/kern_sig.c: revision 1.384 sys/compat/netbsd32/netbsd32_ioctl.c: revision 1.107 sys/miscfs/procfs/procfs_vnops.c: revision 1.208 sys/kern/subr_exec_fd.c: revision 1.9 sys/kern/kern_descrip.c: revision 1.252 (all via patch)
Load struct filedesc::fd_dt with atomic_load_consume.
Exceptions: when fd_refcnt <= 1, or when holding fd_lock.
While here: - Restore KASSERT(mutex_owned(&fdp->fd_lock)) in fd_unused. => This is used only in fd_close and fd_abort, where it holds. - Move bounds check assertion in fd_putfile to where it matters. - Store fd_dt with atomic_store_release. - Move load of fd_dt under lock in knote_fdclose. - Omit membar_consumer in fdesc_readdir. => atomic_load_consume serves the same purpose now. => Was needed only on alpha anyway.
Load struct fdfile::ff_file with atomic_load_consume. Exceptions: when we're only testing whether it's there, not about to dereference it.
Note: We do not use atomic_store_release to set it because the preceding mutex_exit should be enough.
(That said, it's not clear the mutex_enter/exit is needed unless refcnt > 0 already, in which case maybe it would be a win to switch from the membar implied by mutex_enter to the membar implied by atomic_store_release -- which I would generally expect to be much cheaper. And a little clearer without a long comment.) kern_descrip.c: Fix membars around reference count decrement.
In general, the `last one out hit the lights' style of reference counting (as opposed to the `whoever's destroying must wait for pending users to finish' style) requires memory barriers like so:
... usage of resources associated with object ... membar_release(); if (atomic_dec_uint_nv(&obj->refcnt) != 0) return; membar_acquire(); ... freeing of resources associated with object ...
This way, all usage happens-before all freeing. This fixes several errors: - fd_close failed to ensure whatever its caller did would happen-before the freeing, in the case where another thread is concurrently trying to close the fd (ff->ff_file == NULL). Fix: Add membar_release before atomic_dec_uint(&ff->ff_refcnt) in that branch. - fd_close failed to ensure all loads its caller had issued will have happened-before the freeing, in the case where the fd is still in use by another thread (fdp->fd_refcnt > 1 and ff->ff_refcnt-- > 0). Fix: Change membar_producer to membar_release before atomic_dec_uint(&ff->ff_refcnt). - fd_close failed to ensure that any usage of fp by other callers would happen-before any freeing it does. Fix: Add membar_acquire after atomic_dec_uint_nv(&ff->ff_refcnt). - fd_free failed to ensure that any usage of fdp by other callers would happen-before any freeing it does. Fix: Add membar_acquire after atomic_dec_uint_nv(&fdp->fd_refcnt).
While here, change membar_exit -> membar_release. No semantic change, just updating away from the legacy API.
|
1.243.4.2 |
| 17-Nov-2024 |
martin | Pull up following revision(s) (requested by kre in ticket #1003):
sys/kern/kern_descrip.c: revision 1.264 (via patch)
Make O_CLOEXEC always close specified files on exec
It turns out that close-on-exec doesn't always close on exec.
If all close-on-exec fd's were made close-on-exec via dup3() or fcntl(F_DUPFD_CLOEXEC) or use of the internal fd_clone() (whose uses
I did not fully investigate but I think is used to create a fd for the open of a cloner device, and perhaps other things) then none of the close-on-exec file descriptors will be closed when an exec happens - but will be passed through to the new process (still marked, apparently, as close-on-exec - but still won't be closed if another exec happens) - that is unless...
If at least one fd in the process has close-on-exec set some other way (fcntl(F_SETFD), open(O_CLOEXEC) (and the similar functions for sockets, and epoll) and perhaps others then all close-on-exec file descriptors in the process will be correctly closed when an exec happens (however they obtained the close-on-exec status).
There are two steps that need to be taken (in the kernel) when turning on close on exec - the obvious one of setting the ff_exclose field in the struct fdfile for the fd. And second, marking the file descriptor table (which holds the fdfile's for one or more processes) as containing file descriptors with close-on-exec set (it is a simple yes/no, and once set is never cleared until an actual exec happens). If it was set during an exec, all the file descriptors are examined, and those marked close-on-exec are closed. If the file descriptor table doesn't indicate that close-on-exec fds exist in the table, none of that happens.
Several places were setting ff_exclose in the struct fdfile but not bothering to set the fd_exclose field in the file descriptor table.
There's even a function (fd_set_exclose()) whose whole purpose is to do this properly - but it wasn't being used.
Now it is, everywhere (I hope).
|
1.243.4.1 |
| 07-Aug-2024 |
martin | Pull up following revision(s) (requested by kre in ticket #1859):
sys/kern/kern_proc.c: revision 1.276 (via patch) sys/kern/kern_ktrace.c: revision 1.185 (via patch) sys/kern/sys_sig.c: revision 1.58 (via patch) sys/kern/kern_descrip.c: revision 1.263 (via patch) lib/libc/compat-43/killpg.c: revision 1.10 sys/kern/tty.c: revision 1.313 (via patch) tests/lib/libc/sys/t_kill.c: revision 1.2
PR kern/58425 -- Disallow INT_MIN as a (negative) pid arg. Since -INT_MIN is undefined, and to point of negative pid args is to negate them, and use the result as a pgrp id instead, we need to avoid accidentally negating INT_MIN.
Since pid_t is just an integral type, of unspecified width, when testing pid_t value test for <= INT_MIN (or > INT_MIN sometimes) rather than == INT_MIN. When testing int values, just == INT_MIN is all that is needed, < INT_MIN cannot occur.
tests/lib/libc/sys/t_kill: Test kill(INT_MIN, ...) fails with ESRCH. PR kern/58425
|
1.249.2.1 |
| 03-Jan-2021 |
thorpej | Sync w/ HEAD.
|
1.250.4.1 |
| 01-Aug-2021 |
thorpej | Sync with HEAD.
|
1.251.10.3 |
| 17-Nov-2024 |
martin | Pull up following revision(s) (requested by kre in ticket #1003):
sys/kern/kern_descrip.c: revision 1.264
Make O_CLOEXEC always close specified files on exec
It turns out that close-on-exec doesn't always close on exec.
If all close-on-exec fd's were made close-on-exec via dup3() or fcntl(F_DUPFD_CLOEXEC) or use of the internal fd_clone() (whose uses
I did not fully investigate but I think is used to create a fd for the open of a cloner device, and perhaps other things) then none of the close-on-exec file descriptors will be closed when an exec happens - but will be passed through to the new process (still marked, apparently, as close-on-exec - but still won't be closed if another exec happens) - that is unless...
If at least one fd in the process has close-on-exec set some other way (fcntl(F_SETFD), open(O_CLOEXEC) (and the similar functions for sockets, and epoll) and perhaps others then all close-on-exec file descriptors in the process will be correctly closed when an exec happens (however they obtained the close-on-exec status).
There are two steps that need to be taken (in the kernel) when turning on close on exec - the obvious one of setting the ff_exclose field in the struct fdfile for the fd. And second, marking the file descriptor table (which holds the fdfile's for one or more processes) as containing file descriptors with close-on-exec set (it is a simple yes/no, and once set is never cleared until an actual exec happens). If it was set during an exec, all the file descriptors are examined, and those marked close-on-exec are closed. If the file descriptor table doesn't indicate that close-on-exec fds exist in the table, none of that happens.
Several places were setting ff_exclose in the struct fdfile but not bothering to set the fd_exclose field in the file descriptor table.
There's even a function (fd_set_exclose()) whose whole purpose is to do this properly - but it wasn't being used.
Now it is, everywhere (I hope).
|
1.251.10.2 |
| 07-Aug-2024 |
martin | Pull up following revision(s) (requested by kre in ticket #773):
sys/kern/kern_proc.c: revision 1.276 sys/kern/kern_ktrace.c: revision 1.185 sys/kern/sys_sig.c: revision 1.58 sys/kern/kern_descrip.c: revision 1.263 lib/libc/compat-43/killpg.c: revision 1.10 sys/kern/tty.c: revision 1.313 tests/lib/libc/sys/t_kill.c: revision 1.2
PR kern/58425 -- Disallow INT_MIN as a (negative) pid arg.
Since -INT_MIN is undefined, and to point of negative pid args is to negate them, and use the result as a pgrp id instead, we need to avoid accidentally negating INT_MIN.
Since pid_t is just an integral type, of unspecified width, when testing pid_t value test for <= INT_MIN (or > INT_MIN sometimes) rather than == INT_MIN. When testing int values, just == INT_MIN is all that is needed, < INT_MIN cannot occur.
tests/lib/libc/sys/t_kill: Test kill(INT_MIN, ...) fails with ESRCH. PR kern/58425
|
1.251.10.1 |
| 30-Jul-2023 |
martin | Pull up following revision(s) (requested by riastradh in ticket #262):
sys/kern/kern_descrip.c: revision 1.252 sys/kern/kern_descrip.c: revision 1.253 sys/kern/kern_descrip.c: revision 1.254
kern_descrip.c: Fix membars around reference count decrement.
In general, the `last one out hit the lights' style of reference counting (as opposed to the `whoever's destroying must wait for pending users to finish' style) requires memory barriers like so: ... usage of resources associated with object ... membar_release(); if (atomic_dec_uint_nv(&obj->refcnt) != 0) return; membar_acquire(); ... freeing of resources associated with object ...
This way, all usage happens-before all freeing. This fixes several errors: - fd_close failed to ensure whatever its caller did would happen-before the freeing, in the case where another thread is concurrently trying to close the fd (ff->ff_file == NULL). Fix: Add membar_release before atomic_dec_uint(&ff->ff_refcnt) in that branch. - fd_close failed to ensure all loads its caller had issued will have happened-before the freeing, in the case where the fd is still in use by another thread (fdp->fd_refcnt > 1 and ff->ff_refcnt-- > 0). Fix: Change membar_producer to membar_release before atomic_dec_uint(&ff->ff_refcnt). - fd_close failed to ensure that any usage of fp by other callers would happen-before any freeing it does. Fix: Add membar_acquire after atomic_dec_uint_nv(&ff->ff_refcnt). - fd_free failed to ensure that any usage of fdp by other callers would happen-before any freeing it does. Fix: Add membar_acquire after atomic_dec_uint_nv(&fdp->fd_refcnt).
While here, change membar_exit -> membar_release. No semantic change, just updating away from the legacy API.
kern_descrip.c: Use atomic_store_relaxed/release for ff->ff_file. 1. atomic_store_relaxed in fd_close avoids the appearance of race in sanitizers (minor bug). 2. atomic_store_release in fd_affix is necessary because the lock activity was not, in fact, enough to guarantee ordering (real bug some architectures like aarch64). The premise appears to have been that the mutex_enter/exit earlier in fd_affix is enough to guarantee that initialization of fp (A) happens before use of fp by a user once fp is published (B): fp->f_... = ...; // A /* fd_affix */ mutex_enter(&fp->f_lock); fp->f_count++; mutex_exit(&fp->f_lock); ... ff->ff_file = fp; // B But actually mutex_enter/exit allow the following reordering by the CPU: mutex_enter(&fp->f_lock); ff->ff_file = fp; // B fp->f_count++; fp->f_... = ...; // A mutex_exit(&fp->f_lock); The only constraints they imply are: 1. fp->f_count++ and B cannot precede mutex_enter 2. mutex_exit cannot precede A and fp->f_count++ They imply no constraint on the relative ordering of A, B, and fp->f_count++ amongst each other, however. This affects any architecture that has a native load-acquire or store-release operation in mutex_enter/exit, like aarch64, instead of explicit load-before-load/store and load/store-before-store barrier.
No need for atomic_store_* in fd_copy or fd_free because we have exclusive access to ff as is.
kern_descrip.c: Change membar_enter to membar_acquire in fd_getfile. membar_acquire is cheaper on many CPUs, and unlikely to be costlier on any CPUs, than the legacy membar_enter. Add a long comment explaining the interaction between fd_getfile and fd_close and why membar_acquire is safe.
|
1.262.6.1 |
| 02-Aug-2025 |
perseant | Sync with HEAD
|