History log of /src/sys/kern/sys_select.c |
Revision | | Date | Author | Comments |
1.68 |
| 26-Nov-2024 |
khorben | Typo in a comment
|
1.67 |
| 18-Oct-2024 |
kre | PR kern/57504 : Check all fds passed in to select
If an application passes in a huge fd_set (select(BIG, ...)) then check every bit in the fd_sets provided, to make sure they are valid.
If BIG is too big (cannot possibly represent an open fd for this process, under any circumstances: ie: not just because that many are not currently open) return EINVAL.
Otherwise, check every set bit to make sure it is valid. Any fd bits set above the applications current highest open fd automatically generate EBADF and quick(ish) exit.
fd's that are within the plausible range are then checked as they always were (it is possible for there to be a few there above the max open fd - as everything in select is done in multiples of __FDBITS (fd_mask) but the max open fd is not so constrained. Those always were checked, continue using the same mechanism.
This should have zero impact on any sane application which uses the highest fd for which it set a bit, +1, as the first arg to select. However, if there are any broken applications that were relying upon the previous behaviour of simply ignoring any fd_masks that started beyond the max number of open files, then they might (if they happen to have any bits set) now fail.
XXX pullup -10 -- but not for a long time. Someone remind me sometime next year. Leave a long settling time in HEAD just to be sure no issues arise, as in practice, almost nothing should cause any of the new code to be executed.
pullup -9 -- probably not, what this fixes isn't significant enough to bother going that far back for (IMO).
|
1.66 |
| 15-Oct-2023 |
riastradh | branches: 1.66.6; sys_select.c: Sort includes. No functional change intended.
|
1.65 |
| 15-Oct-2023 |
riastradh | sys/lwp.h: Nix sys/syncobj.h dependency.
Remove it in ddb/db_syncobj.h too.
New sys/wchan.h defines wchan_t so that users need not pull in sys/syncobj.h to get it.
Sprinkle #include <sys/syncobj.h> in .c files where it is now needed.
|
1.64 |
| 08-Oct-2023 |
ad | Ensure that an LWP that has taken a legitimate wakeup never produces an error code from sleepq_block(). Then, it's possible to make cv_signal() work as expected and only ever wake a singular LWP.
|
1.63 |
| 04-Oct-2023 |
ad | Eliminate l->l_biglocks. Originally I think it had a use but these days a local variable will do.
|
1.62 |
| 23-Sep-2023 |
ad | - Simplify how priority boost for blocking in kernel is handled. Rather than setting it up at each site where we block, make it a property of syncobj_t. Then, do not hang onto the priority boost until userret(), drop it as soon as the LWP is out of the run queue and onto a CPU. Holding onto it longer is of questionable benefit.
- This allows two members of lwp_t to be deleted, and mi_userret() to be simplified a lot (next step: trim it down to a single conditional).
- While here, constify syncobj_t and de-inline a bunch of small functions like lwp_lock() which turn out not to be small after all (I don't know why, but atomic_*_relaxed() seem to provoke a compiler shitfit above and beyond what volatile does).
|
1.61 |
| 17-Jul-2023 |
riastradh | kern: New struct syncobj::sobj_name member for diagnostics.
XXX potential kernel ABI change -- not sure any modules actually use struct syncobj but it's hard to rule that out because sys/syncobj.h leaks into sys/lwp.h
|
1.60 |
| 29-Jun-2022 |
riastradh | branches: 1.60.4; sleepq(9): Pass syncobj through to sleepq_block.
Previously the usage pattern was:
sleepq_enter(sq, l, lock); // locks l ... sleepq_enqueue(sq, ..., sobj, ...); // assumes l locked, sets l_syncobj ... (*) sleepq_block(...); // unlocks l
As long as l remains locked from sleepq_enter to sleepq_block, l_syncobj is stable, and sleepq_block uses it via ktrcsw to determine whether the sleep is on a mutex in order to avoid creating ktrace context-switch records (which involves allocation which is forbidden in softint context, while taking and even sleeping for a mutex is allowed).
However, in turnstile_block, the logic at (*) also involves turnstile_lendpri, which sometimes unlocks and relocks l. At that point, another thread can swoop in and sleepq_remove l, which sets l_syncobj to sched_syncobj. If that happens, ktrcsw does what is forbidden -- tries to allocate a ktrace record for the context switch.
As an optimization, sleepq_block or turnstile_block could stop early if it detects that l_syncobj doesn't match -- we've already been requested to wake up at this point so there's no need to mi_switch. (And then it would be unnecessary to pass the syncobj through sleepq_block, because l_syncobj would remain stable.) But I'll leave that to another change.
Reported-by: syzbot+8b9d7b066c32dbcdc63b@syzkaller.appspotmail.com
|
1.59 |
| 09-Apr-2022 |
riastradh | select(9): Use membar_acquire/release and atomic_store_release.
No store-before-load ordering here -- this was obviously always intended to be load-before-load/store all along.
|
1.58 |
| 12-Feb-2022 |
thorpej | Add inline functions to manipulate the klists that link up knotes via kn_selnext:
- klist_init() - klist_fini() - klist_insert() - klist_remove()
These provide some API insulation from the implementation details of these lists (but not completely; see vn_knote_attach() and vn_knote_detach()). Currently just a wrapper around SLIST(9).
This will make it significantly easier to switch kn_selnext linkage to a different kind of list.
|
1.57 |
| 10-Dec-2021 |
andvar | s/occured/occurred/ in comments, log messages and man pages.
|
1.56 |
| 29-Sep-2021 |
thorpej | - Change selremove_knote() from returning void to bool, and return true if the last knote was removed and there are no more knotes on the selinfo. - Use this new return value in filt_sordetach(), filt_sowdetach(), filt_fifordetach(), and filt_fifowdetach() to know when to clear SB_KOTE without having to know select/kqueue implementation details.
|
1.55 |
| 11-Dec-2020 |
thorpej | Add sel{record,remove}_knote(), so hide some of the details surrounding knote / kevent registration in the selinfo structure.
|
1.54 |
| 19-Apr-2020 |
ad | branches: 1.54.2; Set LW_SINTR earlier so it doesn't pose a problem for doing interruptable waits with turnstiles (not currently done).
|
1.53 |
| 26-Mar-2020 |
ad | branches: 1.53.2; Change sleepq_t from a TAILQ to a LIST and remove SOBJ_SLEEPQ_FIFO. Only select/poll used the FIFO method and that was for collisions which rarely occur. Shrinks sleep_t and condvar_t.
|
1.52 |
| 15-Feb-2020 |
ad | - List all of the syncobjs in syncobj.h. - Update a comment.
|
1.51 |
| 01-Feb-2020 |
riastradh | Load struct filedesc::fd_dt with atomic_load_consume.
Exceptions: when fd_refcnt <= 1, or when holding fd_lock.
While here:
- Restore KASSERT(mutex_owned(&fdp->fd_lock)) in fd_unused. => This is used only in fd_close and fd_abort, where it holds. - Move bounds check assertion in fd_putfile to where it matters. - Store fd_dt with atomic_store_release. - Move load of fd_dt under lock in knote_fdclose. - Omit membar_consumer in fdesc_readdir. => atomic_load_consume serves the same purpose now. => Was needed only on alpha anyway.
|
1.50 |
| 22-Nov-2019 |
ad | branches: 1.50.2; Minor correction to previous.
|
1.49 |
| 21-Nov-2019 |
ad | Minor improvements to select/poll:
- Increase the maximum number of clusters from 32 to 64 for large systems. kcpuset_t could potentially be used here but that's an excursion I don't want to go on right now. uint32_t -> uint64_t is very simple.
- In the case of a non-blocking select/poll, or where we won't block because there are events ready to report, stop registering interest in the back-end objects early.
- Change the wmesg for poll back to "poll".
|
1.48 |
| 20-Sep-2019 |
kamil | Validate usec ranges in sys___select50()
Later in the code selcommon() checks for proper timespec, check only correct usec of timeval before type conversions.
|
1.47 |
| 20-Aug-2019 |
msaitoh | Use unsigned to avoid undefined behavior. Found by kUBSan.
|
1.46 |
| 26-Jul-2019 |
msaitoh | branches: 1.46.2; Set sc_mask correctly in selsysinit() to avoid undefined behavior. Found by KUBSan.
|
1.45 |
| 08-May-2019 |
christos | Add slop of 1000 and explain why.
|
1.44 |
| 07-May-2019 |
christos | Use the max limit (aka maxfiles or the moral equivalent of OPEN_MAX) which makes poll(2) align with the Posix documentation (which allows EINVAL if nfds > OPEN_MAX). From: Anthony Mallet
|
1.43 |
| 05-May-2019 |
christos | Remove the slop code. Suggested by mrg@
|
1.42 |
| 04-May-2019 |
christos | PR/54158: Anthony Mallet: poll(2) does not allow polling all possible fds (hardcoded limit to 1000 + #<open-fds>). Changed to limit by the max of the resource limit of open descriptors and the above.
|
1.41 |
| 30-Jan-2018 |
ozaki-r | branches: 1.41.4; Apply C99-style struct initialization to syncobj_t
|
1.40 |
| 01-Jun-2017 |
chs | branches: 1.40.2; remove checks for failure after memory allocation calls that cannot fail:
kmem_alloc() with KM_SLEEP kmem_zalloc() with KM_SLEEP percpu_alloc() pserialize_create() psref_class_create()
all of these paths include an assertion that the allocation has not failed, so callers should not assert that again.
|
1.39 |
| 25-Apr-2014 |
pooka | branches: 1.39.4; Remove pollsock(). Since it took only a single socket, it was essentially a complicated way to call soreceive() with a sb_timeo. The only user (netsmb) already did that anyway, so just had to delete the call to pollsock().
|
1.38 |
| 25-Feb-2014 |
pooka | branches: 1.38.2; Ensure that the top level sysctl nodes (kern, vfs, net, ...) exist before the sysctl link sets are processed, and remove redundancy.
Shaves >13kB off of an amd64 GENERIC, not to mention >1k duplicate lines of code.
|
1.37 |
| 26-Jan-2013 |
riastradh | branches: 1.37.2; Assert equality, not assignment, in selrecord.
Code inspection suggests that this fix is not likely to reveal any latent problems.
|
1.36 |
| 29-Aug-2011 |
rmind | branches: 1.36.2; 1.36.12; Add kern.direct_select sysctl. Default to 0 for now.
|
1.35 |
| 09-Aug-2011 |
hannken | No need to lock the selcluster in selscan() if either NO_DIRECT_SELECT is defined or all polls return an event.
|
1.34 |
| 06-Aug-2011 |
hannken | Fix the races of direct select()/poll():
- When sel_do_scan() restarts do a full initialization with selclear() so we start from an empty set without registered events. Defer the evaluation of l_selret after selclear() and add the count of direct events to the count of events.
- For selscan()/pollscan() zero the output descriptors before we poll and for selscan() take the sc_lock before we change them.
- Change sel_setevents() to not count events already set.
Reviewed by: YAMAMOTO Takashi <yamt@netbsd.org>
Should fix PR #44763 (select/poll direct-set optimization seems racy) and PR #45187 (select(2) sometimes doesn't wakeup)
|
1.33 |
| 28-May-2011 |
christos | If a signal did not fire, restore the original signal mask for pselect/pollts using a signal mask. Tested by tron.
|
1.32 |
| 18-May-2011 |
christos | No need to mask twice. The setup function does it.
|
1.31 |
| 18-May-2011 |
christos | PR/43625: Mark Davies: Fix pselect(2) to honor the temporary mask. pselect(2) (and pollts(2)) are similar to sigsuspend(2) in that they temporarily change the process signal mask and wait for signal delivery. Factor out and share the code that does this.
|
1.30 |
| 06-Mar-2011 |
rmind | In a case of direct select, set only masked events, do not wakeup LWP if no polled/selected events were set; also, count the correct return value for the select.
|
1.29 |
| 18-Dec-2010 |
rmind | branches: 1.29.2; - Fix a few possible locking issues in execve1() and exit1(). Add a note that scheduler locks are special in this regard - adaptive locks cannot be in the path due to turnstiles. Randomly spotted/reported by uebayasi@. - Remove unused lwp_relock() and replace lwp_lock_retry() by simplifying lwp_lock() and sleepq_enter() a little. - Give alllwp its own cache-line and mark lwp_cache pointer as read-mostly.
OK ad@
|
1.28 |
| 15-Oct-2010 |
rmind | Re-enable direct select.
|
1.27 |
| 12-Jul-2010 |
rmind | sel_setevents: fix error - match event-set, as intended. Spotted by Enami Tsugutomo.
|
1.26 |
| 11-Jul-2010 |
rmind | Disable direct select for now, since it still brings problems.
|
1.25 |
| 10-Jul-2010 |
rmind | sel_setevents: fix direct injecting of fd bit for select() case.
|
1.24 |
| 08-Jul-2010 |
rmind | sel_do_scan: do not bother to assert for SEL_SCANNING state before blocking, as it might also be SEL_BLOCKING due to spurious wake-ups. That has no harm.
|
1.23 |
| 08-Jul-2010 |
rmind | Implement direct select/poll support, currently effective for socket and pipe subsystems. Avoids overhead of second selscan() on wake-up, and thus improves performance on certain workloads (especially when polling on many file-descriptors). Also, clean-up sys/fd_set.h header and improve macros.
Welcome to 5.99.36!
|
1.22 |
| 25-Apr-2010 |
ad | Make select/poll work with more than 32 CPUs. No ABI change.
|
1.21 |
| 20-Dec-2009 |
rmind | branches: 1.21.2; 1.21.4; Add comment about locking.
|
1.20 |
| 12-Dec-2009 |
dsl | Bounding the 'nfds' arg to poll() at the current process limit for actual open files is rather gross - the poll map isn't required to be dense. Instead limit to a much larger value (1000 + dt_nfiles) so that user programs cannot allocate indefinite sized blocks of kvm. If the limit is exceeded, then return EINVAL instead of silently truncating the list. (The silent truncation in select isn't quite as bad - although even there any high bits that are set ought to generate an EBADF response.) Move the code that converts ERESTART and EWOULDBLOCK into common code. Effectively fixes PR/17507 since the new limit is unlikely to be detected.
|
1.19 |
| 11-Nov-2009 |
rmind | - selcommon/pollcommon: drop redundant l argument. - Use cached curlwp->l_fd, instead of p->p_fd. - Inline selscan/pollscan.
|
1.18 |
| 01-Nov-2009 |
rmind | - Move inittimeleft() and gettimeleft() to subr_time.c, where they belong. - Move abstimeout2timo() there too and export. Use it in lwp_park().
|
1.17 |
| 01-Nov-2009 |
rmind | Move common logic in selcommon() and pollcommon() into sel_do_scan(). Avoids code duplication. XXX: pollsock() should be converted too, except it's a bit ugly.
|
1.16 |
| 21-Oct-2009 |
rmind | Remove uarea swap-out functionality:
- Addresses the issue described in PR/38828. - Some simplification in threading and sleepq subsystems. - Eliminates pmap_collect() and, as a side note, allows pmap optimisations. - Eliminates XS_CTL_DATA_ONSTACK in scsipi code. - Avoids few scans on LWP list and thus potentially long holds of proc_lock. - Cuts ~1.5k lines of code. Reduces amd64 kernel size by ~4k. - Removes __SWAP_BROKEN cases.
Tested on x86, mips, acorn32 (thanks <mpumford>) and partly tested on acorn26 (thanks to <bjh21>).
Discussed on <tech-kern>, reviewed by <ad>.
|
1.15 |
| 24-May-2009 |
ad | More changes to improve kern_descrip.c.
- Avoid atomics in more places. - Remove the per-descriptor mutex, and just use filedesc_t::fd_lock. It was only being used to synchronize close, and in any case we needed to take fd_lock to free the descriptor slot. - Optimize certain paths for the <NDFDFILE case. - Sprinkle more comments and assertions. - Cache more stuff in filedesc_t. - Fix numerous minor bugs spotted along the way. - Restructure how the open files array is maintained, for clarity and so that we can eliminate the membar_consumer() call in fd_getfile(). This is mostly syntactic sugar; the main functional change is that fd_nfiles now lives alongside the open file array.
Some measurements with libmicro:
- simple file syscalls are like close() are between 1 to 10% faster. - some nice improvements, e.g. poll(1000) which is ~50% faster.
|
1.14 |
| 29-Mar-2009 |
christos | Move the internal poll/select related API's to use timespec instead of timeval (rides the uvm bump).
|
1.13 |
| 21-Mar-2009 |
ad | Allocate sleep queue locks with mutex_obj_alloc. Reduces memory usage on !MP kernels, and reduces false sharing on MP ones.
|
1.12 |
| 11-Jan-2009 |
christos | branches: 1.12.2; merge christos-time_t
|
1.11 |
| 20-Nov-2008 |
yamt | pollcommon: use a more appropriate type than char[].
|
1.10 |
| 15-Oct-2008 |
ad | branches: 1.10.2; 1.10.4; 1.10.10; 1.10.14; - Rename cpu_lookup_byindex() to cpu_lookup(). The hardware ID isn't of interest to MI code. No functional change. - Change /dev/cpu to operate on cpu index, not hardware ID. Now cpuctl shouldn't print confused output.
|
1.9 |
| 04-Jun-2008 |
rmind | branches: 1.9.4; Check the result of allocation in the cases where size is passed by user.
|
1.8 |
| 26-May-2008 |
ad | Take the mutex pointer and waiters count out of sleepq_t: the values can be or are maintained elsewhere. Now a sleepq_t is just a TAILQ_HEAD.
|
1.7 |
| 30-Apr-2008 |
ad | branches: 1.7.2; PR kern/38547 select/poll do not set l_kpriority
Among other things this could have made X11 seem sluggish.
|
1.6 |
| 28-Apr-2008 |
martin | Remove clause 3 and 4 from TNF licenses
|
1.5 |
| 24-Apr-2008 |
ad | branches: 1.5.2; Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since we no longer need to guard against access from hardware interrupt handlers.
Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the child process share the parent's lock so that signal state may be kept in sync. Partially addresses PR kern/37437.
|
1.4 |
| 17-Apr-2008 |
yamt | branches: 1.4.2; s/selwakeup/selnotify/ in a comment.
|
1.3 |
| 29-Mar-2008 |
ad | branches: 1.3.2; 1.3.4; selwakeup: convert a while() loop into a do/while() since the first test isn't needed.
|
1.2 |
| 27-Mar-2008 |
ad | Replace use of CACHE_LINE_SIZE in some obvious places.
|
1.1 |
| 23-Mar-2008 |
ad | branches: 1.1.2; Split select/poll into their own file.
|
1.1.2.2 |
| 24-Mar-2008 |
yamt | sync with head.
|
1.1.2.1 |
| 23-Mar-2008 |
yamt | file sys_select.c was added on branch yamt-lazymbuf on 2008-03-24 09:39:02 +0000
|
1.3.4.5 |
| 17-Jan-2009 |
mjf | Sync with HEAD.
|
1.3.4.4 |
| 05-Jun-2008 |
mjf | Sync with HEAD.
Also fix build.
|
1.3.4.3 |
| 02-Jun-2008 |
mjf | Sync with HEAD.
|
1.3.4.2 |
| 03-Apr-2008 |
mjf | Sync with HEAD.
|
1.3.4.1 |
| 29-Mar-2008 |
mjf | file sys_select.c was added on branch mjf-devfs2 on 2008-04-03 12:43:04 +0000
|
1.3.2.4 |
| 27-Dec-2008 |
christos | merge with head.
|
1.3.2.3 |
| 01-Nov-2008 |
christos | Sync with head.
|
1.3.2.2 |
| 29-Mar-2008 |
christos | Welcome to the time_t=long long dev_t=uint64_t branch.
|
1.3.2.1 |
| 29-Mar-2008 |
christos | file sys_select.c was added on branch christos-time_t on 2008-03-29 20:47:01 +0000
|
1.4.2.3 |
| 17-Jun-2008 |
yamt | sync with head.
|
1.4.2.2 |
| 04-Jun-2008 |
yamt | sync with head
|
1.4.2.1 |
| 18-May-2008 |
yamt | sync with head.
|
1.5.2.5 |
| 11-Aug-2010 |
yamt | sync with head.
|
1.5.2.4 |
| 11-Mar-2010 |
yamt | sync with head
|
1.5.2.3 |
| 20-Jun-2009 |
yamt | sync with head
|
1.5.2.2 |
| 04-May-2009 |
yamt | sync with head.
|
1.5.2.1 |
| 16-May-2008 |
yamt | sync with head.
|
1.7.2.3 |
| 23-Jun-2008 |
wrstuden | Sync w/ -current. 34 merge conflicts to follow.
|
1.7.2.2 |
| 14-May-2008 |
wrstuden | Per discussion with ad at n dot o, revert signal mask handling changes.
The l_sigstk changes are most likely totally un-needed as SA will never use a signal stack - we send an upcall (or will as other diffs are brought in).
The l_sigmask changes were too controvertial. In all honesty, I think it's probably best to revert them. The main reason they were there is the fact that in an SA process, we don't mask signals per kernel thread, we mask them per user thread. In the kernel, we want them all to get turned into upcalls. Thus the normal state of l_sigmask in an SA process is for it to always be empty.
While we are in the process of delivering a signal, we want to temporarily mask a signal (so we don't recursively exhaust our upcall stacks). However signal delivery is rare (important, but rare), and delivering back-to-back signals is even rarer. So rather than cause every user of a signal mask to be prepared for this very rare case, we will just add a second check later in the signal delivery code. Said change is not in this diff.
This also un-compensates all of our compatability code for dealing with SA. SA is a NetBSD-specific thing, so there's no need for Irix, Linux, Solaris, SVR4 and so on to cope with it.
As previously, everything other than kern_sa.c compiles in i386 GENERIC as of this checkin. I will switch to ALL soon for compile testing.
|
1.7.2.1 |
| 10-May-2008 |
wrstuden | Initial checkin of re-adding SA. Everything except kern_sa.c compiles in GENERIC for i386. This is still a work-in-progress, but this checkin covers most of the mechanical work (changing signalling to be able to accomidate SA's process-wide signalling and re-adding includes of sys/sa.h and savar.h). Subsequent changes will be much more interesting.
Also, kern_sa.c has received partial cleanup. There's still more to do, though.
|
1.9.4.2 |
| 13-Dec-2008 |
haad | Update haad-dm branch to haad-dm-base2.
|
1.9.4.1 |
| 19-Oct-2008 |
haad | Sync with HEAD.
|
1.10.14.1 |
| 24-Apr-2015 |
msaitoh | Pull up following revision(s) (requested by prlw1 in ticket #1957):
sys/kern/sys_select.c patch
Limit nfds arg to poll() to a large enough value that user programs cannot allocate indefinite sized blocks of kvm. If the limit is exceeded, then return EINVAL instead of silently truncating the list. Addresses PR/17507. [prlw1, ticket #1957]
|
1.10.10.1 |
| 24-Apr-2015 |
msaitoh | Pull up following revision(s) (requested by prlw1 in ticket #1957):
sys/kern/sys_select.c patch
Limit nfds arg to poll() to a large enough value that user programs cannot allocate indefinite sized blocks of kvm. If the limit is exceeded, then return EINVAL instead of silently truncating the list. Addresses PR/17507. [prlw1, ticket #1957]
|
1.10.4.1 |
| 24-Apr-2015 |
msaitoh | Pull up following revision(s) (requested by prlw1 in ticket #1957):
sys/kern/sys_select.c patch
Limit nfds arg to poll() to a large enough value that user programs cannot allocate indefinite sized blocks of kvm. If the limit is exceeded, then return EINVAL instead of silently truncating the list. Addresses PR/17507.
|
1.10.2.2 |
| 28-Apr-2009 |
skrll | Sync with HEAD.
|
1.10.2.1 |
| 19-Jan-2009 |
skrll | Sync with HEAD.
|
1.12.2.2 |
| 23-Jul-2009 |
jym | Sync with HEAD.
|
1.12.2.1 |
| 13-May-2009 |
jym | Sync with HEAD.
Commit is split, to avoid a "too many arguments" protocol error.
|
1.21.4.4 |
| 31-May-2011 |
rmind | sync with head
|
1.21.4.3 |
| 21-Apr-2011 |
rmind | sync with head
|
1.21.4.2 |
| 05-Mar-2011 |
rmind | sync with head
|
1.21.4.1 |
| 30-May-2010 |
rmind | sync with head
|
1.21.2.3 |
| 22-Oct-2010 |
uebayasi | Sync with HEAD (-D20101022).
|
1.21.2.2 |
| 17-Aug-2010 |
uebayasi | Sync with HEAD.
|
1.21.2.1 |
| 30-Apr-2010 |
uebayasi | Sync with HEAD.
|
1.29.2.1 |
| 06-Jun-2011 |
jruoho | Sync with HEAD.
|
1.36.12.3 |
| 03-Dec-2017 |
jdolecek | update from HEAD
|
1.36.12.2 |
| 20-Aug-2014 |
tls | Rebase to HEAD as of a few days ago.
|
1.36.12.1 |
| 25-Feb-2013 |
tls | resync with head
|
1.36.2.1 |
| 22-May-2014 |
yamt | sync with head.
for a reference, the tree before this commit was tagged as yamt-pagecache-tag8.
this commit was splitted into small chunks to avoid a limitation of cvs. ("Protocol error: too many arguments")
|
1.37.2.1 |
| 18-May-2014 |
rmind | sync with head
|
1.38.2.1 |
| 10-Aug-2014 |
tls | Rebase.
|
1.39.4.1 |
| 28-Aug-2017 |
skrll | Sync with HEAD
|
1.40.2.1 |
| 08-Mar-2020 |
martin | Pull up following revision(s) (requested by mlelstv in ticket #1515):
sys/kern/sys_select.c: revision 1.42-1.45
PR/54158: Anthony Mallet: poll(2) does not allow polling all possible fds (hardcoded limit to 1000 + #<open-fds>). Changed to limit by the max of the resource limit of open descriptors and the above.
Remove the slop code. Suggested by mrg@
Use the max limit (aka maxfiles or the moral equivalent of OPEN_MAX) which makes poll(2) align with the Posix documentation (which allows EINVAL if nfds > OPEN_MAX). From: Anthony Mallet
Add slop of 1000 and explain why.
|
1.41.4.3 |
| 21-Apr-2020 |
martin | Sync with HEAD
|
1.41.4.2 |
| 13-Apr-2020 |
martin | Mostly merge changes from HEAD upto 20200411
|
1.41.4.1 |
| 10-Jun-2019 |
christos | Sync with HEAD
|
1.46.2.2 |
| 20-Nov-2024 |
martin | Pull up following revision(s) (requested by riastradh in ticket #1926):
sys/kern/sys_select.c: revision 1.67 (patch) tests/lib/libc/sys/t_select.c: revision 1.5 (patch)
PR kern/57504 : Check all fds passed in to select
If an application passes in a huge fd_set (select(BIG, ...)) then check every bit in the fd_sets provided, to make sure they are valid.
If BIG is too big (cannot possibly represent an open fd for this process, under any circumstances: ie: not just because that many are not currently open) return EINVAL.
Otherwise, check every set bit to make sure it is valid. Any fd bits set above the applications current highest open fd automatically generate EBADF and quick(ish) exit. fd's that are within the plausible range are then checked as they always were (it is possible for there to be a few there above the max open fd - as everything in select is done in multiples of __FDBITS (fd_mask) but the max open fd is not so constrained. Those always were checked, continue using the same mechanism.
This should have zero impact on any sane application which uses the highest fd for which it set a bit, +1, as the first arg to select. However, if there are any broken applications that were relying upon the previous behaviour of simply ignoring any fd_masks that started beyond the max number of open files, then they might (if they happen to have any bits set) now fail.
tests/lib/libc/sys/t_select: Test select on bad file descriptors.
This should immediately fail, not hang, even if the bad fd is high-numbered.
PR kern/57504: select with large enough bogus fd number set hangs instead of failing with EBADF
|
1.46.2.1 |
| 20-Nov-2024 |
martin | Pull up following revision(s) (requested by riastradh in ticket #1921):
sys/kern/kern_event.c: revision 1.106 sys/kern/sys_select.c: revision 1.51 sys/kern/subr_exec_fd.c: revision 1.10 sys/kern/sys_aio.c: revision 1.46 sys/kern/kern_descrip.c: revision 1.244 sys/kern/kern_descrip.c: revision 1.245 sys/ddb/db_xxx.c: revision 1.72 sys/ddb/db_xxx.c: revision 1.73 sys/miscfs/fdesc/fdesc_vnops.c: revision 1.132 sys/kern/uipc_usrreq.c: revision 1.195 sys/kern/sys_descrip.c: revision 1.36 sys/kern/uipc_usrreq.c: revision 1.196 sys/kern/uipc_socket2.c: revision 1.135 sys/kern/uipc_socket2.c: revision 1.136 sys/kern/kern_sig.c: revision 1.383 sys/kern/kern_sig.c: revision 1.384 sys/compat/netbsd32/netbsd32_ioctl.c: revision 1.107 sys/miscfs/procfs/procfs_vnops.c: revision 1.208 sys/kern/subr_exec_fd.c: revision 1.9 sys/kern/kern_descrip.c: revision 1.252 (all via patch)
Load struct filedesc::fd_dt with atomic_load_consume.
Exceptions: when fd_refcnt <= 1, or when holding fd_lock.
While here: - Restore KASSERT(mutex_owned(&fdp->fd_lock)) in fd_unused. => This is used only in fd_close and fd_abort, where it holds. - Move bounds check assertion in fd_putfile to where it matters. - Store fd_dt with atomic_store_release. - Move load of fd_dt under lock in knote_fdclose. - Omit membar_consumer in fdesc_readdir. => atomic_load_consume serves the same purpose now. => Was needed only on alpha anyway.
Load struct fdfile::ff_file with atomic_load_consume. Exceptions: when we're only testing whether it's there, not about to dereference it.
Note: We do not use atomic_store_release to set it because the preceding mutex_exit should be enough.
(That said, it's not clear the mutex_enter/exit is needed unless refcnt > 0 already, in which case maybe it would be a win to switch from the membar implied by mutex_enter to the membar implied by atomic_store_release -- which I would generally expect to be much cheaper. And a little clearer without a long comment.) kern_descrip.c: Fix membars around reference count decrement.
In general, the `last one out hit the lights' style of reference counting (as opposed to the `whoever's destroying must wait for pending users to finish' style) requires memory barriers like so:
... usage of resources associated with object ... membar_release(); if (atomic_dec_uint_nv(&obj->refcnt) != 0) return; membar_acquire(); ... freeing of resources associated with object ...
This way, all usage happens-before all freeing. This fixes several errors: - fd_close failed to ensure whatever its caller did would happen-before the freeing, in the case where another thread is concurrently trying to close the fd (ff->ff_file == NULL). Fix: Add membar_release before atomic_dec_uint(&ff->ff_refcnt) in that branch. - fd_close failed to ensure all loads its caller had issued will have happened-before the freeing, in the case where the fd is still in use by another thread (fdp->fd_refcnt > 1 and ff->ff_refcnt-- > 0). Fix: Change membar_producer to membar_release before atomic_dec_uint(&ff->ff_refcnt). - fd_close failed to ensure that any usage of fp by other callers would happen-before any freeing it does. Fix: Add membar_acquire after atomic_dec_uint_nv(&ff->ff_refcnt). - fd_free failed to ensure that any usage of fdp by other callers would happen-before any freeing it does. Fix: Add membar_acquire after atomic_dec_uint_nv(&fdp->fd_refcnt).
While here, change membar_exit -> membar_release. No semantic change, just updating away from the legacy API.
|
1.50.2.1 |
| 29-Feb-2020 |
ad | Sync with head.
|
1.53.2.1 |
| 20-Apr-2020 |
bouyer | Sync with HEAD
|
1.54.2.1 |
| 14-Dec-2020 |
thorpej | Sync w/ HEAD.
|
1.60.4.1 |
| 18-Nov-2024 |
martin | Pull up following revision(s) (requested by riastradh in ticket #1011):
sys/kern/sys_select.c: revision 1.67 tests/lib/libc/sys/t_select.c: revision 1.5
PR kern/57504 : Check all fds passed in to select
If an application passes in a huge fd_set (select(BIG, ...)) then check every bit in the fd_sets provided, to make sure they are valid.
If BIG is too big (cannot possibly represent an open fd for this process, under any circumstances: ie: not just because that many are not currently open) return EINVAL. Otherwise, check every set bit to make sure it is valid. Any fd bits set above the applications current highest open fd automatically generate EBADF and quick(ish) exit. fd's that are within the plausible range are then checked as they always were (it is possible for there to be a few there above the max open fd - as everything in select is done in multiples of __FDBITS (fd_mask) but the max open fd is not so constrained. Those always were checked, continue using the same mechanism.
This should have zero impact on any sane application which uses the highest fd for which it set a bit, +1, as the first arg to select. However, if there are any broken applications that were relying upon the previous behaviour of simply ignoring any fd_masks that started beyond the max number of open files, then they might (if they happen to have any bits set) now fail.
tests/lib/libc/sys/t_select: Test select on bad file descriptors. This should immediately fail, not hang, even if the bad fd is high-numbered.
PR kern/57504: select with large enough bogus fd number set hangs instead of failing with EBADF
|
1.66.6.1 |
| 02-Aug-2025 |
perseant | Sync with HEAD
|