History log of /src/sys/kern/subr_xcall.c |
Revision | | Date | Author | Comments |
1.39 |
| 01-Apr-2025 |
ozaki-r | xcall: treat ipl as unsigned as it is (NFC)
|
1.38 |
| 01-Mar-2024 |
mrg | check that l_nopreempt (preemption count) doesn't change after callbacks
check that the idle loop, soft interrupt handlers, workqueue, and xcall callbacks do not modify the preemption count, in most cases, knowing it should be 0 currently.
this work was originally done by simonb. cleaned up slightly and some minor enhancement made by myself, and with discussion with riastradh@.
other callback call sites could check this as well (such as MD interrupt handlers, or really anything that includes a callback registration. x86 version to be commited separately.)
|
1.37 |
| 06-Aug-2023 |
riastradh | xcall(9): Rename condvars to be less confusing.
The `cv' suffix is not helpful and `xclocv' looks like some kind of clock at first glance. Just say `xclow' and `xchigh'.
|
1.36 |
| 07-Jul-2023 |
riastradh | xcall(9): If !mp_online, raise spl or set LP_BOUND to call func.
High-priority xcalls may reasonably assume that the spl is raised to splsoftserial, so make sure to do that in xc_broadcast.
Low-priority xcalls may reasonably enter paths that assume the lwp is bound to a CPU, so let's make it assertable even if it doesn't have any other consequences when !mp_online.
XXX pullup-8 XXX pullup-9 XXX pullup-10
|
1.35 |
| 09-Apr-2023 |
riastradh | kern: KASSERT(A && B) -> KASSERT(A); KASSERT(B)
|
1.34 |
| 22-Dec-2020 |
ad | branches: 1.34.18; Comments.
|
1.33 |
| 19-Dec-2019 |
thorpej | branches: 1.33.8; Whitespace police (minor infraction).
|
1.32 |
| 01-Dec-2019 |
riastradh | Restore xcall(9) fast path using atomic_load/store_*.
While here, fix a bug that was formerly in xcall(9): a missing acquire operation in the xc_wait fast path so that all memory operations in the xcall on remote CPUs will happen before any memory operations on the issuing CPU after xc_wait returns.
All stores of xc->xc_donep are done with atomic_store_release so that we can safely use atomic_load_acquire to read it outside the lock. However, this fast path only works on platforms with cheap 64-bit atomic load/store, so conditionalize it on __HAVE_ATOMIC64_LOADSTORE. (Under the lock, no need for atomic loads since nobody else will be issuing stores.)
For review, here's the relevant diff from the old version of the fast path, from before it was removed and some other things changed in the file:
diff --git a/sys/kern/subr_xcall.c b/sys/kern/subr_xcall.c index 45a877aa90e0..b6bfb6455291 100644 --- a/sys/kern/subr_xcall.c +++ b/sys/kern/subr_xcall.c @@ -84,6 +84,7 @@ __KERNEL_RCSID(0, "$NetBSD: subr_xcall.c,v 1.27 2019/10/06 15:11:17 uwe Exp $"); #include <sys/evcnt.h> #include <sys/kthread.h> #include <sys/cpu.h> +#include <sys/atomic.h>
#ifdef _RUMPKERNEL #include "rump_private.h" @@ -334,10 +353,12 @@ xc_wait(uint64_t where) xc = &xc_low_pri; }
+#ifdef __HAVE_ATOMIC64_LOADSTORE /* Fast path, if already done. */ - if (xc->xc_donep >= where) { + if (atomic_load_acquire(&xc->xc_donep) >= where) { return; } +#endif
/* Slow path: block until awoken. */ mutex_enter(&xc->xc_lock); @@ -422,7 +443,11 @@ xc_thread(void *cookie) (*func)(arg1, arg2);
mutex_enter(&xc->xc_lock); +#ifdef __HAVE_ATOMIC64_LOADSTORE + atomic_store_release(&xc->xc_donep, xc->xc_donep + 1); +#else xc->xc_donep++; +#endif } /* NOTREACHED */ } @@ -462,7 +487,6 @@ xc__highpri_intr(void *dummy) * Lock-less fetch of function and its arguments. * Safe since it cannot change at this point. */ - KASSERT(xc->xc_donep < xc->xc_headp); func = xc->xc_func; arg1 = xc->xc_arg1; arg2 = xc->xc_arg2; @@ -475,7 +499,13 @@ xc__highpri_intr(void *dummy) * cross-call has been processed - notify waiters, if any. */ mutex_enter(&xc->xc_lock); - if (++xc->xc_donep == xc->xc_headp) { + KASSERT(xc->xc_donep < xc->xc_headp); +#ifdef __HAVE_ATOMIC64_LOADSTORE + atomic_store_release(&xc->xc_donep, xc->xc_donep + 1); +#else + xc->xc_donep++; +#endif + if (xc->xc_donep == xc->xc_headp) { cv_broadcast(&xc->xc_busy); } mutex_exit(&xc->xc_lock);
|
1.31 |
| 01-Dec-2019 |
ad | Back out the fastpath change in xc_wait(). It's going to be done differently.
|
1.30 |
| 01-Dec-2019 |
ad | Make the fast path in xc_wait() depend on _LP64 for now. Needs 64-bit load/store. To be revisited.
|
1.29 |
| 01-Dec-2019 |
ad | If the system is not up and running yet, just run the function locally.
|
1.28 |
| 11-Nov-2019 |
maxv | Remove lockless reads of 'xc_donep'. This is an uint64_t, and we cannot expect the accesses to be MP-safe on 32bit arches.
Found by KCSAN.
|
1.27 |
| 06-Oct-2019 |
uwe | xc_barrier - convenience function to xc_broadcast() a nop.
Make the intent more clear and also avoid a bunch of (xcfunc_t)nullop casts that gcc 8 -Wcast-function-type is not happy about.
|
1.26 |
| 07-Feb-2018 |
ozaki-r | branches: 1.26.4; Spinkle ASSERT_SLEEPABLE to xcall functions
|
1.25 |
| 05-Feb-2018 |
ozaki-r | Sort XC_IPL_* in order of priority (NFC)
|
1.24 |
| 05-Feb-2018 |
ozaki-r | Avoid allocating unused softints that share a value of IPL between another
|
1.23 |
| 05-Feb-2018 |
ozaki-r | Fix build of kernels that some (or all) IPL_SOFT* share a value (e.g., mips)
|
1.22 |
| 03-Feb-2018 |
martin | Try to fix the build: avoid duplicate case labels when IPL_SOFT* are all the same.
|
1.21 |
| 01-Feb-2018 |
ozaki-r | Support arbitrary softint IPLs in high priority xcall
The high priority xcall supported only a softint of IPL_SOFTSERIAL. It meant that it didn't work for xcall callbacks depending on lower IPLs than IPL_SOFTSERIAL.
The change makes xcall have multiple softints of IPLs and allow users to specify arbitrary IPLs. Users can specify an IPL by XC_HIGHPRI_IPL passed to the 1st argument of xc_broadcast or xc_unicast.
Note that xcall still serves requests one by one (i.e., doesn't run them concurrently) even if requests have different IPLs.
Proposed on tech-kern@
|
1.20 |
| 21-Jun-2017 |
martin | Change a KASSERT to KASSERTMSG and print the xcall function to be invoked as a debugging help.
|
1.19 |
| 21-Nov-2016 |
ozaki-r | branches: 1.19.8; Fix a race condition of low priority xcall
xc_lowpri and xc_thread are racy and xc_wait may return during/before executing all xcall callbacks, resulting in a kernel panic at worst.
xc_lowpri serializes multiple jobs by a mutex and a cv. If all xcall callbacks are done, xc_wait returns and also xc_lowpri accepts a next job.
The problem is that a counter that counts the number of finished xcall callbacks is incremented *before* actually executing a xcall callback (see xc_tailp++ in xc_thread). So xc_lowpri accepts a next job before all xcall callbacks complete and a next job begins to run its xcall callbacks.
Even worse the counter is global and shared between jobs, so if a xcall callback of the next job completes, the shared counter is incremented, which confuses wc_wait of the previous job as all xcall callbacks of the previous job are done and wc_wait of the previous job returns during/before executing its xcall callbacks.
How to fix: there are actually two counters that count the number of finished xcall callbacks for low priority xcall for historical reasons (I guess): xc_tailp and xc_low_pri.xc_donep. xc_low_pri.xc_donep is incremented correctly while xc_tailp is incremented wrongly, i.e., before executing a xcall callback. We can fix the issue by dropping xc_tailp and using only xc_low_pri.xc_donep.
PR kern/51632
|
1.18 |
| 26-Nov-2013 |
rmind | branches: 1.18.4; 1.18.6; 1.18.8; 1.18.10; 1.18.12; Fix previous, use the correct value for softint_establish (SOFTINT_SERIAL).
|
1.17 |
| 26-Nov-2013 |
rmind | Switch XC_HIGHPRI to run at IPL_SOFTSERIAL i.e. the highest software level. Adjust pcu(9) to this xcall(9) change. This may fix the problems after x86 FPU was converted to use PCU, since it avoids heavy contention at the lower levels (particularly, IPL_SOFTNET). This is a good illustration why software interrupts should generally avoid any blocking on locks.
|
1.16 |
| 25-Oct-2013 |
martin | Mark a diagnostic-only variable
|
1.15 |
| 07-Apr-2013 |
rmind | branches: 1.15.4; xc_highpri: fix assert.
|
1.14 |
| 19-Feb-2013 |
martin | Stopgap fix to make rump cooperate with pserialize, may be revisited later. Patch from pooka, ok: rmind. No related regressions in a complete atf test run (which works again with this, even on non x86 SMP machines).
|
1.13 |
| 13-May-2011 |
rmind | branches: 1.13.4; 1.13.10; 1.13.14; 1.13.16; Sprinkle __cacheline_aligned and __read_mostly.
|
1.12 |
| 22-Jun-2010 |
rmind | branches: 1.12.2; Implement high priority (XC_HIGHPRI) xcall(9) mechanism - a facility to execute functions from software interrupt context, at SOFTINT_CLOCK. Functions must be lightweight. Will be used for passive serialization.
OK ad@.
|
1.11 |
| 30-Nov-2009 |
pooka | branches: 1.11.2; 1.11.4; explicitly initialize static boolean
|
1.10 |
| 05-Mar-2009 |
uebayasi | xc_lowpri: don't truncate `where' from uint64_t to u_int.
|
1.9 |
| 28-Apr-2008 |
martin | branches: 1.9.8; 1.9.10; 1.9.14; Remove clause 3 and 4 from TNF licenses
|
1.8 |
| 24-Apr-2008 |
ad | branches: 1.8.2; xc_broadcast: don't try to run cross calls on CPUs that are not yet running.
|
1.7 |
| 14-Apr-2008 |
ad | branches: 1.7.2; Fix comments.
|
1.6 |
| 10-Mar-2008 |
martin | Use cpu index instead of the machine dependend, not very expressive cpuid when naming user-visible kernel entities.
|
1.5 |
| 06-Nov-2007 |
ad | branches: 1.5.2; 1.5.12; 1.5.16; Merge scheduler changes from the vmlocking branch. All discussed on tech-kern:
- Invert priority space so that zero is the lowest priority. Rearrange number and type of priority levels into bands. Add new bands like 'kernel real time'. - Ignore the priority level passed to tsleep. Compute priority for sleep dynamically. - For SCHED_4BSD, make priority adjustment per-LWP, not per-process.
|
1.4 |
| 27-Oct-2007 |
ad | branches: 1.4.2; 1.4.4; Tweak comments.
|
1.3 |
| 08-Oct-2007 |
ad | branches: 1.3.2; 1.3.4; Include sys/cpu.h for archs that don't have CPU_INFO_ITERATOR. Spotted by dsieger@.
|
1.2 |
| 08-Oct-2007 |
ad | Merge file descriptor locking, cwdi locking and cross-call changes from the vmlocking branch.
|
1.1 |
| 26-Aug-2007 |
ad | branches: 1.1.2; 1.1.4; file subr_xcall.c was initially added on branch vmlocking.
|
1.1.4.1 |
| 14-Oct-2007 |
yamt | sync with head.
|
1.1.2.6 |
| 01-Nov-2007 |
ad | - Fix interactivity problems under high load. Beacuse soft interrupts are being stacked on top of regular LWPs, more often than not aston() was being called on a soft interrupt thread instead of a user thread, meaning that preemption was not happening on EOI.
- Don't use bool in a couple of data structures. Sub-word writes are not always atomic and may clobber other fields in the containing word.
- For SCHED_4BSD, make p_estcpu per thread (l_estcpu). Rework how the dynamic priority level is calculated - it's much better behaved now.
- Kill the l_usrpri/l_priority split now that priorities are no longer directly assigned by tsleep(). There are three fields describing LWP priority:
l_priority: Dynamic priority calculated by the scheduler. This does not change for kernel/realtime threads, and always stays within the correct band. Eg for timeshared LWPs it never moves out of the user priority range. This is basically what l_usrpri was before.
l_inheritedprio: Lent to the LWP due to priority inheritance (turnstiles).
l_kpriority: A boolean value set true the first time an LWP sleeps within the kernel. This indicates that the LWP should get a priority boost as compensation for blocking. lwp_eprio() now does the equivalent of sched_kpri() if the flag is set. The flag is cleared in userret().
- Keep track of scheduling class (OTHER, FIFO, RR) in struct lwp, and use this to make decisions in a few places where we previously tested for a kernel thread.
- Partially fix itimers and usr/sys/intr time accounting in the presence of software interrupts.
- Use kthread_create() to create idle LWPs. Move priority definitions from the various modules into sys/param.h.
- newlwp -> lwp_create
|
1.1.2.5 |
| 16-Oct-2007 |
ad | Fix scheduler priority lossage. From rmind@.
|
1.1.2.4 |
| 09-Oct-2007 |
ad | Sync with head.
|
1.1.2.3 |
| 09-Oct-2007 |
ad | Sync with head.
|
1.1.2.2 |
| 30-Aug-2007 |
ad | - Instead of xc_broadcast()/xc_unicast() waiting for completion, have them return a value which can later be passed to xc_wait(). Allows the caller to go and do other stuff in the meantime. - Fix the xcall thread to work properly in the event of a spurious wakeup.
|
1.1.2.1 |
| 26-Aug-2007 |
ad | - Add a generic cross-call facility. Right now this only does threaded cross calls but that should be extended to do IPIs. These are deliberately set up as bound kthreads (and not soft interrupts or something else) so that the called functions can use the spl framework or disable preemption in order to guarantee exclusive access to CPU-local data.
- Use cross calls to take CPUs online or offline. Ok to do since bound LWPs still execute on offline CPUs. As a result schedstate_percpu's::spc_flags is CPU-local again and doesn't need locking.
|
1.3.4.4 |
| 06-Nov-2007 |
joerg | Sync with HEAD.
|
1.3.4.3 |
| 28-Oct-2007 |
joerg | Sync with HEAD.
|
1.3.4.2 |
| 26-Oct-2007 |
joerg | Sync with HEAD.
Follow the merge of pmap.c on i386 and amd64 and move pmap_init_tmp_pgtbl into arch/x86/x86/pmap.c. Modify the ACPI wakeup code to restore CR4 before jumping back into kernel space as the large page option might cover that.
|
1.3.4.1 |
| 08-Oct-2007 |
joerg | file subr_xcall.c was added on branch jmcneill-pm on 2007-10-26 15:48:41 +0000
|
1.3.2.1 |
| 13-Nov-2007 |
bouyer | Sync with HEAD
|
1.4.4.1 |
| 19-Nov-2007 |
mjf | Sync with HEAD.
|
1.4.2.4 |
| 17-Mar-2008 |
yamt | sync with head.
|
1.4.2.3 |
| 15-Nov-2007 |
yamt | sync with head.
|
1.4.2.2 |
| 27-Oct-2007 |
yamt | sync with head.
|
1.4.2.1 |
| 27-Oct-2007 |
yamt | file subr_xcall.c was added on branch yamt-lazymbuf on 2007-10-27 11:35:34 +0000
|
1.5.16.2 |
| 02-Jun-2008 |
mjf | Sync with HEAD.
|
1.5.16.1 |
| 03-Apr-2008 |
mjf | Sync with HEAD.
|
1.5.12.1 |
| 24-Mar-2008 |
keiichi | sync with head.
|
1.5.2.3 |
| 23-Mar-2008 |
matt | sync with HEAD
|
1.5.2.2 |
| 06-Nov-2007 |
matt | sync with HEAD
|
1.5.2.1 |
| 06-Nov-2007 |
matt | file subr_xcall.c was added on branch matt-armv6 on 2007-11-06 23:32:20 +0000
|
1.7.2.1 |
| 18-May-2008 |
yamt | sync with head.
|
1.8.2.4 |
| 11-Aug-2010 |
yamt | sync with head.
|
1.8.2.3 |
| 11-Mar-2010 |
yamt | sync with head
|
1.8.2.2 |
| 04-May-2009 |
yamt | sync with head.
|
1.8.2.1 |
| 16-May-2008 |
yamt | sync with head.
|
1.9.14.1 |
| 13-May-2009 |
jym | Sync with HEAD.
Commit is split, to avoid a "too many arguments" protocol error.
|
1.9.10.1 |
| 15-Mar-2009 |
snj | Pull up following revision(s) (requested by uebayasi in ticket #549): sys/kern/subr_xcall.c: revision 1.10 xc_lowpri: don't truncate `where' from uint64_t to u_int.
|
1.9.8.1 |
| 28-Apr-2009 |
skrll | Sync with HEAD.
|
1.11.4.2 |
| 31-May-2011 |
rmind | sync with head
|
1.11.4.1 |
| 03-Jul-2010 |
rmind | sync with head
|
1.11.2.1 |
| 17-Aug-2010 |
uebayasi | Sync with HEAD.
|
1.12.2.1 |
| 06-Jun-2011 |
jruoho | Sync with HEAD.
|
1.13.16.2 |
| 06-Jul-2017 |
snj | Pull up following revision(s) (requested by ozaki-r in ticket #1419): sys/kern/subr_xcall.c: revision 1.19 Fix a race condition of low priority xcall xc_lowpri and xc_thread are racy and xc_wait may return during/before executing all xcall callbacks, resulting in a kernel panic at worst. xc_lowpri serializes multiple jobs by a mutex and a cv. If all xcall callbacks are done, xc_wait returns and also xc_lowpri accepts a next job. The problem is that a counter that counts the number of finished xcall callbacks is incremented *before* actually executing a xcall callback (see xc_tailp++ in xc_thread). So xc_lowpri accepts a next job before all xcall callbacks complete and a next job begins to run its xcall callbacks. Even worse the counter is global and shared between jobs, so if a xcall callback of the next job completes, the shared counter is incremented, which confuses wc_wait of the previous job as all xcall callbacks of the previous job are done and wc_wait of the previous job returns during/before executing its xcall callbacks. How to fix: there are actually two counters that count the number of finished xcall callbacks for low priority xcall for historical reasons (I guess): xc_tailp and xc_low_pri.xc_donep. xc_low_pri.xc_donep is incremented correctly while xc_tailp is incremented wrongly, i.e., before executing a xcall callback. We can fix the issue by dropping xc_tailp and using only xc_low_pri.xc_donep. PR kern/51632
|
1.13.16.1 |
| 20-Apr-2013 |
bouyer | Pull up following revision(s) (requested by rmind in ticket #868): sys/kern/subr_xcall.c: revision 1.15 xc_highpri: fix assert.
|
1.13.14.4 |
| 03-Dec-2017 |
jdolecek | update from HEAD
|
1.13.14.3 |
| 20-Aug-2014 |
tls | Rebase to HEAD as of a few days ago.
|
1.13.14.2 |
| 23-Jun-2013 |
tls | resync from head
|
1.13.14.1 |
| 25-Feb-2013 |
tls | resync with head
|
1.13.10.2 |
| 06-Jul-2017 |
snj | Pull up following revision(s) (requested by ozaki-r in ticket #1419): sys/kern/subr_xcall.c: revision 1.19 Fix a race condition of low priority xcall xc_lowpri and xc_thread are racy and xc_wait may return during/before executing all xcall callbacks, resulting in a kernel panic at worst. xc_lowpri serializes multiple jobs by a mutex and a cv. If all xcall callbacks are done, xc_wait returns and also xc_lowpri accepts a next job. The problem is that a counter that counts the number of finished xcall callbacks is incremented *before* actually executing a xcall callback (see xc_tailp++ in xc_thread). So xc_lowpri accepts a next job before all xcall callbacks complete and a next job begins to run its xcall callbacks. Even worse the counter is global and shared between jobs, so if a xcall callback of the next job completes, the shared counter is incremented, which confuses wc_wait of the previous job as all xcall callbacks of the previous job are done and wc_wait of the previous job returns during/before executing its xcall callbacks. How to fix: there are actually two counters that count the number of finished xcall callbacks for low priority xcall for historical reasons (I guess): xc_tailp and xc_low_pri.xc_donep. xc_low_pri.xc_donep is incremented correctly while xc_tailp is incremented wrongly, i.e., before executing a xcall callback. We can fix the issue by dropping xc_tailp and using only xc_low_pri.xc_donep. PR kern/51632
|
1.13.10.1 |
| 20-Apr-2013 |
bouyer | branches: 1.13.10.1.2; Pull up following revision(s) (requested by rmind in ticket #868): sys/kern/subr_xcall.c: revision 1.15 xc_highpri: fix assert.
|
1.13.10.1.2.1 |
| 06-Jul-2017 |
snj | Pull up following revision(s) (requested by ozaki-r in ticket #1419): sys/kern/subr_xcall.c: revision 1.19 Fix a race condition of low priority xcall xc_lowpri and xc_thread are racy and xc_wait may return during/before executing all xcall callbacks, resulting in a kernel panic at worst. xc_lowpri serializes multiple jobs by a mutex and a cv. If all xcall callbacks are done, xc_wait returns and also xc_lowpri accepts a next job. The problem is that a counter that counts the number of finished xcall callbacks is incremented *before* actually executing a xcall callback (see xc_tailp++ in xc_thread). So xc_lowpri accepts a next job before all xcall callbacks complete and a next job begins to run its xcall callbacks. Even worse the counter is global and shared between jobs, so if a xcall callback of the next job completes, the shared counter is incremented, which confuses wc_wait of the previous job as all xcall callbacks of the previous job are done and wc_wait of the previous job returns during/before executing its xcall callbacks. How to fix: there are actually two counters that count the number of finished xcall callbacks for low priority xcall for historical reasons (I guess): xc_tailp and xc_low_pri.xc_donep. xc_low_pri.xc_donep is incremented correctly while xc_tailp is incremented wrongly, i.e., before executing a xcall callback. We can fix the issue by dropping xc_tailp and using only xc_low_pri.xc_donep. PR kern/51632
|
1.13.4.1 |
| 22-May-2014 |
yamt | sync with head.
for a reference, the tree before this commit was tagged as yamt-pagecache-tag8.
this commit was splitted into small chunks to avoid a limitation of cvs. ("Protocol error: too many arguments")
|
1.15.4.1 |
| 18-May-2014 |
rmind | sync with head
|
1.18.12.1 |
| 18-Jan-2017 |
skrll | Sync with netbsd-5
|
1.18.10.1 |
| 07-Jan-2017 |
pgoyette | Sync with HEAD. (Note that most of these changes are simply $NetBSD$ tag issues.)
|
1.18.8.1 |
| 12-Dec-2016 |
snj | Pull up following revision(s) (requested by ozaki-r in ticket #1306): sys/kern/subr_xcall.c: revision 1.19 Fix a race condition of low priority xcall xc_lowpri and xc_thread are racy and xc_wait may return during/before executing all xcall callbacks, resulting in a kernel panic at worst. xc_lowpri serializes multiple jobs by a mutex and a cv. If all xcall callbacks are done, xc_wait returns and also xc_lowpri accepts a next job. The problem is that a counter that counts the number of finished xcall callbacks is incremented *before* actually executing a xcall callback (see xc_tailp++ in xc_thread). So xc_lowpri accepts a next job before all xcall callbacks complete and a next job begins to run its xcall callbacks. Even worse the counter is global and shared between jobs, so if a xcall callback of the next job completes, the shared counter is incremented, which confuses wc_wait of the previous job as all xcall callbacks of the previous job are done and wc_wait of the previous job returns during/before executing its xcall callbacks. How to fix: there are actually two counters that count the number of finished xcall callbacks for low priority xcall for historical reasons (I guess): xc_tailp and xc_low_pri.xc_donep. xc_low_pri.xc_donep is incremented correctly while xc_tailp is incremented wrongly, i.e., before executing a xcall callback. We can fix the issue by dropping xc_tailp and using only xc_low_pri.xc_donep. PR kern/51632
|
1.18.6.2 |
| 28-Aug-2017 |
skrll | Sync with HEAD
|
1.18.6.1 |
| 05-Dec-2016 |
skrll | Sync with HEAD
|
1.18.4.1 |
| 12-Dec-2016 |
snj | Pull up following revision(s) (requested by ozaki-r in ticket #1306): sys/kern/subr_xcall.c: revision 1.19 Fix a race condition of low priority xcall xc_lowpri and xc_thread are racy and xc_wait may return during/before executing all xcall callbacks, resulting in a kernel panic at worst. xc_lowpri serializes multiple jobs by a mutex and a cv. If all xcall callbacks are done, xc_wait returns and also xc_lowpri accepts a next job. The problem is that a counter that counts the number of finished xcall callbacks is incremented *before* actually executing a xcall callback (see xc_tailp++ in xc_thread). So xc_lowpri accepts a next job before all xcall callbacks complete and a next job begins to run its xcall callbacks. Even worse the counter is global and shared between jobs, so if a xcall callback of the next job completes, the shared counter is incremented, which confuses wc_wait of the previous job as all xcall callbacks of the previous job are done and wc_wait of the previous job returns during/before executing its xcall callbacks. How to fix: there are actually two counters that count the number of finished xcall callbacks for low priority xcall for historical reasons (I guess): xc_tailp and xc_low_pri.xc_donep. xc_low_pri.xc_donep is incremented correctly while xc_tailp is incremented wrongly, i.e., before executing a xcall callback. We can fix the issue by dropping xc_tailp and using only xc_low_pri.xc_donep. PR kern/51632
|
1.19.8.2 |
| 02-Apr-2018 |
martin | Pull up following revision(s) (requested by ozaki-r in ticket #687): sys/kern/kern_rwlock_obj.c: revision 1.4 sys/rump/librump/rumpkern/locks.c: revision 1.80 sys/kern/kern_rwlock.c: revision 1.50 sys/arch/x86/x86/db_memrw.c: revision 1.5,1.6 sys/ddb/db_command.c: revision 1.150-1.153 share/man/man4/ddb.4: revision 1.175 (via patch),1.176-1.178 sys/kern/kern_mutex_obj.c: revision 1.6 sys/kern/subr_lockdebug.c: revision 1.61-1.64 sys/sys/lockdebug.h: revision 1.17 sys/kern/kern_mutex.c: revision 1.71 sys/sys/lockdebug.h: revision 1.18,1.19 sys/kern/subr_xcall.c: revision 1.26
Obtain proper initialized addresses of locks allocated by mutex_obj_alloc or rw_obj_alloc
Initialized addresses of locks allocated by mutex_obj_alloc or rw_obj_alloc were not useful because the addresses were mutex_obj_alloc or rw_obj_alloc itself. What we want to know are callers of them.
Spinkle ASSERT_SLEEPABLE to xcall functions
Use db_printf instead of printf in ddb
Add a new command, show lockstat, which shows statistics of locks Currently the command shows the number of allocated locks. The command is useful only if LOCKDEBUG is enabled.
Add a new command, show all locks, which shows information of active locks
The command shows information of all active (i.e., being held) locks that are tracked through either of LWPs or CPUs by the LOCKDEBUG facility. The /t modifier additionally shows a backtrace for each LWP additionally. This feature is useful for debugging especially to analyze deadlocks. The command is useful only if LOCKDEBUG is enabled.
Don't pass a unset address to lockdebug_lock_print
x86: avoid accessing invalid addresses in ddb like arm32 This avoids that a command stops in the middle of an execution if a fault occurs due to an access to an invalid address.
Get rid of a redundant output
Improve wording. Fix a Cm argument.
ddb: rename "show lockstat" to "show lockstats" to avoid conflicting with lockstat(8) Requested by mrg@
|
1.19.8.1 |
| 19-Feb-2018 |
snj | Pull up following revision(s) (requested by ozaki-r in ticket #556): sys/sys/xcall.h: 1.6 share/man/man9/xcall.9: 1.11-1.12 sys/kern/subr_xcall.c: 1.21-1.25 Refer softint(9) -- Support arbitrary softint IPLs in high priority xcall The high priority xcall supported only a softint of IPL_SOFTSERIAL. It meant that it didn't work for xcall callbacks depending on lower IPLs than IPL_SOFTSERIAL. The change makes xcall have multiple softints of IPLs and allow users to specify arbitrary IPLs. Users can specify an IPL by XC_HIGHPRI_IPL passed to the 1st argument of xc_broadcast or xc_unicast. Note that xcall still serves requests one by one (i.e., doesn't run them concurrently) even if requests have different IPLs. Proposed on tech-kern@ -- Use high priority xcall with a softint of an IPL the same as psref class's one This mitigates undesired delay of psref_target_destroy under load such as heavy netowrk traffic that loads softint. -- Try to fix the build: avoid duplicate case labels when IPL_SOFT* are all the same. -- Fix build of kernels that some (or all) IPL_SOFT* share a value (e.g., mips) -- Avoid allocating unused softints that share a value of IPL between another Sort XC_IPL_* in order of priority (NFC)
|
1.26.4.1 |
| 13-Apr-2020 |
martin | Mostly merge changes from HEAD upto 20200411
|
1.33.8.1 |
| 03-Jan-2021 |
thorpej | Sync w/ HEAD.
|
1.34.18.2 |
| 21-Sep-2024 |
martin | Pull up following revision(s) (requested by rin in ticket #905):
sys/kern/subr_xcall.c: revision 1.36
xcall(9): If !mp_online, raise spl or set LP_BOUND to call func.
High-priority xcalls may reasonably assume that the spl is raised to splsoftserial, so make sure to do that in xc_broadcast.
Low-priority xcalls may reasonably enter paths that assume the lwp is bound to a CPU, so let's make it assertable even if it doesn't have any other consequences when !mp_online.
|
1.34.18.1 |
| 11-Sep-2024 |
martin | Pull up following revision(s) (requested by rin in ticket #821):
sys/arch/x86/x86/intr.c: revision 1.169 sys/kern/kern_softint.c: revision 1.76 sys/kern/subr_workqueue.c: revision 1.48 sys/kern/kern_idle.c: revision 1.36 sys/kern/subr_xcall.c: revision 1.38
check that l_nopreempt (preemption count) doesn't change after callbacks
check that the idle loop, soft interrupt handlers, workqueue, and xcall callbacks do not modify the preemption count, in most cases, knowing it should be 0 currently.
this work was originally done by simonb. cleaned up slightly and some minor enhancement made by myself, and with discussion with riastradh@. other callback call sites could check this as well (such as MD interrupt handlers, or really anything that includes a callback registration. x86 version to be commited separately.)
apply some more diagnostic checks for x86 interrupts convert intr_biglock_wrapper() into a slight less complete intr_wrapper(), and move the kernel lock/unlock points into the new intr_biglock_wrapper(). add curlwp->l_nopreempt checking for interrupt handlers, including the dtrace wrapper.
XXX: has to copy the i8254_clockintr hack.
tested for a few months by myself, and recently by rin@ on both current and netbsd-10. thanks!
|