Home | History | Annotate | Download | only in kern
History log of /src/sys/kern/kern_cpu.c
RevisionDateAuthorComments
 1.98  17-Jan-2025  mrg partly prepare for more than 2-level CPU speed scheduler support

put the calls behind looking at SPCF_IDLE and SPCF_1STCLASS mostly
behind functions that can grow support for more than 2 CPU classes.
4 new functions, with 2 of them just simple aliases for the 1st:

bool cpu_is_type(struct cpu_info *ci, int wanted);
bool cpu_is_idle_1stclass(struct cpu_info *ci)
bool cpu_is_1stclass(struct cpu_info *ci)
bool cpu_is_better(struct cpu_info *ci1, struct cpu_info *ci2);

with this in place, we can retain the desire to run on 1st-class by
preference, while also expanding cpu_is_better() to handle multiple
non 1st-class CPUs. ultimately, i envision seeing a priority number
where we can mark the fastest turbo-speed cores ahead of others, for
the case we can detect this.

XXX: use struct schedstate_percpu instead of cpu_info?

NFCI.
 1.97  02-Sep-2023  riastradh heartbeat(9): Move #ifdef HEARTBEAT to sys/heartbeat.h.

Less error-prone this way, and the callers are less cluttered.
 1.96  02-Sep-2023  riastradh cpu_setstate: Fix call to heartbeat_suspend.

Do this on successful offlining, not on failed offlining.

No functional change right now because heartbeat_suspend is
implemented as a noop -- heartbeat(9) will just check the
SPCF_OFFLINE flag. But if we change it to not be a noop, well, then
we need to call it in the right place.
 1.95  07-Jul-2023  riastradh heartbeat(9): New mechanism to check progress of kernel.

This uses hard interrupts to check progress of low-priority soft
interrupts, and one CPU to check progress of another CPU.

If no progress has been made after a configurable number of seconds
(kern.heartbeat.max_period, default 15), then the system panics --
preferably on the CPU that is stuck so we get a stack trace in dmesg
of where it was stuck, but if the stuckness was detected by another
CPU and the stuck CPU doesn't acknowledge the request to panic within
one second, the detecting CPU panics instead.

This doesn't supplant hardware watchdog timers. It is possible for
hard interrupts to be stuck on all CPUs for some reason too; in that
case heartbeat(9) has no opportunity to complete.

Downside: heartbeat(9) relies on hardclock to run at a reasonably
consistent rate, which might cause trouble for the glorious tickless
future. However, it could be adapted to take a parameter for an
approximate number of units that have elapsed since the last call on
the current CPU, rather than treating that as a constant 1.

XXX kernel revbump -- changes struct cpu_info layout
 1.94  26-Feb-2023  skrll ci_data.cpu_kcpuset -> ci_kcpuset

NFCI.
 1.93  08-Oct-2020  rin PR kern/45117

Work around regression introduced in rev 1.92:

http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/kern/kern_cpu.c#rev1.92

by which ``cpuctl offline n'' became broken on architectures without
__HAVE_INTR_CONTROL (i.e., everything other than alpha and x86);
cpu_setintr() always fails on these archs, and we had neglected
return value from that function until rev 1.91.

XXX
As martin pointed out in the PR, I'm not sure whether fix in rev 1.92
itself is correct or not. Insert XXX comment referring the PR there....
 1.92  13-Jul-2020  jruoho Do not allow disabling interrupts on the primary CPU. Fixes PR kern/45117.
 1.91  28-May-2020  ad At least panic with a useful message if there are too many CPUs.
 1.90  23-May-2020  ad Move proc_lock into the data segment. It was dynamically allocated because
at the time we had mutex_obj_alloc() but not __cacheline_aligned.
 1.89  21-Dec-2019  ad Fix build failure.
 1.88  20-Dec-2019  ad Split subr_cpu.c out of kern_cpu.c, to contain routines shared with rump.
 1.87  20-Dec-2019  ad Some more CPU topology stuff:

- Use cegger@'s ACPI SRAT parsing code to figure out NUMA node ID for each
CPU as it is attached.

- For scheduler experiments with SMT, flag CPUs with the lowest numbered SMT
IDs as "primaries", link back to the primaries from secondaries, and build
a circular list of CPUs in each package with identical SMT IDs.

- No need for package/core/smt/numa IDs to be anything other than a u_int.
 1.86  18-Dec-2019  ad Passify rump build.
 1.85  17-Dec-2019  ad More rump-ing. I will split this into two files during the week.
 1.84  17-Dec-2019  ad Rump is living up to its name
 1.83  17-Dec-2019  ad Hopefully unbreak the build - now that this is included in rump.
 1.82  16-Dec-2019  ad - Extend the per-CPU counters matt@ did to include all of the hot counters
in UVM, excluding uvmexp.free, which needs special treatment and will be
done with a separate commit. Cuts system time for a build by 20-25% on
a 48 CPU machine w/DIAGNOSTIC.

- Avoid 64-bit integer divide on every fault (for rnd_add_uint32).
 1.81  04-Dec-2019  wiz Fix typo in comment (typlogy)
 1.80  03-Dec-2019  ad - Add some more failsafes to the CPU topology stuff, and build a 3rd
circular list of peer CPUs in other packages, so we might scroll through
them in the scheduler when looking to distribute or steal jobs.

- Fold the run queue data structure into spc_schedstate. Makes kern_runq.c
a far more pleasant place to work.

- Remove the code in sched_nextlwp() that tries to steal jobs from other
CPUs. It's not needed, because we do the very same thing in the idle LWP
anyway. Outside the VM system this was one of the the main causes of L3
cache misses I saw during builds. On my machine, this change yields a
60%-70% drop in time on the "hackbench" benchmark (there's clearly a bit
more going on here, but basically being less aggressive helps).
 1.79  02-Dec-2019  ad Take the basic CPU topology information we already collect, and use it
to make circular lists of CPU siblings in the same core, and in the
same package. Nothing fancy, just enough to have a bit of fun in the
scheduler trying out different tactics.
 1.78  01-Dec-2019  ad Fix false sharing problems with cpu_info. Identified with tprof(8).
This was a very nice win in my tests on a 48 CPU box.

- Reorganise cpu_data slightly according to usage.
- Put cpu_onproc into struct cpu_info alongside ci_curlwp (now is ci_onproc).
- On x86, put some items in their own cache lines according to usage, like
the IPI bitmask and ci_want_resched.
 1.77  13-Nov-2019  mrg put the ucode not found message under #ifdef DEBUG. use printf()
instead of aprint_error().

there's an error returned to userland and displayed by cpuctl.
 1.76  06-Oct-2019  uwe Define cpu_xc_* functions with unused second argument to make them
conform to xcfunc_t callback typedef (-Wcast-function-type).
Same object code is generated.
 1.75  13-Nov-2018  skrll Fix/add KASSERTS to work with a system of MAXCPUS. Add some comments to
explain things.

Discussed with rmind
 1.74  04-Jul-2018  msaitoh Don't allocate memory and return EFTYPE if sc->sc_blobsize==0 to prevent
panic in firmware_malloc().
 1.73  18-Mar-2018  christos branches: 1.73.2;
finish MD glue for compat ucode module.
 1.72  17-Mar-2018  christos move the compat code in compat.
 1.71  29-Aug-2015  maxv branches: 1.71.10; 1.71.16;
Don't decrement the number of offline cpus if we fail to shut down one.

ok christos@, via tech-kern@
 1.70  20-Aug-2015  christos include ioconf.h instead of locally declaring the prototype of the attach
function
 1.69  20-Aug-2015  uebayasi Mark pseudo attach unused arg with __unused.
 1.68  18-Aug-2015  uebayasi Convert pseudo attach functions to take no arguments, as some functions
(pppattach(), putterattach(), etc.) already do. This means that pseudo
attach function will be able to become a constructor.
 1.67  07-Jan-2015  ozaki-r Pass a correct firmware size (instead of 0) to firmware_free

firmware_free now uses kmem_free(9) instead of free(9),
so we need to pass a correct size to it.
 1.66  25-Jul-2014  dholland branches: 1.66.2; 1.66.4;
Add d_discard to all struct cdevsw instances I could find.

All have been set to "nodiscard"; some should get a real implementation.
 1.65  25-Mar-2014  macallan branches: 1.65.2;
snprintf -> vsnprintf in cpu_setmodel()
now this can actually work
hi christos
 1.64  24-Mar-2014  christos - create cpu_{g,s}etmodel() and hide cpu_model from direct access.
 1.63  16-Mar-2014  dholland Change (mostly mechanically) every cdevsw/bdevsw I can find to use
designated initializers.

I have not built every extant kernel so I have probably broken at
least one build; however I've also found and fixed some wrong
cdevsw/bdevsw entries so even if so I think we come out ahead.
 1.62  19-Dec-2013  mlelstv cpu_infos is a NULL terminated array, not an array followed by a 0 byte.
 1.61  24-Nov-2013  rmind Remove cpu_queue (and thus eleminate another use of CIRCLEQ) by replacing
its uses with cpu_infos array. Extra testing by christos@.
 1.60  22-Aug-2013  drochner -extend the pcu(9) API by a function which saves all context on the
current CPU, and use it if a CPU is taken offline
-add a bool argument to pcu_discard which tells whether the internal
"LWP has used the coprocessor" flag should be set or reset. The flag
is reported by pcu_used_p(). If set, future accesses should use the
state stored in the PCB. If reset, it should be reset to default.
The former case is useful for setmcontext().
With that, it should not be necessary anymore to manage the "FPU used"
state by an additional MD variable.

approved by matt
 1.59  17-Oct-2012  drochner branches: 1.59.2;
put binary compatibility support for the old AMD-only CPU microcode
update API inside COMPAT_60
 1.58  01-Sep-2012  matt branches: 1.58.2;
Add a kcpuset_t which just includes ourself.
Add a ci_cpuname for convenience
 1.57  29-Aug-2012  drochner Extend the CPU microcode update framework to support Intel x86 CPUs.
Contrary to the AMD implementation, it doesn't use xcalls to distribute
the update to all CPUs but relies on cpuctl(8) to bind itself to the
right CPU -- to keep it simple and avoid possible problems with
hyperthreading.
Also, it doesn't parse the vendor supplied file to pick the right
part for the present CPU model but relies on userland to prepare
files with specific filenames. I'll commit a pkg for this in a minute
(pkgsrc/sysutils/intel-microcode).
The ioctl interface changed; compatibility is provided (should be
limited to COMPAT_NETBSD6 as soon as this is available).
 1.56  13-Jun-2012  joerg Kill conditionals that are always true. Drop a dead assignment.
 1.55  29-Jan-2012  rmind - Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().
 1.54  17-Jan-2012  cegger fix secmodel implementation of CPU_UCODE.
ok wiz@ for the manpages
ok elad@
 1.53  13-Jan-2012  cegger Support CPU microcode loading via cpuctl(8).
Implemented and enabled via CPU_UCODE kernel config option
for x86 and Xen Dom0.
Tested on different AMD machines with different
CPU families.

ok wiz@ for the manpages
ok releng@
ok core@ via releng@
 1.52  29-Oct-2011  jym branches: 1.52.2; 1.52.6;
Fix comment.
 1.51  11-Sep-2011  jdc Add a cs_hwid field to cpustate and use this to store the ci_cpuid (hardware
ID). Report this as the HwID in cpuctl.
OK jruoho@.
 1.50  07-Aug-2011  rmind - Add an argument to kcpuset_create() for zeroing.
- Add kcpuset_atomic_set(), kcpuset_atomic_clear() and kcpuset_merge().
 1.49  07-Aug-2011  rmind Remove LW_AFFINITY flag and fix some bugs affinity mask handling.
 1.48  07-Aug-2011  rmind Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@
 1.47  29-Jun-2011  matt Add the new ci to cpu_infos *before* calling routines which may want to
cpu_lookup.
 1.46  13-May-2011  rmind Sprinkle __cacheline_aligned and __read_mostly.
 1.45  22-Dec-2010  matt branches: 1.45.2;
Add CTASSERT to verify __HAVE_CPU_DATA_FIRST is correct defined or undefined.
 1.44  25-Apr-2010  ad Allocate the cpu_infos array dynamically.
 1.43  13-Jan-2010  mrg branches: 1.43.2; 1.43.4;
introduce a new function that returns a unique string for each cpu:

char *cpu_name(struct cpu_info *);

and use it when setting up the runq event counters, avoiding an 8 byte
kmem(4) allocation for each cpu. there are more places the cpuname is
used that can be converted to using this new interface, but that can
and will be done as future work.

as discussed with rmind.
 1.42  19-Apr-2009  ad cpuctl:

- Add interrupt shielding (direct hardware interrupts away from the
specified CPUs). Not documented just yet but will be soon.

- Redo /dev/cpu time_t compat so no kernel changes are needed.

x86:

- Make intr_establish, intr_disestablish safe to use when !cold.

- Distribute hardware interrupts among the CPUs, instead of directing
everything to the boot CPU.

- Add MD code for interrupt sheilding. This works in most cases but there is
a bug where delivery is not accepted by an LAPIC after redistribution. It
also needs re-balancing to make things fair after interrupts are turned
back on for a CPU.
 1.41  19-Jan-2009  njoly branches: 1.41.2;
Clear error value on exit for IOC_CPU_OGETSTATE ioctl command.
 1.40  19-Jan-2009  christos provide compat_50
 1.39  07-Dec-2008  ad Add cpu_softintr_p() for assertions
 1.38  06-Nov-2008  rmind branches: 1.38.2;
cpuctl_ioctl: use cpu_index(), instead of cpuid.
Fixes cpuctl(8) on some processors.
 1.37  31-Oct-2008  rmind - Avoid the race with CPU online/offline state changes, when setting the
affinity (cpu_lock protects these operations now).
- Disallow setting of state of CPU to to offline, if there are bound LWPs,
which have no CPU to migrate.
- Disallow setting of affinity for the LWP(s), if all CPUs in the dynamic
CPU-set are offline.
- sched_setaffinity: fix invalid check of kcpuset_isset().
- Rename cpu_setonline() to cpu_setstate().

Should fix PR/39349.
 1.36  15-Oct-2008  ad branches: 1.36.2; 1.36.4;
- Rename cpu_lookup_byindex() to cpu_lookup(). The hardware ID isn't of
interest to MI code. No functional change.
- Change /dev/cpu to operate on cpu index, not hardware ID. Now cpuctl
shouldn't print confused output.
 1.35  28-Aug-2008  yamt cpu_xc_offline: fix races with eg. sleepq_remove.
 1.34  14-Jul-2008  rmind Fix the locking against oneself, migrate LWPs only from runqueue.
Part of the fix for PR/38882.
 1.33  22-Jun-2008  ad branches: 1.33.2;
When offlining a CPU, ensure that at least one other CPU within the same
processor set remains online, otherwise the system can deadlock.
 1.32  04-Jun-2008  ad branches: 1.32.2;
- vm_page: put listq, pageq into a union alongside a LIST_ENTRY, so we can
use both types of list.

- Make page coloring and idle zero state per-CPU.

- Maintain per-CPU page freelists. When freeing, put pages onto the local
CPU's lists and the global lists. When allocating, prefer to take pages
from the local CPU. If none are available take from the global list as
done now. Proposed on tech-kern@.
 1.31  29-May-2008  rmind Simplifcation for running LWP migration. Removes double-locking in
mi_switch(), migration for LSONPROC is now performed via idle loop.
Handles/fixes on-CPU case in lwp_migrate(), misc.

Closes PR/38169, idea of migration via idle loop by Andrew Doran.
 1.30  06-May-2008  ad branches: 1.30.2;
LOCKDEBUG: try to speed it up a bit by not using so much global state.

This will break the build briefly but will be followed by another commit
to fix that..
 1.29  28-Apr-2008  martin Remove clause 3 and 4 from TNF licenses
 1.28  24-Apr-2008  ad branches: 1.28.2;
Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.
 1.27  22-Apr-2008  ad Implement MP callouts as discussed on tech-kern. The CPU binding code is
disabled for the moment until we figure out what we want to do with CPUs
being offlined.
 1.26  12-Apr-2008  ad branches: 1.26.2;
Move the LW_BOUND flag into the thread-private flag word. It can be tested
by other threads/CPUs but that is only done when the LWP is known to be in a
quiescent state (for example, on a run queue).
 1.25  12-Apr-2008  ad Take the run queue management code from the M2 scheduler, and make it
mandatory. Remove the 4BSD run queue code. Effects:

- Pluggable scheduler is only responsible for co-ordinating timeshared jobs.
- All systems run with per-CPU run queues.
- 4BSD scheduler gets processor sets / affinity.
- 4BSD scheduler gets a significant peformance boost on some workloads.

Discussed on tech-kern@.
 1.24  11-Apr-2008  ad Maintain a circular queue of cpu_info's.
 1.23  11-Apr-2008  ad Restructure the name cache code to eliminate most lock contention
resulting from forward lookups. Discussed on tech-kern@.
 1.22  22-Mar-2008  ad Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.
 1.21  14-Feb-2008  ad branches: 1.21.6;
Make schedstate_percpu::spc_lwplock an exernally allocated item. Remove
the hacks in sparc/cpu.c to reinitialize it. This should be in its own
cache line but that's another change.
 1.20  01-Feb-2008  elad Replace a KAUTH_GENERIC_ISSUSER in the cpuctl code with a proper kauth
request.

Reviewed by ad@, tested by me.
 1.19  15-Jan-2008  joerg Introduce optional cpu_offline_md to execute MD actions at the end of
cpu_offline. Use this on amd64/i386 to force a FPU save. As this was
triggered by npxsave_cpu/fpusave_cpu not working for a different CPU,
remove the cpu_info argument and adjust npxsave_*/fpusave_* to use bool
for the save.

OK ad@
 1.18  15-Jan-2008  rmind Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.
 1.17  14-Jan-2008  yamt add a per-cpu storage allocator.
 1.16  22-Dec-2007  yamt add a function to lookup cpu_info by cpu index.
 1.15  05-Dec-2007  ad branches: 1.15.4;
Match the docs: MUTEX_DRIVER/SPIN are now only for porting code written
for Solaris.
 1.14  07-Nov-2007  ad branches: 1.14.2;
Merge from vmlocking:

- pool_cache changes.
- Debugger/procfs locking fixes.
- Other minor changes.
 1.13  06-Nov-2007  ad Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.
 1.12  05-Nov-2007  rmind branches: 1.12.2;
cpu_xc_offline: Do not double-lock the runqueues for SCHED_4BSD, it uses a
global sched_mutex. Fixes a hang reported by <jmcneill>. Tested with M2
and 4BSD - seems to be working fine.
 1.11  04-Nov-2007  rmind - Migrate all threads when the state of CPU is changed to offline;
- Fix inverted logic with r_mcount in M2;
- setrunnable: perform sched_takecpu() when making the LWP runnable;
- setrunnable: l_mutex cannot be spc_mutex here;

This makes cpuctl(8) work with SCHED_M2.

OK by <ad>.
 1.10  17-Oct-2007  ad branches: 1.10.2;
Fix reversed args to memset. From Iain Hibbert.
 1.9  15-Oct-2007  ad Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.
 1.8  08-Oct-2007  ad Add stubs that provide new soft interrupt API from the vmlocking branch.
For now these just pass through to the current softintr code.

(The naming is different to allow softint/softintr to co-exist for a while.
I'm hoping that should make it easier to transition.)
 1.7  08-Oct-2007  ad Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.
 1.6  18-Aug-2007  ad branches: 1.6.2; 1.6.4; 1.6.6;
Make the uarea cache per-CPU and drain in batches of 4.
 1.5  05-Aug-2007  rmind branches: 1.5.2;
Improve per-CPU support for the workqueue(9):
- Make structures CPU-cache friendly, as suggested and explained
by Andrew Doran. CACHE_LINE_SIZE definition is invented.
- Use current CPU if NULL is passed to the workqueue_enqueue().
- Implemented MI CPU index, which could be used as an index of array.
Removed linked-lists usage for work queues.

The roundup2() function avoids division, but works only with power of 2.

Reviewed by: <ad>, <yamt>, <tech-kern>
 1.4  04-Aug-2007  ad A quick hack to get things building again. Don't refer to curlwp
if !MULTIPROCESSOR.
 1.3  04-Aug-2007  ad Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.
 1.2  17-May-2007  yamt branches: 1.2.2; 1.2.4; 1.2.6; 1.2.10;
merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.
 1.1  24-Mar-2007  yamt branches: 1.1.2;
file kern_cpu.c was initially added on branch yamt-idlelwp.
 1.1.2.3  13-May-2007  ad Assign a per-CPU lock to LWPs as they transition into the ONPROC state.

http://mail-index.netbsd.org/tech-kern/2007/05/06/0003.html
 1.1.2.2  24-Mar-2007  yamt update ncpu when attaching cpus, rather than counting cpus after
calling cpu_boot_secondary_processors.
no difference at this point, but it should be more cpu-hotplug friendly.
 1.1.2.1  24-Mar-2007  yamt initialize ci->ci_schedstate.spc_mutex of APs.
(sched_rqinit is called before APs are attached.)
 1.2.10.8  09-Dec-2007  jmcneill Sync with HEAD.
 1.2.10.7  11-Nov-2007  joerg Sync with HEAD.
 1.2.10.6  06-Nov-2007  joerg Sync with HEAD.
 1.2.10.5  04-Nov-2007  jmcneill Sync with HEAD.
 1.2.10.4  26-Oct-2007  joerg Sync with HEAD.

Follow the merge of pmap.c on i386 and amd64 and move
pmap_init_tmp_pgtbl into arch/x86/x86/pmap.c. Modify the ACPI wakeup
code to restore CR4 before jumping back into kernel space as the large
page option might cover that.
 1.2.10.3  03-Sep-2007  jmcneill Sync with HEAD.
 1.2.10.2  09-Aug-2007  jmcneill Sync with HEAD.
 1.2.10.1  04-Aug-2007  jmcneill Sync with HEAD.
 1.2.6.2  03-Sep-2007  skrll Sync with HEAD.
 1.2.6.1  15-Aug-2007  skrll Sync with HEAD.
 1.2.4.2  11-Jul-2007  mjf Sync with head.
 1.2.4.1  17-May-2007  mjf file kern_cpu.c was added on branch mjf-ufs-trans on 2007-07-11 20:09:44 +0000
 1.2.2.9  01-Nov-2007  ad - Fix interactivity problems under high load. Beacuse soft interrupts
are being stacked on top of regular LWPs, more often than not aston()
was being called on a soft interrupt thread instead of a user thread,
meaning that preemption was not happening on EOI.

- Don't use bool in a couple of data structures. Sub-word writes are not
always atomic and may clobber other fields in the containing word.

- For SCHED_4BSD, make p_estcpu per thread (l_estcpu). Rework how the
dynamic priority level is calculated - it's much better behaved now.

- Kill the l_usrpri/l_priority split now that priorities are no longer
directly assigned by tsleep(). There are three fields describing LWP
priority:

l_priority: Dynamic priority calculated by the scheduler.
This does not change for kernel/realtime threads,
and always stays within the correct band. Eg for
timeshared LWPs it never moves out of the user
priority range. This is basically what l_usrpri
was before.

l_inheritedprio: Lent to the LWP due to priority inheritance
(turnstiles).

l_kpriority: A boolean value set true the first time an LWP
sleeps within the kernel. This indicates that the LWP
should get a priority boost as compensation for blocking.
lwp_eprio() now does the equivalent of sched_kpri() if
the flag is set. The flag is cleared in userret().

- Keep track of scheduling class (OTHER, FIFO, RR) in struct lwp, and use
this to make decisions in a few places where we previously tested for a
kernel thread.

- Partially fix itimers and usr/sys/intr time accounting in the presence
of software interrupts.

- Use kthread_create() to create idle LWPs. Move priority definitions
from the various modules into sys/param.h.

- newlwp -> lwp_create
 1.2.2.8  23-Oct-2007  ad Sync with head.
 1.2.2.7  01-Sep-2007  ad - Add a CPU layer to pool caches. In combination with vmem/kmem this
provides CPU-local slab/object and general purpose allocators. The
strategy used is as described in Jeff Bonwick's USENIX paper, except in
at least one place where the described allocation strategy doesn't make
sense. For exclusive access to the CPU layer the IPL is raised or kernel
preemption disabled. Where the interrupt priority levels are software
emulated this is much cheaper than taking a lock, and I think that
writing to a local %pil register is likely to have a similar penalty to
taking a lock.

No tuning of the group sizes is currently done - all groups have 15
items each, but this should be fairly easy to implement. Also, the
reclamation mechanism should probably use a cross-call to drain the
CPU-level caches on remote CPUs.

Currently this causes kernel memory corruption on i386, yet works without
a problem on amd64. The cache layer is disabled for the time being until I
can find the bug.

- Change the pool_cache API so that the caches are themselves dynamically
allocated, and that each cache is tied to a single pool only. Add some
stubs to change pool_cache parameters that call directly through to the
pool layer (e.g. pool_cache_sethiwat). The idea here is that pool_cache
should become the default object allocator (and so LKM friendly), and
that the pool allocator should be for kernel-internal use only. This will
be posted to tech-kern@ for review.
 1.2.2.6  30-Aug-2007  ad - Instead of xc_broadcast()/xc_unicast() waiting for completion, have them
return a value which can later be passed to xc_wait(). Allows the caller
to go and do other stuff in the meantime.
- Fix the xcall thread to work properly in the event of a spurious wakeup.
 1.2.2.5  26-Aug-2007  ad - Add a generic cross-call facility. Right now this only does threaded cross
calls but that should be extended to do IPIs. These are deliberately set
up as bound kthreads (and not soft interrupts or something else) so that
the called functions can use the spl framework or disable preemption in
order to guarantee exclusive access to CPU-local data.

- Use cross calls to take CPUs online or offline. Ok to do since bound LWPs
still execute on offline CPUs. As a result schedstate_percpu's::spc_flags
is CPU-local again and doesn't need locking.
 1.2.2.4  20-Aug-2007  ad Sync with HEAD.
 1.2.2.3  17-Jun-2007  ad - Increase the number of thread priorities from 128 to 256. How the space
is set up is to be revisited.
- Implement soft interrupts as kernel threads. A generic implementation
is provided, with hooks for fast-path MD code that can run the interrupt
threads over the top of other threads executing in the kernel.
- Split vnode::v_flag into three fields, depending on how the flag is
locked (by the interlock, by the vnode lock, by the file system).
- Miscellaneous locking fixes and improvements.
 1.2.2.2  09-Jun-2007  ad Sync with head.
 1.2.2.1  17-May-2007  ad file kern_cpu.c was added on branch vmlocking on 2007-06-09 21:37:26 +0000
 1.5.2.2  05-Aug-2007  rmind Improve per-CPU support for the workqueue(9):
- Make structures CPU-cache friendly, as suggested and explained
by Andrew Doran. CACHE_LINE_SIZE definition is invented.
- Use current CPU if NULL is passed to the workqueue_enqueue().
- Implemented MI CPU index, which could be used as an index of array.
Removed linked-lists usage for work queues.

The roundup2() function avoids division, but works only with power of 2.

Reviewed by: <ad>, <yamt>, <tech-kern>
 1.5.2.1  05-Aug-2007  rmind file kern_cpu.c was added on branch matt-mips64 on 2007-08-05 01:19:18 +0000
 1.6.6.2  18-Oct-2007  yamt sync with head.
 1.6.6.1  14-Oct-2007  yamt sync with head.
 1.6.4.9  24-Mar-2008  yamt sync with head.
 1.6.4.8  27-Feb-2008  yamt sync with head.
 1.6.4.7  04-Feb-2008  yamt sync with head.
 1.6.4.6  21-Jan-2008  yamt sync with head
 1.6.4.5  07-Dec-2007  yamt sync with head
 1.6.4.4  15-Nov-2007  yamt sync with head.
 1.6.4.3  27-Oct-2007  yamt sync with head.
 1.6.4.2  03-Sep-2007  yamt sync with head.
 1.6.4.1  18-Aug-2007  yamt file kern_cpu.c was added on branch yamt-lazymbuf on 2007-09-03 14:40:44 +0000
 1.6.2.4  23-Mar-2008  matt sync with HEAD
 1.6.2.3  09-Jan-2008  matt sync with HEAD
 1.6.2.2  08-Nov-2007  matt sync with -HEAD
 1.6.2.1  06-Nov-2007  matt sync with HEAD
 1.10.2.1  13-Nov-2007  bouyer Sync with HEAD
 1.12.2.4  18-Feb-2008  mjf Sync with HEAD.
 1.12.2.3  27-Dec-2007  mjf Sync with HEAD.
 1.12.2.2  08-Dec-2007  mjf Sync with HEAD.
 1.12.2.1  19-Nov-2007  mjf Sync with HEAD.
 1.14.2.2  26-Dec-2007  ad Sync with head.
 1.14.2.1  08-Dec-2007  ad Sync with head.
 1.15.4.2  19-Jan-2008  bouyer Sync with HEAD
 1.15.4.1  02-Jan-2008  bouyer Sync with HEAD
 1.21.6.7  17-Jan-2009  mjf Sync with HEAD.
 1.21.6.6  28-Sep-2008  mjf Sync with HEAD.
 1.21.6.5  29-Jun-2008  mjf Sync with HEAD.
 1.21.6.4  05-Jun-2008  mjf Sync with HEAD.

Also fix build.
 1.21.6.3  02-Jun-2008  mjf Sync with HEAD.
 1.21.6.2  14-Apr-2008  mjf - remove comments that are no longer true
- add support to devfsd(8) and devfsctl(4) to handle wedges
- add cpuctl device registration
- extract the alloc part out of device_register_name() into a common function
that can be used by the new device_register_sync(), which is used to
synchronously create device files
 1.21.6.1  03-Apr-2008  mjf Sync with HEAD.
 1.26.2.3  17-Jun-2008  yamt sync with head.
 1.26.2.2  04-Jun-2008  yamt sync with head
 1.26.2.1  18-May-2008  yamt sync with head.
 1.28.2.4  11-Aug-2010  yamt sync with head.
 1.28.2.3  11-Mar-2010  yamt sync with head
 1.28.2.2  04-May-2009  yamt sync with head.
 1.28.2.1  16-May-2008  yamt sync with head.
 1.30.2.2  18-Sep-2008  wrstuden Sync with wrstuden-revivesa-base-2.
 1.30.2.1  23-Jun-2008  wrstuden Sync w/ -current. 34 merge conflicts to follow.
 1.32.2.2  18-Jul-2008  simonb Sync with head.
 1.32.2.1  27-Jun-2008  simonb Sync with head.
 1.33.2.2  13-Dec-2008  haad Update haad-dm branch to haad-dm-base2.
 1.33.2.1  19-Oct-2008  haad Sync with HEAD.
 1.36.4.2  13-Nov-2008  snj branches: 1.36.4.2.4;
Pull up following revision(s) (requested by rmind in ticket #48):
sys/kern/kern_cpu.c: revision 1.37
sys/arch/x86/x86/cpu.c: revision 1.58
sys/arch/xen/x86/cpu.c: revision 1.29
sys/sys/cpu.h: revision 1.24
sys/kern/sys_sched.c: revision 1.31
- Avoid the race with CPU online/offline state changes, when setting the
affinity (cpu_lock protects these operations now).
- Disallow setting of state of CPU to to offline, if there are bound LWPs,
which have no CPU to migrate.
- Disallow setting of affinity for the LWP(s), if all CPUs in the dynamic
CPU-set are offline.
- sched_setaffinity: fix invalid check of kcpuset_isset().
- Rename cpu_setonline() to cpu_setstate().
Should fix PR/39349.
 1.36.4.1  07-Nov-2008  snj Pull up following revision(s) (requested by cegger in ticket #21):
sys/kern/kern_cpu.c: revision 1.38
cpuctl_ioctl: use cpu_index(), instead of cpuid.
Fixes cpuctl(8) on some processors.
 1.36.4.2.4.1  15-Feb-2014  matt Add cpu_softintr_p()
Add cpu_name to cpu_data
 1.36.2.3  28-Apr-2009  skrll Sync with HEAD.
 1.36.2.2  03-Mar-2009  skrll Sync with HEAD.
 1.36.2.1  19-Jan-2009  skrll Sync with HEAD.
 1.38.2.1  07-Dec-2008  ad Pull cpu_softintr_p() from trunk.
 1.41.2.1  13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.43.4.3  31-May-2011  rmind sync with head
 1.43.4.2  05-Mar-2011  rmind sync with head
 1.43.4.1  30-May-2010  rmind sync with head
 1.43.2.1  30-Apr-2010  uebayasi Sync with HEAD.
 1.45.2.1  06-Jun-2011  jruoho Sync with HEAD.
 1.52.6.1  18-Feb-2012  mrg merge to -current.
 1.52.2.3  22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.52.2.2  30-Oct-2012  yamt sync with head
 1.52.2.1  17-Apr-2012  yamt sync with head
 1.58.2.3  03-Dec-2017  jdolecek update from HEAD
 1.58.2.2  20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.58.2.1  20-Nov-2012  tls Resync to 2012-11-19 00:00:00 UTC
 1.59.2.2  18-May-2014  rmind sync with head
 1.59.2.1  28-Aug-2013  rmind sync with head
 1.65.2.1  10-Aug-2014  tls Rebase.
 1.66.4.2  22-Sep-2015  skrll Sync with HEAD
 1.66.4.1  06-Apr-2015  skrll Sync with HEAD
 1.66.2.1  04-Nov-2015  riz Pull up following revision(s) (requested by maxv in ticket #965):
sys/kern/kern_cpu.c: revision 1.71
Don't decrement the number of offline cpus if we fail to shut down one.
ok christos@, via tech-kern@
 1.71.16.11  26-Nov-2018  pgoyette Sync with HEAD, resolve a couple of conflicts
 1.71.16.10  28-Jul-2018  pgoyette Sync with HEAD
 1.71.16.9  22-Mar-2018  pgoyette Synch with HEAD, resolve conflicts
 1.71.16.8  18-Mar-2018  pgoyette Import from -current the MD glue code for compat cpu_ucode
 1.71.16.7  17-Mar-2018  pgoyette Import christos's changes for the compat_60 cpu_ucode stuff
 1.71.16.6  17-Mar-2018  pgoyette Back out changes on the branch related to kernel microcode compat.

Christos didn't like the way it was done, so waiting for a better
approach/implementation.
 1.71.16.5  17-Mar-2018  pgoyette Use two different compat stubs since they have different prototypes.
 1.71.16.4  17-Mar-2018  pgoyette Typo - add missing (
 1.71.16.3  17-Mar-2018  pgoyette Typos - add missing )'s
 1.71.16.2  16-Mar-2018  pgoyette Move closer to getting a compat_60 module - still needs more work
 1.71.16.1  16-Mar-2018  pgoyette Initial pass at setting up the compat_60 module.

XXX needs some work to properly handle cpu_ucode stuff.

While here, move details of compat_70 init/fini routines into the
module itself.
 1.71.10.1  26-Jul-2018  snj Pull up following revision(s) (requested by msaitoh in ticket #929):
sys/arch/x86/x86/cpu_ucode_intel.c: 1.14
sys/kern/kern_cpu.c: 1.74
Add cpu_ucode_intel_verify() to verify microcode image. Currently, we don't
verify extended signatures'checksum. I have no any image which has extended
signature. If an extended signature found, the function shows
"This image has extended signature table." and continue.
--
Don't allocate memory and return EFTYPE if sc->sc_blobsize==0 to prevent
panic in firmware_malloc().
 1.73.2.2  13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.73.2.1  10-Jun-2019  christos Sync with HEAD

RSS XML Feed