Home | History | Annotate | Download | only in kern
History log of /src/sys/kern/kern_threadpool.c
RevisionDateAuthorComments
 1.23  23-Jan-2021  riastradh threadpool(9): Fix synchronization between cancel and dispatch.

- threadpool_cancel_job_async tried to prevent
threadpool_dispatcher_thread from taking the job by setting
job->job_thread = NULL and then removing the job from the queue.

- But threadpool_cancel_job_async didn't notice job->job_thread is
null until after it also removes the job from the queue =>
double-remove, *boom*.

The solution is to teach threadpool_dispatcher_thread to wait until
it has acquired the job lock to test whether job->job_thread is still
valid before it decides to remove the job from the queue.

Fixes PR kern/55948.

XXX pullup-9
 1.22  13-Jan-2021  skrll Improve english in a comment
 1.21  13-Jan-2021  riastradh threadpool(9): Tidy up thread naming.

- `dispatcher', not `overseer' -- much more appropriate metaphor.
- Just omit `/-1' from unbound thread names.
- Just omit `@-1' from dynamic-priority (PRI_NONE) thread names.
 1.20  13-Jan-2021  riastradh threadpool(9): Make threadpool_percpu_ref_remote non-sleepable.

Needed for threadpool-based workqueue_enqueue to run in interrupt
context.
 1.19  07-Sep-2020  riastradh branches: 1.19.2;
threadpool: Simplify job reference-counting logic.

Use atomic_load_relaxed while here.
 1.18  25-Apr-2020  thorpej Take the ASSERT_SLEEPABLE() out of threadpool_cancel_job() and add a
comment explaining why we can't make that assertion there.
 1.17  09-Feb-2020  riastradh Switch from ad-hoc logging to dtrace probes.
 1.16  09-Feb-2020  riastradh Teach threadpool(9) to use percpu_create, mostly.
 1.15  17-Jan-2019  hannken branches: 1.15.4; 1.15.6; 1.15.8;
Use PRIu64 for "uint64_t tp_refcnt".
 1.14  29-Dec-2018  thorpej Expose the worker thread idle timeout via sysctl as "kern.threadpool.idle_ms".
 1.13  28-Dec-2018  thorpej Fix job reference counting:
- threadpool_job_hold() no longer returns failure on overflow; it
asserts that overflow doesn't happen.
- threadpool_job_rele() must be called with the job lock held.
- Always grab a reference count on the job in threadpool_schedule_job()
if we're going to do any work.
- Drop that reference count directly in threadpool_job_done(); it's not
safe to dereference the job structure after the job function has called it.
- In the overseer thread, when handing off the job to work thread, hold an
extra reference briefly, as there's a window where we hold neither the
pool lock or the job lock, and without this extra reference, the job could
be snatched away.
 1.12  27-Dec-2018  thorpej Restore curlwp->l_name in threadpool_job_done(), rather than after the
job function has returned. This lays the groundwork for some job object
reference counting change that will be coming in a subsequent comment.
 1.11  26-Dec-2018  thorpej Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.
 1.10  26-Dec-2018  thorpej Adjust the definition of threadpool_job_fn_t to reflect Taylor's original
intent. (The original didn't compile, and I'm not a very good mind reader.)
 1.9  26-Dec-2018  thorpej Whitespace tweaks.
 1.8  26-Dec-2018  thorpej Stylistic tweak to previous.
 1.7  26-Dec-2018  thorpej Simplify thread reference counting of the thread pool object.
 1.6  26-Dec-2018  thorpej Make the callers of threadpool_create() and threadpool_destroy()
responsibile for managing their own storage.
 1.5  26-Dec-2018  thorpej Use uint64_t for the unbound and per-cpu thread pool ref counts; they're
always manipulated under a lock. Rather than bother returning EBUSY,
just assert that the ref count never overlows (if it ever does, you have
bigger problems).
 1.4  26-Dec-2018  thorpej - De-opaque'ify struct threadpool_job.
- De-_t'ify all of the structure types.

No functional chage, no ABI change (verified with old rump unit test
before and after new librump.so).

Per Taylor's request.
 1.3  25-Dec-2018  thorpej branches: 1.3.2;
Ho ho ho! We can suppress that warning with __diagused! Merry Christmas!
 1.2  25-Dec-2018  kre Fix !DIAGNOSTIC builds.
 1.1  24-Dec-2018  thorpej Add threadpool(9), an abstraction that provides shared pools of kernel
threads running at specific priorities, with support for unbound pools
and per-cpu pools.

Written by riastradh@, and based on the May 2014 draft, with a few changes
by me:
- Working on the assumption that a relative few priorities will actually
be used, reduce the memory footprint by using linked lists, rather than
2 large (and mostly empty) tables. The performance impact is essentially
nil, since these lists are consulted only when pools are created (and
destroyed, for DIAGNOSTIC checks), and the lists will have at most 225
entries.
- Make threadpool job object, which the caller must allocate storage for,
really opaque.
- Use typedefs for the threadpool types, to reduce the verbosity of the
API somewhat.
- Fix a bunch of pool / worker thread / job object lifecycle bugs.

Also include an ATF unit test, written by me, that exercises the basics
of the API by loading a kernel module that exposes several sysctls that
allow the ATF test script to create and destroy threadpools, schedule a
basic job, and verify that it ran.

And thus NetBSD 8.99.29 has arrived.
 1.3.2.3  18-Jan-2019  pgoyette Synch with HEAD
 1.3.2.2  26-Dec-2018  pgoyette Sync with HEAD, resolve a few conflicts
 1.3.2.1  25-Dec-2018  pgoyette file kern_threadpool.c was added on branch pgoyette-compat on 2018-12-26 14:02:04 +0000
 1.15.8.1  29-Feb-2020  ad Sync with head.
 1.15.6.1  25-Jan-2021  martin Pull up following revision(s) (requested by riastradh in ticket #1187):

sys/kern/kern_threadpool.c: revision 1.23

threadpool(9): Fix synchronization between cancel and dispatch.
- threadpool_cancel_job_async tried to prevent
threadpool_dispatcher_thread from taking the job by setting
job->job_thread = NULL and then removing the job from the queue.
- But threadpool_cancel_job_async didn't notice job->job_thread is
null until after it also removes the job from the queue =>
double-remove, *boom*.

The solution is to teach threadpool_dispatcher_thread to wait until
it has acquired the job lock to test whether job->job_thread is still
valid before it decides to remove the job from the queue.

Fixes PR kern/55948.

XXX pullup-9
 1.15.4.3  08-Apr-2020  martin Merge changes from current as of 20200406
 1.15.4.2  10-Jun-2019  christos Sync with HEAD
 1.15.4.1  17-Jan-2019  christos file kern_threadpool.c was added on branch phil-wifi on 2019-06-10 22:09:03 +0000
 1.19.2.1  03-Apr-2021  thorpej Sync with HEAD.

RSS XML Feed