Home | History | Annotate | only in /src/lib/libpthread/arch/i386
History log of /src/lib/libpthread/arch/i386
RevisionDateAuthorComments
 1.3 08-Oct-1997  scottr This incarnation of the pthreads library is ancient and not useful, and
should have been mothballed some time ago...
 1.2 07-Jan-1994  mycroft Delete special rules for syscall.S; they are not needed, and one of them
is wrong anyway.
 1.1 14-Nov-1993  proven branches: 1.1.1;
Initial revision
 1.1.1.1 14-Nov-1993  proven Initial release of the POSIX 1003.4a Draft 7 thread implementation.
 1.10 16-May-2009  ad Remove unused code that's confusing when using cscope/opengrok.
 1.9 28-Apr-2008  martin branches: 1.9.8;
Remove clause 3 and 4 from TNF licenses
 1.8 10-Feb-2008  ad branches: 1.8.4;
- Remove libpthread's atomic ops.
- Remove the old spinlock-based mutex and rwlock implementations.
- Use the atomic ops from libc.
 1.7 13-Nov-2007  ad Mutexes:

- Play scrooge again and chop more cycles off acquire/release.
- Spin while the lock holder is running on another CPU (adaptive mutexes).
- Do non-atomic release.

Threadreg:

- Add the necessary hooks to use a thread register.
- Add the code for i386, using %gs.
- Leave i386 code disabled until xen and COMPAT_NETBSD32 have the changes.
 1.6 13-Nov-2007  ad For PR bin/37347:

- Override __libc_thr_init() instead of using our own constructor.
- Add pthread__getenv() and use instead of getenv(). This is used before
we are up and running and unfortunatley getenv() takes locks.

Other changes:

- Cache the spinlock vectors in pthread__st. Internal spinlock operations
now take 1 function call instead of 3 (i386).
- Use pthread__self() internally, not pthread_self().
- Use __attribute__ ((visibility("hidden"))) in some places.
- Kill PTHREAD_MAIN_DEBUG.
 1.5 08-Sep-2007  ad - Get rid of self->pt_mutexhint and use pthread__mutex_owned() instead.
- Update some comments and fix minor bugs. Minor cosmetic changes.
- Replace some spinlocks with mutexes and rwlocks.
- Change the process private semaphores to use mutexes and condition
variables instead of doing the synchronization directly. Spinlocks
are no longer used by the semaphore code.
 1.4 07-Sep-2007  ad - Don't take the mutex's spinlock (ptr_interlock) in pthread_cond_wait().
Instead, make the deferred wakeup list a per-thread array and pass down
the lwpid_t's that way.

- In pthread_cond_wait(), take the mutex before dealing with early wakeup.
In this way there should never be contention on the CV's spinlock if
the app follows POSIX rules (there should only be contention on the
user-provided mutex).

- Add a port of the kernel's rwlocks. The rwlock's spinlock is only taken if
there is contention. This is enabled where atomic ops are available. Right
now that is only i386 and amd64 because I don't have other hardware to
test with. It's trivial to add stubs for other architectures as long as
they have compare-and-swap. When we have proper atomic ops the old rwlock
code can be removed.

- Add a new mutex implementation that's similar to the kernel's mutexes, but
uses compare-and-swap to maintain the waiters list, so no spinlocks are
involved. Same caveats apply as for the rwlocks.
 1.3 07-Sep-2007  ad Add: pthread__atomic_cas_ptr, pthread__atomic_swap_ptr, pthread__membar_full
This is a stopgap until the thorpej-atomic branch is complete.
 1.2 18-Jan-2003  thorpej branches: 1.2.20; 1.2.24;
Merge the nathanw_sa branch.
 1.1 13-Jul-2001  nathanw branches: 1.1.2;
file _context_u.S was initially added on branch nathanw_sa.
 1.1.2.10 06-Sep-2002  nathanw In the xmm case, FPSAVE should be "fxsave", and FPLOAD should be "fxrstor",
not the other way around. Note to self: Lay off the crack.
 1.1.2.9 14-Aug-2002  nathanw Define, initialize, and use function pointers for _{set,get,swap}context_u
routines, with different versions for saving FP state with fnsave or fxsave,
depending on the machdep.osfxsr sysctl.
 1.1.2.8 24-Apr-2002  nathanw Save FP state in user context switches.

XXX1: doing something clever about lazy-FP-switching would be good.
XXX2: coping with SSE/SSE2 regsisters would also be good.
 1.1.2.7 01-Mar-2002  nathanw Restore FP state if the kernel handed us any.
[xmms no longer sounds like it's playing Play-Skool LPs]
 1.1.2.6 11-Feb-2002  nathanw Implement _UC_USER for i386: only save callee-save registers, and set a
flag saying so; on restore, if that flag is set, only restore the callee-save
registers.

Speeds up the user-only version of these routines by 35-40% on a
Pentium MMX 233.
 1.1.2.5 04-Sep-2001  nathanw Generate and use symbolic constants for register offsets in mcontext_t,
instead of hand-coding offset numbers.
 1.1.2.4 02-Aug-2001  nathanw Don't bother even storing 0s into the fields for the caller-save
registers (eax, ecx, edx) and unused fields (esp, trapno, err).

Still need to deal more sanely with segment registers.
 1.1.2.3 02-Aug-2001  nathanw Adjust SETC (used by _setcontext_u and _getcontext_u) to not set the
new stack pointer until all the possibly-interesting data below the
new SP has been preserved. Unfortunately, it's about 3% slower.
 1.1.2.2 31-Jul-2001  nathanw Whitespace cleanup.
 1.1.2.1 13-Jul-2001  nathanw Make userlevel-only *context functions part of libpthread rather than libc.
 1.2.24.3 23-Mar-2008  matt sync with HEAD
 1.2.24.2 09-Jan-2008  matt sync with HEAD
 1.2.24.1 06-Nov-2007  matt sync with HEAD
 1.2.20.1 10-Sep-2007  skrll Sync with HEAD.
 1.8.4.1 18-May-2008  yamt sync with head.
 1.9.8.2 28-Apr-2008  martin Remove clause 3 and 4 from TNF licenses
 1.9.8.1 28-Apr-2008  martin file _context_u.S was added on branch christos-time_t on 2008-04-28 20:23:03 +0000
 1.9 16-May-2009  ad Remove unused code that's confusing when using cscope/opengrok.
 1.8 07-Jul-2008  gmcgarry branches: 1.8.6;
Selector registers are 16-bit and binutils 2.18 insists that only 16-bit
accesses are permitted on them. Therefore, change movl to movw. No change to
machine code generated.
 1.7 28-Apr-2008  martin branches: 1.7.2;
Remove clause 3 and 4 from TNF licenses
 1.6 20-Jan-2007  christos branches: 1.6.12;
fix warning about indirect call without *
 1.5 30-Nov-2004  nathanw Punt to setcontext() system call if the PSL_T bit (single-step trap)
is set, so that the single-step trap happens in the thread's context
and not in the middle of _setcontext_u.

XXX might be able to do something here with iret, too, but it needs
more testing.
 1.4 10-Nov-2004  kent save&restore %fs and %gs registers for USER_LDT applications.
PR#26900
 1.3 30-Oct-2003  yamt branches: 1.3.2;
use explicit "l" suffixes. (eg. lea -> leal)
 1.2 18-Jan-2003  thorpej Merge the nathanw_sa branch.
 1.1 14-Aug-2002  nathanw branches: 1.1.2;
file _getsetc.S was initially added on branch nathanw_sa.
 1.1.2.4 22-Oct-2002  skrll RCSId police.
 1.1.2.3 04-Sep-2002  nathanw The _UC_USER_BIT case of SETC (currently) needs to load FP state.

Found by inspection.
 1.1.2.2 15-Aug-2002  nathanw Bit values are not the same as bit positions; fix the test for the
_UC_FPU flag.
 1.1.2.1 14-Aug-2002  nathanw Define, initialize, and use function pointers for _{set,get,swap}context_u
routines, with different versions for saving FP state with fnsave or fxsave,
depending on the machdep.osfxsr sysctl.
 1.3.2.1 12-Nov-2004  jmc Pullup rev 1.4 (requested by kent in ticket #969)

save&restore %fs and %gs registers for USER_LDT applications. PR#26900
 1.6.12.1 18-May-2008  yamt sync with head.
 1.7.2.1 18-Sep-2008  wrstuden Sync with wrstuden-revivesa-base-2.
 1.8.6.2 07-Jul-2008  gmcgarry Selector registers are 16-bit and binutils 2.18 insists that only 16-bit
accesses are permitted on them. Therefore, change movl to movw. No change to
machine code generated.
 1.8.6.1 07-Jul-2008  gmcgarry file _getsetc.S was added on branch christos-time_t on 2008-07-07 13:07:56 +0000
 1.8 16-May-2009  ad Remove unused code that's confusing when using cscope/opengrok.
 1.7 28-Apr-2008  martin branches: 1.7.8;
Remove clause 3 and 4 from TNF licenses
 1.6 02-Mar-2007  ad branches: 1.6.12;
Remove the PTHREAD_SA option. If M:N threads is reimplemented it's
better off done with a seperate library.
 1.5 07-Sep-2003  cl Remove possible race condition in upcall recycling.
 1.4 17-Jul-2003  nathanw Adapt to structure name changes.
 1.3 26-Jun-2003  nathanw Remove PT_SLEEPUC and add PT_TRAPUC.
 1.2 18-Jan-2003  thorpej Merge the nathanw_sa branch.
 1.1 05-Mar-2001  nathanw branches: 1.1.2;
file genassym.cf was initially added on branch nathanw_sa.
 1.1.2.9 14-Aug-2002  nathanw Define, initialize, and use function pointers for _{set,get,swap}context_u
routines, with different versions for saving FP state with fnsave or fxsave,
depending on the machdep.osfxsr sysctl.
 1.1.2.8 25-Mar-2002  nathanw Define a couple of constants for FP register use.
 1.1.2.7 11-Feb-2002  nathanw Implement _UC_USER for i386: only save callee-save registers, and set a
flag saying so; on restore, if that flag is set, only restore the callee-save
registers.

Speeds up the user-only version of these routines by 35-40% on a
Pentium MMX 233.
 1.1.2.6 04-Sep-2001  nathanw Generate and use symbolic constants for register offsets in mcontext_t,
instead of hand-coding offset numbers.
 1.1.2.5 04-Sep-2001  nathanw Define STACKSPACE in pthread_md.h, not directly in pthread_switch.S
 1.1.2.4 01-Aug-2001  nathanw Create an offset for pt_sleepuc.
 1.1.2.3 31-Jul-2001  nathanw Create an asm symbol for the offset of EIP in the ucontext.
 1.1.2.2 13-Jul-2001  nathanw Note copyright.
Standardize RCS IDs.
 1.1.2.1 05-Mar-2001  nathanw The beginnings of a scheduler activations-based pthread library.
 1.6.12.1 18-May-2008  yamt sync with head.
 1.7.8.2 28-Apr-2008  martin Remove clause 3 and 4 from TNF licenses
 1.7.8.1 28-Apr-2008  martin file genassym.cf was added on branch christos-time_t on 2008-04-28 20:23:03 +0000
 1.4 08-Oct-1997  scottr This incarnation of the pthreads library is ancient and not useful, and
should have been mothballed some time ago...
 1.3 07-Feb-1994  proven More wrapper functions, and some hacks for machine independent sleep
mechanisms.
 1.2 20-Dec-1993  proven Copyrights added to each file.
 1.1 14-Nov-1993  proven branches: 1.1.1;
Initial revision
 1.1.1.1 14-Nov-1993  proven Initial release of the POSIX 1003.4a Draft 7 thread implementation.
 1.4 08-Oct-1997  scottr This incarnation of the pthreads library is ancient and not useful, and
should have been mothballed some time ago...
 1.3 20-Dec-1993  proven Copyrights added to each file.
 1.2 15-Nov-1993  proven OK one more try at getting it right ...
 1.1 14-Nov-1993  proven branches: 1.1.1;
Initial revision
 1.1.1.1 14-Nov-1993  proven Initial release of the POSIX 1003.4a Draft 7 thread implementation.
 1.7 16-May-2009  ad Remove unused code that's confusing when using cscope/opengrok.
 1.6 29-Mar-2009  ad - Make the threadreg code use _lwp_setprivate() instead of MD hooks.

XXX This must not be enabled by default because the LWP private mechanism
is reserved for TLS. It is provided only as a test/demo.

XXX Since ucontext_t does not contain the thread private variable, for a
short time after threads are created their thread specific data is unset.
If a signal arrives during that time we are screwed.

- No longer need pthread__osrev.

- Rearrange _lwp_ctl() calls slightly.
 1.5 28-Apr-2008  martin branches: 1.5.8; 1.5.10;
Remove clause 3 and 4 from TNF licenses
 1.4 13-Nov-2007  ad branches: 1.4.6;
Mutexes:

- Play scrooge again and chop more cycles off acquire/release.
- Spin while the lock holder is running on another CPU (adaptive mutexes).
- Do non-atomic release.

Threadreg:

- Add the necessary hooks to use a thread register.
- Add the code for i386, using %gs.
- Leave i386 code disabled until xen and COMPAT_NETBSD32 have the changes.
 1.3 08-Mar-2003  lukem branches: 1.3.24;
add __RCSID()
 1.2 18-Jan-2003  thorpej Merge the nathanw_sa branch.
 1.1 15-Aug-2002  nathanw branches: 1.1.2;
file pthread_md.c was initially added on branch nathanw_sa.
 1.1.2.1 15-Aug-2002  nathanw Actually add this file.
 1.3.24.1 09-Jan-2008  matt sync with HEAD
 1.4.6.1 18-May-2008  yamt sync with head.
 1.5.10.1 13-May-2009  jym Sync with HEAD.

Third (and last) commit. See http://mail-index.netbsd.org/source-changes/2009/05/13/msg221222.html
 1.5.8.2 28-Apr-2008  martin Remove clause 3 and 4 from TNF licenses
 1.5.8.1 28-Apr-2008  martin file pthread_md.c was added on branch christos-time_t on 2008-04-28 20:23:03 +0000
 1.21 25-May-2023  riastradh libpthread: New pthread__smt_wait to put CPU in low power for spin.

This is now distinct from pthread__smt_pause, which is for spin lock
backoff with no paired wakeup.

On Arm, there is a single-bit event register per CPU, and there are two
instructions to manage it:

- wfe, wait for event -- if event register is clear, enter low power
mode and wait until event register is set; then exit low power mode
and clear event register

- sev, signal event -- sets event register on all CPUs (other
circumstances like interrupts also set the event register and cause
wfe to wake)

These can be used to reduce the power consumption of spinning for a
lock, but only if they are actually paired -- if there's no sev, wfe
might hang indefinitely. Currently only pthread_spin(3) actually
pairs them; the other lock primitives (internal lock, mutex, rwlock)
do not -- they have spin lock backoff loops, but no corresponding
wakeup to cancel a wfe.

It may be worthwhile to teach the other lock primitives to pair
wfe/sev, but that requires some performance measurement to verify
it's actually worthwhile. So for now, we just make sure not to use
wfe when there's no sev, and keep everything else the same -- this
should fix severe performance degredation in libpthread on Arm
without hurting anything else.

No change in the generated code on amd64 and i386. No change in the
generated code for pthread_spin.c on arm and aarch64 -- changes only
the generated code for pthread_lock.c, pthread_mutex.c, and
pthread_rwlock.c, as intended.

PR port-arm/57437

XXX pullup-10
 1.20 02-Mar-2012  joerg branches: 1.20.24; 1.20.34; 1.20.42;
Avoid getcontext() as it triggers clobbering warnings. Use inline
assembler to get the fields directly. Saves a system call as side
effect.
 1.19 24-Feb-2011  joerg branches: 1.19.4;
Allow storing and receiving the LWP private pointer via ucontext_t
on all platforms except VAX and IA64. Add fast access via register for
AMD64, i386 and SH3 ports. Use this fast access in libpthread to replace
the stack based pthread_self(). Implement skeleton support for Alpha,
HPPA, PowerPC, SPARC and SPARC64, but leave it disabled.

Ports that support this feature provide __HAVE____LWP_GETPRIVATE_FAST in
machine/types.h and a corresponding __lwp_getprivate_fast in
machine/mcontext.h.

This material is based upon work partially supported by
The NetBSD Foundation under a contract with Joerg Sonnenberger.
 1.18 25-Jan-2011  christos make pthread__sp unsigned long.
 1.17 16-May-2009  ad branches: 1.17.2;
Remove unused code that's confusing when using cscope/opengrok.
 1.16 29-Mar-2009  ad - Make the threadreg code use _lwp_setprivate() instead of MD hooks.

XXX This must not be enabled by default because the LWP private mechanism
is reserved for TLS. It is provided only as a test/demo.

XXX Since ucontext_t does not contain the thread private variable, for a
short time after threads are created their thread specific data is unset.
If a signal arrives during that time we are screwed.

- No longer need pthread__osrev.

- Rearrange _lwp_ctl() calls slightly.
 1.15 23-Jun-2008  ad branches: 1.15.6; 1.15.8;
pthread__threadreg_get: mark it const.
 1.14 28-Apr-2008  martin branches: 1.14.2;
Remove clause 3 and 4 from TNF licenses
 1.13 22-Mar-2008  ad branches: 1.13.2;
Cheat and add inlines for _atomic_cas_ptr() to work around gcc emitting
unneeded PIC stuff in mutex_lock() and mutex_unlock(), when a thread
register is used.
 1.12 10-Feb-2008  ad - Remove libpthread's atomic ops.
- Remove the old spinlock-based mutex and rwlock implementations.
- Use the atomic ops from libc.
 1.11 13-Nov-2007  ad Mutexes:

- Play scrooge again and chop more cycles off acquire/release.
- Spin while the lock holder is running on another CPU (adaptive mutexes).
- Do non-atomic release.

Threadreg:

- Add the necessary hooks to use a thread register.
- Add the code for i386, using %gs.
- Leave i386 code disabled until xen and COMPAT_NETBSD32 have the changes.
 1.10 24-Sep-2007  skrll Resurrect the function pointers for lock operations and allow each
architecture to provide asm versions of the RAS operations.

We do this because relying on the compiler to get the RAS right is not
sensible. (It gets alpha wrong and hppa is suboptimal)

Provide asm RAS ops for hppa.

(A slightly different version) reviewed by Andrew Doran.
 1.9 08-Sep-2007  ad - Get rid of self->pt_mutexhint and use pthread__mutex_owned() instead.
- Update some comments and fix minor bugs. Minor cosmetic changes.
- Replace some spinlocks with mutexes and rwlocks.
- Change the process private semaphores to use mutexes and condition
variables instead of doing the synchronization directly. Spinlocks
are no longer used by the semaphore code.
 1.8 07-Sep-2007  ad Add: pthread__atomic_cas_ptr, pthread__atomic_swap_ptr, pthread__membar_full
This is a stopgap until the thorpej-atomic branch is complete.
 1.7 29-Mar-2006  cube branches: 1.7.8; 1.7.12;
Instead of using hard-coded values for various registers, get them from the
current context. Valid values can change depending on how the kernel is
setup. i386 and amd64 happen to be setup differently.
 1.6 24-Dec-2005  perry Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.
 1.5 11-Feb-2004  nathanw branches: 1.5.4; 1.5.6;
Add ucontext conversion macros for an "extra" register set.
 1.4 18-Jan-2003  christos delint
 1.3 18-Jan-2003  christos de-lint
 1.2 18-Jan-2003  thorpej Merge the nathanw_sa branch.
 1.1 05-Mar-2001  nathanw branches: 1.1.2;
file pthread_md.h was initially added on branch nathanw_sa.
 1.1.2.11 17-Jan-2003  thorpej Include <sys/ucontext.h> to get ucontext_t for prototypes.
 1.1.2.10 16-Jan-2003  thorpej * Move the pthread_sigmask() prototype to <signal.h>.
* Don't include <signal.h> in <pthread.h>.
* Add code to the signal trampoline to convert from the ucontext
to a sigcontext, and back again (XXX though, only callee-save
regs for _UC_USER contexts).

This is necessary in order to support e.g. GCC's libjava, which depends
on the traditional Unix semantics of changes made to the sigcontext
being visible when the handler returns.
 1.1.2.9 06-Dec-2002  nathanw Clear _UC_USER bit when converting from sigcontext to ucontext.
 1.1.2.8 22-Oct-2002  nathanw Define _INITCONTEXT_U_MD() to set segment registers and flag register
to sensible values.
 1.1.2.7 21-Oct-2002  nathanw Add some macros to convert between ucontext_t and struct sigcontext.
 1.1.2.6 14-Aug-2002  nathanw Define, initialize, and use function pointers for _{set,get,swap}context_u
routines, with different versions for saving FP state with fnsave or fxsave,
depending on the machdep.osfxsr sysctl.
 1.1.2.5 24-Apr-2002  nathanw Add macros to convert between struct ucontext and struct reg/fpreg.
 1.1.2.4 14-Nov-2001  briggs Add pthread__uc_pc() to pthread_md.h instead of making assumptions about
the contents of the m.d. mcontext.
 1.1.2.3 04-Sep-2001  nathanw Define STACKSPACE in pthread_md.h, not directly in pthread_switch.S
 1.1.2.2 13-Jul-2001  nathanw Note copyright.
Standardize RCS IDs.
 1.1.2.1 05-Mar-2001  nathanw The beginnings of a scheduler activations-based pthread library.
 1.5.6.1 19-Apr-2006  tron Pull up following revision(s) (requested by cube in ticket #1265):
lib/libpthread/arch/i386/pthread_md.h: revision 1.7
Instead of using hard-coded values for various registers, get them from the
current context. Valid values can change depending on how the kernel is
setup. i386 and amd64 happen to be setup differently.
 1.5.4.1 21-Apr-2006  tron Pull up following revision(s) (requested by cube in ticket #10448):
lib/libpthread/arch/i386/pthread_md.h: revision 1.7
Instead of using hard-coded values for various registers, get them from the
current context. Valid values can change depending on how the kernel is
setup. i386 and amd64 happen to be setup differently.
 1.7.12.3 23-Mar-2008  matt sync with HEAD
 1.7.12.2 09-Jan-2008  matt sync with HEAD
 1.7.12.1 06-Nov-2007  matt sync with HEAD
 1.7.8.1 10-Sep-2007  skrll Sync with HEAD.
 1.13.2.1 18-May-2008  yamt sync with head.
 1.14.2.1 18-Sep-2008  wrstuden Sync with wrstuden-revivesa-base-2.
 1.15.8.1 13-May-2009  jym Sync with HEAD.

Third (and last) commit. See http://mail-index.netbsd.org/source-changes/2009/05/13/msg221222.html
 1.15.6.2 23-Jun-2008  ad pthread__threadreg_get: mark it const.
 1.15.6.1 23-Jun-2008  ad file pthread_md.h was added on branch christos-time_t on 2008-06-23 10:39:39 +0000
 1.17.2.2 05-Mar-2011  bouyer Sync with HEAD
 1.17.2.1 08-Feb-2011  bouyer Sync with HEAD
 1.19.4.1 17-Apr-2012  yamt sync with head
 1.20.42.1 01-Aug-2023  martin Pull up following revision(s) (requested by riastradh in ticket #296):

lib/libpthread/arch/x86_64/pthread_md.h: revision 1.13
lib/libpthread/pthread_int.h: revision 1.110
lib/libpthread/pthread_int.h: revision 1.111
lib/libpthread/arch/i386/pthread_md.h: revision 1.21
lib/libpthread/arch/arm/pthread_md.h: revision 1.12
lib/libpthread/arch/arm/pthread_md.h: revision 1.13
lib/libpthread/pthread_spin.c: revision 1.11
lib/libpthread/arch/aarch64/pthread_md.h: revision 1.2

libpthread: Use __nothing, not /* nothing */, for empty macros.

No functional change intended -- just safer to do it this way in case
the macros are used in if branches or comma expressions.

PR port-arm/57437 (pthread__smt_pause/wake issue)

libpthread: New pthread__smt_wait to put CPU in low power for spin.

This is now distinct from pthread__smt_pause, which is for spin lock
backoff with no paired wakeup.

On Arm, there is a single-bit event register per CPU, and there are two
instructions to manage it:
- wfe, wait for event -- if event register is clear, enter low power
mode and wait until event register is set; then exit low power mode
and clear event register
- sev, signal event -- sets event register on all CPUs (other
circumstances like interrupts also set the event register and cause
wfe to wake)

These can be used to reduce the power consumption of spinning for a
lock, but only if they are actually paired -- if there's no sev, wfe
might hang indefinitely. Currently only pthread_spin(3) actually
pairs them; the other lock primitives (internal lock, mutex, rwlock)
do not -- they have spin lock backoff loops, but no corresponding
wakeup to cancel a wfe.

It may be worthwhile to teach the other lock primitives to pair
wfe/sev, but that requires some performance measurement to verify
it's actually worthwhile. So for now, we just make sure not to use
wfe when there's no sev, and keep everything else the same -- this
should fix severe performance degredation in libpthread on Arm
without hurting anything else.

No change in the generated code on amd64 and i386. No change in the
generated code for pthread_spin.c on arm and aarch64 -- changes only
the generated code for pthread_lock.c, pthread_mutex.c, and
pthread_rwlock.c, as intended.

PR port-arm/57437
 1.20.34.1 04-Aug-2023  martin Pull up following revision(s) (requested by riastradh in ticket #1700):

lib/libpthread/arch/x86_64/pthread_md.h: revision 1.13
lib/libpthread/pthread_int.h: revision 1.110
lib/libpthread/pthread_int.h: revision 1.111
lib/libpthread/arch/i386/pthread_md.h: revision 1.21
lib/libpthread/arch/arm/pthread_md.h: revision 1.12
lib/libpthread/arch/arm/pthread_md.h: revision 1.13
lib/libpthread/pthread_spin.c: revision 1.11
lib/libpthread/arch/aarch64/pthread_md.h: revision 1.2

libpthread: Use __nothing, not /* nothing */, for empty macros.

No functional change intended -- just safer to do it this way in case
the macros are used in if branches or comma expressions.
PR port-arm/57437 (pthread__smt_pause/wake issue)

libpthread: New pthread__smt_wait to put CPU in low power for spin.

This is now distinct from pthread__smt_pause, which is for spin lock
backoff with no paired wakeup.

On Arm, there is a single-bit event register per CPU, and there are two
instructions to manage it:
- wfe, wait for event -- if event register is clear, enter low power
mode and wait until event register is set; then exit low power mode
and clear event register
- sev, signal event -- sets event register on all CPUs (other
circumstances like interrupts also set the event register and cause
wfe to wake)

These can be used to reduce the power consumption of spinning for a
lock, but only if they are actually paired -- if there's no sev, wfe
might hang indefinitely. Currently only pthread_spin(3) actually
pairs them; the other lock primitives (internal lock, mutex, rwlock)
do not -- they have spin lock backoff loops, but no corresponding
wakeup to cancel a wfe.

It may be worthwhile to teach the other lock primitives to pair
wfe/sev, but that requires some performance measurement to verify
it's actually worthwhile. So for now, we just make sure not to use
wfe when there's no sev, and keep everything else the same -- this
should fix severe performance degredation in libpthread on Arm
without hurting anything else.

No change in the generated code on amd64 and i386. No change in the
generated code for pthread_spin.c on arm and aarch64 -- changes only
the generated code for pthread_lock.c, pthread_mutex.c, and
pthread_rwlock.c, as intended.
PR port-arm/57437
 1.20.24.1 04-Aug-2023  martin Pull up following revision(s) (requested by riastradh in ticket #1878):

lib/libpthread/arch/x86_64/pthread_md.h: revision 1.13
lib/libpthread/pthread_int.h: revision 1.110
lib/libpthread/pthread_int.h: revision 1.111
lib/libpthread/arch/i386/pthread_md.h: revision 1.21
lib/libpthread/arch/arm/pthread_md.h: revision 1.12
lib/libpthread/arch/arm/pthread_md.h: revision 1.13
lib/libpthread/pthread_spin.c: revision 1.11
lib/libpthread/arch/aarch64/pthread_md.h: revision 1.2

libpthread: Use __nothing, not /* nothing */, for empty macros.

No functional change intended -- just safer to do it this way in case
the macros are used in if branches or comma expressions.
PR port-arm/57437 (pthread__smt_pause/wake issue)

libpthread: New pthread__smt_wait to put CPU in low power for spin.

This is now distinct from pthread__smt_pause, which is for spin lock
backoff with no paired wakeup.

On Arm, there is a single-bit event register per CPU, and there are two
instructions to manage it:
- wfe, wait for event -- if event register is clear, enter low power
mode and wait until event register is set; then exit low power mode
and clear event register
- sev, signal event -- sets event register on all CPUs (other
circumstances like interrupts also set the event register and cause
wfe to wake)

These can be used to reduce the power consumption of spinning for a
lock, but only if they are actually paired -- if there's no sev, wfe
might hang indefinitely. Currently only pthread_spin(3) actually
pairs them; the other lock primitives (internal lock, mutex, rwlock)
do not -- they have spin lock backoff loops, but no corresponding
wakeup to cancel a wfe.

It may be worthwhile to teach the other lock primitives to pair
wfe/sev, but that requires some performance measurement to verify
it's actually worthwhile. So for now, we just make sure not to use
wfe when there's no sev, and keep everything else the same -- this
should fix severe performance degredation in libpthread on Arm
without hurting anything else.

No change in the generated code on amd64 and i386. No change in the
generated code for pthread_spin.c on arm and aarch64 -- changes only
the generated code for pthread_lock.c, pthread_mutex.c, and
pthread_rwlock.c, as intended.
PR port-arm/57437
 1.10 02-Mar-2007  ad Remove the PTHREAD_SA option. If M:N threads is reimplemented it's
better off done with a seperate library.
 1.9 04-Jan-2006  skrll A couple of fixes to make libpthread really shared, i.e. not have text re-
locations:

- Don't declare pthread__switch_away global
- Do the PIC dance for pthread__switch_return_point and
pthread__locked_switch. Ideally these (and other) symbols would
be hidden.

Thanks to uwe@, dyoung@ and elad@ for help.

XXX sh3 is still to be done.
XXX vax does strange things.
 1.8 23-Apr-2004  simonb branches: 1.8.2;
s/the the/the/ (only in sources that aren't regularly imported from
elsewhere).
 1.7 30-Oct-2003  yamt use explicit "l" suffixes. (eg. lea -> leal)
 1.6 07-Sep-2003  cl Remove possible race condition in upcall recycling.
 1.5 26-Jun-2003  nathanw Adapt to pt_trapuc: change STACK_SWITCH to check for a value in pt_trapuc
and use it preferentially to a value in pt_uc, clearing it once on the new
stack. Move stores into pt_uc back to before the stack switch; storing
after the stack switch opened a one-instruction race condition where an upcall
that had just started a chain could be preempted again, and would bomb when
restarted due to its pt_uc not yet having been updated. Now that pt_trapuc
is what the upcall code writes to, it is safe to store to pt_uc before
switching stacks.

Remove obsolete pt_sleepuc code.
 1.4 12-Jun-2003  nathanw Two fixes:
* In switch-away cases, write PT_SWITCHTO last (after PT_SWITCHTOUC), so
that pthread__resolve_locks() doesn't see an empty SWITCHTOUC value. This
also permits pthread__resolve_locks() to use the presence of PT_SWITCHTO
as a sign that the thread has done all of its necessary chain work.

* Make the return-point of pthread__switch global and visible, so that its
address can be compared to the PC of a thread, again as a sign that its
chain-work is done.

(other architectures in progress, after they get the *previous* asm fix...)
 1.3 10-Feb-2003  fvdl Continue at the plain switch return point in pthread__switch, not the
locked one, in the !PIC case. From Tor Egge via Havard Eidnes.
 1.2 18-Jan-2003  thorpej Merge the nathanw_sa branch.
 1.1 05-Mar-2001  nathanw branches: 1.1.2;
file pthread_switch.S was initially added on branch nathanw_sa.
 1.1.2.18 02-Jan-2003  nathanw Rewrite pthread__switch() and adjust pthread__locked_switch() to
avoid storing the new saved-context pointer while still using the old
stack. This avoids a race condition with pthread__find_interrupted()
where a thread could lose its old state if it was interrupted in a
certain window in pthread__switch() or pthread__locked_switch() (the
latter was never actually observed, but appeared possible).
 1.1.2.17 18-Dec-2002  nathanw Comment whitespace.
 1.1.2.16 18-Dec-2002  nathanw Tweak ucontext-alignment code so that it consumes at most 12 bytes of
stack space instead of 16, leaving 20 bytes avaliable.
 1.1.2.15 16-Sep-2002  skrll Fix typo in comment.
 1.1.2.14 06-Sep-2002  nathanw Ensure that stack regions allocated for storing ucontext_t's are
aligned to 16-byte boundaries; this ensures that the XMM save areas
are also 16-byte aligned, which is a requirement for use of the fxsave
and fxrstor instructions.
 1.1.2.13 14-Aug-2002  nathanw Define, initialize, and use function pointers for _{set,get,swap}context_u
routines, with different versions for saving FP state with fnsave or fxsave,
depending on the machdep.osfxsr sysctl.
 1.1.2.12 04-Sep-2001  nathanw Define STACKSPACE in pthread_md.h, not directly in pthread_switch.S
 1.1.2.11 01-Aug-2001  nathanw Rework the whole" ucontext vs. temporary crud on the bottom of the
stack" thing again, placing the burden of weirdness on STACK_SWITCH.

Add a lot of comments to explain why this is so tricky, and document
which functions are external.

In the old_preempt case of pthread__locked switch, save the thread's
real context so we don't try to switch back into the middle of this
code more than once.
 1.1.2.10 31-Jul-2001  nathanw Increase paranoia about keeping the ucontext on the bottom of the
stack: Find the location to put it, but don't move $esp there, so that
subsequent caller-save arguments and return addresses are above it
rather than below it.
 1.1.2.9 31-Jul-2001  nathanw Actually, we *do* need pthread__switch in assembler, in order that
the ucontext_t be the last thing on the stack. Otherwise, the switch
code will overwrite stack data, since it sets the stack to pt_uc.

Use the UT_EIP symbolic constant.

When performing a new_preempt, store into pt_switchto and
pt_switchtouc of the *new* thread, so that resolve_locks stands a
chance of finding the switchto victim.
 1.1.2.8 26-Jul-2001  nathanw pthread__switch() no longer needs to be implemeted in assembler,
and doesn't have a lock parameter.
 1.1.2.7 24-Jul-2001  nathanw Substantial rework:
- Make pthread__switch() just a plain switch, and move the
switch-to-next code to pthread__switch_away. Saves a stack switch,
test, and branch in the pthread_switch case, a _getcontext_u() and
associated stack frobbing in the switch_away case, and several
points of sanity.
- Remove a poorly-thought-out ucontext-dodge in STACK_SWITCH.
- Fill in PT_SWITCHTO and PT_SWITCHTOUC in the new_preempt path as
well as the old_preempt path in pthread__locked_switch() and
pthread__upcall_switch().
 1.1.2.6 23-Jul-2001  nathanw Pushing parameters on the stack is not the same as preserving their
values.
 1.1.2.5 20-Jul-2001  nathanw The %esi and %edi registers are callee-save, not caller-save.
 1.1.2.4 16-Jul-2001  nathanw Make pthread__switch() actually decrement fake spinlock counts when
told to do so by pthread__upcall_switch() or pthread__lock_switch().
 1.1.2.3 13-Jul-2001  nathanw Note copyright.
Standardize RCS IDs.
 1.1.2.2 29-Mar-2001  nathanw Don't frob pt_spinlocks in pthread__switch; we don't use it for
switching with locks held.
 1.1.2.1 05-Mar-2001  nathanw The beginnings of a scheduler activations-based pthread library.
 1.8.2.1 08-Jan-2006  riz Pull up following revision(s) (requested by skrll in ticket #1093):
lib/libpthread/arch/sparc/pthread_switch.S: revision 1.8
lib/libpthread/arch/x86_64/pthread_switch.S: revision 1.11
lib/libpthread/arch/sparc64/pthread_switch.S: revision 1.9
lib/libpthread/arch/i386/pthread_switch.S: revision 1.9
A couple of fixes to make libpthread really shared, i.e. not have text re-
locations:
- Don't declare pthread__switch_away global
- Do the PIC dance for pthread__switch_return_point and
pthread__locked_switch. Ideally these (and other) symbols would
be hidden.
Thanks to uwe@, dyoung@ and elad@ for help.
XXX sh3 is still to be done.
XXX vax does strange things.
 1.4 08-Oct-1997  scottr This incarnation of the pthreads library is ancient and not useful, and
should have been mothballed some time ago...
 1.3 07-Feb-1994  proven More wrapper functions, and some hacks for machine independent sleep
mechanisms.
 1.2 27-Jan-1994  mycroft Stylistic change.
 1.1 14-Nov-1993  proven branches: 1.1.1;
Initial revision
 1.1.1.1 14-Nov-1993  proven Initial release of the POSIX 1003.4a Draft 7 thread implementation.

RSS XML Feed