Home | History | Annotate | Line # | Download | only in libpthread
TODO revision 1.5
      1  1.5        ad $NetBSD: TODO,v 1.5 2006/12/25 11:36:36 ad Exp $
      2  1.5        ad 
      3  1.5        ad Bugs to fix, mostly with SA:
      4  1.2   thorpej 
      5  1.3  jdolecek - some blocking routines (like sem_wait()) don't work if SA's aren't
      6  1.3  jdolecek   running yet, because the alarm system isn't up and running or there is no
      7  1.3  jdolecek   thread context to switch to. It would be weird to use them that
      8  1.3  jdolecek   way, but it's perfectly legal.
      9  1.2   thorpej - There is a race between pthread_cancel() and
     10  1.2   thorpej   pthread_cond_broadcast() or pthread_exit() about removing an item
     11  1.2   thorpej   from the sleep queue. The locking protocols there need a little
     12  1.2   thorpej   adjustment.
     13  1.4  christos - pthread_sig.c: pthread__kill_self() passes a bogus ucontext to the handler.
     14  1.4  christos   This is probably not very important.
     15  1.2   thorpej - pthread_sig.c: Come up with a signal trampoline naming convention like
     16  1.2   thorpej   libc's, so that GDB will have an easier time with things.
     17  1.2   thorpej - Consider moving pthread__signal_tramp() to its own file, and building
     18  1.2   thorpej   it with -fasync-unwind-tables, so that DWARF2 EH unwinding works through
     19  1.2   thorpej   it.  (This is required for e.g. GCC's libjava.)
     20  1.2   thorpej - Add locking to ld.elf_so so that multiple threads doing lazy binding
     21  1.2   thorpej   doesn't trash things.
     22  1.2   thorpej - Verify the cancel stub symbol trickery.
     23  1.2   thorpej 
     24  1.2   thorpej 
     25  1.2   thorpej Interfaces/features to implement:
     26  1.2   thorpej - pthread_atfork()
     27  1.2   thorpej - priority scheduling
     28  1.2   thorpej - libc integration: 
     29  1.2   thorpej    - foo_r interfaces
     30  1.2   thorpej - system integration
     31  1.2   thorpej    - some macros and prototypes belong in headers other than pthread.h
     32  1.2   thorpej 
     33  1.2   thorpej 
     34  1.2   thorpej Features that need more/better regression tests:
     35  1.2   thorpej  - pthread_cond_broadcast()
     36  1.2   thorpej  - pthread_once()
     37  1.2   thorpej  - pthread_get/setspecific()
     38  1.2   thorpej  - signals
     39  1.2   thorpej 
     40  1.2   thorpej 
     41  1.2   thorpej Things that need fixing:
     42  1.2   thorpej - Recycle dead threads for new threads.
     43  1.2   thorpej 
     44  1.2   thorpej Ideas to play with:
     45  1.2   thorpej - Explore the trapcontext vs. usercontext distinction in ucontext_t.
     46  1.2   thorpej - Get rid of thread structures when too many accumulate (is this
     47  1.2   thorpej   actually a good idea?)
     48  1.2   thorpej - Adaptive spin/sleep locks for mutexes.
     49  1.2   thorpej - Currently, each thread uses two real pages of memory: one at the top
     50  1.2   thorpej   of the stack for actual stack data, and one at the bottom for the
     51  1.2   thorpej   pthread_st. If we can get suitable space above the initial stack for
     52  1.2   thorpej   main(), we can cut this to one page per thread. Perhaps crt0 should
     53  1.2   thorpej   do something different (give us more space) if libpthread is linked
     54  1.2   thorpej   in?
     55  1.2   thorpej - Figure out whether/how to expose the inline version of
     56  1.2   thorpej   pthread_self().
     57  1.2   thorpej - Along the same lines, figure out whether/how to use registers reserved
     58  1.2   thorpej   in the ABI for thread-specific-data to implement pthread_self().
     59  1.4  christos - Figure out what to do with changing stack sizes.
     60  1.5        ad 
     61  1.5        ad Future work for 1:1 threads:
     62  1.5        ad 
     63  1.5        ad - Stress testing, particularly with multiple CPUs.
     64  1.5        ad 
     65  1.5        ad - Verify that gdb still works well (basic functionality seems to be OK).
     66  1.5        ad 
     67  1.5        ad - There is a race between pthread_exit() and pthread_create() for
     68  1.5        ad   detached LWPs, where the stack (and pthread structure) could be reclaimed
     69  1.5        ad   before the thread has a chance to call _lwp_exit().  Checking the return
     70  1.5        ad   of _lwp_kill(target, 0) could be used to fix this but that seems a bit
     71  1.5        ad   heavyweight. (See shared page item.)
     72  1.5        ad 
     73  1.5        ad - Adaptive mutexes and spinlocks (see shared page item). These need
     74  1.5        ad   to implement exponential backoff to reduce bus contention. On x86 we
     75  1.5        ad   need to issue the 'pause' instruction while spinning, perhaps on other
     76  1.5        ad   SMT processors too.
     77  1.5        ad 
     78  1.5        ad - Have a shared page that:
     79  1.5        ad 
     80  1.5        ad   o Allows an LWP to request it not be preempted by the kernel. This would
     81  1.5        ad     be used over critical sections like pthread_cond_wait(), where we can
     82  1.5        ad     acquire a bunch of spin locks: being preempted while holding them would
     83  1.5        ad     suck. _lwp_park() would reset the flag once in kernel mode, and there
     84  1.5        ad     would need to be an equivalent way to do this from user mode. The user
     85  1.5        ad     path would probably need to notice deferred preemption and call
     86  1.5        ad     sched_yield() on exit from the critical section.
     87  1.5        ad 
     88  1.5        ad   o Perhaps has some kind of hint mechanism that gives us a clue about
     89  1.5        ad     whether an LWP is currently running on another CPU. This could be used
     90  1.5        ad     for adaptive locks, but would need to be cheap to do in-kernel.
     91  1.5        ad 
     92  1.5        ad   o Perhaps has a flag value that's reset when a detached LWP is into the
     93  1.5        ad     kernel and lwp_exit1(), meaning that its stack can be reclaimed. Again,
     94  1.5        ad     may or may not be worth it.
     95  1.5        ad 
     96  1.5        ad - Keep a pool of dead LWPs so that we do not have take the full hit of
     97  1.5        ad   _lwp_create() every time pthread_create() is called. If nothing else
     98  1.5        ad   this is important for benchmarks.. There are a few different ways this
     99  1.5        ad   could be implemented, but it needs to be clear if the advantages are
    100  1.5        ad   real. Lots of thought and benchmarking required.
    101  1.5        ad 
    102  1.5        ad - LWPs that are parked or that have called nanosleep() (common) burn up
    103  1.5        ad   kernel resources. "struct lwp" itself isn't a big deal, but the VA space
    104  1.5        ad   and swap used by kernel stacks is. _lwp_park() takes a ucontext_t pointer
    105  1.5        ad   in expectation that at some point we may be able to recycle the kernel
    106  1.5        ad   stack and re-start the LWP at the correct point, using pageable user
    107  1.5        ad   memory to hold state. It might also be useful to have a nanosleep call
    108  1.5        ad   that does something similar. Again, lots of thought and benchmarking
    109  1.5        ad   required. (Original idea from matt@)
    110  1.5        ad 
    111  1.5        ad - It's possible that we don't need to take so many spinlocks around
    112  1.5        ad   cancellation points like pthread_cond_wait() given that _lwp_wakeup()
    113  1.5        ad   and _lwp_unpark() need to synchronise anyway.
    114  1.5        ad 
    115  1.5        ad - Need to give consideration to the order in which threads enter and exit
    116  1.5        ad   synchronisation objects, both in the pthread library and in the kernel.
    117  1.5        ad   Commonly locks are acquired/released in order (a, b, c -> c, b, a). The
    118  1.5        ad   pthread spec probably has something to say about this.
    119  1.5        ad 
    120  1.5        ad - The kernel scheduler needs improving to handle LWPs and processor affinity
    121  1.5        ad   better, and user space tools like top(1) and ps(1) need to be changed to
    122  1.5        ad   report correctly.  Tied into that is the need for a mechanism to impose
    123  1.5        ad   limits on various aspects of LWPs.
    124  1.5        ad 
    125  1.5        ad - Streamlining of the park/unpark path.
    126  1.5        ad 
    127  1.5        ad - Priority inheritance and similar nasties.
    128