Home | History | Annotate | Line # | Download | only in libpthread
TODO revision 1.8
      1  1.8        ad $NetBSD: TODO,v 1.8 2007/03/02 18:53:51 ad Exp $
      2  1.5        ad 
      3  1.8        ad Bugs to fix:
      4  1.2   thorpej 
      5  1.2   thorpej - Add locking to ld.elf_so so that multiple threads doing lazy binding
      6  1.8        ad   doesn't trash things. XXX Still the case?
      7  1.2   thorpej - Verify the cancel stub symbol trickery.
      8  1.2   thorpej 
      9  1.8        ad Interfaces/features to implement:
     10  1.2   thorpej 
     11  1.2   thorpej - priority scheduling
     12  1.2   thorpej - libc integration: 
     13  1.2   thorpej    - foo_r interfaces
     14  1.2   thorpej - system integration
     15  1.2   thorpej    - some macros and prototypes belong in headers other than pthread.h
     16  1.2   thorpej 
     17  1.8        ad Features that need more/better regression tests:
     18  1.2   thorpej 
     19  1.2   thorpej  - pthread_cond_broadcast()
     20  1.2   thorpej  - pthread_once()
     21  1.2   thorpej  - pthread_get/setspecific()
     22  1.2   thorpej  - signals
     23  1.2   thorpej 
     24  1.8        ad Ideas to play with:
     25  1.2   thorpej 
     26  1.8        ad - Explore the trapcontext vs. usercontext distinction in ucontext_t.
     27  1.2   thorpej 
     28  1.2   thorpej - Get rid of thread structures when too many accumulate (is this
     29  1.2   thorpej   actually a good idea?)
     30  1.8        ad 
     31  1.2   thorpej - Currently, each thread uses two real pages of memory: one at the top
     32  1.2   thorpej   of the stack for actual stack data, and one at the bottom for the
     33  1.2   thorpej   pthread_st. If we can get suitable space above the initial stack for
     34  1.2   thorpej   main(), we can cut this to one page per thread. Perhaps crt0 should
     35  1.2   thorpej   do something different (give us more space) if libpthread is linked
     36  1.2   thorpej   in?
     37  1.8        ad 
     38  1.2   thorpej - Figure out whether/how to expose the inline version of
     39  1.2   thorpej   pthread_self().
     40  1.8        ad 
     41  1.2   thorpej - Along the same lines, figure out whether/how to use registers reserved
     42  1.2   thorpej   in the ABI for thread-specific-data to implement pthread_self().
     43  1.8        ad 
     44  1.4  christos - Figure out what to do with changing stack sizes.
     45  1.5        ad 
     46  1.5        ad - Stress testing, particularly with multiple CPUs.
     47  1.5        ad 
     48  1.6      yamt - A race between pthread_exit() and pthread_create() for detached LWPs,
     49  1.6      yamt   where the stack (and pthread structure) could be reclaimed before the
     50  1.6      yamt   thread has a chance to call _lwp_exit(), is currently prevented by
     51  1.6      yamt   checking the return of _lwp_kill(target, 0).  It could be done more
     52  1.6      yamt   efficiently.  (See shared page item.)
     53  1.5        ad 
     54  1.5        ad - Adaptive mutexes and spinlocks (see shared page item). These need
     55  1.5        ad   to implement exponential backoff to reduce bus contention. On x86 we
     56  1.5        ad   need to issue the 'pause' instruction while spinning, perhaps on other
     57  1.5        ad   SMT processors too.
     58  1.5        ad 
     59  1.5        ad - Have a shared page that:
     60  1.5        ad 
     61  1.5        ad   o Allows an LWP to request it not be preempted by the kernel. This would
     62  1.5        ad     be used over critical sections like pthread_cond_wait(), where we can
     63  1.5        ad     acquire a bunch of spin locks: being preempted while holding them would
     64  1.5        ad     suck. _lwp_park() would reset the flag once in kernel mode, and there
     65  1.5        ad     would need to be an equivalent way to do this from user mode. The user
     66  1.5        ad     path would probably need to notice deferred preemption and call
     67  1.5        ad     sched_yield() on exit from the critical section.
     68  1.5        ad 
     69  1.5        ad   o Perhaps has some kind of hint mechanism that gives us a clue about
     70  1.5        ad     whether an LWP is currently running on another CPU. This could be used
     71  1.5        ad     for adaptive locks, but would need to be cheap to do in-kernel.
     72  1.5        ad 
     73  1.5        ad   o Perhaps has a flag value that's reset when a detached LWP is into the
     74  1.5        ad     kernel and lwp_exit1(), meaning that its stack can be reclaimed. Again,
     75  1.5        ad     may or may not be worth it.
     76  1.5        ad 
     77  1.5        ad - Keep a pool of dead LWPs so that we do not have take the full hit of
     78  1.5        ad   _lwp_create() every time pthread_create() is called. If nothing else
     79  1.5        ad   this is important for benchmarks.. There are a few different ways this
     80  1.5        ad   could be implemented, but it needs to be clear if the advantages are
     81  1.5        ad   real. Lots of thought and benchmarking required.
     82  1.5        ad 
     83  1.5        ad - LWPs that are parked or that have called nanosleep() (common) burn up
     84  1.5        ad   kernel resources. "struct lwp" itself isn't a big deal, but the VA space
     85  1.5        ad   and swap used by kernel stacks is. _lwp_park() takes a ucontext_t pointer
     86  1.5        ad   in expectation that at some point we may be able to recycle the kernel
     87  1.5        ad   stack and re-start the LWP at the correct point, using pageable user
     88  1.5        ad   memory to hold state. It might also be useful to have a nanosleep call
     89  1.5        ad   that does something similar. Again, lots of thought and benchmarking
     90  1.5        ad   required. (Original idea from matt@)
     91  1.5        ad 
     92  1.5        ad - Need to give consideration to the order in which threads enter and exit
     93  1.5        ad   synchronisation objects, both in the pthread library and in the kernel.
     94  1.8        ad   Commonly locks are acquired/released in order (a, b, c -> c, b, a).
     95  1.5        ad 
     96  1.5        ad - The kernel scheduler needs improving to handle LWPs and processor affinity
     97  1.5        ad   better, and user space tools like top(1) and ps(1) need to be changed to
     98  1.5        ad   report correctly.  Tied into that is the need for a mechanism to impose
     99  1.5        ad   limits on various aspects of LWPs.
    100  1.5        ad 
    101  1.5        ad - Streamlining of the park/unpark path.
    102  1.5        ad 
    103  1.5        ad - Priority inheritance and similar nasties.
    104