DragonFly BSD
DragonFly commits List (threaded) for 2006-05
[Date Prev][Date Next]  [Thread Prev][Thread Next]  [Date Index][Thread Index]

cvs commit: src/sys/kern kern_ktr.c kern_lock.c kern_spinlock.c lwkt_rwlock.c lwkt_thread.c lwkt_token.c src/sys/sys globaldata.h spinlock.h spinlock2.h thread.h src/sys/netproto/smb smb_subr.h src/sys/vfs/ntfs ntfs_subr.c src/sys/vm vm_zone.c

From: Matthew Dillon <dillon@xxxxxxxxxxxxxxxxxxxxxxx>
Date: Sun, 21 May 2006 13:23:29 -0700 (PDT)

dillon      2006/05/21 13:23:29 PDT

DragonFly src repository

  Modified files:
    sys/kern             kern_ktr.c kern_lock.c kern_spinlock.c 
                         lwkt_rwlock.c lwkt_thread.c lwkt_token.c 
    sys/sys              globaldata.h spinlock.h spinlock2.h 
    sys/netproto/smb     smb_subr.h 
    sys/vfs/ntfs         ntfs_subr.c 
    sys/vm               vm_zone.c 
  Implement a much faster spinlock.
  * Spinlocks can't conflict with FAST interrupts without deadlocking anyway,
    so instead of using a critical section simply do not allow an interrupt
    thread to preempt the current thread if it is holding a spinlock.  This
    cuts spinlock overhead in half.
  * Implement shared spinlocks in addition to exclusive spinlocks.  Shared
    spinlocks would be used, e.g. for file descriptor table lookups.
  * Cache a shared spinlock by using the spinlock's lock field as a bitfield,
    one for each cpu (bit 31 for exclusive locks).  A shared spinlock sets
    its cpu's shared bit and does not bother clearing it on unlock.
    This means that multiple, parallel shared spinlock accessors do NOT incur
    a cache conflict on the spinlock.  ALL parallel shared accessors operate
    at full speed (~10ns vs ~40-100ns in overhead).  90% of the 10ns in
    overhead is due to a necessary MFENCE to interlock against exclusive
    spinlocks on the mutex.  However, this MFENCE only has to play with
    pending cpu-local memory writes so it will always run at near full speed.
  * Exclusive spinlocks in the face of previously cached shared spinlocks
    are now slightly more expensive because they have to clear the cached
    shared spinlock bits by checking the globaldata structure for each
    conflicting cpu to see if it is still holding a shared spinlock.  However,
    only the initial (unavoidable) atomic swap involves potential cache
    conflicts.  The shared bit checks involve only memory reads and the
    situation should be self-correcting from a performance standpoint since
    the shared bits then get cleared.
  * Add sysctl's for basic spinlock performance testing.  Setting
    debug.spin_lock_test issues a test.  Tests #2 and #3 loop
    debug.spin_test_count times.  p.s. these tests will stall the whole
  	1       Test the indefinite wait code
  	2       Time the best-case exclusive lock overhead
  	3       Time the best-case shared lock overhead
  * TODO: A shared->exclusive spinlock upgrade inline with positive feedback,
    and an exclusive->shared spinlock downgrade inline.
  Revision  Changes    Path
  1.14      +10 -4     src/sys/kern/kern_ktr.c
  1.21      +15 -15    src/sys/kern/kern_lock.c
  1.4       +222 -55   src/sys/kern/kern_spinlock.c
  1.11      +14 -14    src/sys/kern/lwkt_rwlock.c
  1.96      +22 -18    src/sys/kern/lwkt_thread.c
  1.26      +6 -6      src/sys/kern/lwkt_token.c
  1.42      +3 -1      src/sys/sys/globaldata.h
  1.4       +2 -0      src/sys/sys/spinlock.h
  1.9       +128 -84   src/sys/sys/spinlock2.h
  1.81      +1 -1      src/sys/sys/thread.h
  1.13      +2 -2      src/sys/netproto/smb/smb_subr.h
  1.24      +8 -8      src/sys/vfs/ntfs/ntfs_subr.c
  1.20      +7 -7      src/sys/vm/vm_zone.c


[Date Prev][Date Next]  [Thread Prev][Thread Next]  [Date Index][Thread Index]