DragonFly kernel List (threaded) for 2004-02
Re: lkwt in DragonFly
On Sat, Feb 07, 2004 at 06:01:55PM -0800, Matthew Dillon wrote:
> Best case is as I outlined above... no critical section or bus locked
> operation is required at all, just an interlock against an IPI on
> the current cpu and that can be done by setting a field in
> the globaldata structure.
That is of course assuming that your token protects your data
from other threads on other CPUs, and not local interrupts as well?
(I'm assuming you'll be masking out interrupts for this).
> In FreeBSD-5 this sort of interlock could very well be a bit more
> expensive because in FreeBSD-5 you have no choice but to use a critical
> section to prevent your thread from being preemptively switched to another
> cpu, and even if you could get away without a critical section you would
> still have no choice but to use integrated %fs:gd_XXX accesses to
> access fields in your globaldata (again because outside of a critical
> section your thread can be preemptively switched to another cpu).
> The preemptive switching in FreeBSD-5 is the single biggest problem
> it has. It makes it impossible to implement optimal algorithms
> which take advantage of the per-cpu globaldata structure because you
> are forced to obtain one or more mutexes, OR us a critical section
> (which is a lot more expensive in FreeBSD-5 then in DFly) no matter
> what you do. It's a cascade effect... and circular reasoning too.
> We want to use mutexes, oh no mutexes obtained by interrupts may
> conflict with mutexes obtained by mainline code, oh gee that means
> we should do priority borrowing and, wow, that means that one mainline
> kernel thread can preempt another mainline kernel thread indirectly,
> oh fubar, that means that all of our globaldata accesses have to be
> atomic (no caching of 'mycpu'), so we can't easily access per-cpu
> globaldata based caches without using a mutex. See? circular reasoning.
It should be theoretically possible to disable CPU migration with a
simple interlock even on FreeBSD 5.x. That, along with masking out
interrupts, could be used to protect pcpu data without the use of
However - and this may not be the case for DragonFly - I have
previously noticed that when I ripped out the PCPU mutexes
protecting my pcpu uma_keg structures (just basic per-cpu
structures), thereby replacing the xchgs on the pcpu mutex with
critical sections, performance in general on a UP machine decreased
by about 8%. I can only assume at this point that the pessimisation
is due to the cost of interrupt unpending when compared to the nasty
scheduling requirements for an xchg (a large and mostly empty
> Matthew Dillon
Bosko Milekic * bmilekic@xxxxxxxxxxxxxxxx * bmilekic@xxxxxxxxxxx
TECHNOkRATIS Consulting Services * http://www.technokratis.com/