DragonFly kernel List (threaded) for 2011-01
Re: i386 version of cpu_sfence()
:i386 version of cpu_sfence(), it is just asm volatile ("" :::"memory")
:According to the instruction set, sfence should also ensures that the
:"global visibility" (i.e. empty CPU store buffer) of the stores before
:So should we do the same as cpu_mfence(), i.e. use a locked memory access?
cpu_sfence() is basically a NOP, because x86 cpus already order
writes for global visibility. The volatile ..."memory" macro is
roughly equivalent to cpu_ccfence() ... prevent the compiler itself
from trying to optimize or reorder actual instructions around that
point in the code.
lfence/mfence require actual work. Even though cpus guarantee
ordered global visibility on write they also reorder reads and
do speculative reads. Thus lfence/mfence are real.
I dunno what the best approach is but I suggest someone modify
one of the benchmark tests in /usr/src/test/sysperf to measure
the cost of lfence/mfence vs locked memory instructions. For
such a test to be reasonable the test must create two threads
which compete for the memory location in question.