DragonFly kernel List (threaded) for 2011-01
[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index][Thread Index]

Re: i386 version of cpu_sfence()

From:	Matthew Dillon <dillon@xxxxxxxxxxxxxxxxxxxx>
Date:	Fri, 28 Jan 2011 10:54:25 -0800 (PST)

:Hi all,
:
:i386 version of cpu_sfence(), it is just asm volatile ("" :::"memory")
:
:According to the instruction set, sfence should also ensures that the
:"global visibility" (i.e. empty CPU store buffer) of the stores before
:sfence.
:So should we do the same as cpu_mfence(), i.e. use a locked memory access?
:
:Best Regards,
:sephe

    cpu_sfence() is basically a NOP, because x86 cpus already order
    writes for global visibility.  The volatile ..."memory" macro is
    roughly equivalent to cpu_ccfence() ... prevent the compiler itself
    from trying to optimize or reorder actual instructions around that
    point in the code.

    lfence/mfence require actual work.  Even though cpus guarantee
    ordered global visibility on write they also reorder reads and
    do speculative reads.  Thus lfence/mfence are real.

    I dunno what the best approach is but I suggest someone modify
    one of the benchmark tests in /usr/src/test/sysperf to measure
    the cost of lfence/mfence vs locked memory instructions.  For
    such a test to be reasonable the test must create two threads
    which compete for the memory location in question.

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>

Follow-Ups:
- Re: i386 version of cpu_sfence()
  - From: Sepherosa Ziehau <sepherosa@gmail.com>

References:
- i386 version of cpu_sfence()
  - From: Sepherosa Ziehau <sepherosa@gmail.com>

[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index][Thread Index]