DragonFly kernel List (threaded) for 2004-04
Re: pipe testing and kernel copyin/copyout/bcopy performance
On Thu, Apr 29, 2004 at 01:41:35PM -0700, Matthew Dillon wrote:
> :I've also read something about caches not being updated by using SSE
> :instructions such that if you refer to the memory you just copied that the
> :wins for having used SSE in the copy are much diminished.
> These are the so-called 'non-temporal' instructions. So, for example,
> the standard 128 bit move instruction is 'movdqa' or 'movdqu' (for
> double-quad-aligned or double-quad-unaligned). The non-temporal
> version is 'movntdq'.
> The non-temporal instructions supposedly queue directly to memory and
> do not 'pollute' the caches.
. .. but to clear up one potential misunderstanding -- non-temporal
stores *are* cache-coherent; you'll never get the "wrong answer" due to
using a non-temporal instruction. It just won't be as fast as if you'd
let the cache do its job, that's all.
(I wonder what happens if you have the a word of memory dirty in L1 and
do a movntq directed at the next one. Lots of fiddle-faddle, I'd