DragonFly BSD
DragonFly kernel List (threaded) for 2004-04
[Date Prev][Date Next]  [Thread Prev][Thread Next]  [Date Index][Thread Index]

Re: pipe testing and kernel copyin/copyout/bcopy performance


From: Andy Isaacson <adi@xxxxxxxxxxxxx>
Date: Thu, 29 Apr 2004 19:13:44 -0500

On Thu, Apr 29, 2004 at 01:41:35PM -0700, Matthew Dillon wrote:
> :I've also read something about caches not being updated by using SSE 
> :instructions such that if you refer to the memory you just copied that the 
> :wins for having used SSE in the copy are much diminished.
> :
> :Dave
> 
>     These are the so-called 'non-temporal' instructions.  So, for example,
>     the standard 128 bit move instruction is 'movdqa' or 'movdqu' (for
>     double-quad-aligned or double-quad-unaligned).  The non-temporal 
>     version is 'movntdq'.
> 
>     The non-temporal instructions supposedly queue directly to memory and
>     do not 'pollute' the caches.

. .. but to clear up one potential misunderstanding -- non-temporal
stores *are* cache-coherent; you'll never get the "wrong answer" due to
using a non-temporal instruction.  It just won't be as fast as if you'd
let the cache do its job, that's all.

(I wonder what happens if you have the a word of memory dirty in L1 and
do a movntq directed at the next one.  Lots of fiddle-faddle, I'd
imagine.)

-andy



[Date Prev][Date Next]  [Thread Prev][Thread Next]  [Date Index][Thread Index]