DragonFly kernel List (threaded) for 2003-07
Re: MESI Caching work start (was Re: caching and userapi)
On Fri, Jul 18, 2003 at 07:43:58PM -0700, Matthew Dillon wrote:
> :Matt, this is exactly the method propesed by Larry McVoy to avoid
> :having 1000+ mutexes in a kernel: have a simple, non-mutexed kernel
> :take care of each cpu and then manage to get a global page cache memory
> :across them, with mmap fully working.
> :Have a look at http://lwn.net/2001/1206/a/ccCluster.php3
> :By the way, when dragonfly manages to work with it's internal
> :ports system, can we rename you to 'Matt "Amiga Rules" Dillon'? ;)
> :Greets, Antonio.
> Interesting. That article is dated 4 Dec 2001. I wonder if Larry
> has updated his thinking since then. I also looked at the slides
> and flipped through the two papers mentioned. He doesn't specifically
> mention MESI but it is obvious that he is angling for the same effect.
> The funny thing is that what Larry, and what I am proposing, is not hard
> to do. The MESI (Modified Exclusive Shared Invalid) model has been around
> for ages and is the most common model used in SMP systems for L1/L2
> interactions. It's a very well known technology that even deals with the
> 'cache' nature of the information (since an L1/L2 cache only represents
> a small portion of available system memory, things are constantly being
> thrown away).
> The only major difference between a hardware implementation and a software
> implementation is the fact that it is far more expensive to 'broadcast'
> certain cache state transitions to the other cpus in software. If this
> is the only problem then I am certain the performance issue is solvable.
> I would call this sort of feature *CRITICAL* to DragonFly's future.
> Or 'Matt "Son of Amiga" Dillon', complete with a bouncing baby ball
> logo :-0
> Matthew Dillon
Matt, looking by my books, I've found out references to MESI protocol
on the PowerPC 604 manual (page 3-11), which it uses for it's caches.
Possible combinations seem to be:
m p1 p2 valid references
0. e i i memory
1. s s i memory and p1
2. s s s memory, p1 and p2
3. i e i p1
4. i m i p1
I assume that, for the software caches, there may be no passive devices
like system memory is, but only active ones. Therefore:
p1 p2 valid references
1. s i p1
2. s s both p1 and p2
3. e i p1
4. m i p1
Coding a simple state machine and veryfing it by
brute force should not take too much time ;)
I've also googled a bit and found this one talking about MESI:
Regarding broadcast communication between CPUs, we can model it
as an broadcasting ethernet: you store data on shared media
(a global page for all CPUs, for example), tag some recipients
on this data (like an 48bit ethernet address would be)
and then send a global interrupt to all nodes,
which would copy it to internal buffers and resume
processing as fast as possible.
Recipient tagging would allow partitioning the machine,
such as for example allowing route cache entries only
on CPU 1 and 2 but not on others.
Time to read and code a bit more :)
1. Dado un programa, siempre tiene al menos un fallo.
2. Dadas varias lineas de codigo, siempre se pueden acortar a menos lineas.
3. Por induccion, todos los programas se pueden
reducir a una linea que no funciona.