DragonFly kernel List (threaded) for 2004-02
SMP CPU Synchronization patch - needs testing on SMP systems
This patch needs some serious testing on SMP systems. It is part 1 of
a multi-stage patch to fix serious issues with our VM system.
This patch is an outgrowth of conversations I have had with Alan Cox
and Tor regarding TLB writeback races between cpus.
Basically the problem is that a user process running on one cpu may
modify memory which causes the CPU to issue a TLB writeback of the page
table entry in order to (e.g.) set the 'M'odified or 'A'ccessed bit in
the pte. Since the user process does not need the MP lock, this writeback
can race against another cpu running in the kernel (that is holding the
MP lock) trying to update the same page table entry.
The result are occassional weird failures and panics such as
"dirty page found in cache queue".
This patch basically creates a CPU sychronization and rendezvous API
that allows us to force other cpus into a known state while we make
sensitive page table changes. Also in this patch is a reworking of the
PMAP subsystem to use the new frameowrk. Code to deal with modified bit
races in the VM system is slated for a future stage.
I would appreciate it if those of you with SMP systems could test this
patch. I have done some preliminary testing on a Dell-2550 and it was
able to successfully buildworld twice, so I believe the patch is
reasonably stable. But it's a lot of work and a big patch and needs
some third party testing before I feel I can commit it.