Up to [DragonFly] / src / sys / i386 / isa
Request diff between arbitrary revisions
Keyword substitution: kv
Default branch: MAIN
ICU/APIC cleanup part 1/many. Move ICU and APIC support files into their own subdirectory, bump the required config version for the build since this move also requires the use of the new arch/ symlink.
Major cleanup of the interrupt registration subsystem. * Collapse the separate registrations in the kernel interrupt thread and i386 layers into a single machine-independant kernel interrupt thread layer in kern/kern_intr.c. Get rid of the i386 layer's 'MUX' code entirely. * Have the interrupt vector assembly code (icu_vector.s and apic_vector.s) call a machine-independant function in the kernel interrupt thread layer to figure out how to process an interrupt. * Move a lot of assembly into the new C interrupt processing function. * Add support for INTR_MPSAFE. If a device driver registers an interrupt as being MPSAFE, the Big Giant Lock will not be obtained or required. * Temporarily just schedule the ithread if a FAST interrupt cannot be executed due to its serializer being locked. * Add LWKT serialization support for a non-blocking 'try' function. * Get rid of ointhand2_t and adjust all old ISA code to use inthand2_t. * Supply a frame pointer as a pointer rather then embedding it on th stack. * Allow FAST and SLOW interrupts to be mixed on the same IRQ, though this will not necessarily result in optimal operation. * Remove direct APIC/ICU vector calls from the apic/icu vector assembly code. Everything goes through the new routine in kern/kern_intr.c now. * Add a new flag, INTR_NOPOLL. Interrupts registered with the flag will not be polled by the upcoming emergency general interrupt polling sysctl (e.g. ATA cannot be safely polled due to the way ATA register access interferes with ATA DMA). * Remove most of the distinction in the i386 assembly layers between FAST and SLOW interrupts (part 1/2). * Revamp the interrupt name array returned to userland to list multiple drivers associated with the same IRQ.
Fix isa_wrongintr. The APIC vector was being directly assigned to a C function, resulting in register corruption and other nasties. Generate a set of assembly functions to handle wrong-interrupt assignments. For now we just EOI the apic and iret. This bug could cause an SMP box to crash on boot if a spurious IRQ #0 interrupt occurs while it is testing for 8254 interrupt delivery.
Get rid of smp_rendezvous() and all associated support circuitry. Move the two mechanisms still using it (User LDT and MTRR propogation) over to the lwkt_cpusync*() API.
When a cpu is stopped due to a panic or the debugger, it can be in virtually any state, including possibly holding a critical section. IPIQ interrupts must still be processed while we are in this state (even though we could be racing IPIQ processing if we were interrupted at just the wrong time). In particular, dumping is not likely to work if a panic occurs on a cpu != 0 unless we process the IPIQ on the stopped cpus. There are simply too many interactions between cpus. Interrupt threads are LWKT scheduled entities and will generally still not work during a panic while dumping. The dumping code expects this. However, call splz() anyway. We may in the future have to allow certain threads to run while dumping. For example, to allow dumping over the network. There are various ways this can be done, such as by masking gd_runqmask or flagging special threads to be runnable while in a paniced or dumping state.
Remove all remaining SPL code. Replace the mtd_cpl field in the machine dependant thread structure and the CPL field in the interrupt stack frame with dummies (so structural sizes do not change, yet). Remove all interrupt handler SPL mask and mask pointer code. Remove all spl*() functions except for splz(). Note that doreti uses a temporary CPL mask internally to accumulate a bitmap of FAST interrupts which could not be executed due to not being able to get the BGL. This mask has no outside visibility. Note that gd_fpending and gd_ipending still exist to support critical section interrupt deferment.
Synchronize a bunch of things from FreeBSD-5 in preparation for the new ACPICA driver support. * Bring in a lot of new bus and pci DEV_METHODs from FreeBSD-5 * split apic.h into apicreg.h and apicio.h * rename INTR_TYPE_FAST -> INTR_FAST and move the #define * rename INTR_TYPE_EXCL -> INTR_EXCL and move the #define * rename some PCIR_ registers and add additional macros from FreeBSD-5 * note: new pcib bus call, host_pcib_get_busno() imported. * kern/subr_power.c no longer optional. Other changes: * machine/smp.h machine smp/smptests.h can now be #included unconditionally, and some APIC_IO vs SMP separation has been done as well. * gd_acpi_id and gd_apic_id added to machine/globaldata.h prep for new ACPI code. Despite all the changes, the generated code should be virtually the same. These were mostly additions which the pre-existing code does not (yet) use.
Introduce an MI cpu synchronization API, redo the SMP AP startup code,
and start cleaning up deprecated IPI and clock code. Add a MMU/TLB page
table invalidation API (pmap_inval.c) which properly synchronizes page
table changes with other cpus in SMP environments.
* removed (unused) gd_cpu_lockid
* remove confusing invltlb() and friends, normalize use of cpu_invltlb()
and smp_invltlb().
* redo the SMP AP startup code to make the system work better in
situations where all APs do not startup.
* add memory barrier API, cpu_mb1() and cpu_mb2().
* remove (obsolete, no longer used) old IPI hard and stat clock forwarding
code.
* add a cpu synchronization API which is capable of handling multiple
simultanious requests without deadlocking or livelocking.
* major changes to the PMAP code to use the new invalidation API.
* remove (unused) all_procs_ipi() and self_ipi().
* only use all_but_self_ipi() if it is known that all AP's started up,
otherwise use a mask.
* remove (obsolete, no longer usde) BETTER_CLOCK code
* remove (obsolete, no longer used) Xcpucheckstate IPI code
Testing-by: David Rhodus and others
Change lwkt_send_ipiq() and lwkt_wait_ipiq() to take a globaldata_t instead of a cpuid. This is part of an ongoing cleanup to use globaldata_t's to reference other cpus rather then their cpu numbers, reducing the number of serialized memory indirections required in a number of code paths and making more context available to the target code.
This commit represents a major revamping of the clock interrupt and timebase
infrastructure in DragonFly.
* Rip out the existing 8254 timer 0 code, and also disable the use of
Timer 2 (which means that the PC speaker will no longer go beep). Timer 0
used to represent a periodic interrupt and a great deal of code was in
place to attempt to obtain a timebase off of that periodic interrupt.
Timer 0 is now used in software retriggerable one-shot mode to produce
variable-delay interrupts. A new hardware interrupt clock abstraction
called SYSTIMERS has been introduced which allows threads to register
periodic or one-shot interrupt/IPI callbacks at approximately 1uS
granularity.
Timer 2 is now set in continuous periodic mode with a period of 65536
and provides the timebase for the system, abstracted to 32 bits.
All the old platform-integrated hardclock() and statclock() code has
been rewritten. The old IPI forwarding code has been #if 0'd out and
will soon be entirely removed (the systimer abstraction takes care of
multi-cpu registrations now). The architecture-specific clkintr() now
simply calls an entry point into the systimer and provides a Timer 0
reload and Timer 2 timebase function API.
* On both UP and SMP systems, cpus register systimer interrupts for the Hz
interrupt, the stat interrupt, and the scheduler round-robin interrupt.
The abstraction is carefully designed to allow multiple interrupts occuring
at the same time to be processed in a single hardware interrupt. While
we currently use IPI's to distribute requested interrupts from other cpu's,
the intent is to use the abstraction to take advantage of per-cpu timers
when available (e.g. on the LAPIC) in the future.
systimer interrupts run OUTSIDE THE MP LOCK. Entry points may be called
from the hard interrupt or via an IPI message (IPI messages have always
run outside the MP lock).
* Rip out timecounters and disable alternative timecounter code for other
time sources. This is temporary. Eventually other time sources, such as
the TSC, will be reintegrated as independant, parallel-running entities.
There will be no 'time switching' per-say, subsystems will be able to
select which timebase they wish to use. It is desireable to reintegrate
at least the TSC to improve [get]{micro,nano}[up]time() performance.
WARNING: PPS events may not work properly. They were not removed, but
they have not been retested with the new code either.
* Remove spl protection around [get]{micro,nano}[up]time() calls, they are
now internally protected.
* Use uptime instead of realtime in certain CAM timeout tests
* Remove struct clockframe. Use struct intrframe everywhere where clockframe
used to be used.
* Replace most splstatclock() protections with crit_*() protections, because
such protections must now also protect against IPI messaging interrupts.
* Add fields to the per-cpu globaldata structure to access timebase related
information using only a critical section rather then a mutex. However,
the 8254 Timer 2 access code still uses spin locks. More work needs to
be done here, the 'realtime' correction is still done in a single global
'struct timespec basetime' structure.
* Remove the CLKINTR_PENDING icu and apic interrupt hacks.
* Augment the IPI Messaging code to make an intrframe available to callbacks.
* Document 8254 timing modes in i386/sai/timerreg.h. Note that at the
moment we assume an 8254 instead of an 8253 as we are using TIMER_SWSTROBE
mode. This may or may not have to be changed to an 8253 mode.
* Integrate the NTP correction code into the new timebase subsystem.
* Separate boottime from basettime. Once boottime is believed to be stable
it is no longer effected by NTP or other time corrections.
CAVETS:
* PC speaker no longer works
* Profiling interrupt rate not increased (it needs work to be
made operational on a per-cpu basis rather then system-wide).
* The native timebase API is function-based, but currently hardwired.
* There might or might not be issues with 486 systems due to the
timer mode I am using.
Fix a number of mp_lock issues. I had outsmarted myself trying to deal with td->td_mpcount / mp_lock races. The new rule is: you first modify td->td_mpcount, then you deal with mp_lock assuming that an interrupt might have already dealt with it for you, and various other pieces of code deal with the race if an interrupt occurs in the middle of the above two data accesses.
Add the NO_KMEM_MAP kernel configuration option. This is a temporary option that will allow developers to test kmem_map removal and also the upcoming (not this commit) slab allocator. Currently this option removes kmem_map and causes the malloc and zalloc subsystems to use kernel_map exclusively. Change gd_intr_nesting_level. This variable is now only bumped while we are in a FAST interrupt or processing an IPIQ message. This variable is not bumped while we are in a normal interrupt or software interrupt thread. Add warning printf()s if malloc() and related functions detect attempts to use them from within a FAST interrupt or IPIQ. Remove references to the no-longer-used zalloci() and zfreei() functions.
Collapse gd_astpending and gd_reqpri together into gd_reqflags. gd_reqflags now rollsup requests made pending for doreti. Cleanup a number of scheduling primitives and note that we do not need to use locked bus cycles on per-cpu variables. Note that the aweful idelayed hack for certain softints (used only by the TTY subsystem, BTW) gets slightly broken in this commit because idelayed has become per-cpu and the clock ints aren't yet distributed.
Forward FAST interrupts to the MP lock holder + minor fixes.
MP Implmentation 3B/4: Remove Xcpuast and Xforward_irq, replacing them with IPI messaging functions. Fix user scheduling issues so user processes are dependably scheduled on available cpus.
MP Implementation 2/4: Implement a poor-man's IPI messaging subsystem, get both cpus arbitrating the BGL for interrupts, IPIing foreign cpu LWKT scheduling requests without crashing, and dealing with the cpl. The APs are in a slightly less degenerate state now, but hardclock and statclock distribution is broken, only one user process is being scheduled at a time, and priorities are all messed up.
MP Implementation 1/2: Get the APIC code working again, sweetly integrate the MP lock into the LWKT scheduler, replace the old simplelock code with tokens or spin locks as appropriate. In particular, the vnode interlock (and most other interlocks) are now tokens. Also clean up a few curproc/cred sequences that are no longer needed. The APs are left in degenerate state with non IPI interrupts disabled as additional LWKT work must be done before we can really make use of them, and FAST interrupts are not managed by the MP lock yet. The main thing for this stage was to get the system working with an APIC again. buildworld tested on UP and 2xCPU/MP (Dell 2550)
Remove pre-ELF underscore prefix and asnames macro hacks.
threaded interrupts 1: Rewrite the ICU interrupt code, splz, and doreti code. The APIC code hasn't been done yet. Consolidate many interrupt thread related functions into MI code, especially software interrupts. All normal interrupts and software interrupts are now threaded, and I'm almost ready to deal with interrupt-thread-only preemption. At the moment I run interrupt threads in a critical section and probably will continue to do so until I can make them MP safe.
Finish migrating the cpl into the thread structure.
thread stage 8: add crit_enter(), per-thread cpl handling, fix deferred interrupt handling for critical sections, add some basic passive token code, and blocking/signaling code. Add structural definitions for additional LWKT mechanisms. Remove asleep/await. Add generation number based xsleep/xwakeup. Note that when exiting the last crit_exit() we run splz() to catch up on blocked interrupts. There is also some #if 0'd code that will cause a thread switch to occur 'at odd times'... primarily wakeup()-> lwkt_schedule()->critical_section->switch. This will be usefulf or testing purposes down the line. The passive token code is mostly disabled at the moment. It's primary use will be under SMP and its primary advantage is very low overhead on UP and, if used properly, should also have good characteristics under SMP.
thread stage 1: convert curproc to curthread, embed struct thread in proc.
Add the DragonFly cvs id and perform general cleanups on cvs/rcs/sccs ids. Most ids have been removed from !lint sections and moved into comment sections.
import from FreeBSD RELENG_4 1.47.2.5