Up to [DragonFly] / src / sys / i386 / i386
Request diff between arbitrary revisions
Keyword substitution: kv
Default branch: MAIN
Reorganize the way machine architectures are handled. Consolidate the kernel configurations into a single generic directory. Move machine-specific Makefile's and loader scripts into the appropriate architecture directory. Kernel and module builds also generally add sys/arch to the include path so source files that include architecture-specific headers do not have to be adjusted. sys/<ARCH> -> sys/arch/<ARCH> sys/conf/*.<ARCH> -> sys/arch/<ARCH>/conf/*.<ARCH> sys/<ARCH>/conf/<KERNEL> -> sys/config/<KERNEL>
Allow 'options SMP' *WITHOUT* 'options APIC_IO'. That is, an ability to produce an SMP-capable kernel that uses the PIC/ICU instead of the IO APICs for interrupt routing. SMP boxes with broken BIOSes (namely my Shuttle XPC SN95G5) could very well have serious interrupt routing problems when operating in IO APIC mode. One solution is to not use the IO APICs. That is, to run only the Local APICs for the SMP management. * Don't conditionalize NIDT. Just set it to 256 * Make the ICU interrupt code MP SAFE. This primarily means using the imen_spinlock to protect accesses to icu_imen. * When running SMP without APIC_IO, set the LAPIC TPR to prevent unintentional interrupts. Leave LINT0 enabled (normally with APIC_IO LINT0 is disabled when the IO APICs are activated). LINT0 is the virtual wire between the 8259 and LAPIC 0. * Get rid of NRSVIDT. Just use IDT_OFFSET instead. * Clean up all the APIC_IO tests which should have been SMP tests, and all the SMP tests which should have been APIC_IO tests. Explicitly #ifdef out all code related to the IO APICs when APIC_IO is not set.
ICU/APIC cleanup part 7/many. Get rid of most of the dependancies on ICU_LEN, NSWI, and NHWI, by creating a generous system standard maximum for hardware and software interrupts in the MI sys/interrupt.h. The interrupt architecture can then further limit available hardware and software interrupts. For example, i386 uses 32 bit masks and so is limited to 32 hardware interrupts and 32 software interrupts. The name ICU_OFFSET is confusing, rename it to IDT_OFFSET, which is what it really is. Note that this separation is possible due to recent work on the MI interrupt layer. Separate the software interrupt mask from the hardware interrupt mask in the i386 code. Get rid of rndcontrol's 16 irq limit by creating a new ioctl to iterate through interrupt numbers.
Another major mmx/xmm/FP commit. This is a combination of several patches but since the earlier patches didn't actually fix the crashing and corruption issues we were seeing everything has been rolled into one well tested commit. Make the FP more deterministic by requiring that npxthread and the FP state be properly synchronized, and that the FP be in a 'safe' state (meaning that mmx/xmm registers be useable) when npxthread is NULL. Allow the FP save area to be revectored. Kernel entities which use the FP unit, such as the bcopy code, must save the app state if it hasn't already been saved, then revector the save area. Note that combinations of operations must be protected by a critical section or interrupt disablement. Any clearing or setting npxthread combined with an fxsave/fnsave/frstor/fxrstor/fninit must be protected as an atomic entity. Since interrupts are threads and can preempt, such preemption will cause a thread switch to occur and thus cause npxthread and the FP state to be manipulated. The kernel can only depend on the FP state being stable for its use after it has revectored the FP save area. This commit fixes a number of issues, including potential filesystem corruption and kernel crashes.
Rewrite the optimized memcpy/bcopy/bzero support subsystem. Rip out the old FreeBSD code almost entirely. * Add support for stacked ONFAULT routines, allowing copyin and copyout to call the general memcpy entry point instead of rolling their own. * Split memcpy/bcopy and bzero into their own files * Add support for XMM (128 bit) and MMX (64 bit) media instruction copies * Rewrite the integer code. Also note that most of the previous integer and FP special case support had been ripped out of DragonFly long ago in that the assembly was no longer being referenced. It doesn't make sense to have a dozen different zeroing/copying routines so focus on the ones that work well with recent (last ~5 years) cpus. * Rewrite the FP state handling code. Instead of restoring the FP state let it hang, which allows userland to make multiple syscalls and/or for the system to make multiple bcopy()/memcpy() calls without having to save/restore the FP state on each call. Userland will take a fault when it needs the FP again. Note that FP optimized copies only occur for block sizes >= 2048 bytes, so this is not something that userland, or the kernel, will trip up on every time it tries to do a bcopy(). * LWKT threads need to be able to save the FP state, add the simple conditional and 5 lines of assembly required to do that. AMD Athlon notes: 64 bit media instructions will get us 90% of the way there. It is possible to squeeze out slightly more memory bandwidth from the 128 bit XMM instructions (SSE2). While it does not exist in this commit there are two additional features that can be used: prefetching and non-temporal writes. Prefetching is a 3dNOW instruction and can squeeze out significant additionaL performance if you fetch ~128 bytes ahead of the game, but I believe it is AMD-only. Non-temporal writes can double UNCACHED memory bandwidth, but they have a horrible effect on L1/L2 performance and you can't mix non-temporal writes with normal writes without completely destroying memory performance (e.g. multiple GB/s -> less then 100 MBytes/sec). Neither prefetching nor non-temporal writes are implemented in this commit.
Introduce an MI cpu synchronization API, redo the SMP AP startup code, and start cleaning up deprecated IPI and clock code. Add a MMU/TLB page table invalidation API (pmap_inval.c) which properly synchronizes page table changes with other cpus in SMP environments. * removed (unused) gd_cpu_lockid * remove confusing invltlb() and friends, normalize use of cpu_invltlb() and smp_invltlb(). * redo the SMP AP startup code to make the system work better in situations where all APs do not startup. * add memory barrier API, cpu_mb1() and cpu_mb2(). * remove (obsolete, no longer used) old IPI hard and stat clock forwarding code. * add a cpu synchronization API which is capable of handling multiple simultanious requests without deadlocking or livelocking. * major changes to the PMAP code to use the new invalidation API. * remove (unused) all_procs_ipi() and self_ipi(). * only use all_but_self_ipi() if it is known that all AP's started up, otherwise use a mask. * remove (obsolete, no longer usde) BETTER_CLOCK code * remove (obsolete, no longer used) Xcpucheckstate IPI code Testing-by: David Rhodus and others
USER_LDT is now required by a number of packages as well as our upcoming user threads support. Make it non-optional. USER_LDT breaks SysV emulated sysarch(... SVR4_SYSARCH_DSCR) support. For now just #if 0 out the support (which is what FreeBSD-5.x does). Submitted-by: Craig Dooley <firstname.lastname@example.org>
Fix typos in comments.
Collapse gd_astpending and gd_reqpri together into gd_reqflags. gd_reqflags now rollsup requests made pending for doreti. Cleanup a number of scheduling primitives and note that we do not need to use locked bus cycles on per-cpu variables. Note that the aweful idelayed hack for certain softints (used only by the TTY subsystem, BTW) gets slightly broken in this commit because idelayed has become per-cpu and the clock ints aren't yet distributed.
MP Implmentation 3/4: MAJOR progress on SMP, full userland MP is now working! A number of issues relating to MP lock operation have been fixed, primarily that we have to read %cr2 before get_mplock() since get_mplock() may switch away. Idlethreads can now safely HLT without any performance detriment. The userland scheduler has been almost completely rewritten and is now using an extremely flexible abstraction with a lot of room to grow. pgeflag has been removed from mapdev (without per-page invalidation it isn't safe to use PG_G even on UP). Necessary locked bus cycles have been added for the pmap->pm_active field in swtch.s. CR3 has been unoptimized for the moment (see comment in swtch.s). Since the switch code runs without the MP lock we have to adjust pm_active PRIOR to loading %cr3. Additional sanity checks have been added to the code (see PARANOID_INVLTLB and ONLY_ONE_USER_CPU in the code), plus many more in kern_switch.c. A passive release mechanism has been implemented to optimize P_CURPROC/lwkt priority shifting when going from user->kernel and kernel->user. Note: preemptive interrupts don't care due to the way preemption works so no additional complexity there. non-locking atomic functions to protect only against local interrupts have been added. astpending now uses non-locking atomic functions to set and clear bits. private_tss has been moved to a per-cpu variable. The LWKT thread module has been considerably enhanced and cleaned up, including some fixes to handle MPLOCKED vs td_mpcount races (so eventually we can do MP locking without a pushfl/cli/popfl combo). stopevent() needs critical section protection, maybe.
MP Implementation 2/4: Implement a poor-man's IPI messaging subsystem, get both cpus arbitrating the BGL for interrupts, IPIing foreign cpu LWKT scheduling requests without crashing, and dealing with the cpl. The APs are in a slightly less degenerate state now, but hardclock and statclock distribution is broken, only one user process is being scheduled at a time, and priorities are all messed up.
Split the struct vmmeter cnt structure into a global vmstats structure and a per-cpu cnt structure. Adjust the sysctls to accumulate statistics over all cpus.
Remove pre-ELF underscore prefix and asnames macro hacks.
threaded interrupts 1: Rewrite the ICU interrupt code, splz, and doreti code. The APIC code hasn't been done yet. Consolidate many interrupt thread related functions into MI code, especially software interrupts. All normal interrupts and software interrupts are now threaded, and I'm almost ready to deal with interrupt-thread-only preemption. At the moment I run interrupt threads in a critical section and probably will continue to do so until I can make them MP safe.
smp/up collapse stage 2 of 2: cleanup the globaldata structure, cleanup and separate machine dependant portions of thread, proc, and globaldata, and reduce the need to include lots of MD header files.
smp/up collapse stage 1 of 2: Make UP use the globaldata structure the same way SMP does, and start removing all the bad macros and hacks that existed before.
go back to using gd_cpuid instead of gd_cpu.
proc->thread stage3: make time accounting threads based and rework it for performance. Cleanup user/sys/interrupt time accounting. Get rid of the microputime and equivalent support code in mi_switch() (it was really a bad idea to put that in the critical path IMHO). Instead account for time statistically from the statclock, which produce time accounting that is just as accurate in the long haul. Remove the u/s/iticks fields from the proc structure and put a slightly different version in the thread structure, so time can be accounted for both threads and processes.
thread stage 8: add crit_enter(), per-thread cpl handling, fix deferred interrupt handling for critical sections, add some basic passive token code, and blocking/signaling code. Add structural definitions for additional LWKT mechanisms. Remove asleep/await. Add generation number based xsleep/xwakeup. Note that when exiting the last crit_exit() we run splz() to catch up on blocked interrupts. There is also some #if 0'd code that will cause a thread switch to occur 'at odd times'... primarily wakeup()-> lwkt_schedule()->critical_section->switch. This will be usefulf or testing purposes down the line. The passive token code is mostly disabled at the moment. It's primary use will be under SMP and its primary advantage is very low overhead on UP and, if used properly, should also have good characteristics under SMP.
thread stage 7: Implement basic LWKTs, use a straight round-robin model for the moment. Also continue consolidating the globaldata structure so both UP and SMP use it with more commonality. Temporarily match user processes up with scheduled LWKTs on a 1:1 basis. Eventually user processes will have LWKTs, but they will not all be scheduled 1:1 with the user process's runnability. With this commit work can potentially start to fan out, but I'm not ready to announce yet.
thread stage 4: remove curpcb, use td_pcb reference instead. Move the pcb to the end of the thread stack, and note that a pcb will always exist because a thread context will always exist. Also note that vm86 replaces td_pcb temporarily and we really need to rip that out and instead make a copy on the stack, because assumptions are made in regards to the pcb's location.
thread stage 2: convert npxproc to npxthread.
thread stage 1: convert curproc to curthread, embed struct thread in proc.
Add the DragonFly cvs id and perform general cleanups on cvs/rcs/sccs ids. Most ids have been removed from !lint sections and moved into comment sections.
import from FreeBSD RELENG_4 126.96.36.199