DragonFly kernel List (threaded) for 2005-05
The time has come for a kernel interfacing library layer
Ok. The time has come to implement the idea some of us have been
talking about on and off for the last year, and that is to implement a
kernel-interfacing layer between libc and the system (rather then
building the system call stubs into libc).
This layer is going to take over the job of providing the system call
API to the program. It will also allow us to safely use more complex
kernel interfaces, such as a shared-memory userland critical section
for signal interlocks, shared memory access to things like the 'pid',
and so forth, without breaking long-term forwards and backwards
compatibility due to structural changes made in the system.
This will give us the following features:
* The ability to change system structures and system call effects without
having to renumber system calls.
* The ability to use shared memory between userland the kernel without
breaking forwards and backwards compatibility. i.e. only the layer
itself would access the shared memory, program binaries would not.
Again, an all-userland path.
* The ability to implement 'system calls' that actually run entirely in
userspace (the layer would JMP to code in the layer rather then JMP
to code that calls int 0x80).
* The ability to implement (future) asynchronous-messaged system calls
without userland being aware that the physical ABI into the kernel
* And many other things.
How will this work? The concept is simple: Instead of implementing
system calls directly, all userland programs instead implement a
special named-section containing system call stubs. This will be a
BSS section (not contain actual code). The kernel loader (and ld-elf
when it loads things) will automatically detect the existance of the
section and automatically mmap() the actual syscall layer into the
BSS space, as well as mmap() anything else that it needs for system
interfacing (any additional mmap()'d sections will be not be directly
visible to userland. Userland only sees the stub table).
The kernel will select the layer file that it maps in based on the ABI
version of the userland program verses the ABI version of the kernel.
This theoretically means that we can make any old program work with any
new kernel by building the correct layer, independant of both the
original program's and the original kernel's compilation. This also
means that we can make any new program work with any old kernel through
the same means.
Joerg, to make this work I need two other things:
* We need to have the kernel automatically setup the initial TLS
* We need to reserve some fixed positive-offset space in the TLS
to hold a pointer to errno and other things that the layer might
need to manage. Since the layer is simply going to be mmap()'d and
not dynamically linked, it cannot use the standard TLS variable space
itself. The layer may need some meta-data (basically one generic
pointer's worth) to e.g. store a pointer to the shared memory area
or to other interfacing aspects, private to the layer.
The biggest piece of the puzzle here is storing a pointer to errno
at a fixed %gs:POSITIVE_OFFSET so the syscall layer can actually
The benefits of this are huge.