DragonFly kernel List (threaded) for 2003-07
Re: syscall messaging interface API
-----BEGIN PGP SIGNED MESSAGE-----
On a pentium you can use call gates to switch through different protection
levels. Theres a couple problems I see with this though. One is that it
uses segmentation more than just to give a flat memory management model which
is common to most UNIX on x86 if not all. Second, its not portable if you
switch it to any processor which doesnt have segment support. You could have
maybe both, but Im sure theres some reason interrupts were chosen. Most x86
UNIX only use CPL 0 and 3 for OS and User respectively. A software interrupt
is just like any other interrupt. It stores the current context and jumps
right into the entry in the IDT for whatever interrupt you used. It seems
theres more things to set up to get call gates working.
On Wednesday 23 July 2003 03:45 pm, David Leimbach wrote:
> Didn't the L4 folks find a way to make system calls on Pentiums without
> using software interrupts? Isn't this like 10x faster?
> I need to read stuff at the Pistachio site again but I think I am correct.
> Any chance that could get integrated into DflyBSD? [sorry for shortenting
> If one could more fully separate the syscall APIs from the actual
> implementation couldn't each low-level layer do more optimizations of the
> Again... just shooting my naivete around.
> On Wednesday, July 23, 2003, at 02:26PM, Matthew Dillon
> > Here is my idea for the system messaging interface. I will use a
> > new trap gate (0x81) to implement it, because it occurs to me that
> > a message interface really ought to pass and return information in
> > registers rather then on the stack (since the message itself is
> > already in user memory we might as well just have to do the copyin() on
> > the contents rather then on both the system messaging interface arguments
> > and the contents of the message). And a new trap gate isolates us from
> > the old syscall mechanism.
> > int 0x81 to dispatch, arguments in eax, ecx, edx, return value in eax.
> > error = sendsys(port, msg, msgsize)
> > eax:error = int0x81(eax:port, ecx:msg, edx:msgsize)
> > Send a syscall message to the kernel. The userland requests
> > asynchronous or synchronous operation through the standard message
> > flag MSGF_ASYNC. The userland specifies userland pointers to the
> > userland version of the system port, the userland version of the
> > message, and the size of the message.
> > The kernel copyin()'s the message and acts on it, and either returns
> > a synchronous or asynchronous error code as per our messaging
> > API. Results (like the return value for read() or lseek()) will be
> > stored in the message structure. Only error (errno) codes are
> > returned in eax.
> > The kernel will initially ignore the userland version of the system
> > port but eventually we can use this to store interface versioning
> > information (so we don't have to load it into the message every time).
> > The kernel utilizes the reply port stored in the message to return the
> > message to userland. The userland reply port may be NULL, in which
> > case the kernel expects the userland to explicitly wait for the
> > message to be returned or to poll for message completion passively,
> > or the reply port may be non-NULL indicating that the kernel should
> > return the message to the port.
> > The reply port, if non-NULL, controls the action taken when a
> > message is returned. The action can be:
> > * Queue without notification
> > * Queue and perform an upcall to the (port specified) function
> > * Queue and perform an upcall managed by a critical section (the
> > kernel would check to see if the user thread is in a critical
> > section and if so would just flag it. The userland would later
> > detect that flag and flush the kernel's message queue).
> > * ... any other action that we can think of, e.g. things like queue
> > with passive notification but revert to an upcall after a timeout
> > if the userland doesn't call flushsys(). etc.
> > error = waitsys(port, msg)
> > eax:error = int0x81(eax:port, ecx:msg, edx:0)
> > Ask the kernel to block until a message has been returned, or until
> > a message is pending on the specified (userland) mesasge port, or
> > both.
> > error = flushsys()
> > eax:error = int0x81(eax:NULL, ecx:NULL, edx:0)
> > Ask the kernel to flush any pending messages that were held up due
> > to userland being in a critical section. The kernell will have
> > flagged this to the userland and the userland will then call
> > flushsys() when it exits out of its last critical section.
> > I believe that this gives us flexibility we need. I have also come up
> > with a novel solution for signaling! The userland would queue
> > 'signal' messages to the kernel. The kernel would then 'return' the
> > appropriate signal message when the signal occurs. This gives
> > userland complete control (via the reply port) on how to deal with
> > signals.
> > Signal messages would be like continuous I/O requests. The message
> > would still be 'live' in the kernel even after it has 'returned' it to
> > userland. The kernel would be free to return the message over and over
> > again until the userland tells it to abort the signalling request.
> > The userland would interlock with the kernel by virtual of a flag bit
> > in the message or the reply port. This coupled with a userland
> > version of the critical section would interlock the return-from-softint
> > sequencing (i.e. so the kernel doesn't push an upcall on top of the same
> > upcall that is in the middle of trying to return back).
> > A similar form can be used for things like periodic timer requests...
> > they can stay 'live' in the kernel and simply be returned over and
> > over again to the userland.
> > I know this sounds somewhat complex but it provides us with the
> > greatest flexibility as well as an incremental development approach..
> > e.g. initially all system call messages are synchronous so we don't have
> > to worry about reply ports. Then we implement passive reply ports. Then
> > we implement software interrupts (upcalls), then we implement the more
> > complex signalling semantics. All a very orderly and extremely powerful
> > mechanism.
> > -Matt
Craig Dooley cd5697@xxxxxxxxxx
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (FreeBSD)
-----END PGP SIGNATURE-----