Up to [DragonFly] / src / sys / netproto / natm
Request diff between arbitrary revisions
Keyword substitution: kv
Default branch: MAIN
Add NETISR_FLAG_NOTMPSAFE, which could be used as the last parameter to netisr_register(), more expressive and less error-prone than 0. Suggested-by: hsu@
Add following three network protocol threads running mode: 1) BGL (default) 2) Adaptive BGL. Protocol threads run without BGL by default. BGL will be held if the received msg does not have MSGF_MPSAFE turned on the ms_flags field 3) No BGL (experimental) The code on the main path is done by dillon@ Following three sysctls and tunables are added to adjust the "mode": net.netisr.mpsafe_thread net.inet.tcp.mpsafe_thread net.inet.udp.mpsafe_thread They have same set of values, 0 (default) -- BGL 1 -- Adaptive BGL 2 -- No BGL NETISR_FLAG_MPSAFE is added (netisr.ni_flags), so that: - netisr_queue() and schednetisr() could set MSGF_MPSAFE during msg initialization - netisr_run() (called by ether_input_oncpu()) could hold BGL based on this flag before calling netisr's handler PR_MPSAFE is added (protosw.pr_flags), so that tranport_processing_oncpu() could hold BGL before calling protocol's input handler Kernel API changes: - The thread parameter to netmsg_service_loop() must be supplied (running mode) and it must have the type of "int *" - netisr_register() takes additional flags parameter to indicate whether its handler is MPSAFE (NETISR_FLAG_MPSAFE) or not Reviewed-by: dillon@
Reduce ifnet.if_serializer contention on output path: - Push ifnet.if_serializer holding down into each ifnet.if_output implementation - Add a serializer into ifaltq, which is used to protect send queue instead of its parent's if_serializer. This change has following implication: o On output path, enqueueing packets and calling ifnet.if_start are decoupled o In device drivers, poll->dev_encap_ok->dequeue operation sequence is no longer safe, instead dequeue->dev_encap_fail->prepend should be used This serializer will be held by using lwkt_serialize_adaptive_enter() - Add altq_started field into ifaltq, which is used to interlock the calling of its parent's if_start, to reduce ifnet.if_serializer contention. if_devstart(), a helper function which utilizes ifaltq.altq_started, is added to reduce code duplication in ethernet device drivers. - Add if_cpuid into ifnet. This field indicates on which CPU device driver's interrupt will happen. - Add ifq_dispatch(). This function will try to hold ifnet.if_serializer in order to call ifnet.if_start. If this attempt fails, this function will schedule ifnet.if_start to be called on CPU located by ifnet.if_start_cpuid if_start_nmsg, which is per-CPU netmsg, is added to ifnet to facilitate ifnet.if_start scheduling. ifq_dispatch() is called by ether_output_frame() currently - Use ifq_classic_ functions, if altq is not enabled - Fix various device drivers bugs in their if_start implementation - Add ktr for ifq classic enqueue and dequeue - Add ktr for ifnet.if_start
atm_output() must be serialized.
* Greatly reduce the complexity of the LWKT messaging and port abstraction. Significantly reduce the overhead of the subsystem. * The message abort algorithm has been rewritten. It now sends a separate message to issue the abort instead of trying to requeue the original message. This also means the TAILQ embedded in the lwkt_msg structure can be used by unrelated code during processing of the message. * Numerous MSGF_ flags have been removed, and all the LWKT msg/port algorithms have been rewritten and simplified. The message structure is now only touched by the current owner in all situations. * Numerous structural fields have been removed. In particular, the fields used for message abort sequencing have been simplified and we do not try to embed a 'command' field in the base LWKT message any more. * Clean up the netmsg abstraction, which is used all over the network stack. Instead of trying to overload fields in lwkt_msg we now simply extend the base lwkt_msg into struct netmsg. The function dispatch now takes a netmsg and returns void (before we had to return EASYNC), and we no longer need weird casts. Accept/connect message aborts are now greatly simplified.
Revamp SYSINIT ordering. Relabel sysinit IDs (SI_* in sys/kernel.h) to make them less confusing, particularly with regard to the relative order init routines are called in. Reorder many sysinits. Reorder the SMP and CLOCK code to bring all the cpus up far earlier in the boot sequence and to make the full threading and clocking subsystems available for device config.
Give the sockbuf structure its own header file and supporting source file. Move all sockbuf-specific functions from kern/uipc_socket2.c into the new kern/uipc_sockbuf.c and move all the sockbuf-specific structures from sys/socketvar.h to sys/sockbuf.h. Change the sockbuf structure to only contain those fields required to properly management a chain of mbufs. Create a signalsockbuf structure to hold the remaining fields (e.g. selinfo, mbmax, etc). Change the so_rcv and so_snd structures in the struct socket from a sockbuf to a signalsockbuf. Remove the recently added sorecv_direct structure which was being used to provide a direct mbuf path to consumers for socket I/O. Use the newly revamped sockbuf base structure instead. This gives mbuf consumers direct access to the sockbuf API functions for use outside of a struct socket. This will also allow new API functions to be added to the sockbuf interface to ease the job of parsing data out of chained mbufs.
Convert all pr_usrreqs structure initializations to the .name = data format.
Rename printf -> kprintf in sys/ and add some defines where necessary (files which are used in userland, too).
Rename sprintf -> ksprintf Rename snprintf -> knsprintf Make allowances for source files that are compiled for both userland and the kernel.
Embed the netmsg in the mbuf itself rather than allocating one for each received packet. This greatly reduces the overhead in the network receive path (removing a malloc() and free()).
* Remove (void) casts for discarded return values. * Put function types on separate lines. * Ansify function definitions. * Remove PROTO_LIST. * Some style(9) cleanup. In-collaboration-with: Alexey Slynko <firstname.lastname@example.org>
Make all network interrupt service routines MPSAFE part 1/3. Replace the critical section that was previously used to serialize access with the LWKT serializer. Integrate the serializer into the IFNET structure. Note that kern.intr_mpsafe must be set to 1 for network interrupts to actually run MPSAFE. Also note that any interrupts shared with othre non-MP drivers will cause all drivers on that interrupt to run with the Big Giant Lock. Network interrupt - Each network driver then simply passes that serializer to bus_setup_intr() so only a single serializer is required to process the entire interrupt path. LWKT serialization support is already 100% integrated into the interrupt subsystem so it will already be held as of when the registered interrupt procedure is called. Ioctl and if_* functions - All callers of if_* functions (such as if_start, if_ioctl, etc) now obtain the IFNET serializer before making the call. Thus all of these entry points into the driver will now be serialized. if_input - All code that calls if_input now ensures that the serializer is held. It will either already be held (when called from a driver), or the serializer will be wrapped around the call. When packets are forwarded or bridged between interfaces, the target interface serializer will be dropped temporarily to avoid a deadlock. Device Driver access - dev_* entry points into certain pseudo-network devices now obtain and release the serializer. This had to be done on a device-by-device basis (but there are only a few such devices). Thanks to several people for helping test the patch, in particular Sepherosa Ziehau.
spl->critical section conversion.
Remove the canwait argument to dup_sockaddr(). Callers of dup_sockaddr() all assume that it just works, so it really has to work. Since interrupts are now threads we can use M_INTWAIT. While it is possible that a memory deadlock issue exists here (e.g. if swapping over NFS), it isn't likely in this case.
Fix a netmsg memory leak in the ARP code. Adjust all ms_cmd function dispatches to return a proper error code. Reported-by: multiple people
Push the lwkt_replymsg() up one level from netisr_service_loop() to the message handler so we can explicitly reply or not reply as appropriate.
Need header file to deference proc structure. Reported by: drhodus
Pass the credentials along when available.
Eliminate use of curthread in if_ioctl functions by passing down the ucred structure.
Dispatch upper-half protocol request handling.
Once we distribute socket protocol processing requests to different processors, we no longer have a process context to refer to, so eliminate the use of curproc in soreserve() by passing the sockbuf resource limit all the down from the system call code to sbreserve(). Eliminate the use of curproc in unp_attach() by passing down the fields it needs from the proc structure. Define a pru_attach_info structure to hold the information the attach usrreq function requires. The thread argument to in_pcballoc() is unused, so we don't need to pass a thread structure down to in_pcballoc().
__FreeBSD__ -> __DragonFly__
if_xname support Part 2/2: Convert remaining netif devices and implement full support for if_xname. Restructure struct ifnet in net/if_var.h, pulling in a few minor additional changes from current including making if_dunit an int, and making if_flags an int. Submitted-by: Max Laier <email@example.com>
Network threading stage 1/3: netisrs are already software interrupts, which means they alraedy run in their own thread. This commit creates multiple supporting threads for netisrs rather then just one and code has been added to begin routing packets to particular threads based on their content. Eventually this will lead to us being able to isolate and serialize PCBs in particular threads. The tail end of the ip_input path's protocol dispatch, the UIPC (user entry) code, and listen socket have not been covered yet and still need to be serialized. A new debugging sysctl, net.inet.ip.mthread_enable, has been added. It defaults to 1. If you set this sysctl 0 netisr processing will revert to the prior single-threaded behavior. Submitted-by: Jeffrey Hsu <hsu@FreeBSD.org> Additional-work-by: dillon
Centralize if queue handling. Original patch against FreeBSD submitted by Jonathan Lemon. Reviewed by Matt Dillon.
Do you think /sys/netproto needs to use __P() prototypes? I don't.
kernel tree reorganization stage 1: Major cvs repository work (not logged as commits) plus a major reworking of the #include's to accomodate the relocations. * CVS repository files manually moved. Old directories left intact and empty (temporary). * Reorganize all filesystems into vfs/, most devices into dev/, sub-divide devices by function. * Begin to move device-specific architecture files to the device subdirs rather then throwing them all into, e.g. i386/include * Reorganize files related to system busses, placing the related code in a new bus/ directory. Also move cam to bus/cam though this may not have been the best idea in retrospect. * Reorganize emulation code and place it in a new emulation/ directory. * Remove the -I- compiler option in order to allow #include file localization, rename all config generated X.h files to use_X.h to clean up the conflicts. * Remove /usr/src/include (or /usr/include) dependancies during the kernel build, beyond what is normally needed to compile helper programs. * Make config create 'machine' softlinks for architecture specific directories outside of the standard <arch>/include. * Bump the config rev. WARNING! after this commit /usr/include and /usr/src/sys/compile/* should be regenerated from scratch.
LINT build test. Aggregated source code adjustments to bring most of the rest of the kernel source up to date, using the LINT build.
Add the DragonFly cvs id and perform general cleanups on cvs/rcs/sccs ids. Most ids have been removed from !lint sections and moved into comment sections.
import from FreeBSD RELENG_4 1.12