Up to [DragonFly] / src / sys / vfs / ntfs
Request diff between arbitrary revisions
Keyword substitution: kv
Default branch: MAIN
For kmalloc(), MALLOC() and contigmalloc(), use M_ZERO instead of explicitly bzero()ing. Reviewed-by: sephe
getpages/putpages fixup part 1 - Add support for UIO_NOCOPY VOP_WRITEs to filesystems which use the buffer cache and assert that UIO_NOCOPY is not being used for filesystems which do not. For filesystems using the buffer cache all we have to do is force a read-before-write to fill in any missing pieces of the buffer. UIO_NOCOPY writes are used for buffer-cache-backed filesystems which do not implement their own vop_putpages code. At the moment this is only the msdosfs.
Rename printf -> kprintf in sys/ and add some defines where necessary (files which are used in userland, too).
Implement a much faster spinlock. * Spinlocks can't conflict with FAST interrupts without deadlocking anyway, so instead of using a critical section simply do not allow an interrupt thread to preempt the current thread if it is holding a spinlock. This cuts spinlock overhead in half. * Implement shared spinlocks in addition to exclusive spinlocks. Shared spinlocks would be used, e.g. for file descriptor table lookups. * Cache a shared spinlock by using the spinlock's lock field as a bitfield, one for each cpu (bit 31 for exclusive locks). A shared spinlock sets its cpu's shared bit and does not bother clearing it on unlock. This means that multiple, parallel shared spinlock accessors do NOT incur a cache conflict on the spinlock. ALL parallel shared accessors operate at full speed (~10ns vs ~40-100ns in overhead). 90% of the 10ns in overhead is due to a necessary MFENCE to interlock against exclusive spinlocks on the mutex. However, this MFENCE only has to play with pending cpu-local memory writes so it will always run at near full speed. * Exclusive spinlocks in the face of previously cached shared spinlocks are now slightly more expensive because they have to clear the cached shared spinlock bits by checking the globaldata structure for each conflicting cpu to see if it is still holding a shared spinlock. However, only the initial (unavoidable) atomic swap involves potential cache conflicts. The shared bit checks involve only memory reads and the situation should be self-correcting from a performance standpoint since the shared bits then get cleared. * Add sysctl's for basic spinlock performance testing. Setting debug.spin_lock_test issues a test. Tests #2 and #3 loop debug.spin_test_count times. p.s. these tests will stall the whole machine. 1 Test the indefinite wait code 2 Time the best-case exclusive lock overhead 3 Time the best-case shared lock overhead * TODO: A shared->exclusive spinlock upgrade inline with positive feedback, and an exclusive->shared spinlock downgrade inline.
Remove the now unused interlock argument to the lockmgr() procedure. This argument has been abused over the years by kernel programmers attempting to optimize certain locking and data modification sequences, resulting in a virtually unreadable code in some cases. The interlock also made porting between BSDs difficult as each BSD implemented their interlock differently. DragonFly has slowly removed use of the interlock argument and we can now finally be rid of it entirely.
Remove remaining uses of the lockmgr LK_INTERLOCK flag.
NTFS sometimes splits the initialization of a new vnode into two parts. Make sure VMIO is enabled for regular files so BUF/BIO ops work in the second part as VREG might not be set in the first part. Reported-by: Stefan Krueger <email@example.com>
Major BUF/BIO work commit. Make I/O BIO-centric and specify the disk or file location with a 64 bit offset instead of a 32 bit block number. * All I/O is now BIO-centric instead of BUF-centric. * File/Disk addresses universally use a 64 bit bio_offset now. bio_blkno no longer exists. * Stackable BIO's hold disk offset translations. Translations are no longer overloaded onto a single structure (BUF or BIO). * bio_offset == NOOFFSET is now universally used to indicate that a translation has not been made. The old (blkno == lblkno) junk has all been removed. * There is no longer a distinction between logical I/O and physical I/O. * All driver BUFQs have been converted to BIOQs. * BMAP, FREEBLKS, getblk, bread, breadn, bwrite, inmem, cluster_*, and findblk all now take and/or return 64 bit byte offsets instead of block numbers. Note that BMAP now returns a byte range for the before and after variables.
Pass LK_PCATCH instead of trying to store tsleep flags in the lock structure, so multiple entities competing for the same lock do not use unexpected flags when sleeping. Only NFS really uses PCATCH with lockmgr locks.
Remove unused variables. Submitted-by: Alexey Slynko <firstname.lastname@example.org>
Fix LINT kernel; spin_lock function definitions have been split into <sys/spinlock2.h>, so one has to explicitly #include it to use these functions now.
Convert the lockmgr interlock from a token to a spinlock. This fixes a problem on SMP boxes where the MP lock would unexpectedly lose atomicy for a short period of time due to token acquisition. Add a tsleep_interlock() call which takes advantage of tsleep()'s cpu locality of reference to provide a helper function which allows us to atomically spin_unlock() and tsleep() in an MP safe manner with only a critical section. Basically all it does is set a cpumask bit for the ident hash index to cause other cpu's issuing a wakeup to notify our cpu. Any actual wakeup occuring during the race period after the spin_unlock but before the tsleep() call will be delayed by the critical section until after the tsleep has queued the thread. Cleanup some unused junk in vm_map.h.
Make nlink_t 32bit and ino_t 64bit. Implement the old syscall numbers for *stat by wrapping the new syscalls and truncation of the values. Add a hack for boot2 to keep ino_t 32bit, otherwise we would have to link the 64bit math code in and that would most likely overflow boot2. Bump libc major to annotate changed ABI and work around a problem with strip during installworld. strip is dynamically linked and doesn't play well with the new libc otherwise. Support for 64bit inode numbers is still incomplete, because the dirent limited to 32bit. The checks for nlink_t have to be redone too.
Remove cast as lvalue
Remove the VREF() macro and uses of it. Remove uses of 0x20 before ^I inside vnode.h
Style(9) cleanup to src/sys/vfs, stage 11/21: ntfs. - Convert K&R-style function definitions to ANSI style. Submitted-by: Andre Nathan <email@example.com> Additional-reformatting-by: cpressey
Newtoken commit. Change the token implementation as follows: (1) Obtaining a token no longer enters a critical section. (2) tokens can be held through schedular switches and blocking conditions and are effectively released and reacquired on resume. Thus tokens serialize access only while the thread is actually running. Serialization is not broken by preemptive interrupts. That is, interrupt threads which preempt do no release the preempted thread's tokens. (3) Unlike spl's, tokens will interlock w/ interrupt threads on the same or on a different cpu. The vnode interlock code has been rewritten and the API has changed. The mountlist vnode scanning code has been consolidated and all known races have been fixed. The vnode interlock is now a pool token. The code that frees unreferenced vnodes whos last VM page has been freed has been moved out of the low level vm_page_free() code and moved to the periodic filesystem sycer code in vfs_msycn(). The SMP startup code and the IPI code has been cleaned up considerably. Certain early token interactions on AP cpus have been moved to the BSP. The LWKT rwlock API has been cleaned up and turned on. Major testing by: David Rhodus
__FreeBSD__ -> __DragonFly__
Primarily add a missing lwkt_reltoken() in ntfs_ntput(), plus a little cleanup. Remove a few gettoken/reltoken pairs (depend on the MP lock more). NTFS, as a non-critical filesystem, will eventually be serialized anyway. From-panics-reported-by: Adam K Kirchhoff <firstname.lastname@example.org>
__P()!=wanted, remove old style prototypes from the vfs subtree
kernel tree reorganization stage 1: Major cvs repository work (not logged as commits) plus a major reworking of the #include's to accomodate the relocations. * CVS repository files manually moved. Old directories left intact and empty (temporary). * Reorganize all filesystems into vfs/, most devices into dev/, sub-divide devices by function. * Begin to move device-specific architecture files to the device subdirs rather then throwing them all into, e.g. i386/include * Reorganize files related to system busses, placing the related code in a new bus/ directory. Also move cam to bus/cam though this may not have been the best idea in retrospect. * Reorganize emulation code and place it in a new emulation/ directory. * Remove the -I- compiler option in order to allow #include file localization, rename all config generated X.h files to use_X.h to clean up the conflicts. * Remove /usr/src/include (or /usr/include) dependancies during the kernel build, beyond what is normally needed to compile helper programs. * Make config create 'machine' softlinks for architecture specific directories outside of the standard <arch>/include. * Bump the config rev. WARNING! after this commit /usr/include and /usr/src/sys/compile/* should be regenerated from scratch.
Remove the priority part of the priority|flags argument to tsleep(). Only flags are passed now. The priority was a user scheduler thingy that is not used by the LWKT subsystem. For process statistics assume sleeps without P_SINTR set to be disk-waits, and sleeps with it set to be normal sleeps. This commit should not contain any operational changes.
MP Implementation 1/2: Get the APIC code working again, sweetly integrate the MP lock into the LWKT scheduler, replace the old simplelock code with tokens or spin locks as appropriate. In particular, the vnode interlock (and most other interlocks) are now tokens. Also clean up a few curproc/cred sequences that are no longer needed. The APs are left in degenerate state with non IPI interrupts disabled as additional LWKT work must be done before we can really make use of them, and FAST interrupts are not managed by the MP lock yet. The main thing for this stage was to get the system working with an APIC again. buildworld tested on UP and 2xCPU/MP (Dell 2550)
proc->thread stage 5: BUF/VFS clearance! Remove the ucred argument from vop_close, vop_getattr, vop_fsync, and vop_createvobject. These VOPs can be called from multiple contexts so the cred is fairly useless, and UFS ignorse it anyway. For filesystems (like NFS) that sometimes need a cred we use proc0.p_ucred for now. This removal also removed the need for a 'proc' reference in the related VFS procedures, which greatly helps our proc->thread conversion. bp->b_wcred and bp->b_rcred have also been removed, and for the same reason. It makes no sense to have a particular cred when multiple users can access a file. This may create issues with certain types of NFS mounts but if it does we will solve them in a way that doesn't pollute the struct buf.
proc->thread stage 4: rework the VFS and DEVICE subsystems to take thread pointers instead of process pointers as arguments, similar to what FreeBSD-5 did. Note however that ultimately both APIs are going to be message-passing which means the current thread context will not be useable for creds and descriptor access.
Add the DragonFly cvs id and perform general cleanups on cvs/rcs/sccs ids. Most ids have been removed from !lint sections and moved into comment sections.
import from FreeBSD RELENG_4 188.8.131.52