Up to [DragonFly] / src / sys / vfs / isofs / cd9660
Request diff between arbitrary revisions
Keyword substitution: kv
Default branch: MAIN
Use SYSREF to reference count struct vnode. v_usecount is now v_sysref(.refcnt). v_holdcnt is now v_auxrefs. SYSREF's termination state (using a negative reference count from -0x40000000+) now places the vnode in a VCACHED or VFREE state and deactivates it. The vnode is now assigned a 64 bit unique id via SYSREF. vhold() (which manipulates v_auxrefs) no longer reactivates a vnode and is explicitly used only to track references from auxillary structures and references to prevent premature destruction of the vnode. vdrop() will now only move a vnode from VCACHED to VFREE on the 1->0 transition of v_auxrefs if the vnode is in a termination state. vref() will now panic if used on a vnode in a termination state. vget() must now be used to explicitly reactivate a vnode. These requirements existed before but are now explicitly asserted. vlrureclaim() and allocvnode() should now interact a bit better. In particular, vlrureclaim() will do a better job of finding vnodes to flush and transition from VCACHED to VFREE, and allocvnode() will do a better job finding vnodes to reuse without getting blocked by a flush. allocvnode now uses a real VX lock to sequence vnodes into VRECLAIMED. All vnode special state processing now uses a VX lock. Vnodes are now able to be slowly returned to the memory pool when kern.maxvnodes is reduced at run time. Various initialization elements have been moved to CTOR/DTOR and are no longer in the critical path, improving performance. However, since SYSREF uses atomic_cmpset_int() (aka cmpxchgl), which reduces performance somewhat, overall performance tends to be about the same.
Change the kernel dev_t, representing a pointer to a specinfo structure, to cdev_t. Change struct specinfo to struct cdev. The name 'cdev' was taken from FreeBSD. Remove the dev_t shim for the kernel. This commit generally removes the overloading of 'dev_t' between userland and the kernel. Also fix a bug in libkvm where a kernel dev_t (now cdev_t) was not being properly converted to a userland dev_t.
Rename malloc->kmalloc, free->kfree, and realloc->krealloc. Pass 2
Rename malloc->kmalloc, free->kfree, and realloc->krealloc. Pass 1
The thread/proc pointer argument in the VFS subsystem originally existed for... well, I'm not sure *WHY* it originally existed when most of the time the pointer couldn't be anything other then curthread or curproc or the code wouldn't work. This is particularly true of lockmgr locks. Remove the pointer argument from all VOP_*() functions, all fileops functions, and most ioctl functions.
Clone cd9660_blkatoff() into a new procedure, cd9660_devblkatoff(), which returns a devvp-relative buffer rather then the vp-relative buffer. This allows us to access meta-data relative to a vnode without having to instantiate a VM object for that vnode. The new function is used for all directory scans and (negative offset) meta-data access. This fixes a panic due to recent buffer cache commits that formalized the requirements for using the buffer cache. Also, prior to this change, the CD9660 filesystem was using B_MALLOC buffers for a great deal of meta-data access that could very easily have been backed by the device vnode's VM object instead. B_MALLOC buffers have severe caching limitations. This commit fixes all of that as well.
Major BUF/BIO work commit. Make I/O BIO-centric and specify the disk or file location with a 64 bit offset instead of a 32 bit block number. * All I/O is now BIO-centric instead of BUF-centric. * File/Disk addresses universally use a 64 bit bio_offset now. bio_blkno no longer exists. * Stackable BIO's hold disk offset translations. Translations are no longer overloaded onto a single structure (BUF or BIO). * bio_offset == NOOFFSET is now universally used to indicate that a translation has not been made. The old (blkno == lblkno) junk has all been removed. * There is no longer a distinction between logical I/O and physical I/O. * All driver BUFQs have been converted to BIOQs. * BMAP, FREEBLKS, getblk, bread, breadn, bwrite, inmem, cluster_*, and findblk all now take and/or return 64 bit byte offsets instead of block numbers. Note that BMAP now returns a byte range for the before and after variables.
Greatly reduce the size of ISOFS's inode hash table. CDs and DVDs are small and slow compared to hard disks, an ultra-efficient inode hash table is not necessary. Suggested-by: Joerg Sonnenberger <email@example.com>
VFS messaging/interfacing work stage 8/99: Major reworking of the vnode interlock and other miscellanious things. This patch also fixes FS corruption due to prior vfs work in head. In particular, prior to this patch the namecache locking could introduce blocking conditions that confuse the old vnode deactivation and reclamation code paths. With this patch there appear to be no serious problems even after two days of continuous testing. * VX lock all VOP_CLOSE operations. * Fix two NFS issues. There was an incorrect assertion (found by David Rhodus), and the nfs_rename() code was not properly purging the target file from the cache, resulting in Stale file handle errors during, e.g. a buildworld with an NFS-mounted /usr/obj. * Fix a TTY session issue. Programs which open("/dev/tty" ,...) and then run the TIOCNOTTY ioctl were causing the system to lose track of the open count, preventing the tty from properly detaching. This is actually a very old BSD bug, but it came out of the woodwork in DragonFly because I am now attempting to track device opens explicitly. * Gets rid of the vnode interlock. The lockmgr interlock remains. * Introduced VX locks, which are mandatory vp->v_lock based locks. * Rewrites the locking semantics for deactivation and reclamation. (A ref'd VX lock'd vnode is now required for vgone(), VOP_INACTIVE, and VOP_RECLAIM). New guarentees emplaced with regard to vnode ripouts. * Recodes the mountlist scanning routines to close timing races. * Recodes getnewvnode to close timing races (it now returns a VX locked and refd vnode rather then a refd but unlocked vnode). * Recodes VOP_REVOKE- a locked vnode is now mandatory. * Recodes all VFS inode hash routines to close timing holes. * Removes cache_leaf_test() - vnodes representing intermediate directories are now held so the leaf test should no longer be necessary. * Splits the over-large vfs_subr.c into three additional source files, broken down by major function (locking, mount related, filesystem syncer). * Changes splvm() protection to a critical-section in a number of places (bleedover from another patch set which is also about to be committed). Known issues not yet resolved: * Possible vnode/namecache deadlocks. * While most filesystems now use vp->v_lock, I haven't done a final pass to make vp->v_lock mandatory and to clean up the few remaining inode based locks (nwfs I think and other obscure filesystems). * NullFS gets confused when you hit a mount point in the underlying filesystem. * Only UFS and NFS have been well tested * NFS is not properly timing out namecache entries, causing changes made on the server to not be properly detected on the client if the client already has a negative-cache hit for the filename in question. Testing-by: David Rhodus <firstname.lastname@example.org>, Peter Kadau <email@example.com>, walt <firstname.lastname@example.org>, others
VFS messaging/interfacing work stage 7d/99: More firming up of stage 7. Additional work to deal with old-api/new-api issues. Cut more stuff out of the old-api's cache_enter() routine to deal with deadlocks, at the cost of some performance loss (temporary until the VFS's start using the new APIs). Change UFS and NFS to not purge whole directories in *_rename() and *_rmdir(). Add some minor breakage to the API which will not be fixed until the VFS's get new rename implementations - renaming a directory in which a process has chdir'd will create problems for that process. This doesn't happen normally anyway so this temporary breakage should not cause any significant problems. Bug-reports-by: walt, Sascha Wildner, others
VFS messaging/interfacing work stage 4/99. This stage goes a long ways towards allowing us to move the vnode locking into a kernel layer. It gets rid of a lot of cruft from FreeBSD-4. FreeBSD-5 has done some of this stuff too (such as changing the default locking to stdlock from nolock), but DragonFly is going further. * Consolidate vnode locks into the vnode structure, add an embedded v_lock, and getting rid of both v_vnlock and v_data based head-of-structure locks. * Change the default vops to use a standard vnode lock rather then a fake non-lock. * Get rid of vop_nolock() and friends, we no longer support non-locking vnodes. * Get rid of vop_sharedlock(), we no longer support non standard shared-only locks (only NFS was using it and the mount-crossing lookup code should now prevent races to root from dead NFS volumes). * Integrate lock initialization into getnewvnode(). We do not yet incorporate automatically locking into getnewvnode(). getnewvnode() now has two additional arguments, lktimeout and lkflags, for lock structure initialization. * Change the sync vnode lock from nolock to stdlock. This may require more tuning down the line. Fix various sync_inactive() to properly unlock the lock as per the VOP API. * Properly flag the 'rename' vop operation regarding required tdvp and tvp unlocks (the flags are only used by nullfs). * Get rid of all inode-embedded vnode locks * Remove manual lockinit and use new getnewvnode() args instead. Lock the vnode prior to doing anything that might block in order to avoid synclist access before the vnode has been properly initialize. * Generally change inode hash insertion to also check for a hash collision and return failure if it occurs, rather then doing (often non-atomic) relookups and other checks. These sorts of collisions can occur if a vnode is being destroyed at the same time a new vnode is being created from an inode. A new vnode is not generally accessible, except by the sync code (from the mountlist) until it's underlying inode has been hashed so dealing with a hash collision should be as simple as throwing away the vnode with a vput(). * Do not initialize a new vnode's v_data until after the associated inode has been successfully added to the hash, and make the xxx_inactive() and xxx_reclaim() code friendly towards vnodes with a NULL v_data. * NFS now uses standard locks rather then shared-only locks. * PROCFS now uses standard locks rather then non-locks, and PROCFS's lookup code now understands VOP lookup semantics. PROCFS now uses a real hash table for its node search rather then a single singly-linked list (which should better scale to systems with thousands of processes). * NULLFS should now properly handle lookup() and rename() locks. NULLFS's node handling code has been rewritten. NULLFS's bypass code now understands vnode unlocks (rename case). * UFS no longer needs the ffs_inode_hash_lock hacks. It now uses the new collision-on-hash-add methodology. This will speed up UFS when operating on lots of small files (reported by David Rhodus).
Style(9) cleanup to src/sys/vfs, stage 7/21: isofs. - Convert K&R-style function definitions to ANSI style. Submitted-by: Andre Nathan <email@example.com> Additional-reformatting-by: cpressey
Newtoken commit. Change the token implementation as follows: (1) Obtaining a token no longer enters a critical section. (2) tokens can be held through schedular switches and blocking conditions and are effectively released and reacquired on resume. Thus tokens serialize access only while the thread is actually running. Serialization is not broken by preemptive interrupts. That is, interrupt threads which preempt do no release the preempted thread's tokens. (3) Unlike spl's, tokens will interlock w/ interrupt threads on the same or on a different cpu. The vnode interlock code has been rewritten and the API has changed. The mountlist vnode scanning code has been consolidated and all known races have been fixed. The vnode interlock is now a pool token. The code that frees unreferenced vnodes whos last VM page has been freed has been moved out of the low level vm_page_free() code and moved to the periodic filesystem sycer code in vfs_msycn(). The SMP startup code and the IPI code has been cleaned up considerably. Certain early token interactions on AP cpus have been moved to the BSP. The LWKT rwlock API has been cleaned up and turned on. Major testing by: David Rhodus
Fix races in ihashget that were introduced when I introduced the lwkt_gettoken() API to interlock the vnode and hash table ops. Report-by: David Rhodus.
__P()!=wanted, remove old style prototypes from the vfs subtree
kernel tree reorganization stage 1: Major cvs repository work (not logged as commits) plus a major reworking of the #include's to accomodate the relocations. * CVS repository files manually moved. Old directories left intact and empty (temporary). * Reorganize all filesystems into vfs/, most devices into dev/, sub-divide devices by function. * Begin to move device-specific architecture files to the device subdirs rather then throwing them all into, e.g. i386/include * Reorganize files related to system busses, placing the related code in a new bus/ directory. Also move cam to bus/cam though this may not have been the best idea in retrospect. * Reorganize emulation code and place it in a new emulation/ directory. * Remove the -I- compiler option in order to allow #include file localization, rename all config generated X.h files to use_X.h to clean up the conflicts. * Remove /usr/src/include (or /usr/include) dependancies during the kernel build, beyond what is normally needed to compile helper programs. * Make config create 'machine' softlinks for architecture specific directories outside of the standard <arch>/include. * Bump the config rev. WARNING! after this commit /usr/include and /usr/src/sys/compile/* should be regenerated from scratch.
Register keyword removal Approved by: Matt Dillon
MP Implementation 1/2: Get the APIC code working again, sweetly integrate the MP lock into the LWKT scheduler, replace the old simplelock code with tokens or spin locks as appropriate. In particular, the vnode interlock (and most other interlocks) are now tokens. Also clean up a few curproc/cred sequences that are no longer needed. The APs are left in degenerate state with non IPI interrupts disabled as additional LWKT work must be done before we can really make use of them, and FAST interrupts are not managed by the MP lock yet. The main thing for this stage was to get the system working with an APIC again. buildworld tested on UP and 2xCPU/MP (Dell 2550)
proc->thread stage 4: rework the VFS and DEVICE subsystems to take thread pointers instead of process pointers as arguments, similar to what FreeBSD-5 did. Note however that ultimately both APIs are going to be message-passing which means the current thread context will not be useable for creds and descriptor access.
Add the DragonFly cvs id and perform general cleanups on cvs/rcs/sccs ids. Most ids have been removed from !lint sections and moved into comment sections.
import from FreeBSD RELENG_4 18.104.22.168