Up to [DragonFly] / src / sys / vfs / ufs
Request diff between arbitrary revisions
Keyword substitution: kv
Default branch: MAIN
Miscellanious performance adjustments to the kernel * Add an argument to VOP_BMAP so VFSs can discern the type of operation the BMAP is being done for. * Normalize the variable name denoting the blocksize to 'blksize' in vfs_cluster.c. * Fix a bug in the cluster code where a stale bp->b_error could wind up getting returned when B_ERROR is not set. * Do not B_AGE cluster bufs. * Pass the block size to both cluster_read() and cluster_write() instead of those routines getting the block size from vp->v_mount->mnt_stat.f_iosize. This allows different areas of a file to use a different block size. * Properly initialize bp->b_bio2.bio_offset to doffset in cluster_read(). This fixes an issue where VFSs were making an extra, unnecessary call to BMAP. * Do not recycle vnodes on the free list until numvnodes has reached desiredvnodes. Vnodes were being recycled when their resident page count had dropped to zero, but this is actually too early as the VFS may cache important information in the vnode that would otherwise require a number of I/O's to re-acquire. This mainly helps HAMMER (whos inode lookups are fairly expensive). * Do not VAGE vnodes. * Remove the minvnodes test. There is no reason not to load the vnode cache all the way through to its max. * buf_cmd_t visibility for the new BMAP argument.
Fix some IO sequencing performance issues and reformulate the strategy we use to deal with potential buffer cache deadlocks. Generally speaking try to remove roadblocks in the vn_strategy() path. * Remove buf->b_tid (HAMMER no longer needs it) * Replace IO_NOWDRAIN with IO_NOBWILL, requesting that bwillwrite() not be called. Used by VN to try to avoid deadlocking. Remove B_NOWDRAIN. * No longer block in bwrite() or getblk() when we have a lot of dirty buffers. getblk() in particular needs to be callable by filesystems to drain dirty buffers and we don't want to deadlock. * Improve bwillwrite() by having it wake up the buffer flusher at 1/2 the dirty buffer limit but not block, and then block if the limit is reached. This should smooth out flushes during heavy filesystem activity.
Fix a bug in vnode_pager_generic_getpages(). This function was improperly setting m->valid to 0 and was also improperly trying to free the page after it had potentially become wired by the buffer cache. Add a sysctl to UFS that allows us to force it to call vop_stdgetpages() for debugging purposes.
getpages/putpages fixup part 1 - Add support for UIO_NOCOPY VOP_WRITEs to filesystems which use the buffer cache and assert that UIO_NOCOPY is not being used for filesystems which do not. For filesystems using the buffer cache all we have to do is force a read-before-write to fill in any missing pieces of the buffer. UIO_NOCOPY writes are used for buffer-cache-backed filesystems which do not implement their own vop_putpages code. At the moment this is only the msdosfs.
Remove the vpp (returned underlying device vnode) argument from VOP_BMAP(). VOP_BMAP() may now only be used to determine linearity and clusterability of the blocks underlying a filesystem object. The meaning of the returned block number (other then being contiguous as a means of indicating linearity or clusterability) is now up to the VFS. This removes visibility into the device(s) underlying a filesystem from the rest of the kernel.
1:1 Userland threading stage 4.2/4: Make signal system fully lwp-aware by splitting ksignal() in appropriate functions. Introduce lwpsignal(), which now contains the logic of ksignal(), but can be used to deliver a signal to a specific lwp. Convert consumers of ksignal() to use lwpsignal() when they actually generate a thread-specific signal. Fully implement proc_stop() and proc_unstop(). Reviewed-by: Thomas E. Spanjaard <firstname.lastname@example.org>
Rename functions to avoid conflicts with libc.
Add some diagnostic messages to try to catch a ufs_dirbad panic before it happens. MFC: Reorder BUF_UNLOCK() - it must occur after b_flags is modified, not before. A newly created non-VMIO buffer is now marked B_INVAL. Callers of getblk() now always clear B_INVAL before issuing a READ I/O or when clearing or overwriting the buffer. Before this change, a getblk() (getnewbuf), brelse(), getblk() sequence on a non-VMIO buffer would result in a buffer with B_CACHE set yet containing uninitialized data. MFC: B_NOCACHE cannot be set on a clean VMIO-backed buffer as this will destroy the VM backing store, which might be dirty. MFC: Reorder vnode_pager_setsize() calls to close a race condition.
Add a read-ahead version of ffs_blkatoff() called ffs_blkatoff_ra(). This code was basically extracted from ffs_read(). ffs_read() now calls ffs_blkatoff_ra(). ufs_readdir() now also calls ffs_blkatoff_ra().
Remove FFS function hooks used by UFS. Simply make direct calls from ufs to ffs. The original ufs routines don't exist anymore anyhow and EXT2 no longer references UFS files directly. UFS and FFS have been 'one' filesystem for two decades. These hooks are no longer needed.
Remove the (unused) copy-on-write support for a vnode's VM object. This support originally existed to support the badly implemented and severely hacked ENABLE_VFS_IOOPT I/O optimization which was removed long ago. This also removes a bunch of cross-module pollution in UFS.
The thread/proc pointer argument in the VFS subsystem originally existed for... well, I'm not sure *WHY* it originally existed when most of the time the pointer couldn't be anything other then curthread or curproc or the code wouldn't work. This is particularly true of lockmgr locks. Remove the pointer argument from all VOP_*() functions, all fileops functions, and most ioctl functions.
Change *_pager_allocate() to take off_t instead of vm_ooffset_t. The actual underlying type (a 64 bit signed integer) is the same. Recent and upcoming work is standardizing on off_t. Move object->un_pager.vnp.vnp_size to vnode->v_filesize. As before, the field is still only valid when a VM object is associated with the vnode.
Major BUF/BIO work commit. Make I/O BIO-centric and specify the disk or file location with a 64 bit offset instead of a 32 bit block number. * All I/O is now BIO-centric instead of BUF-centric. * File/Disk addresses universally use a 64 bit bio_offset now. bio_blkno no longer exists. * Stackable BIO's hold disk offset translations. Translations are no longer overloaded onto a single structure (BUF or BIO). * bio_offset == NOOFFSET is now universally used to indicate that a translation has not been made. The old (blkno == lblkno) junk has all been removed. * There is no longer a distinction between logical I/O and physical I/O. * All driver BUFQs have been converted to BIOQs. * BMAP, FREEBLKS, getblk, bread, breadn, bwrite, inmem, cluster_*, and findblk all now take and/or return 64 bit byte offsets instead of block numbers. Note that BMAP now returns a byte range for the before and after variables.
Remove the vfs page replacement optimization and its ENABLE_VFS_IOOPT option. This never worked properly... that is, the semantics are broken compared to a normal read or write in that the read 'buffer' will be modified out from under the caller if the underlying file is. What is really needed here is a copy-on-write feature that works in both directions, similar to how a shared buffer is copied after a fork() if either the parent or child modify it. The optimization will eventually rewritten with that in mind but not right now.
Perform some basic cleanups. Change some types over to C99 standard types. Correct some misspellings. Correct some type usages which could possibly resulted in overflows in the filesystem code.
Style(9) cleanup to src/sys/vfs, stage 19/21: ufs. - Convert K&R-style function definitions to ANSI style. Submitted-by: Andre Nathan <email@example.com> Additional-reformatting-by: cpressey
msync(..., MS_INVALIDATE) will incorrectly remove dirty pages without synchronizing them to their backing store under certain circumstances, and can also cause struct buf's to become inconsistent. This can be particularly gruesome when MS_INVALIDATE is used on a range of memory that is mmap()'d to be read-only. Fix MS_INVALIDATE's operation (1) by making UFS honor the invalidation request when flushing to backing store to destroy the related struct buf and (2) by never removing pages wired into the buffer cache and never removing pages that are found to still be dirty. Note that NFS was already coded to honor invalidation requests in nfs_write(). Filesystems other then NFS and UFS do not currently support buffer-invalidation-on-write but all that means now is that the pages will remain in cache, rather then be incorrectly removed and cause corruption. Reported-by: Stephan Uphoff <firstname.lastname@example.org>, Julian Elischer <email@example.com>
Register keyword removal Approved by: Matt Dillon
MP Implementation 1/2: Get the APIC code working again, sweetly integrate the MP lock into the LWKT scheduler, replace the old simplelock code with tokens or spin locks as appropriate. In particular, the vnode interlock (and most other interlocks) are now tokens. Also clean up a few curproc/cred sequences that are no longer needed. The APs are left in degenerate state with non IPI interrupts disabled as additional LWKT work must be done before we can really make use of them, and FAST interrupts are not managed by the MP lock yet. The main thing for this stage was to get the system working with an APIC again. buildworld tested on UP and 2xCPU/MP (Dell 2550)
Split the struct vmmeter cnt structure into a global vmstats structure and a per-cpu cnt structure. Adjust the sysctls to accumulate statistics over all cpus.
simple cleanups (removal of ancient macros)
proc->thread stage 5: BUF/VFS clearance! Remove the ucred argument from vop_close, vop_getattr, vop_fsync, and vop_createvobject. These VOPs can be called from multiple contexts so the cred is fairly useless, and UFS ignorse it anyway. For filesystems (like NFS) that sometimes need a cred we use proc0.p_ucred for now. This removal also removed the need for a 'proc' reference in the related VFS procedures, which greatly helps our proc->thread conversion. bp->b_wcred and bp->b_rcred have also been removed, and for the same reason. It makes no sense to have a particular cred when multiple users can access a file. This may create issues with certain types of NFS mounts but if it does we will solve them in a way that doesn't pollute the struct buf.
proc->thread stage 4: rework the VFS and DEVICE subsystems to take thread pointers instead of process pointers as arguments, similar to what FreeBSD-5 did. Note however that ultimately both APIs are going to be message-passing which means the current thread context will not be useable for creds and descriptor access.
Add the DragonFly cvs id and perform general cleanups on cvs/rcs/sccs ids. Most ids have been removed from !lint sections and moved into comment sections.
import from FreeBSD RELENG_4 220.127.116.11