DragonFly BSD

CVS log for src/sys/vfs/udf/udf_vfsops.c

[BACK] Up to [DragonFly] / src / sys / vfs / udf

Request diff between arbitrary revisions


Keyword substitution: kv
Default branch: MAIN


Revision 1.27.4.1: download - view: text, markup, annotated - select for diffs
Thu Sep 25 02:20:56 2008 UTC (6 years, 2 months ago) by dillon
Branches: DragonFly_RELEASE_2_0
CVS tags: DragonFly_RELEASE_2_0_Slip
Diff to: previous 1.27: preferred, unified; next MAIN 1.28: preferred, unified
Changes since revision 1.27: +4 -2 lines
MFC numerous features from HEAD.

* NFS export support for nullfs mounted filesystems,
  intended for nullfs mounted hammer PFSs.

* Each nullfs mount constructs a unique fsid based on
  the underlying mount.

* Each nullfs mount maintains its own netexport structure.

* The mount pointer in the nch (namecache handle) is passed
  into FHTOVP and friends, allowing operations to occur
  on the underlying vnodes but still go through the nullfs
  mount.

Revision 1.28: download - view: text, markup, annotated - select for diffs
Wed Sep 17 21:44:25 2008 UTC (6 years, 2 months ago) by dillon
Branches: MAIN
CVS tags: HEAD
Diff to: previous 1.27: preferred, unified
Changes since revision 1.27: +4 -2 lines
* Implement the ability to export NULLFS mounts via NFS.

* Enforce PFS isolation when exporting a HAMMER PFS via a NULLFS mount.

NOTE: Exporting anything other then HAMMER PFS root's via nullfs does
NOT protect the parent of the exported directory from being accessed via NFS.

Generally speaking this feature is implemented by giving each nullfs mount
a synthesized fsid based on what is being mounted and implementing the
NFS export infrastructure in the nullfs code instead of just bypassing those
functions to the underyling VFS.

Revision 1.27: download - view: text, markup, annotated - select for diffs
Sun Jan 6 16:55:53 2008 UTC (6 years, 10 months ago) by swildner
Branches: MAIN
CVS tags: DragonFly_RELEASE_1_12_Slip, DragonFly_RELEASE_1_12, DragonFly_Preview
Branch point for: DragonFly_RELEASE_2_0
Diff to: previous 1.26: preferred, unified
Changes since revision 1.26: +0 -2 lines
Remove bogus checks after kmalloc(M_WAITOK) which never returns NULL.

Reviewed-by: hasso

Revision 1.26: download - view: text, markup, annotated - select for diffs
Wed May 9 00:53:36 2007 UTC (7 years, 6 months ago) by dillon
Branches: MAIN
CVS tags: DragonFly_RELEASE_1_10_Slip, DragonFly_RELEASE_1_10
Diff to: previous 1.25: preferred, unified
Changes since revision 1.25: +1 -1 lines
Give the device major / minor numbers their own separate 32 bit fields
in the kernel.  Change dev_ops to use a RB tree to index major device
numbers and remove the 256 device major number limitation.

Build a dynamic major number assignment feature into dev_ops_add() and
adjust ASR (which already had a hand-rolled one), and MFS to use the
feature.  MFS at least does not require any filesystem visibility to
access its backing device.  Major devices numbers >= 256 are used for
dynamic assignment.

Retain filesystem compatibility for device numbers that fall within the
range that can be represented in UFS or struct stat (which is a single
32 bit field supporting 8 bit major numbers and 24 bit minor numbers).

Revision 1.25: download - view: text, markup, annotated - select for diffs
Sat Dec 23 00:41:30 2006 UTC (7 years, 11 months ago) by swildner
Branches: MAIN
CVS tags: DragonFly_RELEASE_1_8_Slip, DragonFly_RELEASE_1_8
Diff to: previous 1.24: preferred, unified
Changes since revision 1.24: +13 -13 lines
Rename printf -> kprintf in sys/ and add some defines where necessary
(files which are used in userland, too).

Revision 1.24: download - view: text, markup, annotated - select for diffs
Fri Oct 27 04:56:34 2006 UTC (8 years, 1 month ago) by dillon
Branches: MAIN
Diff to: previous 1.23: preferred, unified
Changes since revision 1.23: +1 -1 lines
Major namecache work primarily to support NULLFS.

* Move the nc_mount field out of the namecache{} record and use a new
  namecache handle structure called nchandle { mount, ncp } for all
  API accesses to the namecache.

* Remove all mount point linkages from the namecache topology.  Each mount
  now has its own namecache topology rooted at the root of the mount point.

  Mount points are flagged in their underlying filesystem's namecache
  topology but instead of linking the mount into the topology, the flag
  simply triggers a mountlist scan to locate the mount.  ".." is handled
  the same way... when the root of a topology is encountered the scan
  can traverse to the underlying filesystem via a field stored in the
  mount structure.

* Ref the mount structure based on the number of nchandle structures
  referencing it, and do not kfree() the mount structure during a forced
  unmount if refs remain.

These changes have the following effects:

* Traversal across mount points no longer require locking of any sort,
  preventing process blockages occuring in one mount from leaking across
  a mount point to another mount.

* Aliased namespaces such as occurs with NULLFS no longer duplicate the
  namecache topology of the underlying filesystem.  Instead, a NULLFS
  mount simply shares the underlying topology (differentiating between
  it and the underlying topology by the fact that the name cache
  handles { mount, ncp } contain NULLFS's mount pointer.

  This saves an immense amount of memory and allows NULLFS to be used
  heavily within a system without creating any adverse impact on kernel
  memory or performance.

* Since the namecache topology for a NULLFS mount is shared with the
  underyling mount, the namecache records are in fact the same records
  and thus full coherency between the NULLFS mount and the underlying
  filesystem is maintained by design.

* Future efforts, such as a unionfs or shadow fs implementation, now
  have a mount structure to work with.  The new API is a lot more
  flexible then the old one.

Revision 1.23: download - view: text, markup, annotated - select for diffs
Sun Sep 10 01:26:41 2006 UTC (8 years, 2 months ago) by dillon
Branches: MAIN
Diff to: previous 1.22: preferred, unified
Changes since revision 1.22: +1 -1 lines
Change the kernel dev_t, representing a pointer to a specinfo structure,
to cdev_t.  Change struct specinfo to struct cdev.  The name 'cdev' was taken
from FreeBSD.  Remove the dev_t shim for the kernel.

This commit generally removes the overloading of 'dev_t' between userland and
the kernel.

Also fix a bug in libkvm where a kernel dev_t (now cdev_t) was not being
properly converted to a userland dev_t.

Revision 1.22: download - view: text, markup, annotated - select for diffs
Tue Sep 5 00:55:51 2006 UTC (8 years, 2 months ago) by dillon
Branches: MAIN
Diff to: previous 1.21: preferred, unified
Changes since revision 1.21: +11 -11 lines
Rename malloc->kmalloc, free->kfree, and realloc->krealloc.  Pass 1

Revision 1.21: download - view: text, markup, annotated - select for diffs
Sat Aug 12 00:26:21 2006 UTC (8 years, 3 months ago) by dillon
Branches: MAIN
Diff to: previous 1.20: preferred, unified
Changes since revision 1.20: +2 -2 lines
VNode sequencing and locking - part 3/4.

VNode aliasing is handled by the namecache (aka nullfs), so there is no
longer a need to have VOP_LOCK, VOP_UNLOCK, or VOP_ISSLOCKED as 'VOP'
functions.  Both NFS and DEADFS have been using standard locking functions
for some time and are no longer special cases.  Replace all uses with
native calls to vn_lock, vn_unlock, and vn_islocked.

We can't have these as VOP functions anyhow because of the introduction of
the new SYSLINK transport layer, since vnode locks are primarily used to
protect the local vnode structure itself.

Revision 1.20: download - view: text, markup, annotated - select for diffs
Tue Jul 18 22:22:16 2006 UTC (8 years, 4 months ago) by dillon
Branches: MAIN
Diff to: previous 1.19: preferred, unified
Changes since revision 1.19: +2 -3 lines
Remove several layers in the vnode operations vector init code.  Declare
the operations vector directly instead of via a descriptor array.  Remove
most of the recalculation code, it stopped being needed over a year ago.

This work is similar to what FreeBSD now does, but was developed along a
different line.  Ultimately our vop_ops will become SYSLINK ops for userland
VFS and clustering support.

Revision 1.19: download - view: text, markup, annotated - select for diffs
Sat May 6 18:48:53 2006 UTC (8 years, 6 months ago) by dillon
Branches: MAIN
CVS tags: DragonFly_RELEASE_1_6_Slip, DragonFly_RELEASE_1_6
Diff to: previous 1.18: preferred, unified
Changes since revision 1.18: +12 -12 lines
Remove the thread argument from all mount->vfs_* function vectors,
replacing it with a ucred pointer when applicable.  This cleans up a
considerable amount of VFS function code that previously delved into
the process structure to get the cred, though some code remains.

Get rid of the compatibility thread argument for hpfs and nwfs.  Our
lockmgr calls are now mostly compatible with NetBSD (which doesn't use a
thread argument either).

Get rid of some complex junk in fdesc_statfs() that nobody uses.

Remove the thread argument from dounmount() as well as various other
filesystem specific procedures (quota calls primarily) which no longer
need it due to the lockmgr, VOP, and VFS cleanups.  These cleanups also
have the effect of making the VFS code slightly less dependant on the
calling thread's context.

Revision 1.18: download - view: text, markup, annotated - select for diffs
Sat May 6 02:43:14 2006 UTC (8 years, 6 months ago) by dillon
Branches: MAIN
Diff to: previous 1.17: preferred, unified
Changes since revision 1.17: +5 -5 lines
The thread/proc pointer argument in the VFS subsystem originally existed
for...  well, I'm not sure *WHY* it originally existed when most of the
time the pointer couldn't be anything other then curthread or curproc or
the code wouldn't work.  This is particularly true of lockmgr locks.

Remove the pointer argument from all VOP_*() functions, all fileops functions,
and most ioctl functions.

Revision 1.17: download - view: text, markup, annotated - select for diffs
Fri May 5 21:15:10 2006 UTC (8 years, 6 months ago) by dillon
Branches: MAIN
Diff to: previous 1.16: preferred, unified
Changes since revision 1.16: +4 -4 lines
Simplify vn_lock(), VOP_LOCK(), and VOP_UNLOCK() by removing the thread_t
argument.  These calls now always use the current thread as the lockholder.
Passing a thread_t to these functions has always been questionable at best.

Revision 1.16: download - view: text, markup, annotated - select for diffs
Fri Mar 24 18:35:34 2006 UTC (8 years, 8 months ago) by dillon
Branches: MAIN
Diff to: previous 1.15: preferred, unified
Changes since revision 1.15: +2 -2 lines
Major BUF/BIO work commit.  Make I/O BIO-centric and specify the disk or
file location with a 64 bit offset instead of a 32 bit block number.

* All I/O is now BIO-centric instead of BUF-centric.

* File/Disk addresses universally use a 64 bit bio_offset now.  bio_blkno
  no longer exists.

* Stackable BIO's hold disk offset translations.  Translations are no longer
  overloaded onto a single structure (BUF or BIO).

* bio_offset == NOOFFSET is now universally used to indicate that a
  translation has not been made.  The old (blkno == lblkno) junk has all
  been removed.

* There is no longer a distinction between logical I/O and physical I/O.

* All driver BUFQs have been converted to BIOQs.

* BMAP, FREEBLKS, getblk, bread, breadn, bwrite, inmem, cluster_*,
  and findblk all now take and/or return 64 bit byte offsets instead
  of block numbers.  Note that BMAP now returns a byte range for the before
  and after variables.

Revision 1.15: download - view: text, markup, annotated - select for diffs
Sat Sep 17 07:43:12 2005 UTC (9 years, 2 months ago) by dillon
Branches: MAIN
CVS tags: DragonFly_RELEASE_1_4_Slip, DragonFly_RELEASE_1_4
Diff to: previous 1.14: preferred, unified
Changes since revision 1.14: +2 -1 lines
Add an argument to vfs_add_vnodeops() to specify VVF_* flags for the vop_ops
structure.  Add a new flag called VVF_SUPPORTS_FSMID to indicate filesystems
which support persistent storage of FSMIDs.  Rework the FSMID code a bit
to reduce overhead.

Use the spare field in the UFS inode structure to implement a persistent
FSMID.  The FSMID is recursively marked in the namecache but not adjusted
until the next getattr() call on the related inode(s), or when the vnode
is reclaimed.

Revision 1.14: download - view: text, markup, annotated - select for diffs
Thu Aug 4 16:44:12 2005 UTC (9 years, 3 months ago) by joerg
Branches: MAIN
Diff to: previous 1.13: preferred, unified
Changes since revision 1.13: +0 -1 lines
Remove unused include of sys/dirent.h.

Revision 1.13: download - view: text, markup, annotated - select for diffs
Tue Jul 26 15:43:36 2005 UTC (9 years, 4 months ago) by hmp
Branches: MAIN
Diff to: previous 1.12: preferred, unified
Changes since revision 1.12: +8 -14 lines
Clean the VFS operations vector and related code:

* take advantage of C99 sparse structure initialisation, this allows
  us to initialise left out vfsops entries cleanly when vfs_register()
  is called; any vfsop entries that are not specified will be assigned
  vfs_std* functions.  the only exception to this rule is VFS_SYNC
  which is assigned vfs_stdnosync() since a file system may not have
  support for it.  file systems can simply assign vfs_stdsync if they
  do not have their own sync operation.

* add KKASSERTS to make sure that the VFS_ROOT, VFS_MOUNT and VFS_UNMOUNT
  vfs operations are provided by a file system being registered.  all of
  the above are necessary to ensure a minimally working file system.

* remove scattered no-op definitions of VFS_START() vfsop vector entry
  and take advantage of sparse vfsop initialisation.  VFS_START is only
  used by MFS to make ensure calling process is not swapped out when
  I/O is initialised.  The entry point is called from the mount path,
  before the file system is marked ready.

* remove scattered no-op definitions of VFS_QUOTACTL() vfsop vector entry
  and take advantage of sparse vfsop initialisation.

* give UFS a VFS_UNINIT vfsop entry and make use of it in ext2fs when
  ripping down the hash tables.

* many file systems in the kernel seem to not implement the complementing
  VFS_UNINIT() vfsop entry, this is not so much of a problem when the
  file system is compiled into the kernel, but it can leave leakage when
  compiled as KLD modules.  add uninitialisation code and entry points
  for ext2fs, ufs, fdescfs.  grab the ufs_ihash_token when free'ing the
  inode hash table at ripping time.

* add typedefs for all the vfsop entry points, make use of it in definition
  of struct vfsops; this results in clean and consolidate code.  use the
  typedefs for vfs_std* function prototypes.

Revision 1.12: download - view: text, markup, annotated - select for diffs
Wed Feb 2 21:34:18 2005 UTC (9 years, 9 months ago) by joerg
Branches: MAIN
CVS tags: DragonFly_Stable, DragonFly_RELEASE_1_2_Slip, DragonFly_RELEASE_1_2
Diff to: previous 1.11: preferred, unified
Changes since revision 1.11: +0 -3 lines
Don't use the statfs field f_mntonname in filesystems. For the userland
export code, it can synthesized from mnt_ncp.
For debugging code, use f_mntfromname, it should be enough to find
culprit. The vfs_unmountall doesn't use code_fullpath to avoid problems
with resource allocation and to make it more likely that a call from ddb
succeds.
Change getfsstat and fhstatfs to not show directories outside a chroot
path, with the exception of the filesystem counting the chroot root itself.

Revision 1.11: download - view: text, markup, annotated - select for diffs
Fri Dec 17 00:18:36 2004 UTC (9 years, 11 months ago) by dillon
Branches: MAIN
Diff to: previous 1.10: preferred, unified
Changes since revision 1.10: +1 -1 lines
VFS messaging/interfacing work stage 10/99:

Start adding the journaling, range locking, and (very slightly) cache
coherency infrastructure.  Continue cleaning up the VOP operations vector.

Expand on past commits that gave each mount structure its own set of VOP
operations vectors by adding additional vector sets for journaling or
cache coherency operations.  Remove the vv_jops and vv_cops fields
from the vnode operations vector in favor of placing those vop_ops directly
in the mount structure.  Reorganize the VOP calls as a double-indirect
and add a field to the mount structure which represents the current
vnode operations set (which will change when e.g. journaling is turned on
or off).  This creates the infrastructure necessary to allow us to stack
a generic journaling implementation on top of a filesystem.

Introduce a hard range-locking API for vnodes.   This API will be used by
high level system/vfs calls in order to handle atomicy guarentees.  It is
a prerequisit for: (1) being able to break I/O's up into smaller pieces
for the vm_page list/direct-to-DMA-without-mapping goal, (2) to support
the parallel write operations on a vnode goal, (3) to support the clustered
(remote) cache coherency goal, and (4) to support massive parallelism in
dispatching operations for the upcoming threaded VFS work.

This commit represents only infrastructure and skeleton/API work.

Revision 1.10: download - view: text, markup, annotated - select for diffs
Fri Nov 12 00:09:51 2004 UTC (10 years ago) by dillon
Branches: MAIN
Diff to: previous 1.9: preferred, unified
Changes since revision 1.9: +12 -8 lines
VFS messaging/interfacing work stage 9/99: VFS 'NEW' API WORK.

NOTE: unionfs and nullfs are temporarily broken by this commit.

* Remove the old namecache API.  Remove vfs_cache_lookup(), cache_lookup(),
  cache_enter(), namei() and lookup() are all gone.  VOP_LOOKUP() and
  VOP_CACHEDLOOKUP() have been collapsed into a single non-caching
  VOP_LOOKUP().

* Complete the new VFS CACHE (namecache) API.  The new API is able to
  supply topological guarentees and is able to reserve namespaces,
  including negative cache spaces (whether the target name exists or not),
  which the new API uses to reserve namespace for things like NRENAME
  and NCREATE (and others).

* Complete the new namecache API.  VOP_NRESOLVE, NLOOKUPDOTDOT, NCREATE,
  NMKDIR, NMKNOD, NLINK, NSYMLINK, NWHITEOUT, NRENAME, NRMDIR, NREMOVE.
  These new calls take (typicaly locked) namecache pointers rather then
  combinations of directory vnodes, file vnodes, and name components.  The
  new calls are *MUCH* simpler in concept and implementation.  For example,
  VOP_RENAME() has 8 arguments while VOP_NRENAME() has only 3 arguments.

  The new namecache API uses the namecache to lock namespaces without having
  to lock the underlying vnodes.  For example, this allows the kernel
  to reserve the target name of a create function trivially.  Namecache
  records are maintained BY THE KERNEL for both positive and negative hits.

  Generally speaking, the kernel layer is now responsible for resolving
  path elements.  NRESOLVE is called when an unresolved namecache record
  needs to be resolved.  Unlike the old VOP_LOOKUP, NRESOLVE is simply
  responsible for associating a vnode to a namecache record (positive hit)
  or telling the system that it's a negative hit, and not responsible for
  handling symlinks or other special cases or doing any of the other
  path lookup work, much unlike the old VOP_LOOKUP.

  It should be particularly noted that the new namecache topology does not
  allow disconnected namecache records.  In rare cases where a vnode must
  be converted to a namecache pointer for new API operation via a file handle
  (i.e. NFS), the cache_fromdvp() function is provided and a new API VOP,
  VOP_NLOOKUPDOTDOT() is provided to allow the namecache to resolve the
  topology leading up to the requested vnode.  These and other topological
  guarentees greatly reduce the complexity of the new namecache API.

  The new namei() is called nlookup().  This function uses a combination
  of cache_n*() calls, VOP_NRESOLVE(), and standard VOP calls resolve the
  supplied path, deal with symlinks, and so forth, in a nice small compact
  compartmentalized procedure.

* The old VFS code is no longer responsible for maintaining namecache records,
  a function which was mostly adhoc cache_purge()s occuring before the VFS
  actually knows whether an operation will succeed or not.

  The new VFS code is typically responsible for adjusting the state of
  locked namecache records passed into it.  For example, if NCREATE succeeds
  it must call cache_setvp() to associate the passed namecache record with
  the vnode representing the successfully created file.  The new requirements
  are much less complex then the old requirements.

* Most VFSs still implement the old API calls, albeit somewhat modified
  and in particular the VOP_LOOKUP function is now *MUCH* simpler.  However,
  the kernel now uses the new API calls almost exclusively and relies on
  compatibility code installed in the default ops (vop_compat_*()) to
  convert the new calls to the old calls.

* All kernel system calls and related support functions which used to do
  complex and confusing namei() operations now do far less complex and
  far less confusing nlookup() operations.

* SPECOPS shortcutting has been implemented.  User reads and writes now go
  directly to supporting functions which talk to the device via fileops
  rather then having to be routed through VOP_READ or VOP_WRITE, saving
  significant overhead.  Note, however, that these only really effect
  /dev/null and /dev/zero.

  Implementing this was fairly easy, we now simply pass an optional
  struct file pointer to VOP_OPEN() and let spec_open() handle the
  override.

SPECIAL NOTES: It should be noted that we must still lock a directory vnode
LK_EXCLUSIVE before issuing a VOP_LOOKUP(), even for simple lookups, because
a number of VFS's (including UFS) store active directory scanning information
in the directory vnode.  The legacy NAMEI_LOOKUP cases can be changed to
use LK_SHARED once these VFS cases are fixed.  In particular, we are now
organized well enough to actually be able to do record locking within a
directory for handling NCREATE, NDELETE, and NRENAME situations, but it hasn't
been done yet.

Many thanks to all of the testers and in particular David Rhodus for
finding a large number of panics and other issues.

Revision 1.9: download - view: text, markup, annotated - select for diffs
Tue Oct 12 19:21:10 2004 UTC (10 years, 1 month ago) by dillon
Branches: MAIN
Diff to: previous 1.8: preferred, unified
Changes since revision 1.8: +7 -4 lines
VFS messaging/interfacing work stage 8/99: Major reworking of the vnode
interlock and other miscellanious things.  This patch also fixes FS
corruption due to prior vfs work in head.  In particular, prior to this
patch the namecache locking could introduce blocking conditions that
confuse the old vnode deactivation and reclamation code paths.  With
this patch there appear to be no serious problems even after two days
of continuous testing.

* VX lock all VOP_CLOSE operations.
* Fix two NFS issues.  There was an incorrect assertion (found by
  David Rhodus), and the nfs_rename() code was not properly
  purging the target file from the cache, resulting in Stale file
  handle errors during, e.g. a buildworld with an NFS-mounted /usr/obj.
* Fix a TTY session issue.  Programs which open("/dev/tty" ,...) and
  then run the TIOCNOTTY ioctl were causing the system to lose track
  of the open count, preventing the tty from properly detaching.
  This is actually a very old BSD bug, but it came out of the woodwork
  in DragonFly because I am now attempting to track device opens
  explicitly.
* Gets rid of the vnode interlock.  The lockmgr interlock remains.
* Introduced VX locks, which are mandatory vp->v_lock based locks.
* Rewrites the locking semantics for deactivation and reclamation.
  (A ref'd VX lock'd vnode is now required for vgone(), VOP_INACTIVE,
  and VOP_RECLAIM).  New guarentees emplaced with regard to vnode
  ripouts.
* Recodes the mountlist scanning routines to close timing races.
* Recodes getnewvnode to close timing races (it now returns a
  VX locked and refd vnode rather then a refd but unlocked vnode).
* Recodes VOP_REVOKE- a locked vnode is now mandatory.
* Recodes all VFS inode hash routines to close timing holes.
* Removes cache_leaf_test() - vnodes representing intermediate
  directories are now held so the leaf test should no longer be
  necessary.
* Splits the over-large vfs_subr.c into three additional source
  files, broken down by major function (locking, mount related,
  filesystem syncer).

* Changes splvm() protection to a critical-section in a number of
  places (bleedover from another patch set which is also about to be
  committed).

Known issues not yet resolved:

* Possible vnode/namecache deadlocks.
* While most filesystems now use vp->v_lock, I haven't done a final
  pass to make vp->v_lock mandatory and to clean up the few remaining
  inode based locks (nwfs I think and other obscure filesystems).
* NullFS gets confused when you hit a mount point in the underlying
  filesystem.
* Only UFS and NFS have been well tested
* NFS is not properly timing out namecache entries, causing changes made
  on the server to not be properly detected on the client if the client
  already has a negative-cache hit for the filename in question.

Testing-by: David Rhodus <sdrhodus@gmail.com>,
	    Peter Kadau <peter.kadau@tuebingen.mpg.de>,
	    walt <wa1ter@myrealbox.com>,
	    others

Revision 1.8: download - view: text, markup, annotated - select for diffs
Thu Sep 30 19:00:23 2004 UTC (10 years, 1 month ago) by dillon
Branches: MAIN
Diff to: previous 1.7: preferred, unified
Changes since revision 1.7: +7 -8 lines
VFS messaging/interfacing work stage 7/99.  BEGIN DESTABILIZATION!

Implement the infrastructure required to allow us to begin switching to the
new nlookup() VFS API.

	filedesc->fd_ncdir, fd_nrdir, fd_njdir

	    File descriptors (associated with processes) now record the
	    namecache pointer related to the current directory, root directory,
	    and jail directory, in addition to the vnode pointers.  These
	    pointers are used as the basis for the new path lookup code
	    (nlookup() and friends).

	file->f_ncp

	    File pointers may now have a referenced+unlocked namecache
	    pointer associated with them.  All fp's representing directories
	    have this attached.  This allows fchdir() to properly record
	    the ncp in fdp->fd_ncdir and friends.

	mount->mnt_ncp

	    The namecache topology for crossing a mount point works as
	    follows: when looking up a path element which is a mount point,
	    cache_nlookup() will locate the ncp for the vnode-under the
	    mount point.  mount->mnt_ncp represents the root of the mount,
	    that is the vnode-over.  nlookup() detects the mount point and
	    accesses mount->mnt_ncp to skip past the vnode-under.  When going
	    backwards (..), nlookup() detects the case and skips backwards.

	    The ncp linkages are: ncp->ncp->ncp[vnode_under]->ncp[vnode_over].
	    That is, when going forwards or backwards nlookup must explicitly
	    skip over the double-ncp when crossing a mount point.  This allows
	    us to keep the namecache topology intact across mount points.

NEW CACHE level API functions:

	cache_get()	Reference and lock a namecache entry
	cache_put()	Dereference and unlock a namecache entry
	cache_lock()	lock an already-referenced namecache entry
	cache_unlock()	unlock a lockednamecache entry

	    NOTE: namecache locks are exclusive and recursive.  These are
	    the 'namespace' locks that we will be using to guarentee namespace
	    operations such as in a CREATE, RENAME, or REMOVE.

	vfs_cache_setroot() 	Set the new system-wide root directory
	cache_allocroot()   	System bootstrap helper function to allocate
			    	 the root namecache node.

	cache_resolve()		Resolve a NCF_UNRESOLVED namecache node.  The
				namecache node should be locked on call.

	cache_setvp()		(resolver) associate a VP or create a negative
				cache entry representation for a namecache
				pointer and clear NCF_UNRESOLVED.  The
				namecache node should be locked on call.

	cache_setunresolved()	Revert a resolved namecache entry back to an
				unresolved state, disassociating any vnode
				but leaving the topology intact.  The
				namecache node should be locked on call.

	cache_vget()		Obtain the locked+refd vnode related to
				a namecache entry, resolving the entry if
				necessary.  Return ENOENT if the entry
				represents a negative cache hit.

	cache_vref()		Obtained a refd (not locked) vnode related to
				a namecache entry, as above.

	cache_nlookup()		The new namecache lookup routine.  This routine
				does a lookup and allocates a new namecache
				node (into an unresolved state) if necessary.
				Returns a namecache record whether or not
				the item can be found and whether or not it
				represents a positive or negative hit.

	cache_lookup()		OLD API CODE DEPRECATED, but must be maintained
				until everything has been converted over.
	cache_enter()		OLD API CODE DEPRECATED, but must be maintained
				until everything has been converted over.

NEW default VOPs

	vop_noresolve()		Implements a namecache resolver for VFSs
				which are still using the old VOP_LOOKUP/
				VOP_CACHEDLOOKUP API (which is all of them
				still).

	VOP_LOOKUP		OLD API CODE DEPRECATED, but must be maintained
				until everything has been converted over.
	VOP_CACHEDLOOKUP	OLD API CODE DEPRECATED, but must be maintained
				until everything has been converted over.

NEW PATHNAME LOOKUP CODE

	nlookup_init()		Similar to NDINIT, initialize a nlookupdata
				structure for nlookup() and nlookup_done().

	nlookup()		Lookup a path.  Unlike the old namei/lookup
				code the new lookup code does not do any
				fancy pre-disposition of the cache for
				create/delete, it simply looks up the requested
				path and returns the appropriate locked
				namecache pointer.  The caller can obtain the
				vnode and directory vnode, as applicable, from
				the one namecache structure that is returned.

				Access checks are done on directories leading
				up to the result but not done on the returned
				namecache node.

	nlookup_done()		Mandatory routine to cleanup a nlookupdata
				structure after it has been initialized and
				all operations have been completed on it.

	nlookup_simple()	(in progress) all-in-one wrapped new lookup.

	nlookup_mp()		helper call for resolving a mount point's
				glue NCP.  hackish, will be cleaned up later.

	nreadsymlink()		helper call to resolve a symlink.  Note that
				the namecache does not yet cache symlink data
				but the intention is to eventually do so to
				avoid having to do VFS ops to get the data.

	naccess()		Perform access checks on a namecache node
				given a mode and cred.

	naccess_va()		Perform access cheks on a vattr given a
				mode and cred.

Begin switching VFS operations from using namei to using nlookup.
In this batch:

	* mount 	(install mnt_ncp for cross-mount-point handling in
			nlookup, simplify the vfs_mount() API to no longer
			pass a nameidata structure)
	* [l]stat	(use nlookup)
	* [f]chdir	(use nlookup, use recorded f_ncp)
	* [f]chroot	(use nlookup, use recorded f_ncp)

Revision 1.7: download - view: text, markup, annotated - select for diffs
Tue Aug 17 18:57:35 2004 UTC (10 years, 3 months ago) by dillon
Branches: MAIN
CVS tags: DragonFly_Snap29Sep2004, DragonFly_Snap13Sep2004
Diff to: previous 1.6: preferred, unified
Changes since revision 1.6: +5 -1 lines
VFS messaging/interfacing work stage 2/99.  This stage retools the vnode ops
vector dispatch, making the vop_ops a per-mount structure rather then a
per-filesystem structure.  Filesystem mount code, typically in blah_vfsops.c,
must now register various vop_ops pointers in the struct mount to compile
its VOP operations set.

This change will allow us to begin adding per-mount hooks to VFSes to support
things like kernel-level journaling, various forms of cache coherency
management, and so forth.

In addition, the vop_*() calls now require a struct vop_ops pointer as the
first argument instead of a vnode pointer (note: in this commit the VOP_*()
macros currently just pull the vop_ops pointer from the vnode in order to
call the vop_*() procedures).  This change is intended to allow us to divorce
ourselves from the requirement that a vnode pointer always be part of a VOP
call.  In particular, this will allow namespace based routines such as
remove(), mkdir(), stat(), and so forth to pass namecache pointers rather then
locked vnodes and is a very important precursor to the goal of using the
namecache for namespace locking.

Revision 1.6: download - view: text, markup, annotated - select for diffs
Wed May 26 07:45:26 2004 UTC (10 years, 6 months ago) by dillon
Branches: MAIN
CVS tags: DragonFly_1_0_REL, DragonFly_1_0_RC1, DragonFly_1_0A_REL
Diff to: previous 1.5: preferred, unified
Changes since revision 1.5: +1 -1 lines
count_udev() was being called with the wrong argument.

Submitted-by: Hiten Pandya <hmp@backplane.com>

Revision 1.5: download - view: text, markup, annotated - select for diffs
Wed May 19 22:53:06 2004 UTC (10 years, 6 months ago) by dillon
Branches: MAIN
Diff to: previous 1.4: preferred, unified
Changes since revision 1.4: +8 -5 lines
Device layer rollup commit.

* cdevsw_add() is now required.  cdevsw_add() and cdevsw_remove() may specify
  a mask/match indicating the range of supported minor numbers.  Multiple
  cdevsw_add()'s using the same major number, but distinctly different
  ranges, may be issued.  All devices that failed to call cdevsw_add() before
  now do.

* cdevsw_remove() now automatically marks all devices within its supported
  range as being destroyed.

* vnode->v_rdev is no longer resolved when the vnode is created.  Instead,
  only v_udev (a newly added field) is resolved.  v_rdev is resolved when
  the vnode is opened and cleared on the last close.

* A great deal of code was making rather dubious assumptions with regards
  to the validity of devices associated with vnodes, primarily due to
  the persistence of a device structure due to being indexed by (major, minor)
  instead of by (cdevsw, major, minor).  In particular, if you run a program
  which connects to a USB device and then you pull the USB device and plug
  it back in, the vnode subsystem will continue to believe that the device
  is open when, in fact, it isn't (because it was destroyed and recreated).

  In particular, note that all the VFS mount procedures now check devices
  via v_udev instead of v_rdev prior to calling VOP_OPEN(), since v_rdev
  is NULL prior to the first open.

* The disk layer's device interaction has been rewritten.  The disk layer
  (i.e. the slice and disklabel management layer) no longer overloads
  its data onto the device structure representing the underlying physical
  disk.  Instead, the disk layer uses the new cdevsw_add() functionality
  to register its own cdevsw using the underlying device's major number,
  and simply does NOT register the underlying device's cdevsw.  No
  confusion is created because the device hash is now based on
  (cdevsw,major,minor) rather then (major,minor).

  NOTE: This also means that underlying raw disk devices may use the entire
  device minor number instead of having to reserve the bits used by the disk
  layer, and also means that can we (theoretically) stack a fully
  disklabel-supported 'disk' on top of any block device.

* The new reference counting scheme prevents this by associating a device
  with a cdevsw and disconnecting the device from its cdevsw when the cdevsw
  is removed.  Additionally, all udev2dev() lookups run through the cdevsw
  mask/match and only successfully find devices still associated with an
  active cdevsw.

* Major work on MFS:  MFS no longer shortcuts vnode and device creation.  It
  now creates a real vnode and a real device and implements real open and
  close VOPs.  Additionally, due to the disk layer changes, MFS is no longer
  limited to 255 mounts.  The new limit is 16 million.  Since MFS creates a
  real device node, mount_mfs will now create a real /dev/mfs<PID> device
  that can be read from userland (e.g. so you can dump an MFS filesystem).

* BUF AND DEVICE STRATEGY changes.  The struct buf contains a b_dev field.
  In order to properly handle stacked devices we now require that the b_dev
  field be initialized before the device strategy routine is called.  This
  required some additional work in various VFS implementations.  To enforce
  this requirement, biodone() now sets b_dev to NODEV.  The new disk layer
  will adjust b_dev before forwarding a request to the actual physical
  device.

* A bug in the ISO CD boot sequence which resulted in a panic has been fixed.

Testing by: lots of people, but David Rhodus found the most aggregious bugs.

Revision 1.4: download - view: text, markup, annotated - select for diffs
Sat Apr 24 04:32:05 2004 UTC (10 years, 7 months ago) by drhodus
Branches: MAIN
Diff to: previous 1.3: preferred, unified
Changes since revision 1.3: +1 -1 lines
Remove the VREF() macro and uses of it.
Remove uses of 0x20 before ^I inside vnode.h

Revision 1.3: download - view: text, markup, annotated - select for diffs
Mon Mar 29 16:38:36 2004 UTC (10 years, 8 months ago) by dillon
Branches: MAIN
Diff to: previous 1.2: preferred, unified
Changes since revision 1.2: +2 -0 lines
UDF was not properly cleaning up getblk'd buffers in the face of error
conditions.  In some places it was assuming that getblk() would not
return a buffer on error, but in fact getblk() generally always returns
a buffer whether an error occurs or not (and always on an I/O error).

Reported-by: David Rhodus <drhodus@crater.dragonflybsd.org>

Revision 1.2: download - view: text, markup, annotated - select for diffs
Wed Mar 24 17:39:51 2004 UTC (10 years, 8 months ago) by drhodus
Branches: MAIN
Diff to: previous 1.1: preferred, unified
Changes since revision 1.1: +2 -2 lines
Hook in UDF to the build process.

Revision 1.1: download - view: text, markup, annotated - select for diffs
Fri Mar 12 22:38:15 2004 UTC (10 years, 8 months ago) by joerg
Branches: MAIN
Merge the kernel part of UDF support from FreeBSD 5.

This doesn't include the iconv hocks and makes use of M_WAITOK everywhere.

Diff request

This form allows you to request diffs between any two revisions of a file. You may select a symbolic revision name using the selection box or you may type in a numeric name using the type-in text box.

Log view options