DragonFly kernel List (threaded) for 2004-01
[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index][Thread Index]

Re: Background fsck

From:	Matthew Dillon <dillon@xxxxxxxxxxxxxxxxxxxx>
Date:	Mon, 19 Jan 2004 10:49:33 -0800 (PST)

:While it is possible to keep metadata consistency, it's *required* to write
:the journal data synchronously, which will decrease performance. Without
:doing the journal writes synchronously we are under the risk of losing
:important metadata updates, and even lost the chance to bring the filesystem
:into a consistency state.

    You don't have to do meta-data writes synchronously unless you are 
    reusing filesystem blocks that were just recently freed up by a 'rm'.
    In that case you have to sync the removed file's meta-data before
    being able to reuse the blocks.  This actualy has a very simple 
    solution... you leave certain bitmap updates in the unwritten meta-data
    log (which itself resides in the buffer cache until written) but you
    do not actually generate dirty buffers for the bitmap updates until
    the journal is processed out of the buffer cache and onto th disk.
    That way the bitmap appears to be allocated and the blocks will not be
    reused until the deleted file's meta-data is written out.

    All you have to do is ensure that a particular piece of meta-data
    has reached the journal on the disk platter *before* you initiate
    the random write of the meta-data to its actual place on the disk.
    Since the meta-data can be held in the buffer cache for a long 
    period of time, there is no need to issue synchronous meta-data
    updates... it just becomes a dependancy on the block in the buffer
    cache.

    Directories require a little more finese but the same rule basically
    applies.  What you do with directories is actually make the contents
    of the directory itself count as meta-data rather then a file block
    dependancy.

:By marking individual cylinder groups dirty or clean, we will have to
:initialize more write operations, which will also slow down the filesystem
:operation. It's questionable how these code will be more simple compared to
:soft updates and we have to maintain it alone (not to share the fixes

:available in FreeBSD). What's more, by having a journal we will face to
:increased abrasion (because journals must be stored in some places which
:location are fixed and the write of journals are usually rolling) on certain
:parts of disks and in my opinion that are unnecessary when there's soft
:updates available.

    Abrasion?  On a hard disk?  There is no abrasion, but if you are really
    worried you can just make the journal fairly large, like 256MB.  Even
    on a heavily loaded filesystem it would take several minutes, possibly
    even an hour, before the journal rolled over, and make the whole point
    moot.

:Finally, using of journalling requires modification to the metadata format,
:which will lead to some problem when users upgrade their system, so a
:converter might be necessary of DFly finally chooses this approach.

    No it doesn't.  The journal can be stored in a normal fixed-length file
    placed somewhere on the filesystem and chflag'd to be undeleteable.
    As long as the file is pre-created, its block numbers are 'fixed' and
    known (because they never change).  Problem solved.

:Considering that the snapshots are usually ephemeral (because they are
:usually used to backup or have a background fsck), I think it might be
:possible to implement the whole SoftUpdates policies in the VFS layer, as
:David Rhodes pointed out in a recent post. Unfortunately, this apparently
:will drastically increase the complexity of VFS code and it is questionable
:whether this is worthy to have a try.

    No, softupdate's policies are extremely complex.  They are unique to
    softupdates but they require a lot of interaction with lower filesystem
    layers and with the buffer cache.

:The only thing I am worrying about background fsck is, while we can mount a
:dirty filesystem and run fsck in the background, it may turn out that an
:incorrect reference number on an i-node may cause it impossible to remove it
:before the bgfsck is finally done... This will sometimes cause application
:to crash...

    This shouldn't happen, because an incorrect ref count will simply prevent
    the inode from being reused.  It will not prevent the associated file
    from being removed frmo a directory.

    But as I said before, softupdates is a very complex piece of software.
    Kirk has been working on it for years and bugs still crop up, and there
    are a lot of side effects that I don't like.

					-Matt
					Matthew Dillon 
					<dillon@xxxxxxxxxxxxx>

Follow-Ups:
- Re: Background fsck
  - From: Jeroen Ruigrok/asmodai
- Re: Background fsck
  - From: Dan Melomedman

References:
- Background fsck
  - From: Joerg Sonnenberger
- Re: Background fsck
  - From: Matthew Dillon
- Re: Background fsck
  - From: Xin LI

[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index][Thread Index]