DragonFly kernel List (threaded) for 2004-12
[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index][Thread Index]

Re: Description of the Journaling topology

From:	Matthew Dillon <dillon@xxxxxxxxxxxxxxxxxxxx>
Date:	Thu, 30 Dec 2004 16:33:41 -0800 (PST)

:I think that there is a basic synchronisation issue in such topology. 
:Due to buffering, delays, etc it is possible that in some cases 
:filesystem will commit changes to the permanent storage before 
:appropriate journaling entry is created, i.e.:
:
:1. App executes unlink("foo").
:2. Kernel sends appropriate VOP to the filesystem and to the journal.
:3. Filesystem commits metadata update, journal entry still sits 
:somewhere in the buffer.
:4. App executes open("foo", O_CREAT).
:5. Kernel sends appropriate VOP to the filesystem and to the journal.
:6. Journaling system commits unlink() entry to the storage.
:7. Filesystem commits metadata update, machine crashes before journal 
:entry for open() is committed.
:
:On reboot, kernel tries to replay journal as a result already created 
:file foo is lost. The same situation may happen for subsequent write's 
:and other operations -  due to jounrnal lagging behing storage it is 
:possible that in the case of failure some data already written to the 
:storage is lost.
:
:How you are going to address this issue?
:
:-Maxim
    
    Solving this issue requires the filesystem to be aware of the journal's
    existance, which I've mentioned in past posts.  The filesystem would
    have to buffer related disk operations until it gets positive
    confirmation that the related journal entries have been committed.
    This is similar to what softupdates does, but the implementation 
    would not have to be anywhere near as sophisticated.

    Baring that you might not be able to guarentee that an incremental
    playback from the journal would be sufficient to fully recover the
    filesystem.  But even in that case A full restore from backups and full
    playback from the journal would be able to fully recover the
    filesystem up to N seconds prior to the crash.  It would just take longer.
    So the basic property of being able to restore within N seconds is
    still guarenteeable even without a journal-aware filesystem.

					-Matt
					Matthew Dillon 
					<dillon@xxxxxxxxxxxxx>

Follow-Ups:
- Re: Description of the Journaling topology
  - From: Maxim Sobolev

References:
- Description of the Journaling topology
  - From: Matthew Dillon
- Re: Description of the Journaling topology
  - From: Martin P. Hellwig
- Re: Description of the Journaling topology
  - From: Matthew Dillon
- Re: Description of the Journaling topology
  - From: Miguel Mendez
- Re: Description of the Journaling topology
  - From: Matthew Dillon
- Re: Description of the Journaling topology
  - From: Maxim Sobolev

[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index][Thread Index]