DragonFly commits List (threaded) for 2004-12
cvs commit: src/sys/kern vfs_journal.c src/sys/sys kern_syscall.h mountctl.h
dillon 2004/12/30 13:41:06 PST
DragonFly src repository
sys/sys kern_syscall.h mountctl.h
Journaling layer work. Lock down the journaling data format and most
of the record-building API.
The Journaling data format consists of two layers, A logical stream
abstraction layer and a recursive subrecord layer.
The memory FIFO and worker thread only deals with the logical stream
abstraction layer. subrecord data is broken down into logical stream
records which the worker thread then writes out to the journal. Space
for a logical stream record is 'reserved' and then filled in by the
journaling operation. Other threads can reserve their own space in the
memory FIFO, even if earlier reservations have not yet been committed.
The worker thread will only write out completed records and it currently
does so in sequential order, so the worker thread itself may stall
temporarily if the next reservation in the FIFO has not yet been completed.
(this will probably have to be changed in the future but for now its the
easiest solution, allowing for some parallelism without creating too big
Each logical stream is a (typically) short-lived entity, usually
encompassing a single VFS operation, but may be made up of multiple
stream records. The stream records contain a stream id and bits specifying
whether the record is beginning a new logical stream, in the middle
somewhere, or ending a logical stream. Small transactions may be able
to fit in a single record in which case multiple bits may be set.
Very large transactions, for example when someone does a write(... 10MB),
are fully supported and would likely generate a large number of stream
records. Such transactions would not necessarily stall other operations
from other processes, however, since they would be broken up into smaller
pieces for output to the journal.
The stream layer serves to multiplex individual logical streams onto
the memory FIFO and out the journaling stream descriptor.
The recursive subrecord layer identifies the transaction as well as any
other necessary data, including UNDO data if the journal is reversable.
A single transaction may contain several sub-records identifying the bits
making up the transaction (for example, a 'mkdir' transaction would need
a subrecord identifying the userid, groupid, file modes, and path).
The record formats also allow for transactional aborts, even if some of the
data has already been pushed out to the descriptor due to limited buffer
space. And, finally, while the subrecord's header format includes a record
size field, this value may not be known for subrecords representing
recusive 'pushes' since the header may be flushed out to the journal long
before the record is completed. This case is also fully supported.
NOTE: The memory FIFO used to ship data to the worker thread is serialized
by the BGL for the moment, but will eventually be made per-cpu to support
lockless operation under SMP.
Revision Changes Path
1.4 +834 -57 src/sys/kern/vfs_journal.c
1.25 +1 -0 src/sys/sys/kern_syscall.h
1.3 +245 -15 src/sys/sys/mountctl.h