DragonFly users List (threaded) for 2006-12
Re: Re: Idle question about multi-core processors
:I tend to agree with this sentiment. BGL aside though, I think that
:the VFS is still largely serial, and filesystems will remain an issue.
: That probably is not a week long project. From my understanding, it
:would probably be easier to do the ZFS port than to untangle the mess
:that is FFS.
I do have a plan for FFS. I plan to touch it as little as possible :-)
The plan is actually to allow higher layers in the kernel to read and
write the buffer cache directly and avoid having to dive into the VFS
(at least for data already cached). Part of that plan has already been
implemented... the buffer cache is now indexed based on byte offset
ranges (offset,size) instead of block numbers (lblkno,size).
This means that we can theoretically have vn_read() access the buffer
cache *directly*, based on the file offset, without having to have any
knowledge of the file block size or the inner workings of the VFS.
I fully intend to do that eventually.
To complete the equation we need a means to tell the VFS to instantiate
a buffer that does not exist, since the VFS is the only entity that
knows what the file block size for that buffer is going to be:
bp = VOP_GETBUFFER(vp, offset)
Higher layers such as vn_read() and vn_write() would make this call
when they cannot find an existing buffer which serves their needs,
and would then read or write the buffer via bread() or bwrite() (etc)
as needed to make it valid.
This would also allow us to get rid of VOP_READ, VOP_WRITE, VOP_GETPAGES,
and VOP_PUTPAGES. Instead we would just iterate a loop with
VOP_GETBUFFER (if the buffer does not already exist) and then perform
the appropriate operations on the returned buffer and be done with it.
And, as a bonus, the buffer cache now *requires* that filesystem buffers
be VMIO'd, meaning that they are backed by the VM page cache. This
means that we can rely on the buffer's b_pages array of VM pages and
in many cases not even have to map the buffer into kernel virtual memory
(which is the single biggest time cost for current buffer cache
I don't plan on doing any more work on this until next year, because a lot
of it is going to have to be tightly integrated into the clustering work.
Only moderate work on FFS itself will be needed. Since FFS operates
on buffers internally anyway, it is more a matter of removing all the
UIO support in favor of the direct buffer cache API, and abstracting
file extensions/ftruncate out into the kernel proper.