DragonFly BSD

Google Summer of Code 2010

DragonFly BSD is participating in the Google Summer of Code program for 2010.

Have a look at our SoC pages from 2008 and ?2009 to get an overview about prior year's projects. The Projects Page is also a potential source of ideas.

For more details on Google's Summer of Code: Google's SoC page

Note to prospective students: These project proposals are meant to be a first approximation; we're looking forward to your own suggestions (even for completely new directions) and will try to integrate your ideas to make the GSoC project more interesting to all parties. Even when a proposal is very specific about the goals that must be achieved and the path that should be taken, these are always negotiable. Keep in mind that we have tried to limit our proposals to those that (based on our past experience) are appropriate for the GSoC program.

Legend:

Project ideas

VFS Quota System

Meta information:


HAMMER Data dedup

The HAMMER filesystem is very efficient in sharing data between its fine-grained snapshots, but when you copy (or otherwise duplicate) a file or directory tree, the data is no longer shared. This is suboptimal because then we make poor use of disk space and the same data gets cached multiple times wasting precious RAM space.

The goal of this project is to add a data de-duplication mechanism to the HAMMER filesystem. A reasonable approach would be to detect potential data matches using CRCs during pruning runs. Then you could verify there is actual duplication of data (i.e. the match is not a false positive), collapse the B-Tree data reference and account for the additional reference in the allocation blockmap.

Meta information:


Implement i386 32-bit ABI for x86_64 64-bit kernel

The idea here is to support the execution of 32 bit DragonFly binaries in 64 bit DragonFly environments, something numerous other operating systems have done. Several things must be done to support this. First, the appropriate control bits must be set to execute in 32-bit compatibility mode while in usermode instead of 64-bit mode. Second, when a system call is made from 32-bit mode a translation layer is needed to translate the system call into the 64-bit requivalent within the kernel. Third, the signal handler and trampoline code needs to operate on the 32-bit signal frame. Fourth, the 32 and 64 bit ELF loaders both have to be in the kernel at the same time, which may require some messing around with procedure names and include files since originally the source was designed to be one or the other.

There are several hundred system calls which translates to a great deal of 'grunt work' when it comes time to actually do all the translations.

Meta information:


Implement ARC algorithm extension for the vnode free list

Meta information:


Implement swapoff

Meta information:


Graphics Kernel Memory Manager Support ( GEM )

Meta information:


Make DragonFly NUMA-aware

Meta information:


Volume Management based on NetBSD's port of LVM2

NetBSD reimplemented Linux's device mapper (currently only implementing the linear, zero and error targets; Linux has support for a variety of targets, including crypt, stripe, snap, multipath) as dm(4). Device mapper provides the functionality on which to implement volume management; NetBSD has imported LVM2 (which is GPL), but it is possible to create different tools for volume management (e.g. IBM's EVMS was also built on top of device mapper).

The goal of this project is to port both the kernel code, dm(4), and the LVM2 userspace libraries and tools from NetBSD. If time remains, the student should also implement a proof of concept "stripe" target or, for the more ambitious, a "crypt" target.

A possible roadmap for this project would be

  1. Port the dm(4) code

    This code uses proplib instead of binary ioctls for communicating with userspace. Either port proplib, or convert the code to use ioctls.

  2. Port the userspace tools

    Integrate the tools in our source tree using a separate vendor branch, as is normally done for contrib software (see development(7)). Make any DragonFlyBSD-specific changes necessary.

  3. (Optional) Implement either a "stripe" target or a crypt target.

    The stripe target must be designed with robustness and extensibility in mind, though it is not required to go all the way. It should be flexible enough to allow for different RAID level implementations (at least 0, 1 and 5). Additionally, it should be possible to keep an internal (i.e. part of the volume) log to speed up resyncing and parity checking. Implementing those features would be ideal, but is not required.

    The crypt target must allow for different ciphers and cipher parameters and should make use of our in-kernel crypto infrastructure. It is probably necessary to do the encryption asynchronously which will require extending the current infrastructure.

Meta information:


Port DragonFly to Xen platform

Meta information:


Port valgrind to DragonFlyBSD

Valgrind is a very useful tool on a system like DragonFly that's under heavy development. Since valgrind is very target specific, a student doing the port will have to get acquainted with many low level details of the system libraries and the user<->kernel interface (system calls, signal delivery, threading...). This is a project that should appeal to aspiring systems programmers. Ideally, we would want the port to be usable with vkernel processes, thus enabling complex checking of the core kernel code.

The goal of this project is to port valgrind to the DragonFlyBSD platform so that at least the memcheck tool runs sufficiently well to be useful. This is in itself a challenging task. If time remains, the student should try to get at least a trivial valgrind tool to work on a vkernel process.

Meta information:


Adapt pkgsrc to create a package system with dependency independence.

Meta information:


Implement virtio drivers on DragonFly to speed up DragonFly as a KVM guest

As virtualization is coming more and more and KVM will be a strong player in that field, we want DragonFly to have top-notch support for this virtualization platform. For this purpose, we'd like to have a virtio-based implementation of a paravirtualized disk and network driver. virtio is an abstraction to a ring buffer that is shared between the host and the guest. On top of this abstraction, one can build a variety of paravirtualized devices, as specified in virtio-spec.

The goal of this project is to create a virtio-ring implementation and then to implement drivers for the network and block devices described in the specification linked to above. This is a great project for a student who wants to get experience writing (real-world, high-performance) device drivers without having to deal with the quirks of real hardware.

Meta information:


Port FUSE or PUFFS from FreeBSD/NetBSD

Meta information:


Make vkernels checkpointable

Meta information:


HAMMER compression

Doing compression would require flagging the data record as being compressed and also require double-buffering since the buffer cache buffer associated with the uncompressed data might have holes in it and otherwise referenced by user programs and cannot serve as a buffer for in-place compression or decompression.

The direct read / direct write mechanic would almost certainly have to be disabled for compressed buffers and the small-data zone would probably have to be used (the large-data zone is designed only for use with 16K or 64K buffers).

Meta information:


Port usb4bsd

Meta information:


Userland System V Shared Memory / Semaphore / Message Queue implementation

Meta information:


Update our interrupt routing and PCI code

Meta information:


Proportional RSS

The Resident Stack Size displayed by top keeps track of the number of resident pages in a certain process's adress space. It is very useful to locate memory hogs, but doesn't take into account page sharing. For example, if N processes map library L and L's resident pages are 1G, this 1G is added to the RSS of all N processes. A more useful number would be the Proportional (or Effective) RSS, for which we divide the number of mapped shared pages by the number of processes sharing each page. So in the previous example we would add 1GB/N to each process that has L mapped.

The goal of this project is to hack the kernel to allow for effective calculation of the Proportional RSS and modify top to use it in addition to the RSS (i.e. it should display it by default and be able to sort based on it).

Meta information:


(please add)