DragonFly kernel List (threaded) for 2010-02
kernel work week of 3-Feb-2010 HEADS UP (interleaved swap test)
The latest commit adds write clustering and some additional features
to the swapcache, plus a manual page: swapcache(8).
The write clustering significantly reduces the IOPS rate for writes to
the swapcache and appears to improve the SSDs performance (presumably
it has an earlier time write-combining and erasing).
Just for the hell of it I set up two 40G Intel SSDs as 2x interleaved
swap and ran a non-parallel linear file read test on 10G worth of files
after they got cached by swapcache. I was able to achieve around
cat test* | dd of=/dev/null bs=32k
10066329600 bytes transferred in 32.916703 secs (305812207 bytes/sec)
10066329600 bytes transferred in 32.879632 secs (306157003 bytes/sec)
10066329600 bytes transferred in 32.867684 secs (306268298 bytes/sec)
10066329600 bytes transferred in 32.779923 secs (307088263 bytes/sec)
10066329600 bytes transferred in 32.837278 secs (306551890 bytes/sec)
I found it difficult to keep both SSDs maxed out. They were typically
between 80-100% busy. There are some practical limitations related
to how the cluster read-ahead works. It does synchronous BMAP operations
which the SSD doesn't prioritize over prior reads still in progress,
so there are little spots of serialization that reduce performance.
I tried splitting the file set up into two or three concurrent reads
going at once and that did saturate the SSDs (95-100% busy), but
overall performance actually dropped a little, down to 250MB/sec
in aggregate. My guess is that the SSD optimizes for linear reads,
so the more fragmented requests from the concurrent file operations
weren't as optimal inside the SSD.
I also did some testing of the OCZ 120G Colossus. A single Intel 40G
is able to do 180-200M/sec or so reading. The OCZ doesn't do
as well despite being advertised as having 128M of ram cache. It's
performance varies between 130MB/s and 220MB/s with an average of
around 170MB/s. A key thing to note with the OCZ is that it does
not do NCQ while the Intel does. I was a bit surprised, actually.
I fully expected the OCZ to negotiate command queueing.