DragonFly BSD
DragonFly kernel List (threaded) for 2013-08
[Date Prev][Date Next]  [Thread Prev][Thread Next]  [Date Index][Thread Index]

Re: [GSOC] HAMMER2 compression feature week7 report


From: "Samuel J. Greear" <sjg@xxxxxxxxxxxx>
Date: Sun, 4 Aug 2013 14:10:37 -0600

--089e01227acafdfceb04e324c83d
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: quoted-printable

Nice work!

Regarding the performance chart and testing so far, it's nice to know that
the cpu overhead is well-bounded and these small tests likely worked well
for simply making sure everything worked, but I wouldn't spend much/any
time on this type of testing going forward, since these microbenchmarks
only show cached performance -- the compressed numbers will basically
always look like a net loss here (albeit it looks like a small one, which
is good) -- the real numbers of interest are going to be performance of
uncached benchmarks / benchmarks that cause a lot of real disk i/o. As you
make it stable I would move onto things like fsstress, blogbench, bonnie,
etc.

If the code is stable enough I would be interested to hear what the
performance delta is between a pair of dd if=3D/dev/zero bs=3D64k count=3D5=
000 or
similar (as long as its much bigger than RAM) with zero-compression on vs
off. In theory it should look similar to the delta between cached io and
uncached io.

Sam

On Sun, Aug 4, 2013 at 1:55 PM, Daniel Flores <daniel5555@gmail.com> wrote:

> Hello everyone,
> here is my report for week 7.
>
> This week I had to create a new VM for DragonFly. This new VM has
> different settings and works faster than the previous one. So, since all =
my
> work and tests will be done on that new VM, it won't be possible to
> directly compare new results with the results obtained in previous tests =
on
> old VM.
>
> Now, as for work done this week, the code was cleaned up significantly an=
d
> optimized a bit too. This affected mostly write path, since most of new
> code was there. More specifically, the write path now looks like this:
>
> We have hammer2_write_file() function that contains all the code that is
> shared among 3 possible options for write path =96 no compression,
> zero-cheking and LZ4 compression. At the end of the function where paths
> start to differ depending on selected option, it simply determines the
> option and calls one of 3 functions: hammer2_compress_and_write()
> (corresponds to LZ4 compression), hammer2_zero_check_and_write()
> (corresponds to zero-checking option) and hammer2_just_write() (no
> compression or zero-checking). Those functions do everything necessary to
> finish the write path.
>
> hammer2_just_write() mostly contains the code that was previously in the
> end of hammer2_write_file() function.
>
> hammer2_zero_check_and_write() is a very simple function that checks if
> the block to be written contains only zeros with a specific function call=
ed
> not_zero_filled_block() and calls, if necessary, another function called
> zero_check() that deals with the zero-filled block. If the block is not
> zero-filled, the function calls hammer2_just_write().
>
> hammer2_compress_and_write() is the most complex function that performs
> the compression and then writes the block, the compressed version if the
> compression was successful and the original version if it wasn't. It also
> uses not_zero_filled_block() and zero_check() for zero-filled block case.
>
> There are also small improvements, such as that now we use
> obcache_create() instead of obcache_create_simple().
>
> What I'll do now is exhaustively test the code to ensure that it is
> stable. Right now it is not, because we still have a certain bug that
> provokes file corruption while reading and the system crash under certain
> circumstances. I'll be working on fixing that next week. Also there are a
> couple of enhancements for write path such as detecting the incompressibl=
e
> files and not trying to compress them on which I'll be working as well. I
> also expect that, probably, some other bugs will be found in process of
> testing.
>
> Now a bit on tests... Earlier this week I was asked to test the
> performance on small files. The testing methodology was exactly the same =
as
> the one I employed in tests from previous week's report. For testing I us=
ed
> 5 files in total:
>
> 1 .jpg (incompressible) =96 roughly 62KB In size.
> 1 small log file (perfectly compressible) =96 62KB in size.
> 1 .png (incompressible) =96 roughly 2KB in size.
> 1 very small log file (perfectly compressible) =96 2 KB in size.
> 1 even smaller log file =96 512B in size. I didn't use an incompressible
> file, because all files of that size or smaller are embedded directly int=
o
> an inode, so their path is the same.
>
> For the group test, the same files were copied 20 times per test.
>
> The results are summarized in this table [1].
>
> Basically, it looks like for such small files there is no detectable
> difference in performance. It should be noted that the average seek time =
on
> modern hard drives is about 0.009 s, so at this rate other factors are mo=
re
> important for performance than the path used. It also should be noted tha=
t
> currently the write path with compression tries to compress the whole
> logical block even if the file is smaller than this block, but this doesn=
't
> seem to affect the performance on a scale of single file.
>
> On other hand, when the total size is large enough, like 2.5MB in this
> case, it appears that the difference starts to be perceivable.
>
> My code is available, as usually, in my leaf, branch =93hammer2_LZ4=94 [2=
].
> I'll appreciate any comments, feedback and criticism.
>
>
> Daniel
>
> [1]
> http://leaf.dragonflybsd.org/~iostream/performance_table_small_files.html
> [2] git://leaf.dragonflybsd.org/~iostream/dragonfly.git
>
>

--089e01227acafdfceb04e324c83d
Content-Type: text/html; charset=windows-1252
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div>Nice work!</div><div><br></div><div>Regarding the per=
formance chart and testing so far, it&#39;s nice to know that the cpu overh=
ead is well-bounded and these small tests likely worked well for simply mak=
ing sure everything worked, but I wouldn&#39;t spend much/any time on this =
type of testing going forward, since these microbenchmarks only show cached=
 performance -- the compressed numbers will basically always look like a ne=
t loss here (albeit it looks like a small one, which is good) -- the real n=
umbers of interest are going to be performance of uncached benchmarks / ben=
chmarks that cause a lot of real disk i/o. As you make it stable I would mo=
ve onto things like fsstress, blogbench, bonnie, etc.<br>
</div><div class=3D"gmail_extra"><br></div><div class=3D"gmail_extra">If th=
e code is stable enough I would be interested to hear what the performance =
delta is between a pair of dd if=3D/dev/zero bs=3D64k count=3D5000 or simil=
ar (as long as its much bigger than RAM) with zero-compression on vs off. I=
n theory it should look similar to the delta between cached io and uncached=
 io.</div>
<div class=3D"gmail_extra"><br></div><div class=3D"gmail_extra">Sam</div><d=
iv class=3D"gmail_extra"><br><div class=3D"gmail_quote">On Sun, Aug 4, 2013=
 at 1:55 PM, Daniel Flores <span dir=3D"ltr">&lt;<a href=3D"mailto:daniel55=
55@gmail.com" target=3D"_blank">daniel5555@gmail.com</a>&gt;</span> wrote:<=
br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr"><div>Hello everyone,<br></d=
iv><div>here is my report for week 7.</div><div><br></div><div>This week I =
had to create a new VM for DragonFly. This new VM has different settings an=
d works faster than the previous one. So, since all my work and tests will =
be done on that new VM, it won&#39;t be possible to directly compare new re=
sults with the results obtained in previous tests on old VM.</div>

<div><br></div><div>Now, as for work done this week, the code was cleaned u=
p significantly and optimized a bit too. This affected mostly write path, s=
ince most of new code was there. More specifically, the write path now look=
s like this:</div>

<div><br></div><div>We have hammer2_write_file() function that contains all=
 the code that is shared among 3 possible options for write path =96 no com=
pression, zero-cheking and LZ4 compression. At the end of the function wher=
e paths start to differ depending on selected option, it simply determines =
the option and calls one of 3 functions: hammer2_compress_and_write() (corr=
esponds to LZ4 compression), hammer2_zero_check_and_write() (corresponds to=
 zero-checking option) and hammer2_just_write() (no compression or zero-che=
cking). Those functions do everything necessary to finish the write path.</=
div>

<div><br></div><div>hammer2_just_write() mostly contains the code that was =
previously in the end of hammer2_write_file() function.</div><div><br></div=
><div>hammer2_zero_check_and_write() is a very simple function that checks =
if the block to be written contains only zeros with a specific function cal=
led not_zero_filled_block() and calls, if necessary, another function calle=
d zero_check() that deals with the zero-filled block. If the block is not z=
ero-filled, the function calls hammer2_just_write().</div>

<div><br></div><div>hammer2_compress_and_write() is the most complex functi=
on that performs the compression and then writes the block, the compressed =
version if the compression was successful and the original version if it wa=
sn&#39;t. It also uses not_zero_filled_block() and zero_check() for zero-fi=
lled block case.</div>

<div><br></div><div>There are also small improvements, such as that now we =
use obcache_create() instead of obcache_create_simple().</div><div><br></di=
v><div>What I&#39;ll do now is exhaustively test the code to ensure that it=
 is stable. Right now it is not, because we still have a certain bug that p=
rovokes file corruption while reading and the system crash under certain ci=
rcumstances. I&#39;ll be working on fixing that next week. Also there are a=
 couple of enhancements for write path such as detecting the incompressible=
 files and not trying to compress them on which I&#39;ll be working as well=
. I also expect that, probably, some other bugs will be found in process of=
 testing.</div>

<div><br></div><div>Now a bit on tests... Earlier this week I was asked to =
test the performance on small files. The testing methodology was exactly th=
e same as the one I employed in tests from previous week&#39;s report. For =
testing I used 5 files in total:</div>

<div><br></div><div>1 .jpg (incompressible) =96 roughly 62KB In size.</div>=
<div>1 small log file (perfectly compressible) =96 62KB in size.</div><div>=
1 .png (incompressible) =96 roughly 2KB in size.</div><div>1 very small log=
 file (perfectly compressible) =96 2 KB in size.</div>

<div>1 even smaller log file =96 512B in size. I didn&#39;t use an incompre=
ssible file, because all files of that size or smaller are embedded directl=
y into an inode, so their path is the same.</div><div><br></div><div>For th=
e group test, the same files were copied 20 times per test.</div>

<div><br></div><div>The results are summarized in this table [1].</div><div=
><br></div><div>Basically, it looks like for such small files there is no d=
etectable difference in performance. It should be noted that the average se=
ek time on modern hard drives is about 0.009 s, so at this rate other facto=
rs are more important for performance than the path used. It also should be=
 noted that currently the write path with compression tries to compress the=
 whole logical block even if the file is smaller than this block, but this =
doesn&#39;t seem to affect the performance on a scale of single file.</div>

<div><br></div><div>On other hand, when the total size is large enough, lik=
e 2.5MB in this case, it appears that the difference starts to be perceivab=
le.</div><div><br></div><div>My code is available, as usually, in my leaf, =
branch =93hammer2_LZ4=94 [2]. I&#39;ll appreciate any comments, feedback an=
d criticism.</div>

<div><br></div><div><br></div><div>Daniel</div><div><br></div><div>[1] <a h=
ref=3D"http://leaf.dragonflybsd.org/~iostream/performance_table_small_files=
.html" target=3D"_blank">http://leaf.dragonflybsd.org/~iostream/performance=
_table_small_files.html</a></div>

<div>[2] git://<a href=3D"http://leaf.dragonflybsd.org/~iostream/dragonfly.=
git" target=3D"_blank">leaf.dragonflybsd.org/~iostream/dragonfly.git</a></d=
iv><div><br></div></div>
</blockquote></div><br></div></div>

--089e01227acafdfceb04e324c83d--



[Date Prev][Date Next]  [Thread Prev][Thread Next]  [Date Index][Thread Index]