DragonFly BSD
DragonFly kernel List (threaded) for 2013-07
[Date Prev][Date Next]  [Thread Prev][Thread Next]  [Date Index][Thread Index]

Re: [GSOC] HAMMER2 compression feature week1 report


From: Daniel Flores <daniel5555@xxxxxxxxx>
Date: Sun, 28 Jul 2013 21:08:32 +0200

--001a11c3421e11751604e2971a55
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

On Sun, Jul 28, 2013 at 7:27 PM, Radio m=C5=82odych bandyt=C3=B3w <
radiomlodychbandytow@o2.pl> wrote:

> Thanks for the data. I took a quick glance now, later will do some more.
> It appears that the requirement to halve input size is just too much for
> lz4 to notably compress even English text...I suspect that few workloads
> will benefit from LZ4 compression then. There's a need for something
> stronger...
>

I think, the compression will be beneficial for log files, source code
files and some types of uncompressed files. I don't think that changing the
algorithm will significantly change how things are, because we have certain
limitations, like we compress only one block at a time and we don't
reference previous blocks at this moment, so the algorithms don't perform
at their full potential.

But I'll try other algorithms if I'll have enough time.


> Anyway, my results with LZ4 r97 are slightly different, it managed to
> halve 1 block of book1. Which version do you use?
>

I think I use r97 as well. Are you sure that your compression conditions
are the same as mine? I only try to compress one 64KB block at a time and I
don't use the information from previous blocks or anything else.


> Overall, I have a feeling that a stronger LZ77 would be a better fit.
> Something like https://code.google.com/p/data-shrinker/ (Warning: the
> code is demo quality), Shrinker is the strongest-of-fast.
>

OK, this looks promising. When I was choosing an algorithm for the
compression feature, one of the candidates was DEFLATE, which is based on
LZ77. So, I'll definitely consider this data-shrinker as an alternative to
LZ4.

Thank you.


Daniel

--001a11c3421e11751604e2971a55
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">On Sun, Jul 28, 2013 at 7:27 PM, Radio m=C5=82odych bandyt=
=C3=B3w <span dir=3D"ltr">&lt;<a href=3D"mailto:radiomlodychbandytow@o2.pl"=
 target=3D"_blank">radiomlodychbandytow@o2.pl</a>&gt;</span> wrote:<br><div=
 class=3D"gmail_extra">



<div class=3D"gmail_quote"><blockquote class=3D"gmail_quote" style=3D"margi=
n:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Thanks for the da=
ta. I took a quick glance now, later will do some more.<br>
It appears that the requirement to halve input size is just too much for<br=
>
lz4 to notably compress even English text...I suspect that few workloads<br=
>
will benefit from LZ4 compression then. There&#39;s a need for something<br=
>
stronger...<br></blockquote><div><br></div><div>I think, the compression wi=
ll be beneficial for log files, source code files and some types of uncompr=
essed files. I don&#39;t think that changing the algorithm will significant=
ly change how things are, because we have certain limitations, like we comp=
ress only one block at a time and we don&#39;t reference previous blocks at=
 this moment, so the algorithms don&#39;t perform at their full potential.<=
/div>


<div><br></div><div>But I&#39;ll try other algorithms if I&#39;ll have enou=
gh time.</div><div>=C2=A0</div><blockquote class=3D"gmail_quote" style=3D"m=
argin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Anyway, my results with LZ4 r97 are slightly different, it managed to<br>
halve 1 block of book1. Which version do you use?<br></blockquote><div><br>=
</div><div>I think I use r97 as well. Are you sure that your compression co=
nditions are the same as mine? I only try to compress one 64KB block at a t=
ime and I don&#39;t use the information from previous blocks or anything el=
se.</div>

<div>=C2=A0</div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8=
ex;border-left:1px #ccc solid;padding-left:1ex">Overall, I have a feeling t=
hat a stronger LZ77 would be a better fit.<br>
Something like <a href=3D"https://code.google.com/p/data-shrinker/"; target=
=3D"_blank">https://code.google.com/p/data-shrinker/</a> (Warning: the<br>
code is demo quality), Shrinker is the strongest-of-fast.<br></blockquote><=
div><br></div><div>OK, this looks promising. When I was choosing an algorit=
hm for the compression feature, one of the candidates was DEFLATE, which is=
 based on LZ77. So, I&#39;ll definitely consider this data-shrinker as an a=
lternative to LZ4.</div>
<div><br></div><div>Thank you.</div><div><br></div><div><br></div><div>Dani=
el</div></div></div></div>

--001a11c3421e11751604e2971a55--



[Date Prev][Date Next]  [Thread Prev][Thread Next]  [Date Index][Thread Index]