Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Compression for Level 6+ #6

Open
JimBobSquarePants opened this issue Mar 5, 2021 · 18 comments
Open

Improve Compression for Level 6+ #6

JimBobSquarePants opened this issue Mar 5, 2021 · 18 comments
Labels
enhancement New feature or request help wanted Extra attention is needed
Milestone

Comments

@JimBobSquarePants
Copy link
Member

JimBobSquarePants commented Mar 5, 2021

Compression result for the Canterbury Corpus are disappointing. We always appear to be a few percentage points off the other libraries once we hit level 6, except for kennedy.xls which compresses far better than the alternatives. For lower compression levels we compare very well.

https://github.com/SixLabors/ZlibStream/blob/196c4730ba637a445e840ed7cfe67297e77b47af/benchmarks.md

According to the Squash Benchmark

  • Zlib-ng achieved a compression ratio of 3.09 for cp.html at level 6, we only achieve 2.99.
  • Zlib-ng achieved a compression ratio of 5.05 for kennedy.xls at level 6, we achieve 5.5.

I had a go at porting the deflate_slow method from there but that dramatically reduced compression in our sparse benchmarks to levels matching compression level 3. I haven't ported across deflate_medium yet to experiment (deflate_quick is currently broken and disabled via compiler conditionals).

@nmoinvaz I'd love to have your insight as to the cause of the difference if you have time.

@JimBobSquarePants JimBobSquarePants added enhancement New feature or request help wanted Extra attention is needed labels Mar 5, 2021
@JimBobSquarePants JimBobSquarePants added this to the Future milestone Mar 5, 2021
@nmoinvaz
Copy link
Contributor

nmoinvaz commented Mar 5, 2021

I'll try and take a look this weekend and see if there is anything obvious.

@JimBobSquarePants
Copy link
Member Author

Thanks mate! Really appreciate it!

@nmoinvaz
Copy link
Contributor

nmoinvaz commented Mar 6, 2021

That Squash benchmark is from this commit authored in mid-2015 so it is quite outdated. The first thing I would do in making the comparison is checking to make sure that the deflate configuration table being compared is 1-to-1.

Config Tables

ZlibStream

private static readonly Config[] ConfigTable = new Config[10]
{
// good lazy nice chain
new Config(0, 0, 0, 0, STORED), // 0
#if USE_QUICK
new Config(4, 4, 8, 4, QUICK), // 1
#else
new Config(4, 4, 8, 4, FAST), // 1
#endif
new Config(4, 5, 16, 8, FAST), // 2
new Config(4, 6, 32, 32, FAST), // 3
new Config(4, 4, 16, 16, SLOW), // 4
new Config(8, 16, 32, 32, SLOW), // 5
new Config(8, 16, 128, 128, SLOW), // 6
new Config(8, 32, 128, 256, SLOW), // 7
new Config(32, 128, 258, 1024, SLOW), // 8
new Config(32, 258, 258, 4096, SLOW), // 9
};

Zlib-ng

https://github.com/zlib-ng/zlib-ng/blob/51566828e52803a1355837660ac2afd585ac05e4/deflate.c#L142-L168

Zlib-ng Results

I've run some results using the commit nmoinvaz/zlib-ng@a6797d8 of zlib-ng I am on. Using minigzip.exe I checked both with and without new strategies. To narrow down where any problem might be it is best to turn off the new strategies and see where we are.

File Level Original Size New Strategies No New Strategies
cp.html 6 25248 8011 (3.15) 7979 (3.16)
kennedy.xls 6 1029744 217452 (4.73) 210216 (4.89)

With some of the new strategies zlib-ng trade compression ratio for speed.

Fast-zlib

For levels 8 & 9, I have a PR zlib-ng/zlib-ng#828 that uses the fast-zlib method. This is an optimization that you might want to consider implementing at some point that speeds things up a lot. It is tricky to get right and I had to step through fast-zlib and zlib-ng one symbol at a time in order to get it to produce the same results.

But if you take a look at the very bottom that PR you can see speed/compression size ratio benchmarks for zlib-ng/madler and that might give you an idea of how zlib-ng has traded speed for compression.

Hash table

ZlibStream uses the same UpdateHash method as zlib-ng, which is a four-byte integer hash. But traditional zlib implementation uses a rolling hash. These different types of hashes can produce very different results, with-in the margin of what you are seeing. One idea might be to roll back to the old rolling hash method and seeing if the hashing method is the culprit.

Cloudflare

Cloudflare zlib does something a bit differently with hashing. It sets the MINMATCH 4 which produces better results for web content but not for all types of content.

private const int MINMATCH = 3;

I would be interested to see if you could dynamically change from MINMATCH being 3 to 4 based on RGB/RGBA for your use case.

Hopefully that is some helpful info.

@JimBobSquarePants
Copy link
Member Author

Thanks so much for all this information @nmoinvaz it's all very useful!

@nmoinvaz
Copy link
Contributor

nmoinvaz commented Mar 7, 2021

No problem. This resource might also be a good one. It has helped me every so often.

@nmoinvaz
Copy link
Contributor

nmoinvaz commented Mar 9, 2021

You may see some performance improvement if you increase the size of the output buffer to 64kb which I have demonstrated here.

@JimBobSquarePants
Copy link
Member Author

Oooh I'll give that a shot!

@nmoinvaz
Copy link
Contributor

You might want to make it configurable as mentioned in that PR discussion, that way if somebody is using it for web server connections they can set it to use low memory per connection.

@nmoinvaz
Copy link
Contributor

nmoinvaz commented Mar 10, 2021

Also, I have had some internal zlib-ng discussions about bypassing the pending buffer altogether in certain instances. But for ZlibStream, the API doesn't appear to allow for using FlushMode directly, so you could get rid of the pending buffer altogether OR use pending buffer as output buffer. That will reduce the number of reads/writes easily, because it is one less buffer all the data has to be read/write to (passed-through).

Edit:
Actually I was wrong, FlushMode is exposed here and here.

@JimBobSquarePants
Copy link
Member Author

Yeah I'm trying to expose all the options to fill the gaps in the MS implementation.

@AraHaan
Copy link
Contributor

AraHaan commented Mar 11, 2021

Not sure if this will help but on my zlib I nuked a ton of unused internal methods in the code, perhaps that might help with performance some (if not it sure will help with reading the code however).

I still wait for performance and memory usage reduction pr's for the internal bits of the library however whenever that could get done.

Oh and btw I sorta nuked SupportClass, ZStream and other internal bits for an ZlibStream class that all of the internal bits access and make changes only to that to try to reduce writes there as well (and try to reduce the amount of byte arrays).

Also if pr contains usage of System.Memory to use Span and stuff I do not mind (be sure to up net40 on the .NET Framework target of the multitarget to the minimum versions the nuget package for them support).

(Note: you might want to compare the common history between my repository and this one and carefully apply the changes from mine so that you guys do not lose all your work so far)

Also on the 3 ~ 4 bit on the thing, why not add a new compression level to control that setting? Maybe something like ZlibCompressionLevel.Level10 where it then just uses some level from before that for compressing, however sets that MINMATCH to 4 instead of 3.

But here is the issue, it is const so it cannot be changed so then the const could be replaced with an internal static property perhaps which might actually fit better.

And if you guys forgot my zlib is still https://github.com/Elskom/zlib.managed/ At least I tried to improve some things on it as well that is.

@AraHaan
Copy link
Contributor

AraHaan commented Mar 12, 2021

Also why is it not possible to have the same compression rations but also make them fast like said above?

Like why must for use to win with speed of compressing, we must also loose for sake of size of the stuff at the higher levels then?

Why cant we just get both speed, but also keep the compression size as is without losing any good and considerable size reductions.

But that is just me I always been for size of the data but also hated waiting just to get the maximum possible compression that could be made.

Btw someone mind experimenting and see if this would do anything as well too?

I thought of it in my head:

new Config(32, 516, 516, 8192, SLOW),  // 10 

As another test only compression level, I would like to know what the results would be if possible or if it generates valid compressed data that can be decompressed. I could have miscalculated the last number however I am not sure.

@AraHaan
Copy link
Contributor

AraHaan commented Jul 24, 2021

Welp after doing tests on this for trying to improve the compression levels and then adding tests, it seems for some reason when I look at the compressed outputs on a file and compare them that the inputs get directly copied to the compressed output and is not actually compressed after it inserts the zlib header at the beginning of the compressed output and the zlib trailer at the end of it. And ironically it seems to be the same for every compression level and the tests seem to not fail. As such I might need someone to use the last stable version from nuget to see if the compressed output is the same as well on that same file when BestCompression is used.

@AraHaan
Copy link
Contributor

AraHaan commented Aug 27, 2021

Oh and @JimBobSquarePants I recently found an issue with how compression is made in zlib.net and zlib.managed and it may also apply here too. For some reason in my code currently when I tried to add a unit test that it seems no mater what compression level I test on the test file the resulting output data after compression is exactly the same data as the input file (with the zlib header and footer appended to it).

It could be a simple issue on my part but I tracked it down all the way to zlib.net as well, the native C implementation seems to not have this issue on my end.

@JimBobSquarePants
Copy link
Member Author

@AraHaan Sorry I've not gotten back to you. Been heavily occupied with my other libraries so this is on the backburner.

My implementation definitely compresses the output, unfortunately there's something wrong with the inflate stream now. I broke it at some point and have not been able to determine at what time this took place. As such I will have to start the inflate stream optimizations again from scratch.

@AraHaan
Copy link
Contributor

AraHaan commented Aug 28, 2021

I understand, perhaps a best course of action would be to rebase all of our zlib work based on the latest version of the C implementation. Zlib.NET was based on an outdated C implementation so that might be an option worth looking in to.

@JimBobSquarePants
Copy link
Member Author

A lot of my codebase is based upon zlib-ng. There’s significant changes from both yours and zlib.net.

It’ll be a simple case of repeating my steps just with better testing. Time is the biggest factor.

@AraHaan
Copy link
Contributor

AraHaan commented Aug 28, 2021

Yep

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants