Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix excessive usage of malloc/free #99

Merged
merged 1 commit into from
Apr 22, 2016
Merged

Conversation

MrKrzYch00
Copy link
Contributor

Split ZopfliInitHash to:

  • ZopfliAllocHash - allocate hash memory,
  • ZopfliResetHash - reset hash values.

Allocate Hash outside of ZopfliLZ77Greedy and ZopfliLZ77OptimalRun that
pass it further to functions previously allocating them. Do the same for
costs malloc'd array.

Reason for this change:

  • the size of malloc doesn't change,
  • speed up Zopfli^,
  • fix crash on certain devices^^.

^ speeds up Zopfli (especially on smaller blocks) by reducing amount of
sys time from ~7s to 0.1s on x64 Linux for ~5m compression time and from
~1m to 0.1s on ARMv7 Linux for 13m compression time.
^^ fixes a large amount of iterations crash on some ARM devices that due
to architecture or older kernel (not sure which) don't handle too
aggressive heap allocation and freeing well.

PS. You don't want to know how many hours I wasted guessing what was wrong on Odroid U3 when gdb showed false-positives - without pthreads it was running into assert in ZopfliVerifyLenDist, with pthreads it was crashing on free in ZopfliCleanHash. :)

PS2: TCmalloc didn't help on Odroid U3, still crash due to heap problems occured, before this fix.

Split ZopfliInitHash to:
- ZopfliAllocHash - allocate hash memory,
- ZopfliResetHash - reset hash values.

Allocate Hash outside of ZopfliLZ77Greedy and ZopfliLZ77OptimalRun that
pass it further to functions previously allocating them. Do the same for
costs malloc'd array.

Reason for this change:
- the size of malloc doesn't change,
- speed up Zopfli*,
- fix crash on certain devices**.

* speeds up Zopfli (especially on smaller blocks) by reducing amount of
sys time from ~7s to 0.1s on x64 Linux for ~5m compression time and from
~1m to 0.1s on ARMv7 Linux for 13m compression time.
** fixes a large amount of iterations crash on some ARM devices that due
to architecture or older kernel (not sure which) don't handle too
aggressive heap allocation and freeing.
@googlebot
Copy link

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed, please reply here (e.g. I signed it!) and we'll verify. Thanks.


  • If you've already signed a CLA, it's possible we don't have your GitHub username or you're using a different email address. Check your existing CLA data and verify that your email is set on your git commits.
  • If you signed the CLA as a corporation, please let us know the company's name.

@MrKrzYch00
Copy link
Contributor Author

I signed it!

Lol, just because those are 2 serious bugs imo, that should be fixed asap (2 of my last pull requests).

@googlebot
Copy link

CLAs look good, thanks!

@wtgtybhertgeghgtwtg
Copy link

Is there a reason this hasn't been merged?

@lvandeve
Copy link
Collaborator

Merging it in right now. I measure no real speed difference with gcc on intel. But useful if it fixes an ARM problem.

@lvandeve lvandeve merged commit 365bda1 into google:master Apr 22, 2016
@lvandeve
Copy link
Collaborator

Merged, thanks for your work!

@MrKrzYch00
Copy link
Contributor Author

MrKrzYch00 commented Apr 22, 2016

The speed difference is very little on x86, if You compress lodepng.cpp file with 5000 iterations and use x64 binary for it You should get ~3s less seconds on sys time, however it's almost 1m faster on ARMv7. It may depend on malloc/free implementation or kernel heap manager.

EDIT: Following pull request with further improvement: #106

@MrKrzYch00 MrKrzYch00 deleted the patch-2 branch April 22, 2016 18:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants