Can we improve slow download time? #55

iHiD · 2020-10-11T23:13:06Z

Hello 👋

Firstly, thank you for work on this 💙 We're using it all over the place at Exercism and it's proving to be a brilliant tool.

One thing I'm noticing though is that the more its used the slower it gets to download things. On a repo I'm working on atm, it takes over 5mins to download the data and load it into docker. This time seems to be linearly increasing with each usage, which scares me a little! I've tried experimenting with different concurrency levels but to no avail.

I'm wondering if you know of any way to improve things, either for me as a user, or any ideas about how we could speed up/improve the action itself?

Could we maybe set expiries on the cached data, removing layers that haven't been used in a while? This could happen either in the clean up phase of the action, or as a stand-alone clean-up action that could run daily?

There's a few of us at Exercism that would happily contribute to making things better if you want us to submit a PR, etc, but I'm wondering if you had any ideas/thoughts/direction regarding how we could improve this?

Thank you!
Jeremy

The text was updated successfully, but these errors were encountered:

mayli · 2020-11-24T05:35:24Z

👍 for this issue, here is my 0s build time with few minutes of caching operation

cynicaljoy · 2020-12-05T04:15:52Z

I think it's related to actions/cache#381 -- looks like the current version of the actions/cache that's being used in this project is @1

rcowsill · 2020-12-07T01:00:07Z

GitHub's naming is confusing... The action actions/cache@2 uses NPM package @actions/[email protected], which lives here: https://github.com/actions/toolkit/tree/main/packages/cache.

This action is already using @actions/[email protected], with the faster Azure SDK segmented downloads.

For the OP, I suggest checking how many images are getting loaded into docker from the cache. Run docker images -a before and after the cache loads, and see if the size of images added from cache is getting out of control.

vpontis · 2021-01-01T18:32:20Z

For the OP, I suggest checking how many images are getting loaded into docker from the cache. Run docker images -a before and after the cache loads, and see if the size of images added from cache is getting out of control.

We are seeing this happen and our build times are going 📈. Any recommendations on how to fix this?

mo-mughrabi · 2021-01-03T12:38:12Z

Likewise, here, we are experiencing slow downloads and/or uploads to cache

rcowsill · 2021-01-03T18:17:19Z

For the OP, I suggest checking how many images are getting loaded into docker from the cache. Run docker images -a before and after the cache loads, and see if the size of images added from cache is getting out of control.

We are seeing this happen and our build times are going chart_with_upwards_trend. Any recommendations on how to fix this?

If you're not already using v0.0.9 or later, upgrading should help some.

Besides that, currently I think the only workaround is to change your cache keys periodically. That will empty the cache, discarding any images that are no longer used.

The slowdown is happening because all the restored images from the cache have to be carried over into the next cache. That's needed to guarantee that any cached images used by docker are still present for the next run to use. Unfortunately it means that unused images are carried over too.

This could be avoided if docker had a way to monitor cache hits, but it doesn't appear to.

It might help to add some new options to the action for discarding cached images. For example, users could specify how many tags to retain, and the action would keep the newest ones up to that limit. It may also be possible to infer which restored images were not used and discard them, but that's difficult for multistage builds.

CalebAlbers · 2021-01-20T08:09:45Z

Besides that, currently I think the only workaround is to change your cache keys periodically. That will empty the cache, discarding any images that are no longer used.

For anyone looking at a way to do this automatically, we're using the month number as a rotating cache key variable, like so:

    - run: echo "MONTH=$(date +%m)" >> $GITHUB_ENV

    - uses: satackey/[email protected]
      # Ignore the failure of a step and avoid terminating the job.
      continue-on-error: true
      with:
        key: ${{ github.workflow }}-${{ env.MONTH }}-{hash}
        restore-keys: |
          ${{ github.workflow }}-${{ env.MONTH }}-

For more active projects, you could use a weekly cache key (date +%U). I haven't found a better way yet, but definitely open to suggestions

We experienced some problems with `satackey/action-docker-layer-caching` that caused workflow runners to get out of a disk space. This is related to the way the action stores layer incrementaly along with the previously cached ones. See: satackey/action-docker-layer-caching#55 There is also a solution from Docker that we can use for building images but also on later stage of RFC-18 implementation to publish images to registires. See: https://github.com/marketplace/actions/build-and-push-docker-images With the new solution we can also use caching and hopefully it won't cause problems we had before.

…blish-action Switch to Docker's build-publish-action in GitHub Actions In this PR we are switching the Go building workflow to Docker's official [Build and Publish Action](https://github.com/marketplace/actions/build-and-push-docker-images). We were experiencing some issues with running out of disk space while using `satackey/action-docker-layer-caching`. It was happening because the cache size was incremented with each execution. During testing, it [grew up pretty quickly to 7 GB](https://github.com/keep-network/keep-core/runs/1880553919?check_suite_focus=true). There is an issue describing such a situation satackey/action-docker-layer-caching#55. I tried the proposed workarounds but was not satisfied with them. According to [documentation](https://docs.github.com/en/actions/guides/caching-dependencies-to-speed-up-workflows#usage-limits-and-eviction-policy) caches will be cleaned up when their total size reaches 5 GB, but in this case, we had just one cache of 7 GB. I noticed that Docker released [an action set](https://github.com/marketplace/actions/build-and-push-docker-images) that may suit our needs. It also supports layers caching. Additionally, we can use it later for image publication to registries. During testing the cache size was around 1 GB. What is worth mentioning is a clever way how GH's cache action handles access to caches between branches. Long story short: 1. restore matches first caches on the current branch and later on parents - [read more](https://docs.github.com/en/actions/guides/caching-dependencies-to-speed-up-workflows#matching-a-cache-key) 2. sibling branches cannot access each other's caches - [read more](https://docs.github.com/en/actions/guides/caching-dependencies-to-speed-up-workflows#restrictions-for-accessing-a-cache)

patroza · 2021-06-27T11:09:06Z

My approach to the problem:

pull dependent images before cache action
Build hash from major changers of the docker image


YARN=$(md5sum yarn.lock | awk '{ print $1 }')
PKG=$(md5sum package.json | awk '{ print $1 }')
API_PKG=$(md5sum apps/api/package.json | awk '{ print $1 }')
TYPES_PKG=$(md5sum packages/types/package.json | awk '{ print $1 }')
CLIENT_PKG=$(md5sum packages/client/package.json | awk '{ print $1 }')

echo "YARN_HASH=${YARN}_${PKG}_${API_PKG}_${TYPES_PKG}_${CLIENT_PKG}" >> $GITHUB_ENV

Use hash in key for cache
Prune images before cache upload

      - run: |
          docker image prune -a --force --filter "label=tag!=${{ github.sha }}"

🤞

adambiggs · 2021-12-01T22:04:01Z

How is anybody even using this action if the cache continually grows with each build?

It seems this action will always make build times worse after the first handful of builds... Am I missing something?

omacranger · 2021-12-01T22:08:09Z

@adambiggs you should have a build step that cleans up images -- at least that's what we do. We prune images older than three days so we can still leverage the cache without having it be astronomical in size.

adambiggs · 2021-12-02T01:43:13Z

Thanks @omacranger. For anyone who might find themselves here, the workaround I ended up with is adding this step at the end of my job:

- run: docker image prune --all --force --filter "until=48h"

I think a note should really be added to the readme, because some flavour of this workaround seems to be a hard requirement for using this action.

jakeonfire · 2021-12-03T05:33:37Z

i've found it much quicker to download the most recently(/similar) built image and use --cache-from. i'm not sure if there are other cases where this layer caching solution is cheaper.

As otherwise it'll grow in size satackey/action-docker-layer-caching#55

iHiD added the enhancement New feature or request label Oct 11, 2020

rcowsill mentioned this issue Jan 11, 2021

Does it remove discarded layers from cache? [Question] #104

Closed

nkuba mentioned this issue Feb 12, 2021

Switch to Docker's build-publish-action in GitHub Actions keep-network/keep-core#2331

Merged

hemberger mentioned this issue Feb 17, 2021

run-tests.yml: implement docker layer caching smrealms/smr#996

Closed

CosmicHorrorDev mentioned this issue Feb 23, 2021

Build and cache based on #153 mbround18/valheim-docker#165

Merged

4 tasks

artemkoru mentioned this issue Nov 11, 2021

Local docker builder should support using the previously built image as a cache source GoogleContainerTools/skaffold#6809

Closed

eronisko added a commit to SlovakNationalGallery/webumenia.sk that referenced this issue Feb 24, 2022

Prune docker cache after each build

621d459

As otherwise it'll grow in size satackey/action-docker-layer-caching#55

eronisko added a commit to SlovakNationalGallery/webumenia.sk that referenced this issue Feb 24, 2022

Prune docker cache every month

bee133d

As otherwise it'll grow in size satackey/action-docker-layer-caching#55

eronisko added a commit to SlovakNationalGallery/webumenia.sk that referenced this issue Feb 24, 2022

Prune docker cache every month

e705757

As otherwise it'll grow in size satackey/action-docker-layer-caching#55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can we improve slow download time? #55

Can we improve slow download time? #55

iHiD commented Oct 11, 2020 •

edited

Loading

mayli commented Nov 24, 2020

cynicaljoy commented Dec 5, 2020 •

edited

Loading

rcowsill commented Dec 7, 2020 •

edited

Loading

vpontis commented Jan 1, 2021

mo-mughrabi commented Jan 3, 2021

rcowsill commented Jan 3, 2021

CalebAlbers commented Jan 20, 2021 •

edited

Loading

patroza commented Jun 27, 2021

adambiggs commented Dec 1, 2021

omacranger commented Dec 1, 2021

adambiggs commented Dec 2, 2021

jakeonfire commented Dec 3, 2021 •

edited

Loading

Can we improve slow download time? #55

Can we improve slow download time? #55

Comments

iHiD commented Oct 11, 2020 • edited Loading

mayli commented Nov 24, 2020

cynicaljoy commented Dec 5, 2020 • edited Loading

rcowsill commented Dec 7, 2020 • edited Loading

vpontis commented Jan 1, 2021

mo-mughrabi commented Jan 3, 2021

rcowsill commented Jan 3, 2021

CalebAlbers commented Jan 20, 2021 • edited Loading

patroza commented Jun 27, 2021

adambiggs commented Dec 1, 2021

omacranger commented Dec 1, 2021

adambiggs commented Dec 2, 2021

jakeonfire commented Dec 3, 2021 • edited Loading

iHiD commented Oct 11, 2020 •

edited

Loading

cynicaljoy commented Dec 5, 2020 •

edited

Loading

rcowsill commented Dec 7, 2020 •

edited

Loading

CalebAlbers commented Jan 20, 2021 •

edited

Loading

jakeonfire commented Dec 3, 2021 •

edited

Loading