Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: sync subcharts from different registries #172

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

chrissgyulev
Copy link

Adds a feature to sync sub-charts from non-source registries

Related Issues:

Activated by adding a list of repos via trustedSourceDeps:

# source includes relevant information about the source chart repository
source:
  # Dependencies located in repos from this list will be considered as trusted, and also synced.
  # The entry format is the same as "repo" (see below)
  trustedSourceDeps:
  - kind: HELM
    url: https://grafana.github.io/helm-charts

Also adds replaceDependencyRepo

target:
  # In case there is a need to mirror dependencies (from trustedSourceDeps list, see above) - this must be set to true
  replaceDependencyRepo: true

Take a look at examples/sync-deps for full example. This example addresses directly #147 (could be launched via newly added docker-compose.yaml)

Other minor improvements

  • docker-compose.yaml for local testing
  • the Dockerfile now compiles the binary

@chrissgyulev chrissgyulev force-pushed the task/sync-chart-depedencies branch from 13c5584 to 3da9e2e Compare October 22, 2022 08:34
@chrissgyulev chrissgyulev marked this pull request as ready for review October 22, 2022 08:35
Copy link
Contributor

@jotadrilo jotadrilo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @chrissgyulev!

Thank you so much for contributing to this repo!

I have added a few comments, but I have a slightly different proposal to tackle this issue:

  • Remove target.replaceDependencyRepo
  • Replace source.trustedSourceDeps by source.ignoreTrustedRepos. This option will ignore certain "trusted" repositories from the sync process (so the resulting Helm Chart will use the same external repository).
  • Add a target.syncTrustedRepos. This option will enforce syncing certain "trusted" repositories during the sync process (so the resulting Helm Chart will include references to the target repository, where the charts included in the trusted repos should be present after the process).

NOTES: source.ignoreTrustedRepos can be ignored if we don't want to just ignore some repositories and, instead, go for an AON (All-Or-None) approach.

The main reason to follow this approach is that an unexperienced reader will clearly identify which repositories will be ignored from source vs which repositories will be fully synced into the target. By having two different settings that interact internally (this PR) might be troublesome to understand and debug if the code gets more complex.

@@ -4,6 +4,11 @@

# source includes relevant information about the source chart repository
source:
# Dependencies located in repos from this list will be considered as trusted, and also synced.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they will be synced only if we enable target.replaceDependencyRepo = true

# repoName is used to modify the README of the chart. Default value: `myrepo`
# In case there is a need to mirror dependencies (from trustedSourceDeps list, see above) - this must be set to true
replaceDependencyRepo: true
# repoName is used to modify the README of the chart. Default value: `myrepo`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix this indentation

@@ -0,0 +1,11 @@
services:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

even though I am glad you took the time to craft this, we don't want to support docker compose. Please drop this file.

@jotadrilo jotadrilo requested a review from tompizmor October 26, 2022 09:59
@chrissgyulev
Copy link
Author

OK, I'll make the adjustments taking into account your comments @jotadrilo . Do some testing and get back to you for another review. Thank you for your time to review my proposal.

Regards

@MShekow
Copy link

MShekow commented Feb 22, 2023

OK, I'll make the adjustments taking into account your comments @jotadrilo . Do some testing and get back to you for another review. Thank you for your time to review my proposal.

Regards

It's really great that you started this PR. We are also in need of a solution for this. Did you @chrissgyulev by any chance find some time to adapt the PR?

@chrissgyulev
Copy link
Author

@MShekow Hi ! I'll try to find some spare time to finish it soon. My apologies.

@MShekow
Copy link

MShekow commented May 8, 2023

@MShekow Hi ! I'll try to find some spare time to finish it soon. My apologies.

Hey @chrissgyulev any luck yet? We would really benefit from it.

@chrissgyulev chrissgyulev force-pushed the task/sync-chart-depedencies branch from 28896ff to 92e4d21 Compare May 24, 2023 10:20
@chrissgyulev
Copy link
Author

Hi again ! Could you take a look ? @MShekow could you help us, testing your case ? Some things to consider:

  • use examples/sync-deps.yaml config
  • the docs (Readme.md) will be updated when we all agree on this
  • the root /charts-syncer.yaml will also be updated with the options at the end

Thank you very much for your patience!

Regards,
Chriss

@chrissgyulev chrissgyulev requested a review from jotadrilo May 24, 2023 10:27
@MShekow
Copy link

MShekow commented May 25, 2023

@chrissgyulev Thanks for updating the PR. It works with our use case, as expected 👍

@MShekow
Copy link

MShekow commented May 25, 2023

There is one problematic issue, though: synchronization now takes forever. I use a self-built Docker image (using your Dockerfile). When I run the sync with the charts-syncer upstream v0.20.1 with verbose mode (-v=5), then I see log output such as this:

I0525 13:48:53.010525       1 sync.go:49] Looking for the default config charts-syncer.yaml
I0525 13:48:53.034096       1 config.go:37] 'source.repo.chartsIndex' property is empty. Using "kedacore.github.io//charts/charts-index:latest" default value
I0525 13:48:53.034317       1 config.go:78] 'target.repoName' property is empty. Using "myrepo" default value
I0525 13:48:53.036341       1 syncer.go:124] Using workdir: "/.charts-syncer"
I0525 13:48:53.037858       1 cachedisk.go:39] Allocating cache dir: "/.charts-syncer/8526dab2f201639577faed94e0d05ce310045e77"
I0525 13:48:53.038700       1 helmclassic.go:46] [1dccdb76a544d080f69f602bd11d6bfadea509d4] GET "https://kedacore.github.io/charts/index.yaml"
I0525 13:48:53.441692       1 helmclassic.go:61] [1dccdb76a544d080f69f602bd11d6bfadea509d4] HTTP Status: 200 OK
I0525 13:48:53.491240       1 cachedisk.go:39] Allocating cache dir: "/.charts-syncer/0534ad106412f6f2c7dc103cf297992c9a24b3c8"
I0525 13:48:53.492372       1 index.go:89] Publishing threshold set to "2022-06-15 00:00:00 +0000 UTC"
I0525 13:48:53.492558       1 index.go:105] Found 49 versions for "keda" chart: [2.10.2 2.10.1 2.10.0 2.9.4 2.9.3 2.9.2 2.9.1 2.9.0 2.8.4 2.8.3 2.8.2 2.8.1 2.8.0 2.7.2 2.7.1 2.7.0 2.6.2 2.6.1 2.6.0 2.5.1 2.5.0 2.4.0 2.3.2 2.3.0 2.2.2 2.2.1 2.2.0 2.1.3 2.1.2 2.1.1 2.1.0 2.0.1 2.0.0 2.0.0-rc3 2.0.0-rc2 2.0.0-rc 2.0.0-beta1.2 2.0.0-beta1.1 2.0.0-beta 1.5.0 1.4.2 1.4.1 1.4.0 1.3.2 1.3.1 1.3.0 1.2.0 1.1.0 1.0.0]
I0525 13:48:53.492692       1 index.go:106] Indexing "keda" charts...
I0525 13:48:53.493566       1 index.go:146] Details for "keda-2.10.2" chart: &{PublishedAt:2023-04-13 13:51:45.496102 +0200 +0200 Digest:2e75903cda0780a4a8115dc199541315eaccdbfc3ec3da5ab492c8825080cc99}
I0525 13:48:54.315442       1 index.go:156] Skipping "keda-2.10.2" chart: Already synced
I0525 13:48:54.317325       1 index.go:146] Details for "keda-2.10.1" chart: &{PublishedAt:2023-04-13 13:51:45.493483 +0200 +0200 Digest:7216ff7cff5567152b895017b97a95b41b788589c4be82169d92906519a24f25}
I0525 13:48:54.716542       1 index.go:156] Skipping "keda-2.10.1" chart: Already synced
I0525 13:48:54.717986       1 index.go:146] Details for "keda-2.10.0" chart: &{PublishedAt:2023-04-13 13:51:45.491155 +0200 +0200 Digest:4be1fc8dba9d0e17ff475ca3dcb1183b07164ccaddfc48c67f6369a56f1b1777}
I0525 13:48:55.230277       1 index.go:156] Skipping "keda-2.10.0" chart: Already synced
I0525 13:48:55.232057       1 index.go:146] Details for "keda-2.9.4" chart: &{PublishedAt:2023-04-13 13:51:45.542546 +0200 +0200 Digest:c455dc8d908b6e8575fe0dbe8275861355cb242a5768f23cd909e543fe077438}
I0525 13:48:55.634588       1 index.go:156] Skipping "keda-2.9.4" chart: Already synced
I0525 13:48:55.635255       1 index.go:146] Details for "keda-2.9.3" chart: &{PublishedAt:2023-04-13 13:51:45.539826 +0200 +0200 Digest:52a5de6f5585fb2cfe44ba9ddadcf4cd4208138795313e25ee654d82a424faef}
I0525 13:48:55.909680       1 index.go:156] Skipping "keda-2.9.3" chart: Already synced
I0525 13:48:55.911019       1 index.go:146] Details for "keda-2.9.2" chart: &{PublishedAt:2023-04-13 13:51:45.537878 +0200 +0200 Digest:a1f14048f1788cde92a42412fa789e34d48bb4a8e94d4b43e0c70c8b8c326e43}
I0525 13:48:56.355165       1 index.go:156] Skipping "keda-2.9.2" chart: Already synced

When running it with your version, I get this:

I0525 13:45:15.003480       1 sync.go:49] Looking for the default config charts-syncer.yaml
I0525 13:45:15.030129       1 config.go:37] 'source.repo.chartsIndex' property is empty. Using "kedacore.github.io//charts/charts-index:latest" default value
I0525 13:45:15.030359       1 config.go:78] 'target.repoName' property is empty. Using "myrepo" default value
I0525 13:45:15.032525       1 syncer.go:126] Using workdir: "/.charts-syncer"
I0525 13:45:15.034215       1 cachedisk.go:39] Allocating cache dir: "/.charts-syncer/8526dab2f201639577faed94e0d05ce310045e77"
I0525 13:45:15.035208       1 helmclassic.go:46] [1dccdb76a544d080f69f602bd11d6bfadea509d4] GET "https://kedacore.github.io/charts/index.yaml"
I0525 13:45:15.264517       1 helmclassic.go:61] [1dccdb76a544d080f69f602bd11d6bfadea509d4] HTTP Status: 200 OK
I0525 13:45:15.312766       1 cachedisk.go:39] Allocating cache dir: "/.charts-syncer/0534ad106412f6f2c7dc103cf297992c9a24b3c8"
I0525 13:45:15.313977       1 index.go:91] Publishing threshold set to "2022-06-15 00:00:00 +0000 UTC"
I0525 13:45:15.314199       1 index.go:107] Found 49 versions for "keda" chart: [2.10.2 2.10.1 2.10.0 2.9.4 2.9.3 2.9.2 2.9.1 2.9.0 2.8.4 2.8.3 2.8.2 2.8.1 2.8.0 2.7.2 2.7.1 2.7.0 2.6.2 2.6.1 2.6.0 2.5.1 2.5.0 2.4.0 2.3.2 2.3.0 2.2.2 2.2.1 2.2.0 2.1.3 2.1.2 2.1.1 2.1.0 2.0.1 2.0.0 2.0.0-rc3 2.0.0-rc2 2.0.0-rc 2.0.0-beta1.2 2.0.0-beta1.1 2.0.0-beta 1.5.0 1.4.2 1.4.1 1.4.0 1.3.2 1.3.1 1.3.0 1.2.0 1.1.0 1.0.0]
I0525 13:45:15.314339       1 index.go:108] Indexing "keda" charts...
I0525 13:45:15.315336       1 index.go:148] Details for "keda-2.10.2" chart: &{PublishedAt:2023-04-13 13:51:45.496102 +0200 +0200 Digest:2e75903cda0780a4a8115dc199541315eaccdbfc3ec3da5ab492c8825080cc99}
I0525 13:45:34.128042       1 index.go:158] Skipping "keda-2.10.2" chart: Already synced
I0525 13:45:34.130714       1 index.go:148] Details for "keda-2.10.1" chart: &{PublishedAt:2023-04-13 13:51:45.493483 +0200 +0200 Digest:7216ff7cff5567152b895017b97a95b41b788589c4be82169d92906519a24f25}
I0525 13:45:52.038686       1 index.go:158] Skipping "keda-2.10.1" chart: Already synced
I0525 13:45:52.039583       1 index.go:148] Details for "keda-2.10.0" chart: &{PublishedAt:2023-04-13 13:51:45.491155 +0200 +0200 Digest:4be1fc8dba9d0e17ff475ca3dcb1183b07164ccaddfc48c67f6369a56f1b1777}
I0525 13:46:10.870385       1 index.go:158] Skipping "keda-2.10.0" chart: Already synced
I0525 13:46:10.873858       1 index.go:148] Details for "keda-2.9.4" chart: &{PublishedAt:2023-04-13 13:51:45.542546 +0200 +0200 Digest:c455dc8d908b6e8575fe0dbe8275861355cb242a5768f23cd909e543fe077438}
I0525 13:46:32.068523       1 index.go:158] Skipping "keda-2.9.4" chart: Already synced

As you can see, checking each image takes ca. 0.5 sec with the upstream image, and ca. 20 sec with your image :(

UPDATE: from the looks of it, it seems that this is because your branch is outdated (13 commits behind master), and the speed up happens in one of these 13 commits. I guess that would be solved by rebasing.

@chrissgyulev chrissgyulev force-pushed the task/sync-chart-depedencies branch from 92e4d21 to 009c35d Compare May 26, 2023 14:03
@chrissgyulev
Copy link
Author

OK @MShekow ! I've rebased this branch with master. Can you try again, please ? If possible with exactly the same configuration. Thank you very much!

@MShekow
Copy link

MShekow commented May 26, 2023

OK @MShekow ! I've rebased this branch with master. Can you try again, please ? If possible with exactly the same configuration. Thank you very much!

I just checked it out. The performance is now good again, like in the upstream image. Thank you

@chrissgyulev
Copy link
Author

@jotadrilo / @tompizmor could you take a look ?

Regards,
Chriss

@MShekow
Copy link

MShekow commented Jun 9, 2023

OUTDATED, see comment below

@chrissgyulev I found another problem: the sync generally seems to be broken. Example YAML:

source:
  repo:
    kind: HELM
    url: https://argoproj.github.io/argo-helm
    
    
  ignoreTrustedRepos:
    - kind: HELM  # we do not want to synchronize the redis-ha subchart
      url: https://dandydeveloper.github.io/charts
    
target:
  repo:
    kind: OCI
    url: https://our-hosted-registry.com/some-namespace/synced-helm-charts

charts:
  - argo-cd

Log output (with -v=5):

I0609 06:57:31.107580       1 index.go:108] Indexing "argo-cd" charts...
I0609 06:57:31.107613       1 index.go:148] Details for "argo-cd-5.36.1" chart: &{PublishedAt:2023-06-08 22:52:42.13709222 +0000 UTC Digest:a82b57a81c58e1442f6ae445bf52ae4f340734cf74703b90eaad6781a84785c0}
I0609 06:57:31.433754       1 cachedisk.go:55] cache hit { op:has, id:a6f4dfdf6034351d1647bfe7db0c82667a472cdb, filename:argo-cd-5.36.1.tgz }
I0609 06:57:31.433848       1 utils.go:410] [ed71a9541c3494dcdee70e2e89efb958baf2db30] GET "https://github.com/argoproj/argo-helm/releases/download/argo-cd-5.36.1/argo-cd-5.36.1.tgz"
I0609 06:57:31.820786       1 utils.go:422] [ed71a9541c3494dcdee70e2e89efb958baf2db30] HTTP Status: 200 OK
I0609 06:57:31.820814       1 cachedisk.go:55] cache hit { op:has, id:a6f4dfdf6034351d1647bfe7db0c82667a472cdb, filename:argo-cd-5.36.1.tgz }
I0609 06:57:31.820820       1 cachedisk.go:108] cache hit { op:write, id:a6f4dfdf6034351d1647bfe7db0c82667a472cdb, filename:argo-cd-5.36.1.tgz }
I0609 06:57:31.976066       1 cachedisk.go:55] cache hit { op:has, id:a6f4dfdf6034351d1647bfe7db0c82667a472cdb, filename:redis-ha-4.23.0.tgz }
E0609 06:57:31.976111       1 index.go:167] unable to load "argo-cd-5.36.1" chart: invalid "redis-ha-4.23.0" chart dependency: fetching redis-ha:4.23.0 chart: getting redis-ha-4.23.0 from index file: no chart name found
W0609 06:57:31.976133       1 index.go:129] Failed processing argo-cd:5.36.1 chart. The index will remain incomplete.

The problem occurs here:

grafik

The i.Entries only contain entries such as "argo-cd", but "redis-ha" is missing.

I created a PR with an idea for how to fix it. See chrissgyulev#1

@MShekow
Copy link

MShekow commented Jun 10, 2023

You can discard most of the above comment. I simply mistyped the URL for ignoreTrustedRepos, using https://dandydeveloper.github.io/charts instead of https://dandydeveloper.github.io/charts/.

I also learnt that sometimes errors arise because of older tier1-charts referencing different sub-charts (than the most current one). So, for instance, for the mimir-distributed Helm chart I have to exclude both https://charts.min.io/ and https://helm.min.io/ (depending on how far back you go with with CLI argument --from-date, you have to exclude even more repos).

Still, I think that the PR I created (chrissgyulev#1) may be useful, because it skips dependencies/sub-charts early. The use case is that we have limited access to the Internet. We need to get firewall clearances for every DNS/Host. Thus, we want to limit the number of rules to a minimum, and thus we would like to avoid trying to look up charts which we do not need anyway.

@chrissgyulev
Copy link
Author

ok @MShekow let me take a look. Thank you very much

@bainss
Copy link

bainss commented May 5, 2024

Is this feature going to be added back to the next release ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants