-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HIVE-27277: GH actions to build and push docker image #4274
Conversation
.github/workflows/docker-image.yml
Outdated
name: ci hive docker image | ||
|
||
on: | ||
push: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some thoughts on the frequency:
- We would better to trigger the action for every new release.
- For master branch, I think we can update the image every three months.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should have a -latest with the GA version.
Also we could have a daily release about the -dev version (or tags)
for every commit would be a bit overused.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should have 2 workflows:
- GA workflow - Frequency : Once per release
- For the latest dev images. - Frequency: Once per week? Because on an average hive gets about 10 to 15 commits per week.(https://github.com/apache/hive/graphs/commit-activity)
This PR set up a workflow to build and publish docker images for the GA versions of hive.
I will raise a follow-up jira to address the workflow needed for daily/dev images.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once per week makes sense to me for dev, if there are some limitations on the number of the dev images, for example, only keep the latest 10 images for dev.
.github/workflows/docker-image.yml
Outdated
context: ./packaging/src/docker/ | ||
file: ./packaging/src/docker/Dockerfile | ||
push: true | ||
tags: ${{ secrets.DOCKERHUB_USERNAME }}/hive:test-image |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'd better add the real version to the image instead of test-image
, and I'm thinking it would be great if we can determine the HADOOP_VERSION
, TEZ_VERSION
from the project.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about -hive:dev or -hive:daily?
The GA version should be the same as the industry follows like hive4.0-latest imho
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For GA: the versions that are set in the .yml file were manually configured after looking at the hive/pom.xml file.
For hive:daily, i think we can obtain them from the pom.xml file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we trigger the build for GA automatically?
https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#release
I think it makes some troubles every time we should add the new GA build manually, it adds extra steps for releasing the new version, sometimes we may even forget about it.
For the old released version, I think we can push the image manually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we trigger the build for GA automatically?
https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#release
Since we are pushing the changes to the dockerhub for the 1st time, we wanted to trigger it with workflow_dispatcher to verify the dockerhub integration.
Once we verify that this GH action succeeds, we can set it to trigger automatically and update all the images on every release or once every three months.
I think it makes some troubles every time we should add the new GA build manually, it adds extra steps for
releasing the new version, sometimes we may even forget about it.
I agree, but I think for a new GA :
- We will not have prior knowledge of the versions of hive, tez and hadoop to use in the next GA. (Workaround could be: obtain from pom.xml)
- Someone will have to build the new GA docker images locally and verify if it's working before we push them to docker hub.
So that is why i was thinking we should retain the manual step at release time.
Other repos follow something similar: https://github.com/apache/spark-docker/tree/master/.github/workflows
0b9eb3e
to
2c09d78
Compare
99621f1
to
1e5614c
Compare
Set up github actions workflow to build and push docker image to docker hub
Kudos, SonarCloud Quality Gate passed! |
@zabetak / @abstractdog / @dengzhhu653 any comments/thoughts on this. @simhadri-g has a planned follow up as well on this, he can share the details |
Since we are pushing the changes to the docker hub for the 1st time, we wanted to trigger it with workflow_dispatcher to verify the docker hub integration and publish the GA images. Once we verify that this GH action succeeds, I would like to parameterize this and we can set it to trigger automatically every release when a new |
Will reopen the PR after testing with new changes completes. This is to prevent unnecessary runs of hive precommit tests . |
New PR: |
Hi Everyone,
I have got the docker hub repository setup for Apache hive from Infra.
https://issues.apache.org/jira/browse/INFRA-24505
DockerHub: https://hub.docker.com/r/apache/hive
In order to publish the docker image to Docker hub, in this PR I have set up GitHub actions workflow to build and push docker image to Docker hub. The workflow was tested on a hive fork and the image was successfully pushed here. https://hub.docker.com/repository/docker/simhadri064/hive/tags?page=1&ordering=last_updated
We will need to decide on the frequency at which we push these images to docker hub.
What changes were proposed in this pull request?
Why are the changes needed?
Does this PR introduce any user-facing change?
How was this patch tested?
Set up the same workflow on why fork and pushed to personal dockerhub account via github actions