Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automate changelog updates #547

Open
wants to merge 6 commits into
base: develop
Choose a base branch
from

Conversation

KshitijThareja
Copy link
Contributor

Description

This PR introduces a new workflow to automate the updates to CHANGELOG.md whenever a PR is merged into KDS. It also adds a modified version of the existing changelog workflow, which now checks for the presence of Changelog section in PRs.

Issue addressed

Addresses #533

Changelog

  • Description: Introduce a new workflow to automate the updates to CHANGELOG.md whenever a PR is merged into KDS. Also add a modified version of the existing changelog workflow, which checks for the presence of Changelog section in PRs.
  • Products impact: -
  • Addresses: -
  • Components: -
  • Breaking: no
  • Impacts a11y: no
  • Guidance: -

Steps to test

  1. Create a new test PR.
  2. Make sure to include the Changelog section, otherwise the changelog action won't be successful.
  3. When the PR is merged, the Changelog section is automatically appended to the CHANGELOG.md file.

(optional) Implementation notes

At a high level, how did you implement this?

Does this introduce any tech-debt items?

Testing checklist

  • Contributor has fully tested the PR manually
  • If there are any front-end changes, before/after screenshots are included
  • Critical and brittle code paths are covered by unit tests
  • The change is described in the changelog section above

Reviewer guidance

  • Is the code clean and well-commented?
  • Are there tests for this change?
  • Are all UI components LTR and RTL compliant (if applicable)?
  • Add other things to check for here

After review

  • The changelog item has been pasted to the CHANGELOG.md

Comments

@MisRob
Copy link
Member

MisRob commented Feb 16, 2024

Thanks @KshitijThareja, amazing! I will take some time to try out the workflow in my test repo t at first and then ask my teammates who are much more skilled in GH actions than me to review in more detail. I'd like to get to it by the beginning of next week, but as usual, it could take up to two weeks.

@MisRob MisRob self-assigned this Feb 16, 2024
@MisRob
Copy link
Member

MisRob commented Feb 22, 2024

Hi @KshitijThareja, I tested the main flows like opening/closing/merging PR in my test repository. Both check for the presence of PR section in the PR description and updating the changelog after merging the PR worked smoothly! This will be such an important help for our workflow.

I will try to review code as best as I can some time next week and will invite @rtibbles to give the final approval as my GH actions and bash skills are not great.

@MisRob
Copy link
Member

MisRob commented Feb 22, 2024

I am not sure how to best confirm the dependabot part right now as I didn't find time yet to configure it in my test repo. But if the code looks well, I'd be fine with trying out after merge since @KshitijThareja apparently tested ;)

echo "${pr_link_ref}" >> pr_info.txt
echo >> pr_info.txt

sed -i "/<!-- Put all new updates into this section, please -->/r pr_info.txt" CHANGELOG.md
Copy link
Member

@MisRob MisRob Feb 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will need to use something else to identify the correct place to paste the new item because "Put all new updates into this section, please" was just a temporary comment. I will think about it more and come back. Perhaps we could add some kind of special comment to the changelog that would be obviously intended for this action to prevent ourselves from deleting it accidentally.

Copy link
Contributor Author

@KshitijThareja KshitijThareja Feb 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah sure. I can make the changes once we've got that part figured out. We just need something to serve as a marker for the workflow.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, I just posted a suggestion down there

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One other thing is that this gets a little more complex when we have a release branch and the main branch in play. At the moment, this action will be pushing all updates to the default (main) branch, but we actually want to check out the target branch of the closed PR instead to make the CHANGELOG updates. I see you are taking that into account below, but not for the checkout, so this will attempt to push to the branch that the PR was based on, but with the default branch, so a release branch will get pushed to be main instead.

id: check_description
run: |
description=$(jq -r ".pull_request.body" "$GITHUB_EVENT_PATH")
if [[ "$description" == *"## Changelog"* ]]; then
Copy link
Member

@MisRob MisRob Feb 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not required, would just be nice-to-have. People sometimes open a PR and even though they don't delete the Changelog, they leave it unfilled, like this

Screenshot from 2024-02-22 19-46-41

Right now, the action check will pass. If we could check that for example "Description" was changed from its default value, that would help from merging such items accidentally. Or we could do something else along this line if you have any better idea.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree to that. Description is one field which we expect every contributor to fill out. So it makes sense to check if it has been updated or not to further verify the presence of changelog section.
Also, while going through everything, I realized that I should have included a check to see if the PR is made by dependabot in the check-changelog workflow. If we don't have that, the action will keep failing for dependabot PRs.
If you agree to the above changes, I can start working on them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that makes sense to me, thank you

Copy link
Member

@MisRob MisRob Feb 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while going through everything, I realized that I should have included a check to see if the PR is made by dependabot in the check-changelog workflow

We're okay with merging with checks failing as long as there's a good reason for it, so it wouldn't be anything blocking. Nice to have for sure, as dependabot PRs will be a bit more clear.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note that even when we can have more changelog items in the PR description as I've just mentioned below, it's absolutely okay to only check for the first "Description" being changed since we basically just want to detect that someone changed something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. I'll try to get these changes done soon 👍🏻

@KshitijThareja
Copy link
Contributor Author

Hi @MisRob. Glad to know that you were able to test the workflows successfully. This PR required me to test the workflows locally multiple times, I guess more than any code I've ever written before 😅... made me realize how important of a step it is.

@MisRob
Copy link
Member

MisRob commented Feb 23, 2024

@KshitijThareja I realized that we can have more changelog items in a PR description (example PR) so to summarize our conversations about recognizing where to copy/paste from/to, I think it'd be reasonable to

  • Mark beginning and end of changelog section in the PR template
<!-- [DO NOT REMOVE-USED BY GH ACTION] CHANGELOG START -->

<!-- [DO NOT REMOVE-USED BY GH ACTION] CHANGELOG END -->
  • Mark place below which changelog items should be pasted in the CHANGELOG.md
<!-- [DO NOT REMOVE-USED BY GH ACTION] PASTE CHANGELOG -->

These comments can then safely be used by the action.

Copy link
Member

@rtibbles rtibbles left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of things that need to be considered!

echo "${pr_link_ref}" >> pr_info.txt
echo >> pr_info.txt

sed -i "/<!-- Put all new updates into this section, please -->/r pr_info.txt" CHANGELOG.md
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One other thing is that this gets a little more complex when we have a release branch and the main branch in play. At the moment, this action will be pushing all updates to the default (main) branch, but we actually want to check out the target branch of the closed PR instead to make the CHANGELOG updates. I see you are taking that into account below, but not for the checkout, so this will attempt to push to the branch that the PR was based on, but with the default branch, so a release branch will get pushed to be main instead.


on:
workflow_run:
workflows: ["Check PR for Changelog"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need this trigger here? Only need to react to the pull_request closed event below. This trigger will never pass the github.event.pull_request.merged check (and may even fail as it won't have a pull_request attribute).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @rtibbles. I understood your concern regarding taking the target branch into consideration for checkout.
I'll make the required changes within some days time.

@KshitijThareja
Copy link
Contributor Author

@rtibbles @MisRob I've updated the workflows as requested. Can you please review the changes once?
Thanks :)

@KshitijThareja
Copy link
Contributor Author

@MisRob I had a question. You had earlier provided an example for multiple changelog entries in a single PR. But now that we are removing the PR number and link from the PR template, how is the multiple changelogs scenario going to look like? A PR author won't just start typing the Description for the next changelog entry just below the previous one I presume?

@MisRob
Copy link
Member

MisRob commented Mar 4, 2024

Hi @KshitijThareja

A PR author won't just start typing the Description for the next changelog entry just below the previous one I presume?

I think so, yes

@KshitijThareja
Copy link
Contributor Author

Hi @MisRob!
What should be the next steps here? Is there any workaround for the multiple changelog entries issue?

@MisRob
Copy link
Member

MisRob commented Mar 12, 2024

Hi @KshitijThareja

What should be the next steps here? Is there any workaround for the multiple changelog entries issue?

Do you think that multiple changelog items can be recognized in the action, for example by searching for the first row ("Description") and last row ("Guidance")? I think that "Description" and "Guidance" should always be there. If yes, I think we could preserve the current changelog format with a PR link for each item, right?

If you don't think it's feasible, we can also think about tweaking the changelog format in regards to where PR links are and you're welcome to make suggestions based on what you think would work well with the automation.

@KshitijThareja
Copy link
Contributor Author

Hi @MisRob
Sorry I forgot to follow up here. I was working on correcting the script to identify separate changelog entries. I've added the required changes. The action will now be able to identify and add PR links to each changelog entry in a PR. One modification that I think would be helpful is asking the users to manually separate each changelog entry by mentioning it explicitly in the PR, for example:

Changelog 1:
< CONTENT >

Changelog 2:
< CONTENT >

---and so on. 

It would be helpful in increasing the readability in the case of multiple changelogs. What are your views on this?

@MisRob
Copy link
Member

MisRob commented Mar 15, 2024

Thank you @KshitijThareja!

Yes, I think it'd be nice if people could separate changelog entries in a clear manner and we can have a recommendation in comments. I don't think we can rely that it will be present though, but that shouldn't be a problem, is that right? Will the script pick it up in either of those cases?

@rtibbles
Copy link
Member

I think the reviewer will have to take responsibility for making sure this is formatted properly before merge - the other possibility is that there could be a separate PR action that validates the changelog section (and maybe comments on the PR with what the changelog change will look like).

@KshitijThareja
Copy link
Contributor Author

Will the script pick it up in either of those cases?

Yeah it will be able to pick up separate changelog items in either of those cases.

@KshitijThareja
Copy link
Contributor Author

Hi @rtibbles!
I don't really think there's a need for a separate PR action for this right now, though it may be good to have. We are already providing them with the PR template that shows how the changelog should be written by default. And I assume most PR authors would try to stick to the template.
We can add clear instructions on how the changelog section should be formatted in the PR template itself. What's your opinion on this?

@rtibbles
Copy link
Member

I don't review a lot of KDS PRs - so of course I am happy to say "the reviewer should do it"! We could also file that automation idea as a follow up issue :)

@KshitijThareja
Copy link
Contributor Author

Sure @rtibbles. I guess it would be better to finalize the current changes first and then have a follow up issue for the automation part. I'd be happy to work on it once this is finalized :)

@MisRob
Copy link
Member

MisRob commented Mar 15, 2024

There's already separate action that checks for the presence of at least one changelog item in the PR description before merging it so we could gradually build on top of that if needed.

Hearing that it the script can pick up more changelog items, no matter whether we title them or no, I feel confident about this version.

Let me give it a test try and then we can proceed to final review and hopefully start using soon :)

Thanks both

@KshitijThareja
Copy link
Contributor Author

Hi @MisRob
Regarding the first issue, I think I know what the error is. It is somehow evaluating "null" return value to true. I'll add another check for that and update the code in some time after testing it.
About the second one, I had started researching as to why that should happen. Thanks for updating that it wasn't an issue. I was basically clueless about what might have caused such a thing to happen.

@MisRob
Copy link
Member

MisRob commented Mar 20, 2024

About the second one, I had started researching as to why that should happen. Thanks for updating that it wasn't an issue. I was basically clueless about what might have caused such a thing to happen.

Ah yeah apologies, I hope you didn't spend long time on it before I figured out. Still learning :)

@KshitijThareja
Copy link
Contributor Author

Hey @MisRob
I've updated the action to identify the presence of Description in changelog correctly. It should work fine now :)

@MisRob
Copy link
Member

MisRob commented Mar 22, 2024

Thanks @KshitijThareja, I re-tested the whole flow and it looks ready to me. After merging, we will need to do some little tweaks to sync the action with the repo settings. I think I will do it because I have access and it's easier to test that way. I may also open a follow-up or two with some minor cosmetic improvements.

@rtibbles gave the code one final look and one thing he noted

the is_dependabot step doesn't actually output a boolean... it just outputs the username of the user, which can just be got from the github context github.event.pull_request.user doesn't have to be blocking though

Would you like to have a look at it now or better as follow-up?

@KshitijThareja
Copy link
Contributor Author

Welcome @MisRob, It was a good learning experience for me :)
About the dependabot check, I'm aware that it output the name of user(dependabot[bot] if the PR is made by dependabot). I can remove that specific step and just check for it using github.event.pull_request.user as suggested by @rtibbles. That will of course help in making the overall code concise. I'll just make the changes and update in some time.

@KshitijThareja
Copy link
Contributor Author

I've made the required changes 👍🏻

@KshitijThareja
Copy link
Contributor Author

Hello @MisRob!
Any updates on this? Please let me know if you feel there's anything else that needs to be worked upon before merging this PR.

@MisRob
Copy link
Member

MisRob commented Mar 27, 2024

Hi @KshitijThareja, thank you, I think there's nothing else to do right now. We will keep it open until I have a few tweaks to sync the action with the repo settings ready, and then merge it together. I think I will open a PR to your branch at some point.

@KshitijThareja
Copy link
Contributor Author

Sure @MisRob, thanks : )

@MisRob
Copy link
Member

MisRob commented Apr 5, 2024

I'm sorry it's taking so long @KshitijThareja - I put too many things on my plate. I will open a follow-up issue for things I wanted to do and ask in our planning meeting for it to get assigned to someone who could help. As we already saw the code in detail and did testing rounds, I think you can consider this accepted even though it will take some more time to merge. Thanks so much!

@KshitijThareja
Copy link
Contributor Author

No problem @MisRob. This PR can be merged whenever the team is done working on the other prequisite steps for the new workflow to be incorporated into KDS. Please do let me know if you require any further assistance from my side.
Thanks :)

@MisRob MisRob removed their assignment Apr 9, 2024
@EshaanAgg
Copy link
Contributor

Hi! I was browsing past this issue, and the work done here and wanted to drop a small suggestion/idea. I felt we put too much stress on writing bash commands to perform the validation on the PR body, parse it correctly, and then update the changelog. Wouldn't it be easier to maintain and update if we wrote a Python script to perform all this validation and parsing, which can be called with the PR's metadata as arguments?

Having scripts in workflows is a common technique I have seen in plenty of organizations for their specific use cases. Having a script would make it easier to add more complex formats in the future, and we could use the numerous scripting languages to make the same very easy and obvious for anyone like me passing by.

Feel free to ignore it if it feels irrelevant. Thanks!

@KshitijThareja
Copy link
Contributor Author

Hi @EshaanAgg!

It would indeed be a great idea to write the scripts in python if we wanted to improve the readability of the code. But considering bash commands would require lesser time to run, I'd say they are much more efficient than python scripts. Also, in my opinion we rarely require changes to be made to the existing GitHub actions suite, that too only if the existing actions aren't working as one would expect them to function or if they are outdated. Considering that, I'd prefer to keep the bash scripts, but I'd love to hear others' opinions on this.

Thank you :)

@EshaanAgg
Copy link
Contributor

EshaanAgg commented Apr 28, 2024

Seems fair. Just some additional thoughts:

  • Since we will be just working with small character texts, I don't think there would be much difference in the speed of the same (there would be of-course, but not something that's too bothersome from a workflow perspective. Maybe a couple of seconds in the worst case?).
  • I am a bit skeptical about the confidence we can have in bash scripts. They need manual testing of PR creation to be vetted, whereas for Python scripts we can just write automated tests (once we have manually verified that indeed the script gets call with the right content from the PR)
  • Scripts in Python can have wider text processing scope, like trimming, basic formatting, custom error messages on missing fields, and handling text in general.

It might require a bit more work and might not be the most "direct" solution, but I think its prospects are worthwhile. Would love to hear more from you and the maintainers as well. Kudos!

@MisRob
Copy link
Member

MisRob commented Apr 29, 2024

I appreciate the discussion and both of you have good points, @KshitijThareja and @EshaanAgg, thank you.

Generally, I lean towards Python or JavaScript scripts since it's much more readable to me and testing would be easier (that's something that's constantly problematic for GH actions..). My perspective is definitely affected by my frontend focus and poor bash skills. That said, since there aren't many of us writing GH actions on daily basis, I suspect that other people on the team might welcome it too, but we'd really need to hear from them. I will reference this conversation in Slack and let's see if someone wants to chime in. I am not entirely sure on Python vs JavaScript. I remember that @rtibbles mentioned to me a few times that he saw people using JavaScript, and I think he will know more than me how it compares.

@KshitijThareja I wanted to note that if we come to some conclusion, this particular PR won't be required to be re-written, so you can take it as a related discussion from which we may open a follow-up issue or no, or just use it as learning for new actions to come.

@KshitijThareja
Copy link
Contributor Author

Sure thing @MisRob.
We should certainly discuss this in the slack channel as others would be able to give their inputs too and then we can come to a collective decision as to what would be the best approach for this.
Thanks @EshaanAgg for suggesting an alternate path to write these actions . Looking forward to a fruitful discussion :)

@bjester
Copy link
Member

bjester commented Apr 29, 2024

I agree using Python or JS is definitely better than Bash. I think the downsides are that Bash has some really helpful and powerful utilities, and you can write really concise commands with it. With Python or JS, interacting with the system, other commands, or the filesystem requires a lot more code. I've found this package helpful before, and I think similar things can be done in Python.

@KshitijThareja
Copy link
Contributor Author

I agree that Python and JS have their merits and are more suited for certain tasks, but it's important to consider the trade-offs in terms of complexity, and integration efforts. For interacting with the system and simple text processing, using bash scripts is a better option in my opinion, whereas I would choose Python for more complex tasks like advanced data processing. By sticking with bash scripts for this particular workflow, we can ensure that it aligns well with our goals of efficiency and reliability.

But yes, if there are compelling reasons for the maintainers to switch to Python/JS, we should definitely try that out.

P.S. Bash's excellence cannot be denied ;)

@rtibbles
Copy link
Member

My only real reason for choosing JS or Python over Bash is maintainability. Bash is powerful, but also far less commonly used by anyone on the LE team, so even if doing something in Bash would result in an action being completed in 2 seconds, the additional maintainability of using JS or Python means that it taking 6 seconds to run is probably not a big deal.

Bash might seem like the obvious choice as it can be defined inline in Github Actions workflows, but Python can be used in this way using the shell option: https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idstepsshell

See an example here: https://github.com/learningequality/kolibri-installer-windows/blob/main/.github/workflows/build_exe.yml#L122

Likewise, JS can be used using the script action https://github.com/actions/github-script - which has built in support for accessing the Github API. Also, it allows defining separate JS files in the repo that can be imported, in order to keep the amount of script inlined in the repo minimal. See here for an example: https://github.com/learningequality/kolibri/blob/develop/.github/workflows/pr_build_comment.yml#L37

Overall, my tendency would be to use Python for file manipulation and string processing (things that tend to be quite easy and concise in Python), and JS via the script action for interacting with the Github API. I do still use direct bash commands occasionally, but only for simple operations like mv, cp, unlink, etc. Again, this is mostly due to my and the team's more limited familiarity with bash.

@KshitijThareja
Copy link
Contributor Author

I undestand your concerns. It would indeed be better to switch to using Python scripts in that case. Please let me know if I can help with the suggested implementation.
Thank you 😀

@bjester
Copy link
Member

bjester commented Apr 29, 2024

@KshitijThareja There are also tradeoffs with Bash too. Bash is great, but for those unfamiliar, managing state across bash scripts and bash functions can be a real challenge. And for example, it looks like you made heavy usage of piping. It can be difficult to debug a long chain of piped commands if one perhaps stops operating as expected.

@KshitijThareja
Copy link
Contributor Author

I guess it's just different perspectives at play. Nevertheless, I understand why using Python/JS for writing these scripts would be the ideal choice in this situation.
I'd also like to thank everyone for participating in this discussion. It was interesting to hear different arguments. In the end, we should go with whatever seems to be the best option to everyone.

@MisRob
Copy link
Member

MisRob commented May 2, 2024

Yes @KshitijThareja, I think a group of people who use bash on a daily basis would see it differently. It seems you have some passion for bash :)! Anyways, thanks for being open and thanks to all who took time to contribute in the discussion, and @EshaanAgg for opening it up, once again.

It's good to have clarity around what would be helpful in general for actions being used by the team with our background. I think we can use @rtibbles summary as good guidance for the upcoming work:

Overall, my tendency would be to use Python for file manipulation and string processing (things that tend to be quite easy and concise in Python), and JS via the script action for interacting with the Github API. I do still use direct bash commands occasionally, but only for simple operations like mv, cp, unlink, etc.

As mentioned earlier @KshitijThareja, as soon as we do the preparations on our side to seamlessly integrate these two actions, we will accept the PR as is so you don't need to refactor the whole thing. It's already very helpful work. It seems most of the people here would consider migrating some parts of the script away from bash useful so we will welcome it, but definitely not required for merge. Just let us know what works for you best.

@KshitijThareja
Copy link
Contributor Author

Hi @MisRob!
I'm okay with anything. If it's possible for the team to merge this PR now, we can have follow-up issues to migrate the required parts of the script. Until then, the current changes should serve their purpose. It's already a very large PR, so it'd be better if we work on the new changes separately.

@MisRob
Copy link
Member

MisRob commented May 8, 2024

Hi @KshitijThareja, yes I agree it would make sense to do as follow-up, after we merge this PR. We can merge it yet though, that can happen after #609

@MisRob
Copy link
Member

MisRob commented May 8, 2024

Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants