GitHub Action
accum-content
Userlog • Changelog • Dependencies • Known issues • Copyright and License
GitHub composite action to periodically download and accumulate content into repository.
Tutorial to use with: https://github.com/andry81-devops/accum-content
All tutorials: https://github.com/andry81/index#tutorials
-
Does store downloaded content in a repository on the GitHub, so all the resources does render from a repository file and does access a repository instead of an external resource, and:
-
Prevents a broken link case holding at the same time the link content up to date and has option to roll back the content to previous version in case of broken or invalid content (if history was not rewrited and purged).
-
Reduces traffic to an external resource and avoids an out-of-service case.
-
Better does cacheing on the GitHub side, when a resource can be checked on a cache expiration and so updated opposite to an external resource, where the resource is relied on external caching and so has to be completely redownloaded.
-
-
Content can be validated or/and altered after download, but before store in a repository. For example, for a SVG file - checks on valid values, size of canvas or malformed svg format.
-
Downloads content under GitHub Actions pipeline IP instead of a client IP.
-
Repository to store content config file and content itself can be different, and you may directly point the content repository as commits list:
https://github.com/{{REPO_OWNER}}/{{REPO}}--gh-content-cache/commits/{{BRANCH}}
-
Workflow is used accum-content.sh bash script to accumulate content
-
The script does accumulate content into a repository and does update the content index file:
content-index.yml
-
Can extract content not related to a specific repository or related to multiple repositories from a target repository.
-
Can validate content by shell code handler before update it and filter out invalid content.
-
Extracted content can be updated separately from a target repository. For example, after a release commit into a target repository.
-
Traffic to an external resource can be replaced by traffic to the GitHub resource with better caching.
-
All content can be stored in a single place and content change be saved into a commits list.
-
Download and accumulate process can be controlled by explicit config file with parameters (can be outside of a content cache repository).
-
The content cache repository can be rewrited to remove old files and history does not touching anything else.
-
CONTINUE_ON_INVALID_INPUT=1
,CONTINUE_ON_EMPTY_CHANGES=1
: Treats invalid input or empty changes as not an error as by default. -
ERROR_ON_EMPTY_CHANGES_WITHOUT_ERRORS=1
ifCONTINUE_ON_EMPTY_CHANGES=1
: Treats empty changes without errors as an error if enabled to continue on empty changes. -
NO_SKIP_UNEXPIRED_ENTRIES=1
: Doesn't skip not yet expired entries and continue to process them. -
NO_DOWNLOAD_ENTRIES=1
,NO_DOWNLOAD_ENTRIES_AND_CREATE_EMPTY_INSTEAD=1
: Skips download of all entries to not wait on download at all. -
Uses and updates content index file
content-index.yml
to control the up to date and by default does generate the content index file from the content config filecontent-config.yml
, if does not exist. -
ENABLE_GENERATE_CHANGELOG_FILE=1
,CHANGELOG_FILE=".../content-changelog.txt"
: Generates a textual changelog file with notes about changes per commit including the changes absence in case of skipped errors or skipped content update. -
ENABLE_COMMIT_MESSAGE_DATE_WITH_TIME=1
: Inserts the time string in formatHH:MMZ
additionally after the date in each commit message (by default inserts only a date for shorter commit messages). -
ENABLE_COMMIT_MESSAGE_WITH_WORKFLOW_RUN_NUMBER=1
: Inserts the workflow run number after date/time prefix in each commit message (by default does not insert for shorter commit messages). -
ENABLE_GITHUB_ACTIONS_RUN_URL_PRINT_TO_CHANGELOG=1
: Prints GitHub Actions Run URL (with or without workflow run number) into the changelog file to reference the log on the GitHub from the changelog file. -
ENABLE_REPO_STORE_COMMITS_URL_PRINT_TO_CHANGELOG=1
: Prints Content Store Repository commit URL into the changelog file to reference the commit from being committed changelog file.Note The actual hash of the commit can not be know on the moment of the commit. So instead of the commit hash, an approximate date of the commit is used (~ +5 min ahead) in format of:
https://github.com/{{REPO_OWNER}}/{{REPO}}--gh-content-cache/commits?branch={{BRANCH}}&time_zone=utc&until=YYYY-MM-DD
-
Can run download validation shell code after a file download (see
content-config.yml
example below). -
USE_APT_GET_INSTALL_CMDLINE="<apt-get-install-cmdline>"
,USE_PYTHON3_PIP_INSTALL_CMDLINE="<pip-install-cmdline>"
: Installsapt-get
andpython3
packages and modules (w/o caching feature support) before script execution.Note This is mostly for the testing purposes. In case of slow installation of many dependencies, packages or modules you have to use an installation together with the caching feature support. See available list of solutions from here: https://stackoverflow.com/questions/59269850/caching-apt-packages-in-github-actions-workflow/73500415#73500415
Warning You must replace all placeholder into respective values:
{{REPO_OWNER}}
-> repository owner{{REPO}}
-> your repository{{BRANCH}}
-> your repository branch or reference
{{REPO_OWNER}}/gh-content-cache-config/raw/master/repo/{{REPO_OWNER}}/{{REPO}}/content-config.yml
:
content-config:
entries:
- init:
#shell: bash
## Input variables:
## GH_WORKFLOW_ROOT
##
## CAUTION:
## First line must be a shebang line, otherwise each script line will run in a child bash process!
##
#run: |
# #!/bin/bash
# exit 0
dirs:
- dir: badges/traffic/views
schedule:
next-update:
timestamp: 04:00Z
#xpath-match-tokens: |
# svg/g[1]/text[3]
files:
- file: all.svg
query-url: https://img.shields.io/badge/dynamic/json?color=success&label=views|all&query=count&url=https://github.com/andry81-stats/{{REPO}}--gh-stats/raw/master/traffic/views/latest-accum.json?raw=True&logo=github
- file: all-14d.svg
query-url: https://img.shields.io/badge/dynamic/json?color=success&label=14d&query=count&url=https://github.com/andry81-stats/{{REPO}}--gh-stats/raw/master/traffic/views/latest.json?raw=True
- file: unq.svg
query-url: https://img.shields.io/badge/dynamic/json?color=success&label=views|unq&query=uniques&url=https://github.com/andry81-stats/{{REPO}}--gh-stats/raw/master/traffic/views/latest-accum.json?raw=True&logo=github
- file: unq-14d.svg
query-url: https://img.shields.io/badge/dynamic/json?color=success&label=14d&query=uniques&url=https://github.com/andry81-stats/{{REPO}}--gh-stats/raw/master/traffic/views/latest.json?raw=True
- dir: badges/traffic/clones
schedule:
next-update:
timestamp: 04:00Z
#xpath-match-tokens: |
# svg/g[1]/text[3]
files:
- file: all.svg
query-url: https://img.shields.io/badge/dynamic/json?color=success&label=clones|all&query=count&url=https://github.com/andry81-stats/{{REPO}}--gh-stats/raw/master/traffic/clones/latest-accum.json?raw=True&logo=github
- file: all-14d.svg
query-url: https://img.shields.io/badge/dynamic/json?color=success&label=14d&query=count&url=https://github.com/andry81-stats/{{REPO}}--gh-stats/raw/master/traffic/clones/latest.json?raw=True
- file: unq.svg
query-url: https://img.shields.io/badge/dynamic/json?color=success&label=clones|unq&query=uniques&url=https://github.com/andry81-stats/{{REPO}}--gh-stats/raw/master/traffic/clones/latest-accum.json?raw=True&logo=github
- file: unq-14d.svg
query-url: https://img.shields.io/badge/dynamic/json?color=success&label=14d&query=uniques&url=https://github.com/andry81-stats/{{REPO}}--gh-stats/raw/master/traffic/clones/latest.json?raw=True
- dir: badges/metrics
schedule:
next-update:
timestamp: 08:00Z
files:
- file: commits-since-latest.svg
query-url: https://img.shields.io/github/commits-since/{{REPO_OWNER}}/{{REPO}}/latest?sort=semver&label=Github%20commits%20since%20latest
- file: latest-release-name.svg
query-url: https://img.shields.io/github/v/release/{{REPO_OWNER}}/{{REPO}}?include_prereleases&label=latest
- file: github-all-releases.svg
query-url: https://img.shields.io/github/downloads/{{REPO_OWNER}}/{{REPO}}/total?label=Github%20dl|all&logo=github
- dir: badges/metrics
schedule:
next-update:
timestamp: 08:00Z
#xpath-match-tokens: |
# svg/g[1]/text[3]
files:
- file: shields-repo-size.svg
query-url: https://img.shields.io/github/repo-size/{{REPO_OWNER}}/{{REPO}}?logo=github
- file: shields-code-size.svg
query-url: https://img.shields.io/github/languages/code-size/{{REPO_OWNER}}/{{REPO}}?logo=github
- file: tokei-lines-of-code.svg
query-url: https://tokei.rs/b1/github/{{REPO_OWNER}}/{{REPO}}?category=code
# download-validate:
# shell: bash
# # Input variables:
# # GH_WORKFLOW_ROOT, GH_WORKFLOW_FLAGS,
# # IS_STORED_FILE_EXIST, STORED_FILE, STORED_FILE_SIZE, STORED_FILE_HASH,
# # DOWNLOADED_FILE, DOWNLOADED_FILE_SIZE, DOWNLOADED_FILE_HASH
# #
# # CAUTION:
# # First line must be a shebang line, otherwise each script line will run in a child bash process!
# #
# run: |
# #!/bin/bash
#
# source "$GH_WORKFLOW_ROOT/_externals/tacklelib/bash/tacklelib/bash_tacklelib" || exit 255
#
# tkl_include_or_abort "$GH_WORKFLOW_ROOT/bash/github/init-xq-workflow.sh"
#
# if (( ${#XQ_CMDLINE_READ[@]} )); then
# IFS=$'\n' read -r value <<< "$("${XQ_CMDLINE_READ[@]}" '.svg.g[1].text[3]."#text"' "$DOWNLOADED_FILE")" || exit 255
# elif (( ${#XMLSTARLET_CMDLINE_SEL[@]} )); then
# value="$(sed 's/<svg [^>]*/<svg/' "$DOWNLOADED_FILE" | "${XMLSTARLET_CMDLINE_SEL[@]}" -t -v "//svg/g[2]/text[4]")" || exit 255
# else
# exit 255
# fi
#
# value="${value//[ $'\t'a-zA-Z]/}"
# value_int="${value%.*}"
# value_fract="${value#*.}"
# [[ "$value_fract" == "$value" ]] && value_fract=0
# (( value_int || value_fract )) && exit 0
#
# exit 255
#- file: tokei-lines.svg
# query-url: https://img.shields.io/tokei/lines/github/{{REPO_OWNER}}/{{REPO}}?logo=tokei
.github/workflows/accum-content.yml
:
name: "content update at least every 3 hours"
on:
schedule:
- cron: "0 */3 * * *"
# Allows you to run this workflow manually from the Actions tab
workflow_dispatch:
inputs:
# for tests
no_skip_unexpired_entries:
description: 'Do not skip unexpired entries'
required: false
default: 'false'
no_download_entries:
description: 'Do not download entries'
required: false
default: 'false'
jobs:
accum-content-cache:
runs-on: ubuntu-latest
steps:
- uses: {{REPO_OWNER}}/gh-action--accum-content@master
with:
deps_repo_owner: {{REPO_OWNER}}
deps_repo_branch: master
deps_repo_read_token: ${{ github.token }}
config_repo_owner: {{REPO_OWNER}}
config_repo: {{REPO}}--gh-content-cache-config
config_repo_branch: master
config_repo_read_token: ${{ secrets.READ_STATS_TOKEN }}
store_repo_owner: {{REPO_OWNER}}
store_repo: {{REPO}}--gh-content-cache
store_repo_branch: master
store_repo_write_token: ${{ secrets.READ_STATS_TOKEN }}
commit_msg_entity: my-store-dir-1
# config repo
content_config_file: repo/{{REPO_OWNER}}/{{REPO}}/content-config.yml
# store repo
content_index_file: repo/{{REPO_OWNER}}/{{REPO}}/content-index.yml
content_index_dir: repo/{{REPO_OWNER}}/{{REPO}}
curl_flags: >-
-H 'Cache-Control: no-cache'
-v
flags: >-
ENABLE_PRINT_INITIAL_ENV_INTO_STDOUT=1
env: >-
ENABLE_GENERATE_CHANGELOG_FILE=1
CHANGELOG_FILE=repo/{{REPO_OWNER}}/{{REPO}}/content-changelog.txt
ENABLE_COMMIT_MESSAGE_DATE_WITH_TIME=1 # insert the time string in format HH:MMZ additionally after the date in each commit message
ENABLE_COMMIT_MESSAGE_WITH_WORKFLOW_RUN_NUMBER=1 # insert the workflow run number after date/time prefix in each commit message
ENABLE_YAML_DIFF_PRINT_AFTER_EDIT=1
ENABLE_YAML_DIFF_PRINT_BEFORE_PATCH=1
ENABLE_GITHUB_ACTIONS_RUN_URL_PRINT_TO_CHANGELOG=1
ENABLE_REPO_STORE_COMMITS_URL_PRINT_TO_CHANGELOG=1
CONTINUE_ON_EMPTY_CHANGES=1
ERROR_ON_EMPTY_CHANGES_WITHOUT_ERRORS=1
# CONTINUE_ON_INVALID_INPUT=1
# ENABLE_YAML_PRINT_AFTER_EDIT=1
# ENABLE_YAML_PRINT_AFTER_PATCH=1
# ENABLE_YAML_PATCH_DIFF_PAUSE_MODE=1
# NO_SKIP_UNEXPIRED_ENTRIES=1
# NO_DOWNLOAD_ENTRIES=1
# NO_DOWNLOAD_ENTRIES_AND_CREATE_EMPTY_INSTEAD=1
# "USE_APT_GET_INSTALL_CMDLINE=-y xmlstarlet"
# "USE_PYTHON3_PIP_INSTALL_CMDLINE=xq"
Note You can use
secrets.READ_STATS_TOKEN
instead ofsecrets.WRITE_STATS_TOKEN
as long as both repositories under the same repository owner.
Warning You must use different values for
deps_repo_owner
,config_repo_owner
andstore_repo_owner
if respective repositories actually under different repository owners.
Note See REUSE section for details if you have multiple repositories and want to store all GitHub workflow scripts (
.github/workflows/*.yml
) in a single repository.
https://github.com/andry81-devops/gh-known-issues#known-issues
https://github.com/andry81-devops/gh-known-issues#last-known-issues-updates
Code and documentation copyright 2022 Andrey Dibrov. Code released under MIT License