Skip to content

Commit

Permalink
Update monorepo
Browse files Browse the repository at this point in the history
  • Loading branch information
arpitjasa-db committed Jan 25, 2024
1 parent 36213c7 commit 777540e
Show file tree
Hide file tree
Showing 48 changed files with 797 additions and 359 deletions.
36 changes: 36 additions & 0 deletions .github/workflows/generate-cicd-zip.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
name: Generate CICD Zip
on:
pull_request:
paths:
- 'template/{{.input_root_dir}}/.github/workflows/{{.input_project_name}}**'
- 'template/{{.input_root_dir}}/.azure/devops-pipelines/{{.input_project_name}}**'

defaults:
run:
working-directory: template/{{.input_root_dir}}


jobs:
run-tests:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Checkout PR
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
gh pr checkout ${{ github.event.pull_request.number }}
- name: Generate CICD Zip
run: |
cp --parents .github/workflows/\{\{.input_project_name\}\}-* cicd/template
cp --parents .azure/devops-pipelines/\{\{.input_project_name\}\}-* cicd/template
tar -czvf cicd.tar.gz cicd
- name: Add Zip to Pull Request
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
git config --global user.name "GitHub Actions Bot"
git config --global user.email "[email protected]"
git add cicd.tar.gz
git commit -m "Add CICD Zip to Pull Request"
git push
9 changes: 8 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ https://github.com/databricks/mlops-stacks/assets/87999496/0d220d55-465e-4a69-bd
- Python 3.8+
- [Databricks CLI](https://docs.databricks.com/en/dev-tools/cli/databricks-cli.html) >= v0.211.0

[Databricks CLI](https://docs.databricks.com/en/dev-tools/cli/databricks-cli.html) v0.211.0 contains [Databricks asset bundle templates](https://docs.databricks.com/en/dev-tools/bundles/templates.html) for the purpose of project creation.
[Databricks CLI](https://docs.databricks.com/en/dev-tools/cli/databricks-cli.html) contains [Databricks asset bundle templates](https://docs.databricks.com/en/dev-tools/bundles/templates.html) for the purpose of project creation.

Please follow [the instruction](https://docs.databricks.com/en/dev-tools/cli/databricks-cli-ref.html#install-the-cli) to install and set up databricks CLI. Releases of databricks CLI can be found in the [releases section](https://github.com/databricks/cli/releases) of databricks/cli repository.

Expand All @@ -70,6 +70,13 @@ To create a new project, run:

This will prompt for parameters for project initialization. Some of these parameters are required to get started:
* ``input_project_name``: name of the current project
* ``input_setup_cicd_and_project`` : If both CI/CD and the project should be set up, or only one of them.
* ``CICD_and_Project``
* ``Project_Only``
* ``CICD_Only``
* ``Setup_Monorepo``
We expect Data Scientists to specify ``Project_Only`` to get
started in a development capacity, and when ready to move the project to Staging/Production, CI/CD can be set up. We expect that step to be done by Machine Learning Engineers (MLEs) who can specify ``CICD_Only`` during initialization
* ``input_root_dir``: name of the root directory. It is recommended to use the name of the current project as the root directory name, except in the case of a monorepo with other projects where the name of the monorepo should be used instead.
* ``input_cloud``: Cloud provider you use with Databricks (AWS or Azure), note GCP is not supported at this time.
* ``input_cicd_platform`` : CI/CD platform of choice (GitHub Actions or GitHub Actions for GitHub Enterprise Servers or Azure DevOps)
Expand Down
200 changes: 160 additions & 40 deletions databricks_template_schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,125 +2,245 @@
"welcome_message": "Welcome to MLOps Stacks. For detailed information on project generation, see the README at https://github.com/databricks/mlops-stacks/blob/main/README.md.",
"min_databricks_cli_version": "v0.211.0",
"properties": {
"input_project_name": {
"input_setup_cicd_and_project": {
"order": 1,
"type": "string",
"description": "{{if false}}\n\nERROR: This template is not supported by your current Databricks CLI version.\nPlease hit control-C and go to https://docs.databricks.com/en/dev-tools/cli/install.html for instructions on upgrading the CLI to the minimum version supported by MLOps Stacks.\n\n\n{{end}}\nSelect if both CI/CD and the Project should be set up, or only one of them. You can always set up the other later by running initialization again",
"default": "CICD_and_Project",
"enum": ["CICD_and_Project", "Project_Only", "CICD_Only"]
},
"input_project_name": {
"order": 2,
"type": "string",
"default": "my-mlops-project",
"description": "{{if false}}\n\nERROR: This template is no longer supported supported by CLI versions v0.211 and lower.\nPlease hit control-C and go to https://docs.databricks.com/en/dev-tools/cli/install.html for instructions on upgrading the CLI.\n\n\n{{end}}\nProject Name",
"description": "\nProject Name. Default",
"pattern": "^[^ .\\\\/]{3,}$",
"pattern_match_failure_message": "Project name must be at least 3 characters long and cannot contain the following characters: \"\\\", \"/\", \" \" and \".\"."

"pattern_match_failure_message": "Project name must be at least 3 characters long and cannot contain the following characters: \"\\\", \"/\", \" \" and \".\".",
"skip_prompt_if": {
"properties": {
"input_setup_cicd_and_project": {
"const": "CICD_Only"
}
}
}
},
"input_root_dir": {
"order": 2,
"order": 3,
"type": "string",
"default": "my-mlops-project",
"description": "\nRoot directory name. Use a name different from the project name if you intend to use monorepo"
"default": "{{ .input_project_name }}",
"description": "\nRoot directory name. For monorepos, this is the name of the root directory that contains all the projects. Default",
"skip_prompt_if": {
"anyOf":[
{
"properties": {
"input_setup_cicd_and_project": {
"const": "CICD_and_Project"
}
}
},
{
"properties": {
"input_setup_cicd_and_project": {
"const": "Project_Only"
}
}
}
]
}
},
"input_cloud": {
"order": 3,
"order": 4,
"type": "string",
"description": "\nSelect cloud",
"default": "azure",
"enum": ["azure", "aws"]
},
"input_cicd_platform": {
"order": 4,
"order": 5,
"type": "string",
"description": "\nSelect CICD platform",
"default": "github_actions",
"enum": ["github_actions", "github_actions_for_github_enterprise_servers", "azure_devops"]
"enum": ["github_actions", "github_actions_for_github_enterprise_servers", "azure_devops"],
"skip_prompt_if": {
"properties": {
"input_setup_cicd_and_project": {
"const": "Project_Only"
}
}
}
},
"input_databricks_staging_workspace_host": {
"order": 5,
"order": 6,
"type": "string",
"default": "{{if eq .input_cloud `azure`}}https://adb-xxxx.xx.azuredatabricks.net{{else if eq .input_cloud `aws`}}https://your-staging-workspace.cloud.databricks.com{{end}}",
"description": "\nURL of staging Databricks workspace, used to run CI tests on PRs and preview config changes before they're deployed to production. Default",
"pattern": "^(https.*)?$",
"pattern_match_failure_message": "Databricks staging workspace host URLs must start with https. Got invalid workspace host."
"pattern_match_failure_message": "Databricks staging workspace host URLs must start with https. Got invalid workspace host.",
"skip_prompt_if": {
"properties": {
"input_setup_cicd_and_project": {
"const": "Project_Only"
}
}
}
},
"input_databricks_prod_workspace_host": {
"order": 6,
"order": 7,
"type": "string",
"default": "{{if eq .input_cloud `azure`}}https://adb-xxxx.xx.azuredatabricks.net{{else if eq .input_cloud `aws`}}https://your-prod-workspace.cloud.databricks.com{{end}}",
"description": "\nURL of production Databricks workspace. Default",
"pattern": "^(https.*)?$",
"pattern_match_failure_message": "Databricks production workspace host URLs must start with https. Got invalid workspace host."
"pattern_match_failure_message": "Databricks production workspace host URLs must start with https. Got invalid workspace host.",
"skip_prompt_if": {
"properties": {
"input_setup_cicd_and_project": {
"const": "Project_Only"
}
}
}
},
"input_default_branch": {
"order": 7,
"order": 8,
"type": "string",
"default": "main",
"description": "\nName of the default branch, where the prod and staging ML assets are deployed from and the latest ML code is staged. Default"
"description": "\nName of the default branch, where the prod and staging ML assets are deployed from and the latest ML code is staged. Default",
"skip_prompt_if": {
"properties": {
"input_setup_cicd_and_project": {
"const": "Project_Only"
}
}
}
},
"input_release_branch": {
"order": 8,
"order": 9,
"type": "string",
"default": "release",
"description": "\nName of the release branch. The production jobs (model training, batch inference) defined in this stack pull ML code from this branch. Default"
"description": "\nName of the release branch. The production jobs (model training, batch inference) defined in this stack pull ML code from this branch. Default",
"skip_prompt_if": {
"properties": {
"input_setup_cicd_and_project": {
"const": "Project_Only"
}
}
}
},
"input_read_user_group": {
"order": 9,
"order": 10,
"type": "string",
"default": "users",
"description": "\nUser group name to give READ permissions to for project assets (ML jobs, integration test job runs, and machine learning assets). A group with this name must exist in both the staging and prod workspaces. Default"
"description": "\nUser group name to give READ permissions to for project assets (ML jobs, integration test job runs, and machine learning assets). A group with this name must exist in both the staging and prod workspaces. Default",
"skip_prompt_if": {
"properties": {
"input_setup_cicd_and_project": {
"const": "CICD_Only"
}
}
}
},
"input_include_models_in_unity_catalog": {
"order": 10,
"order": 11,
"type": "string",
"description": "\nWhether to use the Model Registry with Unity Catalog",
"default": "yes",
"enum": ["yes", "no"]
"enum": ["yes", "no"],
"skip_prompt_if": {
"properties": {
"input_setup_cicd_and_project": {
"const": "CICD_Only"
}
}
}
},
"input_schema_name": {
"order": 11,
"order": 12,
"type": "string",
"description": "\nName of schema to use when registering a model in Unity Catalog. \nNote that this schema must already exist, and we recommend keeping the name the same as the project name as well as giving the service principals the right access. Default",
"default": "my-mlops-project",
"default": "{{ .input_project_name }}",
"pattern": "^[^ .\\/]*$",
"pattern_match_failure_message": "Valid schema names cannot contain any of the following characters: \" \", \".\", \"\\\", \"/\"",
"skip_prompt_if": {
"properties": {
"input_include_models_in_unity_catalog": {
"const": "no"
"anyOf":[
{
"properties": {
"input_include_models_in_unity_catalog": {
"const": "no"
}
}
},
{
"properties": {
"input_setup_cicd_and_project": {
"const": "CICD_Only"
}
}
}
}
]
}
},
"input_unity_catalog_read_user_group": {
"order": 12,
"order": 13,
"type": "string",
"default": "account users",
"description": "\nUser group name to give EXECUTE privileges to models in Unity Catalog. A group with this name must exist in the Unity Catalog that the staging and prod workspaces can access. Default",
"skip_prompt_if": {
"properties": {
"input_include_models_in_unity_catalog": {
"const": "no"
"anyOf":[
{
"properties": {
"input_include_models_in_unity_catalog": {
"const": "no"
}
}
},
{
"properties": {
"input_setup_cicd_and_project": {
"const": "CICD_Only"
}
}
}
}
]
}
},
"input_include_feature_store": {
"order": 13,
"order": 14,
"type": "string",
"description": "\nWhether to include Feature Store",
"default": "no",
"enum": ["no", "yes"]
},
"input_include_mlflow_recipes": {
"order": 14,
"order": 15,
"type": "string",
"description": "\nWhether to include MLflow Recipes",
"default": "no",
"enum": ["no", "yes"],
"skip_prompt_if": {
"properties": {
"input_include_models_in_unity_catalog": {
"const": "yes"
"anyOf":[
{
"properties": {
"input_include_models_in_unity_catalog": {
"const": "yes"
}
}
},
{
"properties": {
"input_include_feature_store": {
"const": "yes"
}
}
},
{
"properties": {
"input_setup_cicd_and_project": {
"const": "CICD_Only"
}
}
}
}
]
}
}
},
"success_message" : "\n✨ Your MLOps Stack has been created in the '{{.input_project_name}}' directory!\n\nPlease refer to the README.md of your project for further instructions on getting started."
"success_message" : "\n✨ Your MLOps Stack has been created in the '{{.input_root_dir}}/{{.input_project_name}}' directory!\n\nPlease refer to the README.md of your project for further instructions on getting started."
}
7 changes: 0 additions & 7 deletions library/input_validation.tmpl
Original file line number Diff line number Diff line change
@@ -1,9 +1,2 @@
# Validate workspace hostname
{{ define `validation` }}

- Validate feature store and recipes
{{- if and (eq .input_include_feature_store `yes`) (eq .input_include_mlflow_recipes `yes`) -}}
{{ fail `Feature Store cannot be used with MLflow recipes. Please only use one of the two or neither.` }}
{{- end -}}

{{- end -}}
Loading

0 comments on commit 777540e

Please sign in to comment.