-
Notifications
You must be signed in to change notification settings - Fork 55
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Adding tests/environment folder to store datasets and bicep templates for test sources * Added scripts to create databricks jobs and a notebook to mount storage on Databricks * Making test environments more consistent across notebooks (secret scope, environment variables) * Handle of tests were modified to correct mistakes not caught in source controlled versions * Added documentation for testing environment including what secrets are used and what they look like * Adding requirements.txt file for environment deployment * Hive tests should run without additional intervention (i.e. use CREATE IF NOT EXISTS) * Removing production env deployment * Remove the wasbs with parameters test * After updating all jobdefs to be ready for upload, the run-tests script needed to look at .name instead of .settings.name * Unfortunately, when calling the jobs API, it returns a .settings.name which must be used
- Loading branch information
Showing
60 changed files
with
1,486 additions
and
709 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -85,18 +85,19 @@ jobs: | |
name: FunctionZip | ||
path: ./artifacts | ||
|
||
- name: Azure Functions Action | ||
- name: Deploy Azure Function to Integration Env | ||
uses: Azure/[email protected] | ||
with: | ||
app-name: ${{ secrets.INT_FUNC_NAME }} | ||
package: ./artifacts/FunctionZip.zip | ||
publish-profile: ${{ secrets.INT_PUBLISH_PROFILE }} | ||
|
||
- uses: azure/login@v1 | ||
- name: Azure Login | ||
uses: azure/login@v1 | ||
with: | ||
creds: ${{ secrets.INT_AZ_CLI_CREDENTIALS }} | ||
|
||
- name: Azure CLI script | ||
- name: Compare and Update App Settings on Deployed Function | ||
uses: azure/CLI@v1 | ||
with: | ||
azcliversion: 2.34.1 | ||
|
@@ -108,7 +109,7 @@ jobs: | |
|
||
# Start up Synapse Pool and Execute Tests | ||
- name: Start Integration Synapse SQL Pool | ||
run: source tests/integration/manage-sql-pool.sh start ${{ secrets.INT_SUBSCRIPTION_ID }} ${{ secrets.INT_RG_NAME }} ${{ secrets.INT_SYNAPSE_WKSP_NAME }} ${{ secrets.INT_SYNAPSE_SQLPOOL_NAME }} | ||
run: source tests/integration/manage-sql-pool.sh start ${{ secrets.INT_SUBSCRIPTION_ID }} ${{ secrets.INT_SYNAPSE_SQLPOOL_RG_NAME }} ${{ secrets.INT_SYNAPSE_WKSP_NAME }} ${{ secrets.INT_SYNAPSE_SQLPOOL_NAME }} | ||
env: | ||
AZURE_CLIENT_ID: ${{ secrets.AZURE_CLIENT_ID }} | ||
AZURE_CLIENT_SECRET: ${{ secrets.AZURE_CLIENT_SECRET }} | ||
|
@@ -124,6 +125,10 @@ jobs: | |
token = ${{ secrets.INT_DATABRICKS_ACCESS_TOKEN }}" > ./config.ini | ||
export DATABRICKS_CONFIG_FILE=./config.ini | ||
- name: Confirm Databricks CLI is configured | ||
run: databricks clusters spark-versions | ||
env: | ||
DATABRICKS_CONFIG_FILE: ./config.ini | ||
|
||
- name: Cleanup Integration Environment | ||
run: python ./tests/integration/runner.py --cleanup --dontwait None None None | ||
|
@@ -144,7 +149,7 @@ jobs: | |
DATABRICKS_CONFIG_FILE: ./config.ini | ||
|
||
- name: Stop Integration Synapse SQL Pool | ||
run: source tests/integration/manage-sql-pool.sh stop ${{ secrets.INT_SUBSCRIPTION_ID }} ${{ secrets.INT_RG_NAME }} ${{ secrets.INT_SYNAPSE_WKSP_NAME }} ${{ secrets.INT_SYNAPSE_SQLPOOL_NAME }} | ||
run: source tests/integration/manage-sql-pool.sh stop ${{ secrets.INT_SUBSCRIPTION_ID }} ${{ secrets.INT_SYNAPSE_SQLPOOL_RG_NAME }} ${{ secrets.INT_SYNAPSE_WKSP_NAME }} ${{ secrets.INT_SYNAPSE_SQLPOOL_NAME }} | ||
env: | ||
AZURE_CLIENT_ID: ${{ secrets.AZURE_CLIENT_ID }} | ||
AZURE_CLIENT_SECRET: ${{ secrets.AZURE_CLIENT_SECRET }} | ||
|
@@ -172,25 +177,3 @@ jobs: | |
with: | ||
artifacts: ~/artifacts/FunctionZip.zip | ||
token: ${{ secrets.GITHUB_TOKEN }} | ||
|
||
deployProductionEnvironment: | ||
name: Release to Production Environment | ||
needs: [createRelease] | ||
runs-on: ubuntu-latest | ||
environment: | ||
name: Production | ||
steps: | ||
- uses: actions/checkout@v3 | ||
|
||
- name: Download Artifact | ||
uses: actions/download-artifact@v3 | ||
with: | ||
name: FunctionZip | ||
path: ./artifacts | ||
|
||
- name: Azure Functions Action | ||
uses: Azure/[email protected] | ||
with: | ||
app-name: ${{ secrets.FUNC_NAME }} | ||
package: ./artifacts/FunctionZip.zip | ||
publish-profile: ${{ secrets.PUBLISH_PROFILE }} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -161,3 +161,4 @@ build | |
|
||
# Ignore local settings | ||
localsettingsdutils.py | ||
*.ini |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,122 @@ | ||
# Deploying the Test Environment | ||
|
||
## Deploying the Connector | ||
|
||
## Deploying the Data Sources | ||
|
||
``` | ||
az deployment group create \ | ||
--template-file ./tests/environment/sources/adlsg2.bicep \ | ||
--resource-group db2pvsasources | ||
``` | ||
|
||
## Manual Steps | ||
|
||
Create a config.ini file: | ||
|
||
```ini | ||
databricks_workspace_host_id = adb-workspace.id | ||
databricks_personal_access_token = PERSONAL_ACCESS_TOKEN | ||
databricks_spark3_cluster = CLUSTER_ID | ||
databricks_spark2_cluster = CLUSTER_ID | ||
``` | ||
|
||
Assign Service Principal Storage Blob Data Contributor to the main ADLS G2 instance | ||
|
||
Add Service Principal as user in Databricks. | ||
|
||
Enable mount points with `./tests/environment/dbfs/mounts.py` | ||
|
||
Add Key Vault Secrets | ||
* `tenant-id` | ||
* `storage-service-key` | ||
* `azuresql-username` | ||
* `azuresql-password` | ||
* `azuresql-jdbc-conn-str` should be of the form `jdbc:sqlserver://SERVER_NAME.database.windows.net:1433;database=DATABASE_NAME;encrypt=true;trustServerCertificate=false;hostNameInCertificate=*.database.windows.net;loginTimeout=30;` | ||
* `synapse-storage-key` | ||
* `synapse-query-username` | ||
* `synapse-query-password` | ||
* Update SQL Db and Synapse Server with AAD Admin | ||
* Add Service Principal for Databricks to connect to SQL sources | ||
|
||
Set the following system environments: | ||
|
||
* `SYNAPSE_SERVICE_NAME` | ||
* `STORAGE_SERVICE_NAME` | ||
* `SYNAPSE_STORAGE_SERVICE_NAME` | ||
|
||
Upload notebooks in `./tests/integration/spark-apps/notebooks/` to dbfs' `/Shared/examples/` | ||
|
||
* Manually for now. TODO: Automate this in Python | ||
|
||
Compile the following apps and upload them to `/dbfs/FileStore/testcases/` | ||
|
||
* `./tests/integration/spark-apps/jarjobs/abfssInAbfssOut/` with `./gradlew build` | ||
* `./tests/integration/spark-apps/pythonscript/pythonscript.py` by just uploading. | ||
* `./tests/integration/spark-apps/wheeljobs/abfssintest/` with `python -m build` | ||
|
||
Upload the job definitions using the python script `python .\tests\environment\dbfs\create-job.py` | ||
|
||
## Github Actions | ||
|
||
* AZURE_CLIENT_ID | ||
* AZURE_CLIENT_SECRET | ||
* AZURE_TENANT_ID | ||
* INT_AZ_CLI_CREDENTIALS | ||
```json | ||
{ | ||
"clientId": "xxxx", | ||
"clientSecret": "yyyy", | ||
"subscriptionId": "zzzz", | ||
"tenantId": "μμμμ", | ||
"activeDirectoryEndpointUrl": "https://login.microsoftonline.com", | ||
"resourceManagerEndpointUrl": "https://management.azure.com/", | ||
"activeDirectoryGraphResourceId": "https://graph.windows.net/", | ||
"sqlManagementEndpointUrl": "https://management.core.windows.net:8443/", | ||
"galleryEndpointUrl": "https://gallery.azure.com/", | ||
"managementEndpointUrl": "https://management.core.windows.net/" | ||
} | ||
``` | ||
* INT_DATABRICKS_ACCESS_TOKEN | ||
* INT_DATABRICKS_WKSP_ID: adb-xxxx.y | ||
* INT_FUNC_NAME | ||
* INT_PUBLISH_PROFILE from the Azure Function's publish profile XML | ||
* INT_PURVIEW_NAME | ||
* INT_RG_NAME | ||
* INT_SUBSCRIPTION_ID | ||
* INT_SYNAPSE_SQLPOOL_NAME | ||
* INT_SYNAPSE_WKSP_NAME | ||
* INT_SYNAPSE_WKSP_NAME | ||
|
||
## config.json | ||
|
||
```json | ||
{ | ||
"datasets":{ | ||
"datasetName": { | ||
"schema": [ | ||
"field1", | ||
"field2" | ||
], | ||
"data": [ | ||
[ | ||
"val1", | ||
"val2" | ||
] | ||
] | ||
} | ||
}, | ||
"jobs": { | ||
"job-name": [ | ||
[ | ||
("storage"|"sql"|"noop"), | ||
("csv"|"delta"|"azuresql"|"synapse"), | ||
"rawdata/testcase/one/", | ||
"exampleInputA" | ||
] | ||
] | ||
} | ||
} | ||
|
||
``` |
Oops, something went wrong.