Hello!

I have been using the Terraform Provider written by databrickslabs to manage resources on an Azure Databricks Workspace. It is very complete and offers most of the functionality needed. However, occasionally it is necessary to execute a job to do some maintenance on the workspace. And the provider can only deploy jobs, not run them. This is fine, as terraform is concerned with deploy infrastructure as opposed to executing jobs. But it would be great if I could execute a job based on the outputs of the resource. And happily, come to find out there is, with the bonus of using the credentials stored in a service connection in Azure DevOps. This means we do not need to add/store/manage any further secrets in the pipeline.

I’ve included snippets as opposed to an entire project because I have nothing I can share right now.

First step is to get the service principal variables and save the mas envvars. We will need these later.

  - task: AzureCLI@2
    displayName: "Get Service Principal Variables"
    inputs:
      azureSubscription: ${{Parameters.azureSubscription}}
      scriptType: 'bash'
      scriptLocation: 'inlineScript'
      inlineScript: |
        echo "##vso[task.setvariable variable=spId]$servicePrincipalId"
        echo "##vso[task.setvariable variable=spKey]$servicePrincipalKey"
        echo "##vso[task.setvariable variable=tid]$tenantId"
      addSpnToEnvironment: true

Then, when you deploy your Azure Databricks Workspace using the azurerm_databricks_workspace resource, return these three outputs.

resource "azurerm_databricks_workspace" "ws" {
  name                        = var.ws_name
  resource_group_name         = var.rg_name
  location                    = var.location
  sku                         = var.sku
  managed_resource_group_name = join("", [var.ws_name, "mrg"])
}

output "id" {
  value = azurerm_databricks_workspace.ws.id
}

output "workspace_id" {
  value = azurerm_databricks_workspace.ws.workspace_id
}

output "databricks_instance" {
  value = "https://${azurerm_databricks_workspace.ws.workspace_url}/"
}

now deploy your workspace. How I am managing the var-file and the backend here is not really really relevant. The key point here is that you then create some more envvars from the output vars declared in your terraform.

  - script: |
      terraform -v
      terraform apply -auto-approve -var="client_id=$(spId)" -var="client_secret=$(spKey)" -var="tenant_id=$(tId)" -var-file $(System.DefaultWorkingDirectory)/tf/stages/$(tfstage)/tf.tfvars
      resource_id=$(terraform output -raw id)
      workspace_id=$(terraform output -raw workspace_id)
      databricks_instance=$(terraform output -raw databricks_instance)
      echo "##vso[task.setvariable variable=resource_id]$resource_id"
      echo "##vso[task.setvariable variable=workspace_id]$workspace_id"
      echo "##vso[task.setvariable variable=databricks_instance]$databricks_instance"
    displayName: terraform apply
    workingDirectory: $(System.DefaultWorkingDirectory)/tf
    condition: and(succeeded(), eq('${{ parameters.apply }}', true))

Now to run a bash script. This will take the service principal ID and secret and use those to generate tokens that can then be used to access the Azure Databricks Workspace. Alternately you can even create a bearertoken with short enough lifespan so that it can only be used in this pipeline.

  - task: AzureCLI@2
    displayName: "run bash script"
    inputs:
      azureSubscription: ${{Parameters.azureSubscription}}
      scriptType: 'bash'
      scriptLocation: scriptPath
      scriptPath: $(System.DefaultWorkingDirectory)/bash/databricks.sh
      addSpnToEnvironment: true
    condition: and(succeeded(), eq('${{ parameters.apply }}', true))
    env: 
     ARM_TENANT_ID: $(tId)
     ARM_CLIENT_ID: $(spId)
     ARM_CLIENT_SECRET: $(spKey)
     RESOURCE_ID: $(resource_id)
     WORKSPACE_URL: $(databricks_instance)
#! /usr/bin/bash
set -o errexit
set -o nounset
set -o pipefail

MANAGEMENT_RESOURCE_ENDPOINT="https://management.core.windows.net/" # This is Fixed value (DO NOT CHANGE)
AZURE_DATABRICKS_APP_ID="2ff814a6-3304-4ab8-85cb-cd0e6f879c1d" # This is Fixed value (DO NOT CHANGE)

echo $ARM_TENANT_ID
echo $ARM_CLIENT_ID
echo $ARM_CLIENT_SECRET
echo $RESOURCE_ID
echo $WORKSPACE_URL

# Enable install of extensions without prompt
az config set extension.use_dynamic_install=yes_without_prompt

# token response for the azure databricks app 
token_response=$(az account get-access-token --resource $AZURE_DATABRICKS_APP_ID)
# Extract accessToken value
token=$(jq .accessToken -r <<< "$token_response")

# Get the Azure Management Resource endpoint token
# https://docs.microsoft.com/en-us/azure/databricks/dev-tools/api/latest/aad/service-prin-aad-token#--get-the-azure-management-resource-endpoint-token
az_mgmt_resource_endpoint=$(curl -X GET -H 'Content-Type: application/x-www-form-urlencoded' \
-d 'grant_type=client_credentials&client_id='$ARM_CLIENT_ID'&resource='$MANAGEMENT_RESOURCE_ENDPOINT \
--data-urlencode 'client_secret='$ARM_CLIENT_SECRET \
https://login.microsoftonline.com/$ARM_TENANT_ID/oauth2/token)
# Extract the access_token value
mgmt_access_token=$(jq .access_token -r <<< "$az_mgmt_resource_endpoint" )

# Create PAT token valid for 5 min (300 sec)
create_token=$WORKSPACE_URL"api/2.0/token/create"
pat_token_response=$(curl -X POST \
    -H "Authorization: Bearer $token" \
    -H "X-Databricks-Azure-SP-Management-Token: $mgmt_access_token" \
    -H "X-Databricks-Azure-Workspace-Resource-Id: $RESOURCE_ID" \
    -d '{"lifetime_seconds": 300,"comment": "this is an example token"}' \
    $create_token    
)

# Print PAT token
pat_token=$(jq .token_value -r <<< "$pat_token_response")

# List current clusters spark versions
spark_versions=$WORKSPACE_URL"/api/2.0/clusters/spark-versions"
spark=$(curl -X GET \
    -H "Authorization: Bearer $token" \
    -H "X-Databricks-Azure-SP-Management-Token: $mgmt_access_token" \
    -H "X-Databricks-Azure-Workspace-Resource-Id: $RESOURCE_ID" \
    $spark_versions)

echo "spark-versions - "
echo $spark 

The final command validates that the authentication has worked, and all being well you should see an output that resembles this (I’ve truncated the output):

{
  "versions": [
    {
      "key": "8.2.x-scala2.12",
      "name": "8.2 (includes Apache Spark 3.1.1, Scala 2.12)"
    },
    {
      "key": "6.4.x-esr-scala2.11",
      "name": "6.4 Extended Support (includes Apache Spark 2.4.5, Scala 2.11)"
    },
  ...

So all this means you can now interface with the API and do any measure of things on the workspace as part of your terraform deployment, including executing a job.