Data bricks is a cloud-based data engineering tool used for processing and transforming massive quantities of data and exploring the data through machine learning models.
CI/CD refers to the process of developing and delivering software in frequent cycles through the use of automation pipelines.
We can set up a CI/CD pipeline for Azure Data bricks Notebook deployment as follows :
- Create a Data bricks Workspace
https://azure.microsoft.com/nl-nl/resources/templates/101-databricks-workspace/
2. Assign a Contributor permission to Azure AD Group
3. Assign a Contributor permission to Service Principal
$Context = Get-AzContext
$TenantId = $Context.Subscription.TenantId
$SubscriptionId = $Context.Subscription.Id
$Databricks = Get-AzResource -ResourceGroupName $(ResourceGroupName) -Name $(DatabricksWorkspace)
New-AzRoleAssignment -ObjectId $(ServicePrincipalObjectId) -RoleDefinitionName Contributor -Scope $Databricks.ResourceId
4. Get Service Principal Object Id and Password
$ServicePrincipalObjectId=(Get-AzADServicePrincipal -DisplayNameBeginsWith $(ServicePrincipalName)).Id
write-host “##vso[task.setvariable variable=ServicePrincipalObjectId]$ServicePrincipalObjectId”
$ServicePrincipalPassword = Get-AzKeyVaultSecret -VaultName $(KeyVaultName) -Name $(SPNPassword)
$ServicePrincipalPwd = $ServicePrincipalPassword.SecretValueText
write-host “##vso[task.setvariable variable=ServicePrincipalPassword]$ServicePrincipalPassword”
$ServicePrincipalAppId=(Get-AzADServicePrincipal -DisplayNameBeginsWith $(ServicePrincipalName)).ApplicationId.Guid
write-host “##vso[task.setvariable variable=ServicePrincipalAppId]$ServicePrincipalAppId”
# Get Subscription ID
$Context = Get-AzContext
$SubscriptionId = $Context.Subscription.Id
write-host “##vso[task.setvariable variable=SubscriptionId]$SubscriptionId”
# Get Tenant ID
$tenantId = $Context.Subscription.TenantId
write-host “##vso[task.setvariable variable=tenantId]$tenantId”
5. Generate Databricks Token using Azure Devops Task
6. Store the Data bricks Token to the Key Vault
Write-Host “Store Databricks Bearer token to $(KeyVaultName)”
$DatabricksSecretName = ‘$(adb_token_secret_name)’
$CurrentToken = (Get-AzKeyVaultSecret -VaultName “$(KeyVaultName)” -Name “$DatabricksSecretName”).SecretValueText
$Secret = ConvertTo-SecureString -String ‘$(BearerToken)’ -AsPlainText -Force
Set-AzKeyVaultSecret -VaultName ‘$(KeyVaultName)’ -Name “$DatabricksSecretName” -SecretValue $Secret -ContentType “Databricks Access Token” -Expires (Get-Date).AddMonths($(adb_token_expiry_months))
7. Deploy Data bricks Notebook using Azure Devops Task
8. Create Azure Data bricks Secret Scope
You can use Azure powershell or Databricks CLI to create a secret scope
9. Create a Data bricks Cluster
First check if the data bricks cluster exists or not using Azure Powerhell/command line task of Azure Devops and create a cluster if it doesn’t exists
End to End CD Azure Devops CD Pipeline for Azure Databricks Notebook Deployment/Cluster Creation/Secret Scope etc
If you like what you read, don’t forget to clap :)