Hello!

For some time now Microsoft have been investing pretty heavily into terraform, and I don’t mean financially; the AzureRM provider is very closely aligned with what you can do using ARM templates and other methods to create Azure resources. There will however always be an inevitable lag between what you can do with terraform. This is somewhat of an unpopular opinion to hold, because those that don’t agree with it have perhaps nailed their colours to the mast on terraform being the best tool ever. I’ve come across a couple of missing features in terraform and I’m going to share one of them today because it is not immediately obvious that these features are missing.

So, I needed to create a delimited text dataset in Azure Data Factory and provide a couple of parameters. Accoring to the documentation for the resource azurerm_data_factory_dataset_delimited_text, “parameters are set as a map to associate with the Data Factory Dataset.” Alright, simple enough, but what about setting the datatype? After all, parameters can be any number of datatypes. I thought maybe the parameters arguments might be a maps of maps and that I could pass in something like this…

parameters = {
container_name = {type="string",defaultValue="bob"}
container_id = {type="int",defaultValue=1}
}

…but sadly that is not the case as I got the error messsage Inappropriate value for attribute "parameters": element "container_name": string required.

So it seems pretty clear that the parameters argument only accepts string. But then, ever the optimist, I thought that maybe I could pass in a string of json and the provider would parse it into types and defaultValues. But before I attempted to do that I decided to look through the provider source code, as I could be wasting my time here. Come to find out that indeed the only supported data type is string parameters as it states it in a comment. Hmm, OK, so the code below will work to deploy the linked service and dataset to an ADF that already exists (note: so does the Storage Account along with the Managed Identity permissions), but even though I’m setting an int for container_id, it is still treated as a string.

terraform {
  backend "local" {
  }
}

terraform {
  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = "=2.80.0"
    }
  }
}

terraform {
  required_version = ">= 0.14.0"
}

provider "azurerm" {
  features {}
  skip_provider_registration = true
}


data "azurerm_data_factory" "bzzztadf" {
  resource_group_name = "test-adf"
  name                = "bzzztadf"
}


resource "azurerm_data_factory_linked_service_azure_blob_storage" "msi_linked_blob" {
  name                = "bzzzttestadf"
  resource_group_name = "test-adf"
  data_factory_name   = data.azurerm_data_factory.bzzztadf.name

  service_endpoint     = "https://bzzzttestadf.blob.core.windows.net"
  use_managed_identity = true
}

resource "azurerm_data_factory_dataset_delimited_text" "container_names" {
  name                = "CONTAINERNAMES"
  resource_group_name = "test-adf"
  data_factory_name   = data.azurerm_data_factory.bzzztadf.name
  linked_service_name = azurerm_data_factory_linked_service_azure_blob_storage.msi_linked_blob.name

  azure_blob_storage_location {
    container = "@dataset().container_name"
  }
  parameters = { container_name = "bob", container_id = 1, container_is_active = "false" }
}

How much of an impediment this is depends entirely on your data types and if you can use strings and then handle them in the ADF to change the type perhaps. But this smacks of bloody mindedness to stick with terraform, which I think is the wrong thing to do: not that terraform is bad or awful, but you choose the most apprpriate tool for the job at hand. And if terraform cannot do what you need then you need to use ARM, or perhaps either the az datafactory dataset create cmd or New-AzDataFactoryDataset to get the dataset created.

As for what I did, I made use of Kamil Nowinski’s azure.datafactory.tools to get the job done. Much easier to use!