Processing Cubes After a Data Refresh: Batching Processing Using AMO
Recently I needed to update some cubes in a test environment after the underlying data warehouse had been refreshed. Typically I’d sync the cubes from prod, but as sync causes locking, and I needed to get this done during the day, I needed to reprocess all of the cubes in the test environment, and there were a lot of them, all ranging in sizes from hundreds of gb to a few mb. The simplest way to update the cubes was to run a process default on the dimensions and then run a process default on the measure groups. Some of the measure groups were partitioned, so I could’ve got clever and just processed fully any unprocessed partitions, but I felt a default would do most of the hard work for me.
At any rate, because I just needed to get it going and not to think about checking the progress, I created a PowerShell script to run batch the dimension processing in parallel, and then after that had completed move onto each of the measure groups serially. The reason I processed the measure groups in serial is that if one fails then it does not prevent the others from processing successfully.
To parallelise the processing of any objects you need to set the Server.CaptureXml property to true. Rather than executing any XMLA targeted at the server this captures the XMLA to the Server.CaptureLog property. Once you have captured all the XMLA you want, you set the CaptureXml back to false and execute the Server.ExecuteCaptureLog method, setting both transactional and parallel parameters to “true”.
A slight oddity exists in AMO in that whilst you can find each of the dimensions as an object of the database, you cannot actually process this object. Thus in the script below this part is finding all the dimensions object that are part of the cube schema that we will want to process (The script in full is below). We then use the name property of this object to find the pertinent database dimension to process:
$Dimensions=New-object Microsoft.AnalysisServices.Dimension
$CubeDimensions=$cube.dimensions
foreach ($dimension in $cubeDimensions)
{
$Dim=New-object Microsoft.AnalysisServices.Dimension
$dim=$database.Dimensions.FindByName($dimension.Name)
$dim.Process($DimProcessType)
}
If you were to try this, which on the surface looks like it will work:
$Dimensions=New-object Microsoft.AnalysisServices.Dimension
$CubeDimensions=$cube.dimensions
foreach ($dimension in $cubeDimensions)
{
$dim.Process($DimProcessType)
}
You would get this error:
Method invocation failed because [Microsoft.AnalysisServices.CubeDimension] doesn’t contain a method named ‘Process’.
Looking on the MSDN website you will see that this is correct, and that this class is to describe the relationship between the cube and the dimension. But so we can use that to find the dimension object that is associated to the database and run a process on that.
I’ve set the processing type with defaults to limit the number of parameters that have to be typed, but these can be overwritten.
[CmdletBinding()]
param(
[Parameter(Position=0,mandatory=$true)]
[string] $ssasInstance,
[Parameter(Position=1,mandatory=$true)]
[string] $ssasdatabase,
[Parameter(Position=2,mandatory=$true)]
[string] $ssasCube,
[Parameter(Position=3)]
[string] $DimProcessType="ProcessDefault",
[Parameter(Position=4)]
[string] $MgProcessType="ProcessDefault")
[Void][System.Reflection.Assembly]::LoadWithPartialName("Microsoft.AnalysisServices")
$server = New-Object Microsoft.AnalysisServices.Server
$server.connect($ssasInstance)
$server.CaptureXml = $true;
$databases=$server.databases
$ProcessType=$TypeOfProcess
$databases|select-object name
$database=$databases[$ssasdatabase]
$Cubes=New-object Microsoft.AnalysisServices.Cube
$Cube=$database.cubes.GetByName($ssascube)
$Dimensions=New-object Microsoft.AnalysisServices.Dimension
$CubeDimensions=$cube.dimensions
foreach ($dimension in $cubeDimensions)
{
$Dim=New-object Microsoft.AnalysisServices.Dimension
$dim=$database.Dimensions.FindByName($dimension.Name)
$dim.Process($DimProcessType)
}
$date1=get-date
write-host "---------------------------------------------"
Write-host "Processing Dimensions for database $database " -foregroundcolor darkgreen -backgroundcolor white
$server.CaptureXml = $false
$server.Capturelog | Out-Null
$server.ExecuteCaptureLog($true,$true)
$date2=get-date
write-host "Processing took " ($date2-$date1).Hours " Hours, " ($date2-$date1).Minutes " Mins, " ($date2-$date1).Seconds " Secs " -foregroundcolor darkgreen -backgroundcolor white
foreach ($measureGroup in $Cube.MeasureGroups)
{
$date1=get-date
write-host "---------------------------------------------"
Write-host "Processing" $measureGroup "for database" $database -foregroundcolor darkgreen -backgroundcolor white
$measureGroup.Process($MgProcessType)
$date2=get-date
write-host "Processing took " ($date2-$date1).Hours " Hours, " ($date2-$date1).Minutes " Mins, " ($date2-$date1).Seconds " Secs " -foregroundcolor darkgreen -backgroundcolor white
}
Happy Scripting!